Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Bootstrap: A Statistical Method, Study notes of Statistics

Victoria University (VU)Statistics

This paper attempts to introduce readers with the concept and methodology of bootstrap in Statistics, which is placed under a larger umbrella of resampling.

Typology: Study notes

2021/2022

Uploaded on 07/05/2022

carol_78 🇦🇺

4.8

(59)

1K documents

1 / 14

This page cannot be seen from the preview

Don't miss anything!

1

Bootstrap: A Statistical Method

Kes ar Singh and Min ge Xie

Rutg ers University

Abst ract

This pap er at tempts t o i ntroduce r eaders with the con cept a nd

met hodology of bootstr ap in Statistic s, which is placed unde r a larger

umbr ella of re sampling. Ma jor portion of the dis cussions sh ould be

acc essible to an y one w ho h as had a couple o f college le vel applied

stat istics courses. Towards the end, we at te mpt t o provide glimpses of

the v ast literature published o n the t opic, which should b e helpful t o

some one as piring to go in to th e de pth of the me thodology. A sect ion is

ded icated to illustrate real data examples. We think the s e l ected s et o f

ref erences cover the g reater part of t he developments on t his subject

mat ter.

1. Introduction and t he Idea

B. Efron (1979) introduced the Bootstrap method. It spread like brush fire in statistical

sciences within a couple of decades. Now if one conducts a “Google search” for the above title,

an astounding 1.86 million records will be mentioned; scanning through even a fraction of these

records is a daunting task. We attempt first to explain the idea behind the method and the

purpose of it at a rather rudimentary level. The primary task of a statistician is to summarize a

sample based study and generalize the finding to the parent population in a scientific manner. A

technical term for a sample summary number is (sample) statistic. Some basic sample statistics

are sample mean, sample median, sample standard deviation etc. Of course, a summary statistic

like the sample mean will fluctuate from sample to sample and a statistician would like to know

the magnitude of these fluctuations around the corresponding population parameter in an overall

sense. This is then used in assessing Margin of Errors. The entire picture of all possible values of

a sample statistics presented in the form of a probability distribution is called a sampling

distribution. There is a plenty of theoretical knowledge of sampling distributions, which can be

found in any text books of mathematical statistics. A general intuitive method applicable to just

Discover Study notes of Statistics Victoria University (VU)

Partial preview of the text

Download Bootstrap: A Statistical Method and more Study notes Statistics in PDF only on Docsity!

Bootstrap: A Statistical Method

Kes ar Singh and M inge Xie Ru tgers U nivers ity

Ab st ract T h is p ape r atte mp ts to in tr od uce re ad ers w ith t he c on cept a nd met ho do log yu mbr e lla o f r esa mplin g .o f b o ot str ap Ma jo rin Sta tist ic s , po rt ion w hichof th e is d isc uss ion sp la ced un de r s ho u ld a la rg er b e a cce ss ib les tat is tics c ou rs es. t o a n y o n eT ow ards w ho ha sth e en d ,h ad (^) w ea c o up le at te mpt of t o co llege pr o vid e le ve l g limp ses a p p lied of t hes o meo ne va st lite ra tu rea sp ir in g top ub lish ed go int o t heo n d ept hth e to p ic , o f the w hic h meth od o lo g y. sh ou ld b eA he lpfu lsect ion tois d ed icate dr efe re nce s t oc o ve r illu str at e the gre ate rr ea l dat a p art e xa mp le s. o f th e de ve lo p ment sW e th ink th e onse lect ed t h is s ub ject set of mat ter.

1. In tr od uct io n a nd t he Id ea

B. Efron (1979) introduced the Bootstrap method. It spread like brush fire in statistical sciences within a couple of decades. Now if one conducts a “Google search” for the above title, an astounding 1.86 million records will be mentioned; scanning through even a fraction of these records is a daunting task. We attempt first to explain the idea behind the method and the purpose of it at a rather rudimentary level. The primary task of a statistician is to summarize a sample based study and generalize the finding to the parent population in a scientific manner. A technical term for a sample summary number is (sample) statistic. Some basic sample statistics are sample mean, sample median, sample standard deviation etc. Of course, a summary statistic like the sample mean will fluctuate from sample to sample and a statistician would like to know the magnitude of these fluctuations around the corresponding population parameter in an overall sense. This is then used in assessing Margin of Errors. The entire picture of all possible values of a sample statistics presented in the form of a probability distribution is called a sampling distribution. There is a plenty of theoretical knowledge of sampling distributions, which can be found in any text books of mathematical statistics. A general intuitive method applicable to just

about any kind of sample statistic that keeps the user away from the technical tedium has got its own special appeal. Bootstrap is such a method. To understand bootstrap, suppose it were possible to draw repeated samples (of the same size) from the population of interest, a large number of times. Then, one would get a fairly good idea about the sampling distribution of a particular statistic from the collection of its values arising from these repeated samples. But, that does not make sense as it would be too expensive and defeat the purpose of a sample study. The purpose of a sample study is to gather information cheaply in a timely fashion. The idea behind bootstrap is to use the data of a sample study at hand as a “surrogate population”, for the purpose of approximating the sampling distribution of a statistic; i.e. to resample (with replacement) from the sample data at hand and create a large number of “phantom samples” known as bootstrap samples. The sample summary is then computed on each of the bootstrap samples (usually a few thousand). A histogram of the set of these computed values is referred to as the bootstrap distribution of the statistic. In bootstrap’s most elementary application, one produces a large number of “copies” of a sample statistic, computed from these phantom bootstrap samples. Then, a small percentage,

say 100( α / 2)% (usually α = 0.05), is trimmed off from the lower as well as from the upper end of

these numbers. The range of remaining 100(1 − α)%values is declared as the confidence limits

of the corresponding unknown population summary number of interest, with level of confidence

100(1 − α)%. The above method is referred to as bootstrap percentile method. We shall return to

it later in the article.

2. The Theoretical Support Let us develop some mathematical notations for convenience. Suppose a population

parameter θ is the target of a study; say for example, θ is the household median income of a

chosen community. A random sample of size n yields the data ( X (^) 1 , X (^) 2 ,..., X (^) n ). Suppose, the

corresponding sample statistic computed from this data set is θˆ (sample median in the case of

the example). For most sample statistics, the sampling distribution of θˆ for large n ( n ≥ 30 is generally accepted as large sample size), is bell shaped with center θ and standard deviation

Similarly, the sampling distribution of ( X − μ) / SE , with SE ^ = s / n , will be approximated by the

bootstrap distribution of ( X B − X ) / SEB , with SEB = sB / n. The earliest results on second order

correction were reported in Singh (1981) and Babu and Singh (1983). In the subsequent years, a flood of large sample results on bootstrap with substantially higher depth, followed. A name among the researchers in this area that stands out is Peter Hall of Australian National University.

3. Primary Applications of Bootstrap 3.1 Approximating Standard Error of a Sample Estimate: Let us suppose, information is sought about a population parameter θ. Suppose θˆ is a

sample estimator of θ based on a random sample of size n , i.e. θˆ is a function of the data

( X (^) 1 , X (^) 2 ,..., X (^) n ). In order to estimate standard error of θˆ , as the sample varies over the class of all

possible samples, one has the following simple bootstrap approach:

Compute ( θ 1 * , θ 2 ^ ,..., θ N ), using the same computing formula as the one used for θˆ , but now

base it on N different bootstrap samples (each of size n ). A crude recommendation for the

size N could be N = n^2 (in our judgment), unless n^2 is too large. In that case, it could be reduced

to an acceptable size, say n log en. One defines SEB ( ) θ^ ˆ^ = [(1/ N ) (^) ∑ i^ N = 1 ( θ i *^ −θˆ) ]^2 1/ 2 following the

philosophy of bootstrap: replace the population by the empirical population. An older resampling technique used for this purpose is Jackknife, though bootstrap is more widely applicable. The famous example where Jackknife fails while bootstrap is still useful is

that of θˆ = the sample median.

3.2 Bias correction by bootstrap:

The mean of sampling distribution of θˆ often differs from θ , usually by an amount = c / n

for large n. In statistical language, one writes

Bias ( ) θ^ ˆ = E ( )θˆ − θ≈ O (1/ n ).

A bootstrap based approximation to this bias is

1

1^ N i ˆ (^) B ( )ˆ N ∑ i =^ θ^ −^ θ=^ Bias θ (say),

Where θ i *are bootstrap copies of θˆ , as defined in the earlier subsection. Clearly, this

construction is also based on the standard bootstrap thinking: replace the population by the

empirical population of the sample. The bootstrap bias corrected estimator is θˆ c = θˆ^ − BiasB ( )θˆ. It

needs to be pointed out that the older resampling technique called Jackknife is more popular with statisticians for the purpose of bias estimation. 3.3 Bootstrap Confidence Intervals:

Confidence intervals for a given population parameter θ are sample based range [ θ^ ˆ 1 ,

θ^ ˆ 2 ] given out for the unknown number θ. The range possesses the property that θ would lie

within its bounds with a high (specified) probability. The latter is referred to as confidence level. Of course this probability is with respect to all possible samples, each sample giving rise to a confidence interval which thus depends on the chance mechanism involved in drawing the samples. The two mostly used levels of confidence are 95% and 99%. We limit ourselves to the level 95% for our discussion here. Traditional confidence intervals rely on the knowledge of

sampling distribution of θˆ , exact or asymptotic as n → ∞. Here are some standard brands of

confidence intervals constructed using bootstrap. Bootstrap Percentile Method: This method was mentioned in the introduction itself, because of its popularity which is primarily due to its simplicity and natural appeal. Suppose one settles for 1000 bootstrap

replications of θˆ , denoted by ( θ 1 * , θ * 2 ,..., θ 1000 * ). After ranking from bottom to top, let us denote these

bootstrap values as (^ θ^ * (1)^ ,^ θ^ (2)^ ,...,^ θ(1000)^ ). Then the bootstrap percentile confidence interval at 95%

level of confidence would be [^ θ^ (25)^ ^ ,^ θ(975)^ ]. Turning to the theoretical aspects of this method, it

should be pointed out that the method requires the symmetry of the sampling distribution of θˆ

around θ. The reason is that the method approximates the sampling distribution of θˆ − θ by the

interval is known to achieve higher accuracy than the earlier method, which is referred to as “second order accuracy” in technical literature. We end the section with a remark that B. Efron proposed correction to the rudimentary percentile method to bring in extra accuracy. These corrections are known as Efron’s “bias- correction” and “accelerated bias-correction”. The details could be found in Efron and Tibishirani (1993). The bootstrap-t automatically takes care of such corrections, although the bootstrapper needs to look for a formula for SE which is avoided in the percentile method.

4. Some Real Data Example Example 1. (Skewed Univariate Data) In the first example, the data are taken from (Hollander and Wolfe, 1999, page 63), which represent the effect of illumination (difference between counts with and without illumination) on the rate of beak-clapping among chick-embryos; see the end of the section. The boxplot suggests lack of normality of the population. We have carried out bootstrap analysis on the median and on the mean. A noteworthy finding is the lack of symmetry of bootstrap-t histogram, which differs from limiting normal curve. The 95% level confidence intervals coming from our analysis for both mean and the median (centered bootstrap percentile method) cover the range [10, 30], roughly speaking. This range represents overall difference (increase) in the beak-clapping counts per minute due to illumination.

Figure 1. Boxplot of the measurement is presented in (a). Bootstrap distributions of the sample mean, sample median and t* statistic are plotted in (b)-(d), respectively. The dotted lines in (b)and (c) correspond respectively to the sample mean and sample median. Based the bootstrap distributions, the 95% confidence interval for the population median by the bootstrap percentilemethod is (4.7000, 24.7000), by the centered boostrap percentile is (10.5000, 30.5000). The 95% confidence interval for the population mean by the percentile bootstrap method is (10.0960,28.1200), by the centered bootstrap method is (9.4880, 27.51200). The Bootstrap-t 95% CI for the population mean is (12.9413, 30.8147). Note that the bootstrap t on the mean show skewedhistogram of the t-distribution.

Example 2. (Bivariate Data) In this example, the data are from Collins et al. (1999), which

assess body fat in collegiate football players (Devore, 2003, page 553). We study correlation between the BOD and HW measurements; see the data at the end of this section. Here, BOD is BOD POD, a whole body air-displacement plethysmograph, and HW refers to hydrostatic weighing. The sample size is modest, but reasonable for bootstrap methods. As bivariate data consist of n pairs of data, say ( X (^) i , Yi ), for i = 1,..., n , one draws a pair of data randomly at a time

in the bootstrap resampling. For instance, the first draw could be ( X (^) 7 , Y 7 ) followed by ( X (^) 3 , Y 3 )etc.

A sizable amount of journal literature on the topic is directed towards proposal and study of bootstrap schemes which will produce decent results in various statistical situations. The set up that has been the basis of forgoing discussion is basic and there are many types of departures from it. How to bootstrap in case of two stage sampling or a stratified sampling? Natural schemes are not hard to think of. Bootstrapping in case of data with regression models has attracted a lot of attention. There are two schemes which stand out: in one of which the covariate(s) and the response variable are resampled together (called paired bootstrap), and the other one bootstraps the “residuals” (=response – fitted model value) and then reconstructs the bootstrap regression data by plugging in the estimated regression parameters (called residual bootstrap). Paired bootstrap remains valid - in the sense of correct outcome in the limit as n → ∞ , even if the error variances in the model are unequal; a property which the residual bootstrap lacks. The shortcoming is compensated by the fact that the latter scheme brings additional accuracy in the estimation of standard error. This is the classic tug of war between efficiency and robustness in statistics (see Liu and Singh (1992)). A lot harder to bootstrap are the time series data. Needless to say, time series analysis is of critical importance in several disciplines, especially in econometrics. The sources of difficulty are two-fold: (I) Time series data possess serial dependence i.e. X (^) T + 1 has dependence

on X (^) T , XT (^) − 1 etc; (II)The statistical population changes with time, and that is known as non

stationarity. It was noted very early on (see Singh (1981) for m-dependent data) that the classical bootstrap can not handle dependent data. A fair amount of research has been dedicated to modifying the bootstrap so that it could automatically bring in the dependence structure of the original sampling into bootstrap samples. The scheme of moving–block bootstrap has become quite well known (invented in Kunch (1989) and Liu and Singh (1992)). Potitis and Romano are well known authors on the topic, whose contributions have led to significant advancements on the topic of resampling, in general. In a moving block bootstrap scheme, one draws a block of data at a time, instead of one of the X (^) i ’s at a time, in order to preserve the underlying serial

dependence structure that is present in the sample. There is plenty of ongoing research in the area of bootstrap methodology on econometric data.

6. The great m out n bootstrap with ( m n / → 0 ) There are various types of conditions under which the straightforward bootstrap becomes inconsistent, meaning that the bootstrap estimate of sampling distribution and the true sampling distribution do not approach to the same limit, as the sample size n tends to ∞. That means, for large samples, one is bound to end up with an inaccurate statistical inference. The examples include, just to name a few, bootstrapping sample minimum or sample maximum which estimate end-point of a population distribution (Bickel and Freedman (1981)), the case of sample mean when the population variance is ∞ (Athreya (1981)), bootstrapping sample eigenvalues when population eigenvalues have multiplicity (Eaton and Tyler (1991)), the case of sample median when the population density is discontinuous at the population median (Huang, et.al. (1996)). Luckily, a general remedy exists and that is to keep the bootstrap sample size m much lower

than the original size. Mathematically speaking, one requires m → ∞ and m n / → 0 , as n → ∞. In theory it fixes the problem, however for users, it is somewhat troublesome. How to choose m? An obvious suggestion would be settle for a fraction of n , say 20% or so. It should be pointed out that in good situations, where the regular bootstrap is fine, such a m^ is not advisable as it will result is loss of efficiency. See Bickel (2003), for a recent survey on the topic.

References: Athreya, K.B. (1986). Bootstrap of the mean in the infinite variance case. Ann. Stat. 14, 724-731. Azzalini, A. and Hall, P. (2000). Reducing variability using bootstrap methods with quantitative constraints. Biometrika , 87, 895-906. Babu, G.J. (1984). Bootstrapping statistics with linear combination of Chi-square as weak limit. Sankhya A. 46, 85-93. Babu, G.J. and Singh, K. (1983). Inference on means using the bootstrap. Ann. Stat. 11, 999-

Beran, R. (1984). Prepivoting to reduce level errors of confidence sets. Biometrika. 74, 151-173. Beran, R. (1990) Refining bootstrap simultaneous confidence sets. Jour. Amer. Stat. Assoc. 85, 417-428.

Eaton, M.L. and Tyler, D.E. (1991). On Wielandt's inequality and its application to the asymptotic distribution of the eigenvalues of a random symmetric matrix. Ann. Stat. 19 , 260–271. Efron, B. (1979). Bootstrap methods: Another look at jackknife. Ann. Stat. 7, 1-26. Efron, B. (1987). Better bootstrap confidence intervals (with discussions). Jour. Amer. Stat. Assoc. 82, 171-200. Efron, B. (1992). Jackknife-after-bootstrap standard errors and influences functions (with discussions). J.R. Stat. Soc. B. 54, 83-127. Efron, B. (1994). Missing data, imputation and the bootstrap (with discussions). Jour. Amer. Stat. Assoc. 89, 463-479. Efron, B. and Tibshirani, R.J. (1993). AN INTRODUCTION TO THE BOOTSTRAP , Chapman and Hall New York. Freedman, D.A. (1981) Bootstrapping Regression models. Ann. Stat. 9, 1281- 1228. Hall, P. (1989). On efficient bootstrap simulation. Boimetrika. 76, 613-617. Hall, P. (1992). Bootstrap confidence intervals in nonparametric regression. Ann. Stat. 20, 695-

Hall, P. (1992). THE BOOTSTRAP AND EDGEWORTH EXPANSION. Springer Verlag, N.Y. Hall. P. (1988). Theoretical comparison of bootstrap confidence intervals (with discussions). Ann. Stat ., 16, 927-953. Hinkley, D.V. (1988). Bootstrap methods (with discussions ). J. Roy. Stat. Soc. B, 50, 321-337. Hollander, M. and Wolfe, D.A. (1999). NONPARAMETRIC STATISTICAL METHODS (2nd edition). John Wiley & Sons, N.Y. Hwang, J.S., Sen, P.K. and Shao, J. (1996). Bootstrapping a sample quantile when the density has a jump. Stat. Sinica. 6, 1996. Kunch, H.R. (1989). The jackknife and bootstrap for general stationary observations. Ann. Stat. 17, 1217-1241. Lahiri, S.N. (1993). Bootstrapping the studentized sample mean of Lattice variables. J. Mult. Analy. 45, 247-256.

Lahiri, S.N. (1993). On the moving block bootstrap under long range dependence. Stat. Prob. Letters. 18, 405-413. Liu, R.Y. and Singh, K. (1992). Efficiency and Robustness in re sampling. Ann. Stat. 20, 370-

Liu, R.Y. and Singh, K. (1992). Moving block jackknife and bootstrap capture weak dependence. EXPLORING THE LIMITS OF BOOTSTRAP , R. Lepage and L. Billard edited. Wiley, N.Y. Lunneborg, EE. (2000). DATA ANALYSIS BY RESAMPLING: CONCEPTS AND APLLICATIONS. Duxbury Press. Mamman, E. (1992). WHEN DOES BOOTSTRAP WORK. ASYMPTOTOC RESULTS AND SIMULATIONS. Springer Verlag, N.Y. Politis, D.N. and Romano, J.P. (1994). The stationary bootstrap. Jour. Amer. Stat. Assoc. 89, 1303 – 1313. Rubin, D.B. (1981). The Bayesian bootstrap. Ann. Stat. 9, 130-134. Shao, J. and Tu, D. (1995). THE JACKKNIFE AND BOOTSTRAP , Springer, Verlag, N.Y. Singh, K. (1981). On Asymptotic accuracy of Efron’s bootstrap. Ann. Stat. 9, 1187-1195. Singh, K (1998). Breakdown theory for bootstrap quantiles. Ann. Stat. 26, 1719-1732. Singh, K. and Xie M. (2003). Bootlier-plot-Bootstrap based outlier detection plot. Sankhya , 65, 532-559. Taylor, C.C. (1989). Bootstrap choice of smoothing parameter in kernel density estimation. Biometrika. 76, 705-712. Tibshirani, R.J. (1988). Variance stabilization and the bootstrap. Biometrika. 75, 433-444. Wu, C.F.J. (1986). Jackknife, bootstrap and other resampling procedures (with discussions). Ann. Stat. 14, 1261-1350.

Bootstrap: A Statistical Method, Study notes of Statistics

Related documents

Partial preview of the text

Download Bootstrap: A Statistical Method and more Study notes Statistics in PDF only on Docsity!

Bootstrap: A Statistical Method

1. In tr od uct io n a nd t he Id ea

say 100( α / 2)% (usually α = 0.05), is trimmed off from the lower as well as from the upper end of

these numbers. The range of remaining 100(1 − α)%values is declared as the confidence limits

100(1 − α)%. The above method is referred to as bootstrap percentile method. We shall return to

parameter θ is the target of a study; say for example, θ is the household median income of a

corresponding sample statistic computed from this data set is θˆ (sample median in the case of

Similarly, the sampling distribution of ( X − μ) / SE , with SE ^ = s / n , will be approximated by the

bootstrap distribution of ( X B − X ) / SEB , with SEB = sB / n. The earliest results on second order

Compute ( θ 1 * , θ 2 ^ ,..., θ N ), using the same computing formula as the one used for θˆ , but now

that of θˆ = the sample median.

The mean of sampling distribution of θˆ often differs from θ , usually by an amount = c / n

Bias ( ) θ^ ˆ = E ( )θˆ − θ≈ O (1/ n ).

Where θ i *are bootstrap copies of θˆ , as defined in the earlier subsection. Clearly, this

empirical population of the sample. The bootstrap bias corrected estimator is θˆ c = θˆ^ − BiasB ( )θˆ. It

Confidence intervals for a given population parameter θ are sample based range [ θ^ ˆ 1 ,

θ^ ˆ 2 ] given out for the unknown number θ. The range possesses the property that θ would lie

sampling distribution of θˆ , exact or asymptotic as n → ∞. Here are some standard brands of

replications of θˆ , denoted by ( θ 1 * , θ * 2 ,..., θ 1000 * ). After ranking from bottom to top, let us denote these

bootstrap values as (^ θ^ * (1)^ ,^ θ^ (2)^ ,...,^ θ(1000)^ ). Then the bootstrap percentile confidence interval at 95%

level of confidence would be [^ θ^ (25)^ ^ ,^ θ(975)^ ]. Turning to the theoretical aspects of this method, it

should be pointed out that the method requires the symmetry of the sampling distribution of θˆ

around θ. The reason is that the method approximates the sampling distribution of θˆ − θ by the

Example 2. (Bivariate Data) In this example, the data are from Collins et al. (1999), which

Young, G.A.(1994) Bootstrap: More than a stab in the dark? (with discussion) Stat. Scie. 9, 382-

Bootstrap: A Statistical Method, Study notes of Statistics

Related documents

Partial preview of the text

Download Bootstrap: A Statistical Method and more Study notes Statistics in PDF only on Docsity!

Bootstrap: A Statistical Method

1. In tr od uct io n a nd t he Id ea

say 100( α / 2)% (usually α = 0.05), is trimmed off from the lower as well as from the upper end of

these numbers. The range of remaining 100(1 − α)%values is declared as the confidence limits

100(1 − α)%. The above method is referred to as bootstrap percentile method. We shall return to

parameter θ is the target of a study; say for example, θ is the household median income of a

corresponding sample statistic computed from this data set is θˆ (sample median in the case of

Similarly, the sampling distribution of ( X − μ) / SE , with SE ^ = s / n , will be approximated by the

bootstrap distribution of ( X B − X ) / SEB , with SEB = sB / n. The earliest results on second order

Compute ( θ 1 * , θ 2 *^ ,..., θ N *), using the same computing formula as the one used for θˆ , but now

that of θˆ = the sample median.

The mean of sampling distribution of θˆ often differs from θ , usually by an amount = c / n

Bias ( ) θ^ ˆ = E ( )θˆ − θ≈ O (1/ n ).

Where θ i *are bootstrap copies of θˆ , as defined in the earlier subsection. Clearly, this

empirical population of the sample. The bootstrap bias corrected estimator is θˆ c = θˆ^ − BiasB ( )θˆ. It

Confidence intervals for a given population parameter θ are sample based range [ θ^ ˆ 1 ,

θ^ ˆ 2 ] given out for the unknown number θ. The range possesses the property that θ would lie

sampling distribution of θˆ , exact or asymptotic as n → ∞. Here are some standard brands of

replications of θˆ , denoted by ( θ 1 * , θ * 2 ,..., θ 1000 * ). After ranking from bottom to top, let us denote these

bootstrap values as (^ θ^ * (1)^ ,^ θ^ (2)*^ ,...,^ θ(1000)^ * ). Then the bootstrap percentile confidence interval at 95%

level of confidence would be [^ θ^ (25)^ *^ ,^ θ(975)^ * ]. Turning to the theoretical aspects of this method, it

should be pointed out that the method requires the symmetry of the sampling distribution of θˆ

around θ. The reason is that the method approximates the sampling distribution of θˆ − θ by the

Example 2. (Bivariate Data) In this example, the data are from Collins et al. (1999), which

Young, G.A.(1994) Bootstrap: More than a stab in the dark? (with discussion) Stat. Scie. 9, 382-

Compute ( θ 1 * , θ 2 ^ ,..., θ N ), using the same computing formula as the one used for θˆ , but now

bootstrap values as (^ θ^ * (1)^ ,^ θ^ (2)^ ,...,^ θ(1000)^ ). Then the bootstrap percentile confidence interval at 95%

level of confidence would be [^ θ^ (25)^ ^ ,^ θ(975)^ ]. Turning to the theoretical aspects of this method, it