Performs single or double bootstrap (or bootknife) resampling and calculates
confidence intervals.
-- Function File: CI = bootci (NBOOT, BOOTFUN, D)
-- Function File: CI = bootci (NBOOT, BOOTFUN, D1,...,DN)
-- Function File: CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE)
-- Function File: CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE)
-- Function File: CI = bootci (...,'type', TYPE)
-- Function File: CI = bootci (...,'type', 'stud', 'nbootstd', NBOOTSTD)
-- Function File: CI = bootci (...,'type', 'cal', 'nbootcal', NBOOTCAL)
-- Function File: CI = bootci (...,'alpha', ALPHA)
-- Function File: CI = bootci (...,'strata', STRATA)
-- Function File: CI = bootci (...,'loo', LOO)
-- Function File: CI = bootci (...,'seed', SEED)
-- Function File: CI = bootci (...,'Options', PAROPT)
-- Function File: [CI, BOOTSTAT] = bootci (...)
'CI = bootci (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples from
the rows of a data sample D and returns 95% confidence intervals (CI) for
the bootstrap statistics computed by BOOTFUN [1]. BOOTFUN is a function
handle (e.g. specified with @), or a string indicating the function name.
The third input argument, data D (a column vector or a matrix), is used
as input for BOOTFUN. The bootstrap resampling method yields first-order
balance [2-3].
'CI = bootci (NBOOT, BOOTFUN, D1,...,DN)' is as above except that the
third and subsequent numeric input arguments are data vectors that are
used to create inputs for bootfun.
'CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE)' is as above but includes
setting optional parameters using Name-Value pairs.
'CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE)' is as above but
includes setting optional parameters using NAME-VALUE pairs.
bootci can take a number of optional parameters as NAME-VALUE pairs:
'CI = bootci (..., 'alpha', ALPHA)' where ALPHA sets the lower and upper
bounds of the confidence interval(s). The value of ALPHA must be between
0 and 1. The nominal lower and upper percentiles of the confidence
intervals CI are then 100*(ALPHA/2)% and 100*(1-ALPHA/2)% respectively,
and nominal central coverage of the intervals is 100*(1-ALPHA)%. The
default value of ALPHA is 0.05.
'CI = bootci (..., 'type', TYPE)' computes bootstrap confidence interval
CI using one of the following methods:
<> 'norm' or 'normal': Using bootstrap bias and standard error [4].
<> 'per' or 'percentile': Percentile method [1,4].
<> 'basic': Basic bootstrap method [1,4].
<> 'bca': Bias-corrected and accelerated method [5,6] (Default).
<> 'stud' or 'student': Studentized bootstrap (bootstrap-t) [1,4].
<> 'cal': Calibrated percentile method (by double bootstrap [7]).
Note that when BOOTFUN is the mean, percentile, basic and bca intervals
are automatically expanded using Student's t-distribution in order to
improve coverage for small samples [8]. The bootstrap-t method includes
an additive correction to stabilize the variance when the sample size
is small [9].
'CI = bootci (..., 'type', 'stud', 'nbootstd', NBOOTSTD)' computes the
Studentized bootstrap confidence intervals CI, with the standard errors
of the bootstrap statistics estimated automatically using resampling
methods. NBOOTSTD is a positive integer value > 0 defining the number
of resamples. Standard errors are computed using NBOOTSTD bootstrap
resamples. The default value of NBOOTSTD is 100.
'CI = bootci (..., 'type', 'cal', 'nbootcal', NBOOTCAL)' computes the
calibrated percentile bootstrap confidence intervals CI, with the
calibrated percentiles of the bootstrap statistics estimated from NBOOTCAL
bootstrap data samples. NBOOTCAL is a positive integer value. The default
value of NBOOTCAL is 199.
'CI = bootci (..., 'strata', STRATA)' sets STRATA, which are identifiers
that define the grouping of the DATA rows for stratified bootstrap
resampling. STRATA should be a column vector or cell array with the same
number of rows as the DATA.
'CI = bootci (..., 'loo', LOO)' is a logical scalar that specifies whether
the resamples of size n should be obtained by sampling from the original
data (false) or from Leave-One-Out (LOO) jackknife samples (true) of the
data - otherwise known as bootknife resampling [10]. Default is false.
'CI = bootci (..., 'seed', SEED)' initialises the Mersenne Twister random
number generator using an integer SEED value so that bootci results are
reproducible.
'CI = bootci (..., 'Options', PAROPT)' specifies options that govern if
and how to perform bootstrap iterations using multiple processors (if the
Parallel Computing Toolbox or Octave Parallel package is available). This
argument is a structure with the following recognised fields:
<> 'UseParallel': If true, use parallel processes to accelerate
bootstrap computations on multicore machines,
specifically non-vectorized function evaluations,
double bootstrap resampling and jackknife function
evaluations. Default is false for serial computation.
In MATLAB, the default is true if a parallel pool
has already been started.
<> 'nproc': nproc sets the number of parallel processes
'[CI, BOOTSTAT] = bootci (...)' also returns the bootstrap statistics
used to calculate the confidence intervals CI.
'[CI, BOOTSTAT, BOOTSAM] = bootci (...)' also returns BOOTSAM, a matrix
of indices from the bootstrap. Each column in BOOTSAM corresponds to one
bootstrap sample and contains the row indices of the values drawn from
the nonscalar data argument to create that sample.
Bibliography:
[1] Efron, and Tibshirani (1993) An Introduction to the
Bootstrap. New York, NY: Chapman & Hall
[2] Davison et al. (1986) Efficient Bootstrap Simulation.
Biometrika, 73: 555-66
[3] Booth, Hall and Wood (1993) Balanced Importance Resampling
for the Bootstrap. The Annals of Statistics. 21(1):286-298
[4] Davison and Hinkley (1997) Bootstrap Methods and their Application.
(Cambridge University Press)
[5] Efron (1987) Better Bootstrap Confidence Intervals. JASA,
82(397): 171-185
[6] Efron, and Tibshirani (1993) An Introduction to the
Bootstrap. New York, NY: Chapman & Hall
[7] Hall, Lee and Young (2000) Importance of interpolation when
constructing double-bootstrap confidence intervals. Journal
of the Royal Statistical Society. Series B. 62(3): 479-491
[8] Hesterberg, Tim (2014), What Teachers Should Know about the
Bootstrap: Resampling in the Undergraduate Statistics Curriculum,
http://arxiv.org/abs/1411.5279
[9] Polansky (2000) Stabilizing bootstrap-t confidence intervals
for small samples. Can J Stat. 28(3):501-516
[10] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling
vs. Smoothing; Proceedings of the Section on Statistics & the
Environment. Alexandria, VA: American Statistical Association.
bootci (version 2023.07.04)
Author: Andrew Charles Penn
https://www.researchgate.net/profile/Andrew_Penn/
Copyright 2019 Andrew Charles Penn
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 95% BCa bootstrap confidence intervals for the mean
ci = bootci (1999, @mean, data)
Produces the following output
ci = 23.616 34.358
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 95% calibrated percentile bootstrap confidence intervals for the mean
ci = bootci (1999, {@mean, data}, 'type', 'cal','nbootcal',199)
## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 23.975 34.269
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 95% calibrated percentile bootstrap confidence intervals for the median
## with smoothing
ci = bootci (1999, {@smoothmedian, data}, 'type', 'cal', 'nbootcal', 199)
## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 25.040 36.477
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 90% percentile bootstrap confidence intervals for the variance
ci = bootci (1999, {{@var,1}, data}, 'type', 'per', 'alpha', 0.1)
Produces the following output
ci =
96.629
235.910
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 90% BCa bootstrap confidence intervals for the variance
ci = bootci (1999, {{@var,1}, data}, 'type', 'bca', 'alpha', 0.1)
Produces the following output
ci = 117.01 260.73
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41]';
## 90% Studentized bootstrap confidence intervals for the variance
ci = bootci (1999, {{@var,1}, data}, 'type', 'stud', ...
'nbootstd', 50, 'alpha', 0.1)
## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 108.55 297.71
The following code
## Input univariate dataset
data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
0 33 28 34 4 32 24 47 41 24 26 30 41].';
## 90% calibrated percentile bootstrap confidence intervals for the variance
ci = bootci (1999, {{@var,1}, data}, 'type', 'cal', 'nbootcal', ...
199, 'alpha', 0.1)
## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 111.53 268.13
The following code
## Input bivariate dataset
x = [2.12,4.35,3.39,2.51,4.04,5.1,3.77,3.35,4.1,3.35, ...
4.15,3.56, 3.39,1.88,2.56,2.96,2.49,3.03,2.66,3].';
y = [2.47,4.61,5.26,3.02,6.36,5.93,3.93,4.09,4.88,3.81, ...
4.74,3.29,5.55,2.82,4.23,3.23,2.56,4.31,4.37,2.4].';
## 95% BCa bootstrap confidence intervals for the correlation coefficient
ci = bootci (1999, @cor, x, y)
## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 0.5050 0.8633
The following code
## Spatial Test Data from Table 14.1 of Efron and Tibshirani (1993)
## An Introduction to the Bootstrap in Monographs on Statistics and Applied
## Probability 57 (Springer)
## AIM:
## To construct 90% nonparametric bootstrap confidence intervals for var(A,1)
## var(A,1) = 171.5
## Exact intervals based on Normal theory are [118.4, 305.2].
## Calculations using Matlab's 'Statistics and Machine Learning toolbox'
## (R2020b)
##
## A = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
## 0 33 28 34 4 32 24 47 41 24 26 30 41].';
## varfun = @(A) var(A, 1);
## rng('default'); % For reproducibility
## rng('default'); ci1 = bootci (19999,{varfun,A},'alpha',0.1,'type','norm');
## rng('default'); ci2 = bootci (19999,{varfun,A},'alpha',0.1,'type','per');
## rng('default'); ci4 = bootci (19999,{varfun,A},'alpha',0.1,'type','bca');
## rng('default'); ci5 = bootci (19999,{varfun,A},'alpha',0.1,'type','stud');
##
## Summary of results from Matlab's 'Statistics and Machine Learning toolbox'
## (R2020b)
##
## method | 0.05 | 0.95 | length | shape |
## -------------------|--------|--------|--------|-------|
## ci1 - normal | 108.9 | 247.4 | 138.5 | 1.21 |
## ci2 - percentile | 97.6 | 235.8 | 138.2 | 0.87 |
## ci4 - BCa | 114.9 | 260.5 | 145.6 | 1.57 |*
## ci5 - bootstrap-t | 46.7 | 232.5 | 185.8 | 0.49 |**
## -------------------|--------|--------|--------|-------|
## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 |
##
## * Bug in the fx0 subfunction of bootci
## ** Bug in the bootstud subfunction of bootci
## Calculations using the 'boot' and 'bootstrap' packages in R
##
## library (boot) # Functions from Davison and Hinkley (1997)
## A <- c(48,36,20,29,42,42,20,42,22,41,45,14,6, ...
## 0,33,28,34,4,32,24,47,41,24,26,30,41);
## n <- length(A)
## var.fun <- function (d, i) {
## # Function to compute the population variance
## n <- length (d);
## return (var (d[i]) * (n - 1) / n) };
## boot.fun <- function (d, i) {
## # Compute the estimate
## t <- var.fun (d, i);
## # Compute sampling variance of the estimate using Tukey's jackknife
## n <- length (d);
## U <- empinf (data=d[i], statistic=var.fun, type="jack", stype="i");
## var.t <- sum (U^2 / (n * (n - 1)));
## return ( c(t, var.t) ) };
## set.seed(1)
## var.boot <- boot (data=A, statistic=boot.fun, R=19999, sim='balanced')
## ci1 <- boot.ci (var.boot, conf=0.90, type="norm")
## ci2 <- boot.ci (var.boot, conf=0.90, type="perc")
## ci3 <- boot.ci (var.boot, conf=0.90, type="basic")
## ci4 <- boot.ci (var.boot, conf=0.90, type="bca")
## ci5 <- boot.ci (var.boot, conf=0.90, type="stud")
##
## library (bootstrap) # Functions from Efron and Tibshirani (1993)
## set.seed(1);
## ci4a <- bcanon (A, 19999, var.fun, alpha=c(0.05,0.95))
## set.seed(1);
## ci5a <- boott (A, var.fun, nboott=19999, nbootsd=499, perc=c(.05,.95))
##
## Summary of results from 'boot' and 'bootstrap' packages in R
##
## method | 0.05 | 0.95 | length | shape |
## -------------------|--------|--------|--------|-------|
## ci1 - normal | 109.6 | 246.7 | 137.1 | 1.22 |
## ci2 - percentile | 97.9 | 234.8 | 136.9 | 0.86 |
## ci3 - basic | 108.3 | 245.1 | 136.8 | 1.16 |
## ci4 - BCa | 116.0 | 260.7 | 144.7 | 1.60 |
## ci4a - BCa | 115.8 | 260.6 | 144.8 | 1.60 |
## ci5 - bootstrap-t | 112.0 | 291.8 | 179.8 | 2.02 |
## ci5a - bootstrap-t | 116.1 | 290.9 | 174.8 | 2.16 |
## -------------------|--------|--------|--------|-------|
## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 |
## Calculations using the 'statistics-resampling' package for Octave/Matlab
##
## A = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
## 0 33 28 34 4 32 24 47 41 24 26 30 41].';
## ci1 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','norm','seed',1);
## ci2 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','per','seed',1);
## ci3 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','basic','seed',1);
## ci4 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','bca','seed',1);
## ci5 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','stud',...
## 'nbootstd',100,'seed',1);
## ci6 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','cal', ...
## 'nbootcal',499,'seed',1);
##
## Summary of results from 'statistics-resampling' package for Octave/Matlab
##
## method | 0.05 | 0.95 | length | shape |
## -------------------|--------|--------|--------|-------|
## ci1 - normal | 110.1 | 246.2 | 136.1 | 1.22 |
## ci2 - percentile | 98.1 | 234.7 | 136.6 | 0.86 |
## ci3 - basic | 108.4 | 245.0 | 136.1 | 1.17 |
## ci4 - BCa | 116.1 | 259.3 | 143.2 | 1.59 |
## ci5 - bootstrap-t | 114.0 | 290.3 | 176.3 | 2.07 |
## ci6 - calibrated | 115.3 | 276.4 | 161.1 | 1.87 |
## -------------------|--------|--------|--------|-------|
## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 |
##
## Simulation results for constructing 90% confidence intervals for the
## variance of a population N(0,1) from 1000 random samples of size 26
## (analagous to the situation above). Simulation performed using the
## bootsim script with nboot of 1999.
##
## method | coverage | lower | upper | length | shape |
## ---------------------|----------|--------|--------|--------|-------|
## normal | 81.5% | 3.0% | 15.5% | 0.77 | 1.21 |
## percentile | 81.5% | 0.9% | 17.6% | 0.76 | 0.91 |
## basic | 81.1% | 2.5% | 16.4% | 0.78 | 1.09 |
## BCa | 84.2% | 5.4% | 10.4% | 0.86 | 1.82 |
## bootstrap-t | 89.2% | 4.3% | 6.5% | 0.99 | 2.15 |
## calibrated | 87.4% | 4.2% | 8.4% | 0.91 | 2.03 |
## ---------------------|----------|--------|--------|--------|-------|
## parametric - exact | 90.8% | 3.7% | 5.5% | 0.99 | 2.52 |
gives an example of how 'bootci' is used.
Package: statistics-resampling