# doc-cache created by Octave 11.1.0
# name: cache
# type: cell
# rows: 3
# columns: 22
# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
boot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4521
 This function returns resampled data or indices created by balanced bootstrap 
 or bootknife resampling.

 -- Function File: BOOTSAM = boot (N, NBOOT)
 -- Function File: BOOTSAM = boot (X, NBOOT)
 -- Function File: BOOTSAM = boot (..., NBOOT, LOO)
 -- Function File: BOOTSAM = boot (..., NBOOT, LOO, SEED)
 -- Function File: BOOTSAM = boot (..., NBOOT, LOO, SEED, WEIGHTS)

     'BOOTSAM = boot (N, NBOOT)' generates NBOOT bootstrap samples of length N.
     The samples generated are composed of indices within the range 1:N, which
     are chosen by random resampling with replacement [1]. N and NBOOT must be
     positive integers. The returned value, BOOTSAM, is a matrix of indices,
     with N rows and NBOOT columns. The efficiency of the bootstrap simulation
     is ensured by sampling each of the indices exactly NBOOT times, for first-
     order balance [2-3]. Balanced resampling only applies when NBOOT > 1.

     'BOOTSAM = boot (X, NBOOT)' generates NBOOT bootstrap samples, each the
     same length as X (N). X must be a numeric vector, and NBOOT must be
     positive integer. BOOTSAM is a matrix of values from X, with N rows
     and NBOOT columns. The samples generated contains values of X, which
     are chosen by balanced bootstrap resampling as described above [1-3].
     Balanced resampling only applies when NBOOT > 1.

     Note that the values of N and NBOOT map onto int32 data types in the 
     boot MEX file. Therefore, these values must never exceed (2^31)-1.

     'BOOTSAM = boot (..., NBOOT, LOO)' sets the resampling method. If LOO
     is false, the resampling method used is balanced bootstrap resampling.
     If LOO is true, the resampling method used is balanced bootknife
     resampling [4]. The latter involves creating leave-one-out (jackknife)
     samples of size N - 1, and then drawing resamples of size N with
     replacement from the jackknife samples, thereby incorporating Bessel's
     correction into the resampling procedure. LOO must be a scalar logical
     value. The default value of LOO is false.

     'BOOTSAM = boot (..., NBOOT, LOO, SEED)' sets a seed to initialize
     the pseudo-random number generator to make resampling reproducible between
     calls to the boot function. Note that the mex function compiled from the
     source code boot.cpp is not thread-safe. Below is an example of a line of
     code one can run in Octave/Matlab before attempting parallel operation of
     boot.mex in order to ensure that the initial random seeds of each thread
     are unique:
       • In Octave:
            pararrayfun (nproc, @boot, 1, 1, false, 1:nproc)
       • In Matlab:
            ncpus = feature('numcores'); 
            parfor i = 1:ncpus; boot (1, 1, false, i); end;

     'BOOTSAM = boot (..., NBOOT, LOO, SEED, WEIGHTS)' sets a weight
     vector of length N. If WEIGHTS is empty or not provided, the default 
     is a vector of length N, with each element equal to NBOOT (i.e. uniform
     weighting). Each element of WEIGHTS is the number of times that the
     corresponding index (or element in X) is represented in BOOTSAM.
     Therefore, the sum of WEIGHTS must equal N * NBOOT. 

  Bibliography:
  [1] Efron, and Tibshirani (1993) An Introduction to the
        Bootstrap. New York, NY: Chapman & Hall
  [2] Davison et al. (1986) Efficient Bootstrap Simulation.
        Biometrika, 73: 555-66
  [3] Booth, Hall and Wood (1993) Balanced Importance Resampling
        for the Bootstrap. The Annals of Statistics. 21(1):286-298
  [4] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.

  boot (version 2024.04.24)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 This function returns resampled data or indices created by balanced bootstra...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
boot1way


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8004
 Performs resampling under the null hypothesis and computes p-values for
 (multiple) comparisons among independent samples in a one-way layout.

 -- Function File: boot1way (DATA, GROUP)5
 -- Function File: boot1way (..., NAME, VALUE)
 -- Function File: boot1way (..., 'bootfun', BOOTFUN)
 -- Function File: boot1way (..., 'nboot', NBOOT)
 -- Function File: boot1way (..., 'ref', REF)
 -- Function File: boot1way (..., 'alpha', ALPHA)
 -- Function File: boot1way (..., 'Options', PAROPT)
 -- Function File: PVAL = boot1way (DATA, GROUP, ...)
 -- Function File: [PVAL, C] = boot1way (DATA, GROUP, ...)
 -- Function File: [PVAL, C, STATS] = boot1way (DATA, GROUP, ...)
 -- Function File: [...] = boot1way (..., 'display', DISPLAYOPT)

     'boot1way (DATA, GROUP)' performs a bootstrap version of a randomization
     test [1] for comparing independent samples of data in a one-way layout.
     Pairwise multiple comparison tests are computed by the single-step
     maximum absolute t-statistic (maxT) procedure, which controls the family-
     wise error rate (FWER) in a manner analagous to the Tukey-Kramer Honest
     Significance Difference test. The results are displayed as a pretty table
     and the differences between groups are plotted along with the symmetric
     95% bootstrap-t confidence intervals (CI). The colours of the markers and
     error bars depend on the value of the multiplicity-adjusted p-values:
     red if p < .05, or blue if p > .05. All of the p-values reported represent
     the outcome of two-tailed tests. DATA must be a numeric column vector or
     matrix, where categorization of the DATA rows is achieved by labels in
     GROUP. GROUP must be a vector or cell array with the same number of
     rows as DATA.  

     boot1way can take a number of optional parameters as NAME-VALUE pairs:

     'boot1way (..., 'bootfun', BOOTFUN)' also specifies BOOTFUN: the function
     calculated on the original sample and the bootstrap resamples. BOOTFUN
     must be either a:
        o function handle, function name or an anonymous function,
        o string of a function name, or
        o a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (after the data arguments).
        In all cases, BOOTFUN must take DATA for the initial input argument(s).
        BOOTFUN must calculate a statistic representative of the finite data
        sample; it should NOT be an estimate of a population parameter (unless
        they are one of the same). By default, BOOTFUN is @mean. If a robust
        alternative to the mean is required, set BOOTFUN to 'robust' to
        implement a smoothed version of the median (a.k.a. @smoothmedian). 

     'boot1way (..., 'nboot', NBOOT)' is a scalar or a vector of upto two
     positive integers indicating the number of resamples for the first
     (bootstrap) and second (bootknife) levels of iterated resampling. If NBOOT
     is a scalar value, or if NBOOT(2) is set to 0, then standard errors are
     calculated either without resampling (if BOOTFUN @mean) or using Tukey's
     jackknife. This implementation of jackknife requires the Statistics
     package/toolbox. The default value of NBOOT is the vector: [999,99].

     'boot1way (..., 'ref', REF)' sets the GROUP to use as the reference group
     for the multiple comparison tests. If REF is a recognised member of GROUP,
     then the maxT procedure for treatment versus reference controls the
     family-wise error rate (FWER) in a manner analagous to Dunnett's multiple
     comparison tests.

     'boot1way (..., 'alpha', ALPHA)' specifies the two-tailed significance
     level for CI coverage. The default value of ALPHA is 0.05 for 95%
     confidence intervals.

     'boot1way (..., 'Options', PAROPT)' specifies options that govern if
     and how to perform bootstrap iterations using multiple processors (if the
     Parallel Computing Toolbox or Octave Parallel package is available). This
     argument is a structure with the following recognised fields:
        o 'UseParallel': If true, use parallel processes to accelerate
                         bootstrap computations on multicore machines,
                         specifically non-vectorized function evaluations,
                         double bootstrap resampling and jackknife function
                         evaluations. Default is false for serial computation.
                         In MATLAB, the default is true if a parallel pool
                         has already been started. 
        o 'nproc':       nproc sets the number of parallel processes (optional)

     'PVAL = boot1way (DATA, GROUP, ...)' returns the p-value(s) for the 
     (multiple) two-tailed test(s). Note that the p-value(s) returned are
     already adjusted to control the family-wise, type I error rate and 
     truncated at the resolution limit determined by the number of bootstrap
     replicates, specifically 1/NBOOT(1)  

     '[PVAL, C] = boot1way (DATA, GROUP, ...)' also returns a 9 column matrix
     that summarises multiple comparison test results. The columns of C are:
       - column 1:  test GROUP number
       - column 2:  reference GROUP number
       - column 3:  value of BOOTFUN evaluated for the test GROUP
       - column 4:  value of BOOTFUN evaluated for the reference GROUP
       - column 5:  the difference between the groups (column 3 minus column 4)
       - column 6:  LOWER bound of the 100*(1-ALPHA)% bootstrap-t CI
       - column 7:  UPPER bound of the 100*(1-ALPHA)% bootstrap-t CI
       - column 8:  t-ratio
       - column 9:  multiplicity-adjusted p-value
       - column 10: minimum false positive risk for the p-value

     '[PVAL, C, STATS] = boot1way (DATA, GROUP, ...)' also returns a structure 
     containing additional statistics. The stats structure contains the 
     following fields:

       gnames   - group names used in the GROUP input argument. The index of 
                  gnames corresponds to the numbers used to identify GROUPs
                  in columns 1 and 2 of the output argument C
       ref      - index of the reference group
       groups   - group index and BOOTFUN for each group with sample size,
                  standard error and CI, which start to overlap at a
                  multiplicity-adjusted p-value of approximately 0.05
       Var      - weighted mean (pooled) sampling variance
       nboot    - number of bootstrap resamples (1st and 2nd resampling layers)
       alpha    - two-tailed significance level for the CI reported in C.

     '[PVAL, C, STATS, BOOTSTAT] = boot1way (DATA, GROUP, ...)' also returns
     the maximum test statistic computed for each bootstrap resample

     '[...] = boot1way (..., 'display', DISPLAYOPT)' a logical value (true 
      or false) used to specify whether to display the results and plot the
      graph in addition to creating the output arguments. The default is true.

  BIBLIOGRAPHY:
  [1] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. 
        New York, NY: Chapman & Hall

  boot1way (version 2024.04.24)
  Bootstrap tests for comparing independent groups in a one-way layout
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs resampling under the null hypothesis and computes p-values for
 (mu...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
bootbayes


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9026
 Performs Bayesian nonparametric bootstrap and calculates posterior statistics 
 for the mean(s) or regression coefficients of univariate or multivariate data.


 -- Function File: bootbayes (Y)
 -- Function File: bootbayes (Y, X)
 -- Function File: bootbayes (Y, X, CLUSTID)
 -- Function File: bootbayes (Y, X, BLOCKSZ)
 -- Function File: bootbayes (Y, X, ..., NBOOT)
 -- Function File: bootbayes (Y, X, ..., NBOOT, PROB)
 -- Function File: bootbayes (Y, X, ..., NBOOT, PROB, PRIOR)
 -- Function File: bootbayes (Y, X, ..., NBOOT, PROB, PRIOR, SEED)
 -- Function File: bootbayes (Y, X, ..., NBOOT, PROB, PRIOR, SEED, L)
 -- Function File: STATS = bootbayes (Y, ...)
 -- Function File: [STATS, BOOTSTAT] = bootbayes (Y, ...)

     'bootbayes (Y)' performs Bayesian nonparametric bootstrap [1] to create
     1999 bootstrap statistics, each representing the weighted mean(s) of the
     column vector (or column-major matrix), Y, using a vector of weights
     randomly generated from a symmetric Dirichlet distribution. The resulting
     bootstrap (or posterior [1,2]) distribution(s) is/are summarised for each
     column of Y (i.e. each outcome) with the following statistics printed to
     the standard output:
        - original: the mean(s) of the data column(s) of Y
        - bias: bootstrap bias estimate(s)
        - median: the median(s) of the posterior distribution(s)
        - stdev: the standard deviation(s) of the posterior distribution(s)
        - CI_lower: lower bound(s) of the 95% credible interval(s)
        - CI_upper: upper bound(s) of the 95% credible interval(s)
          By default, the credible intervals are shortest probability
          intervals, which represent a more computationally stable version
          of the highest posterior density interval [3].

     'bootbayes (Y, X)' also specifies the design matrix (X) for least squares
     regression of Y on X. X should be a column vector or matrix the same
     number of rows as Y. If the X input argument is empty, the default for X
     is a column of ones (i.e. intercept only) and thus the statistic computed
     reduces to the mean (as above). The statistics calculated and returned in
     the output then relate to the coefficients from the regression of Y on X.
     Y can be a column vector (for univariate regression) or a matrix (for
     multivariate regression).

     'bootbayes (Y, X, CLUSTID)' specifies a vector or cell array of numbers
     or strings respectively to be used as cluster labels or identifiers.
     Rows in Y (and X) with the same CLUSTID value are treated as clusters with
     dependent errors. Rows of Y (and X) assigned to a particular cluster
     will have identical weights during Bayesian bootstrap. If empty (default),
     no clustered resampling is performed and all errors are treated as
     independent.

     'bootbayes (Y, X, BLOCKSZ)' specifies a scalar, which sets the block size
     for bootstrapping when the residuals have serial dependence. Identical
     weights are assigned within each (consecutive) block of length BLOCKSZ
     during Bayesian bootstrap. Rows of Y (and X) within the same block are
     treated as having dependent errors. If empty (default), no block
     resampling is performed and all errors are treated as independent.

     'bootbayes (Y, X, ..., NBOOT)' specifies the number of bootstrap resamples,
     where NBOOT must be a positive integer. If empty, the default value of
     NBOOT is 1999.

     'bootbayes (Y, X, ..., NBOOT, PROB)' where PROB is numeric and sets the
     lower and upper bounds of the credible interval(s). The value(s) of PROB
     must be between 0 and 1. PROB can either be:
       <> scalar: To set the central mass of shortest probability intervals
                  (SPI) to 100*(1-PROB)%
       <> vector: A pair of probabilities defining the lower and upper
                  percentiles of the credible interval(s) as 100*(PROB(1))%
                  and 100*(PROB(2))% respectively. 
          Credible intervals are not calculated when the value(s) of PROB
          is/are NaN. The default value of PROB is 0.95.

     'bootbayes (Y, X, ..., NBOOT, PROB, PRIOR)' accepts a positive real
     numeric scalar to parametrize the form of the symmetric Dirichlet
     distribution. The Dirichlet distribution is the conjugate PRIOR used to
     randomly generate weights on the unit simplex for linear least-squares
     fitting of the observed data, and subsequently to estimate the posterior
     for the regression coefficients by Bayesian bootstrap and any derived 
     linear estimates or contrasts.

     If PRIOR is not provided or is empty, the default value of PRIOR is
     'auto'. The behaviour of 'auto' depends on whether X is provided and
     whether the model contains slope coefficients.

     If no X is provided, or in intercept-only models, the value 'auto'
     sets PRIOR so that the Bayesian-bootstrap posterior standard deviation
     of the mean equals the usual frequentist standard error, i.e. 
     std (Y, 0) / sqrt(N). Here N denotes the number of independent sampling 
     units (e.g., observations, clusters, or blocks). Thus:

          PRIOR (i.e. alpha) = 1 - 2 / N

     With this setting, std (BOOTSTAT, 0, 2) = std (Y, 0) / sqrt (N) and
     var (BOOTSTAT, 0, 2) = std (Y, 0)^2 / N (up to Monte Carlo error).
     When N = 2 (Haldane prior, PRIOR = 0) and the statistic is the mean, the
     posterior standard deviation equals the frequentist standard error exactly
     (up to Monte Carlo error):

         std (BOOTSTAT, 1, 2) = std (Y, 1) = std (Y, 0) / sqrt (N)

     If X is a design matrix including slope predictor terms, the value
     'auto' generalizes the above by providing a global Bessel‑style
     correction matching the overall variance scale on average across
     coefficients. Thus:

          PRIOR (i.e. alpha) = 1 - (tr (H) + 1) / N = 1 - (rank (X) + 1) / N

     Here tr(H) (and equivalently rank(X)) is the effective model degrees of
     freedom, and N is the number of independent sampling units. Equivalently:

          PRIOR (i.e. alpha) = 1 - (N - dfe + 1) / N

     where dfe = N - rank(X) is the effective error degrees of freedom.

     Alternative standard prior choices include: 1 for Bayes’ rule (uniform on
     the simplex), 0.5 for the transformation-invariant Jeffreys prior (for the
     Dirichlet weights), and 0 for the Haldane prior. Priors lower than 1 
     produce a more conservative (wider) posterior, whereas priors greater 
     than 1 are more liberal, shrinking the posterior bootstrap statistics 
     toward the maximum-likelihood estimates.

     (For the Haldane prior, normal-quantile CIs use std(BOOTSTAT,1,2) to
     match the population normalization used for the interval formula.)

     'bootbayes (Y, X, ..., NBOOT, PROB, PRIOR, SEED)' initialises the
     Mersenne Twister random number generator using an integer SEED value so
     that 'bootbayes' results are reproducible.

     'bootbayes (Y, X, ..., NBOOT, PROB, PRIOR, SEED, L)' multiplies the
     regression coefficients by the hypothesis matrix L. If L is not provided
     or is empty, it will assume the default value of 1 (i.e. no change to
     the design). Otherwise, L must have the same number of rows as the number
     of columns in X.

     'STATS = bootbayes (...)' returns a structure where each field contains
     a matrix of size (p x q), where p is the number of predictors and q is
     the number of outcomes (columns in Y). Fields include: original, bias, 
     median, stdev, CI_lower, CI_upper and prior.

     '[STATS, BOOTSTAT] = bootbayes (Y, ...)' also returns the bootstrap
     statistics. If Y is a column vector, BOOTSTAT is a (p x nboot) matrix. 
     If Y is a matrix (q > 1), BOOTSTAT is a (1 x q) cell array where each
     cell contains the (p x nboot) matrix for that outcome.

  Bibliography:
  [1] Rubin (1981) The Bayesian Bootstrap. Ann. Statist. 9(1):130-134
  [2] Weng (1989) On a Second-Order Asymptotic property of the Bayesian
        Bootstrap Mean. Ann. Statist. 17(2):705-710
  [3] Liu, Gelman & Zheng (2015). Simulation-efficient shortest probability
        intervals. Statistics and Computing, 25(4), 809–819. 

  bootbayes (version 2026.02.02)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs Bayesian nonparametric bootstrap and calculates posterior statistic...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
bootcdf


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2170
 Computes the empirical cumulative distribution function (ECDF), accounting for
 the presence of ties. Useful for bootstrap statistics, which often contain ties.

 -- Function File: [x, F] = bootcdf (y)
 -- Function File: [x, F] = bootcdf (y, trim)
 -- Function File: [x, F] = bootcdf (y, trim, m)
 -- Function File: [x, F] = bootcdf (y, trim, m, tol)
 -- Function File: [x, F, P] = bootcdf (...)

     '[x, F] = bootcdf (y)' computes the empirical cumulative distribution
     function (ECDF) of the vector y of length N. This funtction accounts for
     the presence of ties and so is suitable for computing the ECDF of
     bootstrap statistics.

     '[x, F] = bootcdf (y, trim)' removes redundant rows of the ECDF when trim
     is true. When trim is false, x and F are are the same length as y. The
     default is true.

     '[x, F] = bootcdf (y, trim, m)' specifies the denominator in the
     calculation of F as (N + m). Accepted values of m are 0 or 1, with the
     default being 0. When m is 1, quantiles formed from x and F are akin to
     qtype 6 in the R quantile function.

     '[x, F] = bootcdf (y, trim, m, tol)' applies a tolerance for the absolute
     difference in y values that constitutes a tie. The default tolerance
     is 1e-12 for double precision, or 1e-6 for single precision.

     '[x, F, P] = bootcdf (...)' also returns the distribution of P values.

  bootcdf (version 2024.04.21)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Computes the empirical cumulative distribution function (ECDF), accounting f...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
bootci


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8407
 Advanced bootstrap confidence intervals: Performs bootstrap and calculates
 various types of confidence intervals from the resulting bootstrap statistics. 

 -- Function File: CI = bootci (NBOOT, BOOTFUN, D)
 -- Function File: CI = bootci (NBOOT, BOOTFUN, D1,...,DN)
 -- Function File: CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE)
 -- Function File: CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE)
 -- Function File: CI = bootci (...,'type', TYPE)
 -- Function File: CI = bootci (...,'type', 'stud', 'nbootstd', NBOOTSTD)
 -- Function File: CI = bootci (...,'type', 'cal', 'nbootcal', NBOOTCAL)
 -- Function File: CI = bootci (...,'alpha', ALPHA)
 -- Function File: CI = bootci (...,'strata', STRATA)
 -- Function File: CI = bootci (...,'loo', LOO)
 -- Function File: CI = bootci (...,'seed', SEED)
 -- Function File: CI = bootci (...,'Options', PAROPT)
 -- Function File: [CI, BOOTSTAT] = bootci (...)

     'CI = bootci (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples from
     the rows of a data sample D and returns 95% confidence intervals (CI) for
     the bootstrap statistics computed by BOOTFUN [1]. BOOTFUN is a function 
     handle (e.g. specified with @), or a string indicating the function name. 
     The third input argument, data D (a column vector or a matrix), is used
     as input for BOOTFUN. The bootstrap resampling method yields first-order
     balance [2-3].

     'CI = bootci (NBOOT, BOOTFUN, D1,...,DN)' is as above except that the
     third and subsequent numeric input arguments are data (column vectors
     or matrices) that are used to create inputs for BOOTFUN.

     'CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE)' is as above but includes
     setting optional parameters using Name-Value pairs.

     'CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE)' is as above but
     includes setting optional parameters using NAME-VALUE pairs.

     bootci can take a number of optional parameters as NAME-VALUE pairs:

     'CI = bootci (..., 'alpha', ALPHA)' where ALPHA sets the lower and upper 
     bounds of the confidence interval(s). The value of ALPHA must be between
     0 and 1. The nominal lower and upper percentiles of the confidence
     intervals CI are then 100*(ALPHA/2)% and 100*(1-ALPHA/2)% respectively,
     and nominal central coverage of the intervals is 100*(1-ALPHA)%. The
     default value of ALPHA is 0.05.

     'CI = bootci (..., 'type', TYPE)' computes bootstrap confidence interval 
     CI using one of the following methods:
      <> 'norm' or 'normal': Using bootstrap bias and standard error [4].
      <> 'per' or 'percentile': Percentile method [1,4].
      <> 'basic': Basic bootstrap method [1,4].
      <> 'bca': Bias-corrected and accelerated method [5,6] (Default).
      <> 'stud' or 'student': Studentized bootstrap (bootstrap-t) [1,4].
      <> 'cal': Calibrated percentile method (by double bootstrap [7]).
       Note that when BOOTFUN is the mean, the percentile, basic and bca
       intervals are automatically expanded using Student's t-distribution in
       order to improve coverage for small samples [8]. To compute confidence
       the intervals for the mean without this correction for narrowness bias,
       define the mean in BOOTFUN as an anonymous function (e.g. @(x) mean(x)).
       The bootstrap-t method includes an additive correction to stabilize the
       variance when the sample size is small [9].

     'CI = bootci (..., 'type', 'stud', 'nbootstd', NBOOTSTD)' computes the
     Studentized bootstrap confidence intervals CI, with the standard errors
     of the bootstrap statistics estimated automatically using resampling
     methods. NBOOTSTD is a positive integer value > 0 defining the number
     of resamples. Standard errors are computed using NBOOTSTD bootstrap
     resamples. The default value of NBOOTSTD is 100.

     'CI = bootci (..., 'type', 'cal', 'nbootcal', NBOOTCAL)' computes the
     calibrated percentile bootstrap confidence intervals CI, with the
     calibrated percentiles of the bootstrap statistics estimated from NBOOTCAL
     bootstrap data samples. NBOOTCAL is a positive integer value. The default
     value of NBOOTCAL is 199.

     'CI = bootci (..., 'strata', STRATA)' sets STRATA, which are identifiers
     that define the grouping of the DATA rows for stratified bootstrap
     resampling. STRATA should be a column vector or cell array with the same
     number of rows as the DATA.

     'CI = bootci (..., 'loo', LOO)' is a logical scalar that specifies whether
     the resamples of size n should be obtained by sampling from the original
     data (false) or from Leave-One-Out (LOO) jackknife samples (true) of the
     data - otherwise known as bootknife resampling [10]. Default is false.

     'CI = bootci (..., 'seed', SEED)' initialises the Mersenne Twister random
     number generator using an integer SEED value so that bootci results are
     reproducible.

     'CI = bootci (..., 'Options', PAROPT)' specifies options that govern if
     and how to perform bootstrap iterations using multiple processors (if the
     Parallel Computing Toolbox or Octave Parallel package is available). This
     argument is a structure with the following recognised fields:
       <> 'UseParallel': If true, use parallel processes to accelerate
                         bootstrap computations on multicore machines,
                         specifically non-vectorized function evaluations,
                         double bootstrap resampling and jackknife function
                         evaluations. Default is false for serial computation.
                         In MATLAB, the default is true if a parallel pool
                         has already been started. 
       <> 'nproc':       nproc sets the number of parallel processes (optional)

     '[CI, BOOTSTAT] = bootci (...)' also returns the bootstrap statistics
     used to calculate the confidence intervals CI.
   
     '[CI, BOOTSTAT, BOOTSAM] = bootci (...)' also returns BOOTSAM, a matrix 
     of indices from the bootstrap. Each column in BOOTSAM corresponds to one 
     bootstrap sample and contains the row indices of the values drawn from 
     the nonscalar data argument to create that sample.

  Bibliography:
  [1] Efron, and Tibshirani (1993) An Introduction to the
        Bootstrap. New York, NY: Chapman & Hall
  [2] Davison et al. (1986) Efficient Bootstrap Simulation.
        Biometrika, 73: 555-66
  [3] Booth, Hall and Wood (1993) Balanced Importance Resampling
        for the Bootstrap. The Annals of Statistics. 21(1):286-298
  [4] Davison and Hinkley (1997) Bootstrap Methods and their Application.
        (Cambridge University Press)
  [5] Efron (1987) Better Bootstrap Confidence Intervals. JASA, 
        82(397): 171-185 
  [6] Efron, and Tibshirani (1993) An Introduction to the
        Bootstrap. New York, NY: Chapman & Hall
  [7] Hall, Lee and Young (2000) Importance of interpolation when
        constructing double-bootstrap confidence intervals. Journal
        of the Royal Statistical Society. Series B. 62(3): 479-491
  [8] Hesterberg, Tim (2014), What Teachers Should Know about the 
        Bootstrap: Resampling in the Undergraduate Statistics Curriculum, 
        http://arxiv.org/abs/1411.5279
  [9] Polansky (2000) Stabilizing bootstrap-t confidence intervals
        for small samples. Can J Stat. 28(3):501-516
  [10] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.

  bootci (version 2024.05.13)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Advanced bootstrap confidence intervals: Performs bootstrap and calculates
 ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
bootclust


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8214
 Performs balanced bootstrap (or bootknife) resampling of clusters or blocks of
 data and calculates bootstrap bias, standard errors and confidence intervals.

 -- Function File: bootclust (DATA)
 -- Function File: bootclust (DATA, NBOOT)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN)
 -- Function File: bootclust ({D1, D2, ...}, NBOOT, BOOTFUN)
 -- Function File: bootclust (DATA, NBOOT, {BOOTFUN, ...})
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA, CLUSTID)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA, BLOCKSZ)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO, SEED)
 -- Function File: bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO, SEED, NPROC)
 -- Function File: STATS = bootclust (...)
 -- Function File: [STATS, BOOTSTAT] = bootclust (...)

     'bootclust (DATA)' uses nonparametric balanced bootstrap resampling
     to generate 1999 resamples from clusters or contiguous blocks of rows of
     the DATA (column vector or matrix) [1]. By default, each row is it's own
     cluster/block (i.e. no clustering or blocking). The means of the resamples
     are then computed and the following statistics are displayed:
        - original: the original estimate(s) calculated by BOOTFUN and the DATA
        - bias: bootstrap estimate of the bias of the sampling distribution(s)
        - std_error: bootstrap estimate(s) of the standard error(s)
        - CI_lower: lower bound(s) of the 95% bootstrap confidence interval(s)
        - CI_upper: upper bound(s) of the 95% bootstrap confidence interval(s)

     'bootclust (DATA, NBOOT)' specifies the number of bootstrap resamples,
     where NBOOT is a scalar, positive integer corresponding to the number
     of bootstrap resamples. The default value of NBOOT is the scalar: 1999.

     'bootclust (DATA, NBOOT, BOOTFUN)' also specifies BOOTFUN: the function
     calculated on the original sample and the bootstrap resamples. BOOTFUN
     must be either a:
       <> function handle, function name or an anonymous function,
       <> string of a function name, or
       <> a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (after the data arguments).
        In all cases BOOTFUN must take DATA for the initial input argument(s).
        BOOTFUN can return a scalar or any multidimensional numeric variable,
        but the output will be reshaped as a column vector. BOOTFUN must
        calculate a statistic representative of the finite data sample; it
        should NOT be an estimate of a population parameter (unless they are
        one of the same). If BOOTFUN is @mean or 'mean', narrowness bias of
        the confidence intervals for single bootstrap are reduced by expanding
        the probabilities of the percentiles using Student's t-distribution
        [2]. By default, BOOTFUN is @mean.

     'bootclust ({D1, D2, ...}, NBOOT, BOOTFUN)' resamples from the clusters
     or blocks of rows of the data vectors D1, D2 etc and the resamples are
     passed onto BOOTFUN as multiple data input arguments. All data vectors
     and matrices (D1, D2 etc) must have the same number of rows.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA)', where ALPHA is numeric
     and sets the lower and upper bounds of the confidence interval(s). The
     value(s) of ALPHA must be between 0 and 1. ALPHA can either be:
       <> scalar: To set the (nominal) central coverage of equal-tailed
                  percentile confidence intervals to 100*(1-ALPHA)%.
       <> vector: A pair of probabilities defining the (nominal) lower and
                  upper percentiles of the confidence interval(s) as
                  100*(ALPHA(1))% and 100*(ALPHA(2))% respectively. The
                  percentiles are bias-corrected and accelerated (BCa) [3].
        The default value of ALPHA is the vector: [.025, .975], for a 95%
        BCa confidence interval.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA, CLUSTID)' also sets CLUSTID,
     which are identifiers that define the grouping of the DATA rows for
     cluster bootstrap resampling. CLUSTID should be a column vector or
     cell array with the same number of rows as the DATA. Rows in DATA with
     the same CLUSTID value are treated as clusters of observations that are
     resampled together.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA, BLOCKSZ)' groups consecutive
     DATA rows into non-overlapping blocks of length BLOCKSZ for simple block
     bootstrap resampling [4]. Note that this variation of block bootstrap is
     a special case of resampling clustered data. By default, BLOCKSZ is 1.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO)' sets the resampling
     method. If LOO is false, the resampling method used is balanced bootstrap
     resampling. If LOO is true, the resampling method used is balanced
     bootknife resampling [5]. Where N is the number of clusters or blocks,
     bootknife cluster or block resampling involves creating leave-one-out
     jackknife samples of size N - 1, and then drawing resamples of size N with
     replacement from the jackknife samples, thereby incorporating Bessel's
     correction into the resampling procedure. LOO must be a scalar logical
     value. The default value of LOO is false.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO, SEED)' initialises
     the Mersenne Twister random number generator using an integer SEED value
     so that bootclust results are reproducible.

     'bootclust (DATA, NBOOT, BOOTFUN, ALPHA, ..., LOO, SEED, NPROC)' also
     sets the number of parallel processes to use for jackknife computations
     and non-vectorized function evaluations during bootstrap and on multicore
     machines. This feature requires the Parallel package (in Octave), or the
     Parallel Computing Toolbox (in Matlab). This option is ignored during
     bootstrap function evaluations when BOOTFUN is vectorized.

     'STATS = bootclust (...)' returns a structure with the following fields
     (defined above): original, bias, std_error, CI_lower, CI_upper.

     '[STATS, BOOTSTAT] = bootclust (...)' returns BOOTSTAT, a vector or matrix
     of bootstrap statistics calculated over the bootstrap resamples.

     '[STATS, BOOTSTAT, BOOTDATA] = bootclust (...)' returns BOOTDATA, a 1-by-
     NBOOT cell array of datasets generated by cluster or block bootstrap
     resampling.

  BIBLIOGRAPHY:
  [1] Davison and Hinkley (1997). Bootstrap methods and their application
        (Vol. 1). New York, NY: Cambridge University Press.
  [2] Hesterberg, Tim (2014), What Teachers Should Know about the 
        Bootstrap: Resampling in the Undergraduate Statistics Curriculum, 
        http://arxiv.org/abs/1411.5279
  [3] Efron and Tibshirani (1993) An Introduction to the Bootstrap. 
        New York, NY: Chapman & Hall
  [4] Carlstein (1986) The use of subseries values for estimating the
        variance of a general statistic from a stationary sequence. 
        Ann. Statist. 14, 1171-9
  [5] Hesterberg (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.

  bootclust (version 2024.05.16)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs balanced bootstrap (or bootknife) resampling of clusters or blocks ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
bootint


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3142
 Computes percentile confidence interval(s) directly from a vector (or row-
 major matrix) of bootstrap statistics.

 -- Function File: CI = bootint (BOOTSTAT)
 -- Function File: CI = bootint (BOOTSTAT, PROB)
 -- Function File: CI = bootint (BOOTSTAT, PROB, ORIGINAL)

     'CI = bootint (BOOTSTAT)' computes simple 95% percentile confidence
     intervals [1,2] directly from the vector, or rows* of the matrix in
     BOOTSTAT, where BOOTSTAT contains bootstrap statistics such as those
     generated using the `bootstrp` function. Depending on the application,
     bootstrap confidence intervals with better coverage and accuracy can
     be computed using the various dedicated bootstrap confidence interval
     functions from the statistics-resampling package.

        * The matrix should have dimensions P * NBOOT, where P corresponds to
          the number of parameter estimates and NBOOT corresponds to the number
          of bootstrap samples.

     'CI = bootint (BOOTSTAT, PROB)' returns confidence intervals, where
     PROB is numeric and sets the lower and upper bounds of the confidence
     interval(s). The value(s) of PROB must be between 0 and 1. PROB can
     either be:
       <> scalar: To set the central mass of normal confidence intervals
                  to 100*PROB%
       <> vector: A pair of probabilities defining the lower and upper
                  percentiles of the confidence interval(s) as 100*(PROB(1))%
                  and 100*(PROB(2))% respectively.
          The default value of PROB is the vector: [0.025, 0.975], for an
          equal-tailed 95% percentile confidence interval.

     'CI = bootint (BOOTSTAT, PROB, ORIGINAL)' uses the ORIGINAL estimates
     associated with BOOTSTAT to correct PROB and the resulting confidence
     intervals (CI) for median bias. The confidence intervals returned in CI
     therefore become bias-corrected percentile intervals [3,4].

  BIBLIOGRAPHY:
  [1] Efron (1979) Bootstrap Methods: Another look at the jackknife.
        Annals Stat. 7,1-26
  [2] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. 
        New York, NY: Chapman & Hall
  [3] Efron (1981) Nonparametric Standard Errors and Confidence Intervals.
        Can J Stat. 9(2):139-172
  [4] Efron (1982) The jackknife, the bootstrap, and other resampling plans.
        SIAM-NSF, CBMS #38

  bootint (version 2024.05.19)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Computes percentile confidence interval(s) directly from a vector (or row-
 ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
bootknife


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10132
 Performs one or two levels of bootknife resampling and calculates bootstrap
 bias, standard errors and confidence intervals.

 -- Function File: bootknife (DATA)
 -- Function File: bootknife (DATA, NBOOT)
 -- Function File: bootknife (DATA, NBOOT, BOOTFUN)
 -- Function File: bootknife ({D1, D2, ...}, NBOOT, BOOTFUN)
 -- Function File: bootknife (DATA, NBOOT, {BOOTFUN, ...})
 -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA)
 -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA)
 -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC)
 -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC, BOOTSAM)
 -- Function File: STATS = bootknife (...)
 -- Function File: [STATS, BOOTSTAT] = bootknife (...)
 -- Function File: [STATS, BOOTSTAT] = bootknife (...)
 -- Function File: [STATS, BOOTSTAT, BOOTSAM] = bootknife (...)

     'bootknife (DATA)' uses a variant of nonparametric bootstrap, called
     bootknife [1], to generate 1999 resamples from the rows of the DATA
     (column vector or matrix) and compute their means and display the
     following statistics:
        - original: the original estimate(s) calculated by BOOTFUN and the DATA
        - bias: bootstrap estimate of the bias of the sampling distribution(s)
        - std_error: bootstrap estimate(s) of the standard error(s)
        - CI_lower: lower bound(s) of the 95% bootstrap confidence interval(s)
        - CI_upper: upper bound(s) of the 95% bootstrap confidence interval(s)

     'bootknife (DATA, NBOOT)' specifies the number of bootstrap resamples,
     where NBOOT can be either:
       <> scalar: A positive integer specifying the number of bootstrap
                  resamples [2,3] for single bootstrap, or
       <> vector: A pair of positive integers defining the number of outer and
                  inner (nested) resamples for iterated (a.k.a. double)
                  bootstrap and coverage calibration [3-6].
        The default value of NBOOT is the scalar: 1999.

     'bootknife (DATA, NBOOT, BOOTFUN)' also specifies BOOTFUN: the function
     calculated on the original sample and the bootstrap resamples. BOOTFUN
     must be either a:
       <> function handle, function name or an anonymous function,
       <> string of a function name, or
       <> a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (after the data arguments).
        In all cases BOOTFUN must take DATA for the initial input argument(s).
        BOOTFUN can return a scalar or any multidimensional numeric variable,
        but the output will be reshaped as a column vector. BOOTFUN must
        calculate a statistic representative of the finite data sample; it
        should NOT be an estimate of a population parameter (unless they are
        one of the same). If BOOTFUN is @mean or 'mean', narrowness bias of
        the confidence intervals for single bootstrap are reduced by expanding
        the probabilities of the percentiles using Student's t-distribution
        [7]. To compute confidence intervals for the mean without this
        correction for narrowness bias, define the mean within an anonymous
        function instead (e.g. @(x) mean(x)). By default, BOOTFUN is @mean.

     'bootknife ({D1, D2, ...}, NBOOT, BOOTFUN)' resamples from the rows of D1,
     D2 etc and the resamples are passed to BOOTFUN as multiple data input
     arguments. All data vectors and matrices (D1, D2 etc) must have the same
     number of rows.

     'bootknife (DATA, NBOOT, BOOTFUN, ALPHA)', where ALPHA is numeric and
     sets the lower and upper bounds of the confidence interval(s). The
     value(s) of ALPHA must be between 0 and 1. ALPHA can either be:
       <> scalar: To set the (nominal) central coverage of equal-tailed
                  percentile confidence intervals to 100*(1-ALPHA)%. The
                  intervals are either simple percentiles for single
                  bootstrap, or percentiles with calibrated central coverage 
                  for double bootstrap.
       <> vector: A pair of probabilities defining the (nominal) lower and
                  upper percentiles of the confidence interval(s) as
                  100*(ALPHA(1))% and 100*(ALPHA(2))% respectively. The
                  percentiles are either bias-corrected and accelerated (BCa)
                  for single bootstrap, or calibrated for double bootstrap.
        Note that the type of coverage calibration (i.e. equal-tailed or
        not) depends on whether NBOOT is a scalar or a vector. Confidence
        intervals are not calculated when the value(s) of ALPHA is/are NaN.
        The default value of ALPHA is the vector: [.025, .975], for a 95%
        confidence interval.

     'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA)' also sets STRATA, which
     are identifiers that define the grouping of the DATA rows for stratified*
     bootstrap resampling. STRATA should be a column vector or cell array with
     the same number of rows as the DATA.

     'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC)' also sets the
     number of parallel processes to use to accelerate computations of double
     bootstrap, jackknife and non-vectorized function evaluations on multicore
     machines. This feature requires the Parallel package (in Octave), or the
     Parallel Computing Toolbox (in Matlab).

     'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC, BOOTSAM)' uses
     bootstrap resampling indices provided in BOOTSAM. The BOOTSAM should be a
     matrix with the same number of rows as the data. When BOOTSAM is provided,
     the first element of NBOOT is ignored.

     'STATS = bootknife (...)' returns a structure with the following fields
     (defined above): original, bias, std_error, CI_lower, CI_upper.

     '[STATS, BOOTSTAT] = bootknife (...)' returns BOOTSTAT, a vector or matrix
     of bootstrap statistics calculated over the (first, or outer layer of)
     bootstrap resamples.

     '[STATS, BOOTSTAT, BOOTSAM] = bootknife (...)' also returns BOOTSAM, the
     matrix of indices (32-bit integers) used for the (first, or outer
     layer of) bootstrap resampling. Each column in BOOTSAM corresponds
     to one bootstrap resample and contains the row indices of the values
     drawn from the nonscalar DATA argument to create that sample.

  * For cluster resampling, use the 'bootclust' function instead. Clustered
    or serially dependent data can also be analysed by the 'bootstrp', 
    'bootwild' and 'bootbayes' functions.

  DETAILS:
    For a DATA sample with n rows, bootknife resampling involves creating
  leave-one-out jackknife samples of size n - 1 and then drawing resamples
  of size n with replacement from the jackknife samples [1]. In contrast
  to bootstrap, bootknife resampling produces unbiased estimates of the
  standard error of BOOTFUN when n is small. The resampling of DATA rows
  is balanced in order to reduce Monte Carlo error, particularly for
  estimating the bias of BOOTFUN [8,9].
    For single bootstrap, the confidence intervals are constructed from the
  quantiles of a kernel density estimate of the bootstrap statistics
  (with shrinkage correction). 
    For double bootstrap, calibration is used to improve the accuracy of the 
  bias and standard error, and coverage of the confidence intervals [2-6]. 
  Double bootstrap confidence intervals are constructed from the empirical
  distribution of the bootstrap statistics by linear interpolation. 
    This function has no input arguments for specifying a random seed. However,
  one can reset the random number generator with a SEED value using following
  command:

     boot (1, 1, false, SEED);

    Please see the help documentation for the function 'boot' for more
  information about setting the seed for parallel execution of bootknife.

  BIBLIOGRAPHY:
  [1] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.
  [2] Davison A.C. and Hinkley D.V (1997) Bootstrap Methods And Their 
        Application. (Cambridge University Press)
  [3] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. 
        New York, NY: Chapman & Hall
  [4] Booth J. and Presnell B. (1998) Allocation of Monte Carlo Resources for
        the Iterated Bootstrap. J. Comput. Graph. Stat. 7(1):92-112 
  [5] Lee and Young (1999) The effect of Monte Carlo approximation on coverage
        error of double-bootstrap con®dence intervals. J R Statist Soc B. 
        61:353-366.
  [6] Hall, Lee and Young (2000) Importance of interpolation when
        constructing double-bootstrap confidence intervals. Journal
        of the Royal Statistical Society. Series B. 62(3): 479-491
  [7] Hesterberg, Tim (2014), What Teachers Should Know about the 
        Bootstrap: Resampling in the Undergraduate Statistics Curriculum, 
        http://arxiv.org/abs/1411.5279
  [8] Davison et al. (1986) Efficient Bootstrap Simulation.
        Biometrika, 73: 555-66
  [9] Gleason, J.R. (1988) Algorithms for Balanced Bootstrap Simulations. 
        The American Statistician. Vol. 42, No. 4 pp. 263-266

  bootknife (version 2024.05.15)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs one or two levels of bootknife resampling and calculates bootstrap
...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
bootlm


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 29679
 Uses bootstrap to calculate confidence intervals (and p-values) for the
 regression coefficients from a linear model and performs N-way ANOVA.

 -- Function File: bootlm (Y, X)
 -- Function File: bootlm (Y, GROUP)
 -- Function File: bootlm (Y, GROUP, ..., NAME, VALUE)
 -- Function File: bootlm (Y, GROUP, ..., 'dim', DIM)
 -- Function File: bootlm (Y, GROUP, ..., 'continuous', CONTINUOUS)
 -- Function File: bootlm (Y, GROUP, ..., 'model', MODELTYPE)
 -- Function File: bootlm (Y, GROUP, ..., 'standardize', STANDARDIZE)
 -- Function File: bootlm (Y, GROUP, ..., 'varnames', VARNAMES)
 -- Function File: bootlm (Y, GROUP, ..., 'method', METHOD)
 -- Function File: bootlm (Y, GROUP, ..., 'method', 'bayesian', 'prior', PRIOR)
 -- Function File: bootlm (Y, GROUP, ..., 'alpha', ALPHA)
 -- Function File: bootlm (Y, GROUP, ..., 'display', DISPOPT)
 -- Function File: bootlm (Y, GROUP, ..., 'contrasts', CONTRASTS)
 -- Function File: bootlm (Y, GROUP, ..., 'nboot', NBOOT)
 -- Function File: bootlm (Y, GROUP, ..., 'clustid', CLUSTID)
 -- Function File: bootlm (Y, GROUP, ..., 'blocksz', BLOCKSZ)
 -- Function File: bootlm (Y, GROUP, ..., 'posthoc', POSTHOC)
 -- Function File: bootlm (Y, GROUP, ..., 'seed', SEED)
 -- Function File: STATS = bootlm (...)
 -- Function File: [STATS, BOOTSTAT] = bootlm (...)
 -- Function File: [STATS, BOOTSTAT, AOVSTAT] = bootlm (...)
 -- Function File: [STATS, BOOTSTAT, AOVSTAT, PRED_ERR] = bootlm (...)
 -- Function File: [STATS, BOOTSTAT, AOVSTAT, PRED_ERR, MAT] = bootlm (...)
 -- Function File: 'MAT = bootlm (Y, GROUP, ..., 'nboot', 0)'

        Fits a linear model with categorical and/or continuous predictors (i.e.
     independent variables) on a continuous outcome (i.e. dependent variable)
     and computes the following statistics for each regression coefficient:
          - name: the name(s) of the regression coefficient(s)
          - coeff: the value of the regression coefficient(s)
          - CI_lower: lower bound(s) of the 95% confidence interval (CI)
          - CI_upper: upper bound(s) of the 95% confidence interval (CI)
          - p-val: two-tailed p-value(s) for the parameter(s) being equal to 0
        By default, confidence intervals and Null Hypothesis Significance Tests
     (NHSTs) for the regression coefficients (H0 = 0) are calculated by wild
     (cluster) unrestricted bootstrap-t and are robust when normality and
     homoscedasticity cannot be assumed [1].

        Usage of this function is very similar to that of 'anovan'. Data (Y)
     is a numeric variable, and the predictor(s) are specified in GROUP (a.k.a.
     X), which can be a vector, matrix, or cell array. For a single predictor,
     GROUP can be a vector of numbers, logical values or characters, or a cell
     array of numbers, logical values or character strings, with the same
     number of elements (n) as Y. For K predictors, GROUP can be either a
     1-by-K, where each column corresponds to a predictor as defined above,
     an n-by-K cell array, or an n-by-K array of numbers or characters. If
     Y, or the definitions of each predictor in GROUP, are not column vectors,
     then they will be transposed or reshaped to be column vectors. Rows of
     data whose outcome (Y) or value of any predictor is NaN or Inf are
     excluded. For examples of function usage, please see demonstrations in
     the manual at:

     https://gnu-octave.github.io/statistics-resampling/function/bootlm.html

     Note that if GROUP is defined as a design matrix X, containing (one or
     more) continuous predictors only, a constant term (intercept) should
     not be included in X - one is automatically added to the model. For models
     containing any categorical predictors, the 'bootlm' function creates the
     design matrix for you. If you have already constructed a design matrix,
     consider using the functions 'bootwild' and 'bootbayes' instead.

     'bootlm' can take a number of optional parameters as name-value
     pairs.

     '[...] = bootlm (Y, GROUP, ..., 'varnames', VARNAMES)'

       <> VARNAMES must be a cell array of strings with each element
          containing a predictor name for each column of GROUP. By default
          (if not parsed as optional argument), VARNAMES are
          'X1','X2','X3', etc.

     '[...] = bootlm (Y, GROUP, ..., 'continuous', CONTINUOUS)'

       <> CONTINUOUS is a vector of indices indicating which of the
          columns (i.e. predictors) in GROUP should be treated as
          continuous predictors rather than as categorical predictors.
          The relationship between continuous predictors and the outcome
          should be linear.

     '[...] = bootlm (Y, GROUP, ..., 'model', MODELTYPE)'

       <> MODELTYPE can specified as one of the following:

             o 'linear' (default): compute N main effects with no
               interactions.

             o 'interaction': compute N effects and N*(N-1) interactions

             o 'full': compute the N main effects and interactions at
               all levels

             o a scalar integer: representing the maximum interaction
               order

             o a matrix of term definitions: each row is a term and
               each column is a predictor

               -- Example:
               A two-way design with interaction would be: [1 0; 0 1; 1 1]

     '[...] = bootlm (Y, GROUP, ..., 'standardize', STANDARDIZE)'

       <> STANDARDIZE can be either 'off' (or false, default) or 'on' (or true),
          which controls whether the outcome and any continuous predictors in
          the model should be converted to standard scores) before model
          fitting to give standardized regression coefficients. Please see the
          documentation relating to the 'posthoc' input argument to read about
          further consequences of turning on 'standardize'.

     '[...] = bootlm (Y, GROUP, ..., 'method', METHOD)'

       <> METHOD can be specified as one of the following:

             o 'wild' (default): Wild bootstrap-t, using the 'bootwild'
               function. Please see the help documentation below and in the
               function 'bootwild' for more information about this method [1].

             o 'bayesian': Bayesian bootstrap, using the 'bootbayes' function.
                Please see the help documentation below and in the function
               'bootbayes' for more information about this method [2]. This
                method is well-optimized for large data sets.

             Note that p-values are a frequentist concept and are only computed
             and returned from bootlm when the METHOD is 'wild'. Since the wild
             bootstrap method here (based on Webb's 6-point distribution)
             imposes symmetry on the sampling of the residuals, we recommend
             using 'wild' bootstrap for (two-sided) hypothesis tests, and
             instead use 'bayesian' bootstrap with the 'auto' prior setting
             (see below) for estimation of precision/uncertainty (e.g. credible
             intervals).

     '[...] = bootlm (Y, GROUP, ..., 'method', 'bayesian', 'prior', PRIOR)'

       <> Sets the prior for Bayesian bootstrap. Possible values are:

             o scalar: A non-negative (>= 0) scalar that parametrizes the
                  symmetric Dirichlet used to generate weights for linear
                  least squares and the nonparametric Bayesian bootstrap
                  posterior.

             o 'auto' (default): Sets value(s) for PRIOR that effectively
                  incorporates Bessel's correction a priori such that the
                  variance of the posterior (i.e. of the rows of BOOTSTAT) 
                  becomes an unbiased estimator of the sampling variance*.
                  The bootlm function uses a dedicated prior for each linear
                  estimate. Please see the help documentation for the function 
                  'bootbayes' for more information about the prior [2].

     '[...] = bootlm (Y, GROUP, ..., 'method', 'bayesian', 'prior', PRIOR)'

       <> Sets the prior for Bayesian bootstrap. Possible values are:

             o scalar: A non-negative (>= 0) scalar (Dirichlet concentration 
                 alpha) that parametrizes the symmetric Dirichlet used to
                 generate weights for linear least squares and the 
                 nonparametric Bayesian bootstrap posterior. (Note: alpha = 0 
                 is the Haldane case.)

             o 'auto' (default): Applies a degrees-of-freedom style correction
                 so that the posterior variance (rows of BOOTSTAT) becomes an
                 unbiased estimator of sampling variance. bootlm uses a
                 dedicated prior for each linear estimate*. See 'bootbayes'
                 for the exact formulas and details [2].

     '[...] = bootlm (Y, GROUP, ..., 'alpha', ALPHA)'

       <> ALPHA is numeric and sets the lower and upper bounds of the
          confidence or credible interval(s). The value(s) of ALPHA must be
          between 0 and 1. ALPHA can either be:

             o scalar: Set the central mass of the intervals to 100*(1-ALPHA)%.
                  For example, 0.05 for a 95% interval. If METHOD is 'wild',
                  then the intervals are symmetric bootstrap-t confidence
                  intervals [1]. If METHOD is 'bayesian', then the intervals
                  are shortest probability credible intervals [2].

             o vector: A pair of probabilities defining the lower and upper
                  and upper bounds of the interval(s) as 100*(ALPHA(1))% and 
                  100*(ALPHA(2))% respectively. For example, [.025, .975] for
                  a 95% interval. If METHOD is 'wild', then the intervals are
                  asymmetric bootstrap-t confidence intervals [1]. If METHOD is
                  'bayesian', then the intervals are simple percentile credible
                  intervals [2].

               The default value of ALPHA is the scalar: 0.05.

     '[...] = bootlm (Y, GROUP, ..., 'display', DISPOPT)'

       <> DISPOPT can be either 'on' (or true, default) or 'off' (or false)
          and controls the display of the model formula, a table of model
          parameter estimates and a figure of diagnostic plots. The p-values
          are formatted in APA-style.

     '[...] = bootlm (Y, GROUP, ..., 'contrasts', CONTRASTS)'

       <> CONTRASTS can be specified as one of the following:

             o A string corresponding to one of the built-in contrasts
               listed below:

                  o 'treatment' (default): Treatment contrast (or dummy)
                    coding. The intercept represents the mean of the first
                    level of all the predictors. Each slope coefficient
                    compares one level of a predictor (or interaction
                    between predictors) with the first level for that/those
                    predictor(s), at the first level of all the other
                    predictors. The first (or reference level) of the
                    predictor(s) is defined as the first level of the
                    predictor (or combination of the predictors) listed in
                    the GROUP argument. This type of contrast is ideal for
                    one-way designs or factorial designs of nominal predictor
                    variables that have an obvious reference or control group.

                  o 'anova' or 'simple': Simple (ANOVA) contrast coding. The
                    intercept represents the grand mean. Each slope coefficient
                    represents the difference between one level of a predictor
                    (or interaction between predictors) to the first level for
                    that/those predictor(s), averaged over all levels of the
                    other predictor(s). The first (or reference level) of the
                    predictor(s) is defined as the first level of the predictor
                    (or combination of the predictors) listed in the GROUP
                    argument. The columns of this contrast coding scheme sum
                    to zero. This type of contrast is ideal for nominal
                    predictor variables that have an obvious reference or
                    control group and that are modelled together with a
                    covariate or blocking factor.

                  o 'poly': Polynomial contrast coding for trend analysis.
                    The intercept represents the grand mean. The remaining
                    slope coefficients returned are for linear, quadratic,
                    cubic etc. trends across the levels. In addition to the
                    columns of this contrast coding scheme summing to zero,
                    this contrast coding is orthogonal (i.e. the off-diagonal
                    elements of its autocovariance matrix are zero) and so
                    the slope coefficients are independent. This type of
                    contrast is ideal for ordinal predictor variables, in
                    particular, predictors with ordered levels that are evenly
                    spaced.

                  o 'helmert': Helmert contrast coding. The intercept
                    represents the grand mean. Each slope coefficient
                    represents the difference between one level of a predictor
                    (or interaction between predictors) with the mean of the
                    subsequent levels, where the order of the predictor levels
                    is as they appear in the GROUP argument. In addition to the
                    columns of this contrast coding scheme summing to zero,
                    this contrast coding is orthogonal (i.e. the off-diagonal
                    elements of its autocovariance matrix are zero) and so the
                    slope coefficients are independent. This type of contrast
                    is ideal for predictor variables that are either ordinal,
                    or nominal with their levels ordered such that the contrast
                    coding reflects tests of some hypotheses of interest about
                    the nested grouping of the predictor levels. Note that the
                    the one-level predictor is the reference and is subtracted
                    from from the mean of the subsequent predictor levels.

                  o 'effect': Deviation effect coding. The intercept represents
                    the grand mean. Each slope coefficient compares one level
                    of a predictor (or interaction between predictors) with the
                    grand mean. Note that a slope coefficient is omitted for
                    the first level of the predictor(s) listed in the GROUP
                    argument. The columns of this contrast coding scheme sum to
                    zero. This type of contrast is ideal for nominal predictor
                    variables when there is no obvious reference group.

                  o 'sdif' or 'sdiff': Successive differences contrast coding.
                    The intercept represents the grand mean. Each slope
                    coefficient represents the difference between one level of
                    a predictor (or interaction between predictors) to the
                    previous one, where the order of the predictor levels is
                    as they appear in the GROUP argument. The columns of this
                    contrast coding coding scheme sum to zero. This type of
                    contrast is ideal for ordinal predictor variables.

            <> A matrix containing a custom contrast coding scheme (i.e.
               the generalized inverse of contrast weights). Rows in
               the contrast matrices correspond to predictor levels in the
               order that they first appear in the GROUP column. The
               matrix must contain the same number of columns as there
               are the number of predictor levels minus one.

          If the linear model contains more than one predictor and a
          built-in contrast coding scheme was specified, then those
          contrasts are applied to all predictors. To specify different
          contrasts for different predictors in the model, CONTRASTS should
          be a cell array with the same number of cells as there are
          columns in GROUP. Each cell should define contrasts for the
          respective column in GROUP by one of the methods described
          above. If cells are left empty, then the default contrasts
          are applied. Contrasts for cells corresponding to continuous
          predictors are ignored.

     '[...] = bootlm (Y, GROUP, ..., 'nboot', NBOOT)'

       <> Specifies the number of bootstrap resamples, where NBOOT must be a
          positive integer. If empty, the default value of NBOOT is 9999.

     '[...] = bootlm (Y, GROUP, ..., 'clustid', CLUSTID)'

       <> Specifies a vector or cell array of numbers or strings respectively
          to be used as cluster labels or identifiers. Rows of the data with
          the same CLUSTID value are treated as clusters with dependent errors.
          If CLUSTID is a row vector or a two dimensional array or matrix, then
          it will be transposed or reshaped to be a column vectors. If empty
          (default), no clustered resampling is performed and all errors are
          treated as independent. The standard errors computed are cluster-
          robust standard errors. 

     '[...] = bootlm (Y, GROUP, ..., 'blocksz', BLOCKSZ)'

       <> Specifies a scalar, which sets the block size for bootstrapping when
          the errors have serial dependence. Rows of the data within the same
          block are treated as having dependent errors. If empty (default),
          no clustered resampling is performed and all errors are treated
          as independent. The standard errors computed are cluster robust.

     '[...] = bootlm (Y, GROUP, ..., 'dim', DIM)'

       <> DIM can be specified as one of the following:

             o a cell array of strings corresponding to variable names defined
               VARNAMES, or

             o a scalar or vector specifying the dimension(s),

          over which 'bootlm' calculates and returns estimated marginal means
          instead of regression coefficients. For example, the value [1 3]
          computes the estimated marginal mean for each combination of the
          levels of the first and third predictors. The rows of the estimates
          returned are sorted according to the order of the dimensions
          specified in DIM. The default value of DIM is empty, which makes
          'bootlm' return the statistics for the model coefficients. If DIM
          is, or includes, a continuous predictor then 'bootlm' will return an
          error. The following statistics are printed when specifying 'dim':
             - name: the name(s) of the estimated marginal mean(s)
             - mean: the estimated marginal mean(s)
             - CI_lower: lower bound(s) of the 95% confidence interval (CI)
             - CI_upper: upper bound(s) of the 95% confidence interval (CI)
             - N: the number of independent sampling units used to compute CIs

     '[...] = bootlm (Y, GROUP, ..., 'posthoc', POSTHOC)'

       <> When DIM is specified, POSTHOC comparisons along DIM can be one of
          the following:

             o empty or 'none' (default) : No posthoc comparisons are performed.
               The statistics returned are for the estimated marginal means.

             o 'pairwise' : Pairwise comparisons are performed.

             o 'trt_vs_ctrl' : Treatment vs. Control comparisons are performed.
               Control corresponds to the level(s) of the predictor(s) listed
               within the first row of STATS when POSTHOC is set to 'none'.

             o {'trt_vs_ctrl', k} : Treatment vs. Control comparisons are
               performed. The control is group number k as returned when
               POSTHOC is set to 'none'.

             o a two-column numeric matrix defining each pair of comparisons,
               where each number in the matrix corresponds to the row number
               of the estimated-marginal means as listed for the same value(s)
               of DIM and when POSTHOC is set to 'none'. The comparison
               corresponds to the row-wise differences: column 1 - column 2.
               See demo 6 for an example.

          All of the posthoc comparisons use the step-down max |T| procedure
          to control the type I error rate, but the confidence intervals are
          not adjusted for multiple comparisons [3]. If the 'standardize' input
          argument set to 'on', the estimates, confidence intervals and
          bootstrap statistics for the comparisons are converted to estimates
          of Cohen's d standardized effect sizes. Cohen's d here is calculated
          from the residual standard deviation, which is estimated from the
          standard errors and the sample sizes. As such, the effect sizes
          calculated exclude variability attributed to other predictors in
          the model.

     '[...] = bootlm (Y, GROUP, ..., 'seed', SEED)' initialises the Mersenne
     Twister random number generator using an integer SEED value so that
     'bootlm' results are reproducible.

     'bootlm' can return up to four output arguments:

     'STATS = bootlm (...)' returns a structure with the following fields:
       - 'method': The bootstrap method
       - 'name': The names of each of the estimates
       - 'estimate': The value of the estimates
       - 'CI_lower': The lower bound(s) of the confidence/credible interval(s)
       - 'CI_upper': The upper bound(s) of the confidence/credible interval(s)
       - 'pval': The p-value(s) for the hypothesis that the estimate(s) == 0
       - 'fpr': The minimum false positive risk (FPR) for each p-value [4].
       - 'N': The number of independent sampling units used to compute CIs
       - 'prior': The prior used for Bayesian bootstrap. This will return a
                  scalar for regression coefficients, or a P x 1 or P x 2
                  matrix for estimated marginal means or posthoc tests
                  respectively, where P is the number of parameter estimates.
       - 'levels': A cell array with the levels of each predictor.
       - 'contrasts': A cell array of contrasts associated with each predictor.

          Note that the p-values returned are truncated at the resolution
          limit determined by the number of bootstrap replicates, specifically 
          1 / (NBOOT + 1). Values for the minumum false positive risk are
          computed using the Sellke-Berger approach. The following fields are
          only returned when 'estimate' corresponds to model regression
          coefficients: 'levels' and 'contrasts'.

     '[STATS, BOOTSTAT] = bootlm (...)' also returns a P x NBOOT matrix of
     bootstrap statistics for the estimated parameters, where P is the number
     of parameters estimated in the model. Depending on the DIM and POSTHOC
     input arguments set by the user, the estimated parameters whose bootstrap
     statistics are returned will be either regression coefficients, the
     estimated marginal means, or the mean differences between groups of a
     categorical predictor for posthoc testing.

     '[STATS, BOOTSTAT, AOVSTAT] = bootlm (...)' also computes bootstrapped
     ANOVA statistics* and returns them in a structure with the following
     fields: 
       - 'MODEL': The formula of the linear model(s) in Wilkinson's notation
       - 'SS': Sum-of-squares
       - 'DF': Degrees of freedom
       - 'MS': Mean-squares
       - 'F': F-Statistic
       - 'PVAL': p-values
       - 'FPR': The minimum false positive risk for each p-value [3]
       - 'SSE': Sum-of-Squared Error
       - 'DFE': Degrees of Freedom for Error
       - 'MSE': Mean Squared Error

       The ANOVA implemented uses sequential (type I) sums-of-squares and so
       the results and their interpretation depend on the order** of predictors
       in the GROUP variable (when the design is not balanced). Thus, the null
       model used for comparison for each model is the model listed directly
       above it in AOVSTAT; for the first model, the null model is the
       intercept-only model. The procedure here for bootstrapping ANOVA is
       performed with the null hypothesis imposed and requires that the method
       used is 'wild' bootstrap AND that no other statistics are requested 
       (i.e. estimated marginal means or posthoc tests).

       The bootlm function treats all model predictors as fixed effects during
       ANOVA tests. Note also that the bootlm function can be used to compute
       p-values for ANOVA with accounting for dependence structures such as
       block or nested designs by wild cluster bootstrap resampling (see the 
!      'clustid' or 'blocksz' option).

       ** See demo 7 for an example of how to obtain results for ANOVA using
          type II sums-of-squares, which test hypotheses that give results
          invariant to the order of the predictors, regardless of whether
          sample sizes are equal or not.

     '[STATS, BOOTSTAT, AOVSTAT, PRED_ERR] = bootlm (...)' also computes
     refined bootstrap estimates of prediction error* and returns statistics
     derived from it in a structure containing the following fields:
       - 'MODEL': The formula of the linear model(s) in Wilkinson's notation
       - 'PE': Bootstrap estimate of prediction error [5]
       - 'PRESS': Bootstrap estimate of predicted residual error sum of squares
       - 'RSQ_pred': Bootstrap estimate of predicted R-squared
       - 'EIC': Extended (Efron) Information Criterion [6]
       - 'RL': Relative likelihood (compared to the intercept-only model)
       - 'Wt': EIC expressed as weights

       The linear models evaluated are the same as for AOVSTAT, except that the 
       output also includes the statistics for the intercept-only model. Note
       that PRED_ERR statistics are only returned when the method used is
       'wild' bootstrap AND when no other statistics are requested (i.e.
       estimated marginal means or posthoc tests). Computations of the
       statistics in PRED_ERR are compatible with the 'clustid' and 'blocksz'
        options. Note that it is possible (and not unusual) to get a negative
       value for RSQ-pred, particularly for the intercept-only model.

     * If the parallel computing toolbox (Matlab) or package (Octave) is
       installed and loaded, then these computations will be automatically
       accelerated by parallel processing on platforms with multiple processors.

     '[STATS, BOOTSTAT, AOVSTAT, PRED_ERR, MAT] = bootlm (...)' also returns
     a structure containing the design matrix of the predictors (X), the raw
     regression coefficients (b), the outcome (Y), the column vector (ID) of
     arbitrary numeric identifiers for the independent sampling units, the 
     cell array of contrast/coding matrices (C) used to generate X, and the
     hypothesis matrix (L) for computing any specified estimated marginal means
     or posthoc comparisons.

     'MAT = bootlm (Y, GROUP, ..., 'nboot', 0)'

       <> Performs the least-squares fit only and returns MAT without any 
          bootstrap resampling. All other input options are accepted (even if
          some are ignored), but only one output argument can be requested.
          The 'display' option, if true creates the figure or diagnostic plots
          but does not print any results to stdout.

  Bibliography:
  [1] Penn, A.C. statistics-resampling manual: `bootwild` function reference.
        https://gnu-octave.github.io/statistics-resampling/function/bootwild.html 
        and references therein. Last accessed 02 Sept 2024.
  [2] Penn, A.C. statistics-resampling manual: `bootbayes` function reference.
        https://gnu-octave.github.io/statistics-resampling/function/bootbayes.html
        and references therein. Last accessed 02 Sept 2024.
  [3] Westfall, P. H., & Young, S. S. (1993). Resampling-Based Multiple 
        Testing: Examples and Methods for p-Value Adjustment. Wiley.
  [4] David Colquhoun (2019) The False Positive Risk: A Proposal Concerning
        What to Do About p-Values, The American Statistician, 73:sup1, 192-201
  [5] Efron and Tibshirani (1993) An Introduction to the Bootstrap. 
        New York, NY: Chapman & Hall. pg 247-252
  [6] Konishi & Kitagawa (2008), "Bootstrap Information Criterion" In: 
        Information Criteria and Statistical Modeling. Springer Series in
        Statistics. Springer, NY.

  bootlm (version 2024.07.08)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Uses bootstrap to calculate confidence intervals (and p-values) for the
 reg...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
bootmode


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2728
 Uses bootstrap to evaluate the likely number of real peaks (i.e. modes)
 in the distribution of a single set of data.

 -- Function File: H = bootmode (X, M)
 -- Function File: H = bootmode (X, M, NBOOT)
 -- Function File: H = bootmode (X, M, NBOOT, KERNEL)
 -- Function File: H = bootmode (X, M, NBOOT, KERNEL, NPROC)
 -- Function File: [H, P] = bootmode (X, M, ...)
 -- Function File: [H, P, CRITVAL] = bootmode (X, M, ...)

     'H = bootmode (X, M)' tests whether the distribution underlying the 
     univariate data in vector X has M modes. The method employs the
     smooth bootstrap as described [1]. The parsimonious approach is to
     iteratively call this function, each time incrementally increasing
     the number of modes until the null hypothesis (H0) is accepted (i.e.
     H=0), where H0 corresponds to the number of modes being equal to M. 
        - If H = 0, H0 cannot be rejected at the 5% significance level.
        - If H = 1, H0 can be rejected at the 5% significance level.

     'H = bootmode (X, M, NBOOT)' sets the number of bootstrap replicates

     'H = bootmode (X, M, NBOOT, KERNEL)' sets the kernel for kernel
     density estimation. Possible values are:
        o 'Gaussian' (default)
        o 'Epanechnikov'

     'H = bootmode (X, M, NBOOT, KERNEL, NPROC)' sets the number of parallel
      processes to use to accelerate computations. This feature requires the
      Parallel package (in Octave), or the Parallel Computing Toolbox (in
      Matlab).

     '[H, P] = bootmode (X, M, ...)' also returns the two-tailed p-value of
      the bootstrap hypothesis test.

     '[H, P, CRITVAL] = bootmode (X, M, ...)' also returns the critical
     bandwidth (i.e.the smallest bandwidth achievable to obtain a kernel
     density estimate with M modes)

  Bibliography:
  [1] Efron and Tibshirani. Chapter 16 Hypothesis testing with the
       bootstrap in An introduction to the bootstrap (CRC Press, 1994)

  bootmode (version 2023.05.02)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Uses bootstrap to evaluate the likely number of real peaks (i.e. modes)
 in ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
bootridge


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 30261
 Empirical Bayes penalized regression for univariate or multivariate outcomes, 
 with shrinkage tuned to minimize prediction error computed by .632 bootstrap.

 -- Function File: bootridge (Y, X)
 -- Function File: bootridge (Y, X, CATEGOR)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT, ALPHA)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF, SEED)
 -- Function File: bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF, SEED, TOL)
 -- Function File: S = bootridge (Y, X, ...)
 -- Function File: [S, YHAT] = bootridge (Y, X, ...)

      'bootridge (Y, X)' fits an empirical Bayes ridge regression model using
      a linear Normal (Gaussian) likelihood with an empirical Bayes normal
      ridge prior on the regression coefficients. The ridge tuning constant
      (lambda) is optimized via .632 bootstrap-based machine learning (ML) to
      minimize out-of-bag prediction error [1, 2]. Y is an m-by-q matrix of
      outcomes and X is an m-by-n design matrix whose first column must
      correspond to an intercept term. If an intercept term (a column of ones)
      is not found in the first column of X, one is added automatically. If any
      rows of X or Y contain missing values (NaN) or infinite values (+/- Inf),
      the corresponding observations are omitted before fitting.

      For each outcome, the function prints posterior summaries for regression
      coefficients or linear estimates, including posterior means, equal-tailed
      credible intervals, Bayes factors (lnBF10), and the marginal prior used
      for inference. When multiple outcomes are fitted (q > 1), the function
      additionally prints posterior summaries for the residual correlations
      between outcomes, reported as unique (lower-triangular) outcome pairs.
      For each correlation, the printed output includes the estimated
      correlation and its credible interval.

      Interpretation note (empirical Bayes):
        Bayes factors reported by 'bootridge' are empirical‑Bayes approximations
        based on a data‑tuned ridge prior. They are best viewed as model‑
        comparison diagnostics (evidence on a predictive, information‑theoretic
        scale) rather than literal posterior odds under a fully specified prior
        [3–5]. The log scale (lnBF10) is numerically stable and recommended
        for interpretation; BF10 may be shown as 0 or Inf when beyond machine
        range, while lnBF10 remains finite. These Bayesian statistics converge 
        to standard conjugate Bayesian evidence as the effective residual 
        degrees of freedom (df_t) increase.

      For convenience, the statistics-resampling package also provides the
      function `bootlm`, which offers a user-friendly but feature-rich interface
      for fitting univariate linear models with continuous and categorical
      predictors. The design matrix X and hypothesis matrix L returned in the
      MAT-file produced by `bootlm` can be supplied directly to `bootridge`.
      The outputs of `bootlm` also provide a consistent definition of the model
      coefficients, thereby facilitating interpretation of parameter estimates,
      contrasts, and posterior summaries. The design matrix X and hypothesis
      matrix L can also be obtained the same way with one of the outcomes of a
      multivariate data set, then fit to all the outcomes using bootridge.

      'bootridge (Y, X, CATEGOR)' specifies the predictor columns that
      correspond to categorical variables. CATEGOR must be a scalar or vector
      of integer column indices referring to columns of X (excluding the
      intercept). Alternatively, if all predictor terms are categorical, set
      CATEGOR to 'all' or '*'. CATEGOR does NOT create or modify dummy or
      contrast coding; users are responsible for supplying an appropriately
      coded design matrix X. The indices in CATEGOR are used to identify
      predictors that represent categorical variables, even when X is already
      coded, so that variance-based penalty scaling is not applied to these
      terms.

      For categorical predictors in ridge regression, use meaningful centered
      and preferably orthogonal (e.g. Helmert or polynomial) contrasts whenever
      possible, since shrinkage occurs column-wise in the coefficient basis.
      Orthogonality leads to more stable shrinkage and tuning of the ridge
      parameter. Although the prior is not rotationally invariant, Bayes
      factors for linear contrasts defined via a hypothesis matrix (L) are
      typically more stable when the contrasts defining the coefficients are
      orthogonal.

      'bootridge (Y, X, CATEGOR, NBOOT)' sets the number of bootstrap samples
      used to estimate the .632 bootstrap prediction error. The bootstrap* has
      first order balance to improve the efficiency for variance estimation,
      and utilizes bootknife (leave-one-out) resampling to guarantee
      observations in the out-of-bag samples. The default value of NBOOT is
      100, but more resamples are recommended to reduce monte carlo error.

      The bootstrap tuning of the ridge parameter relies on resampling
      functionality provided by the statistics-resampling package. In
      particular, `bootridge` depends on the functions `bootstrp` and `boot` to
      perform balanced bootstrap and bootknife (leave-one-out) resampling and
      generate out-of-bag samples. These functions are required for estimation
      of the .632 bootstrap prediction error used to select the ridge tuning
      constant.

      'bootridge (Y, X, CATEGOR, NBOOT, ALPHA)' sets the central mass of equal-
      tailed credibility intervals (CI) to (1 - ALPHA) with probability mass
      ALPHA/2 in each tail, and sets the threshold for the adjusted stability
      selection (SS) probabilities of the regression coefficients to (1 - ALPHA).
      ALPHA must be a scalar value between 0 and 1. The default value of ALPHA
      is 0.05 for 95% intervals.

      'bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L)' specifies a hypothesis
      matrix L of size n-by-c defining c linear contrasts or model-based
      estimates of the regression coefficients. In this case, posterior
      summaries and credible intervals are reported for the linear estimates
      rather than the model coefficients.

      'bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF)' specifies a design
      effect used to account for clustering or dependence. DEFF inflates the
      posterior covariance and reduces the effective degrees of freedom (df_t) 
      to ensure Bayes factors and intervals are calibrated for the effective 
      sample size. For a mean, Kish's formula DEFF = 1+(g-1)*r (where g is 
      cluster size) suggests an upper bound of g. However, for regression 
      slopes, the realized DEFF depends on the predictor type: it can exceed 
      g for between-cluster predictors or be less than 1 for within-cluster 
      predictors. DEFF is best estimated as the ratio of clustered-to-i.i.d. 
      sampling variances - please see DETAIL below. Default DEFF is 1.

      'bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF, SEED)' initialises the
      Mersenne Twister random number generator using an integer SEED value so
      that bootstrap results are reproducible, which improves convergence.
      Monte carlo error of the results can be assessed by repeating the
      analysis multiple times, each time with a different random seed.

      'bootridge (Y, X, CATEGOR, NBOOT, ALPHA, L, DEFF, SEED, TOL)' controls
      the convergence tolerance for optimizing the ridge tuning constant lambda
      on the log10 scale. Hyperparameter optimization terminates when the width
      of the current bracket satisfies:

          log10 (lambda_high) − log10 (lambda_low) < TOL.

      Thus, TOL determines the relative (multiplicative) precision of lambda.
      The default value TOL = 0.005 corresponds to approximately a 1% change in
      lambda, which is typically well below the Monte Carlo noise of the .632
      bootstrap estimate of prediction error.

      * If sufficient parallel resources are available (three or more workers),
        the optimization uses a parallel k‑section search; otherwise, a serial
        golden‑section search is used. The tolerance TOL applies identically
        in both cases. The benefit of parallel processing is most evident when
        NBOOT is very large. In GNU Octave, the maximum number of workers used
        can be set by the user before running bootridge, for example, for three
        workers with the command:

          setenv ('OMP_NUM_THREADS', '3')

      'S = bootridge (Y, X, ...)' returns a structure containing posterior
      summaries including posterior means, credibility intervals, Bayes factors,
      prior summaries, the bootstrap-optimized ridge parameter, residual
      covariance estimates, and additional diagnostic information.

      The output S is a structure containing the following fields (listed in
      order of appearance):

        o coefficient
            n-by-q matrix of posterior mean regression coefficients for each
            outcome when no hypothesis matrix L is specified.

        o estimate
            c-by-q matrix of posterior mean linear estimates when a hypothesis
            matrix L is specified. This field is returned instead of
            'coefficient' when L is non-empty.

        o CI_lower
            Matrix of lower bounds of the (1 - ALPHA) credibility intervals
            for coefficients or linear estimates. Dimensions match those of
            'coefficient' or 'estimate'.

        o CI_upper
            Matrix of upper bounds of the (1 - ALPHA) credibility intervals
            for coefficients or linear estimates. Dimensions match those of
            'coefficient' or 'estimate'.

        o BF10
            Matrix of Bayes factors (BF10) for testing whether each regression
            coefficient or linear estimate equals zero, computed using the
            Savage–Dickey density ratio. Values may be reported as 0 or Inf
            when outside floating‑point range; lnBF10 remains finite and is
            the recommended evidential scale.

        o lnBF10
            Matrix of natural logarithms of the Bayes factors (BF10). Positive
            values indicate evidence in favour of the alternative hypothesis,
            whereas negative values indicate evidence in favour of the null.
              lnBF10 < -1  is approx. BF10 < 0.3
              lnBF10 > +1  is approx. BF10 > 3.0

        o prior
            Cell array describing the marginal inference-scale prior used for
            each coefficient or estimate in Bayes factor computation.
            Reported as 't (mu, sigma, df_t)' on the coefficient (or estimate)
            scale; see CONDITIONAL VS MARGINAL PRIORS for details.

        o Deff
            Design effect used to inflate the residual covariance and reduce
            inferential degrees of freedom to account for clustering.

        o lambda
            Scalar ridge tuning constant selected by minimizing the .632
            bootstrap estimate of prediction error (then scaled by DEFF).

        o df_lambda
            Effective residual degrees of freedom under ridge regression,
            defined as m minus the trace of the ridge hat matrix. Used for
            residual variance estimation (scale); does NOT include DEFF.

        o df_t
            Inferential degrees of freedom, which is df_lambda adjusted for
            for the design effect.

        o Sigma_Y_hat
            Estimated residual covariance matrix of the outcomes, inflated by
            the design effect DEFF when applicable. For a univariate outcome,
            this reduces to the residual variance.

        o tau2_hat
            Estimated prior covariance of the regression coefficients across
            outcomes, proportional to Sigma_Y_hat and inversely proportional
            to the ridge parameter lambda.

        o Sigma_Beta
            Cell array of posterior covariance matrices of the regression
            coefficients. Each cell corresponds to one outcome and contains
            the covariance matrix for that outcome.

        o nboot
            Number of bootstrap samples used to estimate the .632 bootstrap
            prediction error.

        o tol
            Numeric tolerance used in the golden-section search for optimizing
            the ridge tuning constant.

        o iter
            Number of iterations performed by the golden-section search.

        o pred_err
            The minimized prediction error calculated using the optimal lambda.
            Note that pred_err calculation is summed over outcome variables
            (columns) from the fit on Y (after internal standardization), using
            the predictors X (after internal centering).

        o RSQ_pred
            The predicted R-squared for the fit across all the outcomes.
               RSQ_pred (average) = 1 - (S.pred_err / q),
            where q is the number of outcomes.

        o stability
            The probabilities that the sign of the regression coefficients
            remained consistent across max(nboot,1999) bootstrap resamples [6].
            Raw probabilities are smoothed using a Jeffreys prior and, if
            applicable, adjusted by the design effect (Deff). In the printed
            summary, stability exceeding (1 - ALPHA / 2) is indicated by (+)
            or (-) to denote the consistent direction of the effect.

        o RTAB
            Matrix summarizing residual correlations (strictly lower-
            triangular pairs). The columns correspond to outcome J, outcome I, 
            and the coefficient and credible intervals for their correlation.

            Credible intervals for correlations are computed on Fisher’s z
            [7] using a t‑based sampling distribution with effective degrees 
            of freedom df_t, and then back‑transformed. See CONDITIONAL VS 
            MARGINAL PRIORS and DETAIL below. Diagonal entries are undefined
            and not included.

        o P_vec 
            A vector of predictor-wise penalty weights used to normalize
            shrinkage across the predictor terms.

      '[S, YHAT] = bootridge (Y, X, ...)' returns fitted values.

      DETAIL: The model implements an empirical Bayes ridge regression that
      simultaneously addresses the problems of multicollinearity, multiple 
      comparisons, and clustered dependence. The sections below provide
      detail on the applications to which this model is well suited and the
      principles of its operation.

      REGULARIZATION AND MULTIPLE COMPARISONS: 
      Unlike classical frequentist methods (e.g., Bonferroni) that penalize 
      inference-stage decisions (p-values), `bootridge` penalizes the estimates 
      themselves via shrinkage. By pooling information across all predictors to 
      learn the global penalty (lambda), the model automatically adjusts its 
      skepticism to the design's complexity. This provides a principled 
      probabilistic alternative to family-wise error correction: noise-driven 
      effects are shrunken toward zero, while stable effects survive the 
      penalty. This "Partial Pooling" ensures that Bayes factors are 
      appropriately conservative without the catastrophic loss of power 
      associated with classical post-hoc adjustments [8, 9]. See later section
      on STATISTICAL INFERENCE AND ERROR CONTROL.

      PREDICTIVE OPTIMIZATION:
      The ridge tuning constant (hyperparameter) is selected empirically by
      minimizing the .632 bootstrap estimate of prediction error [1, 2]. This
      aligns lambda with minimum estimated out‑of‑sample mean squared
      prediction error, ensuring the model is optimized for generalizability
      rather than mere in-sample fit [10–12]. This lambda in turn determines the
      scale of the Normal ridge prior used to shrink slope coefficients toward
      zero [13].

      CONDITIONAL VS MARGINAL PRIORS:
      The ridge penalty (lambda) corresponds to a Normal prior on the
      regression coefficients CONDITIONAL on the residual variance:
          Beta | sigma^2 ~ Normal(0, tau^2 * sigma^2),
      where tau^2 is determined by lambda. This conditional Normal prior
      fully defines the ridge objective function and is held fixed during
      lambda optimisation (prediction-error minimisation).

      For inference, however, uncertainty in the residual variance is
      explicitly acknowledged. Integrating over variance uncertainty under
      an empirical‑Bayes approximation induces a marginal Student’s t
      distribution for coefficients and linear estimates, which is used
      for credible intervals and Bayes factors.

      PRIOR CALIBRATION & DATA INDEPENDENCE:
      To prevent circularity in the prior selection, lambda is optimized 
      solely by minimizing the .632 bootstrap out-of-bag (OOB) error. 
      This ensures the prior precision is determined by the model's 
      ability to predict "unseen" observations (data points not used 
      for the coefficient estimation in a given bootstrap draw), 
      thereby maintaining a principled separation between the data used 
      for likelihood estimation and the data used for prior tuning.

      STABILITY SELECTION:
      The directional reproducibility of the sign of the regression coefficients
      under resampling are quantified and reported as Stability Selection (SS).
      It is possible for a shrunken coefficient to be highly stable in sign
      despite having anecdotal Bayes Factors.

      BAYES FACTORS:
      For regression coefficients and linear estimates, Bayes factors are
      computed using the Savage–Dickey density ratio evaluated on the
      marginal inference scale. Prior and posterior densities are Student’s
      t distributions with shared degrees of freedom (df_t), reflecting
      uncertainty in the residual variance under an empirical‑Bayes
      approximation [3–5].

      For residual correlations between outcomes, credible intervals are 
      computed on Fisher’s z [7] with effective degrees of freedom df_t and 
      then back‑transformed to r.

      SUMMARY OF PRIORS:
      The model employs the following priors for empirical Bayes inference:

        o Intercept: Improper flat/Uniform prior, U(-Inf, Inf).

        o Slopes: Marginal Student’s t prior on the coefficient (or estimate)
          scale, t(0, sigma_prior, df_t), with scale determined by the
          bootstrap‑optimised ridge parameter (lambda) and design effect
          DEFF.

          In the limit (high df_t), the inferential framework converges to a 
          Normal-Normal conjugate prior where the prior precision is 
          determined by the optimized lambda. At lower df_t, the function 
          provides more robust, t-marginalized inference to account for 
          uncertainty in the error variance.

        o Residual Variance: Implicit (working) Inverse-Gamma prior,
          Inv-Gamma(df_t/2, Sigma_Y_hat), induced by variance estimation
          and marginalization and used to generate the t-layer.

        o Correlations: An improper flat prior is assumed on Fisher’s z
          transform of the correlation coefficients. Under this prior, the
          posterior for z is proportional to the t‑based sampling distribution
          implied by the effective degrees of freedom df_t.

      UNCERTAINTY AND CLUSTERING:
      The design effect specified by DEFF is integrated throughout the model
      consistent with its definition:
              DEFF(parameter) =  Var_true(parameter) / Var_iid(parameter)
      This guards against dependence between observations leading to anti-
      conservative inference. This adjustment occurs at three levels:

      1. Prior Learning: The ridge tuning constant (lambda) is selected by
         minimizing predictive error on the i.i.d. bootstrap scale and then 
         divided by DEFF. This "dilutes" the prior precision, ensuring the 
             lambda_iid   = sigma^2 / tau^2_iid
             tau^2_true   = DEFF * tau^2_iid
             lambda_true  = sigma^2 / tau^2_true = lambda_iid / DEFF
         where sigma^2 (a.k.a. Sigma_Y_hat) is residual variance (data space)
         and tau^2 (a.k.a. tau2_hat) is the prior variance (parameter space).

      2. Scale Estimation: Residual variance (Sigma_Y_hat) is estimated using
         the ridge-adjusted degrees of freedom (df_lambda = m - trace(H_lambda))
         and is then inflated by a factor of DEFF. This yields an "effective"
         noise scale on the derived parameter statistics that accounts for
         within-cluster correlation [14, 15] according to:
             Var_true(beta_hat) = DEFF * Var_iid(beta_hat)

      3. Inferential Shape: A marginal Student’s t layer is used for all
         quantiles and Bayes factors to propagate uncertainty in the
         residual variance and effective sample size. To prevent over-
         certainty in small-cluster settings, the inferential degrees of
         freedom are reduced: 
             df_t = (m / DEFF) - trace (H_lambda), where m is size (Y, 1)
         This ensures that both the scale (width) and the shape (tails) of the
         posterior distributions are calibrated for the effective sample size.
         The use of t‑based adjustments is akin to placing an Inverse-Gamma
         prior (alpha = df_t / 2, beta = Sigma_Y_hat) on the residual variance
         and is in line with classical variance component approximations (e.g.,
         Satterthwaite/Kenward–Roger) and ridge inference recommendations
         [16–18].

      4. Stability Selection: The sign-consistency probabilities (denoted as
         stability) under bootstrap resampling are adjusted for the design
         effect via a Probit-link transformation: 
            Phi ( Phi^-1(stability) / sqrt (Deff) )
         Where Phi and Phi^-1 are the cumulative standard normal distribution
         function and its inverse respectively. This adjustment ensures that
         the reported stability reflects the effective sample size rather than
         the raw number of observations, preventing over-certainty in the
         presence of clustered or dependent data.

      ESTIMATING THE DESIGN EFFECT:
      While DEFF = 1 + (g - 1) * r provides a useful analytical upper bound 
      based on cluster size (g) and intraclass correlation (r), the realized 
      impact of dependence on regression slopes often varies by predictor type. 
      For complex designs, DEFF is best estimated as the mean ratio of the 
      parameter variances—obtained from the variances of the bootstrap 
      distributions under a cluster-robust estimator (e.g., wild cluster 
      bootstrap via `bootwild` or cluster-based bayesian bootstrap via 
      `bootbayes`) relative to an i.i.d. assumption. Supplying this 
      "Effective DEFF" allows `bootridge` to provide analytical Bayesian 
      inference that approximates the results of a full hierarchical or 
      resampled model [14, 15].

      DIAGNOSTIC ASSESSMENT:
      Users should utilize `bootlm` for formal diagnostic plots (Normal 
      Q-Q, Spread-Location, Cook’s Distance). These tools identify 
      influential observations that may require inspection before or 
      after ridge fitting.

      SUITABILITY: 
      This function is designed for models with continuous outcomes and 
      assumes a linear Normal (Gaussian) likelihood. It is not suitable for 
      binary, count, or categorical outcomes. However, binary and categorical 
      predictors are supported. 

      INTERNAL SCALING AND STANDARDIZATION: 
      All scaling and regularization procedures for optimizing the ridge
      parameter are handled internally to ensure numerical stability and
      balanced, scale-invariant shrinkage. To ensure all outcomes contribute 
      equally to the global regularization regardless of their units, the 
      ridge parameter (lambda) is optimized using internally standardized 
      outcomes. 

      When refitting the model with the optimal ridge parameter, while 
      predictors are maintained on their original scale, the ridge penalty 
      matrix is automatically constructed with diagonal elements proportional 
      to the column variances of X. This ensures that the shrinkage applied 
      to coefficients is equivalent to that of standardized predictors, 
      without requiring manual preprocessing (categorical terms are identified 
      via CATEGOR and are exempt from this variance-based penalty scaling). 
      Following optimization, the final model is refit to the outcomes on 
      their original scale; consequently, all posterior summaries, 
      credibility intervals, and prior standard deviations are reported 
      directly on the original coefficient scale for ease of interpretation.

      STATISTICAL INFERENCE AND ERROR CONTROL:
      Inference is provided via three complementary metrics: Credibility
      Intervals (CI), Bayes Factors (BF), and Stability Selection (SS)
      probabilities. Conditioned on a bootstrap-optimized ridge penalty, these
      statistics exhibit superior control over Type M (magnitude) and Type S
      (sign) errors relative to unpenalized estimators. The inherent shrinkage
      provides implicit False Discovery Rate (FDR) control for CIs and BFs by
      suppressing noise-driven inflation, providing more conservative global
      error control than unpenalized methods. Conversely, SS probabilities
      prioritize statistical power in sparse or low signal-to-noise ratio (SNR)
      settings; while SS maintains marginal False Positive Rate (FPR) control
      near ALPHA, it lacks the intrinsic FDR protection afforded by shrinkage
      when interpreting multiple simultaneous inferences. The reliability of
      all metrics improves as the Signal-to-Noise Ratio (SNR) increases. 

                           CI           BF           SS   
           FDR-Controlled <----------------------------> FPR-Controlled
        (High Stringency)                                (High Discovery) 

      See the last demo in this function for simulation code and results
      supporting these claims.

      See also: `bootstrp`, `boot`, `bootlm`, `bootbayes` and `bootwild`.

  Bibliography:
  [1] Delaney, N. J. & Chatterjee, S. (1986) Use of the Bootstrap and Cross-
      Validation in Ridge Regression. Journal of Business & Economic Statistics,
      4(2):255–262. https://doi.org/10.1080/07350015.1986.10509520
  [2] Efron, B. & Tibshirani, R. J. (1993) An Introduction to the Bootstrap.
      New York, NY: Chapman & Hall, pp. 247–252.
      https://doi.org/10.1201/9780429246593
  [3] Dickey, J. M. & Lientz, B. P. (1970) The Weighted Likelihood Ratio,
      Sharp Hypotheses about Chances, the Order of a Markov Chain. Ann. Math.
      Statist., 41(1):214–226. (Savage–Dickey)
      https://doi.org/10.1214/aoms/1177697203
  [4] Morris, C. N. (1983) Parametric Empirical Bayes Inference: Theory and
      Applications. JASA, 78(381):47–55. https://doi.org/10.2307/2287098
  [5] Wagenmakers, E.-J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010) 
      Bayesian hypothesis testing for psychologists: A tutorial on the 
      Savage–Dickey method. Cognitive Psychology, 60(3):158–189.
      https://doi.org/10.1016/j.cogpsych.2009.12.001
  [6] Meinshausen, N. & Buhlmann, P. (2010) Stability selection. J. R. Statist.
      Soc. B. 72(4): 417-473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
  [7] Fisher, R. A. (1921) On the "Probable Error" of a Coefficient of
      Correlation Deduced from a Small Sample. Metron, 1:3–32. (Fisher z)
  [8] Gelman, A., Hill, J., & Yajima, M. (2012) Why we usually don't worry 
      about multiple comparisons. J. Res. on Educ. Effectiveness, 5:189–211.
      https://doi.org/10.1080/19345747.2011.618213
  [9] Efron, B. (2010) Large-Scale Inference: Empirical Bayes Methods for 
      Estimation, Testing, and Prediction. Cambridge University Press.
      https://doi.org/10.1017/CBO9780511761362
 [10] Hastie, T., Tibshirani, R., & Friedman, J. (2009) The Elements of
      Statistical Learning (2nd ed.). Springer.
      https://doi.org/10.1007/978-0-387-84858-7
 [11] Ye, J. (1998) On Measuring and Correcting the Effects of Data Mining and
      Model Selection. JASA, 93(441):120–131. (Generalized df)
      https://doi.org/10.1080/01621459.1998.10474094
 [12] Akaike, H. (1973) Information Theory and an Extension of the Maximum
      Likelihood Principle. In: 2nd Int. Symp. on Information Theory. (AIC/KL)
      https://doi.org/10.1007/978-1-4612-0919-5_38
 [13] Hoerl, A. E. & Kennard, R. W. (1970) Ridge Regression: Biased Estimation
      for Nonorthogonal Problems. Technometrics, 12(1):55–67.
      https://doi.org/10.1080/00401706.1970.10488634
 [14] Neuhaus, J. M., & Segal, M. R. (1993). Design effects for binary 
      regression models fitted to dependent data. Statistics in Medicine, 
      12(13):1259–1268. https://doi.org/10.1002/sim.4780121309
 [15] Cameron, A. C., & Miller, D. L. (2015) A Practitioner's Guide to 
      Cluster-Robust Inference. J. Hum. Resour., 50(2):317–372.
      https://doi.org/10.3368/jhr.50.2.317
 [16] Satterthwaite, F. E. (1946) An Approximate Distribution of Estimates of
      Variance Components. Biometrics Bulletin, 2(6):110–114.
      https://doi.org/10.2307/3002019
 [17] Kenward, M. G. & Roger, J. H. (1997) Small Sample Inference for Fixed 
      Effects from Restricted Maximum Likelihood. Biometrics, 53(3):983–997.
      https://doi.org/10.2307/2533558
 [18] Vinod, H. D. (1987) Confidence Intervals for Ridge Regression Parameters.
      In Time Series and Econometric Modelling, pp. 279–300.
      https://doi.org/10.1007/978-94-009-4790-0_19

 bootridge (version 2026.02.18)
 Author: Andrew Charles Penn



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Empirical Bayes penalized regression for univariate or multivariate outcomes...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
bootstrp


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8394
 Bootstrap: Resample with replacement to generate new samples and return the
 statistic(s) calculated by evaluating the specified function on each resample.


 -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D)
 -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1, ..., DN)
 -- Function File: BOOTSTAT = bootstrp (..., D1, ..., DN, 'Match', MATCH)
 -- Function File: BOOTSTAT = bootstrp (..., 'Options', PAROPT)
 -- Function File: BOOTSTAT = bootstrp (..., 'Weights', WEIGHTS)
 -- Function File: BOOTSTAT = bootstrp (..., 'Strata', STRATA)
 -- Function File: BOOTSTAT = bootstrp (..., 'LOO', LOO)
 -- Function File: BOOTSTAT = bootstrp (..., 'Seed', SEED)
 -- Function File: [BOOTSTAT, BOOTSAM] = bootstrp (...)
 -- Function File: [BOOTSTAT, BOOTSAM, STATS] = bootstrp (...)
 -- Function File: [BOOTSTAT, BOOTSAM, STATS, BOOTOOB] = bootstrp (...)

     'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples
     with replacement from the rows of the data D and returns the statistic
     computed by BOOTFUN in BOOTSTAT [1]. BOOTFUN is a function handle (e.g.
     specified with @) or name, a string indicating the function name, or a
     cell array, where the first cell is one of the above function definitions
     and the remaining cells are (additional) input arguments to that function
     (after the data argument(s)). The third input argument is the data
     (column vector, matrix or cell array), which is supplied to BOOTFUN. This
     function is the only function in the statistics-resampling package to also
     accept cell arrays for the data arguments. The simulation method used by
     default is bootstrap resampling with first order balance [2-3]; see help
     for the 'boot' function for more information.

     'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1,...,DN)' is as above except 
     that the third and subsequent input arguments are multiple data objects,
     (column vectors, matrices or cell arrays,) which are used as input for
     BOOTFUN.

     'BOOTSTAT = bootstrp (..., D1, ..., DN, 'Match', MATCH)' controls the
     resampling strategy when multiple data arguments are provided. When MATCH
     is true, row indices of D1 to DN are the same (i.e. matched) for each
     resample. This is the default strategy when D1 to DN all have the same
     number of rows. If MATCH is set to false, then row indices are resampled
     independently for D1 to DN in each of the resamples. When any of the data
     D1 to DN, have a different number of rows, this input argument is ignored
     and MATCH is enforced to have a value of false. Note that the MATLAB
     bootstrp function only operates in a mode equivalent to MATCH = true.
     One application of setting MATCH to false is to perform stratified
     bootstrap resampling.

     'BOOTSTAT = bootstrp (..., 'Options', PAROPT)' specifies options that
     govern if and how to perform bootstrap iterations using multiple
     processors (if the Parallel Computing Toolbox or Octave Parallel package).
     is available This argument is a structure with the following recognised
     fields:
        o 'UseParallel': If true, use parallel processes to accelerate
                         bootstrap computations on multicore machines. 
                         Default is false for serial computation. In MATLAB,
                         the default is true if a parallel pool has already
                         been started. 
        o 'nproc':       nproc sets the number of parallel processes (optional)

     'BOOTSTAT = bootstrp (..., D, 'Weights', WEIGHTS)' sets the resampling
     weights. WEIGHTS must be a column vector with the same number of rows as
     the data, D. If WEIGHTS is empty or not provided, the default is a vector
     of length N with uniform weighting 1/N. 

     'BOOTSTAT = bootstrp (..., D1, ... DN, 'Weights', WEIGHTS)' as above if
     MATCH is true. If MATCH is false, a 1-by-N cell array of column vectors
     can be provided to specify independent resampling weights for D1 to DN.

     'BOOTSTAT = bootstrp (..., D1, ... DN, 'Strata', STRATA)' performs 
     balanced stratified resampling according to the group identifiers in
     STRATA, which must be a vector of length (N) equal to the number of rows 
     in D. This option cannot be used in conjunction with 'Weights', and it
     requires 'Match' = true, as such it applies the same stratum-wise indices
     to all matched data arguments.

     'BOOTSTAT = bootstrp (..., 'LOO', LOO)' sets the simulation method. If 
     LOO is false, the resampling method used is balanced bootstrap resampling.
     If LOO is true, the resampling method used is balanced bootknife
     resampling [4]. The latter involves creating leave-one-out (jackknife)
     samples of size N - 1, and then drawing resamples of size N with
     replacement from the jackknife samples, thereby incorporating Bessel's
     correction into the resampling procedure. LOO must be a scalar logical
     value. The default value of LOO is false.

     'BOOTSTAT = bootstrp (..., 'Seed', SEED)' initialises the Mersenne Twister
     random number generator using an integer SEED value so that bootci results
     are reproducible.

     '[BOOTSTAT, BOOTSAM] = bootstrp (...)' also returns indices used for
     bootstrap resampling. If MATCH is true or only one data argument is
     provided, BOOTSAM is a matrix. If multiple data arguments are provided
     and MATCH is false, BOOTSAM is returned in a 1-by-N cell array of
     matrices, where each cell corresponds to the respective data argument
     D1 to DN.  To get the output samples BOOTSAM without applying a function,
     set BOOTFUN to empty (i.e. []).

     '[BOOTSTAT, BOOTSAM, STATS] = bootstrp (...)' also calculates and returns
     the following basic statistics relating to each column of BOOTSTAT: 
        - original: the original estimate(s) calculated by BOOTFUN and the DATA
        - mean: the mean of the bootstrap distribution(s)
        - bias: bootstrap estimate of the bias of the sampling distribution(s)
        - bias_corrected: original estimate(s) after subtracting the bias
        - var: bootstrap variance of the original estimate(s)
        - std_error: bootstrap estimate(s) of the standard error(s)
     If BOOTSTAT is not numeric, STATS only returns the 'original' field. If
     BOOTFUN is empty, then the value of the 'original' field is also empty.

     '[BOOTSTAT, BOOTSAM, STATS, BOOTOOB] = bootstrp (...)' also returns the
     out-of-bag samples. If match is true, BOOTOOB is a 1-by-NBOOT cell array
     where each cell contains a vector of indices corresponding to the out-of-
     bag (OOB) sample observations for the corresponding bootstrap sample. If
     match is false, BOOTOOB is a nested 1-by-N cell array for the respective
     data argument D1 to DN, each containing a 1-by-NBOOT cell array of their
     OOB samples. Note that using bootknife resampling (by setting LOO to true)
     guarantees that all OOB samples have at least one obervation.

  Bibliography:
  [1] Efron, and Tibshirani (1993) An Introduction to the
        Bootstrap. New York, NY: Chapman & Hall
  [2] Davison et al. (1986) Efficient Bootstrap Simulation.
        Biometrika, 73: 555-66
  [3] Booth, Hall and Wood (1993) Balanced Importance Resampling
        for the Bootstrap. The Annals of Statistics. 21(1):286-298
  [4] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.

  bootstrp (version 2024.05.24)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Bootstrap: Resample with replacement to generate new samples and return the
...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
bootwild


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7851
 Performs wild bootstrap and calculates bootstrap-t confidence intervals and 
 p-values for the mean, or the regression coefficients from a linear model.

 -- Function File: bootwild (y)
 -- Function File: bootwild (y, X)
 -- Function File: bootwild (y, X, CLUSTID)
 -- Function File: bootwild (y, X, BLOCKSZ)
 -- Function File: bootwild (y, X, ..., NBOOT)
 -- Function File: bootwild (y, X, ..., NBOOT, ALPHA)
 -- Function File: bootwild (y, X, ..., NBOOT, ALPHA, SEED)
 -- Function File: bootwild (y, X, ..., NBOOT, ALPHA, SEED, L)
 -- Function File: STATS = bootwild (y, ...)
 -- Function File: [STATS, BOOTSTAT] = bootwild (y, ...)
 -- Function File: [STATS, BOOTSTAT, BOOTSSE] = bootwild (y, ...)
 -- Function File: [STATS, BOOTSTAT, BOOTSSE, BOOTFIT] = bootwild (y, ...)

     'bootwild (y)' performs a null hypothesis significance test for the
     mean of y being equal to 0. This function performs wild (cluster)
     unrestricted bootstrap-t resampling of Webb's 6-point distribution of the
     residuals and computes confidence intervals and p-values [1-4]. The
     following statistics are printed to the standard output:
        - original: the mean of the data vector y
        - std_err: heteroscedasticity-consistent standard error(s) (HC1 or CR1)
        - CI_lower: lower bound(s) of the 95% bootstrap-t confidence interval
        - CI_upper: upper bound(s) of the 95% bootstrap-t confidence interval
        - tstat: Student's t-statistic
        - pval: two-tailed p-value(s) for the parameter(s) being equal to 0
        - fpr: minimum false positive risk for the corresponding p-value
          By default, the confidence intervals are symmetric bootstrap-t
          confidence intervals. The minimum false positive risk (FPR) is
          computed according to the Sellke-Berger approach as described [5,6].

     'bootwild (y, X)' also specifies the design matrix (X) for least squares
     regression of y on X. X should be a column vector or matrix the same
     number of rows as y. If the X input argument is empty, the default for X
     is a column of ones (i.e. intercept only) and thus the statistic computed
     reduces to the mean (as above). The statistics calculated and returned in
     the output then relate to the coefficients from the regression of y on X.

     'bootwild (y, X, CLUSTID)' specifies a vector or cell array of numbers
     or strings respectively to be used as cluster labels or identifiers.
     Rows in y (and X) with the same CLUSTID value are treated as clusters
     with dependent errors. Rows of y (and X) assigned to a particular
     cluster will have identical resampling during wild bootstrap. If empty
     (default), no clustered resampling is performed and all errors are
     treated as independent. The standard errors computed are cluster robust.

     'bootwild (y, X, BLOCKSZ)' specifies a scalar, which sets the block size
     for bootstrapping when the residuals have serial dependence. Identical
     resampling occurs within each (consecutive) block of length BLOCKSZ
     during wild bootstrap. Rows of y (and X) within the same block are
     treated as having dependent errors. If empty (default), no block
     resampling is performed and all errors are treated as independent.
     The standard errors computed are cluster robust.

     'bootwild (y, X, ..., NBOOT)' specifies the number of bootstrap resamples,
     where NBOOT must be a positive integer. If empty, the default value of
     NBOOT is 1999.

     'bootwild (y, X, ..., NBOOT, ALPHA)' is numeric and sets the lower and
     upper bounds of the confidence interval(s). The value(s) of ALPHA must
     be between 0 and 1. ALPHA can either be:
        o scalar: To set the (nominal) central coverage of SYMMETRIC
                  bootstrap-t confidence interval(s) to 100*(1-ALPHA)%.
                  For example, 0.05 for a 95% confidence interval.
        o vector: A pair of probabilities defining the (nominal) lower and
                  upper bounds of ASYMMETRIC bootstrap-t confidence interval(s)
                  as 100*(ALPHA(1))% and 100*(ALPHA(2))% respectively. For
                  example, [.025, .975] for a 95% confidence interval.
        The default value of ALPHA is the scalar: 0.05, for symmetric 95% 
        bootstrap-t confidence interval(s).

     'bootwild (y, X, ..., NBOOT, {ALPHA})' as above, except that p-values
     become independent of the confidence intervals since they are adjusted
     to control the family-wise error rate (FWER) across multiple comparisons
     using the step-down max |T| procedure [7]. Confidence intervals remain
     based on the individual bootstrap-t distribution even when FWER control
     is requested. By default, no multiple comparison procedure is used.

     'bootwild (y, X, ..., NBOOT, ALPHA, SEED)' initialises the Mersenne
     Twister random number generator using an integer SEED value so that
     'bootwild' results are reproducible.

     'bootwild (y, X, ..., NBOOT, ALPHA, SEED, L)' multiplies the regression
     coefficients by the hypothesis matrix L. If L is not provided or is empty,
     it will assume the default value of 1 (i.e. no change to the design). 

     'STATS = bootwild (...) returns a structure with the following fields:
     original, std_err, CI_lower, CI_upper, tstat, pval, fpr and the sum-of-
     squared error (sse).

     '[STATS, BOOTSTAT] = bootwild (...)  also returns a vector (or matrix) of
     bootstrap statistics (BOOTSTAT) calculated over the bootstrap resamples
     (before studentization).

     '[STATS, BOOTSTAT, BOOTSSE] = bootwild (...)  also returns a vector
     containing the sum-of-squared error for the fit on each bootstrap 
     resample.

     '[STATS, BOOTSTAT, BOOTSSE, BOOTFIT] = bootwild (...)  also returns an
     N-by-NBOOT matrix containing the N fitted values for each of the NBOOT
     bootstrap resamples.

     '[STATS, BOOTSTAT, BOOTSSE, BOOTFIT, BOOTDAT] = bootwild (...)  also
     returns an N-by-NBOOT matrix containing the N data points for each of
     the NBOOT bootstrap resamples.

  Bibliography:
  [1] Wu (1986). Jackknife, bootstrap and other resampling methods in
        regression analysis (with discussions). Ann Stat.. 14: 1261–1350. 
  [2] Cameron, Gelbach and Miller (2008) Bootstrap-based Improvements for
        Inference with Clustered Errors. Rev Econ Stat. 90(3), 414-427
  [3] Webb (2023) Reworking wild bootstrap-based inference for clustered
        errors. Can J Econ. https://doi.org/10.1111/caje.12661
  [4] Cameron and Miller (2015) A Practitioner’s Guide to Cluster-Robust
        Inference. J Hum Resour. 50(2):317-372
  [5] Colquhoun (2019) The False Positive Risk: A Proposal Concerning What
        to Do About p-Values, Am Stat. 73:sup1, 192-201
  [6] Sellke, Bayarri and Berger (2001) Calibration of p-values for Testing
        Precise Null Hypotheses. Am Stat. 55(1), 62-71
  [7] Westfall, P. H., & Young, S. S. (1993). Resampling-Based Multiple 
        Testing: Examples and Methods for p-Value Adjustment. Wiley.

  bootwild (version 2024.05.23)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs wild bootstrap and calculates bootstrap-t confidence intervals and ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3
cor


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2038
 Vectorized function for computing Pearson's correlation coefficient (RHO)
 between each of the respective columns in two data vectors or matrices.

 -- Function File: RHO = cor (X, Y)
 -- Function File: R2  = cor (X, Y, 'squared')

     'RHO = cor (X, Y)' computes Pearson's correlation coefficient (RHO)
     between the column vectors X and Y. If X and Y are matrices, then
     RHO will be a row vector corresponding to column-wise correlation
     coefficients. Hence this function is vectorised for rapid computation
     of the correlation coefficient in bootstrap resamples. Note that
     unlike the native @corr function, the correlation coefficients
     returned here are representative of the finite data sample and are
     not unbiased estimates of the population parameter.
 
     cor (X, Y) = ...

     mean ( (X - mean (X)) .* (Y - mean (Y)) ) ./ (std (X, 1) .* std (Y, 1))

     'R2 = cor (X, Y, 'squared')' as above but returns the correlation
     coefficient squared (i.e. the coefficient of determination).

    HINT: To use this function to compute Spearman's rank correlation,
    independently transform X and Y to ranks, with tied observations
    receiving the same average rank, before providing them as input to
    this function.

  cor (version 2023.05.02)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Vectorized function for computing Pearson's correlation coefficient (RHO)
 b...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
credint


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2664
 Computes credible interval(s) directly from a vector (or row-major matrix) of
 the posterior(s) obtained by bayesian bootstrap.

 -- Function File: CI = credint (BOOTSTAT)
 -- Function File: CI = credint (BOOTSTAT, PROB)

     'CI = credint (BOOTSTAT)' computes 95% credible intervals directly from
     the vector, or rows* of the matrix in BOOTSTAT, where BOOTSTAT contains
     posterior (or Bayesian bootstrap) statistics, such as those generated
     using the `bootbayes` function (or the `bootlm` function with the method
     set to 'bayesian'). The credible intervals are shortest probability
     intervals (SPI), which represent a more computationally stable version
     of the highest posterior density interval [1,2].

        * The matrix should have dimensions P * NBOOT, where P corresponds to
          the number of parameter estimates and NBOOT corresponds to the number
          of posterior (or Bayesian bootstrap) samples.

     'CI = credint (BOOTSTAT, PROB)' returns credible intervals, where PROB is
     numeric and sets the lower and upper bounds of the credible interval(s).
     The value(s) of PROB must be between 0 and 1. PROB can either be:
       <> scalar: To set the central mass of shortest probability intervals
                  to 100*PROB%
       <> vector: A pair of probabilities defining the lower and upper
                  percentiles of the credible interval(s) as 100*(PROB(1))%
                  and 100*(PROB(2))% respectively.
          The default value of PROB is the scalar: 0.95, for a 95% shortest 
          posterior credible interval.

  Bibliography:
  [1] Liu, Gelman & Zheng (2015). Simulation-efficient shortest probability
        intervals. Statistics and Computing, 25(4), 809–819. 
  [2] Gelman (2020) Shortest Posterior Intervals.
        https://discourse.mc-stan.org/t/shortest-posterior-intervals/16281/16

  credint (version 2023.09.03)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Computes credible interval(s) directly from a vector (or row-major matrix) o...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
deffcalc


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1682
 Computes the design effect (DEFF), which can subsequently be used to correct
 sample size calculations using the 'sampszcalc' function.

 -- Function File: DEFF = deffcalc (BOOTSTAT, BOOTSTAT_SRS)

     'DEFF = deff_calc (BOOTSTAT, BOOTSTAT_SRS)' computes the design effect
     (DEFF) by taking the ratio of the variance of the bootstrap statistics
     from a complex design over the variance of bootstrap statistics from
     simple random sampling with replacement:

            DEFF = var (BOOTSTAT, 0, 2) ./ var (BOOTSTAT_SRS, 0, 2);

     BOOTSTAT and BOOTSTAT_SRS must be row vectors, or matrices with dimenions
     of size P * NBOOT, where P is the number of parameters being estimated
     and NBOOT is the number of bootstrap statistics. The number of parameters
     being estimated (but not the number of bootstrap resamples) must be the
     same to compute DEFF using this function.

  deffcalc (version 2023.09.17)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Computes the design effect (DEFF), which can subsequently be used to correct...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
randtest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4703
 Performs permutation or randomization tests for regression coefficients.

 -- Function File: PVAL = randtest (X, Y)
 -- Function File: PVAL = randtest (X, Y, NREPS)
 -- Function File: PVAL = randtest (X, Y, NREPS, FUNC)
 -- Function File: PVAL = randtest (X, Y, NREPS, FUNC, SEED)
 -- Function File: [PVAL, STAT] = randtest (...)
 -- Function File: [PVAL, STAT, FPR] = randtest (...)
 -- Function File: [PVAL, STAT, FPR, PERMSTAT] = randtest (...)

     'PVAL = randtest (X, Y)' uses the approach of Manly [1,2] to perform
     a randomization (or permutation) test of the null hypothesis that
     coefficients from the regression of Y on X are significantly different
     from 0. The value returned is a 2-tailed p-value. Note that the Y values
     are centered before randomization or permutation to also provide valid
     null hypothesis tests of the intercept. To include an intercept term in
     the regression, X must contain a column of ones.

     Hint: For one-sample or two-sample randomization/permutation tests,
     please use the 'randtest1' or 'randtest2' functions respectively.

     'PVAL = randtest (X, Y, NREPS)' specifies the number of resamples without
     replacement to take in the randomization test. By default, NREPS is 5000.
     If the number of possible permutations is smaller than NREPS, the test
     becomes exact. For example, if the number of sampling units (i.e. rows
     in Y) is 6, then the number of possible permutations is factorial (6) =
     720, so NREPS will be truncated at 720 and sampling will systematically
     evaluate all possible permutations. 

     'PVAL = randtest (X, Y, NREPS, FUNC)' also specifies a custom function
     calculated on the original samples, and the permuted or randomized
     resamples. Note that FUNC must compute statistics related to regression,
     and should either be a:
        o function handle or anonymous function,
        o string of function name, or
        o a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (other than the data arguments).
        See the built-in demos for example usage with @mldivide for linear
        regression coefficients, or with @cor for the correlation coefficient.
        The default value of FUNC is @mldivide.

        If function evaluations cannot be vectorized and the parallel computing
        toolbox (Matlab) or Parallel package (Octave) is installed and loaded,
        then the function evaluations will be automatically accelerated by
        parallel processing on platforms with multiple processors. In GNU
        Octave, the maximum number of workers used can be set by the user
        before running randtest, for example, for 2 workers with the command:

          setenv ('OMP_NUM_THREADS', '2')

     'PVAL = randtest (X, Y, NREPS, FUNC, SEED)' initialises the Mersenne
     Twister random number generator using an integer SEED value so that
     the results of 'randtest' results are reproducible when the
     test is approximate (i.e. when using randomization if not all permutations
     can be evaluated systematically).

     '[PVAL, STAT] = randtest (...)' also returns the test statistic.

     '[PVAL, STAT, FPR] = randtest (...)' also returns the minimum false
     positive risk (FPR) calculated for the p-value, computed using the
     Sellke-Berger approach.

     '[PVAL, STAT, FPR, PERMSTAT] = randtest (...)' also returns the
     statistics of the permutation distribution.

  Bibliography:
  [1] Manly (1997) Randomization, Bootstrap and Monte Carlo Method in Biology.
       2nd Edition. London: Chapman & Hall.
  [2] Hesterberg, Moore, Monaghan, Clipson, and Epstein (2011) Bootstrap
       Methods and Permutation Tests (BMPT) by in Introduction to the Practice
       of Statistics, 7th Edition by Moore, McCabe and Craig.

  randtest (version 2024.04.17)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 73
 Performs permutation or randomization tests for regression coefficients.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
randtest1


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4312
 Performs a permutation or randomization test to assess if a sample comes from
 a population with a value given for the mean or some other location parameter 

 -- Function File: PVAL = randtest1 (A, M)
 -- Function File: PVAL = randtest1 (A, M, NREPS)
 -- Function File: PVAL = randtest1 (A, M, NREPS)
 -- Function File: PVAL = randtest1 (A, M, NREPS, FUNC)
 -- Function File: PVAL = randtest1 (A, M, NREPS, FUNC, SEED)
 -- Function File: PVAL = randtest1 ([A, GA], ...)
 -- Function File: [PVAL, STAT] = randtest1 (...)
 -- Function File: [PVAL, STAT, FPR] = randtest1 (...)
 -- Function File: [PVAL, STAT, FPR, PERMSTAT] = randtest1 (...)

     'PVAL = randtest1 (A, M)' performs a randomization (or permutation) test
     to ascertain whether data sample in the column vector A comes from a
     population with mean equal to the value M. The value returned is a 2-
     tailed p-value against the null hypothesis computed using the absolute
     values of the mean. This function generates resamples by independently
     and randomly flipping the signs of values in (A - M).

     'PVAL = randtest1 (A, M, NREPS)' specifies the number of resamples to
     take in the randomization test. By default, NREPS is 5000. If the number
     of possible permutations is smaller than NREPS, the test becomes exact.
     For example, if the number of sampling units (i.e. rows) in the sample
     is 12, then the number of possible permutations is 2^12 = 4096, so NREPS
     will be truncated at 4096 and sampling will systematically evaluate all
     possible permutations. 

     'PVAL = randtest1 (A, M, NREPS, FUNC)' specifies a custom function
     calculated on the original samples, and the permuted or randomized
     resamples. Note that FUNC must compute a location parameter and
     should either be a:
        o function handle or anonymous function,
        o string of function name, or
        o a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (other than the data arguments).
        See the built-in demos for example usage using the mean.

     'PVAL = randtest1 (A, M, NREPS, FUNC, SEED)' initialises the Mersenne
     Twister random number generator using an integer SEED value so that
     the results of 'randtest1' are reproducible when the test is approximate
     (i.e. when using randomization if not all permutations can be 
     evaluated systematically).

     'PVAL = randtest1 ([A, GA], M, ...)' also specifies the sampling
     units (i.e. clusters) using consecutive positive integers in GA for A.
     Defining the sampling units has applications for clustered resampling,
     for example in the cases of nested experimental designs. Note that when
     sampling units contain different numbers of values, function evaluations
     after sampling cannot be vectorized. If the parallel computing toolbox
     (Matlab) or parallel package (Octave) is installed and loaded, then the
     function evaluations will be automatically accelerated by parallel
     processing on platforms with multiple processors.

     '[PVAL, STAT] = randtest1 (...)' also returns the test statistic.

     '[PVAL, STAT, FPR] = randtest1 (...)' also returns the minimum false
     positive risk (FPR) calculated for the p-value, computed using the
     Sellke-Berger approach.

     '[PVAL, STAT, FPR, PERMSTAT] = randtest1 (...)' also returns the
     statistics of the permutation distribution.

  randtest1 (version 2024.04.21)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs a permutation or randomization test to assess if a sample comes fro...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
randtest2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6715
 Performs a permutation or randomization test to compare the distributions of 
 two independent or paired data samples. 

 -- Function File: PVAL = randtest2 (A, B)
 -- Function File: PVAL = randtest2 (A, B, PAIRED)
 -- Function File: PVAL = randtest2 (A, B, PAIRED, NREPS)
 -- Function File: PVAL = randtest2 (A, B, PAIRED, NREPS)
 -- Function File: PVAL = randtest2 (A, B, PAIRED, NREPS, FUNC)
 -- Function File: PVAL = randtest2 (A, B, PAIRED, NREPS, FUNC, SEED)
 -- Function File: PVAL = randtest2 ([A, GA], [B, GB], ...)
 -- Function File: [PVAL, STAT] = randtest2 (...)
 -- Function File: [PVAL, STAT, FPR] = randtest2 (...)
 -- Function File: [PVAL, STAT, FPR, PERMSTAT] = randtest2 (...)

     'PVAL = randtest2 (A, B)' performs a randomization (or permutation) test
     to ascertain whether data samples A and B come from populations with
     the same distribution. Distributions are compared using the Wasserstein
     metric [1,2], which is the area of the difference between the empirical
     cumulative distribution functions of A and B. The data in A and B should
     be column vectors that represent measurements of the same variable. The
     value returned is a 2-tailed p-value against the null hypothesis computed
     using the absolute values of the test statistics.

     'PVAL = randtest2 (A, B, PAIRED)' specifies whether A and B should be
     treated as independent (unpaired) or paired samples. PAIRED accepts a
     logical scalar:
        o false (default): As above. The rows of samples A and B combined are
                permuted or randomized.
        o true: Performs a randomization or permutation test to ascertain
                whether paired or matched data samples A and B come from
                populations with the same distribution. The vectors A and B
                must each contain the same number of rows, where each row
                across A and B corresponds to a pair of matched observations.
                Within each pair, the allocation of data to samples A or B is
                permuted or randomized [3].

     'PVAL = randtest2 (A, B, PAIRED, NREPS)' specifies the number of resamples
     without replacement to take in the randomization test. By default, NREPS
     is 5000. If the number of possible permutations is smaller than NREPS, the
     test becomes exact. For example, if the number of sampling units across
     two independent samples is 6, then the number of possible permutations is
     factorial (6) = 720, so NREPS will be truncated at 720 and sampling will
     systematically evaluate all possible permutations. If the number of
     sampling units in each paired sample is 12, then the number of possible
     permutations is 2^12 = 4096, so NREPS will be truncated at 4096 and
     sampling will systematically evaluate all possible permutations. 

     'PVAL = randtest2 (A, B, PAIRED, NREPS, FUNC)' also specifies a custom
     function calculated on the original samples, and the permuted or
     randomized resamples. Note that FUNC must compute a difference statistic
     between samples A and B, and should either be a:
        o function handle or anonymous function,
        o string of function name, or
        o a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (other than the data arguments).
        See the built-in demos for example usage with the mean [3], or vaiance.

     'PVAL = randtest2 (A, B, PAIRED, NREPS, FUNC, SEED)' initialises the
     Mersenne Twister random number generator using an integer SEED value so
     that the results of 'randtest2' results are reproducible when the test
     is approximate (i.e. when using randomization if not all permutations
     can be evaluated systematically).

     'PVAL = randtest2 ([A, GA], [B, GB], ...)' also specifies the sampling
     units (i.e. clusters) using consecutive positive integers in GA and GB
     for A and B respectively. Defining the sampling units has applications
     for clustered resampling, for example in the cases of nested experimental 
     designs. If PAIRED is false, numeric identifiers in GA and GB must be
     unique (e.g. 1,2,3 in GA, 4,5,6 in GB) - resampling of clusters then
     occurs across the combined sample of A and B. If PAIRED is true, numeric
     identifiers in GA and GB must by identical (e.g. 1,2,3 in GA, 1,2,3 in
     GB) - resampling is then restricted to exchange of clusters between A 
     and B only where the clusters have the same identifier. Note that when
     sampling units contain different numbers of values, function evaluations
     after sampling cannot be vectorized. If the parallel computing toolbox
     (Matlab) or Parallel package (Octave) is installed and loaded, then the
     function evaluations will be automatically accelerated by parallel
     processing on platforms with multiple processors. In GNU Octave, the
     maximum number of workers used can be set by the user before running
     randtest2, for example, for 2 workers with the command:

          setenv ('OMP_NUM_THREADS', '2')

     '[PVAL, STAT] = randtest2 (...)' also returns the test statistic.

     '[PVAL, STAT, FPR] = randtest2 (...)' also returns the minimum false
     positive risk (FPR) calculated for the p-value, computed using the
     Sellke-Berger approach.

     '[PVAL, STAT, FPR, PERMSTAT] = randtest2 (...)' also returns the
     statistics of the permutation distribution.

  Bibliography:
  [1] Dowd (2020) A New ECDF Two-Sample Test Statistic. arXiv.
       https://doi.org/10.48550/arXiv.2007.01360
  [2] https://en.wikipedia.org/wiki/Wasserstein_metric
  [3] Hesterberg, Moore, Monaghan, Clipson, and Epstein (2011) Bootstrap
       Methods and Permutation Tests (BMPT) by in Introduction to the Practice
       of Statistics, 7th Edition by Moore, McCabe and Craig.

  randtest2 (version 2024.04.17)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs a permutation or randomization test to compare the distributions of...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
sampszcalc


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3724
 Performs sample size calculations, with optional correction for the design
 effect deviating from unity.

 -- Function File: N = sampszcalc (TESTTYPE, EFFSZ)
 -- Function File: N = sampszcalc (TESTTYPE, EFFSZ, POW)
 -- Function File: N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA)
 -- Function File: N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA, TAILS)
 -- Function File: N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA, TAILS, DEFF)

      'N = sampszcalc (TESTTYPE, EFFSZ)' returns the required sample size to
      reach the significance level (alpha) of 0.05 in a two-tailed version of
      the test specified in TESTTYPE for the specified effect size, EFFSZ,
      with a power of 0.8 (i.e. a type II error rate of 1 - 0.8 = 0.2). For
      two-sample tests, N corresponds to the size of each sample.

        TESTTYPE can be:

          't2' : two-sample unpaired t-test

          't'  : paired t-test or one-sample t-test

          'z2' : two-sample unpaired z-test (Normal approximation)

          'z'  : paired z-test or one-sample z-test (Normal approximation)

          'r'  : significance test for correlation

        EFFSZ can be numeric value corresponding to the standardized effect
        size: Cohen's d or h (when TESTTYPE is 't2', 't', 'z2' or 'z'), or 
        Pearson's correlation coefficient (when TESTTYPE is 'r'). For
        convenience, EFFSZ can also be one of the following strings:

          'small'  : which is 0.2 for Cohen's d (or h), or 0.1 for Pearson's r.

          'medium' : which is 0.5 for Cohen's d (or h), or 0.3 for Pearson's r.

          'large'  : which is 0.8 for Cohen's d (or h), or 0.5 for Pearson's r.

       'N = sampszcalc (TESTTYPE, EFFSZ, POW)' also sets the desired power of
       the test. The power corresponds to 1 - beta, where beta is the type II
       error rate (i.e. the probability of not rejecting the null hypothesis
       when it is actually false). (Default is 0.8)

       'N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA)' also sets the desired
       significance level, ALPHA, of the test. ALPHA corresponds to the type I
       error rate (i.e. the probability of rejecting the null hypothesis when
       it is actually true). (Default is 0.05)

       HINT: If the test is expected to be among a family of tests, divide
       ALPHA by the number of tests so that the sample size calculations will
       maintain the desired power after correction for multiple comparisons.

       'N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA, TAILS)' also sets whether
       the test is one-sided or two-sided. (Default is 2)

       'N = sampszcalc (TESTTYPE, EFFSZ, POW, ALPHA, TAILS, DEFF)' also sets
       the design effect to correct the sample size calculation. (Default is 1)
       DEFF can be estimated by dividing the sampling variance of the parameter
       of interest from a complex experimental design by the equivalent
       statistic computed using simple random sampling with replacement.

  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
 Performs sample size calculations, with optional correction for the design
 ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
smoothmad


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1589
 Calculates a smoothed version of the median absolute deviation (MAD).

 -- Function File: MAD = smoothmad (X)
 -- Function File: MAD = smoothmad (X, GROUP)
 -- Function File: MAD = smoothmad (X, GROUP, CONSTANT)

     'MAD = smoothmad (X)' calculates a smoothed version of the median
     absolute deviation (MAD) for each column of the data in x. The
     statistics are scaled by a constant of 1.41 to make the estimator
     consistent with the standard deviation for normally distributed data.
 
     'MAD = smoothmad (X, GROUP)' defines group membership of the rows
     of x and returns the pooled smooth MAD. GROUP must be a numeric
     vector with the same number of rows as x.

     'MAD = smoothmad (X, GROUP, CONSTANT)' sets the CONSTANT to scale
     the value of the MAD. (Default is 1.41).

  smoothmad (version 2023.05.02)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 70
 Calculates a smoothed version of the median absolute deviation (MAD).



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 12
smoothmedian


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3543
 Calculates a smoothed version of the median.

 -- Function File: M = smoothmedian (X)
 -- Function File: M = smoothmedian (X, DIM)
 -- Function File: M = smoothmedian (X, DIM, TOL)

     If X is a vector, find the univariate smoothed median (M) of X. If X is a
     matrix, compute the univariate smoothed median value for each column and
     return them in a row vector.  If the optional argument DIM is given,
     operate along this dimension. Arrays of more than two dimensions are not
     currently supported. The MEX file versions of this function ignore (omit)
     NaN values whereas the m-file includes NaN in it's calculations. Use the
     'which' command to establish which version of the function is being used.

     The smoothed median is a slightly smoothed version of the ordinary median
     and is an M-estimator that is both robust and efficient:

     | Asymptotic                            | Mean |    Median  |    Median  |
     | properties                            |      | (smoothed) | (ordinary) |
     |---------------------------------------|------|------------|------------|
     | Breakdown point                       | 0.00 |      0.341 |      0.500 |
     | Pitman efficacy                       | 1.00 |      0.865 |      0.637 |

     Smoothing the median is achieved by minimizing the objective function:

           S (M) = sum (((X(i) - M).^2 + (X(j) - M).^2).^ 0.5)
                  i < j
 
     where i and j refers to the indices of the Cartesian product of each
     column of X with itself. 

     With the ordinary median as the initial value of M, this function
     minimizes the above objective function by finding the root of the first
     derivative using a fast, but reliable, Newton-Bisection hybrid algorithm.
     The tolerance (TOL) is the maximum value of the step size that is
     acceptable to break from optimization. By default, TOL = range * 1e-04.

     The smoothing works by slightly reducing the breakdown point of the median.
     Bootstrap confidence intervals using the smoothed median have good
     coverage for the ordinary median of the population distribution and can be
     used to obtain second order accurate intervals with Studentized bootstrap
     and calibrated percentile bootstrap methods [1]. When the population
     distribution is thought to be strongly skewed, coverage errors can be
     reduced by improving symmetry through appropriate data transformation.
     Unlike kernel-based smoothing approaches, bootstrapping smoothmedian does
     not require explicit choice of a smoothing parameter or a probability
     density function.

  Bibliography:
  [1] Brown, Hall and Young (2001) The smoothed median and the
       bootstrap. Biometrika 88(2):519-534

  smoothmedian (version 2023.05.02)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 45
 Calculates a smoothed version of the median.





