| Type: | Package | 
| Title: | Modelling of Extreme Values | 
| Version: | 2.0 | 
| Description: | Various tools for the analysis of univariate, multivariate and functional extremes. Exact simulation from max-stable processes (Dombry, Engelke and Oesting, 2016, <doi:10.1093/biomet/asw008>, R-Pareto processes for various parametric models, including Brown-Resnick (Wadsworth and Tawn, 2014, <doi:10.1093/biomet/ast042>) and Extremal Student (Thibaud and Opitz, 2015, <doi:10.1093/biomet/asv045>). Threshold selection methods, including Wadsworth (2016) <doi:10.1080/00401706.2014.998345>, and Northrop and Coleman (2014) <doi:10.1007/s10687-014-0183-z>. Multivariate extreme diagnostics. Estimation and likelihoods for univariate extremes, e.g., Coles (2001) <doi:10.1007/978-1-4471-3675-0>. | 
| License: | GPL-3 | 
| URL: | https://lbelzile.github.io/mev/, https://github.com/lbelzile/mev/ | 
| BugReports: | https://github.com/lbelzile/mev/issues/ | 
| Depends: | R (≥ 3.4) | 
| Imports: | alabama, methods, nleqslv, numDeriv, Rcpp (≥ 0.12.16), Rsolnp, stats, | 
| Suggests: | boot, cobs, evd, expint, knitr, MASS, mvPot (≥ 0.1.4), mvtnorm, gmm, revdbayes, rmarkdown, ismev, tinytest, TruncatedNormal (≥ 1.1) | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| LazyData: | true | 
| RoxygenNote: | 7.3.3 | 
| VignetteBuilder: | knitr | 
| Encoding: | UTF-8 | 
| ByteCompile: | true | 
| NeedsCompilation: | yes | 
| Packaged: | 2025-10-22 09:18:46 UTC; lbelzile | 
| Author: | Leo Belzile | 
| Maintainer: | Leo Belzile <belzilel@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-10-22 09:50:02 UTC | 
Joint maximum likelihood estimation for exponential model
Description
Calculates the MLEs of the rate parameter, and joint asymptotic covariance matrix of these MLEs over a range of thresholds as supplied by the user.
Usage
.Joint_MLE_Expl(x, u = NULL, k, q1, q2 = 1, param)
Arguments
| x | vector of data | 
| u | vector of thresholds. If not supplied, then  | 
| k | number of thresholds to consider if u not supplied | 
| q1 | lower quantile to consider for threshold | 
| q2 | upper quantile to consider for threshold | 
| param | character specifying  | 
Value
a list with
- mle vector of MLEs above the supplied thresholds 
- cov joint asymptotic covariance matrix of these MLEs 
Author(s)
Jennifer L. Wadsworth
GP fitting function of Grimshaw (1993)
Description
Function for estimating parameters k and a for a random sample from a GPD.
Usage
.fit.gpd.grimshaw(x)
Arguments
| x | sample values | 
Value
a list with the maximum likelihood estimates of components a and k
Author(s)
Scott D. Grimshaw
Robust threshold selection of Dupuis
Description
The optimal bias-robust estimator (OBRE) for the generalized Pareto. This function returns robust estimates and the associated weights.
Usage
.fit.gpd.rob(dat, thresh, k = 4, tol = 1e-05, show = FALSE)
Arguments
| dat | a numeric vector of data | 
| thresh | threshold parameter | 
| k | bound on the influence function; the constant  | 
| tol | numerical tolerance for OBRE weights iterations. | 
| show | logical: should diagnostics and estimates be printed. Default to  | 
Value
a list with the same components as fit.gpd,
in addition to
-  estimate: optimal bias-robust estimates of thescaleandshapeparameters.
-  weights: vector of OBRE weights.
References
Dupuis, D.J. (1998). Exceedances over High Thresholds: A Guide to Threshold Selection, Extremes, 1(3), 251–261.
See Also
Examples
dat <- rexp(100)
.fit.gpd.rob(dat, 0.1)
Posterior predictive distribution and density for the GEV distribution
Description
This function calculates the posterior predictive density at points x based on a matrix of posterior density parameters
Usage
.gev.postpred(x, posterior, Nyr = 100, type = c("density", "quantile"))
Arguments
| x | 
 | 
| posterior | 
 | 
| Nyr | number of years to extrapolate | 
| type | string indicating whether to return the posterior  | 
Value
a vector of values for the posterior predictive density or quantile at x, depending on the value of type
Maximum likelihood method for the generalized Pareto Model
Description
Maximum-likelihood estimation for the generalized Pareto model, including generalized linear modelling of each parameter. This function was adapted by Paul Northrop to include the gradient in the gpd.fit routine from ismev.
Usage
.gpd_2D_fit(
  xdat,
  threshold,
  npy = 365,
  ydat = NULL,
  sigl = NULL,
  shl = NULL,
  siglink = identity,
  shlink = identity,
  siginit = NULL,
  shinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
Arguments
| xdat | numeric vector of data to be fitted. | 
| threshold | a scalar or a numeric   vector of the same length as  | 
| npy | number of observations per year/block. | 
| ydat | matrix of covariates for generalized linear modelling of the parameters (or  | 
| sigl | numeric vector of integers, giving the columns of  | 
| shl | numeric vector of integers, giving the columns of  | 
| siglink | inverse link functions for generalized linear modelling of the scale parameter | 
| shlink | inverse link functions for generalized linear modelling of the shape parameter | 
| siginit | numeric giving initial value(s) for parameter estimates. If  | 
| shinit | numeric giving initial value(s) for the shape parameter estimate; if  | 
| show | logical; if  | 
| method | optimization method (see  | 
| maxit | maximum number of iterations. | 
| ... | other control parameters for the optimization. These are passed to components of the  | 
Details
For non-stationary fitting it is recommended that the covariates within the generalized linear models are (at least approximately) centered and scaled (i.e. the columns of ydat should be approximately centered and scaled).
The form of the GP model used follows Coles (2001) Eq (4.7). In particular, the shape parameter is defined so that positive values imply a heavy tail and negative values imply a bounded upper value.
Value
a list with components
- nexc
- scalar giving the number of threshold exceedances. 
- nllh
- scalar giving the negative log-likelihood value. 
- mle
- numeric vector giving the MLE's for the scale and shape parameters, resp. 
- rate
- scalar giving the estimated probability of exceeding the threshold. 
- se
- numeric vector giving the standard error estimates for the scale and shape parameter estimates, resp. 
- trans
- logical indicator for a non-stationary fit. 
- model
- list with components - sigland- shl.
- link
- character vector giving inverse link functions. 
- threshold
- threshold, or vector of thresholds. 
- nexc
- number of data points above the threshold. 
- data
- data that lie above the threshold. For non-stationary models, the data are standardized. 
- conv
- convergence code, taken from the list returned by - optim. A zero indicates successful convergence.
- nllh
- negative log likelihood evaluated at the maximum likelihood estimates. 
- vals
- matrix with three columns containing the maximum likelihood estimates of the scale and shape parameters, and the threshold, at each data point. 
- mle
- vector containing the maximum likelihood estimates. 
- rate
- proportion of data points that lie above the threshold. 
- cov
- covariance matrix. 
- se
- numeric vector containing the standard errors. 
- n
- number of data points (i.e., the length of - xdat).
- npy
- number of observations per year/block. 
- xdata
- data that has been fitted. 
References
Coles, S., 2001. An Introduction to Statistical Modeling of Extreme Values. Springer-Verlag, London.
Is the matrix conditionally negative semi-definite? Function adapted from 'is.CNSD' in the CEGO package, v 2.1.0
Description
Is the matrix conditionally negative semi-definite? Function adapted from 'is.CNSD' in the CEGO package, v 2.1.0
Usage
.is.CNSD(X, tol = 1e-08)
Arguments
| X | a symmetric matrix | 
| tol | tolerance value; eigenvalues between  | 
Author(s)
Martin Zaefferer
Internal function
Description
Takes a list of asymmetry parameters with an associated dependence vector and returns the corresponding asymmetry matrix for the asymmetric logistic and asymmetric negative logistic models
Usage
.mvasym.check(asy, dep, d, model = c("alog", "aneglog", "maxlin"))
Arguments
| asy | a list of  | 
| dep | vector of  | 
| d | dimension of the model | 
| model | either  | 
Details
This function is extracted from the evd package and modified (C) Alec Stephenson
Value
a matrix of asymmetry components, enumerating all possible 2^d-1 subsets of
the power set
Generate from Brown-Resnick process Y \sim {P_x}, where
P_{x} is probability of extremal function
Description
Generate from Brown-Resnick process Y \sim {P_x}, where
P_{x} is probability of extremal function
Usage
.rPBrownResnick(index, Sigma_chol, Sigma)
Arguments
| index | index of the location. An integer in 0, ...,  | 
| Sigma | a positive definite covariance matrix | 
Value
a d-vector from P_x
Generate from extremal Husler-Reiss distribution Y \sim {P_x}, where
P_{x} is probability of extremal function
Description
Generate from extremal Husler-Reiss distribution Y \sim {P_x}, where
P_{x} is probability of extremal function
Usage
.rPHuslerReiss(index, cholesky, Sigma)
Arguments
| index | index of the location. An integer in 0, ...,  | 
| cholesky | the Cholesky root of  | 
| Sigma | a covariance matrix formed from the symmetric square matrix of coefficients  | 
Value
a d-vector from P_x
Generate from Smith model (moving maxima) Y \sim {P_x}, where
P_{x} is probability of extremal function
Description
Generate from Smith model (moving maxima) Y \sim {P_x}, where
P_{x} is probability of extremal function
Usage
.rPSmith(index, Sigma_chol, loc)
Arguments
| index | index of the location. An integer in 0, ...,  | 
| Sigma_chol | the Cholesky root of the covariance matrix | 
| loc | location matrix | 
Value
a d-vector from P_x
Generate from bilogistic Y \sim {P_x}, where
P_{x} is the probability of extremal functions
Description
Generate from bilogistic Y \sim {P_x}, where
P_{x} is the probability of extremal functions
Usage
.rPbilog(d, index, alpha)
Arguments
| d | dimension of the 1-sample | 
| index | index of the location. An integer in 0, ...,  | 
| alpha | a  | 
Value
a d-vector from P_x
Generate from extremal Dirichlet Y \sim {P_x}, where
P_{x} is the probability of extremal functions from the Dirichlet model of
Coles and Tawn.
Description
Note: we generate from the Dirichlet rather than the Gamma distribution, since the former is parallelized
Usage
.rPdir(d, index, alpha, irv = FALSE)
Arguments
| d | dimension of the 1-sample | 
| index | index of the location. An integer in 0, ...,  | 
| alpha | a  | 
| irv | should the usual model ( | 
Value
a d-vector from P_x
Generate from extremal Dirichlet Y \sim {P_x}, where
P_{x} is the probability of extremal functions from a Dirichlet mixture
Description
Generate from extremal Dirichlet Y \sim {P_x}, where
P_{x} is the probability of extremal functions from a Dirichlet mixture
Usage
.rPdirmix(d, index, alpha, weight)
Arguments
| d | dimension of the 1-sample | 
| index | index of the location. An integer in 0, ...,  | 
| alpha | a  | 
| weight | a  | 
Value
a d-vector from P_x
Generate from extremal Student-t Y \sim {P_x}, where
P_{x} is probability of extremal function
Description
Generate from extremal Student-t Y \sim {P_x}, where
P_{x} is probability of extremal function
Usage
.rPexstud(index, cholesky, sigma, al)
Arguments
| index | index of the location. An integer in 0, ...,  | 
| cholesky | Cholesky root of transformed correlation matrix | 
| sigma | a positive semi-definite correlation matrix | 
| al | the alpha parameter in Proposition 7. Corresponds to degrees of freedom - 1 | 
Value
a d-vector from P_x
Generate from logistic Y \sim {P_x}, where
P_{x} is probability of extremal function scaled by a Frechet variate
Description
Generate from logistic Y \sim {P_x}, where
P_{x} is probability of extremal function scaled by a Frechet variate
Usage
.rPlog(d, index, theta)
Arguments
| d | dimension of the 1-sample | 
| index | index of the location. An integer in 0, ...,  | 
| theta | a one-dimensional parameter for the logistic model, strictly greater than 1. | 
Value
a d-vector from P_x
Generate from negative logistic Y \sim {P_x}, where
P_{x} is probability of extremal function scaled by a Frechet variate
Description
Generate from negative logistic Y \sim {P_x}, where
P_{x} is probability of extremal function scaled by a Frechet variate
Usage
.rPneglog(d, index, theta)
Arguments
| d | dimension of the 1-sample | 
| index | index of the location. An integer in 0, ...,  | 
| theta | a one-dimensional parameter for the negative logistic model | 
Value
a d-vector from P_x
Samples from exceedances at site (scaled extremal function definition)
Description
Samples from exceedances at site (scaled extremal function definition)
Usage
.rPsite(n, j, d, par, model, Sigma, loc)
Arguments
| n | sample size | 
| j | index of the site or variable | 
| d | dimension of the multivariate distribution | 
| par | a vector of parameters | 
| model | integer, currently ranging from 1 to 9, corresponding respectively to
(1)  | 
| Sigma | covariance matrix for Brown-Resnick, Smith and extremal student. Default for compatibility | 
| loc | matrix of location for Smith model. | 
Value
a n by d matrix containing the sample
Generates from Q_i, the spectral measure of the bilogistic model
Description
Simulation algorithm of Boldi (2009) for the bilogistic model
Usage
.rbilogspec(n, alpha)
Arguments
| n | sample size | 
| alpha | vector of parameter of dimension  | 
Value
an n by d sample from the spectral distribution
References
Boldi (2009). A note on the representation of parametric models for multivariate extremes. Extremes 12, 211–218.
Generates from Q_i, the spectral measure of the Brown-Resnick model
Description
Simulation algorithm of Dombry et al. (2015)
Usage
.rbrspec(n, Sigma_chol, Sigma)
Arguments
| n | sample size | 
| Sigma_chol | Cholesky root of  | 
| Sigma | 
 | 
Value
an n by d sample from the spectral distribution
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Generates from Q_i, the spectral measure of the Dirichlet mixture model
Description
Simulation algorithm of Dombry et al. (2015)
Usage
.rdirmixspec(n, d, alpha, weight)
Arguments
| n | sample size | 
| d | dimension of the 1-sample | 
| alpha | a  | 
| weight | a  | 
Value
an n by d sample from the spectral distribution
Generates from Q_i, the spectral measure of the extremal Dirichlet
model
Description
This model was introduced in Coles and Tawn (1991); the present method uses the simulation algorithm of Boldi (2009) for the extremal Dirichlet model
Usage
.rdirspec(n, d, alpha, irv = FALSE)
Arguments
| n | sample size | 
| d | dimension of sample | 
| alpha | vector of Dirichlet parameters of dimension  | 
| irv | should the usual model ( | 
Value
an n by d sample from the spectral distribution
References
Boldi (2009). A note on the representation of parametric models for multivariate extremes. Extremes 12, 211–218.
Generates from Q_i, the spectral measure of the extremal Student model
Description
Generates from Q_i, the spectral measure of the extremal Student model
Usage
.rexstudspec(n, sigma, al)
Arguments
| sigma | a positive semi-definite covariance matrix with unit variance | 
| al | the alpha parameter in Proposition 7. Corresponds to degrees of freedom - 1 | 
Value
an n by d sample from the spectral distribution
Generates from Q_i, the spectral measure of the Husler-Reiss model
Description
Generates from Q_i, the spectral measure of the Husler-Reiss model
Usage
.rhrspec(n, Lambda)
Arguments
| Lambda | an symmetric square matrix of coefficients  | 
Value
an n by d sample from the spectral distribution
Generates from Q_i, the spectral measure of the logistic model
Description
Simulation algorithm of Dombry et al. (2015)
Usage
.rlogspec(n, d, theta)
Arguments
| n | sample size | 
| theta | a one-dimensional parameter | 
Value
an n by d sample from the spectral distribution
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Multivariate extreme value distribution sampling algorithm via angular measure
Description
This algorithm corresponds to Algorithm 1 in Dombry, Engelke and Oesting (2016), using the formulation of the Dirichlet mixture of Coles and Tawn (1991) as described and derived in Boldi (2009) for the bilogistic and extremal Dirichlet model. Models currently implemented include logistic, negative logistic, extremal Dirichlet and bilogistic MEV.
Usage
.rmevA1(n, d, par, model, Sigma, loc)
Arguments
| n | sample size | 
| d | dimension of the multivariate distribution | 
| par | a vector of parameters | 
| model | integer, currently ranging from 1 to 9, corresponding respectively to
(1)  | 
| Sigma | covariance matrix for Brown-Resnick, Smith and extremal student. Conditionally negative definite matrix of parameters for the Huesler–Reiss model. Default matrix for compatibility | 
| loc | matrix of location for Smith model. | 
Value
a n by d matrix containing the sample
Multivariate extreme value distribution sampling algorithm via extremal functions
Description
Code implementing Algorithm 2 in Dombry, Engelke and Oesting (2016)
Usage
.rmevA2(n, d, par, model, Sigma, loc)
Arguments
| n | sample size | 
| d | dimension of the multivariate distribution | 
| par | a vector of parameters | 
| model | integer, currently ranging from 1 to 9, corresponding respectively to
(1)  | 
| Sigma | covariance matrix for Brown-Resnick, Smith and extremal student. Default for compatibility | 
| loc | matrix of location for Smith model. | 
Value
a n by d matrix containing the sample
Random samples from asymmetric logistic distribution
Description
Simulation algorithm of Stephenson (2003), using exact-samples from the logistic
Usage
.rmevasy(n, d, par, asym, ncompo, Sigma, model)
Arguments
| n | sample size | 
| d | dimension of the multivariate distribution | 
| par | a vector of parameters | 
| asym | matrix of bool indicating which component belong to the corresponding row logistic model | 
| ncompo | number of components for the (negative) logistic in row | 
| Sigma | matrix of asymmetry parameters | 
Value
a n by d matrix containing the sample
References
Stephenson, A. G. (2003) Simulating multivariate extreme value distributions of logistic type. Extremes, 6(1), 49–60.
Joe, H. (1990). Families of min-stable multivariate exponential and multivariate extreme value distributions, 9, 75–81.
Random sampling from spectral distribution on l1 sphere
Description
Generate from Q_i, the spectral measure of a given multivariate extreme value model
Usage
.rmevspec_cpp(n, d, par, model, Sigma, loc)
Arguments
| n | sample size | 
| d | dimension of the multivariate distribution | 
| par | a vector of parameters | 
| model | integer, currently ranging from 1 to 9, corresponding respectively to
(1)  | 
| Sigma | covariance matrix for Brown-Resnick and extremal student, symmetric matrix
of squared coefficients  | 
| loc | matrix of locations for the Smith model | 
Value
a n by d matrix containing the sample
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Boldi (2009). A note on the representation of parametric models for multivariate extremes. Extremes 12, 211–218.
Multivariate Normal distribution sampler (Rcpp version), derived using the eigendecomposition of the covariance matrix Sigma. The function utilizes the arma random normal generator
Description
Multivariate Normal distribution sampler (Rcpp version), derived using the eigendecomposition of the covariance matrix Sigma. The function utilizes the arma random normal generator
Usage
.rmnorm_arma(n, Mu, Xmat, eigen = TRUE)
Arguments
| n | sample size | 
| Mu | mean vector. Will set the dimension | 
| Xmat | covariance matrix, of same dimension as  | 
Value
an n sample from a multivariate Normal distribution
Generates from Q_i, the spectral measure of the negative logistic model
Description
Simulation algorithm of Dombry et al. (2016)
Usage
.rneglogspec(n, d, theta)
Arguments
| n | sample size | 
| theta | a one-dimensional parameter | 
Value
an n by d sample from the spectral distribution
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Generates from Q_i, the spectral measure of the pairwise Beta model
Description
This model was introduced in Cooley, Davis and Naveau (2010). The sample is drawn from a mixture and the algorithm follows from the proof of Theorem 1 in Ballani and Schlather (2011) and is written in full in Algorithm 1 of Sabourin et al. (2013).
Usage
.rpairbetaspec(n, d, alpha, beta)
Arguments
| n | sample size | 
| alpha | concentration parameter | 
| beta | vector of all pairwise component (lexicographic order, by row) | 
Value
a matrix of samples from the angular distribution
an n by d sample from the spectral distribution
Author(s)
Leo Belzile
References
Cooley, D., R.A. Davis and P. Naveau (2010). The pairwise beta distribution: A flexible parametric multivariate model for extremes, Journal of Multivariate Analysis, 101(9), 2103–2117.
Ballani, D. and M. Schlather (2011). A construction principle for multivariate extreme value distributions, Biometrika, 98(3), 633–645.
Sabourin, A., P. Naveau and A. Fougeres (2013). Bayesian model averaging for extremes, Extremes, 16, 325–350.
Generates from Q_i, the spectral measure of the pairwise exponential model
Description
The sample is drawn from a mixture and the algorithm follows from the proof of Theorem 1 in Ballani and Schlather (2011).
Usage
.rpairexpspec(n, d, alpha, beta)
Arguments
| n | sample size | 
| alpha | concentration parameter | 
| beta | vector of all pairwise component (lexicographic order, by row) | 
Value
a matrix of samples from the angular distribution
an n by d sample from the spectral distribution
Author(s)
Leo Belzile
References
Ballani, D. and M. Schlather (2011). A construction principle for multivariate extreme value distributions, Biometrika, 98(3), 633–645.
Generates from Q_i, the spectral measure of the Smith model (moving maxima)
Description
Simulation algorithm of Dombry et al. (2015)
Usage
.rsmithspec(n, Sigma_chol, loc)
Arguments
| n | sample size | 
| Sigma_chol | Cholesky decomposition of the  | 
| loc | location matrix | 
Value
an n by d sample from the spectral distribution
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Generates from Q_i, the spectral measure of the weighted Dirichlet model
Description
This model was introduced in Ballani and Schlather (2011).
Usage
.rwdirbsspec(n, d, alpha, beta)
Arguments
| n | sample size | 
| alpha | vector of concentration parameters | 
| beta | vector of Dirichlet components | 
Value
a matrix of samples from the angular distribution
an n by d sample from the spectral distribution
Author(s)
Leo Belzile
References
Ballani, D. and M. Schlather (2011). A construction principle for multivariate extreme value distributions, Biometrika, 98(3), 633–645.
Generates from Q_i, the spectral measure of the weighted exponential model
Description
This model was introduced in Ballani and Schlather (2011).
Usage
.rwexpbsspec(n, d, alpha, beta)
Arguments
| n | sample size | 
| alpha | vector of concentration parameters | 
| beta | vector of Dirichlet components | 
Value
a matrix of samples from the angular distribution
an n by d sample from the spectral distribution
Author(s)
Leo Belzile
References
Ballani, D. and M. Schlather (2011). A construction principle for multivariate extreme value distributions, Biometrika, 98(3), 633–645.
Weighted empirical distribution function
Description
Compute an empirical distribution function with weights w at each of x
Usage
.wecdf(x, w)
Arguments
| x | observations | 
| w | weights | 
Value
a function of class ecdf
Transform variogram matrix to covariance of conditional random field
Description
The matrix Lambda is half the semivariogram matrix. The function
returns the conditional covariance with respect to entries in co,
restricted to the subA rows and the subB columns.
Usage
Lambda2cov(Lambda, co, subA, subB)
Arguments
| Lambda | Negative definite matrix for the Huesler–Reiss model | 
| co | vector of integer with conditioning sites | 
| subA | vector of integers with sub-entries (not in  | 
| subB | vector of integers with sub-entries (not in  | 
Value
a covariance matrix
Score and likelihood ratio tests fit of equality of shape over multiple thresholds
Description
The function returns a P-value path for the score testand/or likelihood ratio test for equality of the shape parameters over multiple thresholds under the generalized Pareto model.
Usage
NC.diag(
  xdat,
  u,
  GP.fit = c("Grimshaw", "nlm", "optim", "ismev"),
  do.LRT = FALSE,
  size = NULL,
  plot = TRUE,
  ...,
  xi.tol = 0.001
)
Arguments
| xdat | numeric vector of raw data | 
| u | 
 | 
| GP.fit | function used to optimize the generalized Pareto model. | 
| do.LRT | boolean indicating whether to perform the likelihood ratio test (in addition to the score test) | 
| size | level at which a horizontal line is drawn on multiple threshold plot | 
| plot | logical; if  | 
| ... | additional parameters passed to  | 
| xi.tol | numerical tolerance for threshold distance; if the absolute value of  | 
Details
The default method is 'Grimshaw' using the reduction of the parameters to a one-dimensional
maximization. Other options are one-dimensional maximization of the profile the nlm function or optim.
Two-dimensional optimisation using 2D-optimization ismev using the routine
from gpd.fit from the ismev library, with the addition of the algebraic gradient.
The choice of GP.fit should make no difference but the options were kept.
Warning: the function will not recover from failure of the maximization routine, returning various error messages.
Value
a plot of P-values for the test at the different thresholds u
Author(s)
Paul J. Northrop and Claire L. Coleman
References
Grimshaw (1993). Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution, Technometrics, 35(2), 185–191.
Northrop & Coleman (2014). Improved threshold diagnostic plots for extreme value analyses, Extremes, 17(2), 289–303.
Wadsworth & Tawn (2012). Likelihood-based procedures for threshold diagnostics and uncertainty in extreme value modelling, J. R. Statist. Soc. B, 74(3), 543–567.
Examples
## Not run: 
data(nidd)
u <- seq(65,90, by = 1L)
NC.diag(nidd, u, size = 0.05)
## End(Not run)
Extreme U-statistic Pickands estimator
Description
[Deprecated function]
Given a random sample of n exceedances, the estimator
returns an estimator of the shape parameter or extreme
value index using a kernel of order 3, based on
k largest exceedances of xdat. Oorschot et al. (2023) parametrize the model in terms of m=n-k, so that m=n corresponds to using only the three largest observations.
Usage
PickandsXU(xdat, m)
Arguments
| xdat | vector of observations of length  | 
| m | number of largest order statistics  | 
Details
The calculations are based on the recursions provided in Lemma 4.3 of Oorschot et al.
References
Oorschot, J, J. Segers and C. Zhou (2023), Tail inference using extreme U-statistics, Electronic Journal of Statistics, 17(1): 1113-1159. doi:10.1214/23-EJS2129
Stein's vector of weights
Description
Computes a vector of decreasing weights for exceedances.
Usage
Stein_weights(n, gamma = 1)
Arguments
| n | integer, sample size | 
| gamma | positive scalar for power | 
Value
a vector of positive weights of length n
References
Stein, M.L. (2023). A weighted composite log-likelihood approach to parametric estimation of the extreme quantiles of a distribution. Extremes 26, 469-507 <doi:10.1007/s10687-023-00466-w>
Wadsworth's univariate and bivariate exponential threshold diagnostics
Description
Function to produce diagnostic plots and test statistics for the threshold diagnostics exploiting structure of maximum likelihood estimators based on the non-homogeneous Poisson process likelihood
Usage
W.diag(
  xdat,
  model = c("nhpp", "exp", "invexp"),
  u = NULL,
  k,
  q1 = 0,
  q2 = 1,
  par = NULL,
  M = NULL,
  nbs = 1000,
  alpha = 0.05,
  plots = c("LRT", "WN", "PS"),
  UseQuantiles = FALSE,
  changepar = TRUE,
  transform = TRUE,
  ...
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| model | string specifying whether the univariate or bivariate diagnostic should be used. Either  | 
| u | optional; vector of candidate thresholds. | 
| k | number of thresholds to consider (if  | 
| q1 | lowest quantile for the threshold sequence. | 
| q2 | upper quantile limit for the threshold sequence ( | 
| par | parameters of the NHPP likelihood. If  | 
| M | number of superpositions or 'blocks' / 'years' the process corresponds to (can affect the optimization) | 
| nbs | number of simulations used to assess the null distribution of the LRT, and produce the p-value | 
| alpha | significance level of the LRT | 
| plots | vector of strings indicating which plots to produce;  | 
| UseQuantiles | logical; use quantiles as the thresholds in the plot? | 
| changepar | logical; if  | 
| ... | additional parameters passed to  | 
Details
The function is a wrapper for the univariate (non-homogeneous Poisson process model) and bivariate exponential dependence model.
For the latter, the user can select either the rate or inverse rate parameter  (the inverse rate parametrization  works better for uniformity
of the p-value distribution under the LR test.
There are two options for the bivariate diagnostic: either provide pairwise minimum of marginally
exponentially distributed margins or provide a n times 2 matrix with the original data, which
is transformed to exponential margins using the empirical distribution function.
Value
plots of the requested diagnostics and an invisible list with components
-  MLE: maximum likelihood estimates from all thresholds
-  Cov: joint asymptotic covariance matrix for\xi,\etaor1/\eta.
-  WN: values of the white noise process
-  LRT: values of the likelihood ratio test statistic vs threshold
-  pval: P-value of the likelihood ratio test
-  k: final number of thresholds used
-  thresh: threshold selected by the likelihood ratio procedure
-  qthresh: quantile level of threshold selected by the likelihood ratio procedure
-  thresh0: vector of candidate thresholds
-  qthresh0: quantile level of candidate thresholds
-  mle.u: maximum likelihood estimates for the selected threshold
-  model: model fitted
Author(s)
Jennifer L. Wadsworth
References
Wadsworth, J.L. (2016). Exploiting Structure of Maximum Likelihood Estimators for Extreme Value Threshold Selection, Technometrics, 58(1), 116-126, http://dx.doi.org/10.1080/00401706.2014.998345.
Examples
## Not run: 
set.seed(123)
# Parameter stability only
W.diag(xdat = abs(rnorm(5000)), model = 'nhpp',
       k = 30, q1 = 0, plots = "PS")
W.diag(rexp(1000), model = 'nhpp', k = 20, q1 = 0)
xbvn <- rmnorm(n = 6000,
                mu = rep(0, 2),
                Sigma = cbind(c(1, 0.7), c(0.7, 1)))
# Transform margins to exponential manually
xbvn.exp <- -log(1 - pnorm(xbvn))
#rate parametrization
W.diag(xdat = apply(xbvn.exp, 1, min), model = 'exp',
       k = 30, q1 = 0)
W.diag(xdat = xbvn, model = 'exp', k = 30, q1 = 0)
#inverse rate parametrization
W.diag(xdat = apply(xbvn.exp, 1, min), model = 'invexp',
       k = 30, q1 = 0)
## End(Not run)
Abisko rainfall
Description
Daily non-zero rainfall measurements in Abisko (Sweden) from January 1913 until December 2014.
Arguments
| date | 
 | 
| precip | rainfall amount (in mm) | 
Format
a data frame with 15132 rows and two variables
Source
Abisko Scientific Research Station
References
A. Kiriliouk, H. Rootzen, J. Segers and J.L. Wadsworth (2019), Peaks over thresholds modeling with multivariate generalized Pareto distributions, Technometrics, 61(1), 123–135, <doi:10.1080/00401706.2018.1462738>
Estimation of the bivariate angular dependence function
Description
Estimation of the bivariate angular dependence function
Usage
adf(
  xdat,
  qlev = 0.95,
  estimator = c("hill", "mle", "bayes"),
  level = 0.95,
  ties.method = "random",
  angles = seq(0, 1, by = 0.02),
  plot = TRUE
)
Arguments
| xdat | an  | 
| qlev | quantile level on uniform scale at which to threshold data. Default to 0.95 | 
| estimator | string indicating the estimation method | 
| level | level for confidence intervals, default to 0.95 | 
| ties.method | method for handling of ties in rank transformation | 
| angles | vector of angles at which to evaluate the angular dependence function
The confidence intervals are based on normal quantiles. The standard errors for the  | 
| plot | logical indicating whether to plot the function, defaults to  | 
Value
a plot of the angular dependence function if plot=TRUE, plus an invisible list with components
-  anglethe sequence of angles in (0,1) at which thelambdavalues are evaluated
-  coefpoint estimates of the angular dependence function
-  lowerlevel% confidence interval for lambda (lower bound)
-  upperlevel% confidence interval for lambda (upper bound)
References
J.L. Wadsworth and J.A. Tawn (2013). A new representation for multivariate tail probabilities, Bernoulli, 19(5B), 2689-2714.
Examples
set.seed(12)
dat <- mev::rmev(n = 1000, d = 2, model = "log", param = 0.1)
adf(xdat = dat, estimator = 'hill')
Bivariate angular dependence function for extrapolation based on rays
Description
The scale parameter g(w) in the Ledford and Tawn approach is estimated empirically for
x large as 
\hat{g}(w) = \frac{\Pr(X_P>xw, Y_P>x(1-w))}{\Pr(X_P>x, Y_P>x)}
where the sample (X_P, Y_P) are observations on the unit Pareto scale.
The coefficient \eta is estimated using maximum likelihood as the
shape parameter of a generalized Pareto distribution on \min(X_P, Y_P).
Usage
angextrapo(xdat, qlev = 0.95, w = seq(0.05, 0.95, length = 20), ...)
Arguments
| xdat | an  | 
| qlev | quantile level on uniform scale at which to threshold data. Default to 0.95 | 
| w | vector of unique angles between 0 and 1 at which to evaluate scale empirically. | 
| ... | additional arguments, for backward compatibility | 
Value
a list with elements
-  w: angles between zero and one
-  g: scale function at a given value ofw
-  eta: Ledford and Tawn tail dependence coefficient
References
Ledford, A.W. and J. A. Tawn (1996), Statistics for near independence in multivariate extreme values. Biometrika, 83(1), 169–187.
Examples
angextrapo(rmev(n = 1000, model = 'log', d = 2, param = 0.5))
Rank-based transformation to angular measure
Description
The method uses the pseudo-polar transformation for suitable norms, transforming
the data to pseudo-observations, than marginally to unit Frechet or unit Pareto.
Empirical or Euclidean weights are computed and returned alongside with the angular and
radial sample for values above threshold(s) thresh, specified in terms of quantiles
of the radial component R or marginal quantiles. Only complete tuples are kept.
Usage
angmeas(
  xdat,
  thresh,
  Rnorm = c("l1", "l2", "linf"),
  Anorm = c("l1", "l2", "linf", "arctan"),
  marg = c("frechet", "pareto"),
  wgt = c("empirical", "euclidean"),
  region = c("sum", "min", "max"),
  is.angle = FALSE,
  ...
)
Arguments
| xdat | an  | 
| thresh | threshold of length 1 for  | 
| Rnorm | character string indicating the norm for the radial component. | 
| Anorm | character string indicating the norm for the angular component.  | 
| marg | character string indicating choice of marginal transformation, either to Frechet or Pareto scale | 
| wgt | character string indicating weighting function for the equation. Can be based on Euclidean or empirical likelihood for the mean | 
| region | character string specifying which observations to consider (and weight).  | 
| is.angle | logical indicating whether observations are already angle with respect to  | 
| ... | additional arguments | 
Details
The empirical likelihood weighted mean problem is implemented for all thresholds,
while the Euclidean likelihood is only supported for diagonal thresholds specified
via region=sum.
Value
a list with arguments ang for the d-1 pseudo-angular sample, rad with the radial component
and possibly wts if Rnorm='l1' and the empirical likelihood algorithm converged. The Euclidean algorithm always returns weights even if some of these are negative.
a list with components
-  angmatrix of pseudo-angular observations
-  radvector of radial contributions
-  wtsempirical or Euclidean likelihood weights for angular observations
Author(s)
Leo Belzile
References
Einmahl, J.H.J. and J. Segers (2009). Maximum empirical likelihood estimation of the spectral measure of an extreme-value distribution, Annals of Statistics, 37(5B), 2953–2989.
de Carvalho, M. and B. Oumow and J. Segers and M. Warchol (2013). A Euclidean likelihood estimator for bivariate tail dependence, Comm. Statist. Theory Methods, 42(7), 1176–1192.
Owen, A.B. (2001). Empirical Likelihood, CRC Press, 304p.
Examples
x <- rmev(n = 25, d = 3, param = 0.5, model = 'log')
wts <- angmeas(xdat = x, Rnorm = 'l1', Anorm = 'l1', marg = 'frechet', wgt = 'empirical')
wts2 <- angmeas(xdat = x, Rnorm = 'l2', Anorm = 'l2', marg = 'pareto')
Dirichlet mixture smoothing of the angular measure
Description
This function computes the empirical or Euclidean likelihood
estimates of the spectral measure and uses the points returned from a call to angmeas to compute the Dirichlet
mixture smoothing of de Carvalho, Warchol and Segers (2012), placing a Dirichlet kernel at each observation.
Usage
angmeasdir(
  xdat,
  thresh,
  Rnorm = c("l1", "l2", "linf"),
  Anorm = c("l1", "l2", "linf", "arctan"),
  marg = c("frechet", "pareto"),
  wgt = c("empirical", "euclidean"),
  region = c("sum", "min", "max"),
  is.angle = FALSE,
  ...
)
Arguments
| xdat | an  | 
| thresh | threshold of length 1 for  | 
| Rnorm | character string indicating the norm for the radial component. | 
| Anorm | character string indicating the norm for the angular component.  | 
| marg | character string indicating choice of marginal transformation, either to Frechet or Pareto scale | 
| wgt | character string indicating weighting function for the equation. Can be based on Euclidean or empirical likelihood for the mean | 
| region | character string specifying which observations to consider (and weight).  | 
| is.angle | logical indicating whether observations are already angle with respect to  | 
| ... | additional arguments | 
Details
The cross-validation bandwidth is the solution of
\max_{\nu} \sum_{i=1}^n \log \left\{ \sum_{k=1,k \neq i}^n p_{k, -i} f(\mathbf{w}_i; \nu \mathbf{w}_k)\right\},
where f is the density of the Dirichlet distribution, p_{k, -i} is the Euclidean weight
obtained from estimating the Euclidean likelihood problem without observation i.
Value
an invisible list with components
-  nubandwidth parameter obtained by cross-validation;
-  dirparmatnbydmatrix of Dirichlet parameters for the mixtures;
-  wtsmixture weights.
Examples
set.seed(123)
x <- rmev(n = 100, d = 2L, param = 0.5, model = 'log')
out <- angmeasdir(x)
Automated mean residual life plots
Description
This function implements the automated proposal from Section 2.2 of Langousis et al. (2016) for mean residual life plots. It returns the threshold that minimize the weighted mean square error and moment estimators for the scale and shape parameter based on weighted least squares.
Usage
automrl(xdat, kmax, thresh, plot = TRUE, ...)
Arguments
| xdat | [numeric] vector of observations | 
| kmax | [integer] maximum number of order statistics | 
| thresh | [numeric] vector of thresholds; if missing, uses all order statistics from the 20th largest until  | 
| plot | [logical] if  | 
| ... | additional arguments, currently ignored | 
Details
The procedure consists in estimating the usual
Value
a list containing
-  thresh: selected threshold
-  scale: scale parameter estimate
-  shape: shape parameter estimate
References
Langousis, A., A. Mamalakis, M. Puliga and R. Deidda (2016). Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database, Water Resources Research, 52, 2659–2681.
Censored likelihood for multivariate peaks over threshold models
Description
Censored likelihoods for various parametric limiting models over region determined by
\{y \in F: \max_{j=1}^D \sigma_j \frac{y^\xi_j-1}{\xi_j}+\mu_j  > u\};
where \mu is loc, \sigma is scale and \xi is shape.
Usage
clikmgp(
  dat,
  thresh,
  mthresh = thresh,
  loc,
  scale,
  shape,
  par,
  model = c("log", "neglog", "br", "xstud"),
  likt = c("mgp", "pois", "binom"),
  lambdau = 1,
  ...
)
Arguments
| dat | matrix of observations | 
| thresh | functional threshold for the maximum | 
| mthresh | vector of individuals thresholds under which observations are censored | 
| loc | vector of location parameter for the marginal generalized Pareto distribution | 
| scale | vector of scale parameter for the marginal generalized Pareto distribution | 
| shape | vector of shape parameter for the marginal generalized Pareto distribution | 
| par | list of parameters:  | 
| model | string indicating the model family, one of  | 
| likt | string indicating the type of likelihood, with an additional contribution for the non-exceeding components: one of   | 
| lambdau | vector of marginal rate of marginal threshold exceedance. | 
| ... | additional arguments (see Details) | 
Details
Optional arguments can be passed to the function via ...
-  censoredmatrix of booleans andNAindicating whether observationsdatfall below the mthresholdmthresh
-  clcluster instance created bymakeCluster(default toNULL)
-  ncorsnumber of cores for parallel computing of the likelihood
-  numAbovePerRownumber of observations above mthreshold (non-missing) per row
-  numAbovePerColnumber of observations above mthreshold (non-missing) per column
-  mmaxmaximum per column
-  B1number of replicates for quasi Monte Carlo integral for the exponent measure
-  B2number of replicates for quasi Monte Carlo integral for the censored intensity contribution
-  genvec1generating vector for the quasi Monte Carlo routine (exponent measure), associated withB1
-  genvec2generating vector for the quasi Monte Carlo routine (individual obs contrib), associated withB2
Value
the value of the log-likelihood with attributes expme, giving the exponent measure
Note
The location and scale parameters are not identifiable unless one of them is fixed.
Confidence intervals for profile likelihood objects
Description
Computes confidence intervals for the parameter psi for profile likelihood objects.
This function uses spline interpolation to derive level confidence intervals
Usage
## S3 method for class 'eprof'
confint(
  object,
  parm,
  level = 0.95,
  prob = c((1 - level)/2, 1 - (1 - level)/2),
  print = FALSE,
  method = c("cobs", "smooth.spline"),
  boundary = FALSE,
  ...
)
Arguments
| object | an object of class  | 
| parm | a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. | 
| level | confidence level, with default value of 0.95 | 
| prob | percentiles, with default giving symmetric 95% confidence intervals | 
| print | should a summary be printed. Default to  | 
| method | string for the method, either  | 
| boundary | logical; if  | 
| ... | additional arguments passed to functions. Providing a logical  | 
Value
returns a 2 by 3 matrix containing point estimates, lower and upper confidence intervals based on the likelihood root and modified version thereof
Threshold selection via coefficient of variation
Description
This function computes the empirical coefficient of variation and computes a weighted statistic comparing the squared distance with the theoretical coefficient variation corresponding to a specific shape parameter (estimated from the data using a moment estimator as the value minimizing the test statistic, or using maximum likelihood). The procedure stops if there are no more than 10 exceedances above the highest threshold
Usage
cvselect(
  xdat,
  thresh,
  method = c("mle", "wcv", "cv"),
  nsim = 999L,
  nthresh = 10L,
  level = 0.05,
  lazy = FALSE
)
Arguments
| xdat | [vector] vector of observations | 
| thresh | [vector] vector of threshold. If missing, set to  | 
| method | [string], either moment estimator for the (weighted) coefficient of variation ( | 
| nsim | [integer] number of bootstrap replications | 
| nthresh | [integer] number of thresholds, if  | 
| level | [numeric] probability level for sequential testing procedure | 
| lazy | [logical] compute the bootstrap p-value until the test stops rejecting at level  | 
Value
a list with elements
-  thresh0: value of threshold returned by the procedure,NAif the hypothesis is rejected at all thresholds
-  thresh: sorted vector of candidate thresholds
-  cindex: index of selected threshold amongthreshorNAif none returned
-  pval: bootstrap p-values, withNAiflazyand the p-value exceeds level at lower thresholds
-  shape: shape parameter estimates
-  nexc: number of exceedances of each thresholdthresh
-  method: estimation method for the shape parameter
References
del Castillo, J. and M. Padilla (2016). Modelling extreme values by the residual coefficient of variation, SORT, 40(2), pp. 303–320.
Distance matrix with geometric anisotropy
Description
The function computes the distance between locations, with geometric anisotropy.
Consider real parameters \theta_1 and \theta_2, and the transformation \psi=\arctan(\theta_1/\theta_2)/2 and r=1 +\theta_1^2 + \theta_2^2.
The dilation and rotation matrix is
\left(\begin{matrix} \sqrt{r}\cos(\rho) & -\sqrt{r}\sin(\rho) \\ \sin(\rho)/\sqrt{r} & \cos(\rho)/\sqrt{r} \end{matrix} \right).
The parametrization is convenient for optimization purposes, as the parameter vector is unconstrained and the transformation has unit Jacobian.
Usage
dgeoaniso(loc, theta)
Arguments
| loc | a  | 
| theta | numeric vector of length 2, real parameters | 
Value
a d by d square matrix of pairwise distance
References
Rai, K. and Brown, P.E. (2025), A parameter transformation of the anisotropic Matérn covariance function. Canadian Journal of Statistics e11839. doi:10.1002/cjs.11839
Distance matrix with geometric anisotropy
Description
The function computes the distance between locations, with geometric anisotropy.
The parametrization assumes there is a scale parameter, say \sigma, so that scale
is the distortion for the second component only. The angle rho must lie in
[-\pi/2, \pi/2]. The dilation and rotation matrix is
\left(\begin{matrix} \cos(\rho) & \sin(\rho) \\ - \sigma\sin(\rho) & \sigma\cos(\rho) \end{matrix} \right)
Usage
distg(loc, scale, rho)
Arguments
| loc | a  | 
| scale | numeric vector of length 1, greater than 1. | 
| rho | angle for the anisotropy, must be larger than  | 
Value
a d by d square matrix of pairwise distance
Extended generalised Pareto families
Description
This function provides the log-likelihood and quantiles for the three different families presented
in Papastathopoulos and Tawn (2013) and the two proposals of Gamet and Jalbert (2022), plus exponential tilting. All of the models contain an additional parameter, \kappa \ge 0.
All families share the same tail index as the generalized Pareto distribution, while allowing for lower thresholds.
For most models, the distribution reduce to the generalised Pareto when \kappa=1 (for models gj-tnorm and logist, on the boundary of the parameter space when \kappa \to 0).
egp.retlev gives the return levels for the extended generalised Pareto distributions
Arguments
| xdat | vector of observations, greater than the threshold | 
| thresh | threshold value | 
| par | parameter vector ( | 
| model | a string indicating which extended family to fit | 
| show | logical; if  | 
| p | extreme event probability;  | 
| plot | logical; if  | 
Details
For return levels, the p argument can be related to T year exceedances as follows:
if there are n_y observations per year, than take p
to equal 1/(Tn_y) to obtain the T-years return level.
Value
egp.ll returns the log-likelihood value, while egp.retlev returns a plot of the return levels if plot=TRUE and a list with tail probabilities p, return levels retlev, thresholds thresh and model name model.
Usage
egp.ll(xdat, thresh, model, par)
egp.retlev(xdat, thresh, par, model, p, plot=TRUE)
Author(s)
Leo Belzile
References
Papastathopoulos, I. and J. Tawn (2013). Extended generalised Pareto models for tail estimation, Journal of Statistical Planning and Inference 143(3), 131–143, <doi:10.1016/j.jspi.2012.07.001>.
Gamet, P. and Jalbert, J. (2022). A flexible extended generalized Pareto distribution for tail estimation. Environmetrics, 33(6), <doi:10.1002/env.2744>.
Examples
set.seed(123)
xdat <- rgp(1000, loc = 0, scale = 2, shape = 0.5)
par <- fit.egp(xdat, thresh = 0, model = 'gj-beta')$par
p <- c(1/1000, 1/1500, 1/2000)
# With multiple thresholds
th <- c(0, 0.1, 0.2, 1)
opt <- tstab.egp(xdat, thresh = th, model = 'gj-beta')
egp.retlev(xdat = xdat, thresh = th, model = 'gj-beta', p = p)
opt <- tstab.egp(xdat, th, model = 'pt-power', plots = NA)
egp.retlev(xdat = xdat, thresh = th, model = 'pt-power', p = p)
Extended generalised Pareto log likelihood
Description
This function provides the log-likelihood and quantiles for the different extended generalized Pareto distributions. All families share the same tail index as the generalized Pareto, and reduce to the latter when kappa=1.
Usage
egp.ll(
  xdat,
  thresh,
  par,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist")
)
egp.retlev(
  xdat,
  thresh,
  par,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  p,
  confint = FALSE,
  plot = TRUE
)
Arguments
| xdat | vector of observations, greater than the threshold | 
| thresh | threshold value | 
| par | parameter vector ( | 
| model | a string indicating which extended family to fit | 
| p | extreme event probability;  | 
| plot | logical; if  | 
Fit of extended GP models and parameter stability plots
Description
This function is an alias of fit.egp.
Usage
egp.fit(
  xdat,
  thresh,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  init,
  show = FALSE
)
Details
Supported for backward compatibility
Deprecated function for parameter stability plots
Description
Deprecated function for parameter stability plots
Usage
egp.fitrange(
  xdat,
  thresh,
  model = c("egp1", "egp2", "egp3"),
  plots = 1:3,
  umin,
  umax,
  nint
)
Profile log likelihood for extended generalized Pareto models
Description
Computes the profile log likelihood over a grid of values of \psi for various parameters, including return levels.
Usage
egp.pll(
  psi,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  param = c("kappa", "scale", "shape", "retlev"),
  mle = NULL,
  xdat,
  thresh = NULL,
  plot = FALSE,
  method = c("Nelder", "nlminb", "BFGS"),
  p,
  ...
)
Arguments
| psi | grid of values for the parameter to profile | 
| model | string; choice of extended eneralized Pareto model. | 
| param | string; parameter to profile | 
| mle | a vector or matrix with maximum likelihood estimates of  | 
| xdat | vector of observations | 
| thresh | vector of positive thresholds. If  | 
| plot | logical; if  | 
| method | string giving the optimization method for the outer optimization in the augmented Lagrangian routine; one of  | 
| p | tail probability for return level if  | 
| ... | additional arguments, currently ignored | 
Value
an object of class eprof
Fit an extended generalized Pareto distribution of Naveau et al.
Description
Deprecated function name to fit an extended generalized Pareto family. The user should call fit.extgp instead.
Usage
egp2.fit(
  data,
  model = 1,
  method = c("mle", "pwm"),
  init,
  censoring = c(0, Inf),
  rounded = 0,
  CI = FALSE,
  R = 1000,
  ncpus = 1,
  plots = TRUE
)
Arguments
| data | data vector. | 
| model | integer ranging from 0 to 4 indicating the model to select (see  | 
| method | string; either  | 
| init | vector of initial values, comprising of  | 
| censoring | numeric vector of length 2 containing the lower and upper bound for censoring;  | 
| rounded | numeric giving the instrumental precision (and rounding of the data), with default of 0. | 
| R | integer; number of bootstrap replications. | 
| ncpus | integer; number of CPUs for parallel calculations (default: 1). | 
| plots | logical; whether to produce histogram and density plots. | 
See Also
Extended generalized Pareto distribution
Description
Density function, distribution function, quantile function and random number generation for various extended generalized Pareto distributions
Usage
pegp(
  q,
  scale,
  shape,
  kappa,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  lower.tail = TRUE,
  log.p = FALSE
)
degp(
  x,
  scale,
  shape,
  kappa,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  log = FALSE
)
qegp(
  p,
  scale,
  shape,
  kappa,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  lower.tail = TRUE
)
regp(
  n,
  scale,
  shape,
  kappa,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist")
)
Arguments
| scale | scale parameter, strictly positive. | 
| shape | shape parameter. | 
| kappa | shape parameter for the tilting distribution. | 
| model | string giving the distribution of the model | 
| lower.tail | logical; if  | 
| log.p,log | logical; if  | 
| x,q | vector of quantiles | 
| p | vector of probabilities | 
| n | scalar number of observations | 
References
Papastathopoulos, I. and J. Tawn (2013). Extended generalised Pareto models for tail estimation, Journal of Statistical Planning and Inference 143(3), 131–143, <doi:10.1016/j.jspi.2012.07.001>.
Gamet, P. and Jalbert, J. (2022). A flexible extended generalized Pareto distribution for tail estimation. Environmetrics, 33(6), <doi:10.1002/env.2744>.
Self-concordant empirical likelihood for a vector mean
Description
Self-concordant empirical likelihood for a vector mean
Usage
emplik(
  dat,
  mu = rep(0, ncol(dat)),
  lam = rep(0, ncol(dat)),
  eps = 1/nrow(dat),
  M = 1e+30,
  thresh = 1e-30,
  itermax = 100
)
Arguments
| dat | 
 | 
| mu | 
 | 
| lam | starting values for Lagrange multiplier vector, default to zero vector | 
| eps | lower cutoff for  | 
| M | upper cutoff for  | 
| thresh | convergence threshold for log likelihood (default of  | 
| itermax | upper bound on number of Newton steps. | 
Value
a list with components
-  logelrlog empirical likelihood ratio.
-  lamLagrange multiplier (vector of lengthd).
-  wtsnvector of observation weights (probabilities).
-  convboolean indicating convergence.
-  niternumber of iteration until convergence.
-  ndecNewton decrement.
-  gradnormnorm of gradient of log empirical likelihood.
Author(s)
Art Owen, C++ port by Leo Belzile
References
Owen, A.B. (2013). Self-concordance for empirical likelihood, Canadian Journal of Statistics, 41(3), 387–397.
Eskdalemuir Observatory Daily Rainfall
Description
This dataset contains exceedances of 30mm for daily cumulated rainfall observations over the period 1970-1986. These data were aggregated from hourly series.
Format
a vector with 93 daily cumulated rainfall measurements exceeding 30mm.
Details
The station is one of the rainiest of the whole UK, with an average 1554m of cumulated rainfall per year. The data consisted of 6209 daily observations, of which 4409 were non-zero. Only the 93 largest observations are provided.
Source
Met Office.
Exponent measure for multivariate generalized Pareto distributions
Description
Integrated intensity over the region defined by [0, z]^c for logistic, Huesler-Reiss, Brown-Resnick and extremal Student processes.
Usage
expme(
  z,
  par,
  model = c("log", "neglog", "hr", "br", "xstud"),
  method = c("TruncatedNormal", "mvtnorm", "mvPot")
)
Arguments
| z | vector at which to estimate exponent measure | 
| par | list of parameters | 
| model | string indicating the model family | 
| method | string indicating the package from which to extract the numerical integration routine | 
Value
numeric giving the measure of the complement of [0,z].
Note
The list par must contain different arguments depending on the model. For the Brown–Resnick model, the user must supply the conditionally negative definite matrix Lambda following the parametrization in Engelke et al. (2015) or the covariance matrix Sigma, following Wadsworth and Tawn (2014). For the Husler–Reiss model, the user provides the mean and covariance matrix, m and Sigma. For the extremal student, the covariance matrix Sigma and the degrees of freedom df. For the logistic model, the strictly positive dependence parameter alpha.
Examples
## Not run: 
# Extremal Student
Sigma <- stats::rWishart(n = 1, df = 20, Sigma = diag(10))[, , 1]
expme(z = rep(1, ncol(Sigma)), par = list(Sigma = cov2cor(Sigma), df = 3), model = "xstud")
# Brown-Resnick model
D <- 5L
loc <- cbind(runif(D), runif(D))
di <- as.matrix(dist(rbind(c(0, ncol(loc)), loc)))
semivario <- function(d, alpha = 1.5, lambda = 1) {
  (d / lambda)^alpha
}
Vmat <- semivario(di)
Lambda <- Vmat[-1, -1] / 2
expme(z = rep(1, ncol(Lambda)), par = list(Lambda = Lambda), model = "br", method = "mvPot")
Sigma <- outer(Vmat[-1, 1], Vmat[1, -1], "+") - Vmat[-1, -1]
expme(z = rep(1, ncol(Lambda)), par = list(Lambda = Lambda), model = "br", method = "mvPot")
## End(Not run)
Extremal index estimators
Description
The function implements estimators of the extremal index based on interexceedance time and gap of exceedances. The maximum likelihood estimator and iteratively reweighted least square estimators of Suveges (2007) as well as the intervals estimator. The implementation differs from the presentation of the paper in that an iteration limit is enforced to make sure the iterative procedure terminates. Multiple thresholds can be supplied.
Usage
ext.index(
  xdat,
  quantile = 0.95,
  method = c("wls", "mle", "intervals"),
  plot = FALSE,
  warn = FALSE,
  ...
)
Arguments
| xdat | numeric vector of observations | 
| quantile | a vector of quantile levels in (0,1) for estimation of the extremal index. Defaults to 0.95 | 
| method | a string specifying the chosen method. Must be either  | 
| plot | logical; if  | 
| warn | logical; if  | 
| ... | additional arguments, for backward compatibility | 
Details
The iteratively reweighted least square is a procedure based on the gaps of exceedances S_n=T_n-1
The model is first fitted to non-zero gaps, which are rescaled to have unit exponential scale. The slope
between the theoretical quantiles and the normalized gap of exceedances is b=1/\theta,
with intercept a=\log(\theta)/\theta.
As such, the estimate of the extremal index is based on \hat{\theta}=\exp(\hat{a}/\hat{b}).
The weights are chosen in such a way as to reduce the influence of the smallest values.
The estimator exploits the dual role of \theta as the parameter of the mean for
the interexceedance time as well as the mixture proportion for the non-zero component.
The maximum likelihood is based on an independence likelihood for the rescaled gap of exceedances,
namely \bar{F}(u_n)S(u_n). The score equation is equivalent to a quadratic equation in
\theta and the maximum likelihood estimate is available in closed form.
Its validity requires however condition D^{(2)}(u_n) to apply;
this should be checked by the user beforehand.
A warning is emitted if the effective sample size is less than 50 observations.
Value
a vector or matrix of estimated extremal index of dimension length(method) by length(q).
Author(s)
Leo Belzile
References
Ferro and Segers (2003). Inference for clusters of extreme values, JRSS: Series B, 65(2), 545-556.
Suveges (2007) Likelihood estimation of the extremal index. Extremes, 10(1), 41-55.
Suveges and Davison (2010), Model misspecification in peaks over threshold analysis. Annals of Applied Statistics, 4(1), 203-221.
Fukutome, Liniger and Suveges (2015), Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology, 120(3), 403-416.
Examples
## Not run: 
set.seed(234)
# Moving maxima model with theta=0.5
a <- 1; theta <-  1/(1+a)
sim <- rgev(10001, loc=1/(1+a),scale=1/(1+a),shape=1)
x <- pmax(sim[-length(sim)]*a,sim[-1])
q <- seq(0.9, 0.99, by = 0.01)
ext.index(xdat = x, quantile = q, method = c ('wls', 'mle'))
## End(Not run)
Estimators of the extremal coefficient
Description
These functions estimate the extremal coefficient using an approximate sample from the Frechet distribution.
Usage
extcoef(
  xdat,
  coord = NULL,
  thresh = NULL,
  estimator = c("schlather", "smith", "fmado"),
  margtrans = c("emp", "gev", "none"),
  ties.method = "random",
  prob = 0,
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| coord | an optional  | 
| thresh | threshold parameter (default is to keep all data if  | 
| estimator | string indicating which estimator to compute, one of  | 
| margtrans | string indicating which method to use to transform the margins to unit Frechet scale, either  | 
| ties.method | method for handling of ties in rank transformation | 
| prob | probability of not exceeding threshold  | 
| plot | logical; should cloud or matrix of pairwise empirical estimates be plotted? Default to  | 
| ... | additional parameters passed to the function, currently ignored. | 
Details
The Smith estimator: suppose Z(x) is simple max-stable vector (i.e., with unit Frechet marginals). Then
1/Z is unit exponential and 1/\max(Z(s_1), Z(s_2))
is exponential with rate \theta = \max\{Z(s_1), Z(s_2)\}.
The extremal index for the pair can therefore be calculated using
the reciprocal mean.
The Schlather and Tawn estimator: the likelihood of the naive estimator for a pair
of two sites A is
 \mathrm{card}\left\{ j: \max_{i \in A} X_i^{(j)}\bar{X}_i)>z \right\}
\log(\theta_A) - \theta_A \sum_{j=1}^n \left[ \max \left\{z, \max_{i \in A}
(X_i^{(j)}\bar{X}_i)\right\}\right]^{-1},
where \bar{X}_i = n^{-1} \sum_{j=1}^n 1/X_i^{(j)} is the harmonic mean and z
is a threshold on the unit Frechet scale.
The search for the maximum likelihood estimate for every pair A
is restricted to the interval [1,3].
A binned version of the extremal coefficient cloud is also returned.
The Schlather estimator is not self-consistent.
The Schlather and Tawn estimator includes as special case
the Smith estimator if we do not censor the data (p = 0)
and do not standardize observations by their harmonic mean.
The F-madogram estimator is a non-parametric estimate based on a stationary process
Z; the extremal coefficient satisfies
\theta(h)=\frac{1+2\nu(h)}{1-2\nu(h)},
where
\nu(h) = \frac{1}{2} \mathsf{E}[|F(Z(s+h)-F(Z(s))|]
The implementation only uses complete pairs to calculate the relative ranks.
All estimators are coded in plain R and computations are not optimized.
The estimation time can therefore be large for large data sets.
If there are no missing observations, the routine fmadogram
from the SpatialExtremes package should be preferred as it is
noticeably faster.
The data will typically consist of max-stable vectors or block maxima.
Both of the Smith and the Schlather–Tawn estimators require unit
Frechet margins; the margins will be standardized
to the unit Frechet scale, either parametrically or nonparametrically unless margtrans="none".
If margtrans = "gev", a parametric GEV model is fitted to each column of dat using maximum likelihood
estimation and transformed back using the probability integral transform. If method = "emp",
using the empirical distribution function. The latter is the default, as it is appreciably faster.
Value
an invisible list with vectors dist if coord is non-null or else a matrix of pairwise indices ind,
extcoef and the supplied estimator, fmado and binned. If estimator == "schlather", an additional matrix with 2 columns containing the binned distance binned with the h and the binned extremal coefficient.
References
Schlather, M. and J. Tawn (2003). A dependence measure for multivariate and spatial extremes, Biometrika, 90(1), pp.139–156.
Cooley, D., P. Naveau and P. Poncet (2006). Variograms for spatial max-stable random fields, In: Bertail P., Soulier P., Doukhan P. (eds) Dependence in Probability and Statistics. Lecture Notes in Statistics, vol. 187. Springer, New York, NY
R. J. Erhardt, R. L. Smith (2012), Approximate Bayesian computing for spatial extremes, Computational Statistics and Data Analysis, 56, pp.1468–1481.
Examples
## Not run: 
coord <- 10 * cbind(runif(50), runif(50))
di <- as.matrix(dist(coord))
dat <- rmev(
  n = 1000,
  d = 100,
  param = 3,
  sigma = exp(-di / 2),
  model = 'xstud'
)
res <- extcoef(xdat = dat, coord = coord)
# Extremal Student extremal coefficient function
XT.extcoeffun <- function(h, nu, corrfun, ...) {
  if (!is.function(corrfun)) {
    stop('Invalid function \"corrfun\".')
  }
  h <- unique(as.vector(h))
  rhoh <- sapply(h, corrfun, ...)
  cbind(
    h = h,
    extcoef = 2 * pt(sqrt((nu + 1) * (1 - rhoh) / (1 + rhoh)), nu + 1)
  )
}
#This time, only one graph with theoretical extremal coef
plot(res$dist, res$extcoef, ylim = c(1, 2), pch = 20)
abline(v = 2, col = 'gray')
extcoefxt <- XT.extcoeffun(
  seq(0, 10, by = 0.1),
  nu = 3,
  corrfun = function(x) {
    exp(-x / 2)
  }
)
lines(
  extcoefxt[, 'h'],
  extcoefxt[, 'extcoef'],
  type = 'l',
  col = 'blue',
  lwd = 2
)
# Brown--Resnick extremal coefficient function
BR.extcoeffun <- function(h, vario, ...) {
  if (!is.function(vario)) {
    stop('Invalid function \"vario\".')
  }
  h <- unique(as.vector(h))
  gammah <- sapply(h, vario, ...)
  cbind(h = h, extcoef = 2 * pnorm(sqrt(gammah / 4)))
}
extcoefbr <- BR.extcoeffun(
  seq(0, 20, by = 0.25),
  vario = function(x) {
    2 * x^0.7
  }
)
lines(
  extcoefbr[, 'h'],
  extcoefbr[, 'extcoef'],
  type = 'l',
  col = 'orange',
  lwd = 2
)
coord <- 10 * cbind(runif(20), runif(20))
di <- as.matrix(dist(coord))
dat <- rmev(
  n = 1000,
  d = 20,
  param = 3,
  sigma = exp(-di / 2),
  model = 'xstud'
)
res <- extcoef(
  xdat = dat,
  coord = coord,
  estimator = "smith"
)
## End(Not run)
Extended generalised Pareto families of Naveau et al. (2016)
Description
Density function, distribution function, quantile function and random generation for the extended generalized Pareto distribution (GPD) with scale and shape parameters.
Arguments
| q | vector of quantiles | 
| x | vector of observations | 
| p | vector of probabilities | 
| n | sample size | 
| prob | mixture probability for model  | 
| kappa | shape parameter for  | 
| delta | additional parameter for  | 
| sigma | scale parameter | 
| xi | shape parameter | 
| type | integer between 0 to 5 giving the model choice | 
| step | function of step size for discretization with default  | 
| log | logical; should the log-density be returned (default to  | 
| unifsamp | sample of uniform; if provided, the data will be used in place of new uniform random variates | 
| censoring | numeric vector of length 2 containing the lower and upper bound for censoring | 
Details
The extended generalized Pareto families proposed in Naveau et al. (2016) retain the tail index of the distribution while being compliant with the theoretical behavior of extreme low rainfall. There are five proposals, the first one being equivalent to the GP distribution.
-  type0 corresponds to uniform carrier,G(u)=u.
-  type1 corresponds to a three parameters family, with carrierG(u)=u^\kappa.
-  type2 corresponds to a three parameters family, with carrierG(u)=1-V_\delta((1-u)^\delta).
-  type3 corresponds to a four parameters family, with carrierG(u)=1-V_\delta((1-u)^\delta))^{\kappa/2}. 
-  type4 corresponds to a five parameter model (a mixture oftype2, withG(u)=pu^\kappa + (1-p)*u^\delta
Usage
pextgp(q, prob=NA, kappa=NA, delta=NA, sigma=NA, xi=NA, type=1)
dextgp(x, prob=NA, kappa=NA, delta=NA, sigma=NA, xi=NA, type=1, log=FALSE)
qextgp(p, prob=NA, kappa=NA, delta=NA, sigma=NA, xi=NA, type=1)
rextgp(n, prob=NA, kappa=NA, delta=NA, sigma=NA, xi=NA, type=1, unifsamp=NULL, censoring=c(0,Inf))
Author(s)
Raphael Huser and Philippe Naveau
References
Naveau, P., R. Huser, P. Ribereau, and A. Hannart (2016), Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection, Water Resour. Res., 52, 2753-2769, doi:10.1002/2015WR018552.
Carrier distribution for the extended GP distributions of Naveau et al.
Description
Density, distribution function, quantile function and random number generation for the carrier distributions of the extended Generalized Pareto distributions.
Arguments
| u | vector of observations ( | 
| prob | mixture probability for model  | 
| kappa | shape parameter for  | 
| delta | additional parameter for  | 
| type | integer between 0 to 5 giving the model choice | 
| log | logical; should the log-density be returned (default to  | 
| n | sample size | 
| unifsamp | sample of uniform; if provided, the data will be used in place of new uniform random variates | 
| censoring | numeric vector of length 2 containing the lower and upper bound for censoring | 
| direct | logical; which method to use for sampling in model of  | 
Usage
pextgp.G(u, type=1, prob, kappa, delta)
dextgp.G(u, type=1, prob=NA, kappa=NA, delta=NA, log=FALSE)
qextgp.G(u, type=1, prob=NA, kappa=NA, delta=NA)
rextgp.G(n, prob=NA, kappa=NA, delta=NA,
type=1, unifsamp=NULL, direct=FALSE, censoring=c(0,1))
Author(s)
Raphael Huser and Philippe Naveau
See Also
Pairwise extremogram for max-risk functional
Description
The function computes the pairwise chi estimates and plots them as a function of the distance between sites. The function also includes utilities for geometric anisotropy.
Usage
extremo(xdat, margp, coord, scale = 1, rho = 0, plot = FALSE, ...)
Arguments
| xdat | data matrix | 
| margp | marginal probability above which to threshold observations | 
| coord | matrix of coordinates (one site per row) | 
| scale | geometric anisotropy scale parameter | 
| rho | geometric anisotropy angle parameter | 
| plot | logical; should a graph of the pairwise estimates against distance? Default to  | 
| ... | additional arguments passed to plot | 
Value
an invisible matrix with pairwise estimates of chi along with distance (unsorted)
Examples
## Not run: 
lon <- seq(650, 720, length = 10)
lat <- seq(215, 290, length = 10)
# Create a grid
grid <- expand.grid(lon,lat)
coord <- as.matrix(grid)
dianiso <- distg(coord, 1.5, 0.5)
sgrid <- scale(grid, scale = FALSE)
# Specify marginal parameters `loc` and `scale` over grid
eta <- 26 + 0.05*sgrid[,1] - 0.16*sgrid[,2]
tau <- 9 + 0.05*sgrid[,1] - 0.04*sgrid[,2]
# Parameter matrix of Huesler--Reiss
# associated to power variogram
Lambda <- ((dianiso/30)^0.7)/4
# Regular Euclidean distance between sites
di <- distg(coord, 1, 0)
# Simulate generalized max-Pareto field
set.seed(345)
simu1 <- rgparp(n = 1000, thresh = 50, shape = 0.1, riskf = "max",
                scale = tau, loc = eta, sigma = Lambda, model = "hr")
extdat <- extremo(dat = simu1, margp = 0.98, coord = coord,
                  scale = 1.5, rho = 0.5, plot = TRUE)
# Constrained optimization
# Minimize distance between extremal coefficient from fitted variogram
mindistpvario <- function(par, emp, coord){
alpha <- par[1]; if(!isTRUE(all(alpha > 0, alpha < 2))){return(1e10)}
scale <- par[2]; if(scale <= 0){return(1e10)}
a <- par[3]; if(a<1){return(1e10)}
rho <- par[4]; if(abs(rho) >= pi/2){return(1e10)}
semivariomat <- power.vario(distg(coord, a, rho), alpha = alpha, scale = scale)
  sum((2*(1-pnorm(sqrt(semivariomat[lower.tri(semivariomat)]/2))) - emp)^2)
}
hin <- function(par, ...){
  c(1.99-par[1], -1e-5 + par[1],
    -1e-5 + par[2],
    par[3]-1,
    pi/2 - par[4],
    par[4]+pi/2)
  }
opt <- alabama::auglag(
  par = c(0.7, 30, 1, 0),
  hin = hin,
  control.outer = list(trace = 0),
  fn = function(par){
   mindistpvario(par, emp = extdat$prob, coord = coord)})
stopifnot(opt$kkt1, opt$kkt2)
# Plotting the extremogram in the deformed space
distfa <- distg(loc = coord, opt$par[3], opt$par[4])
plot(
 x = c(distfa[lower.tri(distfa)]),
 y = extdat$prob,
 pch = 20,
 yaxs = "i",
 xaxs = "i",
 bty = 'l',
 xlab = "distance",
 ylab= "cond. prob. of exceedance",
 ylim = c(0,1))
lines(
  x = (distvec <- seq(0,200, length = 1000)),
  y = 2*(1-pnorm(sqrt(power.vario(distvec, alpha = opt$par[1],
                               scale = opt$par[2])/2))),
  col = 2,
  lwd = 2)
## End(Not run)
Parameter stability plot and maximum likelihood routine for extended GP models
Description
The function tstab.egp provides classical threshold stability plot for (\kappa, \sigma, \xi).
The fitted parameter values are displayed with pointwise normal 95% confidence intervals.
The function returns an invisible list with parameter estimates and standard errors, and p-values for the Wald test that \kappa=1.
The plot is for the modified scale (as in the generalised Pareto model) and as such it is possible that the modified scale be negative.
tstab.egp can also be used to fit the model to multiple thresholds.
Usage
fit.egp(
  xdat,
  thresh = 0,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  start = NULL,
  method = c("Nelder", "nlminb", "BFGS"),
  fpar = NULL,
  show = FALSE,
  ...
)
Arguments
| xdat | vector of observations, greater than the threshold | 
| thresh | threshold value | 
| model | a string indicating which extended family to fit | 
| start | optional named list of initial values, with  | 
| method | the method to be used. See Details. Can be abbreviated. | 
| fpar | a named list with fixed parameters, either  | 
| show | logical; if  | 
| ... | additional parameters, for backward compatibility purposes | 
Details
fit.egp is a numerical optimization routine to fit the extended generalised Pareto models of Papastathopoulos and Tawn (2013),
using maximum likelihood estimation.
Value
fit.egp outputs the list returned by optim, which contains the parameter values, the hessian and in addition the standard errors
tstab.egp returns a plot(s) of the parameters fit over the range of provided thresholds, with pointwise normal confidence intervals; the function also returns an invisible list containing notably the matrix of point estimates (par) and standard errors (se).
Author(s)
Leo Belzile
References
Papastathopoulos, I. and J. Tawn (2013). Extended generalised Pareto models for tail estimation, Journal of Statistical Planning and Inference 143(3), 131–143.
Examples
xdat <- mev::rgp(
  n = 100,
  loc = 0,
  scale = 1,
  shape = 0.5)
fitted <- fit.egp(
  xdat = xdat,
  thresh = 1,
  model = "pt-gamma",
  show = TRUE)
thresh <- mev::qgp(seq(0.1, 0.5, by = 0.05), 0, 1, 0.5)
tstab.egp(
   xdat = xdat,
   thresh = thresh,
   model = "pt-gamma")
xdat <- regp(
  n = 100,
  scale = 1,
  shape = 0.1,
  kappa = 0.5,
  model = "pt-power"
)
fit.egp(
 xdat = xdat,
 model = "pt-power",
 show = TRUE,
 fpar = list(kappa = 1),
 method = "Nelder"
)
Fit an extended generalized Pareto distribution of Naveau et al.
Description
This is a wrapper function to obtain PWM or MLE estimates for the extended GP models of Naveau et al. (2016) for rainfall intensities. The function calculates confidence intervals by means of nonparametric percentile bootstrap and returns histograms and QQ plots of the fitted distributions. The function handles both censoring and rounding.
Usage
fit.extgp(
  data,
  model = 1,
  method = c("mle", "pwm"),
  init,
  censoring = c(0, Inf),
  rounded = 0,
  confint = FALSE,
  R = 1000,
  ncpus = 1,
  plots = TRUE
)
Arguments
| data | data vector. | 
| model | integer ranging from 0 to 4 indicating the model to select (see  | 
| method | string; either  | 
| init | vector of initial values, comprising of  | 
| censoring | numeric vector of length 2 containing the lower and upper bound for censoring;  | 
| rounded | numeric giving the instrumental precision (and rounding of the data), with default of 0. | 
| confint | logical; should confidence interval be returned (percentile bootstrap). | 
| R | integer; number of bootstrap replications. | 
| ncpus | integer; number of CPUs for parallel calculations (default: 1). | 
| plots | logical; whether to produce histogram and density plots. | 
Details
The different models include the following transformations:
-  model0 corresponds to uniform carrier,G(u)=u.
-  model1 corresponds to a three parameters family, with carrierG(u)=u^\kappa.
-  model2 corresponds to a three parameters family, with carrierG(u)=1-V_\delta((1-u)^\delta).
-  model3 corresponds to a four parameters family, with carrierG(u)=1-V_\delta((1-u)^\delta))^{\kappa/2}. 
-  model4 corresponds to a five parameter model (a mixture oftype2, withG(u)=pu^\kappa + (1-p)*u^\delta
Author(s)
Raphael Huser and Philippe Naveau
References
Naveau, P., R. Huser, P. Ribereau, and A. Hannart (2016), Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection, Water Resour. Res., 52, 2753-2769, doi:10.1002/2015WR018552.
See Also
Examples
## Not run: 
data(rain, package = "ismev")
fit.extgp(
  rain[rain > 0],
  model = 1,
  method = 'mle',
  init = c(0.9, fit.gpd(rain)$est),
  rounded = 0.1,
  confint = TRUE,
  R = 20
)
## End(Not run)
Maximum likelihood estimation for the generalized extreme value distribution
Description
This function returns an object of class mev_gev, with default methods for printing and quantile-quantile plots.
The default starting values are the solution of the probability weighted moments.
Usage
fit.gev(
  xdat,
  start = NULL,
  method = c("nlminb", "BFGS"),
  show = FALSE,
  fpar = NULL,
  warnSE = FALSE
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| start | named list of starting values | 
| method | string indicating the outer optimization routine for the augmented Lagrangian. One of  | 
| show | logical; if  | 
| fpar | a named list with optional fixed components  | 
| warnSE | logical; if  | 
Value
a list containing the following components:
-  estimatea vector containing the maximum likelihood estimates.
-  std.erra vector containing the standard errors.
-  vcovthe variance covariance matrix, obtained as the numerical inverse of the observed information matrix.
-  methodthe method used to fit the parameter.
-  nllhthe negative log-likelihood evaluated at the parameterestimate.
-  convergencecomponents taken from the list returned byauglag. Values other than0indicate that the algorithm likely did not converge.
-  countscomponents taken from the list returned byauglag.
-  xdatvector of data
Examples
xdat <- mev::rgev(n = 100)
fit.gev(xdat, show = TRUE)
# Example with fixed parameter
fit.gev(xdat, show = TRUE, fpar = list(shape = 0))
Maximum likelihood estimation for the generalized Pareto distribution
Description
Numerical optimization of the generalized Pareto distribution for
data exceeding threshold.
This function returns an object of class mev_gpd, with default methods for printing and quantile-quantile plots.
Usage
fit.gpd(
  xdat,
  threshold = 0,
  method = "Grimshaw",
  show = FALSE,
  MCMC = NULL,
  k = 4,
  tol = 1e-08,
  fpar = NULL,
  warnSE = FALSE
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| threshold | the chosen threshold. | 
| method | the method to be used. See Details. Can be abbreviated. | 
| show | logical; if  | 
| MCMC | 
 | 
| k | bound on the influence function ( | 
| tol | numerical tolerance for OBRE weights iterations ( | 
| fpar | a named list with fixed parameters, either  | 
| warnSE | logical; if  | 
Details
The default method is 'Grimshaw', which maximizes the profile likelihood for the ratio scale/shape.  Other options include 'obre' for optimal B-robust estimator of the parameter of Dupuis (1998), vanilla maximization of the log-likelihood using constrained optimization routine 'auglag', 1-dimensional optimization of the profile likelihood using nlm and optim. Method 'ismev' performs the two-dimensional optimization routine gpd.fit from the ismev library, with in addition the algebraic gradient.
The approximate Bayesian methods ('zs' and 'zhang') are extracted respectively from Zhang and Stephens (2009) and Zhang (2010) and consists of a approximate posterior mean calculated via importance
sampling assuming a GPD prior is placed on the parameter of the profile likelihood.
Value
If method is neither 'zs' nor 'zhang', a list containing the following components:
-  estimatea vector containing thescaleandshapeparameters (optimized and fixed).
-  std.erra vector containing the standard errors. Formethod = "obre", these are Huber's robust standard errors.
-  vcovthe variance covariance matrix, obtained as the numerical inverse of the observed information matrix. Formethod = "obre", this is the sandwich Godambe matrix inverse.
-  thresholdthe threshold.
-  methodthe method used to fit the parameter. See details.
-  nllhthe negative log-likelihood evaluated at the parameterestimate.
-  natnumber of points lying above the threshold.
-  patproportion of points lying above the threshold.
-  convergencecomponents taken from the list returned byoptim. Values other than0indicate that the algorithm likely did not converge (in particular 1 and 50).
-  countscomponents taken from the list returned byoptim.
-  exceedancesexcess over the threshold.
Additionally, if method = "obre", a vector of OBRE weights.
Otherwise, a list containing
-  thresholdthe threshold.
-  methodthe method used to fit the parameter. See Details.
-  natnumber of points lying above the threshold.
-  patproportion of points lying above the threshold.
-  approx.meana vector containing containing the approximate posterior mean estimates.
and in addition if MCMC is neither FALSE, nor NULL
-  post.meana vector containing the posterior mean estimates.
-  post.sea vector containing the posterior standard error estimates.
-  accept.rateproportion of points lying above the threshold.
-  niterlength of resulting Markov Chain
-  burninamount of discarded iterations at start, capped at 10000.
-  thinthinning integer parameter describing
Note
Some of the internal functions (which are hidden from the user) allow for modelling of the parameters using covariates. This is not currently implemented within gp.fit, but users can call internal functions should they wish to use these features.
Author(s)
Scott D. Grimshaw for the Grimshaw option. Paul J. Northrop and Claire L. Coleman for the methods optim, nlm and ismev.
J. Zhang and Michael A. Stephens (2009) and Zhang (2010) for the zs and zhang approximate methods and L. Belzile for methods auglag and obre, the wrapper and MCMC samplers.
If show = TRUE, the optimal B robust estimated weights for the largest observations are printed alongside with the
p-value of the latter, obtained from the empirical distribution of the weights. This diagnostic can be used to guide threshold selection:
small weights for the r-largest order statistics indicate that the robust fit is driven by the lower tail
and that the threshold should perhaps be increased.
References
Davison, A.C. (1984). Modelling excesses over high thresholds, with an application, in Statistical extremes and applications, J. Tiago de Oliveira (editor), D. Reidel Publishing Co., 461–482.
Grimshaw, S.D. (1993). Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution, Technometrics, 35(2), 185–191.
Northrop, P.J. and C. L. Coleman (2014). Improved threshold diagnostic plots for extreme value analyses, Extremes, 17(2), 289–303.
Zhang, J. (2010). Improving on estimation for the generalized Pareto distribution, Technometrics 52(3), 335–339.
Zhang, J. and M. A. Stephens (2009). A new and efficient estimation method for the generalized Pareto distribution. Technometrics 51(3), 316–325.
Dupuis, D.J. (1998). Exceedances over High Thresholds: A Guide to Threshold Selection, Extremes, 1(3), 251–261.
See Also
Examples
data(eskrain)
fit.gpd(eskrain, threshold = 35, method = 'Grimshaw', show = TRUE)
fit.gpd(eskrain, threshold = 30, method = 'zs', show = TRUE)
Maximum likelihood estimation of the point process of extremes
Description
Data above threshold is modelled using the limiting point process
of extremes.
Usage
fit.pp(
  xdat,
  threshold = 0,
  npp = 1,
  np = NULL,
  method = c("nlminb", "BFGS"),
  start = NULL,
  show = FALSE,
  fpar = NULL,
  warnSE = FALSE
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| threshold | the chosen threshold. | 
| npp | number of observation per period. See Details | 
| np | number of periods of data, if  | 
| method | the method to be used. See Details. Can be abbreviated. | 
| start | named list of starting values | 
| show | logical; if  | 
| fpar | a named list with optional fixed components  | 
| warnSE | logical; if  | 
Details
The parameter npp controls the frequency of observations.
If data are recorded on a daily basis, using a value of npp = 365.25
yields location and scale parameters that correspond to those of the
generalized extreme value distribution fitted to block maxima.
Value
a list containing the following components:
-  estimatea vector containing all parameters (optimized and fixed).
-  std.erra vector containing the standard errors.
-  vcovthe variance covariance matrix, obtained as the numerical inverse of the observed information matrix.
-  thresholdthe threshold.
-  methodthe method used to fit the parameter. See details.
-  nllhthe negative log-likelihood evaluated at the parameterestimate.
-  natnumber of points lying above the threshold.
-  patproportion of points lying above the threshold.
-  convergencecomponents taken from the list returned byoptim. Values other than0indicate that the algorithm likely did not converge (in particular 1 and 50).
-  countscomponents taken from the list returned byoptim.
References
Coles, S. (2001), An introduction to statistical modelling of extreme values. Springer : London, 208p.
Examples
data(eskrain)
pp_mle <- fit.pp(eskrain, threshold = 30, np = 6201)
plot(pp_mle)
Estimator of the second order tail index parameter
Description
Estimator of the second order tail index parameter
Usage
fit.rho(xdat, k, method = c("fagh", "dk", "ghp", "gbw"), ...)
Arguments
| xdat | vector of positive observations | 
| k | number of highest order statistics to use for estimation | 
| method | string for the estimator | 
| ... | additional arguments passed to individual routinescurrently ignored. | 
Examples
# Example with rho = -0.2
n <- 1000
xdat <- mev::rgp(n = n, shape = 0.2)
kmin <- floor(n^0.995)
kmax <- ceiling(n^0.999)
rho_est <- fit.rho(
   xdat = xdat,
   k = n - kmin:kmax)
rho_med <- mean(rho_est$rho)
Maximum likelihood estimates of point process for the r-largest observations
Description
This uses a constrained optimization routine to return the maximum likelihood estimate
based on an n by r matrix of observations. Observations should be ordered, i.e.,
the r-largest should be in the last column.
Usage
fit.rlarg(
  xdat,
  start = NULL,
  method = c("nlminb", "BFGS"),
  show = FALSE,
  fpar = NULL,
  warnSE = FALSE
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| start | named list of starting values | 
| method | the method to be used. See Details. Can be abbreviated. | 
| show | logical; if  | 
| fpar | a named list with fixed parameters, either  | 
| warnSE | logical; if  | 
Value
a list containing the following components:
-  estimatea vector containing all the maximum likelihood estimates.
-  std.erra vector containing the standard errors.
-  vcovthe variance covariance matrix, obtained as the numerical inverse of the observed information matrix.
-  methodthe method used to fit the parameter.
-  nllhthe negative log-likelihood evaluated at the parameterestimate.
-  convergencecomponents taken from the list returned byauglag. Values other than0indicate that the algorithm likely did not converge.
-  countscomponents taken from the list returned byauglag.
-  xdatannbyrmatrix of data
Examples
xdat <- rrlarg(n = 10, loc = 0, scale = 1, shape = 0.1, r = 4)
fit.rlarg(xdat)
Shape parameter estimates
Description
Wrapper to estimate the tail index or shape parameter of an extreme value distribution. Each function has similar sets of arguments, a vector or scalar number of order statistics k and
a vector of positive observations xdat. The method argument allows users to choose between different indicators, including the Hill estimator (hill, for positive observations and shape only), the moment estimator of Dekkers and de Haan (mom or dekkers), the de Vries estimator of de Haan and Peng (vries), the generalized jackknife estimator of Gomes et al. (genjack), the Beirlant, Vynckier and Teugels generalized quantile estimator (bvt or genquant), the Pickands estimator (pickands), the extreme U-statistics estimator of Oorschot, Segers and Zhou (osz), or the exponential rgression model of Beirlant et al. (erm).
Usage
fit.shape(
  xdat,
  k,
  method = c("hill", "rbm", "osz", "vries", "genjack", "mom", "dekkers", "genquant",
    "pickands", "erm"),
  ...
)
Arguments
| xdat | vector of positive observations of length  | 
| k | number of largest order statistics | 
| method | estimation method. | 
| ... | additional parameters passed to functions | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar.
Maximum likelihood estimation for weighted generalized Pareto distribution
Description
Weighted maximum likelihood estimation, with user-specified vector of weights.
Usage
fit.wgpd(xdat, threshold = 0, weightfun = Stein_weights, start = NULL, ...)
Arguments
| xdat | vector of observations | 
| threshold | numeric, value of the threshold | 
| weightfun | function whose first argument is the length of the weight vector | 
| start | optional vector of scale and shape parameters for the optimization routine, defaults to  | 
| ... | additional arguments passed to the weighting function  | 
Value
a list with components
-  estimatea vector containing thescaleandshapeparameters (optimized and fixed).
-  std.erra vector containing the standard errors.
-  vcovthe variance covariance matrix, obtained as the numerical inverse of the observed information matrix.
-  thresholdthe threshold.
-  methodthe method used to fit the parameter. See details.
-  nllhthe negative log-likelihood evaluated at the parameterestimate.
-  natnumber of points lying above the threshold.
-  patproportion of points lying above the threshold.
-  convergencelogical indicator of convergence.
-  weightsvector of weights for exceedances.
-  exceedancesexcess over the threshold, sorted in decreasing order.
French wind data
Description
Daily mean wind speed (in km/h) at four stations in the south of France, namely Cap Cepet (S1), Lyon St-Exupery (S2), Marseille Marignane (S3) and Montelimar (S4).
The data includes observations from January 1976 until April 2023; days containing missing values are omitted.
Format
A data frame with 17209 observations and 8 variables:
- date
- date of measurement 
- S1
- wind speed (in km/h) at Cap Cepet 
- S2
- wind speed (in km/h) at Lyon Saint-Exupery 
- S3
- wind speed (in km/h) at Marseille Marignane 
- S4
- wind speed (in km/h) at Montelimar 
- H2
- humidity (in percentage) at Lyon Saint-Exupery 
- T2
- mean temperature (in degree Celcius) at Lyon Saint-Exupery 
The metadata attribute includes latitude and longitude (in degrees, minutes, seconds), altitude (in m), station name and station id.
Source
European Climate Assessment and Dataset project https://www.ecad.eu/
References
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453.
Examples
data(frwind, package = "mev")
head(frwind)
attr(frwind, which = "metadata")
Magnetic storms
Description
Absolute magnitude of 373 geomagnetic storms lasting more than 48h with absolute magnitude (dst) larger than 100 in 1957-2014.
Format
a vector of size 373
Note
For a detailed article presenting the derivation of the Dst index, see http://wdc.kugi.kyoto-u.ac.jp/dstdir/dst2/onDstindex.html
Source
Aki Vehtari
References
World Data Center for Geomagnetism, Kyoto, M. Nose, T. Iyemori, M. Sugiura, T. Kamei (2015), Geomagnetic Dst index, <doi:10.17593/14515-74000>.
Generalized extreme value distribution
Description
Likelihood, score function and information matrix, bias, approximate ancillary statistics and sample space derivative for the generalized extreme value distribution
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
| V | vector calculated by  | 
| n | sample size | 
| p | vector of probabilities | 
Usage
gev.ll(par, dat)
gev.ll.optim(par, dat)
gev.score(par, dat)
gev.infomat(par, dat, method = c('obs','exp'))
gev.retlev(par, p)
gev.bias(par, n)
gev.Fscore(par, dat, method=c('obs','exp'))
gev.Vfun(par, dat)
gev.phi(par, dat, V)
gev.dphi(par, dat, V)
Functions
-  gev.ll: log likelihood
-  gev.ll.optim: negative log likelihood parametrized in terms of location,log(scale)and shape in order to perform unconstrained optimization
-  gev.score: score vector
-  gev.infomat: observed or expected information matrix
-  gev.retlev: return level, corresponding to the(1-p)th quantile
-  gev.bias: Cox-Snell first order bias
-  gev.Fscore: Firth's modified score equation
-  gev.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gev.phi: canonical parameter in the local exponential family approximation
-  gev.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
References
Firth, D. (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80(1), 27–38.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 209 p.
Cox, D. R. and E. J. Snell (1968). A general definition of residuals, Journal of the Royal Statistical Society: Series B (Methodological), 30, 248–275.
Cordeiro, G. M. and R. Klein (1994). Bias correction in ARMA models, Statistics and Probability Letters, 19(3), 169–176.
Firth's modified score equation for the generalized extreme value distribution
Description
Firth's modified score equation for the generalized extreme value distribution
Usage
gev.Fscore(par, dat, method = "obs")
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected ('exp') or the observed ('obs' - the default) information matrix. | 
References
Firth, D. (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80(1), 27–38.
See Also
N-year return levels, median and mean estimate
Description
N-year return levels, median and mean estimate
Usage
gev.Nyr(par, nobs, N, type = c("retlev", "median", "mean"), p = 1/N)
Arguments
| par | vector of location, scale and shape parameters for the GEV distribution | 
| nobs | integer number of observation on which the fit is based | 
| N | integer number of observations for return level. See Details | 
| type | string indicating the statistic to be calculated (can be abbreviated). | 
| p | probability indicating the return level, corresponding to the quantile at 1-1/p | 
Details
If there are n_y observations per year, the L-year return level is obtained by taking
N equal to n_yL.
Value
a list with components
- est point estimate 
- var variance estimate based on delta-method 
- type statistic 
Asymptotic bias of block maxima for fixed sample sizes
Description
Asymptotic bias of block maxima for fixed sample sizes
Usage
gev.abias(shape, rho)
Arguments
| shape | shape parameter | 
| rho | second-order parameter, non-positive | 
Value
a vector of length three containing the bias for location, scale and shape (in this order)
References
Dombry, C. and A. Ferreira (2017). Maximum likelihood estimators based on the block maxima method. https://arxiv.org/abs/1705.00465
Bias correction for GEV distribution
Description
Bias corrected estimates for the generalized extreme value distribution using Firth's modified score function or implicit bias subtraction.
Usage
gev.bcor(par, dat, corr = c("subtract", "firth"), method = c("obs", "exp"))
Arguments
| par | parameter vector ( | 
| dat | sample of observations | 
| corr | string indicating which correction to employ either  | 
| method | string indicating whether to use the expected  ( | 
Details
Method subtractsolves
\tilde{\boldsymbol{\theta}} = \hat{\boldsymbol{\theta}} + b(\tilde{\boldsymbol{\theta}}
for \tilde{\boldsymbol{\theta}}, using the first order term in the bias expansion as given by gev.bias.
The alternative is to use Firth's modified score and find the root of
U(\tilde{\boldsymbol{\theta}})-i(\tilde{\boldsymbol{\theta}})b(\tilde{\boldsymbol{\theta}}),
where U is the score vector, b is the first order bias and i is either the observed or Fisher information.
The routine uses the MLE (bias-corrected) as starting values and proceeds
to find the solution using a root finding algorithm.
Since the bias-correction is not valid for \xi < -1/3, any solution that is unbounded
will return a vector of NA as the solution does not exist then.
Value
vector of bias-corrected parameters
Examples
set.seed(1)
dat <- mev::rgev(n=40, loc = 1, scale=1, shape=-0.2)
par <- mev::fit.gev(dat)$estimate
gev.bcor(par, dat, 'subtract')
gev.bcor(par, dat, 'firth') #observed information
gev.bcor(par, dat, 'firth','exp')
Cox-Snell first order bias for the GEV distribution
Description
Bias vector for the GEV distribution based on an n sample.
Due to numerical instability, values of the information matrix and the bias
are linearly interpolated when the value of the shape parameter is close to zero.
Usage
gev.bias(par, n)
Arguments
| par | vector of  | 
| n | sample size | 
See Also
Information matrix for the generalized extreme value distribution
Description
Information matrix for the generalized extreme value distribution
Usage
gev.infomat(par, dat, method = c("obs", "exp"), nobs = length(dat))
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
Value
a 3 by 3 matrix containing the expected or observed information
Log likelihood for the generalized extreme value distribution
Description
Function returning the density of an n sample from the GEV distribution.
Usage
gev.ll(par, dat)
gev.ll.optim(par, dat)
Arguments
| par | vector of  | 
| dat | sample vector | 
Details
gev.ll.optim returns the negative log likelihood parametrized in terms of location, log(scale) and shape in order to perform unconstrained optimization
See Also
Generalized extreme value maximum likelihood estimates for various quantities of interest
Description
This function calls the fit.gev routine on the sample of block maxima and returns maximum likelihood
estimates for all quantities of interest, including location, scale and shape parameters, quantiles and mean and
quantiles of maxima of N blocks.
Usage
gev.mle(
  xdat,
  args = c("loc", "scale", "shape", "quant", "Nmean", "Nquant"),
  N,
  p,
  q
)
Arguments
| xdat | sample vector of maxima | 
| args | vector of strings indicating which arguments to return the maximum likelihood values for. | 
| N | size of block over which to take maxima. Required only for  | 
| p | tail probability. Required only for  | 
| q | level of quantile for maxima of  | 
Value
named vector with maximum likelihood estimated parameter values for arguments args
Examples
dat <- mev::rgev(n = 100, shape = 0.2)
gev.mle(xdat = dat, N = 100, p = 0.01, q = 0.5)
Profile log-likelihood for the generalized extreme value distribution
Description
This function calculates the profile likelihood along with two small-sample corrections based on Severini's (1999) empirical covariance and the Fraser and Reid tangent exponential model approximation.
Usage
gev.pll(
  psi,
  param = c("loc", "scale", "shape", "quant", "Nmean", "Nquant"),
  mod = "profile",
  dat,
  N = NULL,
  p = NULL,
  q = NULL,
  correction = TRUE,
  plot = TRUE,
  ...
)
Arguments
| psi | parameter vector over which to profile (unidimensional) | 
| param | string indicating the parameter to profile over | 
| mod | string indicating the model, one of  | 
| dat | sample vector | 
| N | size of block over which to take maxima. Required only for  | 
| p | tail probability. Required only for  | 
| q | probability level of quantile. Required only for  | 
| correction | logical indicating whether to use  | 
| plot | logical; should the profile likelihood be displayed? Default to  | 
| ... | additional arguments such as output from call to  | 
Details
The two additional mod available are tem, the tangent exponential model (TEM) approximation and
modif for the penalized profile likelihood based on p^* approximation proposed by Severini.
For the latter, the penalization is based on the TEM or an empirical covariance adjustment term.
Value
a list with components
-  mle: maximum likelihood estimate
-  psi.max: maximum profile likelihood estimate
-  param: string indicating the parameter to profile over
-  std.error: standard error ofpsi.max
-  psi: vector of parameter\psigiven inpsi
-  pll: values of the profile log likelihood atpsi
-  maxpll: value of maximum profile log likelihood
In addition, if mod includes tem
-  normal: maximum likelihood estimate and standard error of the interest parameter\psi
-  r: values of likelihood root corresponding to\psi
-  q: vector of likelihood modifications
-  rstar: modified likelihood root vector
-  rstar.old: uncorrected modified likelihood root vector
-  tem.psimax: maximum of the tangent exponential model likelihood
In addition, if mod includes modif
-  tem.mle: maximum of tangent exponential modified profile log likelihood
-  tem.profll: values of the modified profile log likelihood atpsi
-  tem.maxpll: value of maximum modified profile log likelihood
-  empcov.mle: maximum of Severini's empirical covariance modified profile log likelihood
-  empcov.profll: values of the modified profile log likelihood atpsi
-  empcov.maxpll: value of maximum modified profile log likelihood
References
Fraser, D. A. S., Reid, N. and Wu, J. (1999), A simple general formula for tail probabilities for frequentist and Bayesian inference. Biometrika, 86(2), 249–264.
Severini, T. (2000) Likelihood Methods in Statistics. Oxford University Press. ISBN 9780198506508.
Brazzale, A. R., Davison, A. C. and Reid, N. (2007) Applied asymptotics: case studies in small-sample statistics. Cambridge University Press, Cambridge. ISBN 978-0-521-84703-2
Examples
## Not run: 
set.seed(123)
dat <- rgev(n = 100, loc = 0, scale = 2, shape = 0.3)
gev.pll(psi = seq(0,0.5, length = 50), param = 'shape', dat = dat)
gev.pll(psi = seq(-1.5, 1.5, length = 50), param = 'loc', dat = dat)
gev.pll(psi = seq(10, 40, length = 50), param = 'quant', dat = dat, p = 0.01)
gev.pll(psi = seq(12, 100, length = 50), param = 'Nmean', N = 100, dat = dat)
gev.pll(psi = seq(12, 90, length = 50), param = 'Nquant', N = 100, dat = dat, q = 0.5)
## End(Not run)
Return level for the generalized extreme value distribution
Description
This function returns the 1-pth quantile of the GEV.
Usage
gev.retlev(par, p)
Arguments
| par | vector of  | 
| p | vector of probabilities | 
See Also
Score vector for the generalized extreme value distribution
Description
Score vector for the generalized extreme value distribution
Usage
gev.score(par, dat)
Arguments
| par | vector of  | 
| dat | sample vector | 
Value
a vector of length containing the derivative of the
Tangent exponential model approximation for the GEV distribution
Description
The function gev.tem provides a tangent exponential model (TEM) approximation
for higher order likelihood inference for a scalar parameter for the generalized extreme value distribution.
Options include location scale and shape parameters as well as value-at-risk (or return levels).
The function attempts to find good values for psi that will
cover the range of options, but the fail may fit and return an error.
Usage
gev.tem(
  param = c("loc", "scale", "shape", "quant", "Nmean", "Nquant"),
  dat,
  psi = NULL,
  p = NULL,
  q = 0.5,
  N = NULL,
  n.psi = 50,
  plot = TRUE,
  correction = TRUE
)
Arguments
| param | parameter over which to profile | 
| dat | sample vector for the GEV distribution | 
| psi | scalar or ordered vector of values for the interest parameter. If  | 
| p | tail probability for the (1-p)th quantile (return levels). Required only if  | 
| q | probability level of quantile. Required only for  | 
| N | size of block over which to take maxima. Required only for  | 
| n.psi | number of values of  | 
| plot | logical indicating whether  | 
| correction | logical indicating whether spline.corr should be called. | 
Value
an invisible object of class fr (see tem in package hoa) with elements
-  normal: maximum likelihood estimate and standard error of the interest parameter\psi
-  par.hat: maximum likelihood estimates
-  par.hat.se: standard errors of maximum likelihood estimates
-  th.rest: estimated maximum profile likelihood at (\psi,\hat{\lambda})
-  r: values of likelihood root corresponding to\psi
-  psi: vector of interest parameter
-  q: vector of likelihood modifications
-  rstar: modified likelihood root vector
-  rstar.old: uncorrected modified likelihood root vector
-  param: parameter
Author(s)
Leo Belzile
Examples
## Not run: 
set.seed(1234)
dat <- rgev(n = 40, loc = 0, scale = 2, shape = -0.1)
gev.tem('shape', dat = dat, plot = TRUE)
gev.tem('quant', dat = dat, p = 0.01, plot = TRUE)
gev.tem('scale', psi = seq(1, 4, by = 0.1), dat = dat, plot = TRUE)
dat <- rgev(n = 40, loc = 0, scale = 2, shape = 0.2)
gev.tem('loc', dat = dat, plot = TRUE)
gev.tem('Nmean', dat = dat, p = 0.01, N=100, plot = TRUE)
gev.tem('Nquant', dat = dat, q = 0.5, N=100, plot = TRUE)
## End(Not run)
Tangent exponential model statistics for the generalized extreme value distribution
Description
Matrix of approximate ancillary statistics, sample space derivative of the log likelihood and mixed derivative for the generalized extreme value distribution.
Usage
gev.Vfun(par, dat)
gev.phi(par, dat, V)
gev.dphi(par, dat, V)
Arguments
| par | vector of  | 
| dat | sample vector | 
| V | vector calculated by  | 
See Also
Generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Description
Likelihood, score function and information matrix,
approximate ancillary statistics and sample space derivative
for the generalized extreme value distribution  parametrized in terms of the
quantiles/mean of N-block maxima parametrization z, scale and shape.
Arguments
| par | vector of  | 
| dat | sample vector | 
| V | vector calculated by  | 
| q | probability, corresponding to  | 
| qty | string indicating whether to calculate the  | 
Usage
gevN.ll(par, dat, N, q, qty = c('mean', 'quantile'))
gevN.ll.optim(par, dat, N, q = 0.5, qty = c('mean', 'quantile'))
gevN.score(par, dat, N, q = 0.5, qty = c('mean', 'quantile'))
gevN.infomat(par, dat, qty = c('mean', 'quantile'), method = c('obs', 'exp'), N, q = 0.5, nobs = length(dat))
gevN.Vfun(par, dat, N, q = 0.5, qty = c('mean', 'quantile'))
gevN.phi(par, dat, N, q = 0.5, qty = c('mean', 'quantile'), V)
gevN.dphi(par, dat, N, q = 0.5, qty = c('mean', 'quantile'), V)
Functions
-  gevN.ll: log likelihood
-  gevN.score: score vector
-  gevN.infomat: expected and observed information matrix
-  gevN.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gevN.phi: canonical parameter in the local exponential family approximation
-  gevN.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
Information matrix of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Description
Information matrix of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Usage
gevN.infomat(
  par,
  dat,
  method = c("obs", "exp"),
  qty = c("mean", "quantile"),
  N,
  q = 0.5,
  nobs = length(dat)
)
Arguments
| par | vector of  | 
| dat | sample vector | 
| qty | string indicating whether to calculate the  | 
| q | probability, corresponding to  | 
See Also
Negative log likelihood of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Description
Negative log likelihood of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Usage
gevN.ll(par, dat, N, q = 0.5, qty = c("mean", "quantile"))
Arguments
| par | vector of  | 
| dat | sample vector | 
| q | probability, corresponding to  | 
| qty | string indicating whether to calculate the  | 
See Also
This function returns the mean of N observations from the GEV.
Description
This function returns the mean of N observations from the GEV.
Usage
gevN.mean(par, N)
Arguments
| par | vector of  | 
See Also
Score of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Description
Score of the generalized extreme value distribution (quantile/mean of N-block maxima parametrization)
Usage
gevN.score(par, dat, N, q = 0.5, qty = c("mean", "quantile"))
Arguments
| par | vector of  | 
| dat | sample vector | 
| q | probability, corresponding to  | 
| qty | string indicating whether to calculate the  | 
See Also
Tangent exponential model statistics for the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Vector implementing conditioning on approximate ancillary statistics for the TEM
Usage
gevN.Vfun(par, dat, N, q = 0.5, qty = c("mean", "quantile"))
gevN.phi(par, dat, N, q = 0.5, qty = c("mean", "quantile"), V)
gevN.dphi(par, dat, N, q = 0.5, qty = c("mean", "quantile"), V)
Arguments
| par | vector of  | 
| dat | sample vector | 
| q | probability, corresponding to  | 
| qty | string indicating whether to calculate the  | 
| V | vector calculated by  | 
See Also
Generalized extreme value distribution
Description
Density function, distribution function, quantile function and random number generation for the generalized extreme value distribution.
Usage
qgev(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)
rgev(n, loc = 0, scale = 1, shape = 0)
dgev(x, loc = 0, scale = 1, shape = 0, log = FALSE)
pgev(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)
Arguments
| p | vector of probabilities | 
| loc | scalar or vector of location parameters whose length matches that of the input | 
| scale | scalar or vector of positive scale parameters whose length matches that of the input | 
| shape | scalar shape parameter | 
| lower.tail | logical; if  | 
| n | scalar number of observations | 
| x,q | vector of quantiles | 
| log,log.p | logical; if  | 
Details
The distribution function of a GEV distribution with parameters
loc = \mu, scale = \sigma and
shape = \xi is
F(x) = \exp\{-[1 + \xi (x - \mu) / \sigma] ^ {-1/\xi} \}
for 1 + \xi (x - \mu) / \sigma > 0.  If \xi = 0 the
distribution function is defined as the limit as \xi tends to zero.
The quantile function, when evaluated at zero or one, returns the lower and upper endpoint, whether the latter is finite or not.
Author(s)
Leo Belzile, with code adapted from Paul Northrop
References
Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological elements. Quart. J. R. Met. Soc., 81, 158-171. Chapter 3: doi:10.1002/qj.49708134804
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values, Springer-Verlag, London. doi:10.1007/978-1-4471-3675-0_3
Generalized extreme value distribution (return level parametrization)
Description
Likelihood, score function and information matrix,
approximate ancillary statistics and sample space derivative
for the generalized extreme value distribution  parametrized in terms of the return level z, scale and shape.
Arguments
| par | vector of  | 
| dat | sample vector | 
| p | tail probability, corresponding to  | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
| V | vector calculated by  | 
Usage
gevr.ll(par, dat, p)
gevr.ll.optim(par, dat, p)
gevr.score(par, dat, p)
gevr.infomat(par, dat, p, method = c('obs', 'exp'), nobs = length(dat))
gevr.Vfun(par, dat, p)
gevr.phi(par, dat, p, V)
gevr.dphi(par, dat, p, V)
Functions
-  gevr.ll: log likelihood
-  gevr.ll.optim: negative log likelihood parametrized in terms of return levels,log(scale)and shape in order to perform unconstrained optimization
-  gevr.score: score vector
-  gevr.infomat: observed information matrix
-  gevr.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gevr.phi: canonical parameter in the local exponential family approximation
-  gevr.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
Observed information matrix for GEV distribution (return levels)
Description
The information matrix is parametrized in terms of return level ((1-p)th quantile), scale and shape.
Usage
gevr.infomat(par, dat, method = c("obs", "exp"), p, nobs = length(dat))
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
| p | tail probability, corresponding to  | 
| nobs | integer number of observations | 
See Also
Negative log likelihood of the generalized extreme value distribution (return levels)
Description
Negative log likelihood of the generalized extreme value distribution (return levels)
Negative log likelihood parametrized in terms of location, log return level and shape in order to perform unconstrained optimization
Usage
gevr.ll(par, dat, p)
gevr.ll.optim(par, dat, p)
Arguments
| par | vector of  | 
| dat | sample vector | 
| p | tail probability, corresponding to  | 
See Also
Score of the log likelihood for the GEV distribution (return levels)
Description
Score of the log likelihood for the GEV distribution (return levels)
Usage
gevr.score(par, dat, p)
Arguments
| par | vector of  | 
| dat | sample vector | 
| p | tail probability, corresponding to  | 
See Also
Tangent exponential model statistics for the GEV distribution (return level)
Description
Vector implementing conditioning on approximate ancillary statistics for the TEM
Usage
gevr.Vfun(par, dat, p)
gevr.phi(par, dat, p, V)
gevr.dphi(par, dat, p, V)
Arguments
| par | vector of  | 
| dat | sample vector | 
| p | tail probability, corresponding to  | 
| V | vector calculated by  | 
See Also
Maximum likelihood estimate of generalized Pareto applied to threshold exceedances
Description
The function fit.gpd is a wrapper around gp.fit
Usage
gp.fit(
  xdat,
  threshold,
  method = c("Grimshaw", "auglag", "nlm", "optim", "ismev", "zs", "zhang"),
  show = FALSE,
  MCMC = NULL,
  fpar = NULL,
  warnSE = TRUE
)
Arguments
| xdat | a numeric vector of data to be fitted. | 
| threshold | the chosen threshold. | 
| method | the method to be used. See Details. Can be abbreviated. | 
| show | logical; if  | 
| MCMC | 
 | 
| fpar | a named list with fixed parameters, either  | 
| warnSE | logical; if  | 
Value
a list; see fit.gpd for details
Generalized Pareto distribution
Description
Likelihood, score function and information matrix, bias, approximate ancillary statistics and sample space derivative for the generalized Pareto distribution
Arguments
| par | vector of  | 
| dat | sample vector | 
| tol | numerical tolerance for the exponential model | 
| method | string indicating whether to use the expected  ( | 
| V | vector calculated by  | 
| n | sample size | 
Usage
gpd.ll(par, dat, tol=1e-5)
gpd.ll.optim(par, dat, tol=1e-5)
gpd.score(par, dat)
gpd.infomat(par, dat, method = c('obs','exp'))
gpd.bias(par, n)
gpd.Fscore(par, dat, method = c('obs','exp'))
gpd.Vfun(par, dat)
gpd.phi(par, dat, V)
gpd.dphi(par, dat, V)
Functions
-  gpd.ll: log likelihood
-  gpd.ll.optim: negative log likelihood parametrized in terms oflog(scale)and shape in order to perform unconstrained optimization
-  gpd.score: score vector
-  gpd.infomat: observed or expected information matrix
-  gpd.bias: Cox-Snell first order bias
-  gpd.Fscore: Firth's modified score equation
-  gpd.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gpd.phi: canonical parameter in the local exponential family approximation
-  gpd.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
References
Firth, D. (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80(1), 27–38.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 209 p.
Cox, D. R. and E. J. Snell (1968). A general definition of residuals, Journal of the Royal Statistical Society: Series B (Methodological), 30, 248–275.
Cordeiro, G. M. and R. Klein (1994). Bias correction in ARMA models, Statistics and Probability Letters, 19(3), 169–176.
Giles, D. E., Feng, H. and R. T. Godwin (2016). Bias-corrected maximum likelihood estimation of the parameters of the generalized Pareto distribution, Communications in Statistics - Theory and Methods, 45(8), 2465–2483.
Firth's modified score equation for the generalized Pareto distribution
Description
Firth's modified score equation for the generalized Pareto distribution
Usage
gpd.Fscore(par, dat, method = c("obs", "exp"))
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
References
Firth, D. (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80(1), 27–38.
See Also
Asymptotic bias of threshold exceedances for k order statistics
Description
The formula given in de Haan and Ferreira, 2007 (Springer). Note that the latter differs from that found in Drees, Ferreira and de Haan.
Usage
gpd.abias(shape, rho)
Arguments
| shape | shape parameter | 
| rho | second-order parameter, non-positive | 
Value
a vector of length containing the bias for scale and shape (in this order)
References
Dombry, C. and A. Ferreira (2017). Maximum likelihood estimators based on the block maxima method. https://arxiv.org/abs/1705.00465
Bias correction for GP distribution
Description
Bias corrected estimates for the generalized Pareto distribution using Firth's modified score function or implicit bias subtraction.
Usage
gpd.bcor(par, dat, corr = c("subtract", "firth"), method = c("obs", "exp"))
Arguments
| par | parameter vector ( | 
| dat | sample of observations | 
| corr | string indicating which correction to employ either  | 
| method | string indicating whether to use the expected  ( | 
Details
Method subtract solves
\tilde{\boldsymbol{\theta}} = \hat{\boldsymbol{\theta}} + b(\tilde{\boldsymbol{\theta}}
for \tilde{\boldsymbol{\theta}}, using the first order term in the bias expansion as given by gpd.bias.
The alternative is to use Firth's modified score and find the root of
U(\tilde{\boldsymbol{\theta}})-i(\tilde{\boldsymbol{\theta}})b(\tilde{\boldsymbol{\theta}}),
where U is the score vector, b is the first order bias and i is either the observed or Fisher information.
The routine uses the MLE as starting value and proceeds
to find the solution using a root finding algorithm.
Since the bias-correction is not valid for \xi < -1/3, any solution that is unbounded
will return a vector of NA as the bias correction does not exist then.
Value
vector of bias-corrected parameters
Examples
set.seed(1)
dat <- rgp(n=40, scale=1, shape=-0.2)
par <- gp.fit(dat, threshold=0, show=FALSE)$estimate
gpd.bcor(par,dat, 'subtract')
gpd.bcor(par,dat, 'firth') #observed information
gpd.bcor(par,dat, 'firth','exp')
Cox-Snell first order bias expression for the generalized Pareto distribution
Description
Bias vector for the GP distribution based on an n sample.
Usage
gpd.bias(par, n)
Arguments
| par | vector of  | 
| n | sample size | 
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 209 p.
Cox, D. R. and E. J. Snell (1968). A general definition of residuals, Journal of the Royal Statistical Society: Series B (Methodological), 30, 248–275.
Cordeiro, G. M. and R. Klein (1994). Bias correction in ARMA models, Statistics and Probability Letters, 19(3), 169–176.
Giles, D. E., Feng, H. and R. T. Godwin (2016). Bias-corrected maximum likelihood estimation of the parameters of the generalized Pareto distribution, Communications in Statistics - Theory and Methods, 45(8), 2465–2483.
See Also
Bootstrap approximation for generalized Pareto parameters
Description
Given an object of class mev_gpd,
returns a matrix of parameter values to mimic
the estimation uncertainty.
Usage
gpd.boot(object, B = 1000L, method = c("post", "norm"))
Arguments
| object | object of class  | 
| B | number of pairs to sample | 
| method | string; one of  | 
Details
Two options are available: a normal approximation to the scale and shape based on the maximum likelihood estimates and the observed information matrix. This method uses forward sampling to simulate from a bivariate normal distribution that satisfies the support and positivity constraints
The second approximation uses the ratio-of-uniforms method to obtain samples from the posterior distribution with uninformative priors, thus mimicking the joint distribution of maximum likelihood. The benefit of the latter is that it is more reliable in small samples and when the shape is negative.
Value
a matrix of size B by 2 whose columns contain scale and shape parameters
Information matrix for the generalized Pareto distribution
Description
The function returns the expected or observed information matrix.
Usage
gpd.infomat(par, dat, method = c("obs", "exp"), nobs = length(dat))
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
See Also
Log likelihood for the generalized Pareto distribution
Description
Function returning the density of an n sample from the GP distribution.
gpd.ll.optim returns the negative log likelihood parametrized in terms of log(scale) and shape
in order to perform unconstrained optimization. The function is coded in such a way that the log likelihood is infinite when \xi < -1.
Usage
gpd.ll(par, dat, tol = 1e-05)
gpd.ll.optim(par, dat, tol = 1e-05)
Arguments
| par | vector of  | 
| dat | sample vector | 
| tol | numerical tolerance for the exponential model | 
See Also
Estimation of generalized Pareto parameters via L-moments
Description
Given a sample of exceedances, compute the first four L-moments and use either the first two to obtain the scale and shape (default), or else use L-skewness and L-scale to compute the scale and shape of the generalized Pareto distribution
Usage
gpd.lmom(xdat, thresh, sorted = FALSE, Lskew = FALSE)
Arguments
| xdat | [numeric] vector of observations | 
| thresh | [numeric] optional threshold argument | 
| sorted | [logical] if  | 
| Lskew | [logical]; if  | 
Value
a vector of length two with the scale and shape estimates
Generalized Pareto maximum likelihood estimates for various quantities of interest
Description
This function calls the fit.gpd routine on the sample of excesses and returns maximum likelihood
estimates for all quantities of interest, including scale and shape parameters, quantiles and value-at-risk,
expected shortfall and mean and quantiles of maxima of N threshold exceedances
Usage
gpd.mle(
  xdat,
  args = c("scale", "shape", "quant", "VaR", "ES", "Nmean", "Nquant"),
  m,
  N,
  p,
  q
)
Arguments
| xdat | sample vector of excesses | 
| args | vector of strings indicating which arguments to return the maximum likelihood values for | 
| m | number of observations of interest for return levels. Required only for  | 
| N | size of block over which to take maxima. Required only for  | 
| p | tail probability, equivalent to  | 
| q | level of quantile for N-block maxima. Required only for  | 
Value
named vector with maximum likelihood values for arguments args
Examples
xdat <- mev::rgp(n = 30, shape = 0.2)
gpd.mle(xdat = xdat, N = 100, p = 0.01, q = 0.5, m = 100)
Profile log-likelihood for the generalized Pareto distribution
Description
This function calculates the (modified) profile likelihood based on the p^* formula.
There are two small-sample corrections that use a proxy for
\ell_{\lambda; \hat{\lambda}},
which are based on Severini's (1999) empirical covariance
and the Fraser and Reid tangent exponential model approximation.
Usage
gpd.pll(
  psi,
  param = c("scale", "shape", "quant", "retlev", "VaR", "ES", "Nmean", "Nquant"),
  mod = "profile",
  mle = NULL,
  dat,
  m = NULL,
  N = NULL,
  p = NULL,
  q = NULL,
  correction = TRUE,
  thresh = NULL,
  plot = TRUE,
  ...
)
Arguments
| psi | parameter vector over which to profile (unidimensional) | 
| param | string indicating the parameter to profile over | 
| mod | string indicating the model. See Details. | 
| mle | maximum likelihood estimate in  | 
| dat | sample vector of excesses, unless  | 
| m | number of observations of interest for return levels. Required only for  | 
| N | size of block over which to take maxima. Required only for  | 
| p | tail probability, equivalent to  | 
| q | level of quantile for N-block maxima. Required only for  | 
| correction | logical indicating whether to use  | 
| thresh | numerical threshold above which to fit the generalized Pareto distribution | 
| plot | logical; should the profile likelihood be displayed? Default to  | 
| ... | additional arguments such as output from call to  | 
Details
The three mod available are profile (the default), tem, the tangent exponential model (TEM) approximation and
modif for the penalized profile likelihood based on p^* approximation proposed by Severini.
For the latter, the penalization is based on the TEM or an empirical covariance adjustment term.
Value
a list with components
-  mle: maximum likelihood estimate
-  psi.max: maximum profile likelihood estimate
-  param: string indicating the parameter to profile over
-  std.error: standard error ofpsi.max
-  psi: vector of parameter\psigiven inpsi
-  pll: values of the profile log likelihood atpsi
-  maxpll: value of maximum profile log likelihood
-  family: a string indicating "gpd"
-  thresh: value of the threshold, by default zero
In addition, if mod includes tem
-  normal: maximum likelihood estimate and standard error of the interest parameter\psi
-  r: values of likelihood root corresponding to\psi
-  q: vector of likelihood modifications
-  rstar: modified likelihood root vector
-  rstar.old: uncorrected modified likelihood root vector
-  tem.psimax: maximum of the tangent exponential model likelihood
In addition, if mod includes modif
-  tem.mle: maximum of tangent exponential modified profile log likelihood
-  tem.profll: values of the modified profile log likelihood atpsi
-  tem.maxpll: value of maximum modified profile log likelihood
-  empcov.mle: maximum of Severini's empirical covariance modified profile log likelihood
-  empcov.profll: values of the modified profile log likelihood atpsi
-  empcov.maxpll: value of maximum modified profile log likelihood
Examples
## Not run: 
dat <- rgp(n = 100, scale = 2, shape = 0.3)
gpd.pll(psi = seq(-0.5, 1, by=0.01), param = 'shape', dat = dat)
gpd.pll(psi = seq(0.1, 5, by=0.1), param = 'scale', dat = dat)
gpd.pll(psi = seq(20, 35, by=0.1), param = 'quant', dat = dat, p = 0.01)
gpd.pll(psi = seq(20, 80, by=0.1), param = 'ES', dat = dat, m = 100)
gpd.pll(psi = seq(15, 100, by=1), param = 'Nmean', N = 100, dat = dat)
gpd.pll(psi = seq(15, 90, by=1), param = 'Nquant', N = 100, dat = dat, q = 0.5)
## End(Not run)
Score vector for the generalized Pareto distribution
Description
Score vector for the generalized Pareto distribution
Usage
gpd.score(par, dat)
Arguments
| par | vector of  | 
| dat | sample vector | 
See Also
Tangent exponential model approximation for the GP distribution
Description
The function gpd.tem provides a tangent exponential model (TEM) approximation
for higher order likelihood inference for a scalar parameter for the generalized Pareto distribution. Options include
scale and shape parameters as well as value-at-risk (also referred to as quantiles, or return levels)
and expected shortfall. The function attempts to find good values for psi that will
cover the range of options, but the fit may fail and return an error. In such cases, the user can try to find good
grid of starting values and provide them to the routine.
Usage
gpd.tem(
  dat,
  param = c("scale", "shape", "quant", "VaR", "retlev", "ES", "Nmean", "Nquant"),
  psi = NULL,
  m = NULL,
  thresh = 0,
  n.psi = 50,
  N = NULL,
  p = NULL,
  q = NULL,
  plot = FALSE,
  correction = TRUE,
  ...
)
Arguments
| dat | sample vector for the GP distribution | 
| param | parameter over which to profile | 
| psi | scalar or ordered vector of values for the interest parameter. If  | 
| m | number of observations of interest for return levels. See Details. Required only for  | 
| thresh | threshold value corresponding to the lower bound of the support or the location parameter of the generalized Pareto distribution. | 
| n.psi | number of values of  | 
| N | size of block over which to take maxima. Required only for  | 
| p | tail probability, equivalent to  | 
| q | level of quantile for N-block maxima. Required only for  | 
| plot | logical indicating whether  | 
| correction | logical indicating whether spline.corr should be called. | 
| ... | additional arguments, for backward compatibility | 
Details
As of version 1.11, this function is a wrapper around gpd.pll.
The interpretation for m is as follows: if there are on average m_y observations per year above the threshold, then  m = Tm_y corresponds to T-year return level.
Value
an invisible object of class fr (see tem in package hoa) with elements
-  normal: maximum likelihood estimate and standard error of the interest parameter\psi
-  par.hat: maximum likelihood estimates
-  par.hat.se: standard errors of maximum likelihood estimates
-  th.rest: estimated maximum profile likelihood at (\psi,\hat{\lambda})
-  r: values of likelihood root corresponding to\psi
-  psi: vector of interest parameter
-  q: vector of likelihood modifications
-  rstar: modified likelihood root vector
-  rstar.old: uncorrected modified likelihood root vector
-  param: parameter
Author(s)
Leo Belzile
Examples
set.seed(123)
dat <- rgp(n = 40, scale = 1, shape = -0.1)
#with plots
m1 <- gpd.tem(param = 'shape', n.psi = 50, dat = dat, plot = TRUE)
## Not run: 
m2 <- gpd.tem(param = 'scale', n.psi = 50, dat = dat)
m3 <- gpd.tem(param = 'VaR', n.psi = 50, dat = dat, m = 100)
#Providing psi
psi <- c(seq(2, 5, length = 15), seq(5, 35, length = 45))
m4 <- gpd.tem(param = 'ES', dat = dat, m = 100, psi = psi, correction = FALSE)
mev:::plot.fr(m4, which = c(2, 4))
plot(fr4 <- spline.corr(m4))
confint(m1)
confint(m4, parm = 2, warn = FALSE)
m5 <- gpd.tem(param = 'Nmean', dat = dat, N = 100, psi = psi, correction = FALSE)
m6 <- gpd.tem(param = 'Nquant', dat = dat, N = 100, q = 0.7, correction = FALSE)
## End(Not run)
Tangent exponential model statistics for the generalized Pareto distribution
Description
Matrix of approximate ancillary statistics, sample space derivative of the log likelihood and mixed derivative for the generalized Pareto distribution.
Usage
gpd.Vfun(par, dat)
gpd.phi(par, dat, V)
gpd.dphi(par, dat, V)
Arguments
| par | vector of  | 
| dat | sample vector | 
| V | vector calculated by  | 
See Also
Generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Likelihood, score function and information matrix,
approximate ancillary statistics and sample space derivative
for the generalized Pareto distribution parametrized in terms of average maximum of N exceedances.
The parameter N corresponds to the number of threshold exceedances of interest over which the maxima is taken.
z is the corresponding expected value of this block maxima.
Note that the actual parametrization is in terms of excess expected mean, meaning expected mean minus threshold.
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| N | block size for threshold exceedances. | 
| tol | numerical tolerance for the exponential model | 
| V | vector calculated by  | 
Details
The observed information matrix was calculated from the Hessian using symbolic calculus in Sage.
Usage
gpdN.ll(par, dat, N, tol=1e-5)
gpdN.score(par, dat, N)
gpdN.infomat(par, dat, N, method = c('obs', 'exp'), nobs = length(dat))
gpdN.Vfun(par, dat, N)
gpdN.phi(par, dat, N, V)
gpdN.dphi(par, dat, N, V)
Functions
-  gpdN.ll: log likelihood
-  gpdN.score: score vector
-  gpdN.infomat: observed information matrix for GP parametrized in terms of mean of the maximum ofNexceedances and shape
-  gpdN.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gpdN.phi: canonical parameter in the local exponential family approximation
-  gpdN.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
Information matrix of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Information matrix of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Usage
gpdN.infomat(par, dat, N, method = c("obs", "exp"), nobs = length(dat))
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| N | block size for threshold exceedances. | 
See Also
Negative log likelihood of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Negative log likelihood of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Usage
gpdN.ll(par, dat, N)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| N | block size for threshold exceedances. | 
See Also
This function returns the mean of N maxima from the GP.
Description
This function returns the mean of N maxima from the GP.
Usage
gpdN.mean(par, N)
Arguments
| par | vector of length 2 containing  | 
| N | block size for threshold exceedances. | 
See Also
This function returns the qth percentile of N maxima from the GP.
Description
This function returns the qth percentile of N maxima from the GP.
Usage
gpdN.quant(par, q, N)
Arguments
| par | vector of length 2 containing  | 
| N | block size for threshold exceedances. | 
See Also
Score vector of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Score vector of the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Usage
gpdN.score(par, dat, N)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| N | block size for threshold exceedances. | 
See Also
Tangent exponential model statistics for the generalized Pareto distribution (mean of maximum of N exceedances parametrization)
Description
Vector implementing conditioning on approximate ancillary statistics for the TEM
Usage
gpdN.Vfun(par, dat, N)
gpdN.phi(par, dat, N, V)
gpdN.dphi(par, dat, N, V)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| N | block size for threshold exceedances. | 
| V | vector calculated by  | 
See Also
Generalized Pareto distribution (expected shortfall parametrization)
Description
Likelihood, score function and information matrix, approximate ancillary statistics and sample space derivative for the generalized Pareto distribution parametrized in terms of expected shortfall.
The parameter m corresponds to \zeta_u/(1-\alpha), where \zeta_u is the rate of exceedance over the threshold
u and \alpha is the percentile of the expected shortfall.
Note that the actual parametrization is in terms of excess expected shortfall, meaning expected shortfall minus threshold.
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| tol | numerical tolerance for the exponential model | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
| V | vector calculated by  | 
Details
The observed information matrix was calculated from the Hessian using symbolic calculus in Sage.
Usage
gpde.ll(par, dat, m, tol=1e-5)
gpde.ll.optim(par, dat, m, tol=1e-5)
gpde.score(par, dat, m)
gpde.infomat(par, dat, m, method = c('obs', 'exp'), nobs = length(dat))
gpde.Vfun(par, dat, m)
gpde.phi(par, dat, V, m)
gpde.dphi(par, dat, V, m)
Functions
-  gpde.ll: log likelihood
-  gpde.ll.optim: negative log likelihood parametrized in terms of log expected shortfall and shape in order to perform unconstrained optimization
-  gpde.score: score vector
-  gpde.infomat: observed information matrix for GPD parametrized in terms of rate of expected shortfall and shape
-  gpde.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gpde.phi: canonical parameter in the local exponential family approximation
-  gpde.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
Observed information matrix for the GP distribution (expected shortfall)
Description
The information matrix is parametrized in terms of excess expected shortfall and shape
Usage
gpde.infomat(par, dat, m, method = c("obs", "exp"), nobs = length(dat))
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
See Also
Negative log likelihood of the generalized Pareto distribution (expected shortfall)
Description
Negative log likelihood of the generalized Pareto distribution (expected shortfall)
Negative log likelihood of the generalized Pareto distribution (expected shortfall) - optimization The negative log likelihood is parametrized in terms of log expected shortfall and shape in order to perform unconstrained optimization
Usage
gpde.ll(par, dat, m)
gpde.ll.optim(par, dat, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
See Also
Score vector for the GP distribution (expected shortfall)
Description
Score vector for the GP distribution (expected shortfall)
Usage
gpde.score(par, dat, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
See Also
Tangent exponential model statistics for the generalized Pareto distribution (expected shortfall)
Description
Vector implementing conditioning on approximate ancillary statistics for the TEM
Usage
gpde.Vfun(par, dat, m)
gpde.phi(par, dat, V, m)
gpde.dphi(par, dat, V, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| V | vector calculated by  | 
See Also
Generalized Pareto distribution
Description
Density function, distribution function, quantile function and random number generation for the generalized Pareto distribution.
Usage
pgp(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)
dgp(x, loc = 0, scale = 1, shape = 0, log = FALSE)
qgp(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE)
rgp(n, loc = 0, scale = 1, shape = 0)
Arguments
| loc | location parameter. | 
| scale | scale parameter, strictly positive. | 
| shape | shape parameter. | 
| lower.tail | logical; if  | 
| log.p,log | logical; if  | 
| x,q | vector of quantiles | 
| p | vector of probabilities | 
| n | scalar number of observations | 
References
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values, Springer-Verlag, London. doi:10.1007/978-1-4471-3675-0_3
Generalized Pareto distribution (return level parametrization)
Description
Likelihood, score function and information matrix, approximate ancillary statistics and sample space derivative for the generalized Pareto distribution parametrized in terms of return levels.
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| tol | numerical tolerance for the exponential model | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
| V | vector calculated by  | 
Details
The observed information matrix was calculated from the Hessian using symbolic calculus in Sage.
The interpretation for m is as follows: if there are on average m_y observations per year above the threshold, then  m=Tm_y corresponds to T-year return level.
Usage
gpdr.ll(par, dat, m, tol=1e-5)
gpdr.ll.optim(par, dat, m, tol=1e-5)
gpdr.score(par, dat, m)
gpdr.infomat(par, dat, m, method = c('obs', 'exp'), nobs = length(dat))
gpdr.Vfun(par, dat, m)
gpdr.phi(par, V, dat, m)
gpdr.dphi(par, V, dat, m)
Functions
-  gpdr.ll: log likelihood
-  gpdr.ll.optim: negative log likelihood parametrized in terms oflog(scale)and shape in order to perform unconstrained optimization
-  gpdr.score: score vector
-  gpdr.infomat: observed information matrix for GPD parametrized in terms of rate ofm-year return level and shape
-  gpdr.Vfun: vector implementing conditioning on approximate ancillary statistics for the TEM
-  gpdr.phi: canonical parameter in the local exponential family approximation
-  gpdr.dphi: derivative matrix of the canonical parameter in the local exponential family approximation
Author(s)
Leo Belzile
Observed information matrix for GP distribution (return levels)
Description
The information matrix is parametrized in terms of rate of m-year return level and shape
Usage
gpdr.infomat(par, dat, m, method = c("obs", "exp"), nobs = length(dat))
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations | 
See Also
Negative log likelihood of the generalized Pareto distribution (return levels)
Description
Negative log likelihood of the generalized Pareto distribution (return levels)
Negative log likelihood parametrized in terms of log return level and shape in order to perform unconstrained optimization
Usage
gpdr.ll(par, dat, m)
gpdr.ll.optim(par, dat, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
See Also
Score of the profile log likelihood for the GP distribution (return levels parametrization)
Description
Score of the profile log likelihood for the GP distribution (return levels parametrization)
Usage
gpdr.score(par, dat, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
See Also
Tangent exponential model statistics for the generalized Pareto distribution (return level)
Description
Vector implementing conditioning on approximate ancillary statistics for the TEM
Usage
gpdr.Vfun(par, dat, m)
gpdr.phi(par, dat, V, m)
gpdr.dphi(par, dat, V, m)
Arguments
| par | vector of length 2 containing  | 
| dat | sample vector | 
| m | number of observations of interest for return levels. See Details | 
| V | vector calculated by  | 
See Also
Transformation from the generalized Pareto to unit Pareto
Description
Transformation from the generalized Pareto to unit Pareto
Usage
gpdtopar(dat, loc = 0, scale, shape, lambdau = 1)
Arguments
| dat | vector or matrix of data | 
| loc | vector of location parameters | 
| scale | vector of scale parameters, strictly positive | 
| shape | shape parameter | 
| lambdau | vector of probability of marginal threshold exceedance | 
Value
a vector or matrix of the same dimension as dat with unit Pareto observations
Interpret bivariate threshold exceedance models
Description
This is an adaptation of the evir package interpret.gpdbiv function.
interpret.fbvpot deals with the output of a call to
fbvpot from the evd and to handle families other than the logistic distribution.
The likelihood derivation comes from expression 2.10 in Smith et al. (1997).
Usage
ibvpot(fitted, q, silent = FALSE)
Arguments
| fitted | the output of  | 
| q | a vector of quantiles to consider, on the data scale. Must be greater than the thresholds. | 
| silent | boolean; whether to print the interpretation of the result. Default to  | 
Details
The list fitted must contain
-  modela string; seebvevdfrom packageevdfor options
-  parama named vector containing the parameters of themodel, as well as parametersscale1,shape1,scale2andshape2, corresponding to marginal GPD parameters.
-  thresholda vector of length 2 containing the two thresholds.
-  patthe proportion of observations above the correspondingthreshold
Value
an invisible numeric vector containing marginal, joint and conditional exceedance probabilities.
Author(s)
Leo Belzile, adapting original S code by Alexander McNeil
References
Smith, Tawn and Coles (1997), Markov chain models for threshold exceedances. Biometrika, 84(2), 249–268.
See Also
interpret.gpdbiv in package evir
Examples
if (requireNamespace("evd", quietly = TRUE)) {
y <- rgp(1000,1,1,1)
x <- y*rmevspec(n=1000,d=2,sigma=cbind(c(0,0.5),c(0.5,0)), model='hr')
mod <- evd::fbvpot(x, threshold = c(1,1), model = 'hr', likelihood ='censored')
ibvpot(mod, c(20,20))
}
Information matrix test statistic and MLE for the extremal index
Description
The information matrix test (IMT), proposed by Suveges and Davison (2010), is based
on the difference between the expected quadratic score and the second derivative of
the log-likelihood. The asymptotic distribution for each threshold u and gap K
is asymptotically \chi^2 with one degree of freedom. The approximation is good for
N>80 and conservative for smaller sample sizes. The test assumes independence between gaps.
Usage
infomat.test(xdat, thresh, q, K, plot = TRUE, ...)
Arguments
| xdat | data vector | 
| thresh | threshold vector | 
| q | vector of probability levels to define threshold if  | 
| K | int specifying the largest K-gap | 
| plot | logical: should the graphical diagnostic be plotted? | 
| ... | additional arguments, currently ignored | 
Details
The procedure proposed in Suveges & Davison (2010) was corrected for erratas.
The maximum likelihood is based on the limiting mixture distribution of
the intervals between exceedances (an exponential with a point mass at zero).
The condition D^{(K)}(u_n) should be checked by the user.
Fukutome et al. (2015) propose an ad hoc automated procedure
- Calculate the interexceedance times for each K-gap and each threshold, along with the number of clusters 
- Select the ( - u,- K) pairs for which IMT < 0.05 (corresponding to a P-value of 0.82)
- Among those, select the pair ( - u,- K) for which the number of clusters is the largest
Value
an invisible list of matrices containing
-  IMTa matrix of test statistics
-  pvalsa matrix of approximate p-values (corresponding to probabilities under a\chi^2_1distribution)
-  mlea matrix of maximum likelihood estimates for each given pair (u,K)
-  loglika matrix of log-likelihood values at MLE for each given pair (u,K)
-  thresholda vector of thresholds based on empirical quantiles at supplied levels.
-  qthe vectorqsupplied by the user
-  Kthe largest gap number, supplied by the user
Author(s)
Leo Belzile
References
Fukutome, Liniger and Suveges (2015), Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology, 120(3), 403-416.
Suveges and Davison (2010), Model misspecification in peaks over threshold analysis. Annals of Applied Statistics, 4(1), 203-221.
White (1982), Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1), 1-25.
Examples
infomat.test(xdat = rgp(n = 10000),
             q = seq(0.1, 0.9, length = 10),
             K = 3)
Intensity function for the Brown-Resnick model
Description
The intensity function includes the normalizing constants
Usage
intensBR(tdat, Lambda, cholPrecis = NULL)
Arguments
| tdat | matrix of unit Pareto observations | 
| Lambda | conditionally negative definite parameter matrix of the Huesler–Reiss model | 
| cholPrecis | Cholesky root of the corresponding precision matrix  | 
Value
log intensity contribution
Intensity function for the extremal Student model
Description
The intensity function includes the normalizing constants
Usage
intensXstud(tdat, df, Sigma, cholPrecis = NULL)
Arguments
| tdat | matrix of unit Pareto observations | 
| df | degrees of freedom, must be larger than 1 | 
| Sigma | scale matrix | 
| cholPrecis | Cholesky root of the precision matrix  | 
Value
log intensity contribution
Jacobian of the transformation from generalized Pareto to unit Pareto distribution
Description
If dat is a vector, the arguments loc, scale and shape should be numericals of length 1.
Usage
jac_gpd_pareto(dat, loc = 0, scale, shape, lambdau = 1, censored)
Arguments
| dat | vector or matrix of data | 
| loc | vector of location parameters | 
| scale | vector of scale parameters, strictly positive | 
| shape | shape parameter | 
| lambdau | vector of probability of marginal threshold exceedance | 
| censored | a matrix of logical indicating whether the observations are censored | 
Value
log-likelihood contribution for the Jacobian
Estimators of the tail coefficient
Description
Estimators proposed by Krupskii and Joe under second order expansion
for the coefficient of tail dependence \eta and the
joint tail orthant probability
Usage
kjtail(
  xdat,
  qlev,
  ptail = NULL,
  mqu = NULL,
  type = 1,
  ties.method = eval(formals(rank)$ties.method),
  ...
)
Arguments
| xdat | a matrix of observations | 
| qlev | vector of quantile levels | 
| ptail | tail probability smaller than  | 
| mqu | marginal quantile levels for semiparametric estimation; data above this are modelled using a generalized Pareto distribution. If  | 
| type | integer indicating the estimator type | 
| ties.method | method for handling of ties in rank transformation | 
| ... | additional arguments, for backward compatibility | 
Value
a list with elements
-  pquantile level for estimation
-  etamatrix of estimated coefficient of tail dependence\eta, and standard errors
-  k1parameter of the tail expansion
-  patproportion of observations above the threshold
-  lambdatail dependence coefficient (sic)
-  tailprobtail probability, ifptailis provided
Note
EXPERIMENTAL. The numerical optimization of the likelihood surface is difficult, as the function is ill-behaved. Visual inspection of estimates is often necessary to check for possible lack of convergence.
Examples
d <- 2
rho <- 0.9
Sigma <- matrix(rho, d, d) + diag(1 - rho, d)
eta_true <- 1/sum(Sigma)
data <- rmnorm(
   n = 1e4,
   mu = rep(0, d),
  Sigma = Sigma)
q <- seq(0.95, 0.995, by = 0.005)
kj <- kjtail(xdat = data, qlev = q)
plot(kj)
abline(h = (1+rho)/2, col = 2)
Estimation of the bivariate angular dependence function of Wadsworth and Tawn (2013)
Description
Estimation of the bivariate angular dependence function of Wadsworth and Tawn (2013)
Usage
lambdadep(
  dat,
  qu = 0.95,
  method = c("hill", "mle", "bayes"),
  plot = TRUE,
  level = 0.95,
  ...
)
Arguments
| dat | an  | 
| qu | quantile level on uniform scale at which to threshold data. Default to 0.95 | 
| method | string indicating the estimation method | 
| plot | logical indicating whether to return the graph of  | 
| level | level for confidence intervals, default to 0.95 | 
| ... | additional arguments, used for backward compatibility The confidence intervals are based on normal quantiles. The standard errors for the  | 
Value
a plot of the lambda function if plot=TRUE, plus an invisible list with components
-  wthe sequence of angles in (0,1) at which thelambdavalues are evaluated
-  lambdapoint estimates of lambda
-  lower.confintlevel% confidence interval for lambda (lower bound)
-  upper.confintlevel% confidence interval for lambda (upper bound)
Examples
## Not run: 
set.seed(12)
dat <- rmev(n = 1000, d = 2, model = "log", param = 0.1)
lambdadep(dat, method = 'hill')
lambdadep(dat, method = 'bayes')
lambdadep(dat, method = 'mle')
# With independent observations
dat <- matrix(runif(n = 2000), ncol = 2)
lambdadep(dat, method = 'hill')
## End(Not run)
Leeds air pollution
Description
Daily maximum data (hourly for PM10) on air pollution for the Leeds Centre station in Yorkshire and Humberside station. The data goes from January 1st, 1993, until December 31st, 2024. Data show seasonality and there are some outliers. From December 2nd, 2008 onwards, particulate matters (PM10 and PM2.5) are measured using a tapered element oscillating microbalance (TEOM) and Filter Dynamics Measurement System (FDMS). The data for PM2.5 is missing before the change of instrumentation. A total of 231 daily measurements with only missing values were removed during preprocessing.
Usage
leedspollution
Format
A data frame with 11455 rows and 8 variables:
- date
- [character] a date with format yyy-mm-dd 
- O3
- [integer] ozone (in nanograms per cubic meter) 
- NO
- [integer] nitrogen oxyde (in nanograms per cubic meter) 
- CO
- [double] carbon monoxyde (in micrograms per cubic meter) 
- NO2
- nitrogen dioxyde (in nanograms per cubic meter) 
- SO2
- sulphur dioxide (in nanograms per cubic meter) 
- PM10
- [integer] particulate matter 10, (in nanograms per cubic meter) 
- PM2.5
- [integer] particulate matter 2.5, (in nanograms per cubic meter) 
Source
Crown 2025 copyright Defra via uk-air.defra.gov.uk, licenced under the Open Government Licence (OGL).
Likelihood for multivariate peaks over threshold models
Description
Likelihood for the various parametric limiting models over region determined by
\{y \in F: \max_{j=1}^D \sigma_j \frac{y^\xi_j-1}{\xi_j}+\mu_j  > u\};
where \mu is loc, \sigma is scale and \xi is shape.
Usage
likmgp(
  dat,
  thresh,
  loc,
  scale,
  shape,
  par,
  model = c("log", "br", "xstud"),
  likt = c("mgp", "pois", "binom"),
  lambdau = 1,
  ...
)
Arguments
| dat | matrix of observations | 
| thresh | functional threshold for the maximum | 
| loc | vector of location parameter for the marginal generalized Pareto distribution | 
| scale | vector of scale parameter for the marginal generalized Pareto distribution | 
| shape | vector of shape parameter for the marginal generalized Pareto distribution | 
| par | list of parameters:  | 
| model | string indicating the model family, one of  | 
| likt | string indicating the type of likelihood, with an additional contribution for the non-exceeding components: one of   | 
| lambdau | vector of marginal rate of marginal threshold exceedance. | 
| ... | additional arguments (see Details) | 
Details
Optional arguments can be passed to the function via ...
-  clcluster instance created bymakeCluster(default toNULL)
-  ncorsnumber of cores for parallel computing of the likelihood
-  mmaxmaximum per column
-  B1number of replicates for quasi Monte Carlo integral for the exponent measure
-  genvec1generating vector for the quasi Monte Carlo routine (exponent measure), associated withB1
Value
the value of the log-likelihood with attributes expme, giving the exponent measure
Note
The location and scale parameters are not identifiable unless one of them is fixed.
Maiquetia Daily Rainfall
Description
Daily cumulated rainfall (in mm) at Maiquetia airport, Venezuela. The observations cover the period from January 1961 to December 1999. The original series had missing days in February 1996 (during which there were 2 days with 1hr each of light rain) and January 1998 (no rain). These were replaced by zeros.
Format
a vector of size 14244 containing daily rainfall (in mm),
Source
J.R. Cordova and M. González, accessed 25.11.2018 from <https://rss.onlinelibrary.wiley.com/hub/journal/14679876/series-c-datasets>
References
Coles, S. and L.R. Pericchi (2003). Anticipating Catastrophes through Extreme Value Modelling, Applied Statistics, 52(4), 405-416.
Coles, S., Pericchi L.R. and S. Sisson (2003). A fully probabilistic approach to extreme rainfall modeling, Journal of Hydrology, 273, 35-50.
Examples
## Not run: 
data(maiquetia, package = "mev")
day <- seq.Date(from = as.Date("1961-01-01"), to = as.Date("1999-12-31"), by = "day")
nzrain <- maiquetia[substr(day, 1, 4) < 1999 & maiquetia > 0]
fit.gpd(nzrain, threshold = 30, show = TRUE)
## End(Not run)
Transform arguments using max stability
Description
Given a vector of location, scale and shape parameters,
compute the corresponding parameters for block of size
m assuming a generalized extreme value distribution.
Usage
maxstable(pars, m = 1L, inverse = FALSE)
Arguments
| pars | vector of location, scale and shape parameters | 
| m | [integer] block size | 
| inverse | [logical] whether to compute the parameters for the inverse relationship (defaults to  | 
Examples
maxstable(pars = maxstable(pars = c(1,2,0), m = 10), m = 10, inv = TRUE)
maxstable(pars = maxstable(pars = c(1,2,0.1), m = 5), m = 1/5)
P-P plot for testing max stability
Description
The diagnostic, proposed by Gabda, Towe, Wadsworth and Tawn,
relies on the fact that, for max-stable vectors on the unit Gumbel scale,
the distribution of the maxima is Gumbel distribution with a location parameter equal to the exponent measure.
One can thus consider tuples of size m and estimate the location parameter via maximum likelihood
and transforming observations to the standard Gumbel scale. Replicates are then pooled and empirical quantiles are defined.
The number of combinations of m vectors can be prohibitively large, hence only nmax randomly selected
tuples are selected from all possible combinations. The confidence intervals are obtained by a
nonparametric bootstrap, by resampling observations with replacement observations for the selected tuples and re-estimating the
location parameter. The procedure can be computationally intensive as a result.
Usage
maxstabtest(
  dat,
  m = prod(dim(dat)[-1]),
  nmax = 500L,
  B = 1000L,
  ties.method = "random",
  plot = TRUE
)
Arguments
| dat | matrix or array of max-stable observations, typically block maxima. The first dimension should consist of replicates | 
| m | integer indicating how many tuples should be aggregated. | 
| nmax | maximum number of pairs. Default to 500L. | 
| B | number of nonparametric bootstrap replications. Default to 1000L. | 
| ties.method | string indicating the method for  | 
| plot | logical indicating whether a graph should be produced (default to  | 
Value
a Tukey probability-probability plot with 95
References
Gabda, D.; Towe, R. Wadsworth, J. and J. Tawn, Discussion of “Statistical Modeling of Spatial Extremes” by A. C. Davison, S. A. Padoan and M. Ribatet. Statist. Sci. 27 (2012), no. 2, 189–192.
Multivariate Normal distribution sampler
Description
Sampler derived using the eigendecomposition of the covariance
matrix Sigma. The function uses the Armadillo random normal generator
Usage
mvrnorm(n, mu, Sigma)
Arguments
| n | sample size | 
| mu | mean vector. Will set the dimension | 
| Sigma | a square covariance matrix, of same dimension as  | 
Value
an n sample from a multivariate Normal distribution
Examples
mvrnorm(n = 10, mu = c(0,2), Sigma = diag(2))
River Nidd Flow
Description
The data consists of exceedances over the threshold 65 cubic meter per second of the River Nidd at Hunsingore Weir, for 35 years of data between 1934 and 1969.
Format
a vector of size 154
Source
Natural Environment Research Council (1975). Flood Studies Report, volume 4. pp. 235–236.
References
Davison, A.C. and R.L. Smith (1990). Models for Exceedances over High Thresholds (with discussion), Journal of the Royal Statistical Society. Series B (Methodological), 52(3), 393–442.
See Also
nidd.thresh from the evir package
Nutrient data
Description
Interview component of survey 'What we eat in America'. These are extracted from the 2015–2016 National Health and Nutrition Examination Survey (NHANES, https://wwwn.cdc.gov/nchs/nhanes/Default.aspx) report and consist of the total nutrients for all food and beverage intake ingested over a 24 hours period.
Usage
nutrients
Format
A data frame with 9544 rows and 38 variables:
- prot
- proteins (in grams) 
- carb
- carbonhydrate (in gram) 
- sugr
- total sugars (in gram) 
- fibe
- dietary fibers (in grams) 
- tfat
- total fat (in grams) 
- sfat
- saturated fat (in grams) 
- mfat
- monounsaturated fat (in grams) 
- pfat
- polyunsaturated fat (in grams) 
- chol
- cholesterol (in milligrams) 
- atoc
- vitamin E as alpha-tocopherol (in milligrams) 
- ret
- retinol (in micrograms) 
- vara
- Vitamin A as retinol activity equivalents (in micrograms). 
- acar
- alpha-carotene (in micrograms) 
- bcar
- beta-carotene (in micrograms) 
- cryp
- beta-cryptoxanthin (in micrograms) 
- lyco
- lycopene (in micrograms) 
- lz
- lutein and zeaxanthin (in micrograms). 
- vb1
- thiamin (vitamin B1, in milligrams) 
- vb2
- riboflavin (vitamin B2, in milligrams) 
- niac
- niacin (in milligrams) 
- vb6
- vitamin B5 (in milligrams) 
- fola
- total folate (in micrograms) 
- fa
- folic acid (in micrograms) 
- ff
- food folate (in micrograms) 
- chl
- total choline (in milligrams) 
- vb12
- vitamin B12 (in micrograms) 
- vc
- vitamin C (in milligrams) 
- vd
- vitamin D (comprising D2 and D3, in micrograms) 
- vk
- vitamin K (in micrograms) 
- calc
- calcium (in milligrams) 
- phos
- phosphorus (in milligrams) 
- magn
- magnesium (in milligrams) 
- iron
- iron (in milligrams) 
- zinc
- zinc (in milligrams) 
- copp
- copper (in milligrams) 
- sodi
- sodium (in milligrams) 
- pota
- potassium (in milligrams) 
- sele
- selenium (in micrograms) 
Details
Note that the sample design oversampled specific population targets and that only respondants are provided. The website contains more information about sampling weights. There are multiple missing records.
Note
These data are subject to a data user agreement, available at https://www.cdc.gov/nchs/policy/data-user-agreement.html
Source
National Center for Health Statistics, now available from the Wayback Machine via https://web.archive.org/web/20201029113801/https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DR1TOT_I.XPT
Deaths from pandemics
Description
The data base contains estimated records of the number of deaths from pandemics.
Usage
pandemics
Format
A data frame with 72 rows and 8 variables:
- event
- name of the event 
- startyear
- start year of the event 
- endyear
- end year of the event 
- lower
- lower bound on estimated deaths (in thousands) 
- average
- average estimated deaths (in thousands) 
- upper
- upper bound on estimated deaths (in thousands) 
- saverage
- scaled average of estimated deaths (in thousands) 
- population
- estimated population at risk (in thousands) 
Source
Cirillo, P. and N.N. Taleb (2020). Tail risk of contagious diseases. Nat. Phys. 16, 606–613 (2020). <doi:10.1038/s41567-020-0921-x>
Smith's penultimate approximations
Description
The function takes as arguments the distribution and density functions. There are two options:
method='bm' yields block maxima and method='pot' threshold exceedances.
For method='bm', the user should provide in such case the block sizes via the
argument m, whereas if method='pot', a vector of threshold values should be
provided. The other argument (thresh or m depending on the method) is ignored.
Usage
penultimate(family, method = c("bm", "pot"), thresh, qlev, m, ...)
Arguments
| family | the name of the parametric family. Will be used to obtain  | 
| method | either block maxima ( | 
| thresh | vector of thresholds for method  | 
| qlev | vector of quantile levels for method  | 
| m | vector of block sizes for method  | 
| ... | additional arguments passed to  | 
Details
Alternatively, the user can provide functions densF, quantF and distF for the density,
quantile function and distribution functions, respectively. The user can also supply the derivative
of the density function, ddensF. If the latter is missing, it will be approximated using finite-differences.
For method = "pot", the function computes the reciprocal hazard and its derivative on the log scale to avoid numerical overflow. Thus, the density function should have argument log and the distribution function arguments log.p and lower.tail, respectively.
Value
a data frame containing
-  loc: location parameters (method='bm')
-  scale: scale parameters
-  shape: shape parameters
-  thresh: thresholds (ifmethod='pot'), percentile corresponding to threshold (ifmethod='pot')
-  m: block sizes (ifmethod='bm')
Author(s)
Leo Belzile
References
Smith, R.L. (1987). Approximations in extreme value theory. Technical report 205, Center for Stochastic Process, University of North Carolina, 1–34.
Examples
# Threshold exceedance for Normal variables
quants <- seq(1, 5, by = 0.02)
penult <- penultimate(
   family = "norm",
   method = 'pot',
   thresh = quants,
   ddensF = function(x){-x*dnorm(x)}, # optional argument
   )
plot(x = quants,
     y = penult$shape,
     type = 'l',
     xlab = 'quantile',
    ylab = 'Penultimate shape',
    ylim = c(-0.5, 0))
# Block maxima for Gamma variables
# User must provide arguments for shape (or rate), for which there is no default
m <- seq(30, 3650, by = 30)
penult <- penultimate(family = 'gamma', method = 'bm', m = m, shape = 0.1)
plot(x = m,
     y = penult$shape,
     type = 'l',
     xlab = 'quantile',
     ylab = 'penultimate shape')
# Comparing density of GEV approximation with true density of maxima
m <- 100 # block of size 100
p <- penultimate(
  family = 'norm',
  ddensF = function(x){-x*dnorm(x)},
  method = 'bm',
  m = m)
x <- seq(1, 5, by = 0.01)
plot(
  x = x,
  y = m * dnorm(x) * exp((m-1) * pnorm(x, log.p = TRUE)),
  type = 'l',
  ylab = 'density',
  main = 'Distribution of the maxima of\n 100 standard normal variates')
lines(x, mev::dgev(x, loc = p$loc, scale = p$scale, shape = 0), col = 2)
lines(x, mev::dgev(x, loc = p$loc, scale = p$scale, shape = p$shape), col = 4)
legend(
 x = 'topright',
 lty = c(1, 1, 1),
 col = c(1, 2, 4),
 legend = c('exact', 'ultimate', 'penultimate'),
 bty = 'n')
Extended GP functions
Description
These functions are documented in extgp and in extgp.G for the carrier distributions supported in the unit interval.
Usage
pextgp.G(u, type = 1, prob, kappa, delta)
dextgp.G(u, type = 1, prob = NA, kappa = NA, delta = NA, log = FALSE)
qextgp.G(u, type = 1, prob = NA, kappa = NA, delta = NA)
rextgp.G(
  n,
  prob = NA,
  kappa = NA,
  delta = NA,
  type = 1,
  unifsamp = NULL,
  direct = FALSE,
  censoring = c(0, 1)
)
pextgp(q, prob = NA, kappa = NA, delta = NA, sigma = NA, xi = NA, type = 1)
dextgp(
  x,
  prob = NA,
  kappa = NA,
  delta = NA,
  sigma = NA,
  xi = NA,
  type = 1,
  log = FALSE
)
qextgp(
  p,
  prob = NA,
  kappa = NA,
  delta = NA,
  sigma = NA,
  xi = NA,
  type = 1,
  step = 0
)
rextgp(
  n,
  prob = NA,
  kappa = NA,
  delta = NA,
  sigma = NA,
  xi = NA,
  type = 1,
  unifsamp = NULL,
  censoring = c(0, Inf)
)
See Also
Plot of (modified) profile likelihood
Description
The function plots the (modified) profile likelihood and the tangent exponential profile likelihood
Usage
## S3 method for class 'eprof'
plot(x, ...)
Arguments
| x | |
| ... | further arguments to  | 
Value
a graph of the (modified) profile likelihoods
References
Brazzale, A. R., Davison, A. C. and Reid, N. (2007). Applied Asymptotics: Case Studies in Small-Sample Statistics. Cambridge University Press, Cambridge.
Severini, T. A. (2000). Likelihood Methods in Statistics. Oxford University Press, Oxford.
Plot of tangent exponential model profile likelihood
Description
This function is adapted from the plot.fr function from the hoa package bundle.
It differs from the latter mostly in the placement of legends.
Usage
## S3 method for class 'fr'
plot(x, ...)
Arguments
| x | |
| ... | further arguments to  | 
Details
Plots produced depend on the integers provided in which. 1 displays the Wald pivot, the likelihood root r, the modified likelihood root rstar and the likelihood modification q as functions of the parameter psi. 2 gives the renormalized profile log likelihood and adjusted form, with the maximum likelihood having ordinate value of zero. 3 provides the significance function, a transformation of 1. Lastly, 4 plots the correction factor as a function of the likelihood root; it is a diagnostic plot aimed for detecting failure of
the asymptotic approximation, often due to poor numerics in a neighborhood of r=0; the function should be smooth. The function spline.corr is designed to handle this by correcting numerically unstable estimates, replacing outliers and missing values with the fitted values from the fit.
Value
graphs depending on argument which
References
Brazzale, A. R., Davison, A. C. and Reid, N. (2007). Applied Asymptotics: Case Studies in Small-Sample Statistics. Cambridge University Press, Cambridge.
Plots for random block maximum estimator
Description
The function returns plot of the shape estimator along with the value (and 95% Wald-based confidence interval) at the selected threshold, or a plot of the empirical Bayes risk.
Usage
## S3 method for class 'mev_shape_rbm'
plot(x, type = c("shape", "risk"), log = TRUE, ...)
Arguments
| x | object of class  | 
| type | [string] type of plot, either  | 
| log | [logical] if  | 
| ... | additional arguments, currently ignored | 
Value
one or more plots
Plots for Varty and al. metric-based threshold selection
Description
This S3 method produces quantile-quantile plots with confidence and tolerance bands on various scale (uniform, exponential, generalized Pareto), or a plot of the metric as a function of the threshold.
Usage
## S3 method for class 'mev_thselect_vmetric'
plot(
  x,
  type = c("qq", "pp", "exp", "metric"),
  B = 1000L,
  probs = c(0.025, 0.975),
  ...
)
Arguments
| x | an object of class  | 
| type | string; a single string indicating the choice of plot | 
| B | number of simulations for variability of estimation | 
| probs | quantile levels for intervals. | 
| ... | additional arguments, currently ignored | 
Value
NULL; the function is used to produce a plot
Sequential analysis diagnostic plots for threshold selection
Description
Sequential analysis diagnostic plots for threshold selection
Usage
## S3 method for class 'mev_thselect_wadsworth'
plot(x, type = c("wn", "ps"), ...)
Arguments
| x | object returned by a call to  | 
| type | string giving the plots to produce | 
| ... | additional arguments passed to plotting function | 
Value
NULL; the method is used to generate plots
Mean residual life parameter stability plot
Description
Mean residual life parameter stability plot
Usage
## S3 method for class 'mev_tstab_mrl'
plot(
  x,
  xlab = c("thresh", "nexc"),
  level = 0.95,
  type = c("band", "ptwise"),
  ...
)
Arguments
| x | object resulting from a call to  | 
| xlab | [string]; whether to plot mean residual life plot as a function of threshold value of number of exceedances | 
| level | [numeric] level of Wald confidence intervals | 
| type | [string] whether to plot pointwise confidence intervals using segments ( | 
| ... | additional arguments, currently ignored | 
Value
NULL; use to produce plots
Power variogram model
Description
The power variogram model is
\gamma(h) = (\|h\|/\lambda)^\alpha, \quad \lambda>0, \alpha \in [0,2).
Usage
power.vario(h, alpha, scale = 1)
Arguments
| h | vector or matrix of pairwise distances | 
| alpha | smoothness parameter | 
| scale | scale parameter | 
Value
a vector or matrix of variogram values of the same length as h
Power exponential correlation model
Description
The power correlation model is
\rho(h) = \exp\{-(\|h\|/\lambda)^\alpha\}, \quad \lambda>0, \alpha \in [0,2).
Usage
powerexp.cor(h, alpha = 1, scale = 1)
Arguments
| h | vector or matrix of pairwise distances | 
| alpha | smoothness parameter | 
| scale | scale parameter | 
Value
a vector or matrix of correlations of the same dimensions as h
Poisson process of extremes.
Description
Likelihood, score function and information matrix for the Poisson process likelihood.
Arguments
| par | vector of  | 
| dat | sample vector | 
| u | threshold | 
| method | string indicating whether to use the expected  ( | 
| np | number of periods of observations. This is a post hoc adjustment for the intensity so that the parameters of the model coincide with those of a generalized extreme value distribution with block size  | 
| nobs | number of observations for the expected information matrix. Default to  | 
Usage
pp.ll(par, dat)
pp.ll(par, dat, u, np)
pp.score(par, dat)
pp.infomat(par, dat, method = c('obs', 'exp'))
Functions
-  pp.ll: log likelihood
-  pp.score: score vector
-  pp.infomat: observed or expected information matrix
Author(s)
Leo Belzile
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 209 p.
Wadsworth, J.L. (2016). Exploiting Structure of Maximum Likelihood Estimators for Extreme Value Threshold Selection, Technometrics, 58(1), 116-126, http://dx.doi.org/10.1080/00401706.2014.998345.
Sharkey, P. and J.A. Tawn (2017). A Poisson process reparameterisation for Bayesian inference for extremes, Extremes, 20(2), 239-263, http://dx.doi.org/10.1007/s10687-016-0280-2.
Information matrix for the Poisson process likelihood
Description
The function returns the expected or observed information matrix.
Usage
pp.infomat(par, dat, method = c("obs", "exp"), u, np = 1, nobs = length(dat))
Arguments
| par | vector of  | 
| dat | sample vector | 
| method | string indicating whether to use the expected  ( | 
| u | threshold | 
| np | number of periods of observations. This is a post hoc adjustment for the intensity so that the parameters of the model coincide with those of a generalized extreme value distribution with block size  | 
| nobs | number of observations for the expected information matrix. Default to  | 
Value
information matrix of the NHPP
Note
For the expected information matrix, the number of points above the threshold is random, but should correspond to
np\Lambda. The parametrization for np is shared between fit.pp, pp.ll, etc.
The entries for the information matrix are given in Sharkey and Tawn (2017), but contains some typos which were corrected.
References
Sharkey, P. and J.A. Tawn (2017). A Poisson process reparameterisation for Bayesian inference for extremes, Extremes, 20(2), 239-263, http://dx.doi.org/10.1007/s10687-016-0280-2.
See Also
Examples
## Not run: 
dat <- rgp(n <- 1e3, 0.1, 2, -0.1)
np <- 10
mle <- fit.pp(dat, threshold = 0, np =  np)$par
info_obs <- pp.infomat(par = mle, dat = dat, method = "obs", u = 0, np = np)
info_exp <- pp.infomat(par = mle, dat = dat, method = "exp", u = 0, np = np)
info_obs/info_exp
## End(Not run)
Log-likelihood of Poisson process of threshold exceedances
Description
This function returns the log-likelihood of the non-homogeneous Poisson process
of exceedances above threshold u, adjusted so that there are np periods
of observations.
Usage
pp.ll(par, dat, u, np = 1)
Arguments
| par | vector of  | 
| dat | sample vector | 
| u | threshold | 
| np | number of periods of observations. This is a post hoc adjustment for the intensity so that the parameters of the model coincide with those of a generalized extreme value distribution with block size  | 
Value
log-likelihood of the NHPP
See Also
Score vector for the Poisson process of threshold exceedances
Description
Returns the score vector of the NHPP.
Usage
pp.score(par, dat, u, np = 1)
Arguments
| par | vector of  | 
| dat | sample vector | 
| u | threshold | 
| np | number of periods of observations. This is a post hoc adjustment for the intensity so that the parameters of the model coincide with those of a generalized extreme value distribution with block size  | 
Value
score vector of NHPP
See Also
Weissman's quantile estimator
Description
Given a small probability of exceedance p,
the number of exceedances k out of n observation
above the threshold u (thresh) (corresponding typically to the (k+1)th order statistic, compute the tail quantile at level Q(1-p) using the estimator of Weissman (1978) under the assumption of Pareto tail (positive shape \xi), viz.
 Q(1-p) = u \left(\frac{k}{pn}\right)^{\xi}.
Usage
qweissman(p, k, n, thresh, shape)
Arguments
| p | tail probability, must be larger than the proportion of exceedances  | 
| k | vector of the number of exceedances above  | 
| n | integer, total sample size | 
| thresh | vector of thresholds | 
| shape | vector of positive shape parameters | 
Value
a vector of tail quantiles
References
Weissman, I. (1978). Estimation of Parameters and Larger Quantiles Based on the k Largest Observations. Journal of the American Statistical Association, 73(364), 812–815. <doi:10.2307/2286285>.
Examples
set.seed(2025)
p <- 1/100
xdat <- rgp(n = 1000, loc = 2, scale = 2, shape = 0.4)
hill <- shape.hill(xdat, k = seq(20L, 100L, by = 10L))
thresh <- sort(xdat, decreasing = TRUE)[hill$k+1]
qweissman(
   p = 1/100,
   k = hill$k,
   n = length(xdat),
   thresh = thresh,
   shape = hill$shape)
# Compare with true quantile
qgp(1/100, loc = 2, scale = 2, shape = 0.4, lower.tail = FALSE)
Random variate generation for Dirichlet distribution on S_{d}
Description
A function to sample Dirichlet random variables, based on the representation as ratios of Gamma. Note that the RNG will generate on the full simplex and the sum to one constraint is respected here
Usage
rdir(n, alpha, normalize = TRUE)
Arguments
| n | sample size | 
| alpha | vector of parameter | 
| normalize | boolean. If  | 
Value
sample of dimension d (size of alpha) from the Dirichlet distribution.
Examples
rdir(n=100, alpha=c(0.5,0.5,2),TRUE)
rdir(n=100, alpha=c(3,1,2),FALSE)
Simulation from generalized R-Pareto processes
Description
The generalized R-Pareto process is supported on (loc - scale / shape, Inf) if shape > 0,
or (-Inf, loc - scale / shape) for negative shape parameters, conditional on (X-r(loc))/r(scale)>0.
The standard Pareto process corresponds to scale = loc = rep(1, d).
Usage
rgparp(
  n,
  shape = 1,
  thresh = 1,
  risk = c("mean", "sum", "site", "max", "min", "l2"),
  siteindex = NULL,
  d,
  loc,
  scale,
  param,
  sigma,
  model = c("log", "neglog", "bilog", "negbilog", "hr", "br", "xstud", "smith",
    "schlather", "ct", "sdir", "dirmix"),
  weights,
  vario,
  coord = NULL,
  ...
)
Arguments
| n | number of observations | 
| shape | shape parameter of the generalized Pareto variable | 
| thresh | univariate threshold for the exceedances of risk functional | 
| risk | string indicating the risk functional. | 
| siteindex | integer between 1 and d specifying the index of the site or variable | 
| d | dimension of sample | 
| loc | location vector | 
| scale | scale vector | 
| param | parameter vector for the logistic, bilogistic, negative bilogistic and extremal Dirichlet (Coles and Tawn) model. Parameter matrix for the Dirichlet mixture. Degree of freedoms for extremal student model. See Details. | 
| sigma | covariance matrix for Brown-Resnick and extremal Student-t distributions. Symmetric matrix of squared  coefficients  | 
| model | for multivariate extreme value distributions, users can choose between 1-parameter logistic and negative logistic, asymmetric logistic and negative logistic, bilogistic, Husler-Reiss, extremal Dirichlet model (Coles and Tawn) or the Dirichlet mixture. Spatial models include the Brown-Resnick, Smith, Schlather and extremal Student max-stable processes. Max linear models are also supported | 
| weights | vector of length  | 
| vario | semivariogram function whose first argument must be distance. Used only if provided in conjunction with  | 
| coord | 
 | 
| ... | additional arguments for the  | 
Value
an n by d sample from the generalized R-Pareto process, with attributes
accept.rate if the procedure uses rejection sampling.
Examples
rgparp(n = 10, risk = 'site', siteindex = 2, d = 3, param = 2.5,
   model = 'log', scale = c(1, 2, 3), loc = c(2, 3, 4), shape = 0.5)
rgparp(n = 10, risk = 'max', d = 4, param = c(0.2, 0.1, 0.9, 0.5),
   scale = 1:4, loc = 1:4, model = 'bilog')
rgparp(n = 10, risk = 'sum', d = 3, param = c(0.8, 1.2, 0.6, -0.5),
   scale = 1:3, loc = 1:3, model = 'sdir')
vario <- function(x, scale = 0.5, alpha = 0.8){ scale*x^alpha }
grid.coord <- as.matrix(expand.grid(runif(4), runif(4)))
rgparp(n = 10, risk = 'max', vario = vario, coord = grid.coord,
   model = 'br', scale = runif(16), loc = rnorm(16))
Second order tail index estimator of Drees and Kaufmann
Description
Estimator of the second order regular variation parameter \rho \leq 0 parameter for heavy-tailed data proposed by Drees and Kaufmann (1998)
Usage
rho.dk(xdat, k, tau = 0.5)
Arguments
| xdat | vector of positive observations | 
| k | number of highest order statistics to use for estimation | 
| tau | tuning parameter  | 
References
Drees, H. and E. Kaufmann (1998). Selecting the optimal sample fraction in univariate extreme value estimation, Stochastic Processes and their Applications, 75(2), 149-172, <doi:10.1016/S0304-4149(98)00017-9>.
Second order tail index estimator of Fraga Alves et al.
Estimator of the second order regular variation parameter \rho \leq 0 parameter for heavy-tailed data proposed by Fraga Alves et al. (2003)
Description
Second order tail index estimator of Fraga Alves et al.
Estimator of the second order regular variation parameter \rho \leq 0 parameter for heavy-tailed data proposed by Fraga Alves et al. (2003)
Usage
rho.fagh(xdat, k, tau = 0)
Arguments
| xdat | vector of positive observations | 
| k | number of highest order statistics to use for estimation | 
| tau | scalar real tuning parameter. Default values is 0, which is typically chosen whenever  | 
References
Fraga Alves, M.I., Gomes, M. Ivette, and de Haan, Laurens (2003). A new class of semi-parametric estimators of the second order parameter. Portugaliae Mathematica. Nova Serie 60(2), 193-213. <http://eudml.org/doc/50867>.
Examples
# Example with rho = -0.2
n <- 1000
xdat <- mev::rgp(n = n, shape = 0.2)
kmin <- floor(n^0.995)
kmax <- ceiling(n^0.999)
rho_est <- rho.fagh(
   xdat = xdat,
   k = n - kmin:kmax)
rho_med <- mean(rho_est$rho)
Second order tail index estimator of Goegebeur et al. (2008)
Description
Estimator of the second order regular variation parameter \rho \leq 0 parameter for heavy-tailed data based on ratio of kernel goodness-of-fit statistics.
Usage
rho.gbw(xdat, k)
Arguments
| xdat | vector of positive observations | 
| k | number of highest order statistics to use for estimation | 
References
Goegebeur, Y., J. Beirlant and T. de Wet (2008). Linking Pareto-tail kernel goodness-of-fit statistics with tail index at optimal threshold and second order estimation. REVSTAT-Statistical Journal, 6(1), 51-69. <doi:10.57805/revstat.v6i1.57>
Second order tail index estimator of Gomes et al.
Description
Estimator of the second order regular variation parameter \rho \leq 0 parameter for heavy-tailed data proposed by Gomes et al. (2003)
Usage
rho.ghp(xdat, k, alpha = 2)
Arguments
| xdat | vector of positive observations | 
| k | number of highest order statistics to use for estimation | 
| alpha | positive scalar tuning parameter | 
References
Gomes, M.I., de Haan, L. & Peng, L. (2002). Semi-parametric Estimation of the Second Order Parameter in Statistics of Extremes. Extremes 5, 387–414. <doi:10.1023/A:1025128326588>
Distribution of the r-largest observations
Description
Likelihood, score function and information matrix for the r-largest observations likelihood.
Arguments
| par | vector of  | 
| dat | an  | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations for the expected information matrix. Default to  | 
| r | number of order statistics kept. Default to  | 
Usage
rlarg.ll(par, dat, u, np)
rlarg.score(par, dat)
rlarg.infomat(par, dat, method = c('obs', 'exp'), nobs = nrow(dat), r = ncol(dat))
Functions
-  rlarg.ll: log likelihood
-  rlarg.score: score vector
-  rlarg.infomat: observed or expected information matrix
Author(s)
Leo Belzile
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 209 p.
Smith, R.L. (1986).  Extreme value theory based on the r largest annual events, Journal of Hydrology, 86(1-2), 27–43, http://dx.doi.org/10.1016/0022-1694(86)90004-1.
Information matrix for the r-largest observations.
Description
The function returns the expected or observed information matrix.
Usage
rlarg.infomat(
  par,
  dat,
  method = c("obs", "exp"),
  nobs = nrow(dat),
  r = ncol(dat)
)
Arguments
| par | vector of  | 
| dat | an  | 
| method | string indicating whether to use the expected  ( | 
| nobs | number of observations for the expected information matrix. Default to  | 
| r | number of order statistics kept. Default to  | 
See Also
Log-likelihood of the point process of r-largest observations
Description
The function returns the log-likelihood for an n by r matrix of observations.
Usage
rlarg.ll(par, dat)
Arguments
| par | vector of  | 
| dat | an  | 
See Also
Score of the r-largest observations
Description
The score is computed via linear interpolation for the shape parameter in a neighborhood of zero
Usage
rlarg.score(par, dat)
Arguments
| par | vector of  | 
| dat | an  | 
Value
score vector of size 3
See Also
Exact simulations of multivariate extreme value distributions
Description
Implementation of the random number generators for multivariate extreme-value distributions and max-stable processes based on the two algorithms described in Dombry, Engelke and Oesting (2016).
Usage
rmev(
  n,
  d,
  param,
  asy,
  sigma,
  model = c("log", "alog", "neglog", "aneglog", "bilog", "negbilog", "hr", "br", "xstud",
    "smith", "schlather", "ct", "sdir", "dirmix", "pairbeta", "pairexp", "wdirbs",
    "wexpbs", "maxlin"),
  alg = c("ef", "sm"),
  weights = NULL,
  vario = NULL,
  coord = NULL,
  grid = FALSE,
  dist = NULL,
  ...
)
Arguments
| n | number of observations | 
| d | dimension of sample | 
| param | parameter vector for the logistic, bilogistic, negative bilogistic and extremal Dirichlet (Coles and Tawn) model. Parameter matrix for the Dirichlet mixture. Degree of freedoms for extremal student model. See Details. | 
| asy | list of asymmetry parameters, as in function  | 
| sigma | covariance matrix for Brown-Resnick and extremal Student-t distributions. Symmetric matrix of squared  coefficients  | 
| model | for multivariate extreme value distributions, users can choose between 1-parameter logistic and negative logistic, asymmetric logistic and negative logistic, bilogistic, Husler-Reiss, extremal Dirichlet model (Coles and Tawn) or the Dirichlet mixture. Spatial models include the Brown-Resnick, Smith, Schlather and extremal Student max-stable processes. Max linear models are also supported | 
| alg | algorithm, either simulation via extremal function ( | 
| weights | vector of length  | 
| vario | semivariogram function whose first argument must be distance. Used only if provided in conjunction with  | 
| coord | 
 | 
| grid | Logical.  | 
| dist | symmetric matrix of pairwise distances. Default to  | 
| ... | additional arguments for the  | 
Details
The vector param differs depending on the model
-  log: one dimensional parameter greater than 1
-  alog:2^d-d-1dimensional parameter fordep. Values are recycled if needed.
-  neglog: one dimensional positive parameter
-  aneglog:2^d-d-1dimensional parameter fordep. Values are recycled if needed.
-  bilog:d-dimensional vector of parameters in[0,1]
-  negbilog:d-dimensional vector of negative parameters
-  ct, dir, negdir, sdir:d-dimensional vector of positive (a)symmetry parameters. Fordirandnegdir, ad+1vector consisting of thedDirichlet parameters and the last entry is an index of regular variation in(-\min(\alpha_1, \ldots, \alpha_d), 1]treated as shape parameter
-  xstud: one dimensional parameter corresponding to degrees of freedomalpha
-  dirmix:dbym-dimensional matrix of positive (a)symmetry parameters
-  pairbeta, pairexp:d(d-1)/2+1vector of parameters, containing the concentration parameter and the coefficients of the pairwise beta, in lexicographical order e.g.,\beta_{12}, \beta_{13}, \ldots
-  wdirbs, wexpbs:2dvector ofdconcentration parameters followed by thedDirichlet parameters
Stephenson points out that the multivariate asymmetric negative logistic model given in e.g. Coles and Tawn (1991) is not a valid distribution function in dimension d>3 unless additional constraints are imposed on the parameter values.
The implementation in mev uses the same construction as the asymmetric logistic distribution (see the vignette). As such it does not match the bivariate implementation of rbvevd.
The dependence parameter of the evd package for the Husler-Reiss distribution can be recovered taking
for the Brown–Resnick model  2/r=\sqrt(2\gamma(h)) where h is the lag vector between sites and r=1/\lambda for the Husler–Reiss.
Value
an n by d exact sample from the corresponding multivariate extreme value model
Warning
As of version 1.8 (August 16, 2016), there is a distinction between models hr and br. The latter is meant to be used in conjunction with variograms. The parametrization differs between the two models.
The family of scaled Dirichlet is now parametrized by a parameter in -\min(\alpha) appended to the the d vector param containing the parameter alpha
of the Dirichlet model. Arguments model='dir' and model='negdir' are still supported internally, but not listed in the options.
Author(s)
Leo Belzile
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
See Also
Examples
set.seed(1)
rmev(n=100, d=3, param=2.5, model='log', alg='ef')
rmev(n=100, d=4, param=c(0.2,0.1,0.9,0.5), model='bilog', alg='sm')
## Spatial example using power variogram
#NEW: Semi-variogram must take distance as argument
semivario <- function(x, scale, alpha){ scale*x^alpha }
#grid specification
grid.coord <- as.matrix(expand.grid(runif(4), runif(4)))
rmev(n=100, vario=semivario, coord=grid.coord, model='br', scale = 0.5, alpha = 1)
#using the Brown-Resnick model with a covariance matrix
vario2cov <- function(coord, semivario,...){
 sapply(1:nrow(coord), function(i) sapply(1:nrow(coord), function(j)
  semivario(sqrt(sum((coord[i,])^2)), ...) +
  semivario(sqrt(sum((coord[j,])^2)), ...) -
  semivario(sqrt(sum((coord[i,]-coord[j,])^2)), ...)))
}
rmev(n=100, sigma=vario2cov(grid.coord, semivario = semivario, scale = 0.5, alpha = 1), model='br')
# asymmetric logistic model - see function 'rmvevd' from package 'evd '
asy <- list(0, 0, 0, 0, c(0,0), c(0,0), c(0,0), c(0,0), c(0,0), c(0,0),
  c(.2,.1,.2), c(.1,.1,.2), c(.3,.4,.1), c(.2,.2,.2), c(.4,.6,.2,.5))
rmev(n=1, d=4, param=0.3, asy=asy, model="alog")
#Example with a grid (generating an array)
rmev(n=10, sigma=cbind(c(2,1), c(1,3)), coord=cbind(runif(4), runif(4)), model='smith', grid=TRUE)
## Example with Dirichlet mixture
alpha.mat <- cbind(c(2,1,1),c(1,2,1),c(1,1,2))
rmev(n=100, param=alpha.mat, weights=rep(1/3,3), model='dirmix')
rmev(n=10, param=c(0.1,1,2,3), d=3, model='pairbeta')
Random samples from spectral distributions of multivariate extreme value models.
Description
Generate from Q_i, the spectral measure of a given multivariate extreme value model based on the L1 norm.
Usage
rmevspec(
  n,
  d,
  param,
  sigma,
  model = c("log", "neglog", "bilog", "negbilog", "hr", "br", "xstud", "smith",
    "schlather", "ct", "sdir", "dirmix", "pairbeta", "pairexp", "wdirbs", "wexpbs",
    "maxlin"),
  weights = NULL,
  vario = NULL,
  coord = NULL,
  grid = FALSE,
  dist = NULL,
  ...
)
Arguments
| n | number of observations | 
| d | dimension of sample | 
| param | parameter vector for the logistic, bilogistic, negative bilogistic and extremal Dirichlet (Coles and Tawn) model. Parameter matrix for the Dirichlet mixture. Degree of freedoms for extremal student model. See Details. | 
| sigma | covariance matrix for Brown-Resnick and extremal Student-t distributions. Symmetric matrix of squared  coefficients  | 
| model | for multivariate extreme value distributions, users can choose between 1-parameter logistic and negative logistic, asymmetric logistic and negative logistic, bilogistic, Husler-Reiss, extremal Dirichlet model (Coles and Tawn) or the Dirichlet mixture. Spatial models include the Brown-Resnick, Smith, Schlather and extremal Student max-stable processes. Max linear models are also supported | 
| weights | vector of length  | 
| vario | semivariogram function whose first argument must be distance. Used only if provided in conjunction with  | 
| coord | 
 | 
| grid | Logical.  | 
| dist | symmetric matrix of pairwise distances. Default to  | 
| ... | additional arguments for the  | 
Details
The vector param differs depending on the model
-  log: one dimensional parameter greater than 1
-  neglog: one dimensional positive parameter
-  bilog:d-dimensional vector of parameters in[0,1]
-  negbilog:d-dimensional vector of negative parameters
-  ct,dir,negdir:d-dimensional vector of positive (a)symmetry parameters. Alternatively, ad+1vector consisting of thedDirichlet parameters and the last entry is an index of regular variation in(0, 1]treated as scale
-  xstud: one dimensional parameter corresponding to degrees of freedomalpha
-  dirmix:dbym-dimensional matrix of positive (a)symmetry parameters
-  pairbeta, pairexp:d(d-1)/2+1vector of parameters, containing the concentration parameter and the coefficients of the pairwise beta, in lexicographical order e.g.,\beta_{1,2}, \beta_{1,3}, \ldots
-  wdirbs, wexpbs:2dvector ofdconcentration parameters followed by thedDirichlet parameters
Value
an n by d exact sample from the corresponding multivariate extreme value model
Note
This functionality can be useful to generate for example Pareto processes with marginal exceedances.
Author(s)
Leo Belzile
References
Dombry, Engelke and Oesting (2016). Exact simulation of max-stable processes, Biometrika, 103(2), 303–317.
Boldi (2009). A note on the representation of parametric models for multivariate extremes. Extremes 12, 211–218.
Examples
set.seed(1)
rmevspec(n=100, d=3, param=2.5, model='log')
rmevspec(n=100, d=3, param=2.5, model='neglog')
rmevspec(n=100, d=4, param=c(0.2,0.1,0.9,0.5), model='bilog')
rmevspec(n=100, d=2, param=c(0.8,1.2), model='ct') #Dirichlet model
rmevspec(n=100, d=2, param=c(0.8,1.2,0.5), model='sdir') #with additional scale parameter
#Variogram gamma(h) = scale*||h||^alpha
#NEW: Variogram must take distance as argument
vario <- function(x, scale=0.5, alpha=0.8){ scale*x^alpha }
#grid specification
grid.coord <- as.matrix(expand.grid(runif(4), runif(4)))
rmevspec(n=100, vario=vario,coord=grid.coord, model='br')
## Example with Dirichlet mixture
alpha.mat <- cbind(c(2,1,1),c(1,2,1),c(1,1,2))
rmevspec(n=100, param=alpha.mat, weights=rep(1/3,3), model='dirmix')
Multivariate Normal distribution sampler
Description
Sampler derived using the eigendecomposition of the covariance
matrix Sigma. The function uses the Armadillo random normal generator
Usage
rmnorm(n, mu, Sigma)
Arguments
| n | sample size | 
| mu | mean vector. Will set the dimension | 
| Sigma | a square covariance matrix, of same dimension as  | 
Value
an n sample from a multivariate Normal distribution
Examples
rmnorm(n = 10, mu = c(0,2), Sigma = diag(2))
Simulation from R-Pareto processes
Description
Simulation from R-Pareto processes
Usage
rparp(
  n,
  shape = 1,
  risk = c("sum", "site", "max", "min", "l2"),
  siteindex = NULL,
  d,
  param,
  sigma,
  model = c("log", "neglog", "bilog", "negbilog", "hr", "br", "xstud", "smith",
    "schlather", "ct", "sdir", "dirmix"),
  weights,
  vario,
  coord = NULL,
  ...
)
Arguments
| n | number of observations | 
| shape | shape tail index of Pareto variable | 
| risk | string indicating risk functional. | 
| siteindex | integer between 1 and d specifying the index of the site or variable | 
| d | dimension of sample | 
| param | parameter vector for the logistic, bilogistic, negative bilogistic and extremal Dirichlet (Coles and Tawn) model. Parameter matrix for the Dirichlet mixture. Degree of freedoms for extremal student model. See Details. | 
| sigma | covariance matrix for Brown-Resnick and extremal Student-t distributions. Symmetric matrix of squared  coefficients  | 
| model | for multivariate extreme value distributions, users can choose between 1-parameter logistic and negative logistic, asymmetric logistic and negative logistic, bilogistic, Husler-Reiss, extremal Dirichlet model (Coles and Tawn) or the Dirichlet mixture. Spatial models include the Brown-Resnick, Smith, Schlather and extremal Student max-stable processes. Max linear models are also supported | 
| weights | vector of length  | 
| vario | semivariogram function whose first argument must be distance. Used only if provided in conjunction with  | 
| coord | 
 | 
| ... | additional arguments for the  | 
Details
For riskf=max and riskf=min, the procedure uses rejection sampling based on Pareto variates
sampled from sum and may be slow if d is large.
Value
an n by d sample from the R-Pareto process, with attributes
accept.rate if the procedure uses rejection sampling.
Examples
rparp(n=10, risk = 'site', siteindex=2, d=3, param=2.5, model='log')
rparp(n=10, risk = 'min', d=3, param=2.5, model='neglog')
rparp(n=10, risk = 'max', d=4, param=c(0.2,0.1,0.9,0.5), model='bilog')
rparp(n=10, risk = 'sum', d=3, param=c(0.8,1.2,0.6, -0.5), model='sdir')
vario <- function(x, scale=0.5, alpha=0.8){ scale*x^alpha }
grid.coord <- as.matrix(expand.grid(runif(4), runif(4)))
rparp(n=10, risk = 'max', vario=vario, coord=grid.coord, model='br')
Simulation from Pareto processes using composition sampling
Description
The algorithm performs forward sampling by simulating first from a
mixture, then sample angles conditional on them being less than (max) or greater than (min) one.
The resulting sample from the angular distribution is then multiplied by
Pareto variates with tail index shape.
Usage
rparpcs(
  n,
  model = c("log", "neglog", "br", "xstud"),
  risk = c("max", "min"),
  param = NULL,
  d,
  Lambda = NULL,
  Sigma = NULL,
  df = NULL,
  shape = 1,
  ...
)
Arguments
| n | sample size. | 
| model | string indicating the model family. | 
| risk | string indicating the risk functional. Only  | 
| param | parameter value for the logistic or negative logistic model | 
| d | dimension of the multivariate model, only needed for logistic or negative logistic models | 
| Lambda | parameter matrix for the Brown–Resnick model. See Details. | 
| Sigma | correlation matrix if  | 
| df | degrees of freedom for extremal Student process. | 
| shape | tail index of the Pareto variates (reciprocal shape parameter). Must be strictly positive. | 
| ... | additional parameters, currently ignored | 
Details
For the moment, only exchangeable models and models based n elliptical processes are handled.
The parametrization of the Brown–Resnick is in terms of the matrix Lambda, which
is formed by evaluating the semivariogram \gamma at sites s_i, s_j, meaning that
\Lambda_{i,j} = \gamma(s_i, s_j)/2.
The argument Sigma is ignored for the Brown-Resnick model
if Lambda is provided by the user.
Value
an n by d matrix of samples, where d = ncol(Sigma), with attributes mixt.weights.
Author(s)
Leo Belzile
See Also
rparp for general simulation of Pareto processes based on an accept-reject algorithm.
Examples
## Not run: 
#Brown-Resnick, Wadsworth and Tawn (2014) parametrization
D <- 20L
coord <- cbind(runif(D), runif(D))
semivario <- function(d, alpha = 1.5, lambda = 1){0.5 * (d/lambda)^alpha}
Lambda <- semivario(as.matrix(dist(coord))) / 2
rparpcs(n = 10, Lambda = Lambda, model = 'br', shape = 0.1)
#Extremal Student
Sigma <- stats::rWishart(n = 1, df = 20, Sigma = diag(10))[,,1]
rparpcs(n = 10, Sigma = cov2cor(Sigma), df = 3, model = 'xstud')
## End(Not run)
Simulation of generalized Huesler-Reiss Pareto vectors via composition sampling
Description
Sample from the generalized Pareto process associated to Huesler-Reiss spectral profiles.
For the Huesler-Reiss Pareto vectors, the matrix Sigma is utilized to build Q viz.
Q = \Sigma^{-1} - \frac{\Sigma^{-1}\mathbf{1}_d\mathbf{1}_d^\top\Sigma^{-1}}{\mathbf{1}_d^\top\Sigma^{-1}\mathbf{1}_d}.
The location vector m and Sigma are the parameters of the underlying log-Gaussian process.
Usage
rparpcshr(n, u, alpha, Sigma, m)
Arguments
| n | sample size | 
| u | vector of marginal location parameters (must be strictly positive) | 
| alpha | vector of shape parameters (must be strictly positive). | 
| Sigma | covariance matrix of process, used to define  | 
| m | location vector of Gaussian distribution. | 
Value
n by d matrix of observations
References
Ho, Z. W. O and C. Dombry (2019), Simple models for multivariate regular variations and the Huesler-Reiss Pareto distribution, Journal of Multivariate Analysis (173), p. 525-550, doi:10.1016/j.jmva.2019.04.008
Examples
D <- 20L
coord <- cbind(runif(D), runif(D))
di <- as.matrix(dist(rbind(c(0, ncol(coord)), coord)))
semivario <- function(d, alpha = 1.5, lambda = 1){(d/lambda)^alpha}
Vmat <- semivario(di)
Sigma <- outer(Vmat[-1, 1], Vmat[1, -1], '+') - Vmat[-1, -1]
m <- Vmat[-1,1]
## Not run: 
samp <- rparpcshr(n = 100, u = c(rep(1, 10), rep(2, 10)),
          alpha = seq(0.1, 1, length = 20), Sigma = Sigma, m = m)
## End(Not run)
Simulate r-largest observations from point process of extremes
Description
Simulate the r-largest observations from a Poisson point process with intensity
\Lambda(x) = (1+\xi(x-\mu)/\sigma)^{-1/\xi}
.
Usage
rrlarg(n, r, loc = 0, scale = 1, shape = 0)
Arguments
| n | sample size | 
| r | number of observations per block | 
| loc | location parameter | 
| scale | scale parameter | 
| shape | shape parameter | 
Value
an n by r matrix of samples from the point process, ordered from largest to smallest in each row.
Variogram model of Schlather and Moreva
Description
The variogram model is
\gamma(h) = \frac{[1+\{(\|h\|/\lambda\}^\alpha]^{\beta/\alpha}-1}{2^{\beta/\alpha}-1}, \quad 0 < \alpha \leq 2, \beta \leq 2.
The model is defined at \beta=0 by continuity.
Usage
schlather.vario(h, alpha, beta, scale = 1)
Arguments
| h | vector or matrix of pairwise distances | 
| alpha | smoothness parameter | 
| beta | shape parameter, must be less than 2 | 
| scale | scale parameter | 
Value
a vector or matrix of variogram values of the same length as h
Ramos and Ledford test of independence
Description
The Ramos and Ledford (2005) score test of independence is a modification of tests by Tawn (1988) and Ledford and Tawn (1996) for a logistic model parameter \alpha=1; the latter two have scores with zero expectation, but the variance of the score are infinite, which produces non-regularity and yield test, once suitably normalized, that converge slowly to their asymptotic null distribution. The test, designed for bivariate samples, transforms observations to have unit Frechet margins and considers a bivariate censored likelihood approach for the logistic distribution.
Usage
scoreindep(xdat, p, test = c("ledford", "tawn"))
Arguments
| xdat | a  | 
| p | probability level for the marginal threshold | 
| test | string; if  | 
Value
a list with elements
- stat
- value of the score test statistic 
- pval
- asymptotic p-value 
- test
- testargument
Exponential regression estimator
Description
This function implements the exponential regression estimator of the shape parameter for the case of Pareto tails with positive shape index.
Usage
shape.erm(xdat, k, method = c("bdgm", "fh"), bounds = NULL)
Arguments
| xdat | vector of observations | 
| k | vector of integer, the number of largest observations to consider | 
| method | string; one of  | 
| bounds | vector of length 2 giving the bounds for  | 
Details
The second-order parameter is difficult to pin down, and while values within [-1,0) are most logical under Hall model, the model parameters become unidentifiable when \rho \to 0. The default constraint restrict -5 <\rho < -0.5, with the upper bound changed to -0.25 for sample of sizes larger than 5000 observations. Users can set the value of the bounds for \rho via argument bounds. The optimization is initialized at the Hill estimator.
Value
a data frame with columns
-  knumber of exceedances
-  shapeestimate of the shape parameter
-  rhoestimate of the second-order regular variation index
-  scaleestimate of the scale parameter
References
Feuerverger, A. and P. Hall (1999), Estimating a tail exponent by modelling departure from a Pareto distribution, The Annals of Statistics 27(2), 760-781. <doi:10.1214/aos/1018031215>
Beirlant, J., Dierckx, G., Goegebeur, Y. G. Matthys (1999). Tail Index Estimation and an Exponential Regression Model. Extremes, 2, 177–200 (1999). <doi:10.1023/A:1009975020370>
Generalized jackknife shape estimator
Description
Generalized jackknife shape estimator
Usage
shape.genjack(xdat, k)
Arguments
| xdat | vector of positive observations | 
| k | vector of order statistics; if missing, a vector going from 10 to sample size minus one. | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar integer.
References
Gomes, I.M., João Martins, M. and Neves, M. (2000) Alternatives to a Semi-Parametric Estimator of Parameters of Rare Events-The Jackknife Methodology. Extremes, 3, 207–229. doi:10.1023/A:1011470010228
Beirlant et al. generalized quantile shape estimator
Description
This estimator estimates the real shape parameter based on generalized quantile plots based on mean excess functions, generalized median excesses or trimmed mean excesses.
Usage
shape.genquant(
  xdat,
  k,
  type = c("genmean", "genmed", "trimmean"),
  weight,
  p = 0.5
)
Arguments
| xdat | 
 | 
| k | number of upper order statistics | 
| type | string indicating the estimator choice, one of  | 
| weight | weight a kernel function on  | 
| p | number between zero and one giving the proportion of order statistics for the second threshold | 
References
Beirlant, J., Vynckier P. and J.L. Teugels (1996). Excess functions and estimation of the extreme-value index. Bernoulli, 2(4), 293-318.
Hill's estimator of the shape parameter
Description
Given a sample of positive observations, calculate the tail index or shape parameter. The shape estimate returned is positive.
Usage
shape.hill(xdat, k)
Arguments
| xdat | vector of positive observations | 
| k | vector of order statistics; if missing, a vector going from 10 to sample size minus one. | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar integer.
References
Hill, B.M. (1975). A simple general approach to inference about the tail of a distribution. Annals of Statistics, 3, 1163-1173.
Examples
xdat <- mev::rgp(n = 200, loc = 1, scale = 0.5, shape = 0.5)
shape.hill(xdat)
Lower-trimmed Hill estimator for the shape parameter
Description
Given a sample of Pareto-tailed samples (positive tail index),
compute the lower-trimmed Hill estimator. If k0=k, the estimator reduces to Hill's estimator for the shape index
Usage
shape.lthill(xdat, k, k0 = k, sorted = FALSE, ...)
Arguments
| xdat | [numeric] vector of positive observations | 
| k | [integer] number of order statistics for the threshold | 
| k0 | [integer] vector of number of largest order statistics, no greater than  | 
| sorted | [logical] if  | 
| ... | additional arguments for other routines (notably  | 
Value
a scalar with the shape parameter estimate if k0 is a scalar, otherwise a data frame with columns k0 for the number of exceedances and shape for the tail index.
References
Bladt, M., Albrecher, H. & Beirlant, J. (2020) Threshold selection and trimming in extremes. Extremes, 23, 629-665 . doi:10.1007/s10687-020-00385-0
Examples
# Pareto sample
n <- 200
xdat <- 10/(1 - runif(n)) - 10
shape.lthill(xdat = xdat, k = 100, k0 = 5:100)
Dekkers and de Haan moment estimator for the shape
Description
Given a sample of exceedances, compute the moment estimator of the real shape parameter.
Usage
shape.moment(xdat, k)
Arguments
| xdat | vector of positive observations of length  | 
| k | number of largest order statistics | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar.
References
Dekkers, A.L.M. and de Haan, L. (1989). On the estimation of the extreme-value index and large quantile estimation., Annals of Statistics, 17, 1795-1833.
Extreme U-statistic Pickands estimator
Description
Given a random sample of n exceedances, the estimator
returns an estimator of the shape parameter or extreme
value index using a kernel of order 3, based on
k largest exceedances of xdat. Note that the method does not allow for ties.
Usage
shape.osz(xdat, k, ...)
Arguments
| xdat | vector of observations of length  | 
| k | number of largest order statistics  | 
| ... | additional arguments for backward compatibility | 
Details
The calculations are based on the recursions provided in Lemma 4.3 of Oorschot et al.
References
Oorschot, J, J. Segers and C. Zhou (2023), Tail inference using extreme U-statistics, Electronic Journal of Statistics 17(1): 1113-1159. doi:10.1214/23-EJS2129
Examples
xdat <- rgp(n = 1000, shape = 0.2)
shape.osz(xdat, k = 10)
Pickand's shape estimator
Description
Given a sample of size n of positive exceedances, compute
the real shape parameter \xi based on the k largest order statistics.
Usage
shape.pickands(xdat, k)
Arguments
| xdat | vector of positive observations of length  | 
| k | number of largest order statistics | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar.
References
Pickands, III, J. (1975). Statistical inference using extreme order statistics. Annals of Statistics, 3, 119-131.
Random block maxima shape estimator of Wager
Description
Computes the shape estimator for varying k up to sample size of maximum kmax largest observations
Usage
shape.rbm(xdat, k = 10:floor(length(xdat)/2), ...)
Arguments
| xdat | [vector] sample exceedances | 
| k | [int] vector of integers for which to compute the estimator | 
| ... | additional parameters, currently ignored | 
Value
a list with elements
- k
- number of exceedances 
- shape
- tail index estimate, strictly positive 
- risk
- empirical Bayes estimate of risk 
- thresh
- threshold given by the smallest order statistic considered in the sample 
References
Wager, S. (2014). Subsampling extremes: From block maxima to smooth tail estimation, Journal of Multivariate Analysis, 130, 335-353, doi:10.1016/j.jmva.2014.06.010
Trimmed Hill estimator for the shape parameter
Description
Given a sample of Pareto-tailed samples (positive tail index),
compute the trimmed Hill estimator. If k0=k, the estimator
reduces to Hill's estimator for the shape index
Usage
shape.trimhill(xdat, k, k0, sorted = FALSE)
Arguments
| xdat | [numeric] vector of positive observations | 
| k | [integer] number of order statistics for the threshold | 
| k0 | [integer] number of largest order statistics, strictly less than  | 
| sorted | [logical] if  | 
Value
a scalar with the shape parameter estimate
References
Bhattacharya, S., Kallitsis, M. and S. Stoev, (2019) Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data. Electronic Journal of Statistics 13, 1872–1925
de Vries shape estimator
Description
Given a sample of exceedances, compute the moment estimator of the positive shape parameter using the ratio of log ratio of exceedance and it's square.
Usage
shape.vries(xdat, k)
Arguments
| xdat | vector of positive observations of length  | 
| k | number of largest order statistics | 
Value
a data frame with the number of order statistics k and the shape parameter estimate shape, or a single numeric value if k is a scalar.
References
de Haan, L. and Peng, L. (1998). Comparison of tail index estimators, Statistica Neerlandica 52, 60-70.
de Haan, L. and Peng, L. (1998). Comparison of tail index estimators. Statistica Neerlandica, 52: 60-70. doi:10.1111/1467-9574.00068
Smith's penultimate approximations
Description
The function takes as arguments the distribution and density functions. There are two options:
method='bm' yields block maxima and method='pot' threshold exceedances.
For method='bm', the user should provide in such case the block sizes via the
argument m, whereas if method='pot', a vector of threshold values should be
provided. The other argument (u or m depending on the method) is ignored.
Usage
smith.penult(
  family,
  method = c("bm", "pot"),
  u = NULL,
  qu = NULL,
  m = NULL,
  returnList = TRUE,
  ...
)
Arguments
| family | the name of the parametric family. Will be used to obtain  | 
| method | either block maxima ( | 
| u | vector of thresholds for method  | 
| qu | vector of quantiles for method  | 
| m | vector of block sizes for method  | 
| returnList | logical; should the arguments be returned as a list or as a matrix of parameter | 
| ... | additional arguments passed to  | 
Details
Alternatively, the user can provide functions densF, quantF and distF for the density,
quantile function and distribution functions, respectively. The user can also supply the derivative
of the density function, ddensF. If the latter is missing, it will be approximated using finite-differences.
For method = "pot", the function computes the reciprocal hazard and its derivative on the log scale to avoid numerical overflow. Thus, the density function should have argument log and the distribution function arguments log.p and lower.tail, respectively.
Value
either a vector, a matrix if either length(m)>1 or length(u)>1 or a list (if returnList) containing
-  loc: location parameters (method='bm')
-  scale: scale parameters
-  shape: shape parameters
-  u: thresholds (ifmethod='pot'), percentile corresponding to threshold (ifmethod='pot')
-  m: block sizes (ifmethod='bm')
Author(s)
Leo Belzile
References
Smith, R.L. (1987). Approximations in extreme value theory. Technical report 205, Center for Stochastic Process, University of North Carolina, 1–34.
Examples
#Threshold exceedance for Normal variables
qu <- seq(1,5,by=0.02)
penult <- smith.penult(family = "norm", ddensF=function(x){-x*dnorm(x)},
   method = 'pot', u = qu)
plot(qu, penult$shape, type='l', xlab='Quantile',
   ylab='Penultimate shape', ylim=c(-0.5,0))
#Block maxima for Gamma variables -
#User must provide arguments for shape (or rate)
m <- seq(30, 3650, by=30)
penult <- smith.penult(family = 'gamma', method = 'bm', m=m, shape=0.1)
plot(m, penult$shape, type='l', xlab='Quantile', ylab='Penultimate shape')
#Comparing density of GEV approximation with true density of maxima
m <- 100 #block of size 100
p <- smith.penult(family='norm',
   ddensF=function(x){-x*dnorm(x)}, method='bm', m=m, returnList=FALSE)
x <- seq(1, 5, by = 0.01)
plot(x, m*dnorm(x)*exp((m-1)*pnorm(x,log.p=TRUE)),type='l', ylab='Density',
main='Distribution of the maxima of\n 100 standard normal variates')
lines(x, mev::dgev(x,loc=p[1], scale=p[2], shape=0),col=2)
lines(x, mev::dgev(x,loc=p[1], scale=p[2], shape=p[3]),col=3)
legend(x = 'topright',lty = c(1,1,1,1), col = c(1,2,3,4),
   legend = c('exact', 'ultimate', 'penultimate'), bty = 'n')
Smith's third penultimate approximation
Description
This function returns the density and distribution functions
of the 3rd penultimate approximation for extremes of Smith (1987). It requires
knowledge of the exact constants \epsilon and \rho described in the paper.
Usage
smith.penult.fn(
  loc,
  scale,
  shape,
  eps,
  rho = NULL,
  method = c("bm", "pot"),
  mdaGumbel = FALSE,
  ...
)
Arguments
| loc | location parameter returned by  | 
| scale | scale parameter returned by  | 
| shape | shape parameter returned by  | 
| eps | parameter vector, see Details. | 
| rho | second-order parameter, model dependent | 
| method | one of  | 
| mdaGumbel | logical indicating whether the function  | 
| ... | additional parameters, currently ignored. These are used for backward compatibility due to a change in the names of the arguments. | 
Details
Let F, f denote respectively the distribution and density functions and define the function \phi(x)  as
\phi(x)=-\frac{F(x)\log F(x)}{f(x)}
for block maxima.
The sequence loc corresponds to b_n otherwise, defined as the solution of F(b_n)=\exp(-1/n).
The scale is given by a_n=\phi(b_n), the shape as \gamma_n=\phi'(b_n). These are returned by a call to smith.penult.
For threshold exceedances, b_n is replaced by the sequence of thresholds u and we
take instead \phi(x) to be the reciprocal hazard function \phi(x)=(1-F(x))/f(x).
In cases where the distribution function is in the maximum domain of
attraction of the Gumbel distribution, \rho is possibly undetermined and
\epsilon can be equal to \phi(b_n)\phi''(b_n).
For distributions in the maximum domain of
attraction of the Gumbel distribution and that are class N, it is also possible to abstract from the \rho parameter by substituting the function H_{\rho} by x^3/6 without affecting the rate of convergence. This can be done by setting mdaGumbel=TRUE in the function call.
Warning
The third penultimate approximation does not yield a valid distribution function over the whole range of the original distribution, but is rather valid in a neighborhood of the true support of the distribution of maxima/threshold exceedance.
The function handles the most standard failure (decreasing distribution function and negative densities), but any oscillatory behaviour will not necessarily be captured.
This is inherent to the method and can be resolved by ‘not’ evaluating the functions F and f at the faulty points.
References
Smith, R.L. (1987). Approximations in extreme value theory. Technical report 205, Center for Stochastic Process, University of North Carolina, 1–34.
Examples
#Normal maxima example from Smith (1987)
m <- 100 #block of size 100
p <- smith.penult(family='norm',
   ddensF=function(x){-x*dnorm(x)}, method='bm', m=m, returnList=FALSE)
approx <- smith.penult.fn(loc=p[1], scale=p[2], shape=p[3],
   eps=p[3]^2+p[3]+p[2]^2, mdaGumbel=TRUE, method='bm')
x <- seq(0.5,6,by=0.001)
#First penultimate approximation
plot(x, exp(m*pnorm(x, log.p=TRUE)),type='l', ylab='CDF',
main='Distribution of the maxima of\n 100 standard normal variates')
lines(x, mev::pgev(x,loc=p[1], scale=p[2], shape=0),col=2)
lines(x, mev::pgev(x,loc=p[1], scale=p[2], shape=p[3]),col=3)
lines(x, approx$F(x),col=4)
legend(x='bottomright',lty=c(1,1,1,1),col=c(1,2,3,4),
   legend=c('Exact','1st approx.','2nd approx.','3rd approx'),bty='n')
plot(x, m*dnorm(x)*exp((m-1)*pnorm(x,log.p=TRUE)),type='l', ylab='Density',
main='Distribution of the maxima of\n 100 standard normal variates')
lines(x, mev::dgev(x,loc=p[1], scale=p[2], shape=0),col=2)
lines(x, mev::dgev(x,loc=p[1], scale=p[2], shape=p[3]),col=3)
lines(x, approx$f(x),col=4)
legend(x='topright',lty=c(1,1,1,1),col=c(1,2,3,4),
 legend=c('Exact','1st approx.','2nd approx.','3rd approx'),bty='n')
#Threshold exceedances
par <- smith.penult(family = "norm", ddensF=function(x){-x*dnorm(x)},
method='pot', u=4, returnList=FALSE)
approx <- smith.penult.fn(loc=par[1], scale=par[2], shape=par[3],
 eps=par[3]^2+par[3]+par[2]^2, mdaGumbel=TRUE, method='pot')
x <- seq(4.01,7,by=0.01)
#Distribution function
plot(x, 1-(1-pnorm(x))/(1-pnorm(par[1])),type='l', ylab='Conditional CDF',
main='Exceedances over 4\n for standard normal variates')
lines(x, mev::pgp(x, loc=par[1], scale=par[2], shape=0),col=2)
lines(x, mev::pgp(x, loc=par[1], scale=par[2], shape=par[3]),col=3)
lines(x, approx$F(x),col=4)
#Density
plot(x, dnorm(x)/(1-pnorm(par[1])),type='l', ylab='Conditional density',
main='Exceedances over 4\n for standard normal variates')
lines(x, dgp(x, loc=par[1], scale=par[2], shape=0),col=2)
lines(x, dgp(x, loc=par[1], scale=par[2], shape=par[3]),col=3)
lines(x, approx$f(x),col=4)
Spline correction for Fraser-Reid approximations
Description
The tangent exponential model can be numerically unstable for values close to r = 0.
This function corrects these incorrect values, which are interpolated using splines.
The function takes as input an object of class fr and returns the same object with
different rstar values.
Usage
spline.corr(fr, method = c("cobs", "smooth.spline"))
Arguments
| fr | an object of class  | 
| method | string for the method, either  | 
Details
If available, the function uses cobs from the eponym package. The latter handles constraints and smoothness penalties, and is more robust than the equivalent smooth.spline.
Value
an object of class fr, containing as additional arguments spline and a modified rstar argument.
Warning
While penalized (robust) splines often do a good job at capturing and correcting for numerical outliers and NA, it
may also be driven by unusual values lying on the profile log-likelihood the curve or fail to detect outliers (or falsely identifying ‘correct’ values as outliers). The user should always validate by comparing the plots of both the uncorrected (raw) output of the object with that of spline.corr.
Semi-parametric marginal transformation to uniform
Description
The function spunif transforms a matrix or vector of data x
to the pseudo-uniform scale using a semiparametric transform. Data below the threshold
are transformed to pseudo-uniforms using a rank transform, while data above the threshold
are assumed to follow a generalized Pareto distribution. The parameters of the latter are
estimated using maximum likelihood if either scale = NULL or shape = NULL.
Usage
spunif(x, thresh, scale = NULL, shape = NULL)
Arguments
| x | matrix or vector of data | 
| thresh | vector of marginal thresholds | 
| scale | vector of marginal scale parameters for the generalized Pareto | 
| shape | vector of marginal shape parameters for the generalized Pareto | 
Value
a matrix or vector of the same dimension as x, with pseudo-uniform observations
Author(s)
Leo Belzile
Examples
x <- rmev(1000, d = 3, param = 2, model = 'log')
thresh <- apply(x, 2, quantile, 0.95)
spunif(x, thresh)
Stein's weighted generalized Pareto likelihood
Description
Stein's weighted generalized Pareto likelihood
Usage
stein_gp_lik(pars, xdat, weights = rep(1, length(xdat)), ...)
Arguments
| pars | vector of length with scale and shape parameters | 
| xdat | vector of exceedances in decreasing order if  | 
| weights | vector of weights | 
| ... | additional arguments, currently ignored | 
References
Stein, M.L. (2023). A weighted composite log-likelihood approach to parametric estimation of the extreme quantiles of a distribution. Extremes 26, 469-507 <doi:10.1007/s10687-023-00466-w>
Coefficient of tail correlation and tail dependence
Description
For data with unit Pareto margins, the coefficient of tail dependence \eta is defined  via 
\Pr(\min(X) > x) = L(x)x^{-1/\eta},
where L(x) is a slowly varying function. Ignoring the latter, several estimators of \eta can be defined. In unit Pareto margins, \eta is a nonnegative shape parameter that can be estimated by fitting a generalized Pareto distribution above a high threshold. In exponential margins, \eta is a scale parameter and the maximum likelihood estimator of the latter is the Hill estimator. Both methods are based on peaks-over-threshold and the user can choose between pointwise confidence obtained through a likelihood ratio test statistic ("lrt") or the Wald statistic ("wald").
Usage
taildep(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  depmeas = c("eta", "chi"),
  estimator = list(eta = c("emp", "betacop", "gpd", "hill"), chi = c("emp", "betacop")),
  confint = c("wald", "lrt"),
  level = 0.95,
  trunc = TRUE,
  margtrans = c("emp", "none"),
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of percentiles between 0 and 1 | 
| nq | number of quantiles of the structural variable at which to form a grid; only used if  | 
| qlim | limits for the sequence  | 
| depmeas | dependence measure, either of  | 
| estimator | named list giving the estimation method for  | 
| confint | string indicating the type of confidence interval for  | 
| level | the confidence level required (default to 0.95). | 
| trunc | logical indicating whether the estimates and confidence intervals should be truncated in  | 
| margtrans | marginal transformation; if  | 
| ties.method | string indicating the type of method for  | 
| plot | logical; should graphs be plotted? | 
| ... | additional arguments passed to  | 
Details
The most common approach for estimation is the empirical survival copula, by evaluating the proportion of sample minima with uniform margins that exceed a given x. An alternative estimator uses a smoothed estimator of the survival copula using Bernstein polynomial, resulting in the so-called betacop estimator. Approximate pointwise confidence intervals for the latter are obtained by assuming the proportion of points is binomial.
The coefficient of tail correlation \chi is
\chi = \lim_{u \to 1} \frac{\Pr(F_1(X_1)>u, \ldots, F_D(X_D)>u)}{1-u}.
Asymptotically independent vectors have \chi = 0. The estimator uses an estimator of the survival copula
Value
a named list with elements
-  qlev: aKvector of percentile levels
-  eta: aKby 3 matrix with point estimates, lower and upper confidence intervals
-  chi: aKby 3 matrix with point estimates, lower and upper confidence intervals
Note
As of version 1.15, the percentiles used are from the minimum variable. This ensures that, regardless of the number of variables, there is no error message returned because the quantile levels are too low for there to be observations
See Also
chiplot for bivariate empirical estimates of \chi and \bar{\chi}.
Examples
## Not run: 
set.seed(765)
# Max-stable model
dat <- rmev(n = 1000, d = 4, param = 0.7, model = "log")
taildep(dat, confint = 'wald')
## End(Not run)
Bridging the singularity for higher order asymptotics
Description
The correction factor \log(q/r)/r for the
likelihood root is unbounded in the vincinity of
the maximum likelihood estimator. The thesis of
Rongcai Li (University of Toronto, 2001)
explores different ways of bridging this
singularity, notably using asymptotic expansions.
Usage
tem.corr(fr, print.warning = FALSE)
Arguments
| fr | an object of class  | 
| print.warning | logical; should warning message be printed? Default to  | 
Details
The poor man's method used here consists in
fitting a robust regression to 1/q-1/r
as a function of r and using predictions
from the model to solve for q. This
approach is seemingly superior to that
previously used in spline.corr.
Value
an object of class fr, containing as additional arguments spline and a modified rstar argument.
P-P plot for testing max stability
Description
The diagnostic, proposed by Gabda, Towe, Wadsworth and Tawn,
relies on the fact that, for max-stable vectors on the unit Gumbel scale,
the distribution of the maxima is Gumbel distribution with a location parameter equal to the exponent measure.
One can thus consider tuples of size m and estimate the location parameter via maximum likelihood
and transforming observations to the standard Gumbel scale. Replicates are then pooled and empirical quantiles are defined.
The number of combinations of m vectors can be prohibitively large, hence only nmax randomly selected
tuples are selected from all possible combinations. The confidence intervals are obtained by a
nonparametric bootstrap, by resampling observations with replacement observations for the selected tuples and re-estimating the
location parameter. The procedure can be computationally intensive as a result.
Usage
test.maxstab(
  xdat,
  m = prod(dim(dat)[-1]),
  nmax = 500L,
  B = 1000L,
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | matrix or array of max-stable observations, typically block maxima. The first dimension should consist of replicates | 
| m | integer indicating how many tuples should be aggregated. | 
| nmax | maximum number of pairs. Default to 500L. | 
| B | number of nonparametric bootstrap replications. Default to 1000L. | 
| ties.method | string indicating the method for  | 
| plot | logical indicating whether a graph should be produced (default to  | 
| ... | additional arguments for backward compatibility | 
Value
a Tukey probability-probability plot with 95% confidence intervals obtained using a nonparametric bootstrap
References
Gabda, D.; Towe, R. Wadsworth, J. and J. Tawn, Discussion of “Statistical Modeling of Spatial Extremes” by A. C. Davison, S. A. Padoan and M. Ribatet. Statist. Sci. 27 (2012), no. 2, 189–192.
Examples
## Not run: 
xdat <- mev::rmev(n = 250, d = 100, param = 0.5, model = "log")
test.maxstab(xdat, m = 100)
xdat <- rmnorm(n = 250, Sigma = diag(0.5, 10) + matrix(0.5, 10, 10), mu = rep(0, 10))
test.maxstab(xdat, m = 2, nmax = 100)
test.maxstab(xdat, m = ncol(xdat))
## End(Not run)
Ramos and Ledford test of independence
Description
The Ramos and Ledford (2005) score test of independence is a modification of tests by Tawn (1988) and Ledford and Tawn (1996) for a logistic model parameter \alpha=1; the latter two have scores with zero expectation, but the variance of the score are infinite, which produces non-regularity and yield test, once suitably normalized, that converge slowly to their asymptotic null distribution. The test, designed for bivariate samples, transforms observations to have unit Frechet margins and considers a bivariate censored likelihood approach for the logistic distribution.
Usage
test.scoreindep(xdat, p, test = c("ledford", "tawn"))
Arguments
| xdat | a  | 
| p | probability level for the marginal threshold | 
| test | string; if  | 
Value
a list with elements
- stat
- value of the score test statistic 
- pval
- asymptotic p-value 
- test
- testargument
Examples
samp <- rmev(n = 1000, d = 2,
    param = 0.99, model = "log")
(test.scoreindep(samp, p = 0.9))
Automatic L-moment ratio selection method
Description
Given a sample of observations, calculate the L-skewness and L-kurtosis over a set of candidate thresholds. For each threshold candidate, we find the L-skewness that minimizes the sum of squared distance between the theoretical L-skewness and L-kurtosis of the generalized Pareto distribution,
\min_{\tau_3} (t_3-\tau_3)^2 + [t_4 - \tau_3(1+5\tau_3)/(5+\tau_3)]^2.
The function returns the threshold with the minimum distance.
Usage
thselect.alrs(xdat, thresh, plot = FALSE)
Arguments
| xdat | [numeric] vector of observations | 
| thresh | [numeric] vector of candidate thresholds. If missing, 20 sample quantiles starting at the 0.25 quantile in increments of 3.75 percent. | 
| plot | [logical] if  | 
Value
scalar for the chosen numeric threshold
References
Silva Lomba, J., Fraga Alves, M.I. (2020). L-moments for automatic threshold selection in extreme value analysis. Stoch Environ Res Risk Assess, 34, 465–491. doi:10.1007/s00477-020-01789-x
Lower truncated Hill threshold selection
Description
Given a sample of positive data with Pareto tail, the algorithm computes the optimal number of order statistics that minimizes the variance of the average left truncated tail index estimator, and uses the relationship to the Hill estimator for the Hall class of distributions to derive the optimal number (minimizing the asymptotic mean squared error) of the Hill estimator. The default value for the second order regular variation index is taken to be \rho=-1.
Usage
thselect.bab(
  xdat,
  kmin = floor(0.2 * length(xdat)),
  kmax = length(xdat) - 1L,
  rho = -1,
  test = FALSE,
  nsim = 999L,
  level = 0.95
)
Arguments
| xdat | [vector] positive vector of exceedances | 
| kmin | [int] minimum number of exceedances | 
| kmax | [int] maximum number of exceedances for the estimation of the shape parameter. | 
| rho | [double] scalar for the second order regular variation index, a negative number. | 
| test | [logical] if  | 
| nsim | [int] number of replications for Monte Carlo test, used only if  | 
| level | [double] confidence level for test | 
Value
a list with the number of order statistics for the Hill estimator, k0 and the corresponding shape estimate shape, the average lower-trimmed Hill estimator shape._lth and the number of order statistics upon which the latter is based, k0_lth.
References
Bladt, M., Albrecher, H. & Beirlant, J. (2020) Threshold selection and trimming in extremes. Extremes, 23, 629-665 . doi:10.1007/s10687-020-00385-0
Threshold selection via coefficient of variation
Description
This function computes the empirical coefficient of variation and computes a weighted statistic comparing the squared distance with the theoretical coefficient variation corresponding to a specific shape parameter (estimated from the data using a moment estimator as the value minimizing the test statistic, or using maximum likelihood). The procedure stops if there are no more than 10 exceedances above the highest threshold.
Usage
thselect.cv(
  xdat,
  thresh,
  method = c("mle", "wcv", "cv"),
  nsim = 999L,
  nthresh = 10L,
  level = 0.05,
  lazy = FALSE,
  plot = FALSE
)
Arguments
| xdat | [vector] vector of observations | 
| thresh | [vector] vector of threshold. If missing, set to  | 
| method | [string], either moment estimator for the (weighted) coefficient of variation ( | 
| nsim | [integer] number of bootstrap replications | 
| nthresh | [integer] number of thresholds, if  | 
| level | [numeric] probability level for sequential testing procedure | 
| lazy | [logical] compute the bootstrap p-value until the test stops rejecting at level  | 
| plot | [logical] if  | 
Value
a list with elements
-  thresh: value of threshold returned by the procedure,NAif the hypothesis is rejected at all thresholds
-  thresh0: sorted vector of candidate thresholds
-  cindex: index of selected threshold amongthresh0orNAif none returned
-  pval: bootstrap p-values, withNAiflazyand the p-value exceeds level at lower thresholds
-  shape: shape parameter estimates
-  nexc: number of exceedances of each thresholdthresh0
-  method: estimation method for the shape parameter
Note
The authors suggest transformation of
Y = -1/(X + c) + 1/c,
where X are exceedances and c=\sigma/\xi is the ratio of estimated scale and shape parameters. For heavy-tailed distributions with \xi > 0.25, this may be preferable, but must be conducted outside of the function.
References
del Castillo, J. and M. Padilla (2016). Modelling extreme values by the residual coefficient of variation, SORT, 40(2), pp. 303–320.
Examples
thselect.cv(
 xdat = rgp(1000),
 thresh = qgp(seq(0,0.9, by = 0.1)),
 nsim = 99,
 lazy = TRUE,
 plot = TRUE)
Generalized quantile threshold selection
Description
The methodology proposed by Beirlant, Vynckier and Teugels (1996)
uses an asymptotic expansion of the mean squared error for Hill's estimator given
a random sample with Pareto tails and positive shape, using an exponential regression.
The value of k is selected to minimize the mean squared error given optimal weighting scheme. This
depends on the order of regular variation \rho, which is obtained based on the slope of the difference
in Hill estimators, suitably reweighted. The iterative procedure of Beirlant et al. alternates between parameter estimation
until convergence. It returns the Hill shape estimate, the number of higher order statistic, the parameter rho and
estimates of the standard error of the shape and the mean squared error, based on the ultimate parameter values.
Since the weights can become negative, there is no guarantee that the mean squared error estimate is positive, nor that
the estimated value of \rho is nonpositive.
Usage
thselect.expgqt(
  xdat,
  maxiter = 10L,
  tol = 2,
  kmin = max(10, floor(length(xdat)/100)),
  kmax = floor(0.8 * length(xdat)),
  ...
)
Arguments
| xdat | [vector] sample of exceedances | 
| maxiter | [int] maximum number of iteration | 
| tol | [double] tolerance for difference in value of  | 
| kmin | [int] minimum number of exceedances for the estimator | 
| kmax | [int] maximum number of exceedances for the estimator | 
| ... | additional arguments, currently ignored | 
Value
a list with components
-  shapethe Hill estimator of the shape, based on theklargest order statistics
-  k0number of high order statistics for estimation of the shape using Hill's estimator
-  rhoestimate of the second order regular variation parameter
-  msemean squared error estimate of the shape parameter
-  sestandard error of the shape parameter
-  convergencelogical; ifTRUE, indicates that the method converged to a fixed point withintolbefore reaching the maximum number of iterationsmaxiter
References
Beirlant, J., Vynckier, P., & Teugels, J. L. (1996). Excess Functions and Estimation of the Extreme-Value Index. Bernoulli, 2(4), 293–318. doi:10.2307/3318416
Examples
# Simulate Pareto data - log(xdat) is exponential with rate 2
xdat <- rgp(n = 200, loc = 1, scale = 0.5, shape = 0.5)
(thselect.expgqt(xdat))
Kernel-based threshold selection of Goegebeur, Beirlant and de Wet (2008)
Description
Kernel-based threshold selection of Goegebeur, Beirlant and de Wet (2008)
Usage
thselect.gbw(
  xdat,
  kmax,
  kernel = c("Jackson", "Lewis"),
  rho = c("gbw", "ghp", "fagh", "dk"),
  ...
)
Arguments
| xdat | [vector] sample exceedances | 
| kmax | [int] maximum number of exceedances considered | 
| kernel | [string] kernel choice, one of  | 
| rho | string for the estimator of the second order regular variation. Can also be a negative scalar | 
| ... | additional arguments, for backward compatibility purposes | 
Value
a list with elements
- k0: number of exceedances
- shape: Hill's shape estimate
- rho: second-order regular variation parameter estimate
- gof: goodness-of-fit statistic for the chosen threshold.
References
Goegebeur , Y., Beirlant , J., and de Wet , T. (2008). Linking Pareto-Tail Kernel Goodness-of-fit Statistics with Tail Index at Optimal Threshold and Second Order Estimation. REVSTAT-Statistical Journal, 6(1), 51–69. <doi:10.57805/revstat.v6i1.57>
Examples
xdat <- rgp(n = 1000, scale = 2, shape = 0.5)
(thselect.gbw(xdat, kmax = 500))
Mahalanobis distance-based methodology
Description
Compute the Mahalanobis distance-based threshold method over a grid of thresholds by transforming data from generalized Pareto to unit exponential based on probability weighted moment estimates, then computing the first L-moment and the L-skewness. The latter are compared to the theoretical counterparts from a unit exponential sample of the same size, which is used to compute the Mahalanobis distance. The threshold returned is the one which minimizes the distance.
Usage
thselect.ksmd(xdat, thresh, approx = c("asymptotic", "mc"), nsim = 1000L)
Arguments
| xdat | [numeric] vector of observations | 
| thresh | [numeric] vector of candidate thresholds. If missing, 20 sample quantiles starting at the 0.25 quantile in increments of 3.75 percent. | 
| approx | [string] method to use to obtain moments of first L-moment | 
| nsim | [integer] number of replications for Monte Carlo approximation | 
Value
a list with components
-  thresh0: selected threshold returned by the procedure
-  thresh: vector of candidate thresholds
-  pval: scalar p-value for the chi-square approximation to the test statistic for the selected threshold
-  dist: vector of Mahalanobis distance
-  approx: type of approximation
References
Kiran, K. G. and Srivinas, V.V. (2021). A Mahalanobis distance-based automatic threshold selection method for peaks over threshold model. Water Resources Research 57. <doi:10.1029/2020WR027534>
Minimum distance threshold selection procedure
Description
Minimum distance threshold selection procedure
Usage
thselect.mdps(xdat)
Arguments
| xdat | vector of positive exceedances | 
Value
a list with components
- k0
- order statistic corresponding to threshold (number of exceedances) 
- shape
- Hill's estimator of the tail index based on k0 exceedances 
- thresh0
- numerical value of the threshold, the n-k0+1 order statistic of the original sample 
References
Clauset, A., Shalizi, C.R. and Newman, M.E.J. (2009). Power-Law Distributions in Empirical Data. SIAM Review. Society for Industrial and Applied Mathematics, 51, 661-703, doi:10.1137/070710111
Automated mean residual life plots
Description
This function implements the automated proposal from Section 2.2 of Langousis et al. (2016) for mean residual life plots. It returns the threshold that minimize the weighted mean square error and moment estimators for the scale and shape parameter based on weighted least squares.
Usage
thselect.mrl(xdat, thresh, kmax, plot = TRUE, ...)
Arguments
| xdat | [numeric] vector of observations | 
| thresh | [numeric] vector of thresholds; if missing, uses all order statistics from the 20th largest until  | 
| kmax | [integer] maximum number of order statistics | 
| plot | [logical] if  | 
| ... | additional arguments, currently ignored | 
Details
The procedure consists in estimating the usual mean residual life as a function of the threshold, and looking for an order statistic or threshold value above which the fit is more or less linear.
Value
a list containing
-  thresh: candidate threshold vector
-  thresh0: selected threshold
-  scale: scale parameter estimate
-  shape: shape parameter estimate
-  mrl: empirical mean excess values
-  xdat: ordered observations
-  intercept: intercept for mean excess value at chosen threshold
-  slope: slope for mean excess value at chosen threshold
-  tmanual: logical;TRUEif the user passed a vector of thresholds
References
Langousis, A., A. Mamalakis, M. Puliga and R. Deidda (2016). Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database, Water Resources Research, 52, 2659–2681.
Examples
thselect.mrl(rgp(n = 100))
Northop and Coleman piecewise generalized Pareto threshold selection diagnostic
Description
The model tests the null hypothesis of a generalized Pareto above each threshold in thresh against the alternative of a piecewise generalized Pareto model with continuity constraints.
Usage
thselect.ncpgp(xdat, thresh, test = "score", plot = FALSE, level = 0.95, ...)
Arguments
| xdat | [vector] observations | 
| thresh | [vector] candidate thresholds | 
| test | [string] indicating whether to perform  | 
| plot | [logical]; if  | 
| level | [double] confidence level for confidence interval, defaults to 0.95 | 
| ... | additional arguments, for backward compatibility purposes | 
Value
an object of class mev_thselect_ncpgp containing the test statistic (stat), the p-values (pval), the threshold candidates (thresh) and the selected threshold (thresh0).
Prediction error C-criterion threshold selection method
Description
This function computes the non-robust Pareto prediction error of Dupuis and Victoria-Feser (2003), termed C-criterion, for the Hill estimator of the shape parameter. The threshold returned is the value of the threshold, taken from order statistics, that minimizes the average prediction error.
Usage
thselect.pec(xdat, kmax)
Arguments
| xdat | vector of observations | 
| kmax | maximum number of order statistics to consider. Default to sample size if left unspecified. | 
Value
a list with the number of exceedances k, the chosen threshold thresh0 and the corresponding Hill estimator shape estimate shape.
References
Dupuis, D.J. and M.-P. Victoria-Feser (2003). A Prediction Error Criterion for Choosing the Lower Quantile in Pareto Index Estimation, University of Geneva, technical report, https://archive-ouverte.unige.ch/unige:5789.
Pickands' order statistics threshold selection method
Description
Restricting to the largest fourth of the data, returns the number of exceedances that minimizes the Kolmogorov-Smirnov statistic, i.e., the maximum absolute difference between the estimated generalized Pareto and the empirical distribution of exceedances. Relative to the paper, different estimation methods are proposed.
Usage
thselect.pickands(xdat, thresh, method = c("mle", "lmom", "quartiles"))
Arguments
| xdat | [numeric] vector of observations | 
| thresh | [numeric] vector of candidate thresholds. If missing, defaults to order statistics from the 10th to a quarter of the sample size. | 
| method | [string] estimation method, either the quartiles of Pickands (1975), maximum likelihood, probability weighted moments or L-moments | 
Value
a list with components
-  k0: number of exceedances
-  thresh0: selected threshold returned by the procedure
-  thresh: vector of candidate thresholds
-  dist; vector of Kolmogorov-Smirnoff distance
-  method; string for the estimation method
-  scale: estimated scale parameter at the chosen threshold
-  shape: estimated shape parameter at the chosen threshold
Note
The quartiles estimator of Pickands is robust, but very inefficient. It is provided for historical reasons.
References
James Pickands III (1975). Statistical inference using extreme order statistics, Annals of Statistics, 3(1) 119-131, doi:10.1214/aos/1176343003
Threshold selection for the random block maxima method
Description
Threshold selection for the random block maxima method
Usage
thselect.rbm(xdat, kmax = length(xdat))
Arguments
| xdat | [vector] sample exceedances | 
| kmax | maximum number of exceedances to consider. | 
Value
a list with elements
Threshold selection via SAMSEE
Description
Smooth asymptotic mean squared error estimator of Schneider et al. (2021) for threshold selection. The implementation uses a second-order regular variation index of -1
Usage
thselect.samsee(xdat)
Arguments
| xdat | vector of positive exceedances | 
Value
a list with elements
- k0
- optimal number of exceedances 
- shape
- Hill estimator of the tail index 
- thresh0
- selected threshold 
References
Schneider, L.F., Krajina, A. & Krivobokova, T. (2021). Threshold selection in univariate extreme value analysis, Extremes, 24, 881–913 doi:10.1007/s10687-021-00405-7
Threshold selection diagnostic of Suveges and Davison
Description
The information matrix test (IMT), proposed by Suveges and Davison (2010), is based
on the difference between the expected quadratic score and the second derivative of
the log-likelihood. The asymptotic distribution for each threshold u and gap K
is asymptotically \chi^2 with one degree of freedom. The approximation is good for
N>80 and conservative for smaller sample sizes. The test assumes independence between gaps.
Usage
thselect.sdinfo(xdat, thresh, qlev, plot = FALSE, kmax = 1)
Arguments
| xdat | [vector] vector of observations | 
| thresh | [vector] candidate thresholds | 
| qlev | [vector] probability levels to define threshold if  | 
| plot | [logical]; should the graphical diagnostic be plotted? | 
| kmax | [int] the largest K-gap under consideration for clusters | 
Details
The procedure proposed in Suveges & Davison (2010) was corrected for erratas.
The maximum likelihood is based on the limiting mixture distribution of
the intervals between exceedances (an exponential with a point mass at zero).
The condition D^{(K)}(u_n) should be checked by the user.
Fukutome et al. (2015) propose an ad hoc automated procedure
- Calculate the interexceedance times for each K-gap and each threshold, along with the number of clusters 
- Select the ( - u,- K) pairs for which IMT < 0.05 (corresponding to a P-value of 0.82)
- Among those, select the pair ( - u,- K) for which the number of clusters is the largest
Value
an invisible list of class with elements
-  thresha vector of thresholds based on empirical quantiles at supplied levels.
-  stata matrix of test statistics
-  pvala matrix of approximate p-values (corresponding to probabilities under a\chi^2_1distribution)
-  mlea matrix of maximum likelihood estimates for each given pair of thresholds and gaps
-  loglika matrix of log-likelihood values at MLE for each given pair of elements inthreshand gap in0, \ldots,\code{kmax}
-  quantilequantile levels for thresholds, if supplied by the user
-  kmaxthe largest gap number
Author(s)
Leo Belzile
References
Fukutome, Liniger and Suveges (2015), Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology, 120(3), 403-416.
Suveges and Davison (2010), Model misspecification in peaks over threshold analysis. Annals of Applied Statistics, 4(1), 203-221.
White (1982), Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1), 1-25.
Examples
thselect.sdinfo(
  xdat = rgp(n = 10000),
  qlev = seq(0.1, 0.9, length = 10),
  kmax = 3)
Metric-based threshold selection
Description
Adaptation of Varty et al.'s metric-based threshold automated diagnostic for the independent and identically distributed case with no rounding.
Usage
thselect.vmetric(
  xdat,
  thresh,
  B = 199L,
  type = c("eqd", "exp", "qq", "pp"),
  dist = c("l1", "l2"),
  uq = FALSE,
  neval = 1000L,
  ci = 0.95
)
Arguments
| xdat | vector of observations | 
| thresh | vector of thresholds | 
| B | number of bootstrap replications | 
| type | string indicating scale, either  | 
| dist | string indicating norm, either  | 
| uq | logical; if  | 
| neval | number of points at which to estimate the metric. Default to 1000L | 
| ci | level of symmetric confidence interval. Default to 0.95 | 
Details
The algorithm proceeds by first computing the maximum likelihood algorithm and then simulating datasets from replication with parameters drawn from a bivariate normal approximation to the maximum likelihood estimator distribution.
For each bootstrap sample, we refit the model and convert the quantiles to exponential or uniform variates. The mean absolute or mean squared distance is calculated on these. The threshold returned is the one with the lowest value of the metric.
Value
an invisible list with components
-  thresh: scalar threshold minimizing criterion
-  thresh0: vector of candidate thresholds
-  metric: value of the metric criterion evaluated at each threshold
-  type: argumenttype
-  dist: argumentdist
References
Varty, Z. and J.A. Tawn and P.M. Atkinson and S. Bierman (2021+), Inference for extreme earthquake magnitudes accounting for a time-varying measurement process.
Murphy, C., Tawn, J. A., & Varty, Z. (2024). Automated Threshold Selection and Associated Inference Uncertainty for Univariate Extremes. Technometrics, 67(2), 215–224. <doi:10.1080/00401706.2024.2421744>
Examples
xdat <- rexp(1000, rate = 1/2)
thresh <- quantile(xdat, prob = c(0.25,0.5, 0.75))
Wadsworth's sequential analysis threshold selection
Description
Function to produce diagnostic plots and test statistics for the threshold diagnostics exploiting structure of maximum likelihood estimators based on the non-homogeneous Poisson process likelihood or the coefficient of tail dependence
Usage
thselect.wseq(
  xdat,
  thresh,
  qlev,
  model = c("nhpp", "taildep", "rtaildep"),
  npp = 1,
  nsim = 1000L,
  level = 0.95,
  plot = FALSE,
  ...
)
Arguments
| xdat | a numeric vector or matrix of data to be fitted. | 
| thresh | vector of candidate thresholds. | 
| qlev | vector of probabilities for empirical quantiles used in place of the threshold, used if argument  | 
| model | string specifying whether the univariate or multivariate diagnostic should be used. Either  | 
| npp | number of observations per period for the non-homogeneous point process model. Default to 1. | 
| nsim | number of Monte Carlo simulations used to assess the null distribution of the test statistic | 
| level | confidence level of intervals, defaults to 0.95 | 
| plot | logical; if  | 
| ... | additional parameters passed to internal routine | 
Details
The function is a wrapper for the univariate (non-homogeneous Poisson process model) and exponential dependence model applied to the minimum component (tail dependence coefficient).
For the latter, the user can select either the rate ("taildep" or inverse rate parameter  ("rtaildep"). The inverse rate parametrization  works better for uniformity of the p-value distribution under the likelihood ratio test for the changepoint.
For the coefficient of tail dependence, users must provide pairwise minimum of marginally exponentially distributed margins (see example)
Value
an object of class invisible list with components
-  thresh0: threshold selected by the likelihood ratio procedure
-  thresh: vector of candidate thresholds
-  coef: maximum likelihood estimates from all thresholds
-  vcov: joint asymptotic covariance matrix for shape\xior coefficient of tail dependence\eta, or it's reciprocal.
-  wn: values of the white noise process
-  stat: value of the likelihood ratio test statistic for the changepoint test
-  pval: P-value of the likelihood ratio test
-  mle: maximum likelihood estimates for the selected threshold
-  model: model fitted, eithernhpp,exporinvexp
-  nsim: number of Monte Carlo simulations for changepoint test
-  xdat: vector of observations
Author(s)
Jennifer L. Wadsworth, Léo Belzile
References
Wadsworth, J.L. (2016). Exploiting Structure of Maximum Likelihood Estimators for Extreme Value Threshold Selection, Technometrics, 58(1), 116-126, http://dx.doi.org/10.1080/00401706.2014.998345.
Examples
## Not run: 
set.seed(123)
xdat <- abs(rnorm(5000))
thresh <- quantile(xdat, seq(0, 0.9, by = 0.1))
(diag <- thselect.wseq(
 xdat = xdat,
 thresh = thresh,
 plot = TRUE,
 type = "ps"))
# Multivariate example, with coefficient of tail dependence
xbvn <- rmnorm(n = 6000L,
                mu = rep(0, 2),
                Sigma = cbind(c(1, 0.7), c(0.7, 1)))
thselect.wseq(
  xdat = xbvn,
  qlev = seq(0, 0.9, length.out = 30),
  model = 'taildep',
  plot = TRUE)
## End(Not run)
Coefficient of variation threshold stability plot
Description
This function calculates parametric estimates of the coefficient of variation with pointwise Wald confidence intervals along with empirical estimates and returns a threshold stability plot.
Usage
tstab.cv(
  xdat,
  thresh,
  method = c("empirical", "mle", "wcv", "cv"),
  nthresh = 10L,
  nsim = 99L,
  plot = TRUE,
  level = 0.95,
  ...
)
Arguments
| xdat | [vector] vector of observations | 
| thresh | [vector] vector of threshold. If missing, set to  | 
| method | [string], either moment estimator for the (weighted) coefficient of variation ( | 
| nthresh | [integer] number of thresholds, if  | 
| nsim | [integer] number of bootstrap replications | 
| plot | [logical] if  | 
| level | [numeric] probability level for sequential testing procedure | 
| ... | additional parameters, notably for package  | 
Examples
tstab.cv(
xdat = rgp(1000),
thresh = qgp(seq(0,0.9, by = 0.1)),
method = "cv")
 tstab.cv(
xdat = rgp(1000),
thresh = qgp(seq(0,0.9, by = 0.1)),
method = "empirical")
Threshold stability plots for extended generalized Pareto models
Description
Threshold stability plots for extended generalized Pareto models
Usage
tstab.egp(
  xdat,
  thresh,
  model = c("pt-beta", "pt-gamma", "pt-power", "gj-tnorm", "gj-beta", "exptilt",
    "logist"),
  param = c("shape", "kappa"),
  type = c("wald", "profile"),
  transform = FALSE,
  level = 0.95,
  plot = TRUE,
  ...
)
Arguments
| xdat | vector of observations, greater than the threshold | 
| thresh | threshold value | 
| model | a string indicating which extended family to fit | 
| param | [string] parameter, either  | 
| type | [string] confidence interval type, either  | 
| transform | logical; if  | 
| level | [double] confidence interval level, default to 0.95. | 
| plot | [logical] if  | 
| ... | additional arguments for the plot function, currently ignored. | 
Examples
xdat <- rgp(n = 1000)
tstab.egp(
 xdat = xdat,
 thresh = c(0, quantile(xdat, 0.5)),
 model = "gj-tnorm",
 param = "kappa",
 transform = TRUE)
Parameter stability plots for peaks-over-threshold
Description
This function computes the maximum likelihood estimate at each provided threshold and plots the estimates (pointwise), along with 95% confidence/credible intervals obtained using Wald or profile confidence intervals, or else from 1000 independent draws from the posterior distribution under vague independent normal prior on the log-scale and shape. The latter two methods better reflect the asymmetry of the estimates than the Wald confidence intervals.
Usage
tstab.gpd(
  xdat,
  thresh,
  method = c("wald", "profile", "post"),
  level = 0.95,
  plot = TRUE,
  which = c("scale", "shape"),
  changepar = TRUE,
  ...
)
Arguments
| xdat | a vector of observations | 
| thresh | a vector of candidate thresholds at which to compute the estimates. | 
| method | string indicating the method for computing confidence or credible intervals.
Must be one of  | 
| level | confidence level of the intervals. Default to 0.95. | 
| plot | logical; should parameter stability plots be displayed? Default to  | 
| which | character vector with elements  | 
| changepar | logical; if  | 
| ... | additional arguments passed to  | 
Value
a list with components
- threshold: vector of numerical threshold values.
- mle: matrix of modified scale and shape maximum likelihood estimates.
- lower: matrix of lower bounds for the confidence or credible intervals.
- upper: matrix of lower bounds for the confidence or credible intervals.
- method: method for the confidence or coverage intervals.
plots of the modified scale and shape parameters, with pointwise confidence/credible intervals
and an invisible data frame containing the threshold thresh and the modified scale and shape parameters.
Note
The function is hard coded to prevent fitting a generalized Pareto distribution to samples of size less than 10. If the estimated shape parameters are all on the boundary of the parameter space (meaning \hat{\xi}=-1), then the plots return one-sided confidence intervals for both the modified scale and shape parameters: these typically suggest that the chosen thresholds are too high for estimation to be reliable.
Author(s)
Leo Belzile
See Also
Examples
dat <- abs(rnorm(10000))
u <- qnorm(seq(0.9,0.99, by= 0.01))
par(mfrow = c(1,2))
tstab.gpd(xdat = dat, thresh = u, changepar = FALSE)
## Not run: 
tstab.gpd(xdat = dat, thresh = u, method = "profile")
tstab.gpd(xdat = dat, thresh = u, method = "post")
## End(Not run)
Threshold stability plot for Hill estimator
Description
Threshold stability plot for Hill estimator
Usage
tstab.hill(xdat, kmax, method = "hill", ..., log = TRUE)
Arguments
| xdat | [vector] sample exceedances | 
| kmax | [int] maximum number of order statistics | 
| method | [string] name of estimator for shape parameter. Default to  | 
| ... | additional arguments passed to  | 
| log | [logical] should the x-axis for the number of order statistics used for estimation be displayed on the log scale? Default to  | 
Value
a plot of shape estimates as a function of the number of exceedances
Examples
xdat <- rgp(n = 250, loc = 1, scale = 2, shape = 0.5)
tstab.hill(xdat)
Threshold stability plots for left-truncated Hill estimators
Description
Given a vector of exceedances and some potential choices of k for the threshold, compute the left-truncated Hill estimators for each value of k and use these to compute the variance and slope of the estimator
Usage
tstab.lthill(xdat, k, which = c("lthill", "var", "slope"), log = TRUE, ...)
Arguments
| xdat | [numeric] vector of positive observations | 
| k | [integer] number of order statistics for the threshold | 
| which | [string] the type of plot, showing the left-truncated Hill plot on the log, the log of the variance of the estimator, or the log slope | 
| log | [logical] if  | 
| ... | additional parameters for color, etc. to be passed to plot | 
Value
an invisible list with lthill, order statistics, the log variance and the log scale.
References
Bladt, M., Albrecher, H. & Beirlant, J. (2020) Threshold selection and trimming in extremes. Extremes, 23, 629-665 . doi:10.1007/s10687-020-00385-0
Examples
xdat <- 10/(1 - runif(n = 1000)) - 10
tstab.lthill(xdat = xdat, k = c(50,100,200))
Mean residual life plot
Description
Computes mean of sample exceedances over a range of thresholds or for a pre-specified number of largest order statistics, and returns a plot with 95% Wald-based confidence intervals as a function of either the threshold or the number of exceedances. The main purpose is the plotting method, which generates the so-called mean residual life plot. The latter should be approximately linear over the threshold for a generalized Pareto distribution
Usage
tstab.mrl(
  xdat,
  thresh,
  kmin = 10L,
  kmax = length(xdat),
  plot = TRUE,
  level = 0.95,
  xlab = c("thresh", "nexc"),
  type = c("band", "ptwise"),
  ...
)
Arguments
| xdat | vector of sample observations | 
| thresh | vector of thresholds | 
| kmin | integer giving the minimum number of exceedances; ignored if  | 
| kmax | integer giving the maximum number of exceedances; ignored if  | 
| plot | logical; if  | 
| level | double giving the level of confidence intervals for the plot, default to 0.95 | 
| xlab | string indicating whether to use thresholds ( | 
| type | string whether to plot pointwise confidence intervals using segments ( | 
| ... | additional arguments, currently ignored | 
Value
an invisible list with mean sample exceedances and standard deviation, number of exceedances, threshold
References
Davison, A.C. and R.L. Smith (1990). Models for Exceedances over High Thresholds (with discussion), Journal of the Royal Statistical Society. Series B (Methodological), 52(3), 393–442.
Examples
tstab.mrl(
 xdat = rgp(n = 100, shape = -0.5),
 xlab = "thresh",
 kmax = 50)
tstab.mrl(
 rexp(100),
 thresh = qexp(seq(0, 0.9, by = 0.01)))
Venice Sea Levels
Description
The venice data contains the 10 largest yearly sea levels (in cm)
from 1887 until 2019. Only the yearly maximum is available for 1922
and the six largest observations for 1936.
Format
a data frame with 133 rows and 11 columns containing the year of the measurement (first column)
and ordered 10-largest yearly observations, reported in decreasing order from largest (r1) to smallest (r10).
Note
Smith (1986) notes that the annual maxima seems to fluctuate around a constant sea level up to 1930 or so, after which there is potential linear trend. Records of threshold exceedances above 80 cm (reported on the website) indicate that observations are temporally clustered.
The observations from 1931 until 1981 can be found in Table 1 in Smith (1986), who reported data from Pirazzoli (1982). The values from 1983 until 2019 were extracted by Anthony Davison from the City of Venice website (accessed in May 2020) and are licensed under the CC BY-NC-SA 3.0 license. The Venice City website indicates that later measurements were recorded by an instrument located in Punta Salute.
Source
City of Venice, Historical archive <https://www.comune.venezia.it/node/6214>. Last accessed November 5th, 2020.
References
Smith, R. L. (1986) Extreme value theory based on the r largest annual events. Journal of Hydrology 86, 27–43.
Pirazzoli, P., 1982. Maree estreme a Venezia (periodo 1872-1981). Acqua Aria 10, 1023-1039.
Coles, S. G. (2001) An Introduction to Statistical Modelling of Extreme Values. London: Springer.
See Also
Best 200 times of Women 1500m Track
Description
200 all-time best performance (in seconds) of women 1500-meter run.
Format
a vector of size 200
Source
<http://www.alltime-athletics.com/w_1500ok.htm>, accessed 14.08.2018
Coefficient of extremal asymmetry
Description
This function implements estimators of the bivariate coefficient of extremal asymmetry proposed in Semadeni's (2021) PhD thesis. Two estimators are implemented: one based on empirical distributions, the second using empirical likelihood.
Usage
xasym(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  estimator = c("emp", "elik"),
  confint = c("none", "wald", "bootstrap"),
  level = 0.95,
  B = 999L,
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of quantile levels at which to evaluate extremal asymmetry | 
| nq | integer; number of quantiles at which to evaluate the coefficient if  | 
| qlim | a vector of length 2 with the probability limits for the quantiles | 
| estimator | string indicating the estimation method, one of  | 
| confint | string for the method used to derive confidence intervals, either  | 
| level | probability level for confidence intervals, default to 0.95 or bounds for the interval | 
| B | integer; number of bootstrap replicates (if applicable) | 
| ties.method | string; method for handling ties. See the documentation of rank for available options. | 
| plot | logical; if  | 
| ... | additional parameters for plots | 
Details
Let U, V be uniform random variables and define the partial extremal dependence coefficients
\varphi_{+}(u) = \Pr(V > U \mid U > u, V > u),
,
\varphi_{-}(u) = \Pr(V < U \mid U > u, V > u),
\varphi_0(u) = \Pr(V = U \mid U > u, V > u).
Define
 \varphi(u) = \frac{\varphi_{+} - \varphi_{-}}{\varphi_{+} + \varphi_{-}}
and the coefficient of extremal asymmetry as \varphi = \lim_{u \to 1} \varphi(u).
The empirical likelihood estimator, derived for max-stable vectors with unit Frechet margins, is
\widehat{\varphi}_{\mathrm{el}} = \frac{\sum_i p_i \mathrm{I}(w_i \leq 0.5) - 0.5}{0.5 - 2\sum_i p_i(0.5-w_i) \mathrm{I}(w_i \leq 0.5)}
where p_i is the empirical likelihood weight for observation i, \mathrm{I} is an indicator function and w_i is the pseudo-angle associated to the first coordinate, derived based on exceedances above u.
Value
an invisible data frame with columns
- threshold
- vector of thresholds on the probability scale 
- coef
- extremal asymmetry coefficient estimates 
- confint
- either - NULLor a matrix with two columns containing the lower and upper bounds for each threshold
References
Semadeni, C. (2020). Inference on the Angular Distribution of Extremes, PhD thesis, EPFL, no. 8168.
Examples
## Not run: 
samp <- rmev(n = 1000,
             d = 2,
             param = 0.2,
             model = "log")
xasym(samp, confint = "wald")
xasym(samp, method = "emplik")
## End(Not run)
Coefficient of extremal asymmetry
Description
This function implements estimators of the bivariate coefficient of extremal asymmetry proposed in Semadeni's (2021) PhD thesis. Two estimators are implemented: one based on empirical distributions, the second using empirical likelihood.
Usage
xdep.asym(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  estimator = c("emp", "elik"),
  confint = c("none", "wald", "bootstrap"),
  level = 0.95,
  B = 999L,
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of quantile levels at which to evaluate extremal asymmetry | 
| nq | integer; number of quantiles at which to evaluate the coefficient if  | 
| qlim | a vector of length 2 with the probability limits for the quantiles | 
| estimator | string indicating the estimation method, one of  | 
| confint | string for the method used to derive confidence intervals, either  | 
| level | probability level for confidence intervals, default to 0.95 or bounds for the interval | 
| B | integer; number of bootstrap replicates (if applicable) | 
| ties.method | string; method for handling ties. See the documentation of rank for available options. | 
| plot | logical; if  | 
| ... | additional arguments for backward compatibility | 
Details
Let U, V be uniform random variables and define the partial extremal dependence coefficients
\varphi_{+}(u) = \Pr(V > U \mid U > u, V > u),
,
\varphi_{-}(u) = \Pr(V < U \mid U > u, V > u),
\varphi_0(u) = \Pr(V = U \mid U > u, V > u).
Define
 \varphi(u) = \frac{\varphi_{+} - \varphi_{-}}{\varphi_{+} + \varphi_{-}}
and the coefficient of extremal asymmetry as \varphi = \lim_{u \to 1} \varphi(u).
The empirical likelihood estimator, derived for max-stable vectors with unit Frechet margins, is
\widehat{\varphi}_{\mathrm{el}} = \frac{\sum_i p_i \mathrm{I}(w_i \leq 0.5) - 0.5}{0.5 - 2\sum_i p_i(0.5-w_i) \mathrm{I}(w_i \leq 0.5)}
where p_i is the empirical likelihood weight for observation i, \mathrm{I} is an indicator function and w_i is the pseudo-angle associated to the first coordinate, derived based on exceedances above u.
Value
an invisible data frame with columns
- qlev
- quantile level of thresholds 
- coef
- extremal asymmetry coefficient estimates 
- lower
- either - NULLor a vector containing the lower bound of the confidence interval
- upper
- either - NULLor a vector containing the lower bound of the confidence interval
References
Semadeni, C. (2020). Inference on the Angular Distribution of Extremes, PhD thesis, EPFL, no. 8168.
Examples
## Not run: 
samp <- rmev(n = 1000,
             d = 2,
             param = 0.2,
             model = "log")
xdep.asym(samp, confint = "wald")
xdep.asym(samp, method = "emplik", confint = "none")
## End(Not run)
Coefficient of tail correlation
Description
The coefficient of tail correlation \chi is
\chi = \lim_{u \to 1} \frac{\Pr(F_1(X_1)>u, \ldots, F_D(X_D)>u)}{1-u}.
Asymptotically independent vectors have \chi = 0. The estimator uses an estimator of the survival copula
Usage
xdep.chi(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  estimator = c("emp", "betacop", "gpd", "hill"),
  confint = c("wald", "lrt"),
  level = 0.95,
  margtrans = c("emp", "none"),
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of percentiles between 0 and 1 | 
| nq | number of quantiles of the structural variable at which to form a grid; only used if  | 
| qlim | limits for the sequence  | 
| estimator | string giving estimator to employ | 
| confint | string indicating the type of confidence interval, one of  | 
| level | the confidence level required (default to 0.95). | 
| margtrans | string giving the marginal transformation, one of  | 
| ties.method | string indicating the type of method for  | 
| plot | logical; if  | 
| ... | additional arguments to  | 
Value
a data frame
-  qlev: quantile level of estimates
-  coef: point estimates
-  lower: lower bound of confidence interval
-  upper: lower bound of confidence interval
Examples
## Not run: 
set.seed(765)
# Max-stable model
dat <- rmev(n = 1000, d = 2, param = 0.7, model = "log")
xdep.chi(dat, confint = 'wald')
## End(Not run)
Coefficient chi-bar
Description
For data with unit Pareto margins, the coefficient \bar{\chi} = 2\eta-1 is defined  via 
\Pr(\min(X) > x) = L(x)x^{-1/\eta},
where L(x) is a slowly varying function. Ignoring the latter, several estimators of \eta can be defined. In unit Pareto margins, \eta is a nonnegative shape parameter that can be estimated by fitting a generalized Pareto distribution above a high threshold. In exponential margins, \eta is a scale parameter and the maximum likelihood estimator of the latter is the Hill estimator. Both methods are based on peaks-over-threshold and the user can choose between pointwise confidence obtained through a likelihood ratio test statistic ("lrt") or the Wald statistic ("wald").
Usage
xdep.chibar(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  estimator = c("emp", "betacop"),
  confint = c("wald", "lrt"),
  level = 0.95,
  margtrans = c("emp", "none"),
  ties.method = "random",
  plot = TRUE,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of percentiles between 0 and 1 | 
| nq | number of quantiles of the structural variable at which to form a grid; only used if  | 
| qlim | limits for the sequence  | 
| estimator | string giving estimator to employ | 
| confint | string indicating the type of confidence interval, one of  | 
| level | the confidence level required (default to 0.95). | 
| margtrans | string giving the marginal transformation, one of  | 
| ties.method | string indicating the type of method for  | 
| plot | logical; if  | 
| ... | additional arguments to  | 
Details
The most common approach for estimation is the empirical survival copula, by evaluating the proportion of sample minima with uniform margins that exceed a given x. An alternative estimator uses a smoothed estimator of the survival copula using Bernstein polynomial, resulting in the so-called betacop estimator. Approximate pointwise confidence intervals for the latter are obtained by assuming the proportion of points is binomial.
Value
a data frame
-  qlev: quantile level of estimates
-  coef: point estimates
-  lower: lower bound of confidence interval
-  upper: lower bound of confidence interval
References
Ledford, A.W. and J. A. Tawn (1996), Statistics for near independence in multivariate extreme values. Biometrika, 83(1), 169–187.
Ledford, A.W. and J. A. Tawn (1996), Statistics for near independence in multivariate extreme values. Biometrika, 83(1), 169–187.
Examples
## Not run: 
set.seed(765)
# Max-stable model
dat <- rmev(n = 1000, d = 2, param = 0.7, model = "log")
xdep.chibar(dat, confint = 'wald')
## End(Not run)
Coefficient of tail dependence
Description
For data with unit Pareto margins, the coefficient of tail dependence \eta is defined  via 
\Pr(\min(X) > x) = L(x)x^{-1/\eta},
where L(x) is a slowly varying function. Ignoring the latter, several estimators of \eta can be defined. In unit Pareto margins, \eta is a nonnegative shape parameter that can be estimated by fitting a generalized Pareto distribution above a high threshold. In exponential margins, \eta is a scale parameter and the maximum likelihood estimator of the latter is the Hill estimator. Both methods are based on peaks-over-threshold and the user can choose between pointwise confidence obtained through a likelihood ratio test statistic ("lrt") or the Wald statistic ("wald").
Usage
xdep.eta(
  xdat,
  qlev = NULL,
  nq = 40,
  qlim = c(0.8, 0.99),
  estimator = c("emp", "betacop", "gpd", "hill", "kj"),
  confint = c("wald", "lrt"),
  level = 0.95,
  margtrans = c("emp", "sp", "none"),
  ties.method = "random",
  plot = TRUE,
  mqlev = NULL,
  ...
)
Arguments
| xdat | an  | 
| qlev | vector of percentiles between 0 and 1 | 
| nq | number of quantiles of the structural variable at which to form a grid; only used if  | 
| qlim | limits for the sequence  | 
| estimator | string giving estimator to employ | 
| confint | string indicating the type of confidence interval, one of  | 
| level | the confidence level required (default to 0.95). | 
| margtrans | string giving the marginal transformation, one of  | 
| ties.method | string indicating the type of method for  | 
| plot | logical; if  | 
| mqlev | marginal quantile levels for semiparametric estimation for estimator  | 
| ... | additional arguments to  | 
Details
The most common approach for estimation is the empirical survival copula, by evaluating the proportion of sample minima with uniform margins that exceed a given x. An alternative estimator uses a smoothed estimator of the survival copula using Bernstein polynomial, resulting in the so-called betacop estimator. Approximate pointwise confidence intervals for the latter are obtained by assuming the proportion of points is binomial.
Value
a data frame
-  qlev: quantile level of estimates
-  coef: point estimates
-  lower: lower bound of confidence interval
-  upper: lower bound of confidence interval
References
Ledford, A.W. and J. A. Tawn (1996), Statistics for near independence in multivariate extreme values. Biometrika, 83(1), 169–187.
Examples
## Not run: 
set.seed(765)
# Max-stable model
dat <- rmev(n = 1000, d = 2, param = 0.7, model = "log")
xdep.eta(dat, confint = 'wald')
## End(Not run)
Extremal index estimators
Description
The function implements estimators of the extremal index based on interexceedance time and gap of exceedances. The maximum likelihood estimator and iteratively reweighted least square estimators of Suveges (2007) as well as the intervals estimator. The implementation differs from the presentation of the paper in that an iteration limit is enforced to make sure the iterative procedure terminates. Multiple thresholds can be supplied.
Usage
xdep.xindex(
  xdat,
  qlev = 0.95,
  estimator = c("wls", "mle", "intervals"),
  confint = c("none", "wald", "lrt"),
  level = 0.95,
  plot = FALSE,
  warn = FALSE,
  ...
)
Arguments
| xdat | numeric vector of observations | 
| qlev | a vector of quantile levels in (0,1) for estimation of the extremal index. Defaults to 0.95 | 
| estimator | a string specifying the chosen method (only one allowed). Must be either  | 
| confint | string indicating the type of confidence interval, one of  | 
| level | the confidence level required (default to 0.95). | 
| plot | logical; if  | 
| warn | logical; if  | 
| ... | additional arguments, for backward compatibility | 
Details
The iteratively reweighted least square is a procedure based on the gaps of exceedances S_n=T_n-1
The model is first fitted to non-zero gaps, which are rescaled to have unit exponential scale. The slope
between the theoretical quantiles and the normalized gap of exceedances is b=1/\theta,
with intercept a=\log(\theta)/\theta.
As such, the estimate of the extremal index is based on \hat{\theta}=\exp(\hat{a}/\hat{b}).
The weights are chosen in such a way as to reduce the influence of the smallest values.
The estimator exploits the dual role of \theta as the parameter of the mean for
the interexceedance time as well as the mixture proportion for the non-zero component.
The maximum likelihood is based on an independence likelihood for the rescaled gap of exceedances,
namely \bar{F}(u_n)S(u_n). The score equation is equivalent to a quadratic equation in
\theta and the maximum likelihood estimate is available in closed form.
Its validity requires however condition D^{(2)}(u_n) to apply;
this should be checked by the user beforehand.
A warning is emitted if the effective sample size is less than 50 observations.
Value
a data frame
-  qlev: quantile level of estimates
-  coef: point estimates
-  lower: lower bound of confidence interval
-  upper: lower bound of confidence interval
Author(s)
Leo Belzile
References
Ferro and Segers (2003). Inference for clusters of extreme values, JRSS: Series B, 65(2), 545-556.
Suveges (2007) Likelihood estimation of the extremal index. Extremes, 10(1), 41-55.
Suveges and Davison (2010), Model misspecification in peaks over threshold analysis. Annals of Applied Statistics, 4(1), 203-221.
Fukutome, Liniger and Suveges (2015), Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology, 120(3), 403-416.
Examples
set.seed(234)
# Moving maxima model with theta=0.5
a <- 1; theta <-  1/(1+a)
sim <- rgev(10001, loc=1/(1+a),scale=1/(1+a),shape=1)
x <- pmax(sim[-length(sim)]*a,sim[-1])
q <- seq(0.9, 0.99, by = 0.01)
xdep.xindex(
  xdat = x,
  qlev = q,
  estimator = "mle",
  confint = "wald")