Type: | Package |
Title: | Estimating Means, Standard Deviations and Visualising Distributions using Quantiles |
Version: | 0.1.2 |
Maintainer: | Udara Kumaranathunga <U.Kumaranathunga@latrobe.edu.au> |
Author: | Udara Kumaranathunga [cre], Alysha De Livera [aut], Luke Prendergast [aut] |
Description: | Implements a novel density-based approach for estimating unknown means, visualizing distributions, and meta-analyses of quantiles. A detailed vignettes with example datasets and code to prepare data and analyses is available at https://bookdown.org/a2delivera/metaquant/. The methods are described in the pre-print by De Livera, Prendergast and Kumaranathunga (2024, <doi:10.48550/arXiv.2411.10971>). |
License: | GPL-3 |
Encoding: | UTF-8 |
Depends: | R (≥ 3.5.0) |
RoxygenNote: | 7.3.2 |
Imports: | gld, sld, stats, ggplot2, plotly, magrittr, dplyr, estmeansd |
NeedsCompilation: | no |
Packaged: | 2025-08-11 06:08:04 UTC; 22102238 |
Repository: | CRAN |
Date/Publication: | 2025-08-21 07:10:02 UTC |
Estimating Unknown Parameters using Five-Number Summary
Description
This function provide estimates for the parameters of generalised lambda distribution (GLD), the sample mean and the standard deviation using 5-number summary {minimum, first quartile, median, third quartile, maximum} from a study with sample size n
,
using the method explained in De Livera et al. (2024).
Usage
est.density.five(
min = NULL,
q1 = NULL,
med = NULL,
q3 = NULL,
max = NULL,
n = NULL,
opt = TRUE
)
Arguments
min |
numeric value representing the sample minimum. |
q1 |
numeric value representing the first quartile of the sample. |
med |
numeric value representing the median of the sample. |
q3 |
numeric value representing the third quartile of the sample. |
max |
numeric value representing the sample maximum. |
n |
numeric value specifying the sample size. |
opt |
logical value indicating whether to apply the optimisation step in estimating parameters using theoretical quantiles.
The default value is |
Details
De Livera et al., (2024) proposed using the generalised lambda distribution (GLD) to estimate unknown parameters for studies reporting 5-number summaries in the meta-analysis context.
The GLD is a four parameter family of distributions defined by its quantile function under the FKML parameterisation (Freimer et al., 1988).
De Livera et al. propose that the GLD quantlie function can be used to approximate a sample's distribution using 5-point summaries.
The four parameters of GLD quantile function include: a location parameter (\lambda_1
), an inverse scale parameter (\lambda_2
>0), and two shape parameters (\lambda_3
and \lambda_4
).
The parameters of the GLD are estimated by formulating and solving a set of simultaneous equations which relate the estimated sample quantiles to their theoretical counterparts of the GLD.
Value
-
parameters
: named numeric vector representing the estimated parameters ('location', 'inverse scale', 'shape 1', 'shape 2') of GLD . -
mean
: numeric value of the estimated mean of the sample using GLD. -
sd
: numeric value of the estimated standard deviation of the sample using GLD.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles, 2024. Pre-print available here: https://arxiv.org/abs/2411.10971.
Marshall Freimer, Georgia Kollia, Govind S Mudholkar, and C Thomas Lin. A study of the generalized tukey lambda family. Communications in Statistics-Theory and Methods, 17(10):3547–3567, 1988.
Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.
R. King, B. Dean, S. Klinke, and P. van Staden. gld: Estimation and Use of the Generalised (Tukey) Lambda Distribution. R package version 2.6.7, 2025. doi:10.32614/CRAN.package.gld. https://CRAN.R-project.org/package=gld.
See Also
est.density.minq2max()
, est.density.q1q2q3()
Examples
#Generate 5-number summary data
set.seed(123)
n <- 1000
x <- stats::rlnorm(n, 4, 0.3)
quants <- c(min(x), stats::quantile(x, probs = c(0.25, 0.5, 0.75)), max(x))
#Estimate GLD parameters using 5-number summary
params<- est.density.five(min = quants[1], q1 = quants[2], med = quants[3], q3 = quants[4],
max = quants[5], n=n, opt=TRUE)$parameters
params
Estimating Unknown Parameters using Minimum, Median and Maximum
Description
This function provide estimates for the parameters of skew logistic distribution (SLD), the sample mean and the standard deviation using 3-number summary {minimum, median (q_2
), maximum} from a study with sample size n
,
using the method explained in De Livera et al. (2024).
Usage
est.density.minq2max(
min = NULL,
med = NULL,
max = NULL,
n = NULL,
opt = TRUE
)
Arguments
min |
numeric value representing the sample minimum. |
med |
numeric value representing the median of the sample. |
max |
numeric value representing the sample maximum. |
n |
numeric value specifying the sample size. |
opt |
logical value indicating whether to apply the optimisation step in estimating parameters using theoretical quantiles.
The default value is |
Details
De Livera et al., (2024) proposed using the skew logistic distribution (SLD) to estimate unknown parameters for studies reporting 3-number summaries in the meta-analysis context.
The quantile-based skew logistic distribution, introduced by Gilchrist (2000) and further modified by van Staden and King (2015)
is used to approximate the sample's distribution using 3-point summaries.
The SLD quantile function is defined using three parameters: a location parameter (\lambda
), a scale parameter (\eta
), and a skewing parameter (\delta
).
The parameters of the SLD are estimated by formulating and solving a set of simultaneous equations which relate the estimated sample quantiles to their theoretical counterparts of the SLD.
Value
-
parameters
: named numeric vector representing the estimated parameters ('location', 'scale', 'skewing') of SLD. -
mean
: numeric value of the estimated mean of the sample using SLD. -
sd
: numeric value of the estimated standard deviation of the sample using SLD.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles, 2024. Pre-print available here: https://arxiv.org/abs/2411.10971.
Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.
P. J. van Staden and R. A. R. King. The quantile-based skew logistic distribution. Statistics & Probability Letters, 96:109–116, 2015.
R. King and P. van Staden. sld: Estimation and Use of the Quantile-Based Skew Logistic Distribution. R package version 1.0.1, 2022. doi:10.32614/CRAN.package.sld. https://CRAN.R-project.org/package=sld.
See Also
est.density.five()
, est.density.q1q2q3()
Examples
#Generate 3-number summary data
set.seed(123)
n <- 1000
x <- stats::rlnorm(n, 4, 0.3)
quants <- c(min(x), stats::quantile(x, probs = 0.5), max(x))
#Estimate SLD parameters using 3-number summary
params <- est.density.minq2max(min = quants[1], med = quants[2], max = quants[3],
n=n, opt=TRUE)$parameters
params
Estimating Unknown Parameters using First Quartile, Median and Third Quartile
Description
This function provide estimates for the parameters of skew logistic distribution (SLD), the sample mean and the standard deviation using 3-number summary {first quartile (q_1
), median (q_2
), third quartile (q_3
)} from a study with sample size n
,
using the method explained in De Livera et al. (2024).
Usage
est.density.q1q2q3(
q1 = NULL,
med = NULL,
q3 = NULL,
n = NULL,
opt = TRUE
)
Arguments
q1 |
numeric value representing the first quartile of the sample. |
med |
numeric value representing the median of the sample. |
q3 |
numeric value representing the third quartile of the sample. |
n |
numeric value specifying the sample size. |
opt |
logical value indicating whether to apply the optimisation step in estimating parameters using theoretical quantiles.
The default value is |
Details
De Livera et al., (2024) proposed using the skew logistic distribution (SLD) to estimate unknown parameters for studies reporting 3-number summaries in the meta-analysis context.
The quantile-based skew logistic distribution, introduced by Gilchrist (2000) and further modified by van Staden and King (2015)
is used to approximate the sample's distribution using 3-point summaries.
The SLD quantile function is defined using three parameters: a location parameter (\lambda
), a scale parameter (\eta
), and a skewing parameter (\delta
).
The parameters of the SLD are estimated by formulating and solving a set of simultaneous equations which relate the estimated sample quantiles to their theoretical counterparts of the SLD.
Value
-
parameters
: named numeric vector representing the estimated parameters ('location', 'scale', 'skewing') of SLD. -
mean
: numeric value of the estimated mean of the sample using SLD. -
sd
: numeric value of the estimated standard deviation of the sample using SLD.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles, 2024. Pre-print available here: https://arxiv.org/abs/2411.10971.
Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.
P. J. van Staden and R. A. R. King. The quantile-based skew logistic distribution. Statistics & Probability Letters, 96:109–116, 2015.
R. King and P. van Staden. sld: Estimation and Use of the Quantile-Based Skew Logistic Distribution. R package version 1.0.1, 2022. doi:10.32614/CRAN.package.sld. https://CRAN.R-project.org/package=sld.
See Also
est.density.five()
, est.density.minq2max()
Examples
#Generate 3-number summary data
set.seed(123)
n <- 1000
x <- stats::rlnorm(n, 4, 0.3)
quants <- c(stats::quantile(x, probs = c(0.25, 0.5, 0.75)))
#Estimate SLD parameters using 3-number summary
params<- est.density.q1q2q3(q1 = quants[1], med = quants[2], q3 = quants[3],
n=n, opt=TRUE)$parameters
params
Estimating Sample Mean using Quantiles
Description
This function estimates the sample mean from a study presenting quantile summary measures with the sample size (n
). The quantile summaries can fall into one of the following categories:
-
S_1
: { minimum, median, maximum } -
S_2
: { first quartile, median, third quartile } -
S_3
: { minimum, first quartile, median, third quartile, maximum }
The est.mean
function implements newly proposed flexible quantile-based distribution methods for estimating sample mean (De Livera et al., 2024).
It also incorporates existing methods for estimating sample means as described by Luo et al. (2018) and McGrath et al. (2020).
Usage
est.mean(
min = NULL,
q1 = NULL,
med = NULL,
q3 = NULL,
max = NULL,
n = NULL,
method = "gld/sld",
opt = TRUE
)
Arguments
min |
numeric value representing the sample minimum. |
q1 |
numeric value representing the first quartile of the sample. |
med |
numeric value representing the median of the sample. |
q3 |
numeric value representing the third quartile of the sample. |
max |
numeric value representing the sample maximum. |
n |
numeric value specifying the sample size. |
method |
character string specifying the approach used to estimate the sample means. The options are the following:
|
opt |
logical value indicating whether to apply the optimisation step of |
Details
The 'gld/sld'
method (i.e., the method of De Livera et al., (2024)) of est.mean
uses the following quantile based distributions:
Generalised Lambda Distribution (GLD) for estimating the sample mean using 5-number summaries (
S_3
).Skew Logistic Distribution (SLD) for estimating the sample mean using 3-number summaries (
S_1
andS_2
).
The generalised lambda distribution (GLD) is a four parameter family of distributions defined by its quantile function under the FKML parameterisation (Freimer et al., 1988).
De Livera et al. propose that the GLD quantlie function can be used to approximate a sample's distribution using 5-point summaries.
The four parameters of GLD quantile function include: a location parameter (\lambda_1
), an inverse scale parameter (\lambda_2
>0), and two shape parameters (\lambda_3
and \lambda_4
).
The quantile-based skew logistic distribution (SLD), introduced by Gilchrist (2000) and further modified by van Staden and King (2015)
is used to approximate the sample's distribution using 3-point summaries.
The SLD quantile function is defined using three parameters: a location parameter (\lambda
), a scale parameter (\eta
), and a skewing parameter (\delta
).
For 'gld/sld'
method, the parameters of the GLD and SLD are estimated
by formulating and solving a set of simultaneous equations. These equations relate the estimated sample quantiles to their theoretical counterparts
of the respective distribution (GLD or SLD). Finally, the mean for each scenario is calculated by integrating functions of the estimated quantile function.
Value
mean
: numeric value representing the estimated mean of the sample.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles, 2024. Pre-print available here: https://arxiv.org/abs/2411.10971.
Dehui Luo, Xiang Wan, Jiming Liu, and Tiejun Tong. Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range. Statistical methods in medical research, 27(6):1785–1805, 2018.
Xiang Wan, Wenqian Wang, Jiming Liu, and Tiejun Tong. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC medical research methodology, 14:1–13, 2014.
Sean McGrath, XiaoFei Zhao, Russell Steele, Brett D Thombs, Andrea Benedetti, and DEPRESsion Screening Data (DEPRESSD) Collaboration. Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis. Statistical methods in medical research, 29(9):2520–2537, 2020b.
Marshall Freimer, Georgia Kollia, Govind S Mudholkar, and C Thomas Lin. A study of the generalized tukey lambda family. Communications in Statistics-Theory and Methods, 17(10):3547–3567, 1988.
Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.
P. J. van Staden and R. A. R. King. The quantile-based skew logistic distribution. Statistics & Probability Letters, 96:109–116, 2015.
R. King, B. Dean, S. Klinke, and P. van Staden. gld: Estimation and Use of the Generalised (Tukey) Lambda Distribution. R package version 2.6.7, 2025. doi:10.32614/CRAN.package.gld. https://CRAN.R-project.org/package=gld.
R. King and P. van Staden. sld: Estimation and Use of the Quantile-Based Skew Logistic Distribution. R package version 1.0.1, 2022. doi:10.32614/CRAN.package.sld. https://CRAN.R-project.org/package=sld.
Examples
#Generate 5-point summary data
set.seed(123)
n <- 1000
x <- stats::rlnorm(n, 4, 0.3)
quants <- c(min(x), stats::quantile(x, probs = c(0.25, 0.5, 0.75)), max(x))
obs_mean <- mean(x)
#Estimate sample mean using s3 (5 number summary)
est_mean_s3 <- est.mean(min = quants[1], q1 = quants[2], med = quants[3], q3 = quants[4],
max = quants[5], n=n, method = "gld/sld")
est_mean_s3
#Estimate sample mean using s1 (min, median, max)
est_mean_s1 <- est.mean(min = quants[1], med = quants[3], max = quants[5],
n=n, method = "gld/sld")
est_mean_s1
#Estimate sample mean using s2 (q1, median, q3)
est_mean_s2 <- est.mean(q1 = quants[2], med = quants[3], q3 = quants[4],
n=n, method = "gld/sld")
est_mean_s2
Estimating Sample Standard Deviation using Quantiles
Description
This function estimates the sample standard deviation from a study presenting quantile summary measures with the sample size (n
). The quantile summaries can fall into one of the following categories:
-
S_1
: { minimum, median, maximum } -
S_2
: { first quartile, median, third quartile } -
S_3
: { minimum, first quartile, median, third quartile, maximum }
The est.sd
function implements newly proposed flexible quantile-based distribution methods for estimating sample standard deviation by De Livera et al. (2024)
as well as other existing methods for estimating sample standard deviations by Shi et al. (2020) and McGrath et al. (2020).
Usage
est.sd(
min = NULL,
q1 = NULL,
med = NULL,
q3 = NULL,
max = NULL,
n = NULL,
method = "shi/wan",
opt = TRUE
)
Arguments
min |
numeric value representing the sample minimum. |
q1 |
numeric value representing the first quartile of the sample. |
med |
numeric value representing the median of the sample. |
q3 |
numeric value representing the third quartile of the sample. |
max |
numeric value representing the sample maximum. |
n |
numeric value specifying the sample size. |
method |
character string specifying the approach used to estimate the sample standard deviations. The options are the following:
|
opt |
logical value indicating whether to apply the optimisation step of |
Details
For details explaining the new method 'gld/sld'
, check est.mean
.
Value
sd
: numeric value representing the estimated standard deviation of the sample.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles. Submitted for Review, 2024, pre-print available here: https://arxiv.org/abs/2411.10971
Jiandong Shi, Dehui Luo, Hong Weng, Xian-Tao Zeng, Lu Lin, Haitao Chu, and Tiejun Tong. Optimally estimating the sample standard deviation from the five-number summary. Research synthesis methods, 11(5):641–654, 2020.
Xiang Wan, Wenqian Wang, Jiming Liu, and Tiejun Tong. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC medical research methodology, 14:1–13, 2014.
Sean McGrath, XiaoFei Zhao, Russell Steele, Brett D Thombs, Andrea Benedetti, and DEPRESsion Screening Data (DEPRESSD) Collaboration. Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis. Statistical methods in medical research, 29(9):2520–2537, 2020b.
Examples
#Generate 5-point summary data
set.seed(123)
n <- 1000
x <- stats::rlnorm(n, 5, 0.5)
quants <- c(min(x), stats::quantile(x, probs = c(0.25, 0.5, 0.75)), max(x))
obs_sd <- sd(x)
#Estimate sample SD using s3 (5 number summary)
est_sd_s3 <- est.sd(min = quants[1], q1 = quants[2], med = quants[3], q3 = quants[4],
max = quants[5], n=n, method = "gld/sld")
est_sd_s3
#Estimate sample SD using s1 (min, median, max)
est_sd_s1 <- est.sd(min = quants[1], med = quants[3], max = quants[5],
n=n, method = "gld/sld")
est_sd_s1
#Estimate sample SD using s2 (q1, median, q3)
est_sd_s2 <- est.sd(q1 = quants[2], med = quants[3], q3 = quants[4],
n=n, method = "gld/sld")
est_sd_s2
Visualising Densities using Quantiles
Description
The function estimates and visualizes the density curves of one-group or two-group studies presenting quantile summary measures with the sample size (n
). The quantile summaries can fall into one of the following categories:
-
S_1
: { minimum, median, maximum } -
S_2
: { first quartile, median, third quartile } -
S_3
: { minimum, first quartile, median, third quartile, maximum }
The plotdist
function uses the following quantile-based distribution methods for visualising densities using qantiles (De Livera et al., 2024).
Generalised Lambda Distribution (GLD) when 5-number summaries present (
S_3
).Skew Logistic Distribution (SLD) when 3-number summaries present (
S_1
andS_2
).
Usage
plotdist(
data,
xmin = NULL,
xmax = NULL,
ymax = NULL,
length.out = 1000,
title = "",
xlab = "x",
ylab = "Density",
line.size = 0.5,
title.size = 12,
lab.size = 10,
color.g1 = "pink",
color.g2 = "skyblue",
color.g1.pooled = "red",
color.g2.pooled = "blue",
label.g1 = NULL,
label.g2 = NULL,
display.index = FALSE,
display.legend = FALSE,
pooled.dist = FALSE,
pooled.only = FALSE,
opt = TRUE
)
Arguments
data |
data frame containing the quantile summary data. For one-group studies, the input dataset may contain the following columns depending on the quantile scenario:
For two-group studies, the data frame may also contain the following columns for the second group: |
xmin |
numeric value for the lower limit of the x-axis for density calculation. It is recommended to set this to a value smaller than the smallest value across the quantile summaries to ensure the density curve is fully captured.
If |
xmax |
numeric value for the upper limit of the x-axis for density calculation. It is recommended to set this to a value larger than the largest value across the quantile summaries to ensure the density curve is fully captured.
If |
ymax |
numeric value for the upper limit of the y-axis. If NULL, the highest density value will be used. |
length.out |
integer specifying the number of points along the x-axis for density calculation. Default is |
title |
character string for the plot title. Default is an empty string. |
xlab |
character string for the x-axis label. Default is |
ylab |
character string for the y-axis label. Default is |
line.size |
numeric. Thickness of the density curve lines. Default is |
title.size |
numeric. Font size for the plot title. Default is |
lab.size |
numeric. Font size for axis labels. Default is |
color.g1 |
character string specifying the color for individual density curves of group 1 for each study (row). Default is |
color.g2 |
character string specifying the color for individual density curves of group 2 for each study (row). Default is |
color.g1.pooled |
character string specifying the color for pooled density curve of group 1. Default is |
color.g2.pooled |
character string specifying the color for pooled density curve of group 2. Default is |
label.g1 |
character string indicating label or name for group 1 (eg., 'Treatment') |
label.g2 |
character string indicating label or name for group 2 (eg., 'Control'). If |
display.index |
logical. If |
display.legend |
logical. If |
pooled.dist |
logical. If |
pooled.only |
logical. If |
opt |
logical value indicating whether to apply the optimization step when estimating GLD or SLD parameters. The default value is |
Details
The generalised lambda distribution (GLD) is a four parameter family of distributions defined by its quantile function under the FKML parameterisation (Freimer et al., 1988).
De Livera et al. propose that the GLD quantile function can be used to approximate a sample's distribution using 5-point summaries.
The four parameters of GLD quantile function include: a location parameter (\lambda_1
), an inverse scale parameter (\lambda_2
>0), and two shape parameters (\lambda_3
and \lambda_4
).
The quantile-based skew logistic distribution (SLD), introduced by Gilchrist (2000) and further modified by van Staden and King (2015)
is used to approximate the sample's distribution using 3-point summaries.
The SLD quantile function is defined using three parameters: a location parameter (\lambda
), a scale parameter (\eta
), and a skewing parameter (\delta
).
These parameters of GLD and SLD are estimated by formulating and solving a series of simultaneous equations which relate the estimated quantiles
with the population counterparts of respective distribution (GLD or SLD). The plotdist
uses these estimated parameters, to compute the density data
using dgl
function from the gld package and dsl
function from the sld package.
If one needs to generate pooled density plots, they can use the pooled.dist
or pooled.only
arguments as described in the Arguments section.
The pooled density curves represent a weighted average of individual study densities, with weights determined by sample sizes. The method is similar to obtaining pooled
estimates of effects in a standard meta-analysis and it serves as a way to visualize combined estimated distributional information across studies.
Value
An interactive plotly object visualizing the estimated density curve(s) for one or two groups.
References
Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles. Submitted for Review, 2024, pre-print available here: https://arxiv.org/abs/2411.10971
Marshall Freimer, Georgia Kollia, Govind S Mudholkar, and C Thomas Lin. A study of the generalized tukey lambda family. Communications in Statistics-Theory and Methods, 17(10):3547–3567, 1988.
Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.
P. J. van Staden and R. A. R. King. The quantile-based skew logistic distribution. Statistics & Probability Letters, 96:109–116, 2015.
Examples
#Example dataset of 3-point summaries (min, med, max) for 2 groups
data_3num_2g <- data.frame(
study.index = c("Study 1", "Study 2", "Study 3"),
min.g1 = c(15, 15, 13),
med.g1 = c(66, 68, 63),
max.g1 = c(108, 101, 100),
n.g1 = c(226, 230, 200),
min.g2 = c(18, 19, 15),
med.g2 = c(73, 82, 81),
max.g2 = c(110, 115, 100),
n.g2 = c(226, 230, 200)
)
print(data_3num_2g)
#Density plots of two groups along with the pooled plots
plot_2g <- plotdist(
data_3num_2g,
xmin = 10,
xmax = 125,
title = "Example Density Plots of Two Groups",
xlab = "x data",
color.g1 = "skyblue",
color.g2 = "pink",
color.g1.pooled = "blue",
color.g2.pooled = "red",
label.g1 = "Treatment",
label.g2 = "Control",
display.legend = TRUE,
pooled.dist = TRUE
)
print(plot_2g)