Type: | Package |
Title: | Time-Based Rolling Functions |
Version: | 0.1.7 |
Description: | Provides rolling statistical functions based on date and time windows instead of n-lagged observations. |
URL: | https://mps9506.github.io/tbrf/ |
BugReports: | https://github.com/mps9506/tbrf/issues |
License: | GPL-3 | file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.1 |
Depends: | R (≥ 2.10), ggplot2 (≥ 2.2.1) |
Imports: | boot, dplyr, lubridate, purrr, rlang, stats, tibble, tidyr |
Suggests: | spelling, covr, testthat, knitr, rmarkdown |
VignetteBuilder: | knitr |
Language: | en-US |
Config/Needs/website: | mps9506/mpsTemplates |
NeedsCompilation: | no |
Packaged: | 2025-08-19 13:46:51 UTC; michael.schramm |
Author: | Michael Schramm |
Maintainer: | Michael Schramm <mpschramm@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-08-19 14:10:02 UTC |
Time-Based Rolling Functions
Description
Provides rolling statistical functions based on date and time windows instead of n-lagged observations.
Author(s)
Michael Schramm
See Also
Useful links:
Dissolved oxygen measurements from the Tres Palacios River
Description
Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'AverageDO“ field is the mean of dissolved oxygen concentrations (mg/L) measured at a field site at that day. The MinDO is the minimum dissolved oxygen concentration measured at that site on that day.
Usage
data(Dissolved_Oxygen)
Format
A data frame with 236 rows and 6 variables:
- Station_ID
unique water quality monitoring station identifier
- Date
sampling date in yyyy-mm-dd format
- Param_Code
unique parameter code
- Param_Desc
parameter description with units
- Average_DO
mean of dissolved oxygen measurement, in mg/L
- Min_DO
minimum of dissolved oxygen measurement, in mg/L
Source
https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm
Enterococci bacteria measurements from the Tres Palacios River
Description
Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'Value“ field is the lab measured value of Enterococci bacteria (MPN/100 mL) from grab samples collected at 'Station ID' on the Tres Palacios River on 'Date'.
Usage
data(Entero)
Format
A data frame with 212 rows and 5 variables:
- Station_ID
unique water quality monitoring station identifier
- Date
sampling date in yyyy-mm-dd format
- Param_Code
unique parameter code
- Param_Desc
parameter description with units
- Value
Enterococci concentration, in MPN/L
Source
https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm
Confidence Intervals for Binomial Probabilities
Description
An implementation of the binconf
function in Frank
Harrell's Hmisc package. Produces 1-alpha confidence intervals for binomial
probabilities.
Usage
binom_ci(
x,
n,
alpha = 0.05,
method = c("wilson", "exact", "asymptotic"),
return.df = FALSE
)
Arguments
x |
vector containing the number of "successes" for binomial variates. |
n |
vector containing the numbers of corresponding observations. |
alpha |
probability of a type I error, so confidence coefficient = 1-alpha. |
method |
character string specifying which method to use. The "exact" method uses the F distribution to compute exact (based on the binomial cdf) intervals; the "wilson" interval is score-test-based; and the "asymptotic" is the text-book, asymptotic normal interval. Following Agresti and Coull, the Wilson interval is to be preferred and so is the default. |
return.df |
logical flag to indicate that a data frame rather than a matrix be returned. |
Author(s)
Frank Harrell, modified by Michael Schramm
References
A. Agresti and B.A. Coull, Approximate is better than "exact" for interval estimation of binomial proportions, American Statistician, 52:119–126, 1998.
R.G. Newcombe, Logit confidence intervals and the inverse sinh transformation, American Statistician, 55:200–202, 2001.
L.D. Brown, T.T. Cai and A. DasGupta, Interval estimation for a binomial proportion (with discussion), Statistical Science, 16:101–133, 2001.
Examples
binom_ci(46,50,method="wilson")
Calculates the Geometric Mean
Description
Originally from Paul McMurdie, Ben Bolker, and Gregor on Stack Overflow: https://stackoverflow.com/questions/2602583/geometric-mean-is-there-a-built-in
Usage
gm_mean(x, na.rm = TRUE, zero.propagate = FALSE)
Arguments
x |
vector of numeric values |
na.rm |
logical TRUE/FALSE remove NA values |
zero.propagate |
logical TRUE/FALSE. Allows the optional propagation of zeros. |
Value
the geometric mean of the vector
Returns the Geomean and CI
Description
Generates Geometric mean and confidence intervals using bootstrap.
Usage
gm_mean_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL,
zero.propagate = FALSE
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
zero.propagate |
logical |
Value
named list with geometric mean and (optionally) specified confidence interval
List NA
Description
function to return tibble with NAs as specified
Usage
list_NA(x)
Arguments
x |
named vector |
Value
empty tibble
Returns the mean and CI
Description
Generates mean and confidence intervals using bootstrap.
Usage
mean_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
Value
named list with mean and (optionally) specified confidence interval
Returns the median and CI
Description
Generates median and confidence intervals using bootstrap.
Usage
median_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
Value
named list with mean and (optionally) specified confidence interval
Open Window
Description
calculates the period at each row from the row of interest
Usage
open_window(x, tcolumn, unit = "years", n, i, na.pad)
Arguments
x |
dataframe |
tcolumn |
time column |
unit |
unit |
n |
desired n |
i |
row number |
na.pad |
logical if 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
Value
vector
Step ribbon statistic
Description
Provides stairstep values for ribbon plots. This was originally in Bob Rudis's ggalt package which is no longer on CRAN.
Usage
stat_stepribbon(
mapping = NULL,
data = NULL,
geom = "ribbon",
position = "identity",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
direction = "hv",
...
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
which geom to use; defaults to " |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
direction |
|
... |
Other arguments passed on to
|
Author(s)
Bob Rudis
References
https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs
Examples
x <- 1:10
df <- data.frame(x=x, y=x+10, ymin=x+7, ymax=x+12)
gg <- ggplot(df, aes(x, y))
gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax),
stat="stepribbon", fill="#b2b2b2")
gg <- gg + geom_step(color="#2b2b2b")
gg
gg <- ggplot(df, aes(x, y))
gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax),
stat="stepribbon", fill="#b2b2b2",
direction="hv")
gg <- gg + geom_step(color="#2b2b2b")
gg
Time-Based Rolling Binomial Probability
Description
Produces a a rolling time-window based vector of binomial probability and confidence intervals.
Usage
tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05, na.pad = TRUE)
Arguments
.tbl |
dataframe with two variables. |
x |
indicates the variable column containing "success" and "failure" observations coded as 1 or 0. |
tcolumn |
indicates the variable column containing Date or Date-Time values. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window in the selected units. |
alpha |
numeric, probability of a type 1 error, so confidence coefficient = 1-alpha |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
Value
tibble with binomial point estimate and confidence intervals.
See Also
Examples
## Generate Sample Data
df <- tibble::tibble(
date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100),
value = rbinom(100, 1, 0.25)
)
## Run Function
tbr_binom(df, x = value,
tcolumn = date, unit = "years", n = 5,
alpha = 0.1, na.pad = FALSE)
Binomial test based on time window
Description
Binomial test based on time window
Usage
tbr_binom_window(x, tcolumn, unit = "years", n, i, alpha, na.pad)
Arguments
x |
column containing "success" and "failure" observations as 0 or 1 |
tcolumn |
formatted time column |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
rows |
alpha |
numeric, probability of a type 1 error, so confidence coefficient = 1-alpha |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
Value
list
Time-Based Rolling Geometric Mean
Description
Produces a a rolling time-window based vector of geometric means and confidence intervals.
Usage
tbr_gmean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the geometric mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
... |
additional arguments passed to |
Value
tibble with columns for the rolling geometric mean and upper and lower confidence levels.
See Also
Examples
## Return a tibble with new rolling geometric mean column
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)
## Not run:
## Return a tibble with rolling geometric mean and 95% CI
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Geometric mean based on a time-window
Description
Geometric mean based on a time-window
Usage
tbr_gmean_window(x, tcolumn, unit = "years", n, i, na.pad, ...)
Arguments
x |
column containing the values to calculate the geometric mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
... |
additional arguments passed to gmean_ci |
Value
list
Time-Based Rolling Mean
Description
Produces a a rolling time-window based vector of means and confidence intervals.
Usage
tbr_mean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
... |
additional arguments passed to |
Value
tibble with columns for the rolling mean and upper and lower confidence intervals.
See Also
Examples
## Return a tibble with new rolling mean column
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)
## Not run:
## Return a tibble with rolling mean and 95% CI
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Mean Based on a Time-Window
Description
Mean Based on a Time-Window
Usage
tbr_mean_window(x, tcolumn, unit = "years", n, i, na.pad, ...)
Arguments
x |
column containing the values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
... |
additional arguments passed to |
Value
list
Time-Based Rolling Median
Description
Produces a a rolling time-window based vector of medians and confidence intervals.
Usage
tbr_median(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
... |
additional arguments passed to |
Value
tibble with columns for the rolling median and upper and lower confidence intervals.
See Also
Examples
## Return a tibble with new rolling median column
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years",
n = 5, na.pad = FALSE)
## Not run:
## Return a tibble with rolling median and 95% CI
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Median Based on a Time-Window
Description
Median Based on a Time-Window
Usage
tbr_median_window(x, tcolumn, unit = "years", n, i, na.pad, ...)
Arguments
x |
column containing the values to calculate the median. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
... |
additional arguments passed to |
Value
list
Use Generic Functions with Time Windows
Description
Use Generic Functions with Time Windows
Usage
tbr_misc(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, func, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values the function is applied to. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
func |
specified function |
... |
optional additional arguments passed to function |
Value
tibble
Examples
tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years",
n = 5, na.pad = FALSE, func = mean)
Time-Based Rolling Standard Deviation
Description
Time-Based Rolling Standard Deviation
Usage
tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the standard deviation. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
Value
tibble with column for the rolling sd.
See Also
Examples
tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)
Standard Deviation Based on a Time-Window
Description
Standard Deviation Based on a Time-Window
Usage
tbr_sd_window(x, tcolumn, unit = "years", n, i, na.pad, ...)
Arguments
x |
column containing the values to calculate the standard deviation. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
... |
additional arguments passed to base::sd() |
Value
numeric value
Time-Based Rolling Sum
Description
Time-Based Rolling Sum
Usage
tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the sum. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
Value
dataframe with column for the rolling sum.
See Also
Examples
tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n =
5, na.pad = FALSE)
Sum Based on a Time-Window
Description
Sum Based on a Time-Window
Usage
tbr_sum_window(x, tcolumn, unit = "years", n, i, na.rm, na.pad)
Arguments
x |
column containing the values to calculate the sum. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
na.rm |
logical. Should missing values be removed? |
na.pad |
logical if 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. |
Value
numeric value
tbrf extensions to ggplot2
Description
tbrf makes use of the ggproto class system to extend the functionality of ggplot2. In general the actual classes should be of little interest to users as the standard ggplot2 api of using geom_* and stat_* functions for building up the plot is encouraged.
References
https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs