Following a trials recruitment is an important task for timing of analyses and ensuring that a trial will not run for too long (longer trials are more expensive). accrualPlot provides tools for easily creating recruitment plots and even for predicting when a trial will have successfully recruited all participants.
The package is loaded like any other:
library(accrualPlot)
#> Loading required package: lubridate
#> Warning: package 'lubridate' was built under R version 4.0.5
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, unionaccrual_dfTo work with accrualPlot, we need some data, specifically dates and, optionally, site identifiers. Here’s some data that we will use in the following examples.
set.seed(1234)
x <- as.Date("2020-12-07") + sample(c(-20:20), 50, replace = TRUE)
site <- as.factor(paste0("Site",sample(1:3, 50, replace = TRUE)))accrual_dfs are simply dataframes with the number of participants on each day participants are recruited (or site) began recruiting.
Monocentric trials obviously have only a single site, so we only need the x object we just created. We can pass this into the accrual_create_df function.
df <- accrual_create_df(x)
print(df, head = TRUE)
#> 50 participants recruited between 2020-11-18 and 2020-12-27
#> Date Freq Cumulative
#> 1 2020-11-18 0 0
#> 2 2020-11-18 2 2
#> 3 2020-11-19 2 4
#> 4 2020-11-20 3 7
#> 5 2020-11-21 2 9
#> 6 2020-11-22 2 11In this case, the accrual_df has a single data frame.
For multicentric trials, we should also pass the site identifier to accrual_create_df in the by argument.
df2 <- accrual_create_df(x, by = site)
print(df2, head = TRUE)
#> Site1:
#> 16 participants recruited between 2020-11-18 and 2020-12-27
#> Date Freq Cumulative
#> 1 2020-11-18 0 0
#> 2 2020-11-18 2 2
#> 3 2020-11-19 1 3
#> 4 2020-11-20 1 4
#> 5 2020-11-22 1 5
#> 6 2020-11-24 1 6
#>
#> Site2:
#> 19 participants recruited between 2020-11-21 and 2020-12-27
#> Date Freq Cumulative
#> 1 2020-11-21 0 0
#> 2 2020-11-21 1 1
#> 3 2020-11-24 2 3
#> 4 2020-12-01 1 4
#> 5 2020-12-02 1 5
#> 6 2020-12-03 1 6
#>
#> Site3:
#> 15 participants recruited between 2020-11-19 and 2020-12-27
#> Date Freq Cumulative
#> 1 2020-11-19 0 0
#> 2 2020-11-19 1 1
#> 3 2020-11-20 2 3
#> 4 2020-11-21 1 4
#> 5 2020-11-22 1 5
#> 6 2020-11-25 1 6
#>
#> Overall:
#> 50 participants recruited between 2020-11-18 and 2020-12-27
#> Date Freq Cumulative
#> 1 2020-11-18 0 0
#> 2 2020-11-18 2 2
#> 3 2020-11-19 2 4
#> 4 2020-11-20 3 7
#> 5 2020-11-21 2 9
#> 6 2020-11-22 2 11In this case, the accrual_df is a list of dataframes, one for each site and an overall.
By default, the start and end dates are defined based on the dates that you pass to accrual_create_df. You can override these via the start_date and current_date arguments. This is useful for when you have particularly slow recruiting trials (such as those with particularly strict inclusion criteria). For example, our fictitious example trial might have started recruiting on the 1st November. By adding this information, we modify other output
For multicentric trials where different sites started recruiting at different times, we can pass a vector to start_date
accrualPlot has three flavours of plots:
* Cumulative
* Absolute
* Prediction
and supplies both base graphics as well as ggplot2 graphics implementations (allowing easier modification).
Cumulative plots show a standard step function of the number of participants recruited up to a given point in time. The plots are produced via the plot method (which is a wrapper for the internal function accrual_plot_cum)
For ggplot2 graphics, use the engine option:
library(patchwork)
#> Warning: package 'patchwork' was built under R version 4.0.5
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.0.5
p1 <- plot(df, engine = "ggplot")
p2 <- plot(df2, engine = "ggplot")
p3 <- plot(df4, engine = "ggplot") +
labs(col = "Site") +
theme_classic() +
theme(legend.position = c(.35,.9),
legend.key.height = unit(2, "mm"),
legend.text=element_text(size=8),
legend.title=element_blank(),
axis.text.x = element_text(angle = 45, vjust = 1, hjust=1),
axis.title.x = element_blank())
p1 + p2 + p3Recruitment plots per unit time can be obtained via the absolute method (specify which = "absolute" to plot)
par(mfrow = c(1, 3))
plot(df, which = "abs", unit = "week")
plot(df2, which = "abs", unit = "week")
plot(df4, which = "abs", unit = "week")Options for unit are year, month, week and day.
Where multiple sites exist, the different sites are indicated by different colours on the stacked bars.
p1 <- plot(df, which = "abs", unit = "week", engine = "ggplot")
p2 <- plot(df2, which = "abs", unit = "week", engine = "ggplot")
p3 <- plot(df4, which = "abs", unit = "week", engine = "ggplot") +
labs(fill = "Site") +
ylim(0,12) +
theme_classic() +
theme(legend.position = c(.6,0.9),
legend.justification = "left",
legend.key.height = unit(2, "mm"),
legend.key.width = unit(2, "mm"),
legend.title=element_blank(),
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
axis.title.x = element_blank())
p1 + p2 + p3It is also possible to predict the time point at which a certain number of participants has been recruited (for estimating when a study will be complete). If we want to recruit a total of 75 participants, we can put that in the target option.
par(mfrow = c(1, 3))
plot(df, which = "predict", target = 75)
plot(df2, which = "predict", target = 75)
plot(df4, which = "predict", target = 75, center_legend="strip")Or with ggplot2.
p1 <- plot(df, which = "predict", target = 75, engine = "ggplot2") +
theme(plot.title.position = "plot")
p2 <- plot(df2, which = "predict", target = c(30, 25, 35, 90), engine = "ggplot2") +
labs(col = NULL) +
theme_classic() +
theme(legend.position = c(.025,.9),
legend.justification = "left",
legend.key.height = unit(2, "mm"),
legend.key.width = unit(2, "mm"),
legend.background = element_rect(fill = NA),
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
axis.title.x = element_blank())
p1 + p2In the second ggplot2 example above, we specify different targets for each site, plus a study-level target. The syntax is the same for base graphics.
Tables of recruitment can also be generated using accrualPlot, via the summary method. As with absolute recruitment above, a unit of time can be specified.
# accrual_table(df)
summary(df, unit = "day")
#> start_date time n
#> 1 First participant in Days accruing Participants accrued
#> 2 18Nov2020 39 50
#> rate
#> 1 Accrual rate (per day)
#> 2 1.28
summary(df2, unit = "day")
#> name start_date time n
#> 1 Center First participant in Days accruing Participants accrued
#> 2 Site1 18Nov2020 39 16
#> 3 Site2 21Nov2020 36 19
#> 4 Site3 19Nov2020 38 15
#> 5 Overall 18Nov2020 39 50
#> rate
#> 1 Accrual rate (per day)
#> 2 0.41
#> 3 0.53
#> 4 0.39
#> 5 1.28
summary(df3, unit = "day")
#> start_date time n
#> 1 First participant in Days accruing Participants accrued
#> 2 01Nov2020 56 50
#> rate
#> 1 Accrual rate (per day)
#> 2 0.89
summary(df3, unit = "day", header = FALSE)
#> start_date time n rate
#> 1 01Nov2020 56 50 0.89