Multiple Bias Modeling

Denis Haine

Epidemiologic studies can suffer from more than one bias. Bias functions in episensr can be applied sequentially to quantify bias resulting from multiple biases.

Following the example in Lash et al., we can use the study by Chien et al.. It is a case-control study looking at the association between antidepressant use and the occurrence of breast cancer. The observed OR was 1.2 [0.9–1.6].

chien <- matrix(c(118, 832, 103, 884),
                dimnames = list(c("BC+", "BC-"), c("AD+", "AD-")),
                nrow = 2, byrow = TRUE)
AD+ AD-
BC+ 118 832
BC- 103 884

Records on medication use differed between participants, from pharmacy records and self-reported use, leading to misclassification:

library(episensr)
seq_bias1 <- chien %>%
    misclass(., type = "exposure",
             bias_parms = c(24/(24+19), 18/(18+13), 144/(144+2), 130/(130+4)))
seq_bias1
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
## 
##     AD+ AD-
## BC+ 118 832
## BC- 103 884
##                                        2.5%     97.5%
## Observed Relative Risk: 1.1012443 0.9646019 1.2572431
##    Observed Odds Ratio: 1.2172330 0.9192874 1.6117443
## ── Bias-adjusted measures ──
##                                                              2.5%    97.5%
## Misclassification Bias Corrected Relative Risk: 1.256946                  
##    Misclassification Bias Corrected Odds Ratio: 1.628058 1.113458 2.380487

Controls and cases also enrolled into the study at different rates. By storing the result of misclass() in an object named here seq_bias1, we have access to the various elements returned by the function (see the help() for a given function to know what is returned). The bias-adjusted cell frequencies can be accessed as corr_data.

seq_bias1$corr_data
##          AD+      AD-
## BC+ 192.8332 757.1668
## BC- 133.5114 853.4886

This 2-by-2 table will be used as starting values for the next sequential bias adjustment, selection().

seq_bias2 <- seq_bias1$corr_data %>%
    selection(., bias_parms = c(.734, .605, .816, .756))
seq_bias2
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
## 
##          AD+      AD-
## BC+ 192.8332 757.1668
## BC- 133.5114 853.4886
##                                      2.5%    97.5%
## Observed Relative Risk: 1.256946 1.132669 1.394858
##    Observed Odds Ratio: 1.628058 1.278900 2.072541
## ── Bias-adjusted measures ──
##                                                 
## Selection Bias Corrected Relative Risk: 1.172097
##    Selection Bias Corrected Odds Ratio: 1.448430

The adjusted OR is now 1.45 Again, the bias-adjusted cell frequencies can be used in the next bias analysis, confounders().

seq_bias2$corr_data
##          AD+      AD-
## BC+ 262.7156 1251.515
## BC- 163.6169 1128.953

The association between antidepressant use and breast cancer was adjusted for various confounders (race/ethnicity, income, etc.). None of these confounders were found to change the association by more than 10%. However, for illustration, we can add the effect of a potential confounder (e.g. physical activity):

seq_bias3 <- seq_bias2$corr_data %>%
    confounders(., type = "OR", bias_parms = c(.8, .299, .436))
seq_bias3
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
## 
##           AD+       AD-
## BC+  262.7156 1251.5153
## BC-  163.6169 1128.9532
##                                        2.5%    97.5%
##         Crude Odds Ratio: 1.448430 1.172758 1.788903
## Odds Ratio, Confounder +: 1.406219                  
## Odds Ratio, Confounder -: 1.406219
## ── Bias-adjusted measures ──
##                                        OR due to confounding
## Standardized Morbidity Ratio: 1.406219              1.030018
##              Mantel-Haenszel: 1.406219              1.030018

The serially bias-adjusted OR is 1.406. And the adjusted cells by confounders, for regular exercise:

seq_bias3$cfder_data
##          AD+      AD-
## BC+ 66.83851 478.2302
## BC- 48.92144 492.2236

And for no regular exercise:

seq_bias3$nocfder_data
##          AD+      AD-
## BC+ 195.8771 773.2851
## BC- 114.6954 636.7296

The same process can be realized in a probabilistic framework, as each probsens(), probsens.sel() and probsens_conf() provides the adjusted cell frequencies as A1, B1, C1, D1 in sim_df.

Exposed Not exposed
Cases A1 B1
Controls C1 D1
mod1 <- chien %>%
    probsens(., type = "exposure",
             seca = list("trapezoidal", c(.45, .5, .6, .65)),
             seexp = list("trapezoidal", c(.4, .48, .58, .63)),
             spca = list("trapezoidal", c(.95, .97, .99, 1)),
             spexp = list("trapezoidal", c(.96, .98, .99, 1)),
             corr_se = .8, corr_sp = .8)
## ℹ Calculating observed measures
## ⠙ Assign probability distributions
## ✔ Assign probability distributions [12ms]
## 
## ⠙ Simple bias analysis
## ✔ Simple bias analysis [20ms]
## 
## ⠙ Incorporating random error
## ✔ Incorporating random error [29ms]
## 
str(mod1)
## List of 7
##  $ obs_data    : num [1:2, 1:2] 118 103 832 884
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] "BC+" "BC-"
##   .. ..$ : chr [1:2] "AD+" "AD-"
##  $ obs_measures: num [1:2, 1:3] 1.101 1.217 0.965 0.919 1.257 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] " Observed Relative Risk:" "    Observed Odds Ratio:"
##   .. ..$ : chr [1:3] " " "2.5%" "97.5%"
##  $ adj_measures: num [1:4, 1:3] 1.073 1.073 1.151 1.152 0.962 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:4] "Relative Risk -- systematic error:" "                      total error:" "   Odds Ratio -- systematic error:" "                      total error:"
##   .. ..$ : chr [1:3] "Median" "p2.5" "p97.5"
##  $ sim_df      :'data.frame':    1000 obs. of  27 variables:
##   ..$ seca   : num [1:1000] 0.577 0.497 0.535 0.531 0.6 ...
##   ..$ seexp  : num [1:1000] 0.57 0.452 0.479 0.493 0.608 ...
##   ..$ spca   : num [1:1000] 0.965 0.96 0.983 0.979 0.97 ...
##   ..$ spexp  : num [1:1000] 0.983 0.967 0.989 0.99 0.97 ...
##   ..$ A1     : num [1:1000] 157 174 197 192 157 ...
##   ..$ B1     : num [1:1000] 793 776 753 758 793 ...
##   ..$ C1     : num [1:1000] 156 168 197 193 127 ...
##   ..$ D1     : num [1:1000] 831 819 790 794 860 ...
##   ..$ prevca : num [1:1000] 0.171 0.178 0.196 0.174 0.166 ...
##   ..$ prevexp: num [1:1000] 0.182 0.17 0.19 0.187 0.116 ...
##   ..$ ppvca  : num [1:1000] 0.775 0.727 0.887 0.843 0.801 ...
##   ..$ ppvexp : num [1:1000] 0.881 0.739 0.909 0.921 0.726 ...
##   ..$ npvca  : num [1:1000] 0.917 0.898 0.897 0.909 0.924 ...
##   ..$ npvexp : num [1:1000] 0.911 0.896 0.89 0.895 0.949 ...
##   ..$ ab     : num [1:1000] 170 177 184 190 158 169 195 180 153 176 ...
##   ..$ bb     : num [1:1000] 780 773 766 760 792 781 755 770 797 774 ...
##   ..$ cb     : num [1:1000] 178 170 180 202 137 153 210 177 145 171 ...
##   ..$ db     : num [1:1000] 809 817 807 785 850 834 777 810 842 816 ...
##   ..$ corr_RR: num [1:1000] 0.995 1.049 1.038 0.985 1.11 ...
##   ..$ corr_OR: num [1:1000] 0.991 1.1 1.077 0.972 1.238 ...
##   ..$ rr_se_b: num [1:1000] 0.0605 0.0586 0.0579 0.0581 0.0599 ...
##   ..$ or_se_b: num [1:1000] 0.118 0.119 0.116 0.113 0.127 ...
##   ..$ z      : num [1:1000] -0.405 -0.818 1.345 2.238 1.091 ...
##   ..$ tot_RR : num [1:1000] 1.02 1.101 0.96 0.865 1.04 ...
##   ..$ tot_OR : num [1:1000] 1.039 1.213 0.921 0.754 1.078 ...
##   ..$ syst_RR: num [1:1000] 1.03 1.05 1.03 1.02 1.15 ...
##   ..$ syst_OR: num [1:1000] 1.06 1.09 1.05 1.04 1.35 ...
##  $ reps        : num 1000
##  $ fun         : chr "probsens"
##  $ warnings    : NULL
##  - attr(*, "class")= chr [1:3] "episensr" "episensr.probsens" "list"

Each of the lines from mod1$sim_df[, 5:8] can then be tabulated and fed to the next bias function. Be careful, as the number of observations can quickly become unmanageable.

head(mod1$sim_df[, 5:8])
##         A1       B1       C1       D1
## 1 156.8580 793.1420 155.7892 831.2108
## 2 174.2493 775.7507 168.3679 818.6321
## 3 197.1024 752.8976 196.5320 790.4680
## 4 192.4206 757.5794 193.0553 793.9447
## 5 157.2871 792.7129 126.7854 860.2146
## 6 174.1386 775.8614 142.1461 844.8539