Model Evaluation

library(bayesrules)

For Bayesian model evaluation, the bayesrules package has three functions prediction_summary(), classification_summary() and naive_classification_summary() as well as their cross-validation counterparts prediction_summary_cv(), classification_summary_cv(), and naive_classification_summary_cv() respectively.

Functions Response Model
prediction_summary()
prediction_summary_cv()
Quantitative rstanreg
classification_summary()
classification_summary_cv()
Binary rstanreg
naive_classification_summary()
naive_classification_summary_cv()
Categorical naiveBayes

Prediction Summary

Given a set of observed data including a quantitative response variable y and an rstanreg model of y, prediction_summary() returns 4 measures of the posterior prediction quality.

  1. Median absolute prediction error (mae) measures the typical difference between the observed y values and their posterior predictive medians (stable = TRUE) or means (stable = FALSE).

  2. Scaled mae (mae_scaled) measures the typical number of absolute deviations (stable = TRUE) or standard deviations (stable = FALSE) that observed y values fall from their predictive medians (stable = TRUE) or means (stable = FALSE).

  3. and 4. within_50 and within_90 report the proportion of observed y values that fall within their posterior prediction intervals, the probability levels of which are set by the user. Although 50% and 90% are the defaults for the posterior prediction intervals, these probability levels can be changed with prob_inner and prob_outer arguments. The example below shows the 60% and 80% posterior prediction intervals.

# Data generation
example_data <- data.frame(x = sample(1:100, 20)) 
example_data$y <- example_data$x*3 + rnorm(20, 0, 5)


# rstanreg model
example_model <- rstanarm::stan_glm(y ~ x,  data = example_data, refresh = FALSE)

# Prediction Summary
prediction_summary(example_model, example_data, 
                   prob_inner = 0.6, prob_outer = 0.80, 
                   stable = TRUE)
       mae mae_scaled within_60 within_80
1 3.710897   0.936282       0.6      0.85

Similarly, prediction_summary_cv() returns the 4 cross-validated measures of a model’s posterior prediction quality for each fold as well as a pooled result. The k argument represents the number of folds to use for cross-validation.

prediction_summary_cv(model = example_model, data = example_data, 
                      k = 2, prob_inner = 0.6, prob_outer = 0.80)
$folds
  fold      mae mae_scaled within_60 within_80
1    1 5.131659  0.7802569       0.5       0.8
2    2 5.465685  0.7136092       0.6       1.0

$cv
       mae mae_scaled within_60 within_80
1 5.298672   0.746933      0.55       0.9

Classification Summary

Given a set of observed data including a binary response variable y and an rstanreg model of y, the classification_summary() function returns summaries of the model’s posterior classification quality. These summaries include a confusion matrix as well as estimates of the model’s sensitivity, specificity, and overall accuracy. The cutoff argument represents the probability cutoff to classify a new case as positive.

# Data generation
x <- rnorm(20)
z <- 3*x
prob <- 1/(1+exp(-z))
y <- rbinom(20, 1, prob)
example_data <- data.frame(x = x, y = y)


# rstanreg model
example_model <- rstanarm::stan_glm(y ~ x, data = example_data, 
                                    family = binomial, refresh = FALSE)

# Prediction Summary
classification_summary(model = example_model, data = example_data, cutoff = 0.5)                   
$confusion_matrix
 y 0  1
 0 6  3
 1 1 10

$accuracy_rates
                          
sensitivity      0.9090909
specificity      0.6666667
overall_accuracy 0.8000000

The classification_summary_cv() returns the same measures but for cross-validated estimates. The k argument represents the number of folds to use for cross-validation.

classification_summary_cv(model = example_model, data = example_data,
                          k = 2, cutoff = 0.5)                   
$folds
  fold sensitivity specificity overall_accuracy
1    1         1.0        0.75              0.9
2    2         0.8        0.60              0.7

$cv
  sensitivity specificity overall_accuracy
1         0.9       0.675              0.8

Naive Classification Summary

Given a set of observed data including a categorical response variable y and a naiveBayes model of y, the naive_classification_summary() function returns summaries of the model’s posterior classification quality. These summaries include a confusion matrix as well as an estimate of the model’s overall accuracy.

# Data
data(penguins_bayes, package = "bayesrules")

# naiveBayes model
example_model <- e1071::naiveBayes(species ~ bill_length_mm, data = penguins_bayes)

# Naive Classification Summary
naive_classification_summary(model = example_model, data = penguins_bayes, 
                             y = "species")
$confusion_matrix
   species       Adelie Chinstrap       Gentoo
    Adelie 95.39% (145) 0.00% (0)  4.61%   (7)
 Chinstrap  5.88%   (4) 8.82% (6) 85.29%  (58)
    Gentoo  6.45%   (8) 4.84% (6) 88.71% (110)

$overall_accuracy
[1] 0.7587209

Similarly naive_classification_summary_cv() returns the cross validated confusion matrix. The k argument represents the number of folds to use for cross-validation.

naive_classification_summary_cv(model = example_model, data = penguins_bayes, 
                                y = "species", k = 2)
$folds
  fold    Adelie  Chinstrap    Gentoo overall_accuracy
1    1 0.9864865 0.34482759 0.7826087        0.7965116
2    2 0.8974359 0.05128205 0.9272727        0.7151163

$cv
   species       Adelie   Chinstrap       Gentoo
    Adelie 94.08% (143)  0.00%  (0)  5.92%   (9)
 Chinstrap  5.88%   (4) 17.65% (12) 76.47%  (52)
    Gentoo  8.87%  (11)  6.45%  (8) 84.68% (105)