Getting started with iDIFr

What is DIF?

Differential Item Functioning (DIF) occurs when test-takers from different groups who have the same underlying ability have different probabilities of answering a test item correctly. DIF threatens the validity of comparisons across groups and is a central concern in fair assessment.

iDIFr makes DIF analysis accessible, with particular support for intersectional group designs — where groups are defined by combinations of demographic variables, such as gender × nationality × age band.


A simple example

We’ll use a small synthetic dataset with known DIF to illustrate the workflow.

Step 1: Generate (or load) your data

Your data should be a data frame where:

set.seed(42)
dat <- simulate_dif(
  n_persons  = 600,
  n_items    = 20,
  n_groups   = 2,
  dif_items  = c(3, 7, 12),  # these items have DIF
  dif_effect = 0.9,
  dif_type   = "uniform"
)

head(dat[c(1:5, 21)])  # first 5 items + group column
#>   item_1 item_2 item_3 item_4 item_5 group
#> 1      1      0      0      0      0    G1
#> 2      0      1      1      0      1    G1
#> 3      1      0      0      0      0    G1
#> 4      1      1      0      1      0    G1
#> 5      1      1      0      1      0    G2
#> 6      1      0      0      0      1    G2

Step 2: Check your group structure

Before running the analysis, use check_groups() to inspect group cell sizes. This is especially important in intersectional designs where small cells can reduce statistical power.

check_groups(dat, group = ~ group)

For an intersectional design with multiple demographic variables:

# Add nationality variable for illustration
dat$nationality <- sample(c("UK", "DE", "FR"), 600, replace = TRUE)
dat$age_band    <- sample(c("18-30", "31-45", "46+"), 600, replace = TRUE)

check_groups(dat, group = ~ group * nationality * age_band)

If any cells are too small, check_groups() will tell you and point you to merge_groups().

Step 3: Run the DIF analysis

Supply your data, the item columns, a group formula, and which method(s) to use.

result <- idifr(
  data   = dat,
  items  = 1:20,
  group  = ~ group,
  method = c("LR", "LRT")
)

method is required — you must choose. Options are:

Method What it does
"LR" Logistic Regression — flexible, non-IRT, effect size via Nagelkerke ΔR²
"LRT" IRT Likelihood Ratio Test — model-based, effect size via standardised chi
"MOB" Model-based recursive partitioning — non-parametric, detects intersectional instability

Step 4: Explore the results

# Flagged items with effect sizes
print(result)

# Full breakdown by method
summary(result)

# Effect size heatmap
plot(result)

# Method concordance
plot(result, type = "concordance")

# Flat data frame for your own analysis
df <- tidy(result)

Intersectional DIF

The key feature of iDIFr is first-class support for intersectional group structures. Where conventional DIF analysis examines one demographic variable at a time, intersectional analysis asks: does DIF appear at the combination of gender × nationality × age, even when no individual variable shows DIF?

result_intersectional <- idifr(
  data   = dat,
  items  = 1:20,
  group  = ~ group * nationality * age_band,  # crossing all three variables
  method = c("LR", "LRT")
)

print(result_intersectional)

Handling small cells

Intersectional designs often produce small cells. iDIFr will warn you but always run the analysis. To merge sparse cells:

grp <- check_groups(dat, group = ~ group * nationality * age_band)

merged_dat <- merge_groups(
  grp,
  age_band = list("18-45" = c("18-30", "31-45"))  # combine two age bands
)

# Re-run with merged groups
result_merged <- idifr(merged_dat, 1:20,
                       group  = ~ group * nationality * age_band,
                       method = c("LR", "LRT"))

Effect sizes

iDIFr leads with effect sizes, not just p-values. Flagging criteria require both a significant p-value (after adjustment) and a meaningful effect size.

Method Effect size Classification
LR Nagelkerke ΔR² A: <.035 · B: .035–.070 · C: ≥.070
LRT Std. chi Negligible: <.20 · Moderate: .20–.50 · Large: ≥.50
MOB Std. score difference Negligible: <.20 · Moderate: .20–.50 · Large: ≥.50

Intersectional Contrast Analysis (ICA)

Set ica = TRUE in idifr() to go one step further than a single intersectional analysis. It runs one analysis per demographic variable (single-variable) and one intersectional analysis, then classifies each item by comparing where it was flagged.

ica_res <- idifr(
  data   = dat,
  items  = 1:20,
  group  = ~ group * nationality * age_band,
  method = "LR",
  ica    = TRUE
)

print(ica_res)                 # includes the ICA classification section
tidy(ica_res, table = "ica")   # flat ICA classification table

Four item classifications are possible:

Classification Meaning
amplified Flagged in single-variable and intersectional runs
pure_intersection Only flagged in the intersectional run
obscured Flagged in a single-variable run but not intersectionally
none Not flagged anywhere

Note: ICA runs multiple analyses without cross-analysis p-value correction. The effect_threshold argument (default 0.035) provides a de facto stricter criterion. Interpret pure_intersection and obscured findings with caution in small samples.


Further reading