Title: Access 'Malawi Integrated Household Survey' Data
Version: 0.1.5
Description: Provides programmatic access to the 'Malawi Integrated Household Survey' ('IHS') via the 'World Bank Microdata Library' API https://microdata.worldbank.org/api-documentation/. Users can search variables, download data for 'IHS' rounds 2 through 5, and work with complex survey designs, with no manual file management required.
License: MIT + file LICENSE
Depends: R (≥ 4.1.0)
URL: https://github.com/vituk123/ihsMW
BugReports: https://github.com/vituk123/ihsMW/issues
Imports: httr2 (≥ 1.0.0), arrow (≥ 12.0.0), dplyr (≥ 1.1.0), readr (≥ 2.1.0), rlang (≥ 1.1.0), cli (≥ 3.6.0), rappdirs (≥ 0.3.3), vctrs (≥ 0.6.0), stringdist (≥ 0.9.10)
Suggests: srvyr (≥ 1.2.0), survey (≥ 4.2.0), httptest2 (≥ 0.1.0), testthat (≥ 3.0.0), usethis (≥ 2.2.0), pkgdown (≥ 2.0.0), knitr (≥ 1.40), rmarkdown (≥ 2.20), withr (≥ 2.5.0), jsonlite (≥ 1.8.0)
Encoding: UTF-8
RoxygenNote: 7.3.3
VignetteBuilder: knitr
Config/testthat/edition: 3
Language: en-US
NeedsCompilation: no
Packaged: 2026-04-29 18:43:17 UTC; vitumbikokayuni
Author: Vitumbiko Kayuni ORCID iD [aut, cre]
Maintainer: Vitumbiko Kayuni <vitumbikokayuni@gmail.com>
Repository: CRAN
Date/Publication: 2026-05-04 11:40:02 UTC

Download Malawi IHS microdata

Description

The main interface to the ihsMW package. Downloads one or more IHS variables across one or more survey rounds, applies cross-round harmonisation, and returns the data in the requested format.

Usage

IHS(
  indicator,
  round = "IHS5",
  module = NULL,
  return = c("data.frame", "list", "survey"),
  format = c("parquet", "rds", "csv"),
  cache = TRUE,
  extra = FALSE
)

Arguments

indicator

Character vector of harmonised variable names. Use ihs_search to discover variable names.

round

Character vector of IHS rounds to include. One or more of "IHS2", "IHS3", "IHS4", "IHS5", or "all". Note: IHS1 is not currently available via the API. Default: "IHS5".

module

Optional character string to restrict to a specific module. If NULL (default), the correct module is determined automatically from the crosswalk.

return

Output format: "data.frame" (default), "list", or "survey".

format

File format for download and caching: "parquet" (default), "rds", or "csv".

cache

Logical. If TRUE (default), use and populate the disk cache.

extra

Logical. If FALSE (default), return only the requested indicator columns plus household ID columns. If TRUE, include all variables in the downloaded module (stratum, cluster, weights, etc.).

Value

If return = "data.frame": a single data.frame with an ihs_round column.
If return = "list": a named list of data.frames, one per round.
If return = "survey": a tbl_svy or svydesign object.

See Also

ihs_search to find variable names.
IHS_survey for weighted survey analysis.
ihs_crosswalk_check to assess cross-round comparability.

Examples

## Not run: 
  # One-time setup
  ihs_auth()

  # Download a single variable from the latest round
  df <- IHS("rexp_cat01", round = "IHS5")

  # Multiple variables, multiple rounds
  df <- IHS(c("rexp_cat01", "hh_a02"), round = c("IHS4", "IHS5"))

  # All supported rounds
  df <- IHS("rexp_cat01", round = "all")

  # Return as a named list of data.frames
  lst <- IHS("rexp_cat01", round = c("IHS3", "IHS4", "IHS5"), return = "list")

  # Include weights and design variables
  df <- IHS("rexp_cat01", round = "IHS5", extra = TRUE)

  # Use rds format instead of parquet
  df <- IHS("rexp_cat01", round = "IHS5", format = "rds")

## End(Not run)


Create a survey design object for Malawi IHS data

Description

Creates a complex survey design object using the survey and srvyr packages. Automatically incorporates the appropriate sampling weights, strata, and clusters for the requested round to enable statistically sound national estimations natively.

Usage

IHS_survey(indicator, round = "IHS5", ...)

Arguments

indicator

Character vector of harmonised variable names.

round

A single round string (e.g. "IHS5") or "all".

...

Additional arguments passed to IHS, such as module or format.

Value

A tbl_svy object if the srvyr package is installed, otherwise a svydesign object from the survey package. If multiple rounds are requested, returns a named list of survey objects.

Note

Survey weights differ across IHS rounds and reflect the complex sample design of each survey. Estimates produced using this function are representative at the national, urban/rural, regional, and district level for each round independently. Do not pool weights across rounds without consulting the relevant Basic Information Document for each round. Cite the sampling methodology: NSO Malawi (year), IHS[N] Basic Information Document. National Statistical Office, Zomba, Malawi.

Examples

## Not run: 
  svy <- IHS_survey("rexp_cat01", round = "IHS5")
  survey::svymean(~rexp_cat01, design = svy)
  svy |> srvyr::summarise(mean_cons = srvyr::survey_mean(rexp_cat01))

## End(Not run)


Set up World Bank Microdata API Key

Description

The World Bank Microdata Library uses API keys for authenticated endpoints. The key is stored in the environment variable WORLDBANK_MICRODATA_KEY.

If key is NULL, this function prints an interactive guide to obtaining an API key. If a key is provided, the function validates it against the NADA API, saves it to the session, and appends it to your ~/.Renviron file for future sessions.

Usage

ihs_auth(key = NULL)

Arguments

key

A single string containing your World Bank Microdata API key. Defaults to NULL.

Value

Invisibly returns the API key (if provided) or NULL.

Examples

## Not run: 
# Print interactive setup guide
ihs_auth()

# Set your API key
ihs_auth("paste_your_key_here")

## End(Not run)

Clear Cached IHS Data

Description

Removes downloaded datasets from the internal package cache. This is useful for freeing up disk space. You can clear the cache for specific rounds or entirely.

Usage

ihs_cache_clear(round = NULL)

Arguments

round

A specific round to clear (e.g. "IHS5"). If NULL, asks for confirmation to clear all IHS data depending on the interactivity of the session. Defaults to NULL.

Value

Invisibly returns NULL.

Examples

## Not run: 
# Clear all
ihs_cache_clear()

# Clear only IHS3 data
ihs_cache_clear(round = "IHS3")

## End(Not run)

Display Information About Cached IHS Data

Description

Scans the internal ihsMW cache directory and reports on any previously downloaded datasets.

Usage

ihs_cache_info()

Value

A tibble summarizing the cached files.

Examples

## Not run: 
ihs_cache_info()

## End(Not run)

Check Crosswalk Health

Description

Evaluates the ihsMW crosswalk variable map. Prints a formatted report indicating how many variables are present across rounds, and flags any variables needing manual review.

Usage

ihs_crosswalk_check(verbose = TRUE)

Arguments

verbose

Logical. If TRUE (default), prints the report to the console using message().

Value

A tibble containing the master crosswalk, returned invisibly.

Examples

## Not run: 
cw <- ihs_crosswalk_check()

## End(Not run)

Set up World Bank Microdata API Key (Alias)

Description

A wrapper for ihs_auth() meant for use in scripted or non-interactive environments.

Usage

ihs_key_set(key)

Arguments

key

A single string containing your World Bank Microdata API key.

Value

Invisibly returns the API key.

Examples

## Not run: 
ihs_key_set("paste_your_key_here")

## End(Not run)

Fetch specific variable label locally or remotely

Description

Quickly deciphers what an individual variable physically measures. Looks through the offline dataset initially via harmonised mappings and seamlessly falls through NADA otherwise.

Usage

ihs_label(variable, round = "IHS5")

Arguments

variable

A single character variable map or harmonised standard to inspect precisely.

round

The physical survey round to tie it structurally to if verifying non-harmonised entries. Default "IHS5".

Value

The extracted label mapping directly against what the variable corresponds natively to.

Examples

ihs_label("rexp_cat01")

Inspect available modules for a study

Description

Profiles the file hierarchy explicitly for each data survey pulling nested variables counts efficiently per dataset.

Usage

ihs_modules(round = "IHS5")

Arguments

round

A specific round to fetch dataset structures safely scoped against. (e.g. "IHS5").

Value

Invisibly returns a tibble mapping underlying module variables mapping locally.

Examples

## Not run: 
ihs_modules("IHS5")

## End(Not run)

Description

Searches the manual harmonisation crosswalk bundled within ihsMW for specific variables.

Usage

ihs_search(keyword, round = NULL, fields = c("name", "label", "module"))

Arguments

keyword

A single search string to find (case-insensitive).

round

Limits search to a specific round. Valid inputs are "IHS2", "IHS3", "IHS4", "IHS5". Defaults to NULL (all rounds).

fields

A character vector of fields to include in the search. Valid fields are "name", "label", and "module".

Value

A tibble with cross-round harmonised search results.

Examples

ihs_search("consumption")
ihs_search("expenditure", round = "IHS5")
ihs_search("age", fields = c("name", "label"))

Inspect all variables for a study

Description

Provides real-time variable availability inspection straight from the NADA API.

Usage

ihs_variables(round = "IHS5", module = NULL)

Arguments

round

A specific round to fetch variables for (e.g. "IHS5").

module

An optional module string to specifically look down variables isolated to that path natively (case-insensitive).

Value

Invisibly returns a tibble profiling the variables dynamically mapped alongside known names.

Examples

## Not run: 
ihs_variables(round = "IHS4")
ihs_variables(round = "IHS5", module = "hh_mod_g")

## End(Not run)