Getting started with datasusr

datasusr provides fast, in-memory reading of DATASUS .dbc files and a complete workflow for discovering, downloading, and caching Brazilian public health data.

The fastest way: datasus_fetch()

If you know the source, file type, period, and state you need, datasus_fetch() handles listing, downloading, and reading in a single call:

library(datasusr)

df <- datasus_fetch(
  source    = "SIHSUS",
  file_type = "RD",
  year      = 2024,
  month     = 1,
  uf        = "PE"
)

df

The result is a tibble ready for analysis with dplyr, ggplot2, or any tidyverse tool. Files are cached by default, so running the same call again skips the download entirely.

Reading a local DBC file

If you already have a .dbc file on disk, use read_datasus_dbc() directly:

x <- read_datasus_dbc("RDPE2401.dbc")
x

Selecting columns

DATASUS files often have dozens of columns. Use select to keep only what you need — this is faster and uses less memory:

x <- read_datasus_dbc(
  "RDPE2401.dbc",
  select = c("uf_zi", "ano_cmpt", "munic_res", "val_tot")
)

Controlling column types

By default, datasusr inspects each numeric field to decide between integer and double. You can override this with col_types and parse date fields with parse_dates:

x <- read_datasus_dbc(
  "SPPE2401.dbc",
  select     = c("sp_gestor", "sp_naih", "sp_dtinter", "sp_valato"),
  col_types  = c(
    sp_gestor  = "character",
    sp_naih    = "character",
    sp_dtinter = "date",
    sp_valato  = "double"
  ),
  parse_dates = TRUE,
  guess_types = FALSE
)

Exploring available data

Before downloading, you can browse the internal catalog to discover which sources and file types are available:

datasus_sources()
datasus_file_types(source = "SIHSUS")
datasus_file_types(source = "CNES")

Step-by-step workflow

For more control, you can use the individual functions instead of datasus_fetch():

# 1. Build the FTP paths
datasus_build_path(source = "SIHSUS", file_type = "RD", year = 2024, month = 1)

# 2. List files (validated against FTP)
files <- datasus_list_files(
  source    = "SIHSUS",
  file_type = "RD",
  year      = 2024,
  month     = 1:3,
  uf        = c("PE", "PB")
)

# 3. Download with cache
downloads <- datasus_download(files, use_cache = TRUE)

# 4. Read
x <- read_datasus_dbc(downloads$local_file[[1]])

To skip FTP validation (useful when the server is slow), set check_exists = FALSE in datasus_list_files().

Territorial data (municipalities, regions)

DATASUS publishes territorial reference tables (municipalities, health regions, etc.) as CSV files. Use datasus_get_territory() to download and read them:

# Download municipalities table
municipios <- datasus_get_territory("tb_municip")
municipios

# Other available tables
datasus_ftp_ls("ftp://ftp.datasus.gov.br/territorio/tabelas/")

Finding documentation and data dictionaries

Each information system has documentation files on the DATASUS FTP. Use datasus_docs_url() to find them:

# See all documentation paths
datasus_docs_url()

# List documentation files for a specific system
datasus_docs_url("CNES")
datasus_ftp_ls(datasus_docs_url("CNES")$docs_url[[1]])

Next steps

See the other vignettes for more detail: