Skip to contents

Reads the metadata about an IPUMS extract from a DDI codebook into an ipums_ddi object.

These metadata contains parsing instructions for the associated fixed-width data file, contextual labels for variables and values in the data, and general extract information.

See Downloading IPUMS files below for information about downloading IPUMS DDI codebook files.

Usage

read_ipums_ddi(
  ddi_file,
  lower_vars = FALSE,
  file_select = deprecated(),
  data_layer = deprecated()
)

Arguments

ddi_file

Path to a DDI .xml file downloaded from IPUMS. See Downloading IPUMS files below.

lower_vars

Logical indicating whether to convert variable names to lowercase. Defaults to FALSE for consistency with IPUMS conventions.

data_layer, file_select

[Deprecated] Reading DDI files contained in a .zip archive has been deprecated. Please provide the full path to the .xml file to be loaded in ddi_file.

Value

An ipums_ddi object with metadata information.

Downloading IPUMS files

The DDI codebook (.xml) file provided with IPUMS microdata extracts can be downloaded through the IPUMS extract interface or (for some collections) within R using the IPUMS API.

If using the IPUMS extract interface:

  • Download the DDI codebook by right clicking on the DDI link in the Codebook column of the extract interface and selecting Save as... (on Safari, you may have to select Download Linked File As...). Be sure that the codebook is downloaded in .xml format.

If using the IPUMS API:

  • For supported collections, use download_extract() to download a completed extract via the IPUMS API. This automatically downloads both the DDI codebook and the data file from the extract and returns the path to the codebook file.

See also

read_ipums_micro(), read_ipums_micro_chunked() and read_ipums_micro_yield() to read data from IPUMS microdata extracts.

ipums_var_info() and ipums_file_info() to view metadata about an ipums_ddi object.

ipums_list_files() to list files in an IPUMS extract.

Examples

# Example codebook file
ddi_file <- ipums_example("cps_00157.xml")

# Load data into an `ipums_ddi` obj
ddi <- read_ipums_ddi(ddi_file)

# Use the object to load its associated data
cps <- read_ipums_micro(ddi)
#> Use of data from IPUMS CPS is subject to conditions including that users should cite the data appropriately. Use command `ipums_conditions()` for more details.

head(cps)
#> # A tibble: 6 × 8
#>    YEAR SERIAL MONTH     ASECWTH STATEFIP       PERNUM ASECWT INCTOT            
#>   <dbl>  <dbl> <int+lbl>   <dbl> <int+lbl>       <dbl>  <dbl> <dbl+lbl>         
#> 1  1962     80 3 [March]   1476. 55 [Wisconsin]      1  1476.      4883         
#> 2  1962     80 3 [March]   1476. 55 [Wisconsin]      2  1471.      5800         
#> 3  1962     80 3 [March]   1476. 55 [Wisconsin]      3  1579. 999999998 [Missin…
#> 4  1962     82 3 [March]   1598. 27 [Minnesota]      1  1598.     14015         
#> 5  1962     83 3 [March]   1707. 27 [Minnesota]      1  1707.     16552         
#> 6  1962     84 3 [March]   1790. 27 [Minnesota]      1  1790.      6375         

# Or get metadata information directly
ipums_var_info(ddi)
#> # A tibble: 8 × 10
#>   var_name var_label        var_desc val_labels code_instr start   end imp_decim
#>   <chr>    <chr>            <chr>    <list>     <chr>      <dbl> <dbl>     <dbl>
#> 1 YEAR     Survey year      "YEAR r… <tibble>   "YEAR is …     1     4         0
#> 2 SERIAL   Household seria… "SERIAL… <tibble>   "SERIAL i…     5     9         0
#> 3 MONTH    Month            "MONTH … <tibble>    NA           10    11         0
#> 4 ASECWTH  Annual Social a… "ASECWT… <tibble>   "ASECWTH …    12    22         4
#> 5 STATEFIP State (FIPS cod… "STATEF… <tibble>    NA           23    24         0
#> 6 PERNUM   Person number i… "PERNUM… <tibble>   "PERNUM i…    25    26         0
#> 7 ASECWT   Annual Social a… "ASECWT… <tibble>   "ASECWT i…    27    37         4
#> 8 INCTOT   Total personal … "INCTOT… <tibble>   "99999999…    38    46         0
#> # ℹ 2 more variables: var_type <chr>, rectypes <lgl>

ipums_file_info(ddi)[1:2]
#> $ipums_project
#> [1] "IPUMS CPS"
#> 
#> $extract_date
#> [1] "2023-07-10"
#> 

# If variable metadata have been lost from a data source, reattach from
# its corresponding `ipums_ddi` object:
cps <- zap_ipums_attributes(cps)

ipums_var_label(cps$STATEFIP)
#> [1] NA

cps <- set_ipums_var_attributes(cps, ddi$var_info)

ipums_var_label(cps$STATEFIP)
#> [1] "State (FIPS code)"