Add values to an existing extract definition for an IPUMS microdata collection
Source:R/api_define_extract.R
add_to_extract_micro.Rd
Add new values or replace existing values in an IPUMS microdata extract
definition. All fields are optional, and if omitted, will be unchanged.
Supplying a value
for fields that take a single value, such as description
and data_format
,
will replace the existing value with the supplied value.
This function is marked as experimental because it is typically not the best
option for maintaining reproducible extract definitions and may be retired
in the future. For reproducibility, users should strive to build extract
definitions with define_extract_micro()
.
If you have a complicated extract definition to revise, but do not have
the original extract definition code that created it, we suggest that you
save the revised extract as a JSON file with save_extract_as_json()
. This
will create a stable version of the extract definition that
can be used in the future as needed.
To remove existing values from an IPUMS microdata extract definition, use
remove_from_extract()
.
Learn more about the IPUMS API in vignette("ipums-api")
.
Usage
# S3 method for class 'micro_extract'
add_to_extract(
extract,
description = NULL,
samples = NULL,
variables = NULL,
time_use_variables = NULL,
sample_members = NULL,
data_format = NULL,
data_structure = NULL,
rectangular_on = NULL,
case_select_who = NULL,
data_quality_flags = NULL,
...
)
Arguments
- extract
An
ipums_extract
object.- description
Description of the extract.
- samples
Vector of samples to include in the extract request. Use
get_sample_info()
to identify sample IDs for a given collection.- variables
Character vector of variable names or a list of
var_spec
objects created byvar_spec()
containing specifications for all variables to include in the extract.If a variable already exists in the extract, its specifications will be added to those that already exist for that variable.
- time_use_variables
Vector of names of IPUMS-defined time use variables or a list of specifications for user-defined time use variables to include in the extract request. Use
tu_var_spec()
to create atu_var_spec
object containing a time use variable specification.- sample_members
Indication of whether to include additional sample members in the extract request. If provided, must be one of
"include_non_respondents"
,"include_household_members"
, or both.Sample member selection is only available for the IPUMS ATUS collection (
"atus"
).- data_format
Format for the output extract data file. Either
"fixed_width"
or"csv"
.Note that while
"stata"
,"spss"
, or"sas9"
are also accepted, these file formats are not supported by ipumsr data-reading functions.- data_structure
Data structure for the output extract data.
"rectangular"
provides data in which every row has the same record type (determined by"rectangular_on"
), with variables from other record types written onto associated records of the chosen type (e.g. household variables written onto person records)."hierarchical"
provides data that include rows of differing record types, with records ordered according to their hierarchical structure (e.g. each person record is followed by the activity records for that person)."household_only"
provides household records only. This data structure is only available for the IPUMS USA collection ("usa"
).
- rectangular_on
If
data_structure
is"rectangular"
, records on which to rectangularize. One of"P"
(person),"A"
(activity),"I"
(injury) or"R"
(round).Defaults to
"P"
ifdata_structure
is"rectangular"
andNULL
otherwise.- case_select_who
Indication of how to interpret any case selections included for variables in the extract definition.
"individuals"
includes records for all individuals who match the specified case selections."households"
includes records for all members of each household that contains an individual who matches the specified case selections.
Defaults to
"individuals"
. Usevar_spec()
to add case selections for specific variables.- data_quality_flags
Set to
TRUE
to include data quality flags for all applicable variables in the extract definition. This will override thedata_quality_flags
specification for individual variables in the definition.Use
var_spec()
to add data quality flags for specific variables.- ...
Ignored
Details
If the supplied extract definition comes from a previously submitted extract request, this function will reset the definition to an unsubmitted state.
To modify variable-specific parameters for variables that already exist
in the extract, create a new variable specification with var_spec()
.
See also
remove_from_extract()
to remove
values from an extract definition.
submit_extract()
to submit an extract request.
download_extract()
to download extract data files.
define_extract_micro()
to create a new extract
definition from scratch
Examples
extract <- define_extract_micro(
collection = "usa",
description = "2013 ACS Data",
samples = "us2013a",
variables = c("SEX", "AGE", "YEAR")
)
# Add a single sample
add_to_extract(extract, samples = "us2014a")
#> Unsubmitted IPUMS USA extract
#> Description: 2013 ACS Data
#>
#> Samples: (2 total) us2013a, us2014a
#> Variables: (3 total) SEX, AGE, YEAR
# Add samples and variables
extract2 <- add_to_extract(
extract,
samples = "us2014a",
variables = c("MARST", "BIRTHYR")
)
# Modify specifications for variables in the extract by using `var_spec()`
# with the existing variable name:
add_to_extract(
extract,
samples = "us2014a",
variables = var_spec("SEX", case_selections = "2")
)
#> Unsubmitted IPUMS USA extract
#> Description: 2013 ACS Data
#>
#> Samples: (2 total) us2013a, us2014a
#> Variables: (3 total) SEX, AGE, YEAR
# You can make multiple modifications or additions by providing a list
# of `var_spec()` objects:
add_to_extract(
extract,
samples = "us2014a",
variables = list(
var_spec("RACE", attached_characteristics = "mother"),
var_spec("SEX", case_selections = "2"),
var_spec("RELATE")
)
)
#> Unsubmitted IPUMS USA extract
#> Description: 2013 ACS Data
#>
#> Samples: (2 total) us2013a, us2014a
#> Variables: (5 total) SEX, AGE, YEAR, RACE, RELATE
# Values that only take a single value are replaced
add_to_extract(extract, description = "New description")$description
#> [1] "New description"