Add values to an existing IPUMS NHGIS extract definition
Source:R/api_define_extract.R
add_to_extract.nhgis_extract.Rd
Add new values to an IPUMS NHGIS extract definition.
All fields are optional, and if omitted, will be unchanged.
Supplying a value for fields that take a single value, such as
description
and data_format
, will replace the existing value with
the supplied value.
This function is marked as experimental because it is typically not the best
option for maintaining reproducible extract definitions and may be retired
in the future. For reproducibility, users should strive to build extract
definitions with define_extract_nhgis()
.
If you have a complicated extract definition to revise, but do not have
the original extract definition code that created it, we suggest that you
save the revised extract as a JSON file with save_extract_as_json()
. This
will create a stable version of the extract definition that
can be used in the future as needed.
To remove existing values from an IPUMS NHGIS extract definition, use
remove_from_extract()
.
Learn more about the IPUMS API in vignette("ipums-api")
.
Usage
# S3 method for class 'nhgis_extract'
add_to_extract(
extract,
description = NULL,
datasets = NULL,
time_series_tables = NULL,
geographic_extents = NULL,
shapefiles = NULL,
breakdown_and_data_type_layout = NULL,
tst_layout = NULL,
data_format = NULL,
...
)
Arguments
- extract
An
ipums_extract
object.- description
Description of the extract.
- datasets
List of
ds_spec
objects created byds_spec()
containing the specifications for the datasets to include in the extract request. See examples.If a dataset already exists in the extract, its new specifications will be added to those that already exist for that dataset.
- time_series_tables
List of
tst_spec
objects created bytst_spec()
containing the specifications for the time series tables to include in the extract request.If a time series table already exists in the extract, its new specifications will be added to those that already exist for that time series table.
- geographic_extents
Vector of geographic extents to use for all of the
datasets
in the extract definition (for instance, to obtain data within a particular state). Use"*"
to select all available extents.Required when any of the
datasets
included in the extract definition includegeog_levels
that require extent selection. Seeget_metadata_nhgis()
to determine if a geographic level requires extent selection. At the time of writing, NHGIS supports extent selection only for blocks and block groups.- shapefiles
Names of any shapefiles to include in the extract request.
- breakdown_and_data_type_layout
The desired layout of any
datasets
that have multiple data types or breakdown values."single_file"
(default) keeps all data types and breakdown values in one file"separate_files"
splits each data type or breakdown value into its own file
Required if any
datasets
included in the extract definition consist of multiple data types (for instance, estimates and margins of error) or have multiple breakdown values specified. Seeget_metadata_nhgis()
to determine whether a requested dataset has multiple data types.- tst_layout
The desired layout of all
time_series_tables
included in the extract definition."time_by_column_layout"
(wide format, default): rows correspond to geographic units, columns correspond to different times in the time series"time_by_row_layout"
(long format): rows correspond to a single geographic unit at a single point in time"time_by_file_layout"
: data for different times are provided in separate files
Required when an extract definition includes any
time_series_tables
.- data_format
The desired format of the extract data file.
"csv_no_header"
(default) includes only a minimal header in the first row"csv_header"
includes a second, more descriptive header row."fixed_width"
provides data in a fixed width format
Note that by default,
read_nhgis()
removes the additional header row in"csv_header"
files.Required when an extract definition includes any
datasets
ortime_series_tables
.- ...
Ignored
Details
For extract fields that take a single value, add_to_extract()
will
replace the existing value with the new value provided for that field.
It is not necessary to first remove this value using
remove_from_extract()
.
If the supplied extract definition comes from a previously submitted extract request, this function will reset the definition to an unsubmitted state.
See also
remove_from_extract()
to remove
values from an extract definition.
define_extract_nhgis()
to create a new extract definition.
submit_extract()
to submit an extract request.
download_extract()
to download extract data files.
Examples
extract <- define_extract_nhgis(
datasets = ds_spec("1990_STF1", c("NP1", "NP2"), "county")
)
# Add a new dataset or time series table to the extract
add_to_extract(
extract,
datasets = ds_spec("1990_STF2a", "NPA1", "county")
)
#> Unsubmitted IPUMS NHGIS extract
#> Description:
#>
#> Dataset: 1990_STF1
#> Tables: NP1, NP2
#> Geog Levels: county
#>
#> Dataset: 1990_STF2a
#> Tables: NPA1
#> Geog Levels: county
add_to_extract(
extract,
time_series_tables = tst_spec("A00", "state")
)
#> Unsubmitted IPUMS NHGIS extract
#> Description:
#>
#> Dataset: 1990_STF1
#> Tables: NP1, NP2
#> Geog Levels: county
#>
#> Time Series Table: A00
#> Geog Levels: state
# If a dataset/time series table name already exists in the definition
# its specification will be modified by adding the new specifications to
# the existing ones
add_to_extract(
extract,
datasets = ds_spec("1990_STF1", "NP4", "nation")
)
#> Unsubmitted IPUMS NHGIS extract
#> Description:
#>
#> Dataset: 1990_STF1
#> Tables: NP1, NP2, NP4
#> Geog Levels: county, nation
# You can add new datasets and modify existing ones simultaneously by
# providing a list of `ds_spec` objects
add_to_extract(
extract,
datasets = list(
ds_spec("1990_STF1", "NP4", "nation"),
ds_spec("1990_STF2a", "NPA1", "county")
)
)
#> Unsubmitted IPUMS NHGIS extract
#> Description:
#>
#> Dataset: 1990_STF1
#> Tables: NP1, NP2, NP4
#> Geog Levels: county, nation
#>
#> Dataset: 1990_STF2a
#> Tables: NPA1
#> Geog Levels: county
# Values that can only take a single value are replaced
add_to_extract(extract, data_format = "fixed_width")$data_format
#> [1] "fixed_width"