Reads a dataset downloaded from the NHGIS extract system. Relies on csv files (with or without the extra header row).

read_nhgis(data_file, data_layer = NULL, verbose = TRUE,
  var_attrs = c("val_labels", "var_label", "var_desc"))

read_nhgis_sf(data_file, shape_file, data_layer = NULL,
  shape_layer = data_layer, shape_encoding = "latin1",
  verbose = TRUE, var_attrs = c("val_labels", "var_label", "var_desc"))

read_nhgis_sp(data_file, shape_file, data_layer = NULL,
  shape_layer = data_layer, shape_encoding = "latin1",
  verbose = TRUE, var_attrs = c("val_labels", "var_label", "var_desc"))

Arguments

data_file

Filepath to the data (either the .zip file directly downloaded from the website, the path to the unzipped folder, or the path to the unzipped .csv file directly).

data_layer

For .zip extracts with multiple datasets, the name of the data to load. Accepts a character vector specifying the file name, or dplyr_select_style conventions. Data layer must uniquely identify a dataset.

verbose

Logical, indicating whether to print progress information to console.

var_attrs

Variable attributes to add from the codebook, defaults to adding all (val_labels, var_label and var_desc). See set_ipums_var_attributes for more details.

shape_file

Filepath to the shape files (either the .zip file directly downloaded from the website, or the path to the unzipped folder, or the unzipped .shp file directly).

shape_layer

(Defaults to using the same value as data_layer) Specification of which shape files to load using the same semantics as data_layer. Can load multiple shape files, which will be combined.

shape_encoding

The text encoding to use when reading the shape file. Typically the defaults should read the data correctly, but for some extracts you may need to set them manually, but if funny characters appear in your data, you may need to. Defaults to "latin1" for NHGIS.

Value

read_nhgis returns a tbl_df with only the tabular data, read_nhgis_sf returns a sf object with data and the shapes, and read_nhgis_sp returns a SpatialPolygonsDataFrame with data and shapes.

See also

Examples

csv_file <- ipums_example("nhgis0008_csv.zip") shape_file <- ipums_example("nhgis0008_shape_small.zip") data_only <- read_nhgis(csv_file)
#> Use of data from NHGIS is subject to conditions including that users should #> cite the data appropriately. Use command `ipums_conditions()` for more details. #> #> #> Reading data file...
# If sf package is availble, can load as sf object if (require(sf)) { sf_data <- read_nhgis_sf(csv_file, shape_file) }
#> Use of data from NHGIS is subject to conditions including that users should #> cite the data appropriately. Use command `ipums_conditions()` for more details. #> #> #> Reading data file... #> Reading geography... #> options: ENCODING=latin1 #> Reading layer `US_pmsa_1990' from data source `C:\Users\burkx031\AppData\Local\Temp\RtmpgnQu91\file1ee4b7d3f8\US_pmsa_1990.shp' using driver `ESRI Shapefile' #> Simple feature collection with 71 features and 8 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -2336182 ymin: -1247086 xmax: 2075339 ymax: 1476544 #> epsg (SRID): NA #> proj4string: +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs
# If sp package is available, can load as SpatialPolygonsDataFrame if (require(rgdal) && require(sp)) { sp_data <- read_nhgis_sp(csv_file, shape_file) }
#> Use of data from NHGIS is subject to conditions including that users should #> cite the data appropriately. Use command `ipums_conditions()` for more details. #> #> #> Reading data file... #> Reading geography... #> OGR data source with driver: ESRI Shapefile #> Source: "C:\Users\burkx031\AppData\Local\Temp\RtmpgnQu91\file1ee45aa01ab3", layer: "US_pmsa_1990" #> with 71 features #> It has 8 fields
#> Warning: Column `GISJOIN` has different attributes on LHS and RHS of join