Helpers for joining shape files downloaded from the IPUMS website to data from extracts. Because of historical reasons, the attributes of (like variable type) of variables in the shape files does not always match those in the data files.

ipums_shape_left_join(
  data,
  shape_data,
  by,
  suffix = c("", "SHAPE"),
  verbose = TRUE
)

ipums_shape_right_join(
  data,
  shape_data,
  by,
  suffix = c("", "SHAPE"),
  verbose = TRUE
)

ipums_shape_inner_join(
  data,
  shape_data,
  by,
  suffix = c("", "SHAPE"),
  verbose = TRUE
)

ipums_shape_full_join(
  data,
  shape_data,
  by,
  suffix = c("", "SHAPE"),
  verbose = TRUE
)

Arguments

data

A dataset, usually one that has been aggregated to a geographic level.

shape_data

A shape file (loaded with read_ipums_sf or read_ipums_sp)

by

A vector of variable names to join on. Like the dplyr join functions, named vectors indicate that the names are different between the data and shape file. shape files to load. Accepts a character vector specifying the file name, or dplyr_select_style conventions. Can load multiple shape files, which will be combined.

suffix

For variables that are found in both, but aren't joined on, a suffix to put on the variables. Defaults to nothing for data variables and "_SHAPE" for variables from the shape file.

verbose

I TRUE, will report information about geometries dropped in the merge.

Value

returns a sf or a SpatialPolygonsDataFrame depending on what was passed in.

Examples

# Note that these examples use NHGIS data so that they use the example data provided,
# but the functions read_nhgis_sf/read_nhgis_sp perform this merge for you.

data <- read_nhgis(ipums_example("nhgis0008_csv.zip"))
#> Use of data from NHGIS is subject to conditions including that users should
#> cite the data appropriately. Use command `ipums_conditions()` for more details.
#> 
#> 
#> Reading data file...

if (require(sf)) {
  sf <- read_ipums_sf(ipums_example("nhgis0008_shape_small.zip"))
  data_sf <- ipums_shape_inner_join(data, sf, by = "GISJOIN")
}
#> Loading required package: sf
#> Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 7.2.1; sf_use_s2() is TRUE
#> options:        ENCODING=latin1 
#> Reading layer `US_pmsa_1990' from data source 
#>   `C:\Users\derek\AppData\Local\Temp\RtmpCkCwmv\file36e8722d263a\US_pmsa_1990.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 71 features and 8 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -2336182 ymin: -1247086 xmax: 2075339 ymax: 1476544
#> Projected CRS: Albers

if (require(sp) && require(rgdal)) {
  sp <- read_ipums_sp(ipums_example("nhgis0008_shape_small.zip"))
  data_sp <- ipums_shape_inner_join(data, sp, by = "GISJOIN")
}
#> Loading required package: sp
#> Loading required package: rgdal
#> Please note that rgdal will be retired by the end of 2023,
#> plan transition to sf/stars/terra functions using GDAL and PROJ
#> at your earliest convenience.
#> 
#> rgdal: version: 1.5-32, (SVN revision 1176)
#> Geospatial Data Abstraction Library extensions to R successfully loaded
#> Loaded GDAL runtime: GDAL 3.3.2, released 2021/09/01
#> Path to GDAL shared files: C:/Users/derek/AppData/Local/R/win-library/4.2/rgdal/gdal
#> GDAL binary built with GEOS: TRUE 
#> Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
#> Path to PROJ shared files: C:/Users/derek/AppData/Local/R/win-library/4.2/rgdal/proj
#> PROJ CDN enabled: FALSE
#> Linking to sp version:1.4-7
#> To mute warnings of possible GDAL/OSR exportToProj4() degradation,
#> use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
#> OGR data source with driver: ESRI Shapefile 
#> Source: "C:\Users\derek\AppData\Local\Temp\RtmpCkCwmv\file36e8143d553e", layer: "US_pmsa_1990"
#> with 71 features
#> It has 8 fields

if (FALSE) {
  # Sometimes variable names won't match between datasets (for example in IPUMS international)
  data <- read_ipums_micro("ipumsi_00004.xml")
  shape <- read_ipums_sf("geo2_br1980_2010.zip")
  data_sf <- ipums_shape_inner_join(data, shape, by = c("GEO2" = "GEOLEVEL2"))
}