`R/utils.R`

`interpolate_pw.Rd`

A common use-case when working with time-series small-area Census data is to transfer data from one set of shapes (e.g. 2010 Census tracts) to another set of shapes (e.g. 2020 Census tracts). Population-weighted interpolation is one such solution to this problem that takes into account the distribution of the population within a Census unit to intelligently transfer data between incongruent units.

```
interpolate_pw(
from,
to,
to_id = NULL,
extensive,
weights,
weight_column = NULL,
weight_placement = c("surface", "centroid"),
crs = NULL
)
```

- from
The spatial dataset from which numeric attributes will be interpolated to target zones. By default, all numeric columns in this dataset will be interpolated.

- to
The target geometries (zones) to which numeric attributes will be interpolated.

- to_id
(optional) An ID column in the target dataset to be retained in the output. For data obtained with tidycensus, this will be

`"GEOID"`

by convention. If`NULL`

, the output dataset will include a column`id`

that uniquely identifies each row.- extensive
if

`TRUE`

, return weighted sums; if`FALSE`

, return weighted means.- weights
An input spatial dataset to be used as weights. If the dataset is not of geometry type

`POINT`

, it will be converted to points by the function with`sf::st_point_on_surface()`

. For US-based applications, this will commonly be a Census block dataset obtained with the tigris or tidycensus packages.- weight_column
(optional) a column in

`weights`

used for weighting in the interpolation process. Typically this will be a column representing the population (or other weighting metric, like housing units) of the input weights dataset. If`NULL`

(the default), each feature in`weights`

is given an equal weight of 1.- weight_placement
(optional) One of

`"surface"`

, where weight polygons are converted to points on polygon surfaces with`sf::st_point_on_surface()`

, or`"centroid"`

, where polygon centroids are used instead with`sf::st_centroid()`

. Defaults to`"surface"`

. This argument is not necessary if weights are already of geometry type`POINT`

.- crs
(optional) The EPSG code of the output projected coordinate reference system (CRS). Useful as all input layers (

`from`

,`to`

, and`weights`

) must share the same CRS for the function to run correctly.

A dataset of class sf with the geometries and an ID column from `to`

(the target shapes) but with numeric attributes of `from`

interpolated to those shapes.

The approach implemented here is based on Esri's data apportionment algorithm, in which an "apportionment layer" of points (referred to here as the `weights`

) is used to determine how to weight areas of overlap between origin and target zones. Users must supply a "from" dataset as an sf object (the dataset from which numeric columns will be interpolated) and a "to" dataset, also of class sf, that contains the target zones. A third sf object, the "weights", may be an object of geometry type `POINT`

or polygons from which points will be derived using `sf::st_point_on_surface()`

.

An intersection is computed between `from`

and `to`

, and a spatial join is computed between the intersection layer and the weights layer, represented as points. A specified `weight_column`

in `weights`

will be used to determine the relative influence of each point on the allocation of values between `from`

and `to`

; if no weight column is specified, all points will be weighted equally.

The `extensive`

parameter (logical) should reflect the values being interpolated correctly. If `TRUE`

, the function returns a weighted sum for each zone. If `FALSE`

, a weighted mean will be returned. For Census data, `extensive = TRUE`

should be used for transferring counts / estimated counts between zones. Derived metrics (e.g. population density, percentages, etc.) should use `extensive = FALSE`

. Margins of error in the ACS will not be transferred correctly with this function, so please use with caution.

```
if (FALSE) {
# Example: interpolating work-from-home from 2011-2015 ACS
# to 2020 shapes
library(tidycensus)
library(tidyverse)
library(tigris)
options(tigris_use_cache = TRUE)
wfh_15 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2015,
state = "AZ",
county = "Maricopa",
geometry = TRUE
) %>%
select(estimate)
wfh_20 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2020,
state = "AZ",
county = "Maricopa",
geometry = TRUE
)
maricopa_blocks <- blocks(
"AZ",
"Maricopa",
year = 2020
)
wfh_15_to_20 <- interpolate_pw(
from = wfh_15,
to = wfh_20,
to_id = "GEOID",
weights = maricopa_blocks,
weight_column = "POP20",
crs = 26949,
extensive = TRUE
)
}
```