class: center, middle, inverse, title-slide # Mapping 2020 US Census Data in R ### Kyle Walker ### March 18, 2022 --- ## About me .pull-left[ * Associate Professor of Geography at TCU * Spatial data science researcher and consultant * R package developer: __tidycensus__, __tigris__, __mapboxapi__ * Book: [_Analyzing US Census Data: Methods, Maps and Models in R_](https://walker-data.com/census-r/) - Available for free online right now; - To be published in print with CRC Press in fall 2022 ] .pull-right[ <img src=img/photo.jpeg> ] --- ## SSDAN workshop series * Last week: an introduction to 2020 US Census data - [If you missed the workshop, view the slides here](https://walker-data.com/umich-workshop-2022/intro-2020-census/#1) * __Right now__: Mapping 2020 Census data * Friday, March 25: a first look at the 2016-2020 American Community Survey data with R and __tidycensus__ --- ## Today's agenda * Hour 1: Census geographic data and "GIS" in R * Hour 2: Making maps with 2020 US Census data * Hour 3: Interactive mapping and advanced geometry handling --- ## General setup * Packages required for today's workshop: ```r install.packages(c("tidycensus", "tidyverse", "terra", "tmap", "mapview", "rosm", "crsuggest")) ``` * Other required packages will be picked up as dependencies of these packages * Or use the pre-built RStudio Cloud environment at https://rstudio.cloud/project/3705005 --- class: middle, center, inverse ## Part 1: Census geographic data and "GIS" in R --- ## US Census Geography <img src=img/census_diagram.png style="width: 500px"> .footnote[Source: [US Census Bureau](https://www2.census.gov/geo/pdfs/reference/geodiagram.pdf)] --- ## Census TIGER/Line shapefiles .pull-left[ <img src=img/tiger_logo.png style="width: 350px"> ] .pull-right[ * TIGER: Topologically Integrated Geographic Encoding and Referencing database * High-quality series of geographic datasets released by the US Census Bureau * Distributed as _shapefiles_, a common GIS data format comprised of several related files ] .footnote[Image source: [US Census Bureau](https://www2.census.gov/geo/pdfs/maps-data/data/tiger/tgrshp2020/TGRSHP2020_TechDoc.pdf)] --- ## A typical GIS workflow <img src="img/co_counties.png" style="width: 800px"> --- ## The __tigris__ R package .pull-left[ <img src="https://raw.githubusercontent.com/walkerke/tigris/master/tools/readme/tigris_sticker.png" style="width: 400px"> ] .pull-right[ * R interface to the US Census Bureau's TIGER/Line shapefile FTP server * No API key necessary - just install the package and start using Census shapefiles in R! ] --- ## Basic usage of tigris * To use tigris, call a function that corresponds to the Census geography you want, optionally by `state` or `county`, when appropriate * Defaults to 2020 unless `year` is otherwise specified (up to 2021 is available) ```r tx_counties <- counties(state = "TX") ``` --- ```r tx_counties ``` ``` ## Simple feature collection with 254 features and 17 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -106.6456 ymin: 25.83716 xmax: -93.50804 ymax: 36.5007 ## Geodetic CRS: NAD83 ## First 10 features: ## STATEFP COUNTYFP COUNTYNS GEOID NAME NAMELSAD LSAD CLASSFP ## 8 48 327 01383949 48327 Menard Menard County 06 H1 ## 12 48 189 01383880 48189 Hale Hale County 06 H1 ## 14 48 011 01383791 48011 Armstrong Armstrong County 06 H1 ## 36 48 057 01383814 48057 Calhoun Calhoun County 06 H1 ## 39 48 077 01383824 48077 Clay Clay County 06 H1 ## 56 48 361 01383966 48361 Orange Orange County 06 H1 ## 64 48 177 01383874 48177 Gonzales Gonzales County 06 H1 ## 69 48 147 01383859 48147 Fannin Fannin County 06 H1 ## 71 48 265 01383918 48265 Kerr Kerr County 06 H1 ## 100 48 391 01383981 48391 Refugio Refugio County 06 H1 ## MTFCC CSAFP CBSAFP METDIVFP FUNCSTAT ALAND AWATER INTPTLAT ## 8 G4020 <NA> <NA> <NA> A 2336237980 613559 +30.8852677 ## 12 G4020 352 38380 <NA> A 2602109423 246678 +34.0684364 ## 14 G4020 108 11100 <NA> A 2354617585 12183672 +34.9641790 ## 36 G4020 544 38920 <NA> A 1312960774 1361631001 +28.4417191 ## 39 G4020 <NA> 48660 <NA> A 2819871858 72506860 +33.7859042 ## 56 G4020 <NA> 13140 <NA> A 864501044 118476459 +30.1223229 ## 64 G4020 <NA> <NA> <NA> A 2762704862 8204086 +29.4619121 ## 69 G4020 206 14300 <NA> A 2307251052 20847065 +33.5911611 ## 71 G4020 314 28500 <NA> A 2857617603 10231764 +30.0599530 ## 100 G4020 <NA> <NA> <NA> A 1995545557 123688759 +28.3221157 ## INTPTLON geometry ## 8 -099.8588613 MULTIPOLYGON (((-99.82187 3... ## 12 -101.8228879 MULTIPOLYGON (((-102.0874 3... ## 14 -101.3566363 MULTIPOLYGON (((-101.6251 3... ## 36 -096.5795739 MULTIPOLYGON (((-96.87329 2... ## 39 -098.2129174 MULTIPOLYGON (((-98.42308 3... ## 56 -093.8940999 MULTIPOLYGON (((-94.03178 3... ## 64 -097.4919205 MULTIPOLYGON (((-97.72765 2... ## 69 -096.1049882 MULTIPOLYGON (((-96.38281 3... ## 71 -099.3533388 MULTIPOLYGON (((-99.25911 2... ## 100 -097.1624723 MULTIPOLYGON (((-97.40849 2... ``` --- ```r plot(tx_counties$geometry) ``` ![](index_files/figure-html/basic-plot-1.png)<!-- --> --- ## The __sf__ package and simple feature geometry .pull-left[ <img src="https://user-images.githubusercontent.com/520851/34887433-ce1d130e-f7c6-11e7-83fc-d60ad4fae6bd.gif" style="width: 400px"> ] .pull-right[ * The sf package implements a _simple features data model_ for vector spatial data in R * Vector geometries: _points_, _lines_, and _polygons_ stored in a list-column of a data frame * Allows for tidy spatial data analysis ] --- class: middle, center, inverse ## Discussion: Making sense of GIS data --- ## Datasets available in tigris * __Legal entities__: units that have legal significance in the US (e.g. states, counties) * __Statistical entities__: units that are used to tabulate Census data but do not have legal standing (e.g. Census tracts or block groups) * __Geographic features__: other geographic datasets provided by the Census Bureau that are not used for demographic tabulation (e.g. roads, water) --- ## Polygons: statistical entities ```r travis_tracts <- tracts(state = "TX", county = "Travis") plot(travis_tracts$geometry) ``` ![](index_files/figure-html/travis-tracts-1.png)<!-- --> --- ## Lines: geographic features ```r travis_roads <- roads(state = "TX", county = "Travis") plot(travis_roads$geometry) ``` ![](index_files/figure-html/travis-roads-1.png)<!-- --> --- ## Points: geographic features ```r dc_landmarks <- landmarks("DC", type = "point") plot(dc_landmarks$geometry) ``` ![](index_files/figure-html/dc-landmarks-1.png)<!-- --> --- ## How tigris works When you call a tigris function, it does the following: * _Downloads_ your data from the US Census Bureau website; * _Stores_ your data in a temporary directory by default; * _Loads_ your data into R as a simple features object using `sf::st_read()` * Recommended option: use `options(tigris_use_cache = TRUE)` to cache downloaded shapefiles and prevent having to re-download every time you use them --- class: middle, center, inverse ## tigris features and options --- ## Cartographic boundary shapefiles .pull-left[ * Question I've received over the years: "Why does Michigan look so weird?" * The core TIGER/Line shapefiles include _water area_ that belongs to US states and counties ] .pull-right[ ```r mi_counties <- counties("MI") plot(mi_counties$geometry) ``` ![](index_files/figure-html/michigan-tiger-1.png)<!-- --> ] --- ## Cartographic boundary shapefiles .pull-left[ * Use the argument `cb = TRUE` to obtain a _cartographic boundary shapefile_ pre-clipped to the US shoreline * For some geographies, highly generalized (1:5 million and 1:20 million) shapefiles are available with the `resolution` argument ] .pull-right[ ```r mi_counties_cb <- counties("MI", cb = TRUE) plot(mi_counties_cb$geometry) ``` ![](index_files/figure-html/michigan-cb-1.png)<!-- --> ] --- ## Understanding yearly differences in TIGER/Line files * Whereas legal entities change shape very rarely (but they do change!), statistical entities change with every decennial Census * tigris fetches Census shapefiles from 1990 up through 2020 ```r tarrant90 <- tracts("TX", "Tarrant", cb = TRUE, year = 1990) tarrant00 <- tracts("TX", "Tarrant", cb = TRUE, year = 2000) tarrant10 <- tracts("TX", "Tarrant", cb = TRUE, year = 2010) tarrant20 <- tracts("TX", "Tarrant", cb = TRUE, year = 2020) ``` --- ```r par(mfrow = c(2, 2)) plot(tarrant90$geometry, main = "1990") plot(tarrant00$geometry, main = "2000") plot(tarrant10$geometry, main = "2010") plot(tarrant20$geometry, main = "2020") ``` ![](index_files/figure-html/plot-yearly-data-1.png)<!-- --> --- ## Interactive viewing of data with __mapview__ * The mapview package brings interactive spatial data viewing to R: ```r library(mapview) mapview(tarrant20) ``` * As an extension, use the leafsync package to interactively compare two or more maps ```r library(leafsync) sync(mapview(tarrant90), mapview(tarrant20)) ``` --- ## National datasets * National enumeration units datasets for select geographies (counties, Census tracts, block groups) can be acquired when `cb = TRUE` by leaving the `state` argument blank ```r us_tracts <- tracts(cb = TRUE) ``` --- ```r us_tracts ``` ``` ## Simple feature collection with 85190 features and 13 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516 ## Geodetic CRS: NAD83 ## First 10 features: ## STATEFP COUNTYFP TRACTCE AFFGEOID GEOID NAME ## 1 17 089 853004 1400000US17089853004 17089853004 8530.04 ## 2 11 001 005003 1400000US11001005003 11001005003 50.03 ## 3 06 037 482503 1400000US06037482503 06037482503 4825.03 ## 4 31 153 010630 1400000US31153010630 31153010630 106.30 ## 5 12 057 013706 1400000US12057013706 12057013706 137.06 ## 6 36 013 035800 1400000US36013035800 36013035800 358 ## 7 56 039 967704 1400000US56039967704 56039967704 9677.04 ## 8 26 099 261200 1400000US26099261200 26099261200 2612 ## 9 26 065 000700 1400000US26065000700 26065000700 7 ## 10 42 045 409601 1400000US42045409601 42045409601 4096.01 ## NAMELSAD STUSPS NAMELSADCO STATE_NAME LSAD ## 1 Census Tract 8530.04 IL Kane County Illinois CT ## 2 Census Tract 50.03 DC District of Columbia District of Columbia CT ## 3 Census Tract 4825.03 CA Los Angeles County California CT ## 4 Census Tract 106.30 NE Sarpy County Nebraska CT ## 5 Census Tract 137.06 FL Hillsborough County Florida CT ## 6 Census Tract 358 NY Chautauqua County New York CT ## 7 Census Tract 9677.04 WY Teton County Wyoming CT ## 8 Census Tract 2612 MI Macomb County Michigan CT ## 9 Census Tract 7 MI Ingham County Michigan CT ## 10 Census Tract 4096.01 PA Delaware County Pennsylvania CT ## ALAND AWATER geometry ## 1 3622334 91650 MULTIPOLYGON (((-88.35003 4... ## 2 94136 0 MULTIPOLYGON (((-77.03195 3... ## 3 729678 0 MULTIPOLYGON (((-118.0995 3... ## 4 4964876 0 MULTIPOLYGON (((-96.23429 4... ## 5 535141 10298 MULTIPOLYGON (((-82.34769 2... ## 6 6855007 0 MULTIPOLYGON (((-79.33622 4... ## 7 199584343 3264367 MULTIPOLYGON (((-111.0366 4... ## 8 1258303 0 MULTIPOLYGON (((-83.00763 4... ## 9 910872 33682 MULTIPOLYGON (((-84.56296 4... ## 10 5880610 0 MULTIPOLYGON (((-75.36051 4... ``` --- ## Part 1 exercises * Give tigris a try for yourselves! [Explore the available geographies in the tigris documentation](https://github.com/walkerke/tigris) and fetch data for a state and/or county of your choosing. Plot the result with `plot()` or with `mapview()`. --- class: middle, center, inverse ## Part 2: Mapping US Census data in R --- ## 2020 Census data in tidycensus: a refresher * Last week, we learned how to use the __tidycensus__ package to acquire data from the 2020 decennial US Census * If you need to: [review these slides to get set up with an API key and learn how to make some basic data pulls](https://walker-data.com/umich-workshop-2022/intro-2020-census/#11) * Core function: `get_decennial()` --- ## Typical Census GIS workflows Traditionally, getting "spatial" Census data requires: -- * Fetching shapefiles from the Census website; -- * Downloading a CSV of data, cleaning/formatting it; -- * Loading geometries and data into your GIS of choice; -- * Aligning key fields in your GIS and joining your data --- ## Geometry in tidycensus * tidycensus takes care of this entire process with the argument `geometry = TRUE` ```r library(tidycensus) options(tigris_use_cache = TRUE) tx_population <- get_decennial( geography = "county", variables = "P1_001N", state = "TX", year = 2020, * geometry = TRUE ) ``` --- ```r tx_population ``` ``` ## Simple feature collection with 254 features and 4 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -106.6456 ymin: 25.83738 xmax: -93.50829 ymax: 36.5007 ## Geodetic CRS: NAD83 ## # A tibble: 254 × 5 ## GEOID NAME variable value geometry ## <chr> <chr> <chr> <dbl> <MULTIPOLYGON [°]> ## 1 48005 Angelina County, Texas P1_001N 86395 (((-95.00488 31.42396, -95.0033… ## 2 48225 Houston County, Texas P1_001N 22066 (((-95.77535 31.12205, -95.7744… ## 3 48373 Polk County, Texas P1_001N 50123 (((-95.20018 30.82457, -95.1750… ## 4 48473 Waller County, Texas P1_001N 56794 (((-96.19178 30.13842, -96.1901… ## 5 48441 Taylor County, Texas P1_001N 143208 (((-100.1518 32.09064, -100.151… ## 6 48055 Caldwell County, Texas P1_001N 45883 (((-97.89707 29.86437, -97.8961… ## 7 48355 Nueces County, Texas P1_001N 353178 (((-97.11172 27.89307, -97.1083… ## 8 48273 Kleberg County, Texas P1_001N 31040 (((-97.3178 27.49456, -97.3159 … ## 9 48489 Willacy County, Texas P1_001N 20164 (((-97.2581 26.42544, -97.25596… ## 10 48377 Presidio County, Texas P1_001N 6131 (((-104.9801 30.62514, -104.979… ## # … with 244 more rows ``` --- ## Basic mapping with base plotting ```r plot(tx_population["value"]) ``` ![](index_files/figure-html/plot-geometry-1.png)<!-- --> --- ## Basic mapping with ggplot2 * `geom_sf()`: ggplot2 method to use simple features in your data visualization workflows ```r library(tidyverse) tx_map <- ggplot(tx_population, aes(fill = value)) + geom_sf() ``` --- ```r tx_map ``` ![](index_files/figure-html/plot-geom-sf-1.png)<!-- --> --- class: middle, center, inverse ## Mapping Census data with tmap --- ## The tmap package .pull-left[ <img src="https://user-images.githubusercontent.com/2444081/106523217-12c12880-64e1-11eb-8d55-01442d535400.png" style="width: 350px"> ] .pull-right[ * Comprehensive package for thematic mapping in R * ggplot2-like syntax, but designed in a way to feel friendly to GIS cartographers coming to R for mapping ] --- ## Example data * Our example: comparing the distributions of racial and ethnic groups in Cook County, Illinois (Chicago) ```r hennepin_race <- get_decennial( geography = "tract", state = "MN", county = "Hennepin", variables = c( Hispanic = "P2_002N", White = "P2_005N", Black = "P2_006N", Native = "P2_007N", Asian = "P2_008N" ), summary_var = "P2_001N", year = 2020, geometry = TRUE ) %>% mutate(percent = 100 * (value / summary_value)) ``` --- ```r dplyr::glimpse(hennepin_race) ``` ``` ## Rows: 1,645 ## Columns: 7 ## $ GEOID <chr> "27053026707", "27053026707", "27053026707", "2705302670… ## $ NAME <chr> "Census Tract 267.07, Hennepin County, Minnesota", "Cens… ## $ variable <chr> "Hispanic", "White", "Black", "Native", "Asian", "Hispan… ## $ value <dbl> 175, 4215, 258, 34, 178, 224, 4581, 658, 21, 188, 285, 2… ## $ summary_value <dbl> 5188, 5188, 5188, 5188, 5188, 5984, 5984, 5984, 5984, 59… ## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-93.49271 4..., MULTIPOLYGO… ## $ percent <dbl> 3.37316885, 81.24518119, 4.97301465, 0.65535852, 3.43099… ``` --- ## Basic plotting with tmap .pull-left[ * `tm_shape()` initializes the shape; `tm_polygons()` shows the polygons for quick display ```r library(tmap) hennepin_black <- filter(hennepin_race, variable == "Black") tm_shape(hennepin_black) + tm_polygons() ``` ] .pull-right[ ![](index_files/figure-html/polygons-map-1.png)<!-- --> ] --- ## Choropleth mapping with tmap .pull-left[ * _Choropleth maps_ show statistical variation through color or shading of areas * They generally should be used with _normalized data_ such as rates or percentages, not counts ```r tm_shape(hennepin_black) + tm_polygons(col = "percent") ``` ] .pull-right[ ![](index_files/figure-html/choropleth-show-1.png)<!-- --> ] --- ## Modifying choropleth options * Color palettes can be modified with the `palette` parameter, which accepts ColorBrewer and viridis palettes * If you've mapped with GIS software before, the `style` parameter implements various breaks methods, including `"equal"`, `"quantile"` and `"jenks"` --- .pull-left[ ```r tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "quantile", n = 7, palette = "Purples", title = "2020 US Census") + tm_layout(title = "Percent Black\nby Census tract", frame = FALSE, legend.outside = TRUE) ``` ] .pull-right[ ![](index_files/figure-html/custom-choropleth-show-1.png)<!-- --> ] --- ## tmap choropleth tips and tricks * Use `tmaptools::palette_explorer()` to interactively browse color options * Use the option `tm_layout(legend.hist = TRUE)` to display the distribution of data values among classes --- .pull-left[ ```r library(sf) tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "jenks", n = 7, palette = "viridis", title = "2020 US Census", legend.hist = TRUE) + tm_layout(title = "Percent Black population\nby Census tract", frame = FALSE, legend.outside = TRUE) ``` ] .pull-right[ ![](index_files/figure-html/jenks-show-1.png)<!-- --> ] --- ## Graduated symbol maps * Graduated symbols: using _size_ of a symbol to represent statistical variation on a map * Implemented in tmap with `tm_bubbles()` ```r symbol_map <- tm_shape(hennepin_black) + tm_polygons() + tm_bubbles(size = "value", alpha = 0.5, col = "navy") + tm_layout(legend.outside = TRUE, legend.outside.position = "bottom") ``` --- ```r symbol_map ``` ![](index_files/figure-html/bubbles-map-1.png)<!-- --> --- ## Faceted mapping * `tm_facets()` allows for comparative small multiples maps. It works well with long-form spatial data returned by tidycensus ```r facet_map <- tm_shape(hennepin_race) + tm_facets(by = "variable", scale.factor = 4) + tm_fill(col = "percent", style = "quantile", n = 7, palette = "Blues") + * tm_layout(legend.position = c(-0.7, 0.15)) ``` --- ```r facet_map ``` ![](index_files/figure-html/facet-map-1.png)<!-- --> --- ## Dot-density maps .pull-left[ * Dot-density maps are useful alternatives to choropleth maps as they can represent internal heterogeneity of Census geographies * __A brand-new function in tidycensus__ (GitHub version for now), `as_dot_density()`, helps you create group-wise dots for dot-density mapping * Map the result with `tm_dots()` ] .pull-right[ ```r hennepin_dots <- hennepin_race %>% as_dot_density( value = "value", values_per_dot = 250, group = "variable" ) hennepin_base <- hennepin_race %>% distinct(GEOID, .keep_all = TRUE) dot_map <- tm_shape(hennepin_base) + tm_polygons(col = "white") + tm_shape(hennepin_dots) + tm_dots(col = "variable", palette = "Set1", size = 0.01) + tm_layout(legend.outside = TRUE) ``` ] --- ```r dot_map ``` ![](index_files/figure-html/dot-density-map-show-1.png)<!-- --> --- class: middle, center, inverse ## Adding cartographic elements --- ## Adding basemaps .pull-left[ * Thematic maps often will use _basemaps_ with a semi-transparent overlay to provide important geographic context to the mapped data * The easiest way to get a basemap for __tmap__ is with the __rosm__ package ```r library(rosm) basemap <- osm.raster( st_bbox(hennepin_black), zoom = 10, type = "cartolight", crop = TRUE ) ``` ] .pull-right[ ```r tm_shape(basemap) + tm_rgb() ``` ![](index_files/figure-html/show-basemap-1.png)<!-- --> ] --- ## Adding basemaps .pull-left[ * Modify the transparency with an `alpha` value to show the basemap beneath the Census tract polygons ```r tm_shape(basemap) + tm_rgb() + tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "quantile", n = 7, palette = "Purples", title = "2020 US Census", * alpha = 0.6) + tm_layout(title = "Percent Black\nby Census tract", legend.outside = TRUE) ``` ] .pull-right[ ![](index_files/figure-html/map-with-basemap-show-1.png)<!-- --> ] --- ## Adding cartographic elements .pull-left[ * __tmap__ also has built-in support for north arrows, scale bars, and credits ```r tm_shape(basemap) + tm_rgb() + tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "quantile", n = 7, palette = "Purples", title = "2020 US Census", alpha = 0.6) + tm_layout(title = "Percent Black\nby Census tract", legend.outside = TRUE) + * tm_scale_bar(position = c("left", "BOTTOM")) + * tm_compass(position = c("right", "top")) + * tm_credits("Basemap (c) CARTO, OSM", * bg.color = "white", * position = c("RIGHT", "BOTTOM"), * bg.alpha = 0, * align = "right") ``` ] .pull-right[ ![](index_files/figure-html/cartographic-elements-show-1.png)<!-- --> ] --- ## What if I still want to use a GIS? * No problem! Write out your data to a shapefile/GeoJSON/GeoPackage with `sf::st_write()` and load into your GIS of choice * Recommendation: use `output = "wide"` for multi-variable datasets (easier to use in a desktop GIS) ```r library(sf) st_write(tx_population, "data/tx_population.shp") ``` --- ## Part 2 exercises Try making your own map with tmap! * If you are just getting started with tidycensus/the tidyverse, make a race/ethnicity map by adapting the code provided in this section but for a different county. * If you are comfortable with tidycensus at this stage, pick a different variable to map instead! --- class: middle, center, inverse ## Interactive mapping and advanced geometry handling --- ## Interactive maps with tmap * Regular __tmap__ code can generate interactive maps by setting `tmap_mode("view")` ```r tmap_mode("view") tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "quantile", n = 7, palette = "Purples", title = "Percent Black<br/>by Census tract", alpha = 0.6) ``` --- ## Customizing interactive maps * As with static maps, __tmap__ includes a wide range of options for customizing interactive maps ```r *tmap_options(basemaps = c("Esri.WorldTopoMap", "Stamen.TonerLite", * "CartoDB.DarkMatter")) tm_shape(hennepin_black) + tm_polygons(col = "percent", style = "quantile", n = 7, palette = "Purples", title = "Percent Black<br/>by Census tract", alpha = 0.6, * id = "NAME") ``` --- ## Saving interactive maps * Use `tmap_save()` to write an interactive map to a website (also works for static plots!) ```r tmap_save(tmap_obj, "hennepin_black.html") ``` --- ## Interactive maps with mapview * Informative interactive choropleth maps can be generated quickly with `mapview::mapview()` by specifying a `zcol` argument ```r mapview(hennepin_black, zcol = "percent") ``` --- class: middle, center, inverse ## Additional resources for interactive maps --- class: middle, center, inverse ## Advanced geometry handling --- ## Mapping population density .pull-left[ * When __tidycensus__ calls __tigris__ internally to get Census geometries, it dismisses all the information (besides the geometry) from the __tigris__ object by default * The argument `keep_geo_vars = TRUE` keeps the information from __tigris__, which includes a column, `ALAND` for land area in meters * This can be used to calculate and map population density ] .pull-right[ ```r mi_density <- get_decennial( geography = "county", variables = "P1_001N", state = "MI", year = 2020, * geometry = TRUE, * keep_geo_vars = TRUE ) %>% mutate(density = 1000000 * (value / ALAND)) ``` ] --- ## Mapping population density * Maps of population density (and everything else, typically!) should aim to minimize distortion of areas * Solution: use a _projected coordinate reference system_ ```r tmap_mode("plot") library(crsuggest) # View the mi_crs object and pick a suitable CRS for the state mi_crs <- suggest_crs(mi_density) # Use that CRS for your map mi_density_map <- tm_shape(mi_density, projection = 6497) + tm_polygons(col = "density", style = "cont", palette = "Blues", title = "People/km2") ``` --- ```r mi_density_map ``` ![](index_files/figure-html/mi-density-map-1.png)<!-- --> --- ## The problem with national maps * A common use-case is generating a map of data for the entire United States. Let's get some data: ```r us_percent_hispanic <- get_decennial( geography = "county", variables = "P2_002N", summary_var = "P2_001N", year = 2020, geometry = TRUE ) %>% mutate(percent = 100 * (value / summary_value)) ``` --- ## The problem with national maps * As we'll see, mapping the entire US by default does not work well ```r national_map <- tm_shape(us_percent_hispanic) + tm_polygons(col = "percent", palette = "plasma", lwd = 0.05, style = "cont", legend.is.portrait = FALSE, title = "% Hispanic, 2020 Census") + tm_layout(legend.outside = TRUE, legend.outside.position = "bottom") ``` --- ```r national_map ``` ![](index_files/figure-html/show-national-map-1.png)<!-- --> --- ## "Shifting" and re-scaling US geometry * Use the `shift_geometry()` function in the __tigris__ package to move Alaska, Hawaii, and Puerto Rico to better positions for national mapping ```r us_rescaled <- shift_geometry(us_percent_hispanic) rescaled_map <- tm_shape(us_rescaled) + tm_polygons(col = "percent", palette = "plasma", lwd = 0.05, style = "cont", legend.is.portrait = FALSE, title = "% Hispanic, 2020 Census") + tm_layout(legend.outside = TRUE, legend.outside.position = "bottom", legend.outside.size = 0.2) ``` --- ```r rescaled_map ``` ![](index_files/figure-html/rescaled-map-show-1.png)<!-- --> --- ## "Shifting" and re-scaling US geometry * Use `preserve_area = TRUE` to preserve the relative area of Alaska, Hawaii, and Puerto Rico relative to the continental US ```r us_shifted <- shift_geometry(us_percent_hispanic, position = "outside", preserve_area = TRUE) shifted_map <- tm_shape(us_shifted) + tm_polygons(col = "percent", palette = "plasma", lwd = 0.05, style = "cont", legend.is.portrait = FALSE, title = "% Hispanic, 2020 Census") + tm_layout(legend.position = c("left", "BOTTOM")) ``` --- ```r shifted_map ``` ![](index_files/figure-html/shifted-map-show-1.png)<!-- --> --- ## The problem with water areas .pull-left[ * Locations with signficant water area (e.g. Seattle, New York City) will often have Census tracts that cover rivers and lakes where no one lives * Census maps may then misrepresent what viewers expect to see from a location ] .pull-right[ ```r king_asian <- get_decennial( geography = "tract", variables = c(asian = "P2_008N", total = "P2_001N"), state = "WA", county = "King", geometry = TRUE, year = 2020, output = "wide" ) %>% mutate(percent = 100 * (asian / total)) mapview(king_asian, zcol = "percent") ``` ] --- ## "Erasing" water areas .pull-left[ * The `erase_water()` function in the __tigris__ package will remove water areas above a given area threshold (size percentile) from a spatial dataset * Performs best when data are in a _projected coordinate reference system_ ([read here for more information](https://walker-data.com/census-r/census-geographic-data-and-applications-in-r.html#coordinate-reference-systems)) ] .pull-right[ ```r library(sf) king_erase <- king_asian %>% * st_transform(6596) %>% * erase_water(area_threshold = 0.9) mapview(king_erase, zcol = "percent") ``` ] --- ## Dasymetric dot-density mapping .pull-left[ * The new `as_dot_density()` function in __tidycensus__ integrates `erase_water()` for _dasymetric dot placement_, where dots will not be placed in water areas * Helpful for locations like Minneapolis, which have a lot of lakes! ] .pull-right[ ```r hennepin_dots_dasy <- hennepin_race %>% st_transform(26914) %>% as_dot_density( value = "value", values_per_dot = 250, group = "variable", erase_water = TRUE ) hn_basemap <- osm.raster(st_bbox(hennepin_race), zoom = 10, type = "cartolight", crop = TRUE) tm_shape(hn_basemap) + tm_rgb() + tm_shape(hennepin_dots_dasy) + tm_dots(col = "variable", palette = "Set1", size = 0.01) + tm_layout(legend.outside = TRUE) ``` ] --- <img src="img/hennepin_dot_map.png" style="width: 800px"> --- ## Part 3 exercises * Using a static map you created in Part 2 of the workshop, try converting to an interactive map with `tmap_mode("view")` * Experiment with the `shift_geometry()` function and try different combinations of the `preserve_area` and `position` arguments. Which layout do you prefer, and why? --- class: middle, center, inverse ## Thank you!