Workflows with R and tidycensus
2024-02-08
Associate Professor of Geography at TCU
Spatial data science researcher and consultant
Package developer: tidycensus, tigris, mapboxapi, crsuggest, idbr (R), pygris (Python)
Book: Analyzing US Census Data: Methods, Maps and Models in R
Today: Working with the 2022 American Community Survey with R and tidycensus
Thursday, February 22nd: Analyzing 2020 Decennial US Census Data in R
Thursday, March 7th: Doing “GIS” and making maps with US Census Data in R
Hour 1: The American Community Survey, R, and tidycensus
Hour 2: ACS data workflows
Hour 3: An introduction to ACS microdata
Annual survey of 3.5 million US households
Covers topics not available in decennial US Census data (e.g. income, education, language, housing characteristics)
Available as 1-year estimates (for geographies of population 65,000 and greater) and 5-year estimates (for geographies down to the block group)
Data delivered as estimates characterized by margins of error
data.census.gov is the main, revamped interactive data portal for browsing and downloading Census datasets, including the ACS
The US Census Application Programming Interface (API) allows developers to access Census data resources programmatically
Wrangles Census data internally to return tidyverse-ready format (or traditional wide format if requested);
Automatically downloads and merges Census geometries to data for mapping;
Includes tools for handling margins of error in the ACS and working with survey weights in the ACS PUMS;
States and counties can be requested by name (no more looking up FIPS codes!)
R: programming language and software environment for data analysis (and wherever else your imagination can take you!)
RStudio: integrated development environment (IDE) for R developed by Posit
Posit Cloud: run RStudio with today’s workshop pre-configured at https://posit.cloud/content/7549022
To get started, install the packages you’ll need for today’s workshop
If you are using the Posit Cloud environment, these packages are already installed for you
tidycensus (and the Census API) can be used without an API key, but you will be limited to 500 queries per day
Power users: visit https://api.census.gov/data/key_signup.html to request a key, then activate the key from the link in your email.
Once activated, use the census_api_key()
function to set your key as an environment variable
get_acs()
functionThe get_acs()
function is your portal to access ACS data using tidycensus
The two required arguments are geography
and variables
. As of v1.6, the function defaults to the 2018-2022 5-year ACS
GEOID
, NAME
, variable
, estimate
, and moe
# A tibble: 3,222 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 01001 Autauga County, Alabama B25077_001 191800 7996
2 01003 Baldwin County, Alabama B25077_001 266000 6916
3 01005 Barbour County, Alabama B25077_001 102700 11171
4 01007 Bibb County, Alabama B25077_001 120100 13377
5 01009 Blount County, Alabama B25077_001 159800 6189
6 01011 Bullock County, Alabama B25077_001 87700 20560
7 01013 Butler County, Alabama B25077_001 94800 5984
8 01015 Calhoun County, Alabama B25077_001 140500 5181
9 01017 Chambers County, Alabama B25077_001 116900 9814
10 01019 Cherokee County, Alabama B25077_001 158700 8550
# ℹ 3,212 more rows
1-year ACS data are more current, but are only available for geographies of population 65,000 and greater
Access 1-year ACS data with the argument survey = "acs1"
; defaults to "acs5"
# A tibble: 646 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 0103076 Auburn city, Alabama B25077_001 335200 22622
2 0107000 Birmingham city, Alabama B25077_001 125500 14964
3 0121184 Dothan city, Alabama B25077_001 190800 8133
4 0135896 Hoover city, Alabama B25077_001 393400 19743
5 0137000 Huntsville city, Alabama B25077_001 294700 16881
6 0150000 Mobile city, Alabama B25077_001 178800 11552
7 0151000 Montgomery city, Alabama B25077_001 155200 10868
8 0177256 Tuscaloosa city, Alabama B25077_001 297600 30475
9 0203000 Anchorage municipality, Alaska B25077_001 367900 10111
10 0404720 Avondale city, Arizona B25077_001 400300 22495
# ℹ 636 more rows
table
parameter can be used to obtain all related variables in a “table” at once# A tibble: 54,774 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 01001 Autauga County, Alabama B19001_001 22308 369
2 01001 Autauga County, Alabama B19001_002 990 265
3 01001 Autauga County, Alabama B19001_003 656 187
4 01001 Autauga County, Alabama B19001_004 1026 303
5 01001 Autauga County, Alabama B19001_005 1335 329
6 01001 Autauga County, Alabama B19001_006 741 205
7 01001 Autauga County, Alabama B19001_007 822 218
8 01001 Autauga County, Alabama B19001_008 840 270
9 01001 Autauga County, Alabama B19001_009 921 260
10 01001 Autauga County, Alabama B19001_010 962 279
# ℹ 54,764 more rows
For geographies available below the state level, the state
parameter allows you to query data for a specific state
For smaller geographies (Census tracts, block groups), a county
can also be requested
tidycensus translates state names and postal abbreviations internally, so you don’t need to remember the FIPS codes!
Example: data on median home value in San Diego County, California by Census tract
# A tibble: 737 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 06073000100 Census Tract 1; San Diego County; Calif… B25077_… 1633800 71171
2 06073000201 Census Tract 2.01; San Diego County; Ca… B25077_… 1331000 147432
3 06073000202 Census Tract 2.02; San Diego County; Ca… B25077_… 891100 97240
4 06073000301 Census Tract 3.01; San Diego County; Ca… B25077_… 957500 232555
5 06073000302 Census Tract 3.02; San Diego County; Ca… B25077_… 761700 108681
6 06073000400 Census Tract 4; San Diego County; Calif… B25077_… 799100 94490
7 06073000500 Census Tract 5; San Diego County; Calif… B25077_… 1025000 81768
8 06073000600 Census Tract 6; San Diego County; Calif… B25077_… 727700 92078
9 06073000700 Census Tract 7; San Diego County; Calif… B25077_… 736400 102788
10 06073000800 Census Tract 8; San Diego County; Calif… B25077_… 678400 119751
# ℹ 727 more rows
To search for variables, use the load_variables()
function along with a year and dataset
The View()
function in RStudio allows for interactive browsing and filtering
Detailed Tables
Data Profile (add "/profile"
for variable lookup)
Subject Tables (add "/subject"
)
Comparison Profile (add "/cprofile"
)
Supplemental Estimates (use "acsse"
)
Migration Flows (access with get_flows()
)
# A tibble: 2,548 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 01 Alabama B01001_001 5074296 NA
2 01 Alabama B01001_002 2461248 6178
3 01 Alabama B01001_003 146169 3134
4 01 Alabama B01001_004 158767 6029
5 01 Alabama B01001_005 164578 5689
6 01 Alabama B01001_006 97834 3029
7 01 Alabama B01001_007 70450 2897
8 01 Alabama B01001_008 42597 4156
9 01 Alabama B01001_009 34623 3440
10 01 Alabama B01001_010 97373 4627
# ℹ 2,538 more rows
# A tibble: 52 × 100
GEOID NAME B01001_001E B01001_001M B01001_002E B01001_002M B01001_003E
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 01 Alabama 5074296 NA 2461248 6178 146169
2 02 Alaska 733583 NA 385667 2351 23043
3 04 Arizona 7359197 NA 3678381 2695 201423
4 05 Arkansas 3045637 NA 1504488 4216 90239
5 06 California 39029342 NA 19536425 6410 1081904
6 08 Colorado 5839926 NA 2960896 4278 154565
7 09 Connecticut 3626205 NA 1776689 2237 91513
8 10 Delaware 1018396 NA 494657 1092 27456
9 11 District o… 671803 NA 319763 733 20038
10 12 Florida 22244823 NA 10953468 6169 563703
# ℹ 42 more rows
# ℹ 93 more variables: B01001_003M <dbl>, B01001_004E <dbl>, B01001_004M <dbl>,
# B01001_005E <dbl>, B01001_005M <dbl>, B01001_006E <dbl>, B01001_006M <dbl>,
# B01001_007E <dbl>, B01001_007M <dbl>, B01001_008E <dbl>, B01001_008M <dbl>,
# B01001_009E <dbl>, B01001_009M <dbl>, B01001_010E <dbl>, B01001_010M <dbl>,
# B01001_011E <dbl>, B01001_011M <dbl>, B01001_012E <dbl>, B01001_012M <dbl>,
# B01001_013E <dbl>, B01001_013M <dbl>, B01001_014E <dbl>, …
Census variables can be hard to remember; using a named vector to request variables will replace the Census IDs with a custom input
In long form, these custom inputs will populate the variable
column; in wide form, they will replace the column names
# A tibble: 174 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 06001 Alameda County, California percent_high_school 16.7 0.4
2 06001 Alameda County, California percent_bachelors 28.3 0.3
3 06001 Alameda County, California percent_graduate 21.3 0.3
4 06003 Alpine County, California percent_high_school 25.7 7.5
5 06003 Alpine County, California percent_bachelors 20.6 7.5
6 06003 Alpine County, California percent_graduate 18.7 8.5
7 06005 Amador County, California percent_high_school 30.7 2.2
8 06005 Amador County, California percent_bachelors 13.6 1.8
9 06005 Amador County, California percent_graduate 5.9 1.1
10 06007 Butte County, California percent_high_school 22.3 0.9
# ℹ 164 more rows
Use the load_variables()
function to find a variable that interests you that we haven’t used yet.
Use get_acs()
to fetch data on that variable from the ACS for counties, similar to how we did for median household income.
Values available in the 5-year ACS may not be available in the corresponding 1-year ACS tables
If available, they will likely have larger margins of error
Your job as an analyst: balance need for certainty vs. need for recency in estimates
# A tibble: 52 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 01 Alabama B16001_054 666 556
2 02 Alaska B16001_054 NA NA
3 04 Arizona B16001_054 1906 1342
4 05 Arkansas B16001_054 NA NA
5 06 California B16001_054 154917 14153
6 08 Colorado B16001_054 1643 1968
7 09 Connecticut B16001_054 4039 2965
8 10 Delaware B16001_054 0 203
9 11 District of Columbia B16001_054 NA NA
10 12 Florida B16001_054 3311 1969
# ℹ 42 more rows
# A tibble: 52 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 01 Alabama B16001_054 493 256
2 02 Alaska B16001_054 24 39
3 04 Arizona B16001_054 2894 626
4 05 Arkansas B16001_054 641 308
5 06 California B16001_054 146406 5724
6 08 Colorado B16001_054 1358 472
7 09 Connecticut B16001_054 2093 715
8 10 Delaware B16001_054 7 12
9 11 District of Columbia B16001_054 22 24
10 12 Florida B16001_054 2466 636
# ℹ 42 more rows
As opposed to decennial US Census data, ACS estimates include information on uncertainty, represented by the margin of error in the moe
column
This means that in some cases, visualization of estimates without reference to the margin of error can be misleading
Walkthrough: building a margin of error visualization with ggplot2
The data are not sorted by value, making comparisons difficult
The axis and tick labels are not intuitive
The Y-axis labels contain repetitive information (” County, Utah”)
We’ve made no attempt to customize the styling
reorder()
to sort counties by the value of their ACS estimates, improving legibilitylabs()
to label the plot and its axes, and change the theme to one of several built-in optionsutah_plot_errorbar <- ggplot(utah_income, aes(x = estimate,
y = reorder(NAME, estimate))) +
geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe), #<<
width = 0.5, linewidth = 0.5) + #<<
geom_point(color = "darkblue", size = 2) +
scale_x_continuous(labels = label_dollar()) +
scale_y_discrete(labels = function(x) str_remove(x, " County, Utah")) +
labs(title = "Median household income, 2018-2022 ACS",
subtitle = "Counties in Utah",
caption = "Data acquired with R and tidycensus. Error bars represent margin of error around estimates.",
x = "ACS estimate",
y = "") +
theme_minimal(base_size = 12)
One of the best features of tidycensus is the argument geometry = TRUE
, which gets you the correct Census geometries with no hassle
get_acs()
with geometry = TRUE
returns a spatial Census dataset containing simple feature geometries; learn more on March 7
Let’s take a look at some examples
geometry = TRUE
does the hard work for you of acquiring and pre-joining spatial Census dataSimple feature collection with 1332 features and 5 fields (with 1 geometry empty)
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -88.26364 ymin: 41.46971 xmax: -87.52416 ymax: 42.15426
Geodetic CRS: NAD83
First 10 features:
GEOID NAME variable estimate
1 17031822400 Census Tract 8224; Cook County; Illinois DP02_0068P 16.4
2 17031740100 Census Tract 7401; Cook County; Illinois DP02_0068P 40.9
3 17031828100 Census Tract 8281; Cook County; Illinois DP02_0068P 19.4
4 17031826600 Census Tract 8266; Cook County; Illinois DP02_0068P 17.1
5 17031720500 Census Tract 7205; Cook County; Illinois DP02_0068P 58.5
6 17031750300 Census Tract 7503; Cook County; Illinois DP02_0068P 64.5
7 17031826500 Census Tract 8265; Cook County; Illinois DP02_0068P 20.1
8 17031825504 Census Tract 8255.04; Cook County; Illinois DP02_0068P 36.4
9 17031827100 Census Tract 8271; Cook County; Illinois DP02_0068P 14.2
10 17031824900 Census Tract 8249; Cook County; Illinois DP02_0068P 11.3
moe geometry
1 5.9 MULTIPOLYGON (((-87.79876 4...
2 8.5 MULTIPOLYGON (((-87.70137 4...
3 6.4 MULTIPOLYGON (((-87.549 41....
4 4.4 MULTIPOLYGON (((-87.63033 4...
5 7.6 MULTIPOLYGON (((-87.69645 4...
6 7.9 MULTIPOLYGON (((-87.6912 41...
7 6.8 MULTIPOLYGON (((-87.63427 4...
8 11.9 MULTIPOLYGON (((-87.71377 4...
9 6.4 MULTIPOLYGON (((-87.656 41....
10 5.0 MULTIPOLYGON (((-87.71712 4...
Mapping, GIS, and spatial data is the subject of our March 7 workshop - so be sure to check that out!
Even before we dive deeper into spatial data, it is very useful to be able to explore your results on an interactive map
Our solution: mapview()
zcol
Consider using Public Use Microdata Areas (PUMAs) for geographically-consistent substate mapping
PUMAs are typically used for microdata geography; however, I find them quite useful to approximate real state submarkets, planning areas, etc.
Variables in the Data Profile and Subject Tables can change names over time
You’ll need to watch out for the Connecticut issue and changing geographies
The 2020 1-year ACS was not released (and is not in tidycensus), so your time-series can break if you are using iteration to pull data
Swap in a variable from Part 1, "B25077_001"
(median home value) for the analysis in this section, and try the following:
For a state of your choosing, how do margins of error differ among counties for median home values in the 1-year and 5-year ACS?
Can you visualize trends in median home value for a county of your choosing using mapview()
?
Microdata: individual-level survey responses made available to researchers
The ACS Public Use Microdata Series (PUMS) allows for detailed cross-tabulations not available in aggregated data
The 1-year PUMS covers about 1 percent of the US population; the 5-year PUMS covers about 5 percent (so, not the full ACS)
Data downloads available in bulk from the Census FTP server or from data.census.gov’s MDAT tool
Other resource for cleaned, time-series microdata: IPUMS
get_pums()
get_pums()
requires specifying one or more variables and the state for which you’d like to request data. state = 'all'
can get data for the entire USA, but it takes a while!
The function defaults to the 5-year ACS with survey = "acs5"
; 1-year ACS data is available with survey = "acs1"
.
The default year is 2022 in the latest version of tidycensus; data are available back to 2005 (1-year ACS) and 2005-2009 (5-year ACS). 2020 1-year data are not available.
get_pums()
# A tibble: 43,708 × 8
SERIALNO SPORDER WGTP PWGTP AGEP ST HHT SEX
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 2022GQ0000103 1 0 15 43 41 b 1
2 2022GQ0000264 1 0 11 29 41 b 1
3 2022GQ0000368 1 0 115 79 41 b 2
4 2022GQ0000407 1 0 2 20 41 b 2
5 2022GQ0000504 1 0 18 59 41 b 1
6 2022GQ0000523 1 0 29 60 41 b 1
7 2022GQ0000539 1 0 73 27 41 b 1
8 2022GQ0000570 1 0 73 90 41 b 2
9 2022GQ0000718 1 0 14 59 41 b 1
10 2022GQ0000776 1 0 4 85 41 b 1
# ℹ 43,698 more rows
get_pums()
get_pums()
returns some technical variables by default without the user needing to request them specifically. These include:
SERIALNO
: a serial number that uniquely identifies households in the sample;
SPORDER
: the order of the person in the household; when combined with SERIALNO
, uniquely identifies a person;
WGTP
: the household weight;
PWGTP
: the person weight
# A tibble: 1 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 41 Oregon B01003_001 4240137 NA
The pums_variables
dataset is your one-stop shop for browsing variables in the ACS PUMS
It is a long-form dataset that organizes specific value codes by variable so you know what you can get. You’ll use information in the var_code
column to fetch variables, but pay attention to the var_label
, val_code
, val_label
, and data_type
columns
recode = TRUE
argument in get_pums()
appends recoded columns to your returned dataset based on information available in pums_variables
# A tibble: 43,708 × 11
SERIALNO SPORDER WGTP PWGTP AGEP ST HHT SEX ST_label HHT_label
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> <ord> <ord>
1 2022GQ0000103 1 0 15 43 41 b 1 Oregon/OR N/A (GQ/…
2 2022GQ0000264 1 0 11 29 41 b 1 Oregon/OR N/A (GQ/…
3 2022GQ0000368 1 0 115 79 41 b 2 Oregon/OR N/A (GQ/…
4 2022GQ0000407 1 0 2 20 41 b 2 Oregon/OR N/A (GQ/…
5 2022GQ0000504 1 0 18 59 41 b 1 Oregon/OR N/A (GQ/…
6 2022GQ0000523 1 0 29 60 41 b 1 Oregon/OR N/A (GQ/…
7 2022GQ0000539 1 0 73 27 41 b 1 Oregon/OR N/A (GQ/…
8 2022GQ0000570 1 0 73 90 41 b 2 Oregon/OR N/A (GQ/…
9 2022GQ0000718 1 0 14 59 41 b 1 Oregon/OR N/A (GQ/…
10 2022GQ0000776 1 0 4 85 41 b 1 Oregon/OR N/A (GQ/…
# ℹ 43,698 more rows
# ℹ 1 more variable: SEX_label <ord>
PUMS datasets - especially from the 5-year ACS - can get quite large. The variables_filter
argument can return a subset of data from the API, reducing long download times
variables_filter
is specified as a named list where the name represents the PUMS variable and the value represents a vector of values you are requesting from the API
# A tibble: 25,703 × 8
SERIALNO SPORDER WGTP PWGTP AGEP ST HHT SEX
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 2018GQ0003965 1 0 8 32 41 b 2
2 2018GQ0004720 1 0 6 46 41 b 2
3 2018GQ0004942 1 0 5 42 41 b 2
4 2018GQ0006933 1 0 3 30 41 b 2
5 2018GQ0006965 1 0 23 32 41 b 2
6 2018GQ0007480 1 0 11 40 41 b 2
7 2018GQ0007813 1 0 19 34 41 b 2
8 2018GQ0010876 1 0 28 41 41 b 2
9 2018GQ0012843 1 0 14 37 41 b 2
10 2018GQ0018588 1 0 67 32 41 b 2
# ℹ 25,693 more rows
In the previous hour, you were introduced to PUMAs
Public Use Microdata Areas (PUMAs) are the smallest available geographies at which records are identifiable in the PUMS datasets
PUMAs are redrawn with each decennial US Census, and typically are home to 100,000-200,000 people. The 2022 ACS is the first that aligns with the new 2020 PUMAs
PUMA
# A tibble: 43,708 × 7
SERIALNO SPORDER WGTP PWGTP AGEP PUMA ST
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 2022HU0427155 1 124 124 55 04703 41
2 2022HU0427210 1 117 117 57 06721 41
3 2022HU0427210 2 117 128 56 06721 41
4 2022HU0427249 1 56 55 72 06722 41
5 2022HU0427249 2 56 57 69 06722 41
6 2022HU0427278 1 52 52 69 00503 41
7 2022HU0427278 2 52 57 77 00503 41
8 2022HU0427304 1 193 194 64 03904 41
9 2022HU0427314 1 57 57 73 05114 41
10 2022HU0427314 2 57 59 71 05114 41
# ℹ 43,698 more rows
PUMS data represent a smaller sample than the regular ACS, so understanding error around tabulated estimates is critical
The Census Bureau recommends using successive difference replication to calculate standard errors, and provides replicate weights to do this
tidycensus includes tools to help you get replicate weights and format your data for appropriate survey-weighted analysis
rep_weights
argument# A tibble: 43,708 × 87
SERIALNO SPORDER AGEP PUMA ST WGTP PWGTP PWGTP1 PWGTP2 PWGTP3 PWGTP4
<chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2022GQ0000… 1 43 00503 41 0 15 16 16 17 18
2 2022GQ0000… 1 29 04704 41 0 11 25 13 2 10
3 2022GQ0000… 1 79 03904 41 0 115 115 121 115 124
4 2022GQ0000… 1 20 06724 41 0 2 2 2 3 3
5 2022GQ0000… 1 59 01900 41 0 18 19 20 17 17
6 2022GQ0000… 1 60 03905 41 0 29 28 30 30 26
7 2022GQ0000… 1 27 04705 41 0 73 71 71 71 48
8 2022GQ0000… 1 90 03904 41 0 73 76 73 74 73
9 2022GQ0000… 1 59 09100 41 0 14 13 14 14 16
10 2022GQ0000… 1 85 05116 41 0 4 4 2 3 3
# ℹ 43,698 more rows
# ℹ 76 more variables: PWGTP5 <dbl>, PWGTP6 <dbl>, PWGTP7 <dbl>, PWGTP8 <dbl>,
# PWGTP9 <dbl>, PWGTP10 <dbl>, PWGTP11 <dbl>, PWGTP12 <dbl>, PWGTP13 <dbl>,
# PWGTP14 <dbl>, PWGTP15 <dbl>, PWGTP16 <dbl>, PWGTP17 <dbl>, PWGTP18 <dbl>,
# PWGTP19 <dbl>, PWGTP20 <dbl>, PWGTP21 <dbl>, PWGTP22 <dbl>, PWGTP23 <dbl>,
# PWGTP24 <dbl>, PWGTP25 <dbl>, PWGTP26 <dbl>, PWGTP27 <dbl>, PWGTP28 <dbl>,
# PWGTP29 <dbl>, PWGTP30 <dbl>, PWGTP31 <dbl>, PWGTP32 <dbl>, …
tidycensus links to the survey and srvyr packages for managing PUMS data as complex survey samples
The to_survey()
function will format your data with replicate weights for correct survey-weighted estimation
srvyr conveniently links R’s survey infrastructure to familiar tidyverse-style workflows
Standard errors can be multiplied by 1.645 to get familiar 90% confidence level margins of error
or_survey %>%
group_by(PUMA) %>%
summarize(median_age = survey_median(AGEP)) %>%
mutate(median_age_moe = median_age_se * 1.645)
# A tibble: 35 × 4
PUMA median_age median_age_se median_age_moe
<chr> <dbl> <dbl> <dbl>
1 00301 33 0.754 1.24
2 00501 40 1.51 2.48
3 00502 41 1.26 2.07
4 00503 43 1.51 2.48
5 00504 43 1.26 2.07
6 01701 41 0.754 1.24
7 01702 48 1.51 2.48
8 01900 46 0.502 0.826
9 02901 40 1.26 2.07
10 02902 44 1.76 2.89
# ℹ 25 more rows
Tabulated median ages are not identical to published estimates, but are very close
Use published estimates if available; use PUMS data to generate estimates that aren’t available in the published tables