Title: | Retrieve Data from the Census APIs |
---|---|
Description: | A wrapper for the U.S. Census Bureau APIs that returns data frames of Census data and metadata. Available datasets include the Decennial Census, American Community Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, Population Estimates and Projections, and more. |
Authors: | Hannah Recht [aut, cre] |
Maintainer: | Hannah Recht <[email protected]> |
License: | GPL-3 |
Version: | 0.9.0.9000 |
Built: | 2024-11-01 04:39:16 UTC |
Source: | https://github.com/hrecht/censusapi |
Some small geographies in some Census APIs can only be used under a state hierarchy. This is a list of fips codes that may be looped over to retrieve data for all states.
fips
fips
A list of fips codes for 50 states and the District of Columbia.
fips
fips
Retrieve a Census API key stored the .Renivron file
get_api_key()
get_api_key()
A CENSUS_KEY or CENSUS_API_KEY string stored in the user's .Renviron. file, or a warning message printed once per R session if none is found.
Other helpers:
has_api_key()
## Not run: get_api_key() ## End(Not run)
## Not run: get_api_key() ## End(Not run)
Retrieve Census data from a given API
getCensus( name, vintage = NULL, key = NULL, vars, region = NULL, regionin = NULL, time = NULL, show_call = FALSE, convert_variables = TRUE, year = NULL, date = NULL, period = NULL, monthly = NULL, category_code = NULL, data_type_code = NULL, naics = NULL, pscode = NULL, naics2012 = NULL, naics2007 = NULL, naics2002 = NULL, naics1997 = NULL, sic = NULL, ... )
getCensus( name, vintage = NULL, key = NULL, vars, region = NULL, regionin = NULL, time = NULL, show_call = FALSE, convert_variables = TRUE, year = NULL, date = NULL, period = NULL, monthly = NULL, category_code = NULL, data_type_code = NULL, naics = NULL, pscode = NULL, naics2012 = NULL, naics2007 = NULL, naics2002 = NULL, naics1997 = NULL, sic = NULL, ... )
name |
The programmatic name of your dataset, e.g. "timeseries/poverty/saipe" or "acs/acs5". Use listCensusApis() to see valid dataset names. Required. |
vintage |
Vintage (year) of dataset, e.g. 2014. Not required for timeseries APIs. |
key |
A Census API key, obtained at
https://api.census.gov/data/key_signup.html. If you have a |
vars |
List of variables to get. Required. |
region |
Geography to get. |
regionin |
Optional hierarchical geography to limit region. |
time |
Time period of data to get. Required for most timeseries APIs. |
show_call |
Display the underlying API call that was sent to the Census Bureau. Default is FALSE. |
convert_variables |
Convert columns that are likely numbers into numeric data. Default is TRUE. If false, all columns will be characters, which is the type returned by the Census Bureau. |
year , date , period , monthly , category_code , data_type_code , naics , pscode , naics2012 , naics2007 , naics2002 , naics1997 , sic
|
Optional arguments used in some timeseries data APIs. |
... |
Other valid arguments to pass to the Census API. Note: the APIs are case sensitive. |
A data frame with results from the specified U.S. Census Bureau dataset.
# Get total population and median household income for Census places # (cities, towns, villages) in a single state from the 5-year American Community Survey. acs_simple <- getCensus( name = "acs/acs5", vintage = 2022, vars = c("NAME", "B01001_001E", "B19013_001E"), region = "place:*", regionin = "state:01") head(acs_simple) # Get all data from the B08301 variable group, "Means of Transportation to Work." # This returns estimates as well as margins of error and annotation flags. acs_group <- getCensus( name = "acs/acs5", vintage = 2022, vars = "group(B08301)", region = "state:*") head(acs_group) # Retreive 2020 Decennial Census block group data within a specific Census tract, # using the regionin argument to precisely specify the Census tract, county, # and state. decennial_block_group <- getCensus( name = "dec/dhc", vintage = 2020, vars = c("NAME", "P1_001N"), region = "block group:*", regionin = "state:36+county:027+tract:220300") head(decennial_block_group) # Get poverty rates for children and for people of all ages beginning in 2000 using the # Small Area Income and Poverty Estimates API saipe <- getCensus( name = "timeseries/poverty/saipe", vars = c("NAME", "SAEPOVRT0_17_PT", "SAEPOVRTALL_PT"), region = "state:01", time = "from 2000") head(saipe) # Get the number of employees and number of establishments in the construction sector, # NAICS2017 code 23, using the County Business Patterns API cbp <- getCensus( name = "cbp", vintage = 2021, vars = c("EMP", "ESTAB", "NAICS2017_LABEL"), region = "county:*", NAICS2017 = 23) head(cbp)
# Get total population and median household income for Census places # (cities, towns, villages) in a single state from the 5-year American Community Survey. acs_simple <- getCensus( name = "acs/acs5", vintage = 2022, vars = c("NAME", "B01001_001E", "B19013_001E"), region = "place:*", regionin = "state:01") head(acs_simple) # Get all data from the B08301 variable group, "Means of Transportation to Work." # This returns estimates as well as margins of error and annotation flags. acs_group <- getCensus( name = "acs/acs5", vintage = 2022, vars = "group(B08301)", region = "state:*") head(acs_group) # Retreive 2020 Decennial Census block group data within a specific Census tract, # using the regionin argument to precisely specify the Census tract, county, # and state. decennial_block_group <- getCensus( name = "dec/dhc", vintage = 2020, vars = c("NAME", "P1_001N"), region = "block group:*", regionin = "state:36+county:027+tract:220300") head(decennial_block_group) # Get poverty rates for children and for people of all ages beginning in 2000 using the # Small Area Income and Poverty Estimates API saipe <- getCensus( name = "timeseries/poverty/saipe", vars = c("NAME", "SAEPOVRT0_17_PT", "SAEPOVRTALL_PT"), region = "state:01", time = "from 2000") head(saipe) # Get the number of employees and number of establishments in the construction sector, # NAICS2017 code 23, using the County Business Patterns API cbp <- getCensus( name = "cbp", vintage = 2021, vars = c("EMP", "ESTAB", "NAICS2017_LABEL"), region = "county:*", NAICS2017 = 23) head(cbp)
Is there a saved Census API key in the .Renivron file?
has_api_key()
has_api_key()
TRUE or FALSE.
Other helpers:
get_api_key()
has_api_key()
has_api_key()
Scrapes https://api.census.gov/data.json and returns a dataframe that includes columns for dataset title, description, name, vintage, url, dataset type, and other useful fields.
listCensusApis(name = NULL, vintage = NULL)
listCensusApis(name = NULL, vintage = NULL)
name |
Optional complete or partial API dataset programmatic name. For
example, "acs", "acs/acs5", "acs/acs5/subject". If using a partial name,
this needs to be the left-most part of the dataset name before |
vintage |
Optional vintage (year) of dataset. |
A data frame with the following columns:
title: Short written description of the dataset.
name: Programmatic name of the dataset.
vintage: Year of the survey, for use with microdata and aggregate datasets.
type: Dataset type, which is either "Aggregate", "Microdata", or "Timeseries".
temporal: Time period of the dataset. Warning: not always documented.
spatial: Spatial region of the dataset. Warning: not always documented.
url: Base URL of the dataset endpoint.
modified: Date last modified. Warning: sometimes out of date.
description: Long written description of the dataset.
contact: Email address for specific questions about the Census Bureau survey.
Other metadata:
listCensusMetadata()
,
makeVarlist()
# Get information about every dataset available in the APIs apis <- listCensusApis() head(apis) # Get information about all vintage 2022 datasets apis_2022 <- listCensusApis(vintage = 2022) head(apis_2022) # Get information about all timeseries datasets apis_timeseries <- listCensusApis(name = "timeseries") head(apis_timeseries) # Get information about 2020 Decennial Census datasets apis_decennial_2020 <- listCensusApis(name = "dec", vintage = 2020) head(apis_decennial_2020) # Get information about one particular dataset api_sahie <- listCensusApis(name = "timeseries/healthins/sahie") head(api_sahie)
# Get information about every dataset available in the APIs apis <- listCensusApis() head(apis) # Get information about all vintage 2022 datasets apis_2022 <- listCensusApis(vintage = 2022) head(apis_2022) # Get information about all timeseries datasets apis_timeseries <- listCensusApis(name = "timeseries") head(apis_timeseries) # Get information about 2020 Decennial Census datasets apis_decennial_2020 <- listCensusApis(name = "dec", vintage = 2020) head(apis_decennial_2020) # Get information about one particular dataset api_sahie <- listCensusApis(name = "timeseries/healthins/sahie") head(api_sahie)
Get information about a Census Bureau API dataset, including its available variables, geographies, variable groups, and value labels
listCensusMetadata( name, vintage = NULL, type = "variables", group = NULL, variable_name = NULL, include_values = FALSE )
listCensusMetadata( name, vintage = NULL, type = "variables", group = NULL, variable_name = NULL, include_values = FALSE )
name |
API programmatic name - e.g. acs/acs5. Use |
vintage |
Vintage (year) of dataset. Not required for timeseries APIs. |
type |
Type of metadata to return. Options are:
|
group |
An optional variable group code, used to return metadata for a specific group of variables only. Variable groups are not used for all APIs. |
variable_name |
A name of a specific variable used to return value labels for that variable. Value labels are not used for all APIs. |
include_values |
Use with |
A data frame with metadata about the specified API endpoint.
Other metadata:
listCensusApis()
,
makeVarlist()
# type: variables # List the variables available in the Small Area # Health Insurance Estimates. variables <- listCensusMetadata( name = "timeseries/healthins/sahie", type = "variables") head(variables) # type: variables for a single variable group # List the variables that are included in the B17020 group in the # 5-year American Community Survey. variable_group <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "variables", group = "B17020") head(variable_group) # type: variables, with value labels # Create a data dictionary with all variable names and encoded values # for a microdata API. variable_values <- listCensusMetadata( name = "cps/voting/nov", vintage = 2020, type = "variables", include_values = TRUE) head(variable_values) # type: geographies # List the geographies available in the 5-year American Community Survey. geographies <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "geographies") head(geographies) # type: groups # List the variable groups available in the 5-year American # Community Survey. groups <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "groups") head(groups) # type: values for a single variable # List the value labels of the NAICS2017 variable in the County # Business Patterns dataset. naics_values <- listCensusMetadata( name = "cbp", vintage = 2021, type = "values", variable = "NAICS2017") head(naics_values)
# type: variables # List the variables available in the Small Area # Health Insurance Estimates. variables <- listCensusMetadata( name = "timeseries/healthins/sahie", type = "variables") head(variables) # type: variables for a single variable group # List the variables that are included in the B17020 group in the # 5-year American Community Survey. variable_group <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "variables", group = "B17020") head(variable_group) # type: variables, with value labels # Create a data dictionary with all variable names and encoded values # for a microdata API. variable_values <- listCensusMetadata( name = "cps/voting/nov", vintage = 2020, type = "variables", include_values = TRUE) head(variable_values) # type: geographies # List the geographies available in the 5-year American Community Survey. geographies <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "geographies") head(geographies) # type: groups # List the variable groups available in the 5-year American # Community Survey. groups <- listCensusMetadata( name = "acs/acs5", vintage = 2022, type = "groups") head(groups) # type: values for a single variable # List the value labels of the NAICS2017 variable in the County # Business Patterns dataset. naics_values <- listCensusMetadata( name = "cbp", vintage = 2021, type = "values", variable = "NAICS2017") head(naics_values)
Return a list of variable names or data frame of variable metadata containing a given string. This can be used create a list of variables to later pass to getCensus, or a data frame documenting variables used in a given project.
makeVarlist(name, vintage = NULL, find, varsearch = "all", output = "list")
makeVarlist(name, vintage = NULL, find, varsearch = "all", output = "list")
name |
API programmatic name - e.g. acs/acs5. Use |
vintage |
Vintage (year) of dataset. Not required for timeseries APIs. |
find |
A string to find in the variable metadata. |
varsearch |
Optional argument specifying which fields to search. Default is "all". Options are "all", "name", "label", or "concept". |
output |
Optional argument, specifying output to "list" or "dataframe". Default is "list". |
A data frame containing variable metadata
Other metadata:
listCensusApis()
,
listCensusMetadata()
# Return a list, and then use getCensus function to retrieve those variables myvars <- makeVarlist(name = "timeseries/poverty/saipe", find = "Ages 0-4", varsearch = "label") myvars saipe_dt <- getCensus(name = "timeseries/poverty/saipe", time = 2016, vars = myvars, region = "state:*") head(saipe_dt)
# Return a list, and then use getCensus function to retrieve those variables myvars <- makeVarlist(name = "timeseries/poverty/saipe", find = "Ages 0-4", varsearch = "label") myvars saipe_dt <- getCensus(name = "timeseries/poverty/saipe", time = 2016, vars = myvars, region = "state:*") head(saipe_dt)