The most difficult thing about working with BLS data is gaining a clear understanding on what data are available and what they represent. Some of the more popular data sets can be found on the BLS Databases, Tables & Calculations website. The selected examples below do not include all series or databases.
OES (Occupational Employment Statistics) includes employment, benefits, and wages segmented by metro area and occupation.
National Compensation Survey-Benefits includes survey data of those who have benefits available and who takes advantage of those benefits.
Other wage data can be found in the CPS, CES, and QCEW, which are covered in the Employment seciton of these vignettes.
Note: The hyperlinks above link to lists of the most popular seriesIDs, but are only a small sample of all the data tracked by the BLS.
The OES contains similar wage data found in the CPS, but often has more resolution in certain geographic areas. Unlike the CPS, the OES is an annual survey and does not keep time series data.
For example, we may want to compare the average hourly wage of Computer and Information Systems Managers in Orlando, FL to those in San Jose, CA. Notice, below the survey only returns values for 2015.
# Computer and Information Systems Managers in Orlando, FL and San Jose, CA.
# Orlando: "OEUM003674000000011302103"
# San Jose: "OEUM004194000000011302108"
library(blscrapeR)
df <- bls_api(c("OEUM003674000000011302103", "OEUM004194000000011302108"))
head(df)
Another OES example would be to grab the most recent Annual mean wage for All Occupations in All Industries in the United States.
library(blscrapeR)
df <- bls_api("OEUN000000000000000000004")
head(df)
This data set includes time series data on how much employers pay for employee benefits as a total cost and as a percent of employee wages and salaries.
For example, if we want to see the total cost of benefits per hour work and also see what percentage that is of the total compensation, we could run the following script.
library(blscrapeR)
library(tidyverse)
df <- bls_api(c("CMU1030000000000D", "CMU1030000000000P"))
# Spread series ids and rename columns to human readable format.
df.sp <- spread(df, seriesID, value) %>%
rename("hourly_cost"=CMU1030000000000D, "pct_of_wages"=CMU1030000000000P) %>%
# Percentages are represented as floating integers. Fix this to avoid confusion.
mutate(pct_of_wages = pct_of_wages*0.01)
head(df.sp)
This survey includes data on how many Americans have access to certain benefits. For example, we can see the percentage of those who have access to paid vacation days and those who have access to Health insurance through their employers.
library(blscrapeR)
library(tidyverse)
df <- bls_api(c("NBU10500000000000033030", "NBU11500000000000028178"))
# Spread series ids and rename columns to human readable format.
df.sp <- spread(df, seriesID, value) %>%
rename("pct_paid_vacation"=NBU10500000000000033030, "pct_health_ins"=NBU11500000000000028178) %>%
# Value data are in whole numbers but represent percentages. Fix this to avoid confusion.
mutate(pct_of_wages = pct_of_wages*0.01, pct_health_ins = pct_health_ins*0.01)
head(df.sp)