Household demographic metadata for households participating in the Customer Journey study. Due to nature of the data, the demographic information is not available for all households.
demographicsA data frame with 801 rows and 8 variables
household_id: Uniquely identifies each household
age: Estimated age range
income: Household income range
home_ownership: Homeowner status (Homeowner, Renter, Unknown)
marital_status: Marital status (Married, Single, Unknown)
household_size: Size of household up to 5+
household_comp: Household composition description
kids_count: Number of children present up to 3+
84.51°, Customer Journey study, https://www.8451.com/area51/
a tibble
# \donttest{
# full data set
demographics
#> # A tibble: 801 × 8
#> household_id age income home_ownership marital_status household_size
#> <chr> <ord> <ord> <ord> <ord> <ord>
#> 1 1 65+ 35-49K Homeowner Married 2
#> 2 1001 45-54 50-74K Homeowner Unmarried 1
#> 3 1003 35-44 25-34K NA Unmarried 1
#> 4 1004 25-34 15-24K NA Unmarried 1
#> 5 101 45-54 Under 15K Homeowner Married 4
#> 6 1012 35-44 35-49K NA Married 5+
#> 7 1014 45-54 15-24K NA Married 4
#> 8 1015 45-54 50-74K Homeowner Unmarried 1
#> 9 1018 45-54 35-49K Homeowner Married 5+
#> 10 1020 45-54 25-34K Homeowner Married 2
#> # ℹ 791 more rows
#> # ℹ 2 more variables: household_comp <ord>, kids_count <ord>
# Transaction line items that don't have household metadata
require("dplyr")
transactions_sample %>%
anti_join(demographics, "household_id")
#> # A tibble: 32,801 × 11
#> household_id store_id basket_id product_id quantity sales_value retail_disc
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2261 309 31625220889 940996 1 3.86 0.43
#> 2 2131 368 32053127496 873902 1 1.59 0.9
#> 3 511 316 32445856036 847901 1 1 0.69
#> 4 918 340 32074655895 1085604 1 1.29 0
#> 5 1688 450 34850403304 1028715 1 2 1.79
#> 6 467 31782 31280745102 896613 2 6.55 4.44
#> 7 1947 32004 32744181707 978497 1 3.99 0
#> 8 568 446 32932232291 949023 1 3.49 0.5
#> 9 1783 369 33409764350 1079223 1 1 0
#> 10 401 31642 40955342402 839753 1 0.17 0
#> # ℹ 32,791 more rows
#> # ℹ 4 more variables: coupon_disc <dbl>, coupon_match_disc <dbl>, week <int>,
#> # transaction_timestamp <dttm>
# }