Product metadata for all products purchased by households participating in the Customer Journey study.
productsA data frame with 92,331 rows and 7 variables
product_id: Uniquely identifies each product
manufacturer_id: Uniquely identifies each manufacturer
department: Groups similar products together
brand: Indicates Private or National label brand
product_category: Groups similar products together at lower level
product_type: Groups similar products together at lowest level
package_size: Indicates package size (not available for all products)
84.51°, Customer Journey study, https://www.8451.com/area51/
a tibble
# \donttest{
# full data set
products
#> # A tibble: 92,331 × 7
#> product_id manufacturer_id department brand product_category product_type
#> <chr> <chr> <chr> <fct> <chr> <chr>
#> 1 25671 2 GROCERY Natio… FRZN ICE ICE - CRUSH…
#> 2 26081 2 MISCELLANEOUS Natio… NA NA
#> 3 26093 69 PASTRY Priva… BREAD BREAD:ITALI…
#> 4 26190 69 GROCERY Priva… FRUIT - SHELF S… APPLE SAUCE
#> 5 26355 69 GROCERY Priva… COOKIES/CONES SPECIALTY C…
#> 6 26426 69 GROCERY Priva… SPICES & EXTRAC… SPICES & SE…
#> 7 26540 69 GROCERY Priva… COOKIES/CONES TRAY PACK/C…
#> 8 26601 69 DRUG GM Priva… VITAMINS VITAMIN - M…
#> 9 26636 69 PASTRY Priva… BREAKFAST SWEETS SW GDS: SW …
#> 10 26691 16 GROCERY Priva… PNT BTR/JELLY/J… HONEY
#> # ℹ 92,321 more rows
#> # ℹ 1 more variable: package_size <chr>
# Transaction line items that don't have product metadata
require("dplyr")
transactions_sample %>%
anti_join(products, "product_id")
#> # A tibble: 222 × 11
#> household_id store_id basket_id product_id quantity sales_value retail_disc
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 1166 408 31969185576 5978656 0 0 0
#> 2 867 369 40436331223 5978656 0 0 0
#> 3 40 406 40085429046 5978648 0 0 0
#> 4 1633 32004 32187016694 5978656 0 0 0
#> 5 2305 450 35000781083 5978648 0 0 0
#> 6 910 299 35293721775 5978648 0 0 0
#> 7 2178 343 32557004315 5978656 0 0 0
#> 8 115 329 40968842134 5978648 0 0 0
#> 9 367 368 35840831094 5978648 0 0 0
#> 10 2462 403 40853241224 5978648 0 0 0
#> # ℹ 212 more rows
#> # ℹ 4 more variables: coupon_disc <dbl>, coupon_match_disc <dbl>, week <int>,
#> # transaction_timestamp <dttm>
# }