Product metadata for all products purchased by households participating in the Customer Journey study.

products

Format

A data frame with 92,331 rows and 7 variables

  • product_id: Uniquely identifies each product

  • manufacturer_id: Uniquely identifies each manufacturer

  • department: Groups similar products together

  • brand: Indicates Private or National label brand

  • product_category: Groups similar products together at lower level

  • product_type: Groups similar products together at lowest level

  • package_size: Indicates package size (not available for all products)

Source

84.51°, Customer Journey study, http://www.8451.com/area51/

Value

products

a tibble

Examples

# full data set products
#> # A tibble: 92,331 x 7 #> product_id manufacturer_id department brand product_category product_type #> <chr> <chr> <chr> <fct> <chr> <chr> #> 1 25671 2 GROCERY Nati… FRZN ICE ICE - CRUSH… #> 2 26081 2 MISCELLAN… Nati… NA NA #> 3 26093 69 PASTRY Priv… BREAD BREAD:ITALI… #> 4 26190 69 GROCERY Priv… FRUIT - SHELF S… APPLE SAUCE #> 5 26355 69 GROCERY Priv… COOKIES/CONES SPECIALTY C… #> 6 26426 69 GROCERY Priv… SPICES & EXTRAC… SPICES & SE… #> 7 26540 69 GROCERY Priv… COOKIES/CONES TRAY PACK/C… #> 8 26601 69 DRUG GM Priv… VITAMINS VITAMIN - M… #> 9 26636 69 PASTRY Priv… BREAKFAST SWEETS SW GDS: SW … #> 10 26691 16 GROCERY Priv… PNT BTR/JELLY/J… HONEY #> # … with 92,321 more rows, and 1 more variable: package_size <chr>
# Transaction line items that don't have product metadata require("dplyr") transactions_sample %>% anti_join(products, "product_id")
#> # A tibble: 222 x 11 #> household_id store_id basket_id product_id quantity sales_value retail_disc #> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> #> 1 1166 408 31969185… 5978656 0 0 0 #> 2 867 369 40436331… 5978656 0 0 0 #> 3 40 406 40085429… 5978648 0 0 0 #> 4 1633 32004 32187016… 5978656 0 0 0 #> 5 2305 450 35000781… 5978648 0 0 0 #> 6 910 299 35293721… 5978648 0 0 0 #> 7 2178 343 32557004… 5978656 0 0 0 #> 8 115 329 40968842… 5978648 0 0 0 #> 9 367 368 35840831… 5978648 0 0 0 #> 10 2462 403 40853241… 5978648 0 0 0 #> # … with 212 more rows, and 4 more variables: coupon_disc <dbl>, #> # coupon_match_disc <dbl>, week <int>, transaction_timestamp <dttm>