Product metadata for all products purchased by households participating in the Customer Journey study.

products

Format

A data frame with 92,331 rows and 7 variables

  • product_id: Uniquely identifies each product

  • manufacturer_id: Uniquely identifies each manufacturer

  • department: Groups similar products together

  • brand: Indicates Private or National label brand

  • product_category: Groups similar products together at lower level

  • product_type: Groups similar products together at lowest level

  • package_size: Indicates package size (not available for all products)

Source

84.51°, Customer Journey study, https://www.8451.com/area51/

Value

products

a tibble

Examples

# \donttest{
# full data set
products
#> # A tibble: 92,331 × 7
#>    product_id manufacturer_id department    brand  product_category product_type
#>    <chr>      <chr>           <chr>         <fct>  <chr>            <chr>       
#>  1 25671      2               GROCERY       Natio… FRZN ICE         ICE - CRUSH…
#>  2 26081      2               MISCELLANEOUS Natio… NA               NA          
#>  3 26093      69              PASTRY        Priva… BREAD            BREAD:ITALI…
#>  4 26190      69              GROCERY       Priva… FRUIT - SHELF S… APPLE SAUCE 
#>  5 26355      69              GROCERY       Priva… COOKIES/CONES    SPECIALTY C…
#>  6 26426      69              GROCERY       Priva… SPICES & EXTRAC… SPICES & SE…
#>  7 26540      69              GROCERY       Priva… COOKIES/CONES    TRAY PACK/C…
#>  8 26601      69              DRUG GM       Priva… VITAMINS         VITAMIN - M…
#>  9 26636      69              PASTRY        Priva… BREAKFAST SWEETS SW GDS: SW …
#> 10 26691      16              GROCERY       Priva… PNT BTR/JELLY/J… HONEY       
#> # ℹ 92,321 more rows
#> # ℹ 1 more variable: package_size <chr>

# Transaction line items that don't have product metadata
require("dplyr")
transactions_sample %>%
  anti_join(products, "product_id")
#> # A tibble: 222 × 11
#>    household_id store_id basket_id   product_id quantity sales_value retail_disc
#>    <chr>        <chr>    <chr>       <chr>         <dbl>       <dbl>       <dbl>
#>  1 1166         408      31969185576 5978656           0           0           0
#>  2 867          369      40436331223 5978656           0           0           0
#>  3 40           406      40085429046 5978648           0           0           0
#>  4 1633         32004    32187016694 5978656           0           0           0
#>  5 2305         450      35000781083 5978648           0           0           0
#>  6 910          299      35293721775 5978648           0           0           0
#>  7 2178         343      32557004315 5978656           0           0           0
#>  8 115          329      40968842134 5978648           0           0           0
#>  9 367          368      35840831094 5978648           0           0           0
#> 10 2462         403      40853241224 5978648           0           0           0
#> # ℹ 212 more rows
#> # ℹ 4 more variables: coupon_disc <dbl>, coupon_match_disc <dbl>, week <int>,
#> #   transaction_timestamp <dttm>
# }