DMC Historical demand forecasting datasets
EDIT: Research originally done as part of a HSA, I’ll publish it here because why not, may be useful.
Best
- Demand Forecast for Optimized Inventory Planning | Kaggle
- Literally DMC 2020: DMC 2020 – DATA MINING CUP
- We have orders for ItemID, for each a salesPrice
- items.csv with item/brand/manufacturer, customerRating and three category attributes (‘categorical/hierarchical’?).
- Retail Store Sales Transactions (Scanner Data) | Kaggle
- One year: 2016
- Really nice!
- Date, CustomerID, item, category, quantity, ~price
-
The anonymized dataset includes 64.682 transactions of 5.242 SKU’s sold to 22.625 customers during one year.
- Retail Data Set | Kaggle
- Date, Article+quantity, price, discount, customer ID
- 29k rows, 2019-2023
- Demand Forecasting | Kaggle
- 2011-2013, ~2k stores
- for each week: product id, units sold, total price + base price (?), whether it as featured and displayed (?)
Interesting
- Walmart Dataset | Kaggle
- 45 stores, weekly sales in total, not per-item!
- Holiday week or not + Temperature / fuel price / CPI / Unemployment
- 2010-2012, of course USA
- Store Item Demand Forecasting Challenge | Kaggle
- Daily sales of 50 items across 10 stores, no categories
- 2013-2017
- time-based train/test split
- Used here: Machine Learning for Retail Demand Forecasting | by Samir Saci | Towards Data Science
- Predicting the sales of products of a retail chain | Kaggle
- 2012-2014, product/dept and one category
- 333 stores in 3 Indian states
- Has prices or each product by outlet and week
- 2012-2014, product/dept and one category
- Supermarket sales | Kaggle
- per-category data!
- category (“Electronics”), quantity bought, price, datetime of sale
- Price, + tax and A LOT of similar info
-
- “Gross income” whatever this means
-
- Branch, city
- Customer gender/membership
- Customer rating of shopping experience
- Jan-Mar 2019
- per-category data!
Not too interesting
- Retail Sales Dataset 2018-2022 | Kaggle
- Monthly sales data, article and “group” (=category), avg. price etc.
- 2018-2022, no stores/cities, “global retail company” - could be aggregated worldwide?
- Pakistan’s Largest E-Commerce Dataset | Kaggle
- Everything we might want, but E-commerce and Pakistan
- Categories and items not anonymized!
- Looks messy
- Completed/cancelled orders
- Retail Demand Forecasting Dataset | Kaggle
- Unknown/anonymous
- Products belonging to one of 70k (long-tailed disr.) categories
- warehouse demand
- Jan-Nov 2016
- For each week: it’s a state/school holiday, promo, etc.
- Weekly SKU level Product Sales Transactions | Kaggle
- for each article - number of sales in 6 weeks + price + store
- Forecasts for Product Demand | Kaggle
- More about manufacturing demand and bulk things than retail
- UCI Machine Learning Repository: Demand Forecasting for a store Data Set
- Meal delivery company
- Category, sub-category, price, discount, featured on homepage etc.
- Has per-meal metadata (cuisine etc.)
- Hierarchical sales data of an Italian grocery store - Mendeley Data
- 2014-2018, pasta sold in an italian store, + presence/absence of promotion
- Australia Grocery Product Dataset | Kaggle
- No demand, just not-anonimysed prodcts with category/subcategory and price available in Australia on a per-city-postal-code basis and whether it’s in stock or not at that time
- May be useful as taxonomy or something?..
- Superstore Dataset | Kaggle
- FAKE Tableau sample dataset, but is 10/10 what we look for
- Has everything, not anonymized - product + quantity, category+sub-category, price, discount
- order/shipping date, customer name/segment/city
- 2014-2017
TODO
- Continue looking through this: Search | Kaggle
- Corporación Favorita Grocery Sales Forecasting | Kaggle
- Ecuador, the usual
-
dates, store and item information, whether that item was being promoted, as well as the unit sales. Additional files include supplementary information that may be useful in building your models.
- TODO I get an error when I “accept” the terms of the competition in two browsers, can’t look inside
- Retail Sales Forecasting | Kaggle
- “Brazilian top retailer”, many stores etc.
- Four columns, ‘venda’ and ’estoque’ ?..
- I guess all categories and items summed up
- 2014-2016
Sources / useful search terms
- retail | grocery, demand | sales
- SKU is a term of art, basically ArticleID
- “merchandise planning”
- Kaggle category ‘retail and shopping’: Find Open Datasets and Machine Learning Projects | Kaggle
- SKU: Search | Kaggle
- Dataset Search
- Dataset Search
Random
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus