DMC Historical demand forecasting datasets
EDIT: Research originally done as part of a HSA, I’ll publish it here because why not, may be useful.
- Demand Forecast for Optimized Inventory Planning | Kaggle
- Literally DMC 2020: DMC 2020 – DATA MINING CUP
- We have orders for ItemID, for each a salesPrice
- items.csv with item/brand/manufacturer, customerRating and three category attributes (‘categorical/hierarchical’?).
- Retail Store Sales Transactions (Scanner Data) | Kaggle
- One year: 2016
- Really nice!
- Date, CustomerID, item, category, quantity, ~price
The anonymized dataset includes 64.682 transactions of 5.242 SKU’s sold to 22.625 customers during one year.
- Retail Data Set | Kaggle
- Date, Article+quantity, price, discount, customer ID
- 29k rows, 2019-2023
- Demand Forecasting | Kaggle
- 2011-2013, ~2k stores
- for each week: product id, units sold, total price + base price (?), whether it as featured and displayed (?)
- Walmart Dataset | Kaggle
- 45 stores, weekly sales in total, not per-item!
- Holiday week or not + Temperature / fuel price / CPI / Unemployment
- 2010-2012, of course USA
- Store Item Demand Forecasting Challenge | Kaggle
- Daily sales of 50 items across 10 stores, no categories
- 2013-2017
- time-based train/test split
- Used here: Machine Learning for Retail Demand Forecasting | by Samir Saci | Towards Data Science
- Predicting the sales of products of a retail chain | Kaggle
- 2012-2014, product/dept and one category
- 333 stores in 3 Indian states
- Has prices or each product by outlet and week
- 2012-2014, product/dept and one category
- Supermarket sales | Kaggle
- per-category data!
- category (“Electronics”), quantity bought, price, datetime of sale
- Price, + tax and A LOT of similar info
- “Gross income” whatever this means
- Branch, city
- Customer gender/membership
- Customer rating of shopping experience
- Jan-Mar 2019
- per-category data!
Not too interesting
- Retail Sales Dataset 2018-2022 | Kaggle
- Monthly sales data, article and “group” (=category), avg. price etc.
- 2018-2022, no stores/cities, “global retail company” - could be aggregated worldwide?
- Pakistan’s Largest E-Commerce Dataset | Kaggle
- Everything we might want, but E-commerce and Pakistan
- Categories and items not anonymized!
- Looks messy
- Completed/cancelled orders
- Retail Demand Forecasting Dataset | Kaggle
- Unknown/anonymous
- Products belonging to one of 70k (long-tailed disr.) categories
- warehouse demand
- Jan-Nov 2016
- For each week: it’s a state/school holiday, promo, etc.
- Weekly SKU level Product Sales Transactions | Kaggle
- for each article - number of sales in 6 weeks + price + store
- Forecasts for Product Demand | Kaggle
- More about manufacturing demand and bulk things than retail
- UCI Machine Learning Repository: Demand Forecasting for a store Data Set
- Meal delivery company
- Category, sub-category, price, discount, featured on homepage etc.
- Has per-meal metadata (cuisine etc.)
- Hierarchical sales data of an Italian grocery store - Mendeley Data
- 2014-2018, pasta sold in an italian store, + presence/absence of promotion
- Australia Grocery Product Dataset | Kaggle
- No demand, just not-anonimysed prodcts with category/subcategory and price available in Australia on a per-city-postal-code basis and whether it’s in stock or not at that time
- May be useful as taxonomy or something?..
- Superstore Dataset | Kaggle
- FAKE Tableau sample dataset, but is 10/10 what we look for
- Has everything, not anonymized - product + quantity, category+sub-category, price, discount
- order/shipping date, customer name/segment/city
- 2014-2017
- Continue looking through this: Search | Kaggle
- Corporación Favorita Grocery Sales Forecasting | Kaggle
- Ecuador, the usual
dates, store and item information, whether that item was being promoted, as well as the unit sales. Additional files include supplementary information that may be useful in building your models.
- TODO I get an error when I “accept” the terms of the competition in two browsers, can’t look inside
- Retail Sales Forecasting | Kaggle
- “Brazilian top retailer”, many stores etc.
- Four columns, ‘venda’ and ’estoque’ ?..
- I guess all categories and items summed up
- 2014-2016
Sources / useful search terms
- retail | grocery, demand | sales
- SKU is a term of art, basically ArticleID
- “merchandise planning”
- Kaggle category ‘retail and shopping’: Find Open Datasets and Machine Learning Projects | Kaggle
- SKU: Search | Kaggle
- Dataset Search
- Dataset Search
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus