A data-driven study on demand forecasting for Southeast Asian supply chains — real data, measured results, honest tradeoffs.
We collected 18 million demand records from a Thai FMCG distributor, built a unified forecast pipeline across 4,200 SKUs, and measured what actually works — and what doesn't.
42% of SKU-store-week forecasts deviate from actual demand by more than 30%. The root cause isn't bad planners. It's bad data.
In most Southeast Asian supply chains, demand forecasting still lives in spreadsheets. Planners spend Monday mornings copying last week's sales data into Excel, adjusting a few numbers based on gut feel, and emailing the result to procurement. This process fails 42% of the time — meaning the forecast deviates from actual demand by more than 30% at the SKU-store-week level.
The root cause isn't bad planners. It's bad data. Sales data is not demand data. When a product stocks out, sales drop to zero — but demand didn't disappear. When a promotion runs, sales spike — but organic demand didn't actually change. Every forecast built on raw sales data inherits these distortions.
The problem compounds across echelons. A retailer sees a 10% demand dip and cuts orders by 20%. The distributor, seeing a 20% order drop, slashes their own orders by 35%. The manufacturer, now seeing a 35% decline, cancels a production run. A modest demand signal becomes a catastrophic supply response.
This is the bullwhip effect — and it's not theoretical. In our data, a 10% demand variation at the consumer level amplified to a 50% order variation at the manufacturer level. The coefficient of variation nearly quintupled across four echelons.
Each echelon forecasts independently using distorted signals from the next level down, amplifying errors upstream.
Companies batch orders weekly or monthly to reduce transaction costs, creating lumpy demand patterns that mask true consumption.
Forward-buying during promotions inflates orders beyond real demand, followed by artificial demand troughs.
When supply is constrained, buyers inflate orders to secure allocation — then cancel when supply normalizes.
Unpredictable lead times force planners to pad safety stock, increasing order volatility across the chain.
"Most supply chains don't have a forecasting problem — they have a data problem. Fix the data pipeline first. The models will follow."
We partnered with one of Thailand's largest FMCG distributors to gain access to their complete demand data across 120 retail locations over 24 consecutive months.
The dataset includes point-of-sale transactions, inventory movements, supplier orders, and promotional calendars. It spans 4,200 active SKUs across four major categories.
Raw data is never forecast-ready. We applied four critical cleansing steps to transform sales data into true demand signals:
Identified periods where zero sales were caused by stockouts, not zero demand. These periods were flagged and excluded from training data.
For censored periods, we estimated what demand would have been using Bayesian imputation based on neighboring stores and historical patterns.
Mapped product substitution relationships — when SKU-A stocks out, what percentage of customers switch to SKU-B vs. leaving the store?
Each promotion event was tagged with type (price cut, BOGO, display, bundle), depth, duration, and cannibalization scope.
"We don't forecast sales. We forecast demand. Sales is what you achieved. Demand is what customers wanted."
Every component built, tested, and measured. From raw POS data to production forecast API — fully automated, deployable with one command.
The pipeline ingests data from three primary sources — point-of-sale systems, ERP inventory data, and supplier order feeds — into a Delta Lake for versioned, ACID-compliant storage. A feature engine computes 47 features per SKU-store-week, including lag features, rolling statistics, promotion indicators, weather data, holiday flags, pricing elasticities, and competitive presence metrics.
| Metric | Value |
|---|---|
| Raw data volume | 4.2 GB |
| Feature store volume | 12.8 GB |
| Ingestion throughput | 45K rec/s |
| Full pipeline runtime | 18 min |
| API P95 latency | 187 ms |
| Dashboard refresh | 5 s |
"The architecture isn't the impressive part. The hard part is getting the data right: unconstrained demand, clean promotion tags, and substitution maps."
Real numbers from a real deployment. Not projections. Not estimates. Measured before-and-after.
| Metric | Before | After | Impact |
|---|---|---|---|
| Forecast Accuracy (MAE) | 42.3% | 14.6% | -27.7 pp |
| Excess Inventory | 31.7M THB | 13.1M THB | 18.6M THB freed |
| Annual Carrying Cost Saved | — | 4.6M THB/year | |
| Industry | Typical MAE | Good MAE | Best-in-Class MAE |
|---|---|---|---|
| FMCG | 35–45% | 20–30% | 12–18% |
| Manufacturing | 30–40% | 18–25% | 10–15% |
| Pharma | 25–35% | 15–22% | 8–12% |
| E-commerce | 40–55% | 25–35% | 15–22% |
Want to see what your data pipeline is missing? We'll run a free 2-hour data audit.
Book a Data AuditFive steps from assessment to continuous optimization. Platform-agnostic means we recommend honestly and deliver the right solution.
Audit your current data sources, forecast process, and accuracy baselines to identify the biggest gaps and quickest wins.
Build the data pipeline: ingest POS, inventory, promotions, and external signals into a unified, versioned data lake.
Train and benchmark multiple forecast models per SKU segment, selecting the best approach for each demand pattern.
Deploy rigorous accuracy measurement at SKU-store-week granularity with automated alerting on forecast degradation.
Continuously retrain models, refine features, and extend the pipeline as new data sources and business needs emerge.