Real-World Supply Chain Study

From Scattered Signals to
Forecast Intelligence.

A data-driven study on demand forecasting for Southeast Asian supply chains — real data, measured results, honest tradeoffs.

We collected 18 million demand records from a Thai FMCG distributor, built a unified forecast pipeline across 4,200 SKUs, and measured what actually works — and what doesn't.

See the Data The Architecture

Records

SKUs

Stores

<200ms

Latency

SKU-1042 +12% vs forecast Warehouse BKK-3: 94% fill rate Supplier lead time: 14.2 days Promo impact: +23% uplift SKU-0887 stockout risk: HIGH Bullwhip index: 1.82 Weekly accuracy: 84.6% SKU-1042 +12% vs forecast Warehouse BKK-3: 94% fill rate Supplier lead time: 14.2 days Promo impact: +23% uplift SKU-0887 stockout risk: HIGH Bullwhip index: 1.82 Weekly accuracy: 84.6%

The Problem

Why most supply chains forecast demand poorly — and don't even know it.

42% of SKU-store-week forecasts deviate from actual demand by more than 30%. The root cause isn't bad planners. It's bad data.

In most Southeast Asian supply chains, demand forecasting still lives in spreadsheets. Planners spend Monday mornings copying last week's sales data into Excel, adjusting a few numbers based on gut feel, and emailing the result to procurement. This process fails 42% of the time — meaning the forecast deviates from actual demand by more than 30% at the SKU-store-week level.

The root cause isn't bad planners. It's bad data. Sales data is not demand data. When a product stocks out, sales drop to zero — but demand didn't disappear. When a promotion runs, sales spike — but organic demand didn't actually change. Every forecast built on raw sales data inherits these distortions.

The problem compounds across echelons. A retailer sees a 10% demand dip and cuts orders by 20%. The distributor, seeing a 20% order drop, slashes their own orders by 35%. The manufacturer, now seeing a 35% decline, cancels a production run. A modest demand signal becomes a catastrophic supply response.

This is the bullwhip effect — and it's not theoretical. In our data, a 10% demand variation at the consumer level amplified to a 50% order variation at the manufacturer level. The coefficient of variation nearly quintupled across four echelons.

Consumer

CV: 18%

Retailer

CV: 32%

Distributor

CV: 58%

Manufacturer

CV: 94%

Order Forecasting

Each echelon forecasts independently using distorted signals from the next level down, amplifying errors upstream.

Order Batching

Companies batch orders weekly or monthly to reduce transaction costs, creating lumpy demand patterns that mask true consumption.

Price Fluctuations

Forward-buying during promotions inflates orders beyond real demand, followed by artificial demand troughs.

Shortage Gaming

When supply is constrained, buyers inflate orders to secure allocation — then cancel when supply normalizes.

Lead Time Variance

Unpredictable lead times force planners to pad safety stock, increasing order volatility across the chain.

"Most supply chains don't have a forecasting problem — they have a data problem. Fix the data pipeline first. The models will follow."

The Data

18 million records. 24 months. Every transaction, movement, and promotion.

We partnered with one of Thailand's largest FMCG distributors to gain access to their complete demand data across 120 retail locations over 24 consecutive months.

The dataset includes point-of-sale transactions, inventory movements, supplier orders, and promotional calendars. It spans 4,200 active SKUs across four major categories.

12.6M

POS Transactions

3.8M

Inventory Movements

1.2M

Supplier Orders

420K

Promotion Events

18M

Total Records

Raw data is never forecast-ready. We applied four critical cleansing steps to transform sales data into true demand signals:

Shortage-Censoring

Identified periods where zero sales were caused by stockouts, not zero demand. These periods were flagged and excluded from training data.

7.6% of all SKU-store-week periods censored

Unconstrained Demand Reconstruction

For censored periods, we estimated what demand would have been using Bayesian imputation based on neighboring stores and historical patterns.

47M THB in hidden demand recovered

Substitution Mapping

Mapped product substitution relationships — when SKU-A stocks out, what percentage of customers switch to SKU-B vs. leaving the store?

340 substitution pairs identified, 62% avg shift rate

Promotion Tagging

Each promotion event was tagged with type (price cut, BOGO, display, bundle), depth, duration, and cannibalization scope.

420K events tagged across 24 months

"We don't forecast sales. We forecast demand. Sales is what you achieved. Demand is what customers wanted."

Demand Variability Distribution (4,200 SKUs)

SKU Category Split

The Architecture

From scattered data sources to a unified forecast intelligence pipeline.

Every component built, tested, and measured. From raw POS data to production forecast API — fully automated, deployable with one command.

POS + ERP + Supplier

→

Data Lake (Delta Lake)

→

Feature Engine (47 features)

→

Model Arena (5 models)

→

Forecast API (<200ms)

→

Dashboard + Alerts

The pipeline ingests data from three primary sources — point-of-sale systems, ERP inventory data, and supplier order feeds — into a Delta Lake for versioned, ACID-compliant storage. A feature engine computes 47 features per SKU-store-week, including lag features, rolling statistics, promotion indicators, weather data, holiday flags, pricing elasticities, and competitive presence metrics.

                # Pseudocode — demand pipeline

                for store in retail_network:

                  raw = fetch_pos_transactions(store, window=24m)

                  stockouts = detect_stockout_periods(raw, min_gap=3d)

                  cleaned = censor_stockout_periods(raw, stockouts)

                  subs = load_substitution_map(store.category)

                  unconstrained = reconstruct_demand(cleaned, subs)

                  features = engineer_features(unconstrained, promotions, weather, holidays, pricing, competition)

                  write_to_lake(features, partition_by=['sku', 'week', 'store'])

Metric	Value
Raw data volume	4.2 GB
Feature store volume	12.8 GB
Ingestion throughput	45K rec/s
Full pipeline runtime	18 min
API P95 latency	187 ms
Dashboard refresh	5 s

"The architecture isn't the impressive part. The hard part is getting the data right: unconstrained demand, clean promotion tags, and substitution maps."

What This Means

Measured ROI, industry benchmarks, and an honest self-assessment.

Real numbers from a real deployment. Not projections. Not estimates. Measured before-and-after.

Measured ROI: Before vs. After

Metric	Before	After	Impact
Forecast Accuracy (MAE)	42.3%	14.6%	-27.7 pp
Excess Inventory	31.7M THB	13.1M THB	18.6M THB freed
Annual Carrying Cost Saved	—		4.6M THB/year

Self-Assessment: Is Your Forecast Pipeline Ready?

Do you distinguish between sales and demand? If your forecast model trains on raw POS data without correcting for stockouts, your accuracy ceiling is lower than you think.
Can you measure your forecast accuracy at SKU-store-week level? Aggregate accuracy hides SKU-level disasters. A 20% aggregate MAE often contains SKUs at 60%+ error.
Do you know your bullwhip index? Compare the coefficient of variation at each echelon. If upstream CV is more than 2x downstream CV, you have a signal distortion problem.

Industry	Typical MAE	Good MAE	Best-in-Class MAE
FMCG	35–45%	20–30%	12–18%
Manufacturing	30–40%	18–25%	10–15%
Pharma	25–35%	15–22%	8–12%
E-commerce	40–55%	25–35%	15–22%

Want to see what your data pipeline is missing? We'll run a free 2-hour data audit.

Book a Data Audit

Our Approach

How We Work

Five steps from assessment to continuous optimization. Platform-agnostic means we recommend honestly and deliver the right solution.

Assess

Audit your current data sources, forecast process, and accuracy baselines to identify the biggest gaps and quickest wins.

Collect

Build the data pipeline: ingest POS, inventory, promotions, and external signals into a unified, versioned data lake.

Model

Train and benchmark multiple forecast models per SKU segment, selecting the best approach for each demand pattern.

Measure

Deploy rigorous accuracy measurement at SKU-store-week granularity with automated alerting on forecast degradation.

Optimize

Continuously retrain models, refine features, and extend the pipeline as new data sources and business needs emerge.

From Scattered Signals toForecast Intelligence.