Deeper book levels add noise, not signal. On SET, the top of the book is the entire game.
750 out-of-sample regressions across 25 stocks, 5 depth configurations, and 6 horizons from 5 seconds to 15 minutes. L1-only beats every deeper-book configuration at every horizon. At long horizons, deeper models go negative — predicting worse than the average return.
bid_qty − ask_qty at a given level. Positive = more buyers stacked; negative = more sellers.A trader looking at a quote terminal sees ten levels of book on each side. Twenty rows of bids and offers. The unspoken assumption is that all twenty rows carry information — that's why exchanges sell deeper book feeds, why infrastructure teams subscribe to L10, and why analytics tools display the full ladder.
But how much of that information is actually predictive? If we built a model of the next mid-price move using only L1, vs L1 + L2, vs the full L1–L10, how much extra predictive power would the deeper levels contribute?
This chapter answers that question on SET data — and the answer is surprising.
The relationship between order book depth and short-horizon price prediction has been studied for two decades. Three findings are well-established in the literature.
These findings are consistent across US equities, European equities, and major futures contracts. What hasn't been measured is whether the same hierarchy holds on SET, with its binding tick grid (Chapter 4) and L1-concentrated depth (Chapter 5). The empirical contribution of this chapter is to settle that question for SET data — and the result is sharper than expected.
For each (stock × horizon × depth-config) cell:
bid_qty[k] − ask_qty[k]), included for k = 1…depth.5 depth configs (L1, L1–L2, L1–L3, L1–L5, L1–L10) × 6 horizons (5s, 10s, 30s, 60s, 5min, 15min) × 25 stocks = 750 regressions. Each cell evaluated on its own out-of-sample data.
Sessions filtered to continuous trading only (10:00–12:30 and 14:00–16:30 Bangkok). Predictions whose horizon spans the lunch break or end of day are dropped.
Mean OOS R² across the 25 stocks, in percentage points. The L1-only row is best at every column.
| Depth | 5s | 10s | 30s | 60s | 5min | 15min |
|---|---|---|---|---|---|---|
| L1 only | 0.78 | 1.11 | 1.76 | 2.18 | 2.53 | 1.49 |
| L1–L2 | 0.74 | 1.05 | 1.65 | 1.97 | 2.11 | 0.99 |
| L1–L3 | 0.71 | 0.99 | 1.52 | 1.77 | 1.49 | −0.70 |
| L1–L5 | 0.69 | 0.95 | 1.43 | 1.62 | 0.80 | −2.25 |
| L1–L10 | 0.57 | 0.71 | 0.98 | 0.87 | −1.39 | −8.02 |
1. L1 alone is the best predictor at every horizon. Adding levels never helps on average.
2. At long horizons (5min, 15min), deeper-book models go negative OOS — meaning the L10 model fits the test set worse than just predicting the average return. That is a strong overfitting signal: deeper levels look meaningful in training, then collapse out of sample.
Each line traces what happens as we increase the depth of book used in the regression. All slopes are negative.
| Going from L1 to L1–L10 | Median R² change |
|---|---|
| at 5s | −0.17 pp |
| at 10s | −0.34 pp |
| at 30s | −0.66 pp |
| at 60s | −1.25 pp |
| at 5min | −2.76 pp |
| at 15min | −6.37 pp |
The deeper the book you use, the worse you predict. The longer the horizon, the more pronounced the damage.
A reasonable interpretation, consistent with this chapter's findings and the prior chapters of the series.
Deeper levels are noisier than informative. L2 through L10 quantities reflect resting orders far from the action — they update infrequently, get cancelled often, and rarely participate in price discovery. When fed into a regression, they add degrees of freedom but not signal, which causes overfitting on the training set and worse performance on the test set.
This is consistent with the SET microstructure findings from earlier chapters:
If almost all the trading happens at L1, almost all the signal lives at L1. Deeper levels become decoration — useful to display, useless to predict from.
L1 imbalance only, 30-second horizon, 95% bootstrap confidence interval per stock.
18 of 25 stocks show statistically significant predictability (95% bootstrap CI excludes zero).
Best: SCM (4.8% R²), CPALL (3.3%), KCE (2.9%), VGI (2.8%), HANA (2.6%), GULF (2.6%).
Essentially unpredictable: PTT (−0.2%, CI crosses zero), EA (0.0%), TOP (0.3%, not significant).
Percentage of stocks where the 95% CI lower bound is above zero.
| Horizon | Significant |
|---|---|
| 5s | 76% (19/25) |
| 10s | 80% (20/25) |
| 30s | 72% (18/25) |
| 60s | 72% (18/25) |
| 5min | 52% (13/25) |
| 15min | 32% (8/25) |
Short-horizon prediction is broadly significant. Long-horizon prediction works only on a minority.
Heatmap of 25 stocks × 5 depth configurations at 30-second horizon. The best column for nearly every row is L1 or L1–L2.
Stocks where L1 alone works (top of chart) usually keep working at L1–L3, then degrade at L5 and L10. Stocks that don't work at L1 (PTT, EA, TOP) don't get rescued by adding depth — they get worse. Across nearly every row, the best column is L1 or L1–L2, never L1–L10.
There is no stock where adding deeper book levels improved predictability meaningfully out-of-sample.
If L1 is where the signal lives, the natural follow-up is: how much more signal can we extract by using L1 behavior, not just L1 state?
We tested an enriched feature set built entirely from L1 — no deeper book required:
bid_qty − ask_qty (the baseline).| Horizon | Baseline (L1 imbalance) | Enriched (L1 dynamics) | Lift |
|---|---|---|---|
| 5s | 0.78% | 1.04% | +0.26 pp |
| 10s | 1.11% | 2.15% | +1.04 pp |
| 30s | 1.76% | 3.00% | +1.24 pp |
| 60s | 2.18% | 3.52% | +1.34 pp |
| 5min | 2.53% | 3.00% | +0.47 pp |
| 15min | 1.49% | −0.23% | −1.72 pp (overfits) |
At the sweet-spot horizons (10s–60s), enriching L1 features roughly doubles predictability. The lift is broad-based — 84–96% of stocks see improvement at these horizons.
The contrast with Section 5 is the key finding of this chapter:
Negative effect on OOS R² at every horizon.
Positive effect on OOS R² at every mid-range horizon.
The path to better short-horizon prediction on SET is not deeper book data. It is richer features at L1. Microprice, spread state, imbalance change, and quote-update rate carry information that imbalance alone misses — and they all live at the top of the book.
A trader who can compute these features in real time on L1 data extracts more signal than one paying for L10 and using it naively. This also reinforces why message-level LOB reconstruction matters: most of these features (velocity, message rate, microprice) cannot be computed from a snapshot feed. They require the full message stream.
This study uses licensed market data obtained through commercial agreement. Infozense is not affiliated with the Stock Exchange of Thailand. No market data is distributed through this website. This content is for educational and analytical purposes only and does not constitute investment advice.