588 → 2,125 → 5,240 → 195 → 214 alerts. How weak signals become strong verdicts.
No single signal is reliable. Price spikes have innocent explanations. Volume surges happen during news events. But when four independent signals align — the probability of coincidence collapses.
SET ITCH encodes the aggressor side explicitly in every trade message. The aggressor field is set to ‘B’ (buy-initiated) or ‘A’ (ask-initiated) by the exchange when the trade matches — not inferred. On most US and European equity venues, aggressor identification is inferred via the Lee-Ready algorithm (compare trade price to prevailing quote), which has classification errors near the spread midpoint. The 97.7% buy-aggressor figure shown in the Top Alert is exchange-reported, not estimated — there is no model error in this number.
A second structural advantage — the tick grid. SET’s binding tick floor (Chapter 4) means the spread is pinned at one tick approximately 98% of the time, so every trade is unambiguously bid- or ask-initiated and there is no spread-crossing or midpoint-trade ambiguity. Directional pressure expresses through aggressor flow rather than spread compression. The four-signal architecture in this chapter works particularly well because of this — on markets with elastic spreads, the aggressor signal would be noisier and weaker.
Together: the third signal in the funnel (‘Aggressor > 70%’) is measured directly from the protocol AND lives in a market structure where that measurement is more discriminative than it would be elsewhere.
An ‘event’ is one instrument × one window where the threshold was met. Each signal alone produces hundreds or thousands of hits. But when multiple independent signals fire on the same event — the false positive rate collapses exponentially.
If each signal has an independent false positive rate of 10%, one signal alone produces too many alerts. Two signals together: 1% false positive. Three signals: 0.1%. Four signals: 0.01%. This is the Bayesian argument popularized by statistician Nate Silver in The Signal and the Noise — each additional signal multiplies the evidence.
The composite score assigns +1 for each signal that fires. Score-2 means two signals aligned. Score-3 means three. The 20 score-3 alerts are the highest priority for investigation.
Session return exceeds ±5% — 588 instrument-days flagged.
Daily volume exceeds 3× the 20-day rolling average — 2,125 flagged.
Buy-side aggressor ratio exceeds 70% — 5,240 flagged.
Average trade size declines by >50% during the run-up — 195 flagged.
This is the highest-scoring alert in the dataset. All four signals fire simultaneously. The price, volume, aggressor ratio, and trade size tell a coherent story.
The four-panel view reveals the manipulation anatomy: (1) price rises sharply, (2) volume surges, (3) nearly all trades are buy-initiated — suggesting a single actor or coordinated group, (4) average trade size shrinks as retail participants pile in. The dump phase shows volume staying high while the aggressor ratio inverts and price collapses.
The distribution is steep: most multi-signal alerts fire exactly two signals. Twenty alerts fire three signals — these are the highest priority for investigation. No alert fired all four signals simultaneously.
For a regulator with limited investigation capacity, the score provides a prioritization mechanism. Investigate the 20 score-3 alerts first — they represent the intersection of abnormal return, volume surge, and directional dominance. This reduces the investigation queue from thousands of daily anomalies to a manageable 20.
The scoring engine runs in 65ms across all 4,500 instruments using Database A. That is fast enough for intraday surveillance — alerts can be generated every 5 minutes throughout the trading session, not just end-of-day. The limiting factor is not compute but analyst bandwidth.
Nate Silver's insight in The Signal and the Noise: most predictions fail not because the model is wrong but because the base rate is ignored. Multi-signal scoring is Bayesian updating in action.
With one signal (price > 5%), you flag 588 instrument-days. Assume 90% are innocent — that is 529 false positives. Unworkable.
Add a second signal (volume surge). If the two are independent, the joint false positive rate drops from 10% to 1%. Now you have ~5 false positives instead of 529.
Add a third signal (aggressor dominance). Joint false positive: 0.1%. Add a fourth: 0.01%. At score-3, you are looking at events where three independent anomalies coincide — the probability of coincidence is vanishingly small.
This funnel architecture is not specific to markets. The same pattern applies to any multi-sensor anomaly detection: IoT sensor arrays (temperature + vibration + power draw), fraud detection (amount + velocity + geolocation + device fingerprint), cybersecurity (packet rate + port scan + payload signature). One signal lies. Four signals convict.
The funnel is composable: adding a fifth signal (e.g., a PIN-style informed-flow estimator, or spread compression) costs near-zero compute but further reduces false positives. The architecture is designed for extensibility — each signal is a module that produces a binary flag per instrument-window.
— Nate Silver, The Signal and the Noise: Why So Many Predictions Fail — but Some Don't (Penguin Press, 2012), Ch 8
Each signal is an independent module. The composite score is a simple sum. The power is not in complexity but in the combination of orthogonal evidence.
The thresholds (5%, 3×, 70%, 50%) are calibrated from the empirical distribution of the dataset. They are not arbitrary — each represents approximately the 95th percentile of its respective metric. Changing the thresholds shifts the precision/recall trade-off: tighter thresholds catch fewer manipulations but with higher confidence.
— Harris, Trading and Exchanges
The multi-signal scoring pattern is domain-agnostic. Any system that monitors high-frequency event streams for anomalies benefits from the same architecture.
Signal 1: temperature exceeds threshold. Signal 2: vibration frequency shifts. Signal 3: power draw spikes. Signal 4: product defect rate increases. One signal = routine maintenance check. Three signals = stop the line. Same funnel, different data.
Signal 1: transaction amount exceeds pattern. Signal 2: velocity exceeds normal. Signal 3: new device or geolocation. Signal 4: recipient is flagged entity. Banks that use single-signal rules block 70% of legitimate transactions. Multi-signal scoring cuts false positives by 90%.
Signal 1: unusual packet rate. Signal 2: port scan detected. Signal 3: payload signature match. Signal 4: connection from known-bad IP range. SOC teams drown in single-signal alerts — thousands per day. Multi-signal scoring surfaces the 5 that matter.
Signal 1: heart rate anomaly. Signal 2: blood pressure deviation. Signal 3: oxygen saturation drop. Signal 4: activity pattern change. Each alarm alone triggers alarm fatigue. Combined scoring directs nurse attention to the patient most likely deteriorating.
"One signal lies. Two signals suggest. Three signals convict. The architecture is the same whether you are catching market manipulation, predicting machine failure, or detecting fraud — independent signals, Bayesian combination, exponential false-positive decay."
This study uses licensed market data obtained through commercial agreement. Infozense is not affiliated with the Stock Exchange of Thailand. No market data is distributed through this website. This content is for educational and analytical purposes only and does not constitute investment advice.