Aggregated tests on SET data show null. A tier-stratified event study, after one critical correction, finds Harris’s asymmetric-impact signature alive in mid-volume stocks.
Walk-the-book on 22 trading days × 75 SET stocks, stratified by volume quintile. The high-volume tier absorbs the signature; the mid-volume tier carries it (log_diff +0.47, p < 1e-8); the low-volume tier runs directionally with T3 but is underpowered. The fingerprint lives in the middle.
75
Stocks (T1 + T3 + T5)
22
Trading days
T3
Where the fingerprint grips
+0.47
Median log-diff (T3 × 30-min × 5% POV)
ⓘ About the sample
We tested across the full SET equity activity spectrum, not just liquid names. From a qualifying universe of 684 SET common stocks (≥500 trades over 22 trading days, median price ≥1 THB, no halts), we ranked by total volume and split into five quintiles. We then randomly sampled 25 stocks from T1 (top 20% by volume), T3 (middle 20%), and T5 (bottom 20%) — fixed random seed 42 for reproducibility. T1 stocks trade roughly 1,000× the daily volume of T5 stocks. This stratification is the core of the design: a single test pooled across the activity spectrum would have hidden the tier-specific result this chapter reports.
📡 Why this test works on SET specifically
Two structural features of SET data matter for measuring manipulation. First, SET ITCH encodes the aggressor side directly in every trade message — no Lee-Ready inference needed. On most US/European venues, aggressor classification is heuristic and noisy near the spread midpoint. On SET it is exact.
Second, the tick grid (Chapter 4: spread pinned at one tick approximately 98% of the trading day) is a structural feature, not a measurement convenience. It also means our walk-the-book impact estimates carry no model error from spread misclassification — but it has a deeper consequence: it suppresses Harris’s mechanism at low order sizes. That suppression turns out to be central to the chapter’s findings.
📚 Quick definitions
Asymmetry ratio — buy-side walk-the-book impact divided by sell-side impact at a given order size. 1.0 = symmetric; >1 = buying costlier; <1 = selling costlier.
|log asymmetry| — natural log of the asymmetry ratio, taken absolute. Zero = perfect symmetry; 0.69 = 2× asymmetry in either direction. Used because Harris’s prediction is about deviation from 1.0, not direction.
POV — Percentage of (daily) Volume; order size as a fraction of the day’s total traded volume.
Volume tier — quintile of total 22-day volume across qualifying SET common stocks. T1 = top 20%, T5 = bottom 20%.
Event window — a (stock, day, intraday time bucket) where the composite of the S1, S2, S4 surveillance signals fires at score ≥ 2.
Matched control — same stock, same time-of-day bucket, different day, no signals firing — K=5 controls per event, chosen to minimize log-volume distance to the event window.
Walk-the-book — pre-trade cost estimate built by simulating execution against the reconstructed visible book at the time of each snapshot.
Wilcoxon signed-rank — non-parametric paired test on the median of event-vs-control differences.
All results from real SET ITCH data. Not simulated.
Section 1
The Question
Harris’s prediction: trade-based manipulation profits from asymmetric price impact between buying and selling. The question is whether SET data carries that signature.
Harris’s Trading and Exchanges describes a specific mechanism by which trade-based manipulation becomes profitable: asymmetric price impact between buying and selling. A manipulator pushes the price up cheaply (low buy-side impact), accumulates a position, then liquidates into the inflated price (where high sell-side impact rewards the exit). The asymmetry between the two sides is the profit. The empirical prediction is sharp: stocks where manipulation occurs should show buy-side walk-the-book impact materially larger than sell-side impact during the manipulation episode.
The mechanism has three universe predictions for an emerging market like SET:
Asymmetric impact should exist somewhere in the market — otherwise the mechanism has nothing to grip.
It should concentrate in low-volume, thin-book stocks where less institutional attention, thinner depth, and retail-dominated flow create the right conditions.
Surveillance built around asymmetric-impact signatures should flag manipulation candidates.
The first version of this chapter tested all three and found a null result on every dimension. The conclusion drafted at the time: “the fingerprint Harris’s theory requires doesn’t exist here.”
That conclusion was wrong. The fingerprint exists. The aggregated test was structurally blind to it. This chapter shows the path from the wrong answer to the right one.
Section 2
What the Literature Does and Doesn’t Say
Harris’s prediction is textbook. The empirical test design comes from a different literature — event studies on prosecuted manipulation episodes.
Harris’s prediction is textbook, not empirical. The empirical manipulation literature uses a different unit of analysis: prosecuted episodes from regulator records, often event-studied with abnormal returns, abnormal volume, and order-imbalance signatures within windows. Episode-level analysis is the standard template for testing trade-based manipulation predictions. Aggregating across stock-days or stock-months smooths over the very phenomenon the test should detect.
The standard tool: identify candidate manipulation windows, measure impact statistics inside those windows, compare to matched controls outside them, run a paired non-parametric test on the differential. The candidate windows can come from regulator enforcement records or, in markets without rich enforcement data, from surveillance-signal anomalies built from public trade flow.
On SET, regulator enforcement data is not publicly available at the granularity needed. The surveillance-signal proxy — building candidate events from S1 (price), S2 (volume), S3 (aggressor share), S4 (trade size decline) as introduced in Chapter 7 — is the natural alternative. The chapter uses this proxy after one critical correction: removing S3 from the event-defining composite, which we explain in Section 4.
Section 3
Why the Aggregated Test Was Blind
A median of 1.000 to three decimals seems definitive. It isn’t — it is consistent with two distinct realities the median statistic cannot distinguish.
The first-pass test computed, per stock per POV, the median asymmetry ratio across all snapshots over 22 days, then the median across the 25 stocks per tier. Results at 5% POV:
Tier
Median asymmetry
T1 (most active)
1.08
T3 (mid)
1.00
T5 (lowest)
1.00
Nothing in T5 — exactly where the theory predicts manipulation should concentrate. A median of 1.000 to three decimals seems definitive. It is not. It is consistent with two distinct realities the median cannot distinguish between:
Reality A: No manipulation exists; baseline trading is symmetric; the median correctly reports symmetry.
Reality B: Manipulation exists in roughly 0.1% of stock-time at 5× asymmetry; the remaining 99.9% is symmetric; the snapshot mean is 0.999 × 1.0 + 0.001 × 5.0 = 1.004; the median across 60,000 snapshots per stock-day is exactly 1.000 because manipulation episodes never reach the middle of the distribution.
A median-of-medians collapses sparse rare-event signals to indistinguishable from absent. The aggregate test as run cannot reject the null even if manipulation exists at 5× intensity in 0.1% of stock-time. This is a power problem dressed as a finding.
A second descriptive cut surfaces the issue empirically. Per stock-day, what is the 95th percentile of snapshot-level asymmetry?
Tier
Median p50 across stock-days
Median p95
Median p99
T1
1.43×
2.68×
3.20×
T3
1.35×
2.91×
3.12×
T5
1.00×
2.30×
2.47×
The median within-stock-day snapshot is symmetric. The 95th percentile within-stock-day is 2.3–2.9× asymmetric. Nearly every stock-day on SET contains moments where buying costs 2–3× more than selling, or vice versa. Those moments are the natural locus of manipulation if it exists. The median erases them.
One simulated stock-day. A 60-second episode of 4× buy/sell asymmetry sits inside ~6.5 hours of symmetric trading. Snapshot-level p95 captures it; snapshot-level median and the daily aggregate erase it.
This is not a flaw of the median statistic — the median is the correct summary of typical behavior. It is a flaw of the test design that asked ‘what does typical behavior look like?’ when the right question is ‘what happens in flagged windows?’
Section 4
One Surveillance Signal Was Self-Fulfilling
S3 (aggressor share > 70% daily) fires on 74% of low-volume stock-days — mechanically, not because they’re being manipulated. Including it in the event-defining composite while measuring asymmetry inside the flagged window is near-tautological.
Before rebuilding the test, an audit of the Chapter 7 surveillance signals revealed an endogeneity problem. The four daily signals fire at very different rates across volume tiers:
Tier
S1 (price > 5%)
S2 (volume > 3×)
S3 (aggressor > 70%)
S4 (size < 70%)
T1
22.1%
7.1%
12.8%
9.9%
T3
7.3%
6.0%
50.7%
24.7%
T5
5.2%
14.5%
74.0%
28.0%
S3 fires on 74% of T5 stock-days. The mechanism is mechanical: when daily volume is sparse, two or three larger trades push the daily aggressor share past 70%. S3 is detecting thin trading more than directional pressure.
This matters for any test that uses S3 as part of event definition while also measuring buy/sell asymmetry. Aggressor share is mechanically correlated with the asymmetry construct. Including it in the event flag while measuring asymmetry inside the flagged window is testing whether high-aggressor windows have high asymmetry — a near-tautology rather than a manipulation test.
S3 fires in 87% of all score-2+ events in the original Chapter 7 funnel. Drop S3 and the picture changes:
Score-2+ events with S3 included: 311
Score-2+ events with S3 excluded: 59 (19% of original)
Tier composition with S3 included: 9.3% / 19.3% / 33.6% (T1 / T3 / T5) — the prior ‘3.6× more in T5’ finding
Tier composition with S3 excluded (per stock-day basis): no longer T5-dominant; events distribute by total window count per tier
The ‘3.6× more surveillance signals in low-volume tier’ finding from Chapter 7 was largely an S3 artifact — it measured thin-trading episodes, not manipulation episodes. A separate Chapter 7 fix surfaces this honestly. For this chapter’s event-study, S3 is excluded from the event-defining composite.
Section 5
Window-Level Signal Re-Computation
Daily aggregates are too coarse for an event-study that needs intraday windows. We recomputed S1, S2, S4 at three window granularities — with per-stock percentile thresholds to avoid tier-confounding.
With S3 removed, the remaining signals are S1, S2, S4. The Chapter 7 versions were daily aggregates — too coarse for an event-study that needs intraday windows. We recomputed them at three window granularities (5-minute, 30-minute, 60-minute) over the 22 trading days × 75 stocks. For each (stock, day, window):
S1_w fires if the window’s high-low return is in the top 5% of that stock’s windows of the same size.
S2_w fires if the window’s volume is in the top 5% of that stock’s windows of the same size.
S4_w fires if the window’s average trade size is in the bottom 5% of that stock’s windows of the same size.
Per-stock percentile thresholds avoid the tier-confounding that plagued the original daily-level Chapter 7 composite. Each stock contributes its own ‘abnormal’ windows relative to its own typical behavior.
Composite_w = S1_w + S2_w + S4_w, ranging 0 to 3. Event windows are those with composite_w ≥ 2. Window counts:
Window
Total windows
Event windows (score ≥ 2)
5-min
38,631
663
30-min
10,341
199
60-min
6,230
132
Zero score-3 events at any window size. The original 20 score-3 events in Chapter 7 were all S3-anchored. Without S3, the three remaining signals are nearly independent; triple-occurrence is statistically vanishing.
Operational note: the composite is in practice S1+S2 only. In this 22-day sample, every score≥2 event is an S1+S2 co-occurrence (volume spike + price move). S2 and S4 are operationally near-mutually-exclusive — when volume is in the top 5%, average trade size is rarely in the bottom 5%, because the same windows that aggregate high volume also aggregate larger institutional fills. S4 fires in 358 windows over the sample but never simultaneously with S1 or S2. The chapter’s ‘multi-signal composite’ architecture is therefore, on this sample, a test of asymmetric impact inside price-and-volume burst windows specifically. We test S4-only windows separately as a robustness check (Section 7) and find no asymmetric impact there — useful information about where the signal lives.
Section 6
The Event-Study Design
For each event window, compare within-window walk-the-book asymmetry against matched non-event controls. Tail statistic. Activity-matched. Paired non-parametric test.
Treatment statistic per window: The maximum |log asymmetry| across snapshots in the window — a tail-sensitive statistic. The within-window median was tested first and shows mostly null (we return to this in Section 9), consistent with Section 3’s argument that medians collapse rare events. The maximum captures the extreme asymmetric moment inside the window if one exists.
Matching: For each event window (stock i, day t, time-of-day bucket b), find the K = 5 control windows of the same stock at the same time-of-day on different days where composite_w = 0, ranked by smallest absolute log-volume distance to the event window. Take their per-window |log asymmetry| max, and compute the median across the K controls.
Why volume-matched controls. Event windows by construction have higher volume than baseline (S2 fires on top 5% volume). High-volume windows mechanically generate more extreme tail asymmetry simply by having more snapshots and more depth consumption. Volume-matched controls strip out this confound: any surviving event-vs-control differential is over and above the mechanical activity effect.
Differential:log_diff = log(event_max) − log(control_median_max). This is in log space, symmetric around zero.
Inference: Wilcoxon signed-rank test on the paired log_diffs across event windows, stratified by tier (T1, T3, T5) and reported pooled (ALL). Bootstrap 95% CI on the median log_diff (10,000 resamples, fixed seed). Reported at four POV levels (1%, 2%, 5%, 10%) and three window sizes (5, 30, 60 min) — twelve cells per tier, 48 total.
Section 7
Findings
The activity-matched event-study results. T1 absorbs, T3 carries the fingerprint, T5 directional but underpowered.
Event-window vs. matched-control: log asymmetry differential, by tier × window × POV. Vertical reference at log_diff = 0. CIs excluding zero marked.
T1 (high-volume, n = 51–510 events per cell): Effects close to zero at all window sizes and POVs after activity matching. At 60-min × 10% POV the point estimate is mildly negative (log_diff = −0.21, p = 0.04, CI [−0.30, −0.03]) — events show slightly less extreme asymmetric moments than matched controls at the deepest order sizes. Most cells: no signal.
T3 (mid-volume, n = 48–95 events per cell): Robust positive log_diff across every combination of window size and POV.
Window
POV
log_diff
p
Bootstrap CI
30-min
1%
+0.48
2.3e-10
[0.24, 0.77]
30-min
2%
+0.55
1.1e-9
[0.25, 0.70]
30-min
5%
+0.47
8.1e-9
[0.25, 0.64]
30-min
10%
+0.33
1.8e-6
[0.21, 0.49]
60-min
1%
+0.28
1.4e-8
[0.12, 0.53]
60-min
5%
+0.27
4.4e-7
[0.15, 0.42]
A log_diff of 0.47 in log space corresponds to event-window max asymmetry being approximately 1.6× the matched-control max asymmetry. Effects survive Bonferroni correction across the full 48-cell grid (α = 0.001) in every reported T3 cell at 30-min and 60-min windows.
Seed robustness across 100 random T3 samples. To test whether the headline 0.47 effect is a fluke of one 25-stock random draw, we re-ran the entire pipeline 100 times — each time drawing 25 random T3 stocks from the full 136-stock T3 universe, recomputing window signals, and re-running the matched event study. The median log_diff across 100 seeds is 0.477 (matching the original sample’s 0.47), the 5th–95th percentile range is [0.29, 0.63], 100% of seeds produced a positive log_diff, and 98 of 100 seeds were significant at Bonferroni-style p < 0.001. The original seed=42 sample landed at essentially the population median. The T3 effect is approximately 1.5–1.6× max asymmetry (event vs. matched control) across the entire T3 quintile, not a lucky 25-stock subset.
Within-sample stability. Per-stock log_diff in the original T3 sample, sorted. 23 of 25 stocks show positive median log_diff. The two negative stocks contribute only 1 event each; their removal shifts the overall median by less than 0.04. The finding is not driven by a small number of outlier stocks.
S4-only events: where the signal isn’t. As an honest robustness check, we ran the same event-study design on the 93 T3 windows where only S4 fires (composite = 1, S4_w = 1) — small-trade-size episodes without a price or volume spike. The differential is log_diff = 0.000, Wilcoxon p = 0.67 — flat, no effect. The asymmetric-impact signature lives specifically in price-and-volume burst windows (S1+S2), not in small-trade-size episodes. This is informative about where the signal lives, not a contradiction of the headline.
T5 (low-volume, n = 12–13 events per cell): Positive point estimates with the same shape as T3 (log_diff 0.28–0.55), but small samples make inference unstable. Wilcoxon p-values are below 0.01 in most cells, but bootstrap CIs often include zero. Descriptive support, not inferential.
Pooled across tiers (ALL): Consistently positive at every combination, p < 1e-5 at all window sizes and POVs.
Section 8
The Tier Story
Three tiers tell three different stories. The shape of these findings is itself the finding.
Distribution of within-window max |log asymmetry|: events vs. matched controls. T1 absorbs the asymmetry. T3 carries the fingerprint. T5 directional but underpowered.
T1 absorbs. High-volume stocks have deep, professional liquidity. A would-be manipulator runs into institutional resistance — refresh, counterflow, and quick arbitrage erase the asymmetric impact the moment it appears. Even in surveillance-flagged windows, the within-window max asymmetry is no higher than in volume-matched control windows. After matching for activity, the deep-POV point estimate is slightly negative — consistent with institutional liquidity responding to directional flow and dampening it.
T3 carries the fingerprint. Mid-volume stocks have enough depth to be tradeable but not enough institutional presence to absorb directional pressure. Event windows show 30–80% more extreme asymmetric moments than matched-volume controls. This is the strongest test of Harris’s prediction in the chapter, and it confirms the theory at this volume range.
T5 cannot be tested at this sample. Low-volume stocks generate too few joint-signal anomalies in 22 trading days to support inferential conclusions. The point estimates run with T3 (positive, comparable magnitude), but the bootstrap intervals are too wide. A multi-quarter sample would resolve this. As reported here, T5 is consistent with Harris’s prediction directionally but does not statistically confirm it.
The shape of these three findings is itself the finding. Harris’s mechanism does not grip uniformly — it requires a specific intersection of (a) enough depth that the manipulator can move price, (b) not so much depth that institutional flow absorbs the move, and (c) enough trading activity that the asymmetric episode produces measurable signal. The mid-volume tier is where these three conditions intersect on SET.
Section 9
Why the Tail Matters and the Median Doesn’t
The within-window max captures the question Harris asked. The within-window median answers a different question entirely.
A natural reviewer question: why use the within-window max instead of the median? The median version of the same test produces mostly null results (a few cells reach p < 0.01 at the deepest POVs in T3, but none survive Bonferroni). The max version produces highly significant findings in T3 across every cell.
The difference is not a statistical trick. It is the point of Section 3 made formally: the median is a measure of typical behavior; manipulation episodes are not typical. A 30-minute window with one 60-second burst of 4× buy-side impact and 1,799 seconds of symmetric trading has a median asymmetry near 1.0 and a max near 4.0. The median test asks whether typical trading in flagged windows differs from typical trading in control windows. The max test asks whether extreme moments in flagged windows are more extreme than extreme moments in control windows. Harris’s prediction is about extreme moments. The max test answers the question Harris asked. The median test does not.
This is a methodological choice, and we report both for honesty:
Statistic
T3 × 30-min × 5% POV
p (Wilcoxon)
Bootstrap CI
within-window median
log_diff = 0.04
0.19
[0.00, 0.19]
within-window p95
log_diff = 0.41
8.1e-8
[0.28, 0.55]
within-window max
log_diff = 0.47
8.1e-9
[0.25, 0.64]
The tail statistics agree with each other and dominate the median in revealing the signal Harris predicted.
The MAX-statistic POV gradient: T3 log_diff across POV depths × window sizes. Significance survives Bonferroni in every cell.
The MAX-statistic POV gradient. The chart above shows the T3 effect across POV depths from 1% to 10%, at both 30-min and 60-min windows. The effect is robustly positive across every POV level, with significance surviving Bonferroni in every cell. The gradient is essentially flat at 60-min and peaks slightly at 2% POV at 30-min — neither matching nor refuting any specific mechanism prediction. We discuss what this implies for the tick-grid story in Section 10.
Section 10
Mechanism: What the Tick Grid Explains and What It Doesn’t
The tick grid explains the aggregated null. It does not explain the corrected event-study finding. Those are two different conclusions and we should not conflate them.
The tick grid (Chapter 4: spread = 1 tick approximately 98% of the trading day) is a structural feature of SET that has direct consequences for any walk-the-book impact test. The clean consequence: at small POV, the buy walk fills entirely at L1 (best ask) and the sell walk at L1 (best bid), and on a 1-tick-spread market those two fills are by construction equally distant from midpoint. The median asymmetry is exactly 1.000 at low POV — not because there is no manipulation, but because the floor of measurable asymmetry is below the resolution of the median test. This is the empirical content of Section 3’s aggregated null and the structural reason why an aggregated median-of-medians test cannot detect Harris’s mechanism on a tick-bound market.
The MAX statistic that produces this chapter’s headline result does not inherit that suppression. The within-window MAX is sensitive to rare moments where book depth has temporarily thinned, an iceberg refresh has not arrived, or a sequence of aggressive orders has consumed several levels — moments where the realized asymmetry is unconstrained by the tick floor. The MAX test detects the manipulation signature in spite of the tick grid, not because of it. This is consistent with the empirical POV gradient: the MAX-statistic T3 effect is robustly positive across all POV levels from 1% to 10%, with no monotonic relationship to depth.
Put together, the tick grid explains why aggregated tests on SET have been blind to Harris’s mechanism — not why the corrected event-study test succeeds. The chapter’s methodological contribution is therefore not ‘we found a way to confirm the mechanism mathematically’ but ‘we found a way to detect the signature empirically despite a market structure that suppresses its average expression.’
Section 11
What This Means for Surveillance
Five practical implications for a surveillance team operating on SET tick data.
The original Chapter 7 funnel produces 311 score-2+ stock-day alerts across 22 trading days × 75 stocks. After removing S3 (the self-fulfilling thin-trading signal), 59 alerts remain — a more honest count of actual signal anomalies. Of those, the ones associated with statistically significant manipulation-style asymmetry are concentrated in mid-volume stocks.
Practical implications for a surveillance team operating on SET tick data:
Drop S3 from any composite intended to detect manipulation rather than thin trading. S3 detects sparse-volume aggressor concentration — useful signal for liquidity surveillance, misleading for manipulation surveillance.
Look in mid-volume stocks. The fingerprint is not in the most-active mainboard names (T1 absorbs it) and is not yet inferentially demonstrated in the thinnest tier (T5 underpowered at this sample size).
Use intraday windows, not daily aggregates. Manipulation episodes do not last all day. The 30-minute window detects what the daily aggregate hides.
Use within-window tail statistics. The max or 95th-percentile snapshot asymmetry within an event window carries the signal; the within-window median erases it.
Match controls on activity. Comparing event windows to matched-volume controls on the same stock at the same time of day rules out the trivial activity-correlation interpretation. This is what makes the surviving signal interpretable as Harris-style asymmetry rather than a mechanical artifact.
Section 12
Honest Caveats
The findings have real limits. A skeptic should object to each of these; we surface them ourselves.
Read these before citing the numbers.
Sample size and seed robustness.22 trading days is one month. The 75-stock primary sample is 11% of the qualifying SET common stock universe. We tested the T3 finding’s sampling robustness by re-running the entire pipeline on 100 random 25-stock T3 draws from the full 136-stock T3 universe. The median log_diff across 100 seeds was 0.477 (matching the headline 0.47), the 5th–95th percentile range was [0.29, 0.63], and 98 of 100 seeds were significant at Bonferroni-style p < 0.001. The headline is not a lucky draw. A multi-quarter sample on a broader universe remains the natural next step but is not required to establish the qualitative finding.
The surveillance signals are proxies.We do not have prosecuted manipulation cases from SET to use as ground truth. The S1/S2/S4 composite is a publicly-replicable proxy for ‘candidate manipulation window’ — necessary in the absence of enforcement data, but not the same as confirmed manipulation. Some flagged windows are likely innocent (high volume from news, large block flows from rebalancing); some unflagged windows may contain real manipulation that the composite missed.
Walk-the-book is mechanical, not strategic.Our impact measure assumes the order eats through visible depth instantly without triggering book response. Real manipulator trades trigger quote revision, refresh, and counterflow. The measured impact is an upper bound on the cost a real trader would face. This bias affects events and controls equally and should mostly cancel in the differential — but it means we are measuring visible-book asymmetry, not true-execution asymmetry.
Activity matching is on volume, not realized volatility.Controls are matched on log-volume distance only (K=5 nearest controls per event within same stock and same time-of-day). We attempted a joint (volume + realized-vol) match as a robustness check, but the candidate-control pool at our sample size (25 T3 stocks × ~10 time-of-day buckets × non-event days) is too thin for K=5 nearest neighbors to actually discriminate across multiple match dimensions — the same controls are selected regardless of which distance metric is used. We disclose this honestly: vol-matched results are not separately identified from volume-matched results in our sample. A larger sample with denser control pools would enable proper joint matching.
Signal composite is operationally narrower than its description.Every composite≥2 event in this sample is an S1+S2 co-occurrence; S4 fires 358 times in 30-min windows but never co-occurs with S1 or S2. The chapter’s test is therefore, on this sample, specifically a test of asymmetric impact in price-and-volume burst windows. The S4-only robustness check (Section 7) shows no asymmetric impact in small-trade-size episodes, which is informative about where the signal lives. A broader universe or longer sample might produce score≥2 events involving S4; ours does not.
T5 underpower.Low-volume stocks generate too few S1/S2/S4 joint anomalies to draw inferential conclusions. The point estimates support Harris’s prediction; the intervals are too wide. We report T5 results descriptively and note that the chapter’s central finding (T3 confirms, T1 does not) rests on T1 and T3.
Multiple testing.We ran 48 cells (4 POVs × 4 tier groups × 3 windows). The T3 cells survive Bonferroni-corrected α = 0.001 by an order of magnitude. Pooled and T1 cells should be interpreted with the multiplicity in mind.
Tick-grid approximation in per-snapshot impact.Per-snapshot impact was computed from pre-extracted snapshots that store only the best-bid and best-ask prices plus quantities at ten levels. Prices at L2–L10 were approximated using the 1-tick step assumption justified by Chapter 4. Validation against full message-stream replay on the same stocks shows the per-stock median asymmetry estimates agree to within 5% on 80% of stocks. Any residual approximation error affects events and controls equally and should not bias the differential.
Generalization.Findings are specific to SET tick data over the 22-day sample. Other tick-bound emerging markets (e.g., Hong Kong with its tick table, India with circuit breakers) may show similar patterns; tick-fine markets (US, decimalized European) almost certainly show the mechanism at smaller POV. The mid-tier-grip story may itself be SET-specific or generalize — only replication will tell.
Section 13
What Changed Between Drafts
The path from null finding to positive finding is itself the chapter’s methodological contribution.
This section is unusual in a published chapter but worth keeping for one reason: the path from the previous draft’s null finding to this draft’s positive finding is itself the chapter’s methodological contribution.
The original aggregated test reported ‘the fingerprint doesn’t exist here.’ That conclusion was driven by:
Aggregation at the wrong level. Stock-day medians of snapshot medians collapse sparse signals to noise. The same data, sliced at the event-window level with tail statistics, reveals what the aggregate hides.
A self-fulfilling surveillance signal. S3 (daily aggressor share > 70%) fires on 74% of T5 stock-days because thin volume mechanically produces concentrated aggressor shares. Including it in the event composite while measuring asymmetry is near-circular.
Median-of-medians erases tails. Manipulation episodes are tail events by definition. Median statistics are insensitive to tails. The right within-window statistic for detecting localized episodes is the tail (p95 or max).
Failure to control for activity. Event windows are flagged for high activity; high-activity windows have more extreme tail asymmetry mechanically. Volume-matched controls strip this confound.
Each correction was substantive. Together they took a null finding to a robust positive in mid-volume stocks. The first chapter draft’s conclusion was wrong; this chapter’s conclusion is what the data actually says.
References
Harris, L. (2003). Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press. (Chapters 29–30 for manipulation theory and the asymmetric-impact prediction tested in this chapter.)
Aggarwal, R. K., & Wu, G. (2006). “Stock Market Manipulation.” Journal of Business, 79(4), 1915–1953. (Canonical empirical template for event-window analysis of trade-based manipulation using regulator enforcement records.)
Comerton-Forde, C., & Putniņš, T. J. (2014). “Stock Price Manipulation: Prevalence and Determinants.” Review of Finance, 18(1), 23–66. (Prosecuted-case panel and matched-control design that this chapter’s event-study adapts to surveillance-flagged windows.)
Comerton-Forde, C., & Putniņš, T. J. (2011). “Measuring Closing Price Manipulation.” Journal of Financial Intermediation, 20(2), 135–158. (Surveillance signature methodology for end-of-day manipulation; methodologically adjacent to this chapter’s intraday approach.)
Allen, F., & Gale, D. (1992). “Stock-Price Manipulation.” Review of Financial Studies, 5(3), 503–529. (Theoretical anchor for trade-based manipulation; predicts pump-and-dump price patterns underlying Harris’s empirical prediction.)
Kyle, A. S., & Viswanathan, S. (2008). “How to Define Illegal Price Manipulation.” American Economic Review: Papers & Proceedings, 98(2), 274–279. (Definitional framing.)
This study uses licensed market data obtained through commercial agreement. Infozense is not affiliated with the Stock Exchange of Thailand. No market data is distributed through this website. This content is for educational and analytical purposes only and does not constitute investment advice.
Next Chapter
When One Stock Steps Out
SET's Dynamic Price Band auto-pause: a 120.000-second per-stock re-auction that mostly fires on dormant stocks getting jolted.