HOT · NOWManualBroker OFFBridge LIVEBUILD PHASE│◈ 10 brains race◇ Forge has memory◆ CI-LB > +0.3R · Holm·Static · not real-time
🪜 Deployment Phase Ladder · where the engine is
1 · PAPER TRADING ◀ NOW
Hot mode, paper fills. Everything works, is connected, and proves expectancy. Advance: works end-to-end + positive expectancy on paper.
2 · QUANTOWER DEMO
Verify every MECHANICAL op — order send, connections, bridges, automation. Mechanics, not edge.
3 · LIVE months away
Real broker / prop-firm account — the only phase touching real money. Only after 1 + 2 prove out.
Operative question is always "does this function work in paper?" — live framing is months out. Maps under: P1 ≈ G0 / Hot Run · P2 ≈ exec-ramp Phase 0 + G1 · P3 ≈ later gates.
◉
19 racing brains
Portfolio fleet · NOA leader +138.9R
◬
208 modules
Shared substrate · all brains read from it
⧉
~150 fields/fire
Substrate density · 16 categories
⟁
12,915 fires resolved
Deduped portfolio sample · HOT-RUN
⬢
6 layers · ~55 cards
This atlas · current state only
Risk & Performance Notice
Futures trading involves substantial risk and is not suitable for every investor. Past, simulated, hypothetical, replay-based, research, or model-generated results are not necessarily indicative of future performance. No representation is being made that any account will or is likely to achieve profits or losses similar to those shown. Unless explicitly marked as broker-executed live trades, all displayed outcomes should be treated as research/model outputs and may not reflect slippage, commissions, liquidity constraints, execution delays, or trader behavior. The 12,915-fire deduplicated sample referenced in this atlas was generated during research and hot-run evaluation phases across 10 racing brains — not under live broker execution.
Figure provenance — read every $ / R with its tag
A number whose provenance is left for the reader to guess is the ambiguous-zero applied to money. Default: every $ / R figure here is REALIZED in the shadow paper-race (it actually happened across the books — still not real broker money). Anything else MUST carry a tag: ⟨modeled⟩ / ⟨counterfactual · would-have⟩ (never happened, even in paper — e.g. "if reactive had been anticipated") · ⟨pre-haircut⟩ (claimed R before execution slippage) · ⟨synthetic⟩ (generated data) · ⟨live⟩ (real broker fills, when they exist). A would-have dollar next to a realized dollar, unlabeled, reads as real money — and that lie sizes positions. Per feedback_silent_failure_pattern.md R6.
Phase Ladder · 0 → 45-phase trajectory from substrate truth to autonomous portfolio★ Phase 1 active
PHASE 0CLOSED ✓
Data & measurement integrity
Make the numbers trustworthy before trusting them — every screen, file and journal now reads the same trade record from one source, so a profit can't be counted twice or quietly disappear. This was the foundation: measure honestly first, judge later.
Race all 10 brains on the same live tape and find the one with a real, repeatable edge — proven across enough trades and different market conditions that it can't be luck. Nothing graduates until the math rules out chance.
Contenders: NOA +138.9R · WCONS +117.4R
▸
PHASE 2queued
Graduation candidate
Lock the winning brain's rules so they can't drift, then re-test them on data they were never tuned on — each market type on its own. The edge has to survive being frozen, not just look good in hindsight.
Gate: PROMOTION_READY
▸
PHASE 3queued
Execution qualification
Take the frozen brain from "good signal on screen" to "real fill on a real account," one careful rung at a time — simulator, then demo broker — measuring how much the edge shrinks at each step before any real money.
Gate: D3 reached
▸
PHASE 4★ NORTH-STAR
Portfolio allocation · NOA operates
The end goal: once two or more brains have earned their place, NOA runs them as a portfolio — sizing each by market conditions while you set the risk and stop pressing buttons. The engine becomes the trader.
≥ 2 brains graduated
Pivot trigger: if no brain clears Phase 1 by month 6 (~2026-11-13) → architecture council fires.Each phase has a hard gate. No graduation without clearing it.
●
SubstrateL1
what every brain sees — shared market reality, classified once, read by all 10 racers7 CARDS
Shared by construction. The substrate is computed ONCE per tick and consumed by every brain (NOA · ANT-NOA · BRO · CONS · WCONS · AGG · PA · PRECISION · LLQ · SCHL · SAVT). No brain owns its own private feed — when we say "the substrate," we mean the single source of truth all 10 racers measure against. Brain differences live downstream in Pillars, conviction, and exit policy — not here.
Order Flow
01
In plain terms. This watches what buyers and sellers are actually doing underneath each price move — who's winning, who's absorbing the other side, and whether the move has real force behind it. It's the engine's first defense against a breakout that looks real on the chart but has nothing buying or selling to back it up.
Tracks who's winning each bar — buyers or sellers — and flags when price disagrees with that pressure
Spots when big players are absorbing the other side, in four stages: building, confirmed, exhausted, reset
Three ways to know who took the trade — exchange tag, or inferred two different ways as fallback
Knows the difference between someone refilling an order and someone aggressively taking it
Prevents
Fake breakouts where price jumps but no real buying or selling backs it up.
⟨engine-order-flow.js · NQEliteAbsorption⟩
4-state FSM
Liquidity
02
In plain terms. This maps the price levels institutions actually defend — and the traps they spring when the crowd piles in. It tells a real breakout from a fake one engineered to grab stops and reverse, so the engine doesn't chase a move built to trap it.
Tells a fake breakout (price grabs stops then snaps back) from a real one (grabs stops then keeps going)
Detects failed breakouts — price pokes through a key level but can't hold above it
Watches the levels that actually matter: yesterday's high/low, today's opening range, fair-value zones
If a symbol's levels stop behaving predictably, stop firing on that symbol
Prevents
Chasing breakouts that were engineered to grab your stops and reverse.
⟨stop-hunt-panel · pipeline sweep-state⟩
±1 tick dedup
Market Profile
03
In plain terms. This tracks where the market spent time agreeing on price — and where it refused to. It shows where today's fair value sits and which way it's drifting, so the engine knows whether to expect a snap-back or a genuine shift in where price wants to trade.
Measures the first hour's range against the day's typical move — not against a fixed number
Tracks where most volume traded today and how the fair-value zone is shifting
Volume-weighted average price tracked from key moments — open, big news, session start
Yesterday's high and low survive restarts — never lose context after a reload
Prevents
Betting on a snap-back when the market is actually shifting where it wants to trade.
⟨engine-ib-* · poc-tracker · vwap-context⟩
Dalton framework
Whale & OI
04
In plain terms. This reads the footprints institutions leave that retail never sees — repeated one-sided slamming, contracts added or closed overnight, and where the real size is sitting. It keeps the engine from fighting a level a big player is actively defending.
Spots bursts where the same side is repeatedly slamming the market
Watches how many contracts institutions added or closed overnight
Sees how thick the real buy and sell orders are at each price — not just how many there are
Recognizes when a big player is actively defending a price level on a pullback
Prevents
Fighting positions that institutions are actively defending.
⟨whale-tracker · oi-tracker · aggressor-streak⟩
Quantower L2
Cross-Market
05
In plain terms. This checks whether all four index futures are pulling the same way — and who's leading the move. When NQ runs but the rest of the market quietly turns, this is what flags it before the engine trades a move the broader tape doesn't support.
Compares NQ · ES · YM · RTY every bar — reads the broader market picture
Who's pulling the move: tech, broad market, small-caps — or are they disagreeing?
Flags when one index makes a new high but its sibling doesn't — a classic warning
Detects when all four go quiet at once — often the calm before a real move
Prevents
Trading a strong NQ while the rest of the market quietly rolls over.
⟨engine-cross-index · noa-cross-market⟩
4-symbol fabric
Macro Context
06
In plain terms. This is what the big-picture market is doing to your setup right now — fear, the dollar, and the calendar of major releases. The same setup can be a green light on a quiet day and a hard no in the minutes around a Fed decision; this is what tells them apart.
Tracks fear (VIX) and the US dollar (DXY) — and what they're saying today
Knows when Fed days, jobs reports, and inflation prints are about to hit
Stops firing signals in the minutes around major news releases
Same setup can be a buy on a weak-dollar day and a no-go on a strong-dollar day
Prevents
Trading into Fed-day chaos, or fighting which way the dollar wants to go.
⟨engine-news-risk · macro_daily.json⟩
VIX · DXY · event
HTF · Volatility
07
In plain terms. This sets the bigger frame you're trading inside — how much the market typically moves today and what kind of day it is (trending, choppy, or reversing). It's what stops the engine from using calm-day tactics on a wild day, or the reverse.
Sizes targets relative to how much the market typically moves today — not against fixed numbers
Classifies the day: trending · choppy · reversing × calm · normal · loud
Knows Asian-session hours move differently and handles them separately
If you lose too much in one day, the day ends — full stop
Prevents
Using choppy-day tactics on a trending day, or the other way around.
⟨daily-risk-gate · ib-day-snapshot⟩
3×3 regime grid
◆
PillarsL2
how the brains think — the cognitive primitives every racer inherits before adding its own conviction7 CARDS
Inherited, not invented. Every brain inherits these pillars — confluence weighting, setup catalog, bias arbitration, risk plan, cadence. The 10 racers differ in which pillars they emphasize, which setups they whitelist, and when they hold fire — but the pillars themselves are common ground. Memory (formerly the MI layer) folds in here as the seventh pillar.
Institutional Confluence
08
In plain terms. This is the engine's single official scorer — it weighs every piece of evidence (order flow, market structure, location, the wider context) into one number, and downgrades signals that are really just saying the same thing twice. One place owns that math, so no two screens can disagree.
One score combining four things: order flow · market structure · location · wider context
Different signals matter more for different setups — weights are tuned per setup, not globally
When most evidence agrees, the few outliers get downgraded — they can't inflate the score
One place owns the math — no parallel copies allowed to drift
Prevents
A high score that's really the same signal getting counted twice.
⟨rr-confluence · production-context⟩
minority × (1−d/140)
Setup Catalog
09
In plain terms. This is the library of trade patterns the engine is allowed to recognize — every family backed by published research, not hunches. When two patterns fire on the same trade it merges them, so the count reflects real, distinct bets rather than the same idea wearing two names.
Every family has academic research + experienced-trader literature behind it
When two setups fire on the same trade, they get merged — 15 fires really means ~10 distinct bets
A setup only counts as "proven" after at least 30 trades (fewer if research already supports it)
Plans to reweight overlapping setups are documented but locked until the data earns them
Prevents
Shipping near-identical setups under different names and pretending they're independent.
⟨setup-classification · docs/setup_theses.md⟩
10 families
Bias Arbitrator
10
In plain terms. Before anything gets sized, this decides what kind of trade it even is — with the trend or against it — and refuses the muddy middle. When the read is clean you keep the edge; the fuzzy fallback path is the expensive one, and it's measured.
For every signal, picks one of two clean paths: continuation or reversal
When the read is muddy and falls through to the backup path, trades lose ~1.6× risk on average — measured
Going long and going short use different rules — they aren't mirror images
Decides what kind of trade this is before sizing it — never after
Prevents
Mixed signals diluting what was a clean directional read.
⟨engine-bias-arbitrator.js⟩
Δ −1.6R fallback
Phase F · Risk Plan
11
In plain terms. This is the layer whose whole job is to say no — cap how much reward you chase per trade, block trades fighting the day's trend, and end the day when losses hit the limit. It exists so a great-looking setup can't talk you into an oversized bet.
Caps reward-to-risk per symbol — no lottery tickets: NQ at 2.5× max, ES at 4×
Hard-blocks trades that go against the day's trend
Hit the daily loss limit, the day ends — no exceptions
Currently advice only · enforcement activates in Automated Run
Prevents
Oversized "lottery ticket" trades that come from gut, not data.
⟨daily-risk-gate · l4-risk-advisor⟩
RR cap NQ 2.5 / ES 4.0
Phase G · Additive Setups
12
In plain terms. These are four newer trade patterns, each grounded in published research rather than invention — playing index disagreement, gap retraces, overnight order build-up, and failed first breakouts. Same bar as everything else: no setup ships without real evidence behind it.
G1 · NQ and ES disagree (one makes a new high, the other doesn't) — trade the laggard
G2 · The open gaps, price retraces halfway back — fade the move
G3 · Overnight volume piles up one way — ride it into the morning (NY Fed paper)
G3b · London grabs liquidity, NY's first breakout fails — fade the failure
Prevents
Inventing setups out of thin air with no real evidence behind them.
⟨production-context setup definitions⟩
NY Fed sr917
Cadence & Pause
13
In plain terms. This is the engine's sense of when to stay quiet — limiting how often each symbol fires so the screen doesn't fill with noise. Pausing a symbol only silences the alerts; behind the scenes it keeps watching and learning. One missed signal beats a stream of noisy ones.
Limits how often each symbol and setup can fire — no spam
Pausing a symbol stops the signals on screen, but keeps the engine watching and learning
Quiet on the screen, busy in the background — the engine never stops collecting data
One missed signal is far cheaper than a stream of noisy ones
Prevents
Signal-spam that wears down your focus and your trust in the system.
⟨cadence-gate · symbol-pause⟩
silence = information
Memory · Market Intelligence
14
In plain terms. This is the engine's memory — it records the events, the episodes, and how they resolved, so any brain can ask "have I seen this before?" before committing. It turns a repeating market into evidence instead of a fresh surprise every time.
5 MI modules: events (CPI/NFP/FOMC) · episodes (high-variance windows) · outcomes (how each episode resolved) · store (durable IDB) · intelligence (the read API)
Every brain can query MarketIntelligence for context before committing — "this regime resolved BEARISH in the last 8 sessions"
10 brains on one tape — the active battlefield (Phase 1). Same substrate, same pillars, different reads on what fires.10 BRAINS
Same race, different reads. Every brain consumes the Substrate (L1) and inherits the Pillars (L2). They diverge on which setups they take, how strict their conviction gate is, and which regime cells they specialize in. Phase 1 asks: which racer's CI-lower clears +0.3R per trade on ≥100 fires across ≥2 regime cells, Holm-corrected across N=10? Winner snapshot freezes into Phase 2. Until then, all 10 race in shadow on the live tape.
🏆 Portfolio ScoreboardDeduplicated · current handoff truth · all 10 racers · NOA leader · AGG losing book+54.2R portfolio net ·−$51,515 on stop-dist + commissions ·12,915 n
Brain
Kind
R
$ REAL
n
Phase 1 read
★ NOA
broad · self-discovery
+138.9
+$12,659
660
Real leader · NQ +100R · ES +39R · positive both symbols · Phase 1 contender
R-vs-$ divergence · tier-sizing impact · NQ +56R / ES −50R · investigation flagged before any Phase 2 promotion
CONS
consensus
−1.3
−$238
440
Essentially breakeven · waiting for WCONS to graduate first
ANT-NOA
pre-arm head
−53.1
−$12,386
897
NQ −80R drags portfolio · counterfactual: strip top 5 leaks → would promote to NO_EDGE_YET
BRO
Brooks PA inheritance
−55.9
−$12,387
274
Negative both symbols · first auto-proposal: split by regime — wins trend_down, loses trend_up
AGG
broad aggressive
−97.9
−$34,001
8,861
Losing book · ES bleeds −707R · ES quarantine + regime heal shipped 2026-06-06 · counterfactual proves the gap is STRUCTURAL — leak quarantines alone don't graduate AGG
PRECISION
whitelist over AGG
new
—
0
8th racer · 9 documented-winner cells (5 NQ gold mines + 4 ES survivors) · install-date floor 2026-06-09 09:30 ET
The only brains that learn the market themselves — Scholar (readable checklist) + Savant (learned representation). Train offline, ship frozen to browser. 2026-06-10: their lifetime-0 fires were a regime-bucket BUG (`_resolveRegime`→`'unknown'`→max CQL penalty→qLower −0.93), NOT "abstaining by design" — fixed (→ ProvenanceStamper). Now gates honestly; fires in trend/expansion. Still data-gated for retrain (≥6 RTH days, have 2).
★ NOA
B1
In plain terms. The fleet's lead brain and its broadest — it hunts edge across every kind of setup rather than specializing, and discovers new patterns on its own instead of waiting to be told what to trade. Right now it's the only racer making money on both NQ and ES.
Phase 1 contender — needs CI-lower clear at +0.3R on ≥2 regime cells
⟨engine-noa-*.js · 32 modules⟩
Phase 1 contender
★ WCONS
B2
In plain terms. This one trades no setups of its own — it listens to all the other brains and takes a weighted vote, trusting each in proportion to how well it's done lately. A "wisdom of the crowd" brain that leans hardest on whoever's currently hot.
+117.4R · +$11,452 · n=249
NQ +90R · ES +27R · positive both
First Holm-significant cell in the fleet: trend_up n=63 ci_low +0.166R (p=0.000)
Phase 1 contender alongside NOA
⟨engine-wcons-*.js⟩
Phase 1 contender
PA
B3
In plain terms. A pure price-action reader — it trades off the shape and quality of the bars themselves (clean structure, second-entry pullbacks), not order flow or news. The busiest brain on the tape, but its dollars and its R disagree sharply — flagged for a look before it can advance.
+6.1R but −$16,615 · severe R-vs-$ divergence
NQ +56R · ES −50R — symbol asymmetry
n=1,534 — most-fired discretionary read
Sizing-architecture investigation flagged before Phase 2
Real conviction tier: fusion-bias agreement + regime alignment grade each fire 1–4 (was hardcoded tier-2 on every fire, polluting the band tape-wide)
In plain terms. The control experiment — a plain, equal vote of all the brains, nobody weighted. It exists to prove the weighted version (WCONS) is actually earning its keep: if simple averaging worked just as well, the weighting would be pointless.
−1.3R · −$238 · n=440
Essentially breakeven · the "no edge if we just average everyone" baseline
WCONS proves the weighting is doing real work · CONS proves it's necessary
Not a Phase 1 candidate — exists as the methodological control
⟨engine-cons-*.js⟩
control · breakeven
Anticipation Spine · 6 arms
B5
In plain terms. Six brains that try to get in early — arming a trade just before the signal fully forms, each specialized in a different read (price action, Brooks, order flow, aggressive, consensus). Five of the six make money together; the original lead arm is the one that bleeds.
In plain terms. Trades the classic Al Brooks price-action playbook — bar-by-bar reading, second entries, wedges, final flags. The catch: Brooks's hand-written odds are treated as starting assumptions, not facts, and they haven't survived the post-2025 market — it loses on both symbols.
−55.9R · −$12,387 · n=274
Negative both symbols · Brooks priors don't survive 2025+ regime
First auto-proposal: split by regime — wins on trend_down (+0.39R), loses on trend_up (−0.52R)
Brooks's explicit probabilities = Bayesian priors, not facts
⟨engine-noa-brooks.js v0.3 · 39 videos ingested⟩
Bleeding · regime-split candidate
AGG
B7
In plain terms. The volume brain — it fires on every setup that clears the minimum bar, holding nothing back. That makes it the busiest and the biggest loser: it bleeds heavily on ES, and the gap is structural, not just a handful of bad setups.
In plain terms. A sharpshooter built on top of AGG — instead of firing on everything, it only takes the handful of specific setup-and-symbol combinations that have actually proven to win. Brand new, still gathering its first trades.
In plain terms. A single-question experiment: does AGG do better if it only trades in thin, low-liquidity conditions — the one market state where its edge tested positive? It's AGG with one regime filter, there to settle that question.
Asks: does AGG-restricted-to-low_liquidity outperform AGG-broad?
Built on the Forge's Holm-significant edge cell: low_liquidity n=3540 ci_low +0.077R
Same install-date floor as PRECISION
Hypothesis instrument — tests whether regime-filter alone is enough
⟨engine-shadow-book-llq.js⟩
Accumulating · regime test
SCHL + SAVT · The Self-Learning Brains
B10
The only brains in the fleet that learn the market themselves — no setup catalog, no hand-coded pillars, no human-written rules. They look at the data and discover their own edge.
Every other brain (NOA · BRO · AGG · PA · etc.) trades setups we designed. SCHL + SAVT design their own.
SCHL — the Scholar: discovers edge as a readable checklist. We can audit what it learned and why. Asks: can edge be NAMED?
SAVT — the Savant: discovers edge as a learned feature representation. Cannot explain itself in words — it just feels the pattern. Asks: can edge be FELT?
Two paradigms, one race — they're testing whether trader edge is something we can name or only something the machine can feel.
Safe by construction: trained wild offline in Python; a frozen snapshot ships to the browser. They learn in the lab, never on live capital.
Data-gated for retrain (≥6 RTH session-days of arena tapes, have 2). They'd rather stay silent than fire on a half-trained policy.
⚠ 2026-06-10 bug fix: their lifetime-0 fires were NOT "abstaining by design" — a regime-bucket bug (`_resolveRegime` returned `'unknown'`, a key absent from the CQL table) pinned the conservatism penalty at max (1.0) → qLower ≈ −0.93 on every candidate, strangling a validated +0.087R edge. Fixed (→ `ProvenanceStamper.getRegimeKey` + `regimeAtFire` stamp); qLower −0.93 → +0.0147. They now gate honestly and fire in trend/expansion where qHat clears the floor.
🧊 Cold·CALIBRATION — shadow capture, no signals shown. A tool, not the default.
🔥 Hot · NOW·HOT — every brain fires real signals visibly, Paper Trader runs on screen. No real account.
🤖 Automated·LIVE — graduated brain executes against real broker (Phase 3 ladder).
⬠
The Lab + The ForgeL4
how the system discovers, governs, and calibrates itself — the Lab measures, the Co-Pilot watches and acts, the Forge stamps + ranks findings, the Constitution fences the layers, and the calibration loop grounds the roster15 CARDS
Three instruments, one loop.The Lab (docs/research_plan.html) is where measurement happens — three tabs (Measure · Optimize · Discover) read the deduplicated truth and answer "what does the data say this week?" The Co-Pilot (docs/copilot.html) is the 6-faced sentinel — Hunter / Cross-Brain find leaks · Scout / Oracle find rising edges · the scanner self-schedules at 09:25 + 16:05 ET weekdays + a Saturday-night discovery slot (14:05 ET) · the AUTO watcher acts on graded-eligible findings (money-path permanently blocked). The Forge (docs/optimization.html) is where findings get persisted — cell-significance scans run with Holm-Bonferroni + session-clustered bootstrap CI, lifecycle stamps each finding (NEW / PERSISTENT / CONFIRMED / DROPPED), counterfactual ranking tests "would stripping this leak graduate the book?", auto-emit ships paste-ready quarantine patches + specialist-book stubs, and the Synthesizer composes regime-route policy candidates against the OOS gauntlet. The Discovery Lab (claude/auto_audit.py + lab_investigator.py + lab_features.py) is the autonomous quant researcher — it invents thousands of hypothesis cells per sweep across 30 dimensions (18 recorded — incl. the standing watches: news event/phase/aftertaste, roll window, institutional alignment, payout-multiple — + 12 derived senses: MFE/give-back/round-trip/heat/life/risk/RR bands, cross-book agreement, follower, regime alignment, streak context), screens them through BH-FDR + parent-lift, tortures survivors with an 11-test interrogation battery (strip-best, day-consistency, R-vs-$ money-truth, composition, cost floor…), grades each with a verdict ladder (CONFIRMED→DEAD) + evidence score 0–100, auto-freezes forward tests, runs the operator's question-templates forever, and briefs the Co-Pilot in desk language. Two gears: sentinel daily (forward-test health + edge-rot, quiet unless burning) · full discovery every Saturday night (findings land while the market is closed). It discovers, interrogates, and talks — it wires nothing. The system has memory of its own discoveries and a sentinel watching them.
🧪 The Lab · docs/research_plan.htmlmeasurement + research + discovery surface5 cards
Measure Tab · Standings
L1
In plain terms. The weekly report card — it ranks every brain on how it's actually doing, broken down by market condition, and only calls an edge real once the statistics clear a strict bar. This is where you look to answer "who's winning, and is it real?"
Per-book aggregate: n, mean R, PF, win rate, session-clustered bootstrap CI
9-cell heatmap: regime × outcome — colors graduate by sample density
In plain terms. The workshop view — it shows the live race, how close each brain is to graduating, and which leaks are worth fixing first. From here you jump to the Forge for the deep dig.
The Race: top quarantine candidates ranked by counterfactual ΔR — promotion-ready first
Graduation Ladder: per-book %s derived from verdict tier + would-promote flag
Calibration Bay: per-book aggregate + top 6 leaks + top 4 experiments + living loop
🔧 Links to the Forge (`docs/optimization.html`) for deeper hyp drill-down
ARENA Phase 2(b) LIVE (2026-06-08): confirmation × exit-policy counterfactual cells from resolver_vm matrix replay across 36k ARM tapes — all 5 structural anticipation arms (ANT-PA · AGG · CONS · BRO · OF), brain-matched (each brain's confirmations replay only on its own tapes → genuinely distinct cells, e.g. PA +0.07/−0.03 vs AGG +0.04/+0.01). CI-backed. ANT-NOA out (learned pre-arm path). Historical cells use midPx-synthesized bars; live v1.1 capture adds engine bars + the 4 flow confirmations going forward.
In plain terms. The "what did we get wrong" tab — it ranks the week by where the system was most confidently mistaken, so the biggest surprises rise to the top instead of hiding inside the average.
Surprise rank: rank everything by how wrong our prior was — biggest deltas first
Confidence × wrongness: high-conviction misfires get the loudest signal
Regime mix shifts: detect when the distribution moves under us
Blind spots: regimes where n is too small to trust either way
Live now — was rendering fixture for an unknown period (caught 2026-06-06)
In plain terms. A set of five research tools whose only job is to find what we don't yet know — what surprised us, where we're blind for lack of data, and which next trade would teach us the most. It proposes; it never trades.
surprise_rank.py · what was most surprising vs prior
blind_spots.py · regimes / setups with insufficient evidence
auto_experiment.py · pre-registered shadow tests, no fishing
unknown_clusters.py · clusters of unnamed edge
active_learning.py · what next fire would teach us most
Fail-closed below the 10-session floor · discovery proposes, pre-reg disposes
In plain terms. The autonomous researcher — it dreams up thousands of possible edges, assumes each is fake until proven otherwise, puts the survivors through a brutal test battery, and reports back in plain desk language. It investigates and talks; it changes nothing on its own.
Hypothesis generator: ~4,100 cells/sweep over 21 dims (9 recorded + 12 derived senses incl. cross-book agreement + streak) + 60 seeded deep probes
Screen: BH-FDR (q=0.10) + parent-lift — a child cell must BEAT its parent or it's the parent's edge in a costume
🐙 The Co-Pilot · docs/copilot.html6-faced sentinel — finds problems, recommends fixes, acts in one click1 card
🐙 The Octopus · 6 faces
CP
In plain terms. The watchdog that doesn't just measure the brains — it hunts for problems and acts. Six "faces": two sniff out leaks, two spot edges that are rising, one runs the scans on a schedule, and one is allowed to act on findings that have earned it — never anything touching real money.
🔬 The Forge · docs/optimization.htmlinstrument with memory — scan, classify, persist, surface, compose8 cards
Cell-Significance Scanner
F1
In plain terms. The lie-detector for edges — it takes every brain-and-condition slice and asks whether its profit could just be luck, using strict statistics and a held-back chunk of data to confirm the edge survives out of sample.
Holm-Bonferroni across the family of cells per book — multiple-testing honest
Session-clustered bootstrap for the CI — same-session fires aren't independent
Walk-forward holdout (2026-06-07) — newest 30% of fires by ts held out; each hyp gets holdout_passes. First scan caught a NOA edge_cell overfit (train +0.388R → holdout −0.659R, sign flipped)
Regime-rotation expectancy (2026-06-07) — regime_weighted_meanR + regime_weighted_usd_per_week; cells whose edge sits in now-rare regimes get marked down
Temporal stability + $-impact ranking — surface the load-bearing leaks first
In plain terms. The Forge's memory — every edge it finds gets tagged (new, holding, confirmed, or dead) so the system remembers what it discovered and notices the moment an edge stops being true.
Lifecycle: 🆕 NEW (absent from prior) · 🔁 PERSISTENT (in prior + current, < 3 dates) · ✓ CONFIRMED (≥3 distinct ET dates) · ❌ DROPPED (in prior, gone from current)
Drift status (2026-06-07): on first sighting, snapshot install_meanR / install_ci_low / install_usd_per_week. Each later scan computes sign-preserving drift_ratio against the snapshot
STABLE (≥0.95) · SLIPPING (0.7-0.95) · DRIFTING (<0.7) · INVERTED (sign flipped) · DEAD (CI now brackets zero) · NEW (no baseline)
Loudest alarm = CONFIRMED + DEAD — was real for ≥3 scan-dates, significance now lost (the "silently dying edge" case)
In plain terms. Asks the "what if we fixed this" question — if we stopped a brain from taking its worst leak, would it actually graduate to a better grade? Ranks the leaks by how much fixing each one would help.
Per-leak: simulates strip + recomputes aggregate ci_low via session-clustered bootstrap
In plain terms. Finds the trades that are pure noise — lots of volume, zero edge either way. Not losers exactly, just dead weight that piles on risk and cost without adding any profit.
The OPPOSITE pattern from edge/leak — not bleeding, not edge, just commission burn + variance source
Body text projects commission burn over disk window
Independent of Holm — these don't compete for significance, they're confirmed null
First-run: 0 cells qualify (CIs too wide at 7 sessions) · framework in place for when disk grows
⟨cell_significance._classify_dead_weight()⟩
5th classification
Auto-Emit · Patches + Stubs
F5
In plain terms. Turns a finding into ready-to-paste code — closing the gap between "we spotted a leak" and "we shipped the fix." The Forge doesn't just point at the problem; it hands you the patch.
In plain terms. A one-brain X-ray — slice it any single way (by setup, by hour, by market type) and see exactly what makes that brain win or lose, value by value.
6 inner-ring conditions: setupId · regime · symbol · direction · convictionTier · exit
In plain terms. The Forge's alert light — it waves when something new shows up and keeps a memory of what faded away, so a real find isn't missed and a dead one isn't chased twice.
Top-right chip: 🔬 polls /discoveries every 5 min
Amber pulse when N new hyps since last seen scan_date
Green steady when confirmed hyps exist · grey faint when idle/offline
Click → opens Forge in new tab + clears the alarm (last-seen scan_date stamp)
Dropped archive panel below hyp grid: last 15 hyps with kind, $/wk, lifecycle, stability
⟨assets/js/engine-discoveries-chip.js · /hyp_action POST · dropped.jsonl⟩
5-min poll · 4 states · archive
★ The Synthesizer · 5-stage gauntlet
F8
In plain terms. Doesn't just pick a winning brain — it builds a combined strategy that routes to different brains by market condition, then puts that blended result through a tough out-of-sample gauntlet before it can be called ready.
Stage 1: per-cell stats with session-clustered bootstrap CI on resolved fires (scanner quarantines applied)
Stage 2: regime route — per (sym × regime) pick the book with highest CI-lower at n≥30 + mean R ≥ +0.05R
Stage 3: correlation dedup — buckets aligned fires by (sym, ~5-min, direction) and flags pairs aligning ≥3×/week
Stage 4: OOS gauntlet on the COMPOSED stream — walk-forward + session-clustered CI on the OOS pool, not on the parts
Stage 5: frozen artifact mirroring the Cerberus policy schema generalized
First live run: 14,828 outcomes / 21d → 119 cells → 6 routes → 396 composed historical fires → 100% folds positive → OOS CI-lower +0.626R per trade → PROMOTION_READY
TRADABLE CUTOVER (2026-06-10): the composer now routes only on HOLDABLE fires — each book as a real netting account (≤1 position per FAMILY risk slot — NQ/MNQ one NASDAQ slot, ES/MES one SP slot), via shadow_tape.tradable_per_book (TRADABLE_ONLY). Routing on the raw stacked tape promoted books whose edge is un-holdable fantasy.
The rerank: 9 → 7 routes. AGG (raw +1037R but tradable −334R — fires 16,921×, 730 holdable, those lose) and PRECISION (+1288R → +52R, 96% fantasy) REMOVED from the strategy. New routes → ANT-OF · ANT-PA · ANT-CONS · WCONS (the tradable leaders; WCONS ×3). Still PROMOTION_READY, OOS CI-lower +0.550 — the edge survives the honesty filter, the fantasy doesn't.
⚠ Caveat: pre-haircut. PROMOTION_READY = backtest verdict, not deployment authorization. Phase 1 Trust Calibration measures what survives a real broker — see L6.
Independently corroborated 2026-06-10: the Twin Timing Edge instrument (`claude/twin_timing_edge.py`) — a matched-pair method that pairs each reactive brain with its anticipation twin on the SAME setup (via `coFireClusterId`) and measures ΔR on the honest co-fired subset — found the SAME routing the synthesizer did (route to ANT arms where they win; PA/ANT-PA +0.45R [+0.29,+0.60] n=409). ZERO conflicts on diff. Two independent methods converge. Reactive brains kept as the control arm; the twin-table is a corroborator/watchdog, NOT an override engine. Finding: edge is a better entry RULE not earlier timing; conditioning flips signs (NOA loses overall but wins +0.83R in expansion).
In plain terms. The honesty check on profit — a single brain is a signal lab, and adding up overlapping signals as P&L is a lie a real account could never have held. This counts only the trades you could actually have taken, in real dollars.
The rule: ≤1 position per FAMILY risk slot (NQ/MNQ = one NASDAQ slot, ES/MES = one SP slot — a micro is a size rung of the slot, never a second position; risk-slot law 2026-06-12) · no hedge. Interval-overlap netting walk over the time-ordered fires.
One tape, two projections: raw (every fire = edge sample, for discovery) + tradable (what an account could hold, for trust/routing). The raw tape is untouched.
The lie quantified (per-book raw → tradable): PRECISION +1288 → +52 (96% fantasy) · AGG +1037 → −334 (FLIPS NEGATIVE) · NOA +217 → +22 · WCONS +166 → +131 (most honest, 21% blocked). It cuts both ways — PA/BRO score HIGHER tradable (stacking HID their edge). It reranks the race.
SSOT lockstep: BookCanon.tradable (JS, live cards) ↔ shadow_tape.tradable (Python, canonical — reads real direction + resolveTs). The cutover feeds the Synthesizer (F8).
netting projection · AGG +1037→−334 · reranks the race
🧬 Meta Thesis Engine
MB-T
In plain terms. Shifts the unit of thinking from "a trade" to "a thesis" — one living directional idea per market, born from the evidence, scored continuously, and killed when it stops being true. Born, lives, dies, learns.
The hard bound: ≤1 open position per FAMILY risk slot (NQ/MNQ, ES/MES — micros are size rungs of the slot, never a second position) · no hedge, intra-ticker or across the family. Without it the shadow P&L logs fills a real netting account could never take — a gross lie. Caught live: ES 8 open shorts / NQ 3.
Thesis Health: every brain ± , regime ± , EMA-smoothed. Reversal = flip only when the incumbent thesis is DEAD and the opposite is born. Exit-on-death = close when the idea dies, not only on stop.
Honesty (Pillar IX): below the n=8 floor → "I don't know" → death/reversal DISABLED. Structure now, tune after data (the PF-0.92 caveat baked in).
Ride free off the object: Belief State (bull/bear/neutral %) · Conviction · Uncertainty · Thesis Journal · Self-Calibration.
Surfaced: the Meta Brain · The Mind page (cognition lens) — Belief Map, Conscience (S7), Stream of consciousness. Plus the Meta Brain Trader Journal (the autonomous trader's logbook, calendar-first).
DECISION REPLAY (2026-06-12): every Meta commit ships a DECISION record (book v0.5.0) — chosen + the rejected field reconstructed from a 90s candidate buffer + decision stability (did the lead churn NQ→ES→NQ?) + the load-bearing argument + prosecutor override. Graded OFFLINE by decision_grader.py: allocationDelta = chosenR − bestHoldableRivalR (rivals replayed on the 1m bar tape, ⟨counterfactual⟩) — "good pick" vs "left money". This is Allocation Score v2: the per-decision grade vs the actual field. Surfaces in the journal row expand (why this, not that, how steady) + the Lab (DECISION_QUALITY: torn fields = stand-down candidate). Decision Families groups graded decisions by (regime × newsPhase × competition × stability) → the bleeding family is a pattern of mistakes, not a single bad trade. Governed by charter law:decision_metric_governance — no decision metric outranks real net tradable P&L (Goodhart guard).
one thesis per ticker · exit-on-death · ≤1 position/family slot · decision replay
The Constitution · Influence Boundaries
§
In plain terms. The wiring rulebook, enforced by the build itself — it defines which layer may influence which, and if any part tries to reach somewhere it shouldn't, the build simply fails. Guardrails with teeth.
check_constitution.py scans 38 core files / 4 layers → 0 violations; --self-test proves the fence has teeth
Amendment test: an idea may bend a rule freely but never break a pillar (truth · no silent mutation · judgment-tied-to-evidence · operator-owns-money · auditable)
★ THE CHARTER · Capital Allocation Under Uncertainty
⚖
In plain terms. The one law above all others: the product isn't signals or trades — it's growing capital wisely across two real slots (one position per market, max), and protecting it when there's no real edge to deploy. Everything else is an instrument serving this.
One judging question for ALL work: does it improve how limited capital is allocated — or the trustworthiness of allocation measurement? Neither → backlog
Objective: net tradable $ per DAILY risk budget · constraints-as-law (not ratios) · NQ+ES same-dir = ONE risk unit (~0.9 correlated) · sizing = a discrete rung ladder
Gates G0–G4 with advance AND kill criteria at every rung · G0 now (paper supremacy, span-guarded) · G1 blocked on the operator's broker-connector decision
Instruments: regret ledger (flat scores 100 only when the tape agrees) · Allocation Score + selector drift · gate odometer (reads the law) · time-to-pay prior ($/slot-hour) · prosecutors (mute briefs — authority is EARNED) · black-swan sentinel (advisory, never flattens)
Lab re-chartered: every bus finding gate-impact stamped (G0 / measurement-trust / backlog) · deployment questions outrank entry edges
In plain terms. The loop that turns a noisy pile of brains into a trustworthy roster the Meta Brain can lean on — scan, decide, apply, check the change actually stuck, then audit itself.
The cell is the unit (brain × setup × regime × symbol), never the brain
Reactive vs anticipated is the biggest lever — but mind provenance: the "+$190k" figure is ⟨MODELED · counterfactual would-have⟩, AND it's book-vs-book (selectivity-inflated). The realized spine number is +$29k (shadow paper-book), not $190k of real money. The HONEST matched-pair ΔR (claude/twin_timing_edge.py, co-fired subset only) is +0.45R/trade on PA and FLIPS by regime (NOA loses overall, wins only in expansion). Don't size off the $190k.
EV-ranking, not win-rate (Brier lies on asymmetric payoff) + the cost floor (~0.09R commissions — a thin positive R isn't edge)
STABLE gate: never auto-cut INSUFFICIENT_WINDOWS · R1–R6 codified in calibration_policy_v1.json (the autopilot seed)
Autonomous-hand safety (designed, gated): verify-applied · dead-man's-switch watchdog (silence = alarm) · Telegram escalation to the operator's phone
what protects the race — observability, kill-rules, EOD discipline, chip-as-alarm doctrine, propagation hygiene8 CARDS
Three Timing Frames
23
In plain terms. Three clocks that must agree before a trade commits — the big-picture clock (what kind of day is it), the signal clock (is a setup forming), and the execution clock (commit on bar close). One commit, three layers of timing checked.
Big-picture clock · what kind of day is it · how loud · where is value moving
Signal clock · is a setup arming inside that bigger picture?
Execution clock · commits on bar close, with safeguards against flicker
The engine has the execution clock today · the next phase makes all three explicit
Prevents
Pulling the trigger without checking what the bigger timeframe is doing.
⟨decision-clock · render-stabilizer⟩
3 clocks · 1 commit
One Source of Truth
24
In plain terms. A rule, not a feature — every screen reads the same official engine number, never a convenient copy. It's what stops two parts of the app from quietly showing different values for the same thing.
Every number on every screen reads from the same source the engine itself uses
Built into the architecture — not a "be careful" rule that can be forgotten later
Four real bugs caught the time a redesign tried to shortcut around it
A blank panel beats a wrong one
Prevents
Screens that look right and feel right, but quietly show the wrong number.
⟨buildLiveEngineExplanation() · pattern⟩
4 bypass bugs caught
Build Phase · Lock Lifted
25
In plain terms. The current operating stance: the engine is being built, so we add aggressively and filter later rather than guarding every change. Data from before this point was declared invalid and isn't trusted.
Lock lifted after structurally broken engine was identified — prior data invalidated
Build aggressively: new modules, new brains, new whitelist books — all welcome
The Forge filters later — cell-significance + Holm + lifecycle do the curation
Lock returns only when bridge data returns and operator confirms — not a calendar date
Prevents
Premature freezing of an architecture that hasn't found its edge yet.
⟨MEMORY.md · 2026-05-22 lift⟩
BUILD PHASE active
The Kill Rule
26
In plain terms. When the evidence says a feature doesn't work, it gets removed — not "tuned harder." A discipline against endlessly tweaking something the data already says is dead.
Every proposed fix carries a written "when do we kill this idea" clause — before it ships
Each candidate has an explicit, measurable kill condition
If four weeks of fresh data don't support it, it's dropped — not patched or rescued
Applies to the AI layer too — same rule, same standard, no favorites
Prevents
Keeping features alive out of attachment, after the data has already killed them.
⟨ROADMAP §LATER S1–S12 · noa doctrine⟩
12 kill conditions
EOD Killswitch
27
In plain terms. No brain is allowed to hold a trade past the market close — full stop, anchored to real exchange time and wired into every part that can close a position. Stops overnight risk getting carried by accident.
16:00 ET force-flatten: every open shadow position closes at last-valid R (exit='eod_flat')
No opens between 16:00–18:00 ET + all weekends
Wired into all 6 resolution owners + Paper Trader — flatten-guard + open-gate per brain
In plain terms. One official list of which setups are retired — replacing three separate lists that used to disagree and caused a real bug. Now there's a single source for "is this thing benched."
Merged read across LES + PA — one call, normalized cell shape (book / sym / setupId|regime / source / reason / ts)
Metadata layer persisted to localStorage.nqelite.quarantine_registry.metadata.v1
Subscribe events on add / lift — Analyst + Actions + UI all see the same state
V1 additive — LES + PA still own their lists; registry normalizes the seam
Prevents
5-way quarantine-state divergence — the audit's R3 critical.
In plain terms. Lets new listeners plug into the pipeline without editing the money-path file — the pipeline announces once, and anything that wants to listen subscribes itself. Keeps the dangerous core file untouched as the system grows.
publish('decision_tail', {ctx, decs, sym, d}) — one call from pipeline.js per tick
LES + MetaBrain self-register via subscribe('decision_tail', handler) at boot
CanonicalEngineState stays direct — pipeline reads its return value into ctx
Subscriber contract: read-only on payload (Meta Brain's autonomy goes through formal select → MetaDecision → OMS, not in-pipeline mutation)
Prevents
Pipeline.js becoming the implicit observer registry. The audit's R11.
In plain terms. Touch one instrument, check the other — no fix ships for NQ without verifying it didn't break ES. A guard against fixing one symbol while quietly breaking its twin.
Every NQ-side change triggers an ES-side parallel scan — same logic, different parameters
Caught real bugs: INV-2 Pass-1 shipped NQ-only, Pass-2 found the ES mirror was broken
Extends to 4-surface propagation: engine change → NOA Guide + Roadmap (md+html) + Atlas (EN+HE) in the same response
Cross-module trace before any edit: what does this touch downstream? The system is interconnected — no edit lives alone
Prevents
Fixing one instrument while silently breaking the parallel — the most expensive class of bug that passes all single-symbol tests.
In plain terms. The rule that a status which can never turn red is lying. The enemy is the "ambiguous zero" — a 0 or "idle" that looks identical whether it's correct or the thing is actually dead. Every health light must be able to scream.
Born 2026-06-06 (SCHL/SAVT live but policy fetch blocked 2 days); upgraded 2026-06-10 into a standing review with 5 alarms shipped + 5 refinements
Smoke checks (`engine-smoke-test.js`): book_reader_coverage (FAIL — a book firing on disk but unread) · doc_claim_drift (WARN — a "by design/healthy" doc claim masking a dead brain) · disk_source_fresh (WARN — a 0 that's source-unreachable, not empty)
Chip states (`engine-noa-desk.js`): red .policy-down ⛔ · amber .strangled ⚠ (brain blocked, not idle) · amber .gate-locked 🔒 (engine gated, not idle)
Truth source: `engine-brain-truth.js` classifies brains FIRING/IDLE/STRANGLED/DOWN
5 refinements: name the ambiguous zero · prove BOTH directions on the real trigger · false alarms are silent failures too · watch the watcher · encode FAIL/WARN/passive severity
CPU budget guard (2026-06-11) (`engine-cpu-sentinel.js`): the same invariant applied to CPU — every continuous timer has a per-fire ceiling (default 150ms); a breach turns the CPU chip RED and NAMES the offender (⛔ sustained / ⚡ spike) the moment it lands, so a heavy new timer is caught in ONE session, not after days of silent drift (+32,000ms). Levers: cache the expensive per-tick read > run-O(n)-every-Nth > cap growth > throttle. Doctrine + "fix CPU" procedure: `feedback_cpu_budget_doctrine.md`.
Order-flow capture chip (2026-06-13) (`engine-stability-observer.js` → #tbOFCapChip in the ALERTS row): the doctrine applied to the Event-Fragility-v2 capture — the 60s observer now logs continuous pre-event order flow (deltaPct/cvdSlope/DOM/absorption) so the quiet coiled pre-event tape is no longer a capture blind-spot. The chip shows the captured value live, color-coded by delta, or an honest OF · idle when a sample had no feed data — never a blank that hides a dead capture (it IS the captured row, one source of truth).
Prevents
The system looking alive (or "fine by design") when it's dead — across chips, counts, AND doc claims.
In plain terms. Three words, locked: Cold, Hot, Automated — each meaning one specific thing, no synonyms, no improvising. Confusing "which mode are we in" is one of the most expensive mistakes in this system.
how it ships — NOA is the Phase 4 north-star operator. The execution ramp + Phase 1 Trust Calibration gate get the graduated brain into a real broker without losing capital to its own scaffolding.10 CARDS
Two halves, one trajectory. Today NOA is a racer in L3 (the brain) AND the operator-in-training here in L6 (the face + hands). Phase 4 north-star: NOA the racer wins Phase 1, gets snapshotted into Phase 2, qualified through Phase 3's execution ramp, and arrives in Phase 4 as a meta-allocator routing across all graduated brains. The execution ramp at execution/ is the deliberate, broker-safe path from "real signal" to "real fill on a real account." Already shipped: OMS Phase 0-Sim (69 tests green, 5 schemas, hash-chain ledger) · Meta Brain paper book (5 phases inside the dashboard, mirrors the Synthesizer's 6 routes) · Phase 1 Trust Calibration pipeline (measures claimed-R vs realized-R; pipeline real, data synthetic until a broker connects). Blocked on: operator picks broker connector (Quantower Trading Simulator first, AMP Rithmic demo for haircut measurement).
NOA Doctrine · 10 Commandments
N1
In plain terms. NOA is plumbing, not a character — and ten hard rules, enforced in code, keep her that way: stay silent unless there's something real to say, never make things up, never offer false comfort. The voice serves the edge, not the other way around.
10 commandments: silence-first · no comfort · no tick narration · no fake certainty · no block execution · no punishment · no therapist · no hallucinate confidence · no omniscience · no social companion
Doctrine outranks feature requests, council enthusiasm, ship-faster pressure
In plain terms. When a trade fires it freezes the reason you took it, then checks reality against that reason on every tick — surfacing the moment something material changes and staying quiet for ordinary noise. It's what keeps you honest about why you're still in.
In plain terms. The end state — once at least two brains have proven themselves, NOA runs them as a portfolio and the engine, not you, is the one trading. The whole ladder points here.
Meta-allocator: regime-routing table — Phase 1's per-regime data populates it directly
Correlation-aware sizing: two brains firing same direction on same regime cell don't double-up
Drawdown budget partitioned across brains — brain-level breakers feed portfolio-level
Operator's role collapses to: risk-limit setter + monthly reviewer · no intraday touch
Stage 6 of the MI evolution ladder — designed, awaits graduating brains
⟨routeBook(regime) → {book, size, exit} · MI v0.2 Stage 6⟩
North-star · ≥2 brains graduated
Cerberus · Offline-Trained Frozen Policy
N4
In plain terms. A different kind of brain — it learns hard offline in Python, and only a frozen, tested snapshot ever ships to the live browser. Because it never learns live, it can't quietly drift away from what was validated.
In plain terms. The careful eight-rung path from "software test harness" to "real autonomous trading at size" — each rung must be earned, with the brain's rules frozen and promotion automatic only when the bar is cleared.
In plain terms. The order-handling skeleton everything live will be built on — one writer, save-before-you-send, and a tamper-evident ledger. Boring on purpose: this is the part that must never lose or duplicate an order.
In plain terms. For real money, the stop-loss and target must live at the broker, not in the browser — because a browser stop disappears the moment you lose connection. Non-negotiable.
🔺 Quantower "Local" SL/TP are client-side and vanish on disconnect/power-loss → forbidden for live
🔺 Quantower built-in Simulator has no broker → can prove software, never broker-side protection
A SIMULATOR_LOCAL stop can never pass the live-eligibility gate (fail-closed, machine-enforced)
Real CQG/Rithmic-routed broker demo required before any live execution
In plain terms. Not one of the racers — the boss above them. It reads every brain's vote, trusts each by track record, cancels out overlap, holds back when uncertain or when the real-world haircut is too steep, allocates the capital, and even checks whether it's beating the best single brain. Paper-only — it never touches the live money-path.
S1 · Haircut-aware EV · engine-meta-brain-selector.js — scores each route on expected R AFTER slippage/fees, scaled to avgWin/avgLoss (not a flat 1R penalty)
S2 · Intervention-bias wall (Gap 10) · engine-cf-snapshotter.js — freezes each candidate brain's state at decision time → CF_SNAPSHOT; brain trust learns from its OWN outcome, never from Meta's intervention
S3 · Provenance-keyed trust · claude/trust_reducer.py — Trust EMA keyed by (brain × sym × regime × session × provenance × surface); paper trust ≠ demo ≠ live
S4 · Candidate ingest · engine-meta-brain-ingest.js — polls the synthesizer-routed brains' fire streams; Meta mirrors the ROUTED brain per (sym × regime), not a rare engine SIGNAL. ingestCandidate() books via selector + haircut + sizing + MFE
Regime Commander · when the policy's primary brain is silent/dormant, the next crowd-aligned brain COVERS as secondary — tagged + sized-down, measurable apart
S6 · Capital Router · engine-meta-brain-allocator.js — every 10s/symbol: ONE decision over all voters, correlation-collapsed (Gap 2/5 — the Oracle-flagged cluster counts once, not N×), quantized to integer contracts round-down, emits an allocation_vector
S7 · Holy Grail self-audit · engine-meta-brain-self-audit.js — "is Meta beating the BEST single brain?" If not → selfConfidence < 1 → the allocator sizes Meta DOWN. PROVING / BEATING / LAGGING / FAILING
S1–S7 live · correlation-collapsed · self-policing · SHADOW
Plane-B learning ledger
N8b
In plain terms. The tamper-evident record under the Meta Brain — kept in a separate database from the live-order ledger on purpose, so learning can never contaminate the money-path and the Meta's meddling can never muddy which brain deserves the credit.
execution/contracts/allocation_vector_v1.schema.json — FROZEN. ONE decision = ONE intent. Closes Gaps 1·2·4·6·10·11·12·13·14
claude/plane_b_ledger.py — hash-chained append-only (reuses OMS canonical_event_hash); writer-id authority enforced at the append boundary; 13 tests green (tamper · truncation · auth · replay)
Disk server: POST /push_plane_b_event (per-request SQLite connections — thread-safe on ThreadingHTTPServer) · GET /plane_b_status · GET /meta_brain_state
Cross-origin: dashboard mirrors Meta state to disk every 30s (engine-meta-brain-state-push.js); standalone page reads it — one writer, many readers, no IDB-race dupe class
In plain terms. Two different jobs kept separate: Plane A asks "where should we trade next?" and Plane B asks "how do we sharpen the brains we already have?" Mixing the two questions is how systems fool themselves.
Plane A · docs/meta_brain.html · violet brand accent · the rich 6-section view: status hero + route cards + recent fires strip + Meta Brain findings feed + mode-switch self-score + voice log
Plane B · docs/copilot.html · pure Co-Pilot — sharpen the books currently on the tape, no Meta Brain inline (single-line pointer above footbar)
Co-Pilot page: ⬢ Meta Brain nav link added · Meta Brain inline panel REMOVED → Plane-B is now leak/edge-only
Signals-page chip strip: MB chip (violet accent, trust-state badge ·U/·W/·T/·S/·F once ≥20 sealed) · Meta Brain is the Journal's 19th first-class book
Operator decision 2026-06-08: don't let Meta Brain's policy noise contaminate the active sharpening surface
In plain terms. The reality check before any brain graduates — it compares what a brain claimed it made against what it would really have made (and claimed fills vs real fills). That gap, the "haircut," has to be applied before anyone trusts a brain's numbers. Waiting on a real broker to fill in real figures.
Analyzer · claude/trust_calibration.py — pairs engine fires with broker fills by intent_id, computes R_shortfall = actual_R − claimed_R (universal across winners + losers), session-clustered bootstrap CI by (sym × regime × book)
Spec · execution/contracts/broker_fills_v1.schema.json — FROZEN. Two record types (fill / exit), additionalProperties: false
Routing config · execution/config/route_to_demo_v1.json — default NOA-only, both syms, RTH, max 2 concurrent / 10 daily / $1500 risk, kill on 4 streak or $600 daily loss
Promotion gate: Synthesizer verdict + haircut.is_synthetic == false + n ≥ 50 per sym + shortfall_CI_upper < 0 · all four must clear before composed policy leaves shadow
⚠ Pipeline real, data synthetic. Synth NOA NQ: −0.022R/trade · ES: −0.116R/trade — sanity-check only. Real numbers blocked on operator broker pick.
every threshold, gate, setup weight, and order-flow rule — straight from the engine code4 TABS · 40+ CARDS
Thresholds6
Gates & Pipeline16
Setups & Confluence10
Order Flow7
Order Flow Threshold
Delta Classification
6% · 350 · 120
P1
How The Engine Separates Real Directional Aggression From Market-Maker Noise.
MeasureRuleOutcomeCalibrationSources
Core Decision
Institutional Delta Requires Meaningful Size Plus Confirmed Directional Travel.
Magnitude|deltaPct| >= 6%OR |delta| >= absFloor
Efficiencydirectional displacement / total path >= 45%
Pass InterpretationDelta can support initiative flow, bias confidence, and downstream order-flow confirmation.
Fail InterpretationDelta is treated as noise, absorption risk, or inventory rebalancing unless other evidence overrides it.
Measures
Every bar the engine receives carries a delta: the net difference between aggressive buying volume, where market orders hit the ask, and aggressive selling volume, where market orders hit the bid. Raw delta is meaningless without context: 200 contracts on a 3,000-volume bar is noise; the same 200 on a 1,200-volume bar is a 16.7% directional skew, which means someone is acting with intent.
Gate 1: Magnitude Rule
The engine runs a dual-gate test. First, percentage: is this bar's delta at least ±6% of its total volume? Second, absolute floor: is the raw delta at least 350 contracts for NQ or 120 contracts for ES? Either gate passing qualifies the bar as meaningful. The absolute floor exists because during thin pre-market or lunch bars, even a 10% skew might represent only 30 contracts, which is statistically irrelevant for a futures instrument that trades millions daily.
Gate 2: Efficiency Rule
On top of delta magnitude, the engine evaluates bar efficiency: did price actually travel in the delta's direction? Efficiency equals directional price displacement divided by total path traveled. A bar with at least 45% efficiency moved purposefully in one direction. Below 45%, the bar oscillated: the delta might be real, but the price action is indecisive, and the move is more likely market-maker rebalancing than institutional commitment.
Parameter Summary
Parameter
Exact Rule
Role In The Engine
Magnitude Filters
Relative Threshold
±6% Of Bar Volume
Filters Weak Directional Skew.
Instrument Floors
NQ Absolute Floor
350 Contracts
Minimum Raw Delta For NQ Depth.
ES Absolute Floor
120 Contracts
Minimum Raw Delta For ES Depth.
Combined Gate
|Δ%| ≥ 6% OR |Δ| ≥ absFloor
Marks Delta Magnitude As Meaningful.
Efficiency Confirmation
Bar Efficiency Gate
≥45% Displacement ÷ Path
Requires Price To Confirm The Flow.
Min Price Move (NQ)
16 Ticks = 4 Pts
Avoids Microscopic NQ Drift.
Min Price Move (ES)
10 Ticks = 2.5 Pts
Avoids Microscopic ES Drift.
Calibration Rationale
The 6% threshold was derived empirically: below it, correlation between delta sign and next-bar direction drops below statistical noise.
The 3:1 ratio between NQ (350) and ES (120) absolute floors matches the typical volume ratio between the two instruments.
The 45% efficiency threshold aligns with microstructure research showing that bars below roughly 40-50% efficiency are dominated by market-maker inventory rebalancing, not directional intent.
Volume Classification
Volume & Market Tempo
4 tiers · self-calibrating
P2
Two separate systems: volatility-rank buckets for regime, and trade-count tempo for institutional participation.
RankClassifyTempoOffsetScore
Core Decision
Volume regime and bar-level tempo are independent systems that jointly adjust confluence threshold and bias confidence.
Regimevolatility rank percentile → 4 tiers (LOW / NORMAL / HIGH / EXTREME)
Tempobar trades vs baseline · WEAK <0.6× · STRONG ≥1.5×
When It WorksHIGH/EXTREME regime lowers confluence threshold; STRONG tempo confirms institutional participation behind setups.
When It FailsLOW regime raises threshold +4; WEAK tempo flags retail-only environment — setups lack institutional backing.
Volume Regime Classification
The engine classifies market activity through two independent lenses. Volume regime uses a volatility rank (0-100 percentile) provided by the bridge from the instrument's recent history. This isn't a fixed number — "high volume" on a quiet August day means something different than "high volume" on an FOMC day. If the bridge can't provide a rank, the engine falls back to today's intraday range: NQ range >220 pts = EXTREME, <70 pts = LOW.
Market Order Tempo
Market order tempo is a separate, bar-level check. It counts trades and contracts in each bar and compares them to fixed baselines (NQ: 900 trades / 12,000 contracts; ES: 600 / 8,000). Below 60% of baseline = WEAK (retail noise, no institutional footprint). Above 150% = STRONG (institutions are participating). This distinction matters because a "high confluence" setup in a WEAK tempo environment is suspicious — who's going to move the market in your favor?
Score Adjustment
Both systems feed into score adjustments: volume regime shifts the confluence threshold (LOW: +4 harder, EXTREME: -5 easier), while tempo classification informs Gate 2.5's bias confidence and several order-flow conditions.
Volume Regime Tiers
Volume Regime
Volatility Rank
Fallback (NQ range)
EXTREME
≥85th percentile
>220 pts
HIGH
≥70th
>140 pts
NORMAL
≥35th
70–140 pts
LOW
<35th
<70 pts
Tempo Baselines
Tempo Gate
NQ
ES
Baseline
900 trades / 12,000 vol
600 trades / 8,000 vol
WEAK (<0.6×)
<540t or <7,200v
<360t or <4,800v
STRONG (≥1.5×)
≥1,350t or ≥18,000v
≥900t or ≥12,000v
Volume Score Offsets
Volume → Score Offset
Effect
LOW
+4 (raise threshold — harder to fire)
NORMAL
0 (baseline)
HIGH
-3 (lower threshold — easier)
EXTREME
-5 (much easier — conviction in loud markets)
Calibration Rationale
Self-calibrating percentile bands prevent the system from mislabeling a quiet Tuesday as "low volume" when it's actually normal for that contract's seasonal pattern.
The tempo baseline (900/600 trades) was calibrated from CME Group session data for NQ and ES regular trading hours.
The 0.6×/1.5× multipliers for WEAK/STRONG are derived from the point where institutional participation visibility inflects in order-book data.
Liquidity Detection
Sweep & Trap Detection
7-bar window · 3 confirm paths
P3
The engine's institutional-behavior detector: where stops cluster, how sweeps are identified, and when a sweep becomes a trap.
TargetSweepWindowConfirmScore
Core Decision
A sweep past a liquidity target must either be accepted as a breakout or confirmed as a trap within a 7-bar window.
Sweepprice past target by ≥ break distance (NQ: 4t / ES: 3t) within 24 bars
Trapconfirmed trap scores 82 confidence — highest-conviction signal
When It WorksConfirmed trap at 82 confidence — institutional stop-hunt detected, highest-conviction reversal signal.
When It FailsSweep accepted (holds past fail distance for full window) — real breakout, not a trap.
Liquidity Targets
The engine maintains a ranked list of liquidity targets — session extremes, prior-day highs/lows, VPOC, VWAP, and equal highs/lows — scored by type, proximity, and session context. Each target represents a probable stop-cluster location: retail traders place stops outside these levels, creating pools of resting orders that institutional players can exploit.
Sweep Detection & Acceptance
A sweep is detected when price pushes past a target by at least the break distance (NQ: 4 ticks / 1 pt, ES: 3 ticks / 0.75 pt) within the last 24 bars. The engine then watches a 7-bar window: if price returns inside the level, it's a potential trap. If price holds beyond the fail distance (NQ: 6 ticks, ES: 4 ticks) for the full window, the sweep is accepted — real breakout, not a trap.
Trap Confirmation
Trap confirmation requires one of three signals within 4 bars after the sweep: (1) price reverses by at least trapMinTicks (NQ: 8 / ES: 5), (2) aggressive delta ≥10% stalls with a 2-tick reversal, or (3) the last 4 bars form a micro-range tighter than sweepFailTicks — the market froze after the attempt. Confirmed traps score 82 confidence in the order flow state machine — the highest-conviction signal the engine produces.
Sweep Distances
Distance
NQ (ticks / pts)
ES (ticks / pts)
Sweep break
4t / 1.00 pt
3t / 0.75 pt
Sweep fail (accepted)
6t / 1.50 pt
4t / 1.00 pt
Trap min reversal
8t / 2.00 pt
5t / 1.25 pt
Trap max window
12t / 3.00 pt
8t / 2.00 pt
Gap significant
80 pts / 0.4%
20 pts / 0.3%
Trap Probability Scoring
Trap Probability
Component
Base
25 pts
+ Reclaim
+28
+ Trap confirmed
+28
+ Fake breakout
+12
+ Contra delta
+10
− Accepted
−38
High confidence
≥65 probability
Calibration Rationale
The break/fail distances are calibrated to each instrument's tick value. NQ at $5/tick needs 4 ticks ($20) to register a meaningful push past a level — less than that is just bid/ask bounce.
ES at $12.50/tick needs only 3 ($37.50) for a comparable dollar displacement.
The trap window (8–12 / 5–8 ticks) corresponds to typical retail stop-cluster distances from key levels — this is where the "victims" are positioned, based on common retail order-placement patterns observed in DOM data.
Risk Management
Risk Guardrails
-3.75R halt · 12-trade window · setup retirement
P4
Hard limits the engine will not cross — behavioral brakes calibrated to prevent judgment-degradation spirals.
TrackHaltSizeRetire
Core Decision
Three independent brakes prevent judgment-degradation spirals: daily loss halt, drawdown-scaled position sizing, and rolling setup retirement.
Daily Haltcumulative realized R hits -3.75R → new setups halted
When It WorksPosition sizing stays full, setups remain active, engine operates at maximum capacity.
When It FailsEngine halts at -3.75R, sizes down to 25% floor at deep drawdown, or benches broken setups permanently.
Daily Loss Halt & Circuit Breaker
The engine tracks cumulative realized R across all closed signals since midnight ET. When the daily total hits -3.75R, new setups are halted — not because the edge disappeared, but because three standard-size losing trades in a row degrades judgment. A separate circuit breaker fires after 5 consecutive losses regardless of R magnitude, and a 30-fire hard cap prevents signal spam even if realizedR data isn't populated.
Setup Retirement
Setup retirement operates on a rolling window of the last 12 closed trades per setup type. If a setup's rolling average expectancy drops below -0.12R with at least 6 completed trades, it's benched ("Retired"). Between -0.05R and -0.12R, it's flagged as "Degrading" with a -5 priority penalty. Retired setups continue computing for observation but can never fire — the engine doesn't ride a dead horse.
Risk Parameters
Parameter
Value
Daily loss halt
-3.75R cumulative
Circuit breaker
5 consecutive losses
Hard fire cap
30 fires / day (safety net)
RR floor
1.5:1 (HOT) / 2.0:1 (AUTOMATED)
RR cap (advisory)
2.5 NQ / 4.0 ES (off in Phase A)
Retire threshold
-0.12R rolling expectancy, n≥6
Degrade threshold
-0.05R rolling expectancy
Rolling window
Last 12 closed trades per setup
Calibration Rationale
The -3.75R daily halt aligns with institutional prop desk standards where traders are pulled after 3-4 standard-risk losing trades.
The position degradation curve is a behavioral brake: it doesn't predict whether the next trade wins — it ensures the inevitable revenge trade costs less.
The 12-trade rolling window for retirement balances responsiveness (catching broken setups quickly) against noise resistance (not benching a setup over two bad days).
Risk Management
Risk Plan & Position Sizing
stop · entry · R-budget · contract count
P4.5
From signal to trade: stop placement → entry price → dollar risk budget → contract count → instrument → size tier. Every live trade starts here.
PlanBudgetSizeTierMirror
Core Decision
Given a setup's entry price and stop loss, compute the all-in contract plan that keeps stop risk plus commissions at or below the effective-equity R-budget.
Current Selectorstandard if all-in fits, else micro fallback
When It WorksEvery trade has a defined dollar risk before entry. Sizing scales down automatically under drawdown — no manual calculation required.
When It FailsEntry or stop price not available at render time — chip shows plan fallback. Trader overrides with HALF or PROBE tier manually.
Stop Placement & Entry Price
The risk plan is anchored to two prices: entry and stop loss. These come from the setup's structural plan — the engine reads them from linked.entryPrice / linked.stopLoss at render time (signal card) and at journal-entry time (addSetupToTraderJournal()). If no linked plan exists, the engine attempts to derive them from buildStructuredPlan(). Stop distance is computed as |entry − stop| in points — direction agnostic.
Dollar Risk Budget
Base risk is a flat percentage of effective account equity: (accountSize + closed realized P&L) × maxRiskPctPerTrade when accountEquityAutoAdjust is enabled. This is then scaled by the automatic risk multiplier from riskScaleForNextTrade(): drawdown state, conviction tier, and bounded ATR-risk trim from stop width, target reach, and entry-to-VWAP extension. No regime multiplier: the regime gates already filter which setups fire; sizing does not second-guess what the gate decided.
Contract Count & Instrument Auto-Select
engine-position-sizer.js now optimizes executable contract mixes instead of forcing one instrument class. It can recommend all-micro, all-mini, or mixed output such as 1 NQ + 2 MNQ when that best fits the adjusted all-in risk budget. Dollar risk includes commissions: minis use $3.50 round turn, micros use $1.50. The chip displays all-in economics and the exact contract mix.
Size Tiers — FULL / HALF / PROBE
Three manual tiers appear on every signal card. FULL (default — no click required): trade the full adjusted risk budget. HALF: cut the budget to 50% before computing contracts. PROBE: bypass all math — always 1 micro, regardless of stop distance, drawdown, or account size. PROBE is not a sizing calculation — it is a veto. The trade is on, but exposure is minimal. Tier selection persists per setup ID for the session; clicking a new tier re-renders the chip immediately.
Instrument Specs
Instrument
Point Value
Type
Round Turn
NQ
$20 / pt
Standard
$3.50
MNQ
$2 / pt
Micro
$1.50
ES
$50 / pt
Standard
$3.50
MES
$5 / pt
Micro
$1.50
Position Sizing Curve (Drawdown Scale)
Drawdown (R)
Scale
Effect on $250 base budget
0
100%
$250
-2
85%
$213
-4
70%
$175
-6
55%
$138
-8
40%
$100
Config Fields
Field
Default
Location
accountSize
25,000
config.js · INSTITUTIONAL_TUNING
accountEquityAutoAdjust
true
config.js · INSTITUTIONAL_TUNING
maxRiskPctPerTrade
0.01 (1%)
config.js · INSTITUTIONAL_TUNING
roundTurnCommissionByInstrument
NQ/ES 3.50 · MNQ/MES 1.50
config.js · INSTITUTIONAL_TUNING
sizeByDrawdown
curve above
config.js · INSTITUTIONAL_TUNING
Calibration Rationale
1% fixed risk per trade is the standard institutional starting point — aggressive enough to compound, conservative enough to survive a cold streak without account damage.
All-in sizing prevents commission drag from being invisible: contract count is based on stop risk plus round-turn fees.
Auto instrument selection now supports mixed mini+micro sizing, so full-size progression can be precise without excessive micro-only commissions.
PROBE tier exists for high-uncertainty setups where the trader wants presence but not exposure — one micro is near-zero cost and still captures the full experience in the substrate.
No regime multiplier by design: if a setup passed all gates, it has already been regime-filtered. Sizing adds only execution-geometry risk via ATR-proxy stop/target/VWAP measurements.
Signal Grading
Scoring, EV & Grading
sigmoid · 64% cap · 12-trade blend
P5
How confluence score converts to a win probability via a conservative sigmoid model — and what each grade means.
ScoreSigmoidBlendEVGrade
Core Decision
Confluence score converts to win probability via a sigmoid capped at 64%, then computes EV to decide whether to fire.
Win Probsigmoid inflected at score 72, ceiling 64%
EV(winProb × R:R) − (1−winProb) × 1R, fire at EV ≥ 0.0
When It WorksEV ≥ 0.0 fires the signal; EV ≥ 0.8 earns a premium badge for highest-conviction setups.
When It FailsEV negative — setup computed but suppressed. Near-miss (EV ≥ -0.2) tracked as Armed for observation.
Sigmoid Win Probability
Confluence score (0–100) doesn't directly fire a signal. It first converts to a win probability through a sigmoid curve inflected at score 72, capped at a 64% ceiling. The reasoning: even perfect alignment doesn't guarantee >64% win rate in futures markets. A score of 55 maps to ~25% win probability; 72 maps to ~43%; 90 maps to ~61%.
Historical Win Rate Blending
If the setup type has ≥12 completed trades in the outcome registry, the engine blends historical win rate into the curve estimate. Historical data gets up to 85% weight (curve always contributes at least 15%). If the actual win rate underperforms the curve by >8%, the historical weight is boosted to 92% — the model recognizes it was overestimating and defers to reality.
Expected Value & Multipliers
Expected Value is then computed as: EV = (winProb × R:R) − (1−winProb) × 1R. This is further adjusted by six multipliers: market fit (0.6–1.0×), directional alignment (±5%, −12% for conflict), confirmation status (+4%/−2%), production mode (+3%/−2%), sweep/absorption bonus (+6%), and chop penalty (−5% to −15%). The final win probability is hard-bounded between 15% and 72%.
Grade Thresholds
Grade
Score
Approx Win Prob
S (elite)
≥90
~61%
A+
80–89
~54%
A
70–79
~40%
B
<70
<37%
EV Decision Thresholds
EV Decision
Threshold
Signal (Hot Run)
EV ≥ 0.0
Signal (Automated Run)
EV ≥ 0.5
Premium badge
EV ≥ 0.8
Armed (near-miss)
EV ≥ -0.2
Historical blend trigger
≥12 closed trades
Win prob ceiling
64% (sigmoid cap)
Calibration Rationale
The sigmoid inflection at 72 was set empirically: substrate data shows a sharp inflection in outcome quality around that score.
The 64% ceiling is a market-structure constraint — no retail-accessible strategy sustains >64% win rate in liquid futures at meaningful holding periods.
The 12-trade minimum before blending historical WR prevents overfitting to a handful of early results.
Hot Run's 0.0 EV threshold is intentionally permissive — we're collecting signal-quality data, not filtering for profit.
Session & Data Quality
Session Windows, Slippage & Data Quality
64 quality floor · 3.2t MOC
P6
How long the engine remembers, what it assumes about execution cost, and when it declares itself blind.
MemorySlippageQualityHalt
Core Decision
Target memory tightens through the day, slippage is modeled conservatively per session, and data quality below 64/100 halts production.
Qualityscore from 100 downward · floor 64/100 → production halted
Feed< 10 ticks/min = hard stop · < 22 = warning
When It WorksQuality ≥85 (grade A) — full production with accurate slippage estimates and fresh target memory.
When It FailsQuality <64 (grade D) — engine halts. A blank screen beats a wrong one.
Target Memory Windows
Target memory tightens through the trading day. In Asia (slow, level-driven), a liquidity target persists in memory for 26 minutes and up to 130 ticks away. By NY (fast, momentum-driven), the same target is stale at 18 minutes / 100 ticks. The scan range — how far the engine looks for new targets — follows the same tightening: 90 ticks in Asia, 64 in NY.
Slippage Model
Slippage is modeled per session and adjusted by volume regime. Base assumptions range from 1.8 ticks (Asia, thin but orderly) to 3.2 ticks (MOC, thin and chaotic). A volume multiplier scales these: EXTREME volume costs 1.6× the base slippage. Reversion setups get a 0.9× style discount (tighter fills at levels); continuation setups get 1.1× (looser fills chasing). A flat 2-tick round-trip cost is added for commissions and spread.
Data Quality Scoring
Data quality is scored from 100 downward, with deductions for each data gap. Missing price = −35. Stale feed = −24. Missing order book = −5. The quality floor is 64/100 (grade C) — below that, the engine halts production. The philosophy: a blank screen beats a wrong one. Feed health is monitored via a 1-minute sliding window of tick arrivals; below 10 ticks/minute = hard stop, below 22 = warning.
Session Memory Windows
Window
Asia
London
NY
Target memory
26 min
22 min
18 min
Max distance
130t
115t
100t
Scan range
90t
78t
64t
Slippage Assumptions
Slippage
Base (ticks)
At EXTREME vol
Opening Drive
3.0
4.8
Power Hour
2.6
4.16
MOC
3.2
5.12
Lunch
2.2
3.52
London
2.0
3.20
Asia
1.8
2.88
Style adj.
reversion ×0.9 · continuation ×1.1
Round-trip cost
2 ticks fixed
Data Quality Grades
Data Quality Score
Impact
A (≥85)
Full production
B (72–84)
Production with penalty
C (64–71)
Minimum passing
D (<64)
Production halted
Calibration Rationale
Target memory windows tighten because volatility accelerates through the trading day — a level that's relevant for 26 minutes in Asia is ancient by NY.
The slippage model is intentionally conservative: it should overestimate execution cost, not underestimate it.
The MOC session gets the highest base (3.2t) because market-on-close flow creates the most adverse fills relative to displayed liquidity.
The 64/100 quality floor is aggressive by design — it means any two medium-severity data gaps halt the engine.
Serial AND cascade — each gate must pass. Hard blocks terminate immediately. Soft blocks accumulate as WAIT. Advisory gates (Phase A) compute but don't enforce.
Pipeline Gate
Gate 0 — Institutional Hard Blocks
3 sub-gates · hard block
G0
System-level kill switches. Three sub-gates that terminate before anything else runs.
CheckEvaluateDecideRecord
Core Decision
Any institutional block, symbol kill, or quarantined setup terminates the signal immediately.
Vetoinstitutional block flag set → signal dies
Kill Switchper-symbol toggle → hard block
Quarantineblacklisted setup → compute only, never SIGNAL
Gate PassesNo institutional blocks, symbol is live, setup is not quarantined — signal proceeds to Gate 1.
Gate BlocksSignal dies immediately with no appeal. Quarantined setups continue computing for substrate observation only.
Institutional Veto
Gate 0 is the pipeline's first checkpoint and the only one that cannot be overridden by any downstream condition. It runs three independent checks in sequence. Gate 0 (Institutional Veto) scans for any reason the system has flagged as an institutional-level block — circuit breakers, exchange halts, or operator-set kill conditions. If any flag is set, the signal dies immediately with no appeal.
Symbol Kill Switch & Setup Quarantine
Gate 0a (Symbol Kill Switch) is a per-symbol manual toggle. During Phase 1 calibration, ES was locked in analysis-only mode — setups were computed and logged but never promoted to signal status. Gate 0b (Setup Quarantine) checks a static blacklist of setups that failed in live data. Quarantined setups continue computing (substrate observation only) but can never promote to SIGNAL. BUILD PHASE (2026-05-22): all 8 previously quarantined setups reinstated — prior data declared invalid (structurally broken engine). Fresh data collection will re-evaluate all 50 setups on equal footing.
0
Institutional Veto
Any institutional block reason set → signal dies. No appeal.
Hard
0a
Symbol Kill Switch
Per-symbol toggle. ES was analysis-only during Phase 1 calibration.
Hard
0b
Setup Quarantine
BUILD PHASE: all quarantines lifted (2026-05-23). 0 setups currently benched. Prior quarantine data invalidated by engine restructure.
Hard
Calibration Rationale
The quarantine list is data-driven, not opinion-driven — each benched setup was quarantined after its rolling performance fell below the retirement threshold or showed structural failure (0% WR).
Rather than deleting them, the engine keeps them computing for observation — if market conditions change and the setup rehabilitates in substrate data, it can be unbenched.
The kill switch was used to keep ES in watch-only mode until its own calibration data was collected.
Pipeline Gate
Gate 1 / 1.5 — Market Tradeable + State Confidence
50% confidence floor
G1
Is the market open and classified with enough confidence to act?
CheckEvaluateDecideRecord
Core Decision
Market must be open, state confidence must exceed 50%, and production mode must be active.
G1market open + valid state → hard/soft by reason
G1.5astate confidence < 50% → soft block
G1.5btradeMode = NONE → score penalty
G1.5cregime transition → proportional penalty
Gate PassesMarket is open, engine trusts its classification, and production mode is active — signal proceeds to bias check.
Gate BlocksPre-market or post-close hard-blocks. Low confidence or missing thesis imposes soft score penalties.
Market Tradeable
Gate 1 consults the market context analyzer to determine whether the market is open and in a valid, tradeable state. The severity of any block depends on the reason: a pre-market or post-close condition is a hard block (no point evaluating setups when the market is closed), while a data-quality degradation might only impose a soft penalty.
State Confidence & Transition Risk
Gate 1.5 evaluates three separate confidence dimensions. 1.5a (State Confidence) requires the engine's own classification of the current market context to be at least 50% confident. Below that threshold, the engine doesn't trust its regime classification — it might be calling a "trending" market when it's actually rotating, and every downstream gate depends on that classification being at least plausible. 1.5b (Production Mode) penalizes when tradeMode = NONE — no sweep+trap pattern detected and no trend exception active. This means the engine has no high-confidence thesis about what the market is doing. 1.5c (Transition Risk) applies a proportional penalty when the market is transitioning between regimes (e.g., from range to trend), because regime boundaries are statistically where false signals cluster.
1
Market Tradeable
Market context analyzer says the market is open and valid. Severity-based: hard or soft depending on the block reason.
Var
1.5a
State Confidence ≥ 50
Production state must be at least 50% confident. Below that the engine doesn't trust its own classification.
Soft
1.5b
Production Mode
If tradeMode = NONE (no sweep+trap, no trend exception), score takes a penalty.
Soft
1.5c
Transition Risk
Market transitioning between regimes → score penalty proportional to transition risk score.
Soft
Calibration Rationale
The 50% confidence floor is generous — it's a sanity check, not a filter. Below 50 the engine is essentially guessing what kind of market this is, and guessing is not a strategy.
Transition penalties exist because regime boundaries are where false signals cluster — a mean-reversion setup that fires during a regime transition to trend will get run over.
Pipeline Gate
Gate 2.5 — Bias Arbitrator
55% min · 8% edge
G2.5
Does the setup's direction match the market's directional bias? Five modules vote.
CheckEvaluateDecideRecord
Core Decision
Setup direction must align with the declared bias from five voting modules.
Biasside > 55% AND lead ≥ 8% → directional bias declared
The Bias Arbitrator aggregates votes from five independent directional modules — each contributes a long score and a short score based on its own evidence. The arbitrator then applies a voting framework: if one side exceeds 55% and leads the other by at least 8%, a directional bias is declared. If both sides exceed 55% within 8% of each other, the state is CONFLICT — not uncertainty, but genuine contradictory evidence, which is worse for signal quality than having no bias at all.
Sweep Override & Continuation Fallback
The gate checks whether the proposed setup's direction aligns with the declared bias. A long setup in a SHORT bias environment gets a penalty. However, a critical exception exists: if there's an actionable sweep or absorption event, the bias check is skipped entirely. The logic is that liquidity events (stop hunts, institutional absorption at a level) override directional bias — a trapped short squeeze doesn't care what the trend says.
The CONTINUATION_FALLBACK state deserves attention: when bias direction matches but confidence is below the decisive threshold, the arbitrator falls back to a continuation assumption. Cohort attribution analysis found this fallback costs approximately -1.6R on average — it's a known weak spot that ships as a soft penalty rather than a hard block, awaiting more data to decide its fate.
Arbitrator Tuning
Tuning
Value
Side minimum
55% (score to declare side)
Side edge
8% (L-S gap for decisive call)
Conflict minimum
55% (both sides ≥ this = CONFLICT)
Conflict edge
8% (max gap to be "conflict" not "edge")
None max
55% (below = no direction)
Calibration Rationale
The 8% edge requirement prevents firing on razor-thin directional advantages that could flip on the next bar.
CONFLICT state isn't "we don't know" — it's "both sides have a case," which is empirically worse for signal outcomes than NONE (where neither side has evidence).
The sweep/absorption override exists because liquidity events carry their own directional conviction independent of trend.
Pipeline Gate
Gate 2.6 / 2.7 — Market Story + Chop
55 fit · severe chop = kill
G2.6
Does the setup family match what the auction is doing — and is the market in chop?
CheckEvaluateDecideRecord
Core Decision
Setup family must fit the auction narrative; severe chop without liquidity events kills the signal.
G2.6story fit < 55 → soft penalty · fit < 35 + score < 60 + CHOP_NO_EDGE → hard block
G2.7SEVERE_NO_EDGE_CHOP without sweep/absorption → hard block
Gate PassesSetup family fits the auction narrative with score above 55, or chop is mild enough to allow continuation.
Gate BlocksSevere chop hard-blocks the signal. Low story fit imposes soft penalty; extreme mismatch escalates to hard block.
Market Story Fit
Gate 2.6 (Market Story Fit) computes a 0-100 score measuring how well the proposed setup family matches the current auction narrative. A continuation setup in a trending market scores high; the same setup in a rotating, mean-reverting market scores low. Below 55, the signal takes a soft penalty. Below 35, with a raw confluence score below 60 and the market classified as CHOP_NO_EDGE, the gate escalates to a hard block — the thesis has no structural support.
Chop Permissions
Gate 2.7 (Chop Permissions) is the engine's most aggressive defensive gate. When the market story evaluator classifies the environment as SEVERE_NO_EDGE_CHOP — meaning no directional conviction, no institutional footprint, and no structural level nearby — the gate hard-blocks unless there's an actionable sweep or absorption event. Other chop states (MILD_CHOP, ROTATIONAL_CHOP) impose graduated score penalties but allow the signal to continue. The distinction matters: mild chop can produce legitimate mean-reversion setups at levels, but severe chop is a fee-burning machine.
2.6
Market Story Fit
Market fit score 0–100. Below 55 → soft block. Below 35 with score <60 and story = CHOP_NO_EDGE → hard block.
Soft
2.7
Chop Permissions
SEVERE_NO_EDGE_CHOP without actionable sweep/absorption → hard block. Other chop states → score penalty.
Hard*
Calibration Rationale
CHOP_NO_EDGE is the engine saying "this market is going nowhere and there's no institutional footprint to ride." Hard-blocking it prevents the most expensive error in trading: forcing a trade when there's nothing to trade.
The 35/60 threshold is deliberately low — only the clearest "no edge" conditions trigger it.
The sweep/absorption override exists because these events can break a chop regime entirely.
Pipeline Gate
Gate 3 / 3.5 — Regime + Session Playbook
3 sessions · 6 families
G3
Setup style must match market regime, and only approved families run per session.
G3 STRICTstyle mismatch → hard block · low_participation → hard block
G3 BALANCEDstyle mismatch → soft penalty
G3.5setup outside session playbook → blocked or downgraded
Gate PassesSetup style matches regime and is in the session's approved playbook — signal proceeds to state validity checks.
Gate BlocksStyle mismatch blocks (STRICT) or penalizes (BALANCED). Low participation hard-blocks all setups regardless.
Regime Compatibility
Gate 3 (Regime Compatibility) enforces that continuation setups only fire in trend/expansion regimes, and reversion setups only fire in range/rotation regimes. In STRICT mode, a style mismatch is a hard block — a breakout setup in a ranging market simply cannot fire. In BALANCED mode, the mismatch imposes a soft penalty, allowing the setup to continue at a score disadvantage. Both modes hard-block all setups in low_participation regimes, because thin markets produce unreliable signals regardless of setup quality.
Session Playbook
Gate 3.5 (Session Playbook) restricts which setup families are approved for each trading session. Asia (slow, level-driven) permits key-mag, exhaust-rev, and vpoc-mig — setups that work with clear levels and minimal flow. London permits ib-reject and trap-rev — setups that exploit the London open's stop-hunting patterns. NY (institutional flow-driven) permits ib-brk, ib-ext, and flow-surge — setups that ride directional momentum. Setups outside their playbook are either blocked (STRICT) or downgraded (BALANCED). The playbook is calibrated from historical outcome data grouped by session, not from theory about what "should" work.
3
Regime Compatibility
Continuation setups allowed in trend/expansion. Reversion setups allowed in range/rotation. Banned in low_part. STRICT = hard block, BALANCED = soft.
Not every setup works in every session. Asia is slow and level-driven — breakout setups waste capital chasing moves that don't follow through.
NY has real institutional flow — reversal setups against that flow lose.
The low_participation hard block protects against the deadliest trap: a "perfect" setup in an empty market where there's no one to move price in your favor.
Pipeline Gate
Gate 3.6 / 3.7 — State Validity + Dominance
gap + breakout + dominance
G3.6
Gap authority, breakout direction, and tier-1 dominance scope.
CheckEvaluateDecideRecord
Core Decision
Setup must not fight gap magnetism, clean breakouts, or active tier-1 signals.
G3.6agap opposes direction → soft block (sweep+trap overrides)
G3.6buntrapped breakout opposite → soft block
G3.7tier-1 active (ldn-sweep, fail-auc) → lower tiers yield
Gate PassesNo gap conflict, no opposing breakout, no tier-1 dominance contention — signal proceeds to institutional checks.
Gate BlocksSoft penalties for gap or breakout headwinds. Lower-tier setups suppressed when tier-1 is active.
Gap Authority & Breakout State
Gate 3.6a (Gap Authority) checks whether a significant overnight gap conflicts with the setup's direction. A long setup when there's a large bearish gap (price opened well below prior close) faces headwind — the gap acts as an overhead magnet pulling price back. This is a soft block, not hard, because a confirmed sweep+trap can override the gap's authority.
Gate 3.6b (Breakout State) penalizes setups that fight a clean, untrapped breakout. If price has broken out of a range in one direction and no trap has been confirmed, setups pointing the opposite way are swimming upstream.
Tier-1 Dominance
Gate 3.7 (Tier-1 Dominance) is a priority arbitration mechanism: when a tier-1 setup (London sweep, failed auction — the highest-conviction patterns in the system) is active, lower-tier setups in the opposite direction are suppressed. Even lower-tier setups in the same direction yield — the tier-1 setup takes the slot. This prevents signal clutter and ensures the best signal gets the attention.
3.6a
Gap Authority
Significant gap conflicts with setup direction → soft block, unless sweep+trap is active.
Soft
3.6b
Breakout State
Clean breakout opposite to setup direction without trap confirmation → soft block.
Soft
3.7
Tier-1 Dominance
When a tier-1 setup (ldn-sweep, fail-auc) is active in opposite direction, lower-tier setups yield. Same direction but lower tier also yields.
Soft
Calibration Rationale
When the best setup in the system is pointing one way, lesser setups pointing the other way should defer.
Tier-1 setups have the strongest historical edge — when they speak, the rest of the pipeline listens.
The gap override for sweep+trap reflects that institutional stop-hunting events can negate gap magnetism entirely.
Pipeline Gate
Gate 3.8 / 3.9 — Whale + OI Conviction
whale · OI · soft
G3.8
Institutional positioning checks — are whales fighting you, and does open interest contradict your trade?
CheckEvaluateDecideRecord
Core Decision
Institutional DOM clustering and open interest dynamics must not contradict the setup's direction.
G3.8whale defense opposes direction → soft penalty · alignment = research note
G3.9SHORT_COVERING + long → penalty · LONG_CAPITULATION + short → penalty
Gate PassesInstitutional positioning doesn't contradict the setup, or aligns with it — signal proceeds to final qualification.
Gate BlocksSoft penalties for opposing whale defense or OI contradictions. Not hard blocks — institutions can be wrong at turning points.
Whale Defense
Gate 3.8 (Whale Defense) reads DOM (Depth of Market) clustering patterns. When institutional-sized orders cluster at resistance above the current price (defense-resistance) or at support below (defense-support), the engine detects "whale defense" — large players actively protecting a level. If their defense opposes the setup's direction (e.g., massive sell-side clustering at resistance while a long setup tries to fire), the setup takes a soft penalty. If whale defense aligns with the setup, it's logged as a research note but doesn't adjust the score — alignment is expected, not bonus-worthy.
OI Conviction
Gate 3.9 (OI Conviction) cross-references open interest dynamics with the setup's direction. A long setup firing during a SHORT_COVERING environment gets penalized: the rally looks like buying, but OI falling says it's shorts closing, not new longs entering — the rally is structural unwind, not genuine demand. Similarly, a short setup during LONG_CAPITULATION faces a penalty because the selloff is exhaustion, not new bearish conviction. Counter-trend penalties apply when longing against REAL_DOWNTREND or shorting against REAL_UPTREND. All OI gates are soft penalties because institutional positioning can be wrong, and a strong enough setup-level signal can override positioning headwinds.
3.8
Whale Defense
Institutional DOM clustering (defense-resistance, defense-support) opposing setup direction → soft penalty. Alignment = research note only.
Soft
3.9
OI Conviction
Long setup + SHORT_COVERING → "rally is fake." Short setup + LONG_CAPITULATION → "selloff is exhaustion." Long vs REAL_DOWNTREND or short vs REAL_UPTREND → counter-trend penalty.
Soft
Calibration Rationale
These are the "bigger fish" gates — if institutional positioning contradicts the setup, the setup is fighting the wrong crowd.
Short covering rallies look identical to real buying on the tape, but they die once the shorts finish covering. OI dynamics distinguish real conviction from structural unwind.
Soft penalties, not hard blocks, because institutions can be wrong too — and often are at turning points.
Pipeline Gate
Gate 4–8 — Score, Confirmation & Risk
5 gates · retire to RR
G4-8
Final qualification: retirement check, confluence threshold, critical fails, entry timing, and risk/reward validation.
CheckEvaluateDecideRecord
Core Decision
Setup must pass retirement, confluence, critical-fail, confirmation, and R:R checks to fire.
G8R:R < rrFloor (1.5 HOT / 2.0 AUTOMATED) → BLOCKED
Gate PassesSetup is not retired, confluence meets threshold, no critical failures, R:R is above rrFloor (1.5 HOT / 2.0 AUTOMATED) — eligible for SIGNAL.
Gate BlocksRetirement benches the setup. Critical fails hard-block. R:R below floor forces WATCH regardless of score quality.
Retirement & Confluence
Gate 4 (Retirement) checks the setup's rolling expectancy across its last 12 closed trades. Below -0.12R expectancy with at least 6 completed trades — the setup is benched. It continues computing (for substrate observation) but can never fire. Between -0.05R and -0.12R it's flagged as "Degrading" with a -5 priority penalty. Gate 5 (Confluence Threshold) is the raw score check: the setup's weighted confluence must reach a dynamic threshold (~68% base, adjusted by volume regime and qualification profile). BALANCED mode gets a -4 offset, making the threshold more permissive during data collection.
Critical Fails & Entry Confirmation
Gate 6 (Critical Fails) catches single-point-of-failure conditions. If any individual evaluator with weight ≥18 fails (scores zero), that's a critical fail. One critical fail is logged as a warning. Two or more critical fails trigger a hard block — the thesis has multiple load-bearing pillars that collapsed. Gate 7 (Entry Confirmation) requires setups below the bypass threshold (~75-80%) to show confirming price action, order flow, or liquidity signals before promoting. Above the bypass threshold, the confluence itself is confirmation enough.
Risk & R:R Tiering
Gate 8 (Risk & R:R) validates the stop/target geometry. R:R is calculated from the planned entry, stop, and target prices. The floor is phase-dependent: 1.5:1 in HOT Run, 2.0:1 in Automated Run — below that, the setup is BLOCKED regardless of confluence. R:R bands (1.5–2.0, 2.0–2.5, 2.5–3.0, 3.0–4.0, 4.0+) are tracked for statistical analysis but don't gate signal states. Calibration tuning clamps dynamic RR to the range 2.5–6.0 (config.js minRRGate/maxRRGate). GEX modifier (2026-05-21): dynamic R:R adjusts ±0.40R max based on options gamma exposure — LONG near a hard call wall (+0.25), SHORT near hard put wall (+0.25), COILED regime favors breakouts (-0.15), PINNED regime favors mean-reversion (-0.10), CASCADE regime penalizes fades (+0.40). Research-constrained: GEX modifiers only fire in CALM/NORMAL VIX (FlashAlpha 8yr: GEX adds zero in elevated/stressed VIX).
Score must reach dynamic threshold (base ~68%, STRICT offset 0, BALANCED offset -4). Gap logged if below.
Soft
6
Critical Fails
Any individual evaluation with weight ≥18 that fails. One = logged. Two or more = blocked.
Hard*
7
Entry Confirmation
If score below bypass threshold (~75–80%), requires price action / order flow / liquidity confirmation.
Soft
8
Risk & R:R Tier
Validates stop/target feasibility. Below rrFloor (1.5 HOT / 2.0 AUTOMATED) = BLOCKED. R:R bands tracked for statistics.
Hard
Calibration Rationale
Retirement prevents the engine from throwing good money after bad.
The critical-fail gate (weight ≥18) catches single-point-of-failure conditions — if one heavyweight check fails, the whole thesis is suspect regardless of total score.
R:R tiering means even a high-confluence signal doesn't fire if the risk/reward math doesn't work — this is where the engine enforces the asymmetry that makes edge-based trading viable.
Pipeline Gate
Gate 9 / 9b / 10 — Advisory (Phase A Off)
3 gates · advisory only
ADV
Computed but not enforced. Will block signals when switched on after calibration proves them.
CheckEvaluateDecideRecord
Core Decision
Advisory gates log what they would block but do not enforce — evidence before enforcement.
G9R:R > NQ 2.5:1 / ES 4.0:1 → would hard-block (advisory)
G9bdirection violation → would hard-block · style violation → would soft-block (advisory)
G10permanently disabled — 840-trade OOS showed it destroyed value
Gate PassesAll advisory gates currently pass by default — they compute and log but never block during Phase A.
Gate BlocksNo enforcement in Phase A. When activated post-calibration, G9 and G9b would hard/soft block as documented.
R:R Inflation Cap
Gate 9 (R:R Inflation Cap) addresses a structural bias in the engine's trade planner: while historical resolved trades show a median R:R of ~1.5:1, the planner consistently produces 4.5:1 median projections. The cap would hard-block any signal where the projected R:R exceeds 2.5:1 for NQ or 4.0:1 for ES. Currently advisory-only: the engine logs what it would have blocked, but doesn't enforce, pending calibration data proving that the cap improves outcomes rather than just filtering high-conviction setups.
Regime Fit Gate & Session Router
Gate 9b (Regime Fit Gate) layers a higher-timeframe regime check on top of Gate 3's style compatibility. It reads the daily regime bias and applies two sub-checks: a direction violation (shorting in a trend_up day) would be a hard block; a style violation (breakout setup in a range day) would be a soft block. Like Gate 9, it's advisory-only pending data proof. Gate 10 (Session Router) is permanently disabled. Out-of-sample validation on 840 historical trades showed it blocked profitable families more often than unprofitable ones — it destroyed net value. Kept in the codebase as an architectural placeholder and a reminder that "intuitive" filters can fail empirical validation.
9
R:R Inflation Cap
NQ capped at 2.5:1, ES at 4.0:1. Historical median RR was 1.5:1 but planner produces 4.5:1 median. Would hard-block inflated projections.
Off
9b
Regime Fit Gate
Day regime bias + style constraints. Short setup in trend_up → direction violation. Breakout in range → style violation. Direction = hard, style = soft (when enforced).
Off
10
Session Router
Permanently disabled. OOS validation on 840 trades showed it blocked profitable families. Kept as architectural placeholder.
Off
Calibration Rationale
Phase A philosophy: observe before enforcing. These gates log advisory verdicts so we can see what they would have blocked.
When the data proves a gate adds edge, it gets switched on.
Gate 10 was explicitly rejected — 840-trade OOS validation showed it destroyed value. Ideas that sound right but lose money get killed, not debated.
Pipeline Gate
Final Decision States
6 states · SIGNAL to fire
DEC
What the pipeline produces after all gates run.
CheckEvaluateDecideRecord
Core Decision
Pipeline resolves each evaluation into one of six discrete states — only SIGNAL can fire.
SIGNALall gates pass + R:R confirmed → can fire
ARMEDstrong confluence, waiting one condition → auto-promotes if resolved
BLOCKEDhard gate failed → structural rejection
Gate PassesSIGNAL state reached — the setup fires. All six states plus the determining gate are written to the substrate.
Gate BlocksARMED, WATCH, BLOCKED, CONFLICT, or PAUSED — each conveys why the signal didn't fire for trader review.
Signal & Armed States
After every gate has executed, the pipeline resolves the signal into one of six discrete states. SIGNAL means every gate passed and R:R is confirmed — this is the only state that can fire. ARMED means confluence is strong and most gates passed, but the signal is waiting for one remaining condition (typically R:R confirmation or entry timing). Armed signals are near-miss candidates: if the pending condition resolves, the signal promotes automatically on the next evaluation cycle.
Non-Firing States
WATCH is the passive observation state — the setup is being evaluated but isn't close enough to actionability. BLOCKED means at least one hard gate failed — structural rejection with no workaround. CONFLICT is a distinct state from BLOCKED: it means the directional evidence is genuinely split, and the engine refuses to pick a side. PAUSED is operator-initiated — the trader manually paused this symbol via the dashboard's per-symbol pause button. All six states, along with the specific gate that determined them, are written to the substrate for every evaluation cycle. The trader sees all of this on the dashboard and can act on contextual nuance the gate system can't encode.
Decision State Matrix
State
Can Fire
Meaning
SIGNAL
YES
All gates pass, R:R confirmed
ARMED
WATCH
Strong confluence, waiting R:R or confirm
WATCH
NO
Watching, not actionable
BLOCKED
NO
Hard gate failed
CONFLICT
NO
Directional conflict
PAUSED
NO
Trader paused this symbol
Calibration Rationale
Six states, not two. Binary pass/fail loses information.
ARMED means "this was close — if one condition flips, it fires." CONFLICT means "the evidence is split."
The granularity exists because the human trader needs to know WHY something didn't fire, not just that it didn't — the context informs manual decisions and post-session review.
Scoring Framework
Global Confluence Categories
6 categories · 100 pts
CFG
Every setup scores from the same 6-category budget. Each category has a fixed weight — conditions within it divide those points.
CategoryWeightScoreThreshold
Core Decision
Every setup scores against a fixed 100-point budget divided into six categories — same budget, different distributions per thesis.
PassSetup scores above confluence threshold — qualifies for signal generation and trader presentation.
FailSetup scores below threshold — filtered out, logged to substrate for analysis only.
Budget Architecture
Every setup in the engine — regardless of family — scores against the same 100-point budget divided into six categories. The budget is fixed but the distribution is not: each setup type allocates different weights to different conditions within each category, reflecting what matters for that specific thesis. An IB breakout weights trigger conditions heavily; an exhaustion reversal weights flow divergence conditions.
Category Rationale
Location receives the largest allocation (25 points) because where price sits relative to key structural levels is the single strongest predictor of whether a setup succeeds. A perfect trigger at a meaningless level is noise; a mediocre trigger at a critical level is a trade. Regime (20) and Trigger (20) share the next tier — regime ensures the market environment supports the thesis, while the trigger is the specific event that creates the opportunity. Flow (15) confirms the thesis with real-time order flow evidence. R:R (10) scores the mathematical quality of the stop/target geometry. Risk Filters (10) are binary safety checks — session timing, stop-hunt clearance, book stability — that don't generate edge but prevent easily avoidable losses.
Location gets the most weight (25) because where price is relative to key levels matters more than any single trigger — this is the "where" that defines the trade.
Flow gets less (15) because it confirms but doesn't initiate.
Risk filters (10) are binary safety checks, not edge generators.
Every setup sums to exactly 100 points — different distributions, same budget — ensuring apples-to-apples comparison across families.
Qualification Mode
Qualification Profiles
STRICT vs BALANCED
QP
Two modes that control how many setups can fire and how selective the engine is.
CategoryWeightScoreThreshold
Core Decision
STRICT maximizes signal quality; BALANCED maximizes data collection — Phase A runs BALANCED deliberately.
PassSetup meets profile requirements — proceeds through pipeline to signal generation.
FailSetup rejected by profile constraints — hard block (STRICT) or downgraded score (BALANCED).
Profile Mechanics
The engine operates in one of two qualification profiles that control selectivity across the entire pipeline. STRICT is sniper mode: one setup per symbol, minimum 2.2:1 R:R, no score offset, and regime/playbook mismatches are hard blocks. BALANCED is scouting mode: two setups per symbol can compete, 1.8:1 R:R floor, -4 score offset (lowering the effective confluence threshold), and regime/playbook mismatches impose soft penalties rather than hard blocks.
Phase A Strategy
Phase A (current) runs BALANCED to maximize data collection. The engine intentionally fires more signals to build a statistical sample for each setup type. Once enough data accumulates to reliably distinguish setup quality, the profile will shift to STRICT — fewer signals, higher average quality, lower noise. The -4 score offset in BALANCED mode means a setup that needs 68 in STRICT only needs 64 in BALANCED. This isn't a quality compromise — it's deliberate observational permissiveness.
Profile Parameters
Parameter
STRICT
BALANCED
Setups / symbol
1
2
Min R:R
2.2
1.8
Score offset
0
-4
Regime gate
Hard block
Soft (downgrade)
Playbook gate
Hard block
Soft (downgrade)
Calibration Rationale
STRICT optimizes for signal quality at the cost of sample size.
BALANCED optimizes for learning at the cost of signal purity.
Phase A needs volume — you can't evaluate what you don't fire.
The trade-off is explicit and temporary.
Setup Family
Structural Breakout Family
3 setups · continuation
F1
IB breakout, IB extension, breakout retest — continuation through structure.
ThesisTriggerConfluenceQualificationSignal
Core Decision
The initial balance is the structural reference — breakout, extension, and retest each weight different evidence for the same directional thesis.
High ConfluenceOne-sided IB with confirmed directional flow and structural alignment — institutional continuation trade.
Low ConfluenceBalanced IB or weak flow — breakout is noise, not institutional intent.
Breakout & Extension
The Structural Breakout family trades the initial balance (IB — the range formed in the first hour of RTH) as a structural reference. IB Breakout (ib-brk) fires when price decisively exits the IB range. Its heaviest weight is ib_one_sided (22) — was the IB dominated by one side? A tight, one-sided IB that breaks out is institutional intent; a wide, balanced IB that breaks randomly is noise. IB Extension (ib-ext) fires after the breakout holds and extends. It shifts weight from the trigger to flow confirmation: delta_aligned (20) and mkt_order_tempo (14) together require proven, sustained flow in the breakout direction.
Breakout Retest
Breakout Retest (brk-ret) is the family's highest-conviction variant. It fires when a breakout level is retested and holds. The weight profile flips entirely: sweep_reclaim (20) and key_level_near (18) replace ib_one_sided — the thesis is no longer "the breakout happened" but "the breakout survived its first challenge." Absorption (14) confirms that aggressive selling at the retest level was absorbed by passive buyers. The not_lunch filter (8) exists because lunch-hour retests frequently fail due to thin liquidity, not structural weakness.
Condition Weights
Condition
ib-brk
ib-ext
brk-ret
ib_one_sided
22
20
—
delta_aligned
18
20
16
vwap_side
12
10
—
cross_aligned
12
10
12
mkt_order_tempo
8
14
—
sweep_reclaim
—
—
20
key_level_near
—
—
18
absorption
—
—
14
stop_hunt_clear
8
8
12
gap_confirms
—
12
—
ib_tight
10
—
—
prime_window
6
—
—
session_match
4
6
—
not_lunch
—
—
8
Calibration Rationale
IB breakout weights one-sided action (22) because that's the primary signal — the IB was dominated by one side.
Extension shifts to delta (20) + tempo (14) because it's a follow-through trade; you need proven flow continuation.
Breakout retest flips to sweep_reclaim (20) because the thesis is "breakout held, retested, and reclaimed" — the trigger IS the level reclaim.
Setup Family
Mean Reversion Family
4 setups · reversion
F2
IB fade, VWAP bounce, VWAP deviation snap, value area fade — counter-move plays.
ThesisTriggerConfluenceQualificationSignal
Core Decision
Price moved too far from fair value — divergence and absorption confirm the move is running out of fuel.
vwap-devvwap_extreme 24 + delta_divergence 20
vwap-bncvwap_bounce_zone 24 + delta_divergence 18
ib-fadedelta_divergence 20 + ib_one_sided 18
va-fadenaked_vpoc 18 + delta_divergence 18
High ConfluenceFlow diverges from price at an extreme level with absorption — high-probability snap back to fair value.
Low ConfluenceNo divergence or absorption — catching a falling knife, not a mean reversion.
Family Thesis
Mean reversion setups bet that price has moved too far from fair value and will snap back. The family shares two dominant signals: delta divergence (flow disagreeing with price direction) and absorption (price stopping despite aggressive hitting). These two conditions together answer the question: "Is the move running out of fuel?"
Setup Variants
VWAP Deviation Snap (vwap-dev) has the family's most concentrated weight: vwap_extreme at 24 points. The entire thesis is "price extended ≥2σ from VWAP — mean reversion probability is high." Without extreme VWAP deviation, the setup doesn't exist. VWAP Bounce (vwap-bnc) similarly loads 24 on vwap_bounce_zone — the trade IS the bounce at VWAP. IB Fade (ib-fade) combines divergence (20) with one-sided IB (18) — the IB pushed hard one way, but flow says the push is exhausted. Value Area Fade (va-fade) anchors on naked VPOC (18) and key levels (14) — fading from a value area boundary that hasn't been visited yet is a high-quality mean-reversion thesis because the unvisited VPOC acts as a magnet.
Condition Weights
Condition
ib-fade
vwap-bnc
vwap-dev
va-fade
delta_divergence
20
18
20
18
absorption
16
14
16
16
vwap_side/extreme
14
—
24
—
vwap_bounce_zone
—
24
—
—
ib_one_sided
18
—
—
—
naked_vpoc
—
—
—
18
key_level_near
—
12
12
14
cross_aligned
12
12
10
12
stop_hunt_clear
12
10
10
12
not_lunch
8
10
8
10
Calibration Rationale
Reversion trades live or die by divergence and absorption — without them, you're catching a falling knife.
VWAP deviation snap loads 24 on vwap_extreme because the entire thesis is "price extended too far from fair value."
VWAP bounce puts 24 on bounce_zone — the trade is the zone itself.
These concentrated weights ensure the setup can't fire if its core thesis is absent.
High ConfluenceFlow exhaustion confirmed at a key level with absorption — trapped participants will fuel the reversal.
Low ConfluenceNo divergence or absorption proof — the move may be a legitimate trend, not a trap.
Exhaustion Reversal
This family trades failed moves — situations where a directional push exhausted itself, trapping participants on the wrong side. The thesis is contrarian: the losers' forced exits fuel the reversal. Exhaustion Reversal (exhaust) carries the heaviest single condition weight in the entire engine: delta_divergence at 26. The signal is unambiguous: aggressive flow pushed price to a level (key_level_near at 20), but the flow is dying (divergence) while passive absorption (18) holds the level. Three independent lines of evidence converge on one conclusion: the move is over.
Trap & IB Rejection
Trap & Reverse (trap-rev) leads with sweep_reclaim (20) — the failed breakout IS the trade. Price pushed past a level, swept stops, and reclaimed. The trapped participants are now underwater, and their forced exits become your fuel. IB Rejection (ib-rej) combines one-sided IB (18) with divergence (20) — the IB pushed hard one way but the rejection needs both: the setup (the push) and the proof (the flow reversal). Session_match (6) gives a small bonus because IB rejections are most reliable when they happen during the approved session for the setup family.
Condition Weights
Condition
ib-rej
trap-rev
exhaust
delta_divergence
20
18
26
sweep_reclaim
18
20
—
absorption
14
16
18
ib_one_sided
18
—
—
key_level_near
—
14
20
cross_aligned
14
12
12
stop_hunt_clear
10
12
14
session_match
6
—
—
not_lunch
—
8
10
Calibration Rationale
Exhaustion reversal loads 26 on divergence because the entire thesis is "flow dried up at a level." It's the engine's strongest conviction about single-condition importance.
Trap & reverse leads with sweep_reclaim (20) because the failed breakout IS the trade — you need proof that the move was a trap.
IB rejection combines the push (18) with the proof (20) — you need both.
High ConfluenceOverwhelming institutional flow or strong level magnetism with book confirmation — ride the directional wave.
Low ConfluenceWeak flow or no structural magnet — chasing momentum without confirmation is suicidal.
Flow Surge
This family trades institutional flow events — situations where the order book tells a clear directional story. Flow Surge (flow-surge) is the engine's most aggressive setup, requiring overwhelming unidirectional flow: delta_aligned at 24 + mkt_order_tempo at 20. Together, these two conditions require both the magnitude (massive delta in one direction) AND the participation (heavy institutional trade count). DOM alignment (14) adds a third confirmation: the depth-of-market book structure should support the direction. This is a momentum trade — ride the institutional wave.
Level Magnet & Volume Migration
Key Level Magnet (key-mag) trades the pull toward an unvisited structural level. Its heaviest weight is naked_vpoc (22) — the unvisited Volume Point of Control is the magnet itself, a price level where significant volume transacted previously but the current session hasn't reached yet. Key_level_near (18) and absorption (16) add structural and flow confirmation. Volume Migration Follow (vol-mig) tracks when the VPOC physically migrates — it shifts (20) toward a new level, and the engine follows with delta confirmation (18) and cross-market alignment (14). This is a trend-following variant: the auction itself is voting on the new fair value.
Condition Weights
Condition
flow-surge
key-mag
vol-mig
delta_aligned
24
14
18
mkt_order_tempo
20
—
16
naked_vpoc
—
22
20
key_level_near
10
18
12
dom_aligned
14
—
—
absorption
—
16
—
vwap_side
12
—
—
cross_aligned
10
12
14
stop_hunt_clear
10
10
12
not_lunch
—
8
8
Calibration Rationale
Flow surge needs 44 combined points from delta + tempo because it's the riskiest setup type — chasing momentum without extreme flow confirmation is suicidal.
Key level magnet leads with naked_vpoc (22) because the unvisited VPOC is the magnet — without it, there's nothing to be attracted to.
Volume migration follows the market's own vote on new fair value.
High ConfluenceUnique structural trigger fires with session and flow confirmation — specialist edge with concentrated thesis.
Low ConfluenceCore trigger absent — the setup literally doesn't exist without its defining condition.
Specialist Thesis
Phase G+H setups are specialist plays defined by unique structural triggers that don't exist in the core families. They carry the highest single-condition weights in the system (22-30 points) because each setup is its trigger — without the specific condition, the setup literally doesn't exist.
Phase G Variants
Gap Halfback Fade (gap-fade) loads 30 on its trigger (gap_inside_range) and 25 on first_5min — together these two conditions comprise 55% of the setup's total budget. The thesis: when the market opens with a gap that falls inside yesterday's range, and the first 5 minutes show reversal, the gap will fill at least 50% (halfback). London Sweep (ldn-sweep) at 28 triggers on the London session's characteristic stop-hunting pattern: price pushes above/below the Asian range to sweep stops, then reverses into the NY open. Overnight Continuation (ovn-cont) at 28 uses overnight VWAP (at_ovn_vwap, 18) as its anchor — the thesis is that the overnight direction established by Asia/London will continue into NY. SMT Divergence (smt-lag) at 25 fires when NQ and ES diverge structurally — one makes a new high/low while the other doesn't confirm, suggesting the leader is trapping participants.
Phase H Variants
Delta Divergence Reversal (delta-div) at 28 fires when cumulative delta diverges from price at a key level — flow says "no" while price says "yes." The highest-weight single flow condition in the system. Opening Shock Reversal (open-shock) combines extreme opening-range expansion (22) with failed continuation (25) — if the first minutes spike violently but can't sustain, the reversion trade has structure. US session only. Whale Cluster Pullback (whale-pull) leads with absorption (28) — the thesis is that visible institutional defense at a level creates a pullback anchor. Tape Climax Exhaustion (tape-climax) loads delta_divergence (24) + mkt_order_tempo (20) — the market is hitting hard but getting nowhere, tempo spikes are terminal not sustaining. Settlement Magnet (settle-mag) loads near_settlement (30) as a pure distance-based magnet play in afternoon PM session — price gravitates toward settlement in the last hours. No TP1/TP2, managed by distance to target. Pre-Event Compression (fomc-comp) at 22 requires pre_event_day from the economic calendar (FOMC/CPI/NFP) + tight IB (20) — the thesis is that pre-event compression resolves directionally once the event arrives. Powered by data-economic-calendar.js with 63 confirmed events through mid-2027.
Condition Weights — Phase G
Condition
ldn-sweep
gap-fade
ovn-cont
smt-lag
london_sweep / gap / overnight / smt
28
30
28
25
delta_aligned
14
14
12
18
ny_session / first_5min / first_30min
18
25
22
—
absorption
8
10
—
12
prime_window / at_ovn_vwap
10
—
18
—
cross_aligned
8
6
8
—
mkt_order_tempo
7
—
—
10
stop_hunt_clear
7
7
6
8
vwap_side
—
8
—
12
session_match
—
—
6
7
not_lunch
—
—
—
8
Condition Weights — Phase H
Condition
delta-div
open-shock
whale-pull
tape-climax
settle-mag
fomc-comp
Primary trigger
28
25
28
24
30
22
Secondary trigger
18
22
18
20
22
20
delta_aligned / divergence
—
16
16
—
16
18
absorption
16
12
—
18
—
—
key_level_near
18
—
12
14
12
—
cross_aligned
12
10
8
8
10
14
stop_hunt_clear
10
8
8
10
10
10
Calibration Rationale
These setups have the highest single-condition weights in the system (22-30) because each one is defined by its unique trigger.
Phase G: 4 specialist plays — gap, London sweep, overnight carry, SMT divergence. Each one IS its trigger.
Phase H: 6 microstructure plays — delta divergence, opening shock, whale pullback, tape climax, settlement magnet, pre-event compression. Researched and queued during BUILD PHASE.
Settlement magnet and pre-event compression use new data sources: near_settlement (distance-to-settlement evaluator) and pre_event_day (powered by data-economic-calendar.js with 63 confirmed FOMC/CPI/NFP/GDP/PCE events through mid-2027).
Setup Quarantine
Quarantine & Session Map
0 benched · 3 sessions
QS
Which setups are benched, and which sessions allow which families.
CategoryWeightScoreThreshold
Core Decision
Quarantine is data-driven — each benched setup demonstrated structural failure, not a bad streak. Session playbooks map approved families to session liquidity profiles.
Quarantineobjective performance collapse → manual unblock only
Retirementauto-bench at -0.12R expectancy → auto-rehabilitate
PassSetup not quarantined and approved for current session — proceeds to scoring pipeline.
FailSetup quarantined or session-blocked — still computes and writes to substrate but never fires a signal.
Quarantine List
The quarantine list is the engine's holding pen for setups that failed in live data. BUILD PHASE (2026-05-23): all 8 previously quarantined setups have been reinstated. The engine was declared structurally broken on 2026-05-22 — all prior data is invalidated. Fresh data collection will re-evaluate every setup (now 50 total, including 12 new Phase H entries) on equal footing. Historical quarantine reasons preserved as comments in config.js for reference. The distinction between quarantine and retirement remains: retired setups auto-bench at -0.12R expectancy and can auto-rehabilitate; quarantined setups require manual operator intervention.
Session Playbook
The session playbook maps each trading session to its approved setup families. Asia (pre-London, low participation) permits only level-based plays: key-mag, exhaust-rev, vpoc-mig. Breakout setups are banned — thin Asia liquidity produces false breakouts. London permits trap-based plays: ib-reject, trap-rev — because the London open characteristically sweeps Asia stops. NY gets the full momentum arsenal: ib-brk, ib-ext, flow-surge — because NY has the institutional flow to sustain breakouts.
Quarantined Setups
Quarantined
Reason
None — all quarantines lifted for BUILD PHASE (2026-05-23). Prior data invalidated by engine restructure. Fresh data collection will determine new quarantine candidates.
Session Approved Families
Session
Approved Families
Asia
key-mag, exhaust-rev, vpoc-mig
London
ib-reject, trap-rev
NY
ib-brk, ib-ext, flow-surge
Calibration Rationale
The quarantine list is data-driven, not opinion-driven. Each benched setup was quarantined after demonstrating structural failure, not a bad streak.
The session playbook is calibrated from historical outcome data grouped by session — Asia breakouts fail because there's no institutional flow to sustain them, not because breakouts are bad setups.
Liquidity Analysis
Liquidity Target Scoring
7 types · 44pt proximity cap
LIQ
How the engine scores potential sweep targets by type, proximity, and session.
CategoryWeightScoreThreshold
Core Decision
Each liquidity target gets a composite score from type base + proximity adjustment + session bonus — producing a live priority queue of the "hottest" sweep targets.
High ConfluenceSession extreme nearby with session match — highest-priority sweep target in the live queue.
Low ConfluenceRange target far away — effectively invisible to the current move, deprioritized.
Type Hierarchy
The engine maintains a real-time ranked list of liquidity targets — price levels where stop orders are likely to cluster. Each target receives a composite score combining three factors: type base score, proximity adjustment, and session context bonus. The type hierarchy reflects institutional behavior: session extremes (Asia/London/OR highs and lows) score highest (34) because they're the most visible stop-cluster locations — virtually every retail trader places stops outside them. Prior-day extremes (PDH/PDL at 32) serve a similar role at daily scale.
Proximity & Session Modifiers
The proximity formula adds up to 44 points based on how close the target is: max(0, 44 − distTicks × 0.55). This means a target at the current price gets +44, a target 40 ticks away gets +22, and a target 80 ticks away gets effectively zero. The decay rate of 0.55 per tick was calibrated to match the typical effective range of institutional sweep operations — stops too far away aren't getting swept in the current move. A session match bonus of +8 rewards targets that belong to the current session (Asia high targeted during Asia session) because same-session extremes are the freshest, most-watched levels. Range targets score only 6 base with an additional -20 penalty because intraday ranges are weak references — everyone sees them, but they lack the institutional significance of session extremes.
Base Scores by Type
Target Type
Base Score
Asia/London/OR High-Low
34
PDH / PDL
32
Value Area
26
VPOC
24
VWAP
22
Equal Highs/Lows
16
Range
6
Score Modifiers
Modifier
Effect
Proximity
max(0, 44 - distTicks × 0.55)
Session match bonus
+8 (Asia target in Asia, etc.)
Range penalty
-20
Equal H/L penalty
-8
Calibration Rationale
Session extremes score highest because they're the most obvious stop-cluster locations — every retail trader sets stops outside them.
PDH/PDL are close behind for the same reason at daily scale.
Range gets only 6 (effectively -14 after penalty) because it's a weak, overused reference.
Proximity degrades at 0.55/tick — a target 80 ticks away is effectively invisible to the current move.
The scoring produces a live priority queue: the engine always knows which liquidity target is the "hottest" right now.
Order Flow Engine
Order Flow State Machine
9 states · 15–82 conf
OF
9 states evaluated in priority order. First match wins. Each state carries a quality rating and confidence score.
DetectClassifyConfirmScore
Core Decision
Priority-first state classification: first matching state wins, no further evaluation.
ConfirmedHigh-quality state (Trap 82, Initiative 78) — institutional behavior detected, full conviction signal.
RejectedLow-quality state (Noisy 35, Stale 15) — no interpretable flow, signal suppressed.
State Machine Logic
The order flow state machine is the engine's real-time interpretation of what the market is doing right now. Every bar, it evaluates 9 possible states in a fixed priority order — the first state whose conditions are met wins, and no further states are evaluated. This priority-first design prevents ambiguity: when multiple states could apply, the highest-conviction interpretation takes precedence.
High-Priority States
Stale (15) checks first as a circuit breaker — if the latest flow data exceeds the age threshold, every other state is meaningless. Trap Flow (82) evaluates second and carries the highest confidence in the system: confirmed stop hunt with reclaim means institutional behavior — this is the most reliable signal the engine produces. Initiative Buy/Sell (78) require the tightest triple-confirmation: meaningful delta magnitude + CVD confirmation + efficient price travel in the same direction. Missing any one of the three drops the state.
Mid & Low-Priority States
CVD Divergence (72) catches a specific failure: price makes a new extreme but cumulative volume delta doesn't confirm — the "who's buying?" gap. Absorption (68) detects price stopping despite aggressive hitting, especially near a structural level. Exhaustion (62) is the "dying move" state: price is still traveling but delta has lost its meaningful threshold, and either volume or CVD is diverging. Delta Not Confirmed (48) captures the awkward middle: delta is meaningful but price/CVD don't agree. Noisy/Unusable (35) is the default — no conditions met, no interpretation possible.
Price moving but delta NOT meaningful + (vol ≥65%ile OR CVD diverging)
Delta Not Confirmed
LOW
48
Delta is meaningful but price/CVD don't confirm
Noisy / Unusable
LOW
35
Default — no conditions met
Calibration Rationale
Priority order matters — Trap Flow evaluates first (82 confidence) because confirmed institutional stop-hunting is the highest-conviction signal; when it's present, nothing else matters.
Initiative Buy/Sell (78) require the full triple-confirmation picture.
Exhaustion (62) is the "dying move" detector.
The spread from 15 to 82 reflects the genuine information-content gap between stale noise and confirmed institutional trapping.
Order Flow Engine
Directional Intent Scoring
62% min · 8 modes
DIR
Five modules vote on long vs. short. The winner must clear both a minimum score and a clear edge over the other side.
DetectClassifyConfirmScore
Core Decision
Dual-threshold directional call: side minimum 62% AND edge 10% required to declare direction.
RejectedCONFLICT or NO_DIRECTION — both sides too close or too weak, no directional trade eligible.
Voting & Thresholds
Directional intent scoring aggregates five independent modules — each evaluates its own evidence and produces a long score and a short score. The scores are weighted and combined into a final directional call. The thresholds are intentionally stricter than the Bias Arbitrator (Gate 2.5): the side minimum is 62% (vs. 55%), and the edge requirement is 10% (vs. 8%). This dual-threshold design prevents two failure modes: declaring direction when evidence is thin (minimum check) and declaring direction when both sides have nearly equal evidence (edge check).
Directional Modes
Beyond the binary long/short call, the system infers a directional mode — how the market is expressing its direction. ACCEPTANCE means price is being accepted in the direction (trending smoothly). PULLBACK means the direction holds but price is retracing (potential entry opportunity). TRAP_REVERSAL means a failed move trapped participants on the wrong side (highest conviction). The mode tells the setup system what kind of trade to look for: a pullback in an uptrend calls for continuation setups; a trap reversal calls for reversal setups.
Confidence Floor
The confidence floor at 60% flags marginal directional reads. A call that passes the math (62% side, 10% edge) but has low confidence (below 60%) is technically valid but unreliable — the engine logs it as a research note rather than a conviction signal. Confidence ranges from 45% to 75%, with the floor at 60% reflecting the point where cohort attribution data shows directional calls become predictive of setup outcomes.
Threshold Parameters
Parameter
Value
Side minimum
62% (to declare LONG or SHORT)
Side edge
10% (L-S gap required)
Conflict min
55% (both sides ≥ this = CONFLICT)
Conflict edge
10% (max gap for CONFLICT state)
None max
55% (below = NONE)
Confidence floor
60% (range 45–75%)
Mode Taxonomy
Mode
Variations
LONG_*
ACCEPTANCE · PULLBACK · TRAP_REVERSAL
SHORT_*
ACCEPTANCE · PULLBACK · TRAP_REVERSAL
Mixed
CONFLICT · NO_DIRECTION
Calibration Rationale
62% is the real scoring minimum (vs. 55% in the bias arbitrator's pipeline version) because directional intent is a higher-stakes call — it determines the trade direction, not just a gate penalty.
The 10% edge prevents "barely long" calls.
Mode inference (ACCEPTANCE vs PULLBACK vs TRAP_REVERSAL) tells the setup system what kind of trade to look for — a critical routing decision that determines which setup families are eligible.
Order Flow Engine
Absorption Quality States
7 states · 18–86
ABS
Absorption isn't binary — it transitions through 6 states from initial detection to confirmed reversal or invalidation.
DetectClassifyConfirmScore
Core Decision
Progressive state machine: absorption evidence accumulates from initial detection (58) to reversal confirmed (86) or invalidated (18).
EntryACTIVE_ABSORPTION at 58 — initial detection
PeakREVERSAL_CONFIRMED at 86 — full convergence
KillINVALIDATED at 18 — price accepted through level
RejectedINVALIDATED (18) — price bulldozed through the absorbed level, thesis dead.
Detection Model
Absorption detection answers a critical question: is someone passively absorbing aggressive flow at a level? When large limit orders silently eat incoming market orders without letting price move, it's invisible on the price chart but detectable in order flow — high volume + no price movement = passive absorption. The engine models this as a progressive state machine rather than a binary flag.
Early States
ACTIVE_ABSORPTION (58) is the initial detection: order flow shows absorption, but it's just a fact — someone is sitting on a level. It becomes interesting at TRAP_BUILDING (74) when structural evidence accumulates: a reclaim, a failed breakout, or proximity to a planned target suggests the absorption will hold. CONTINUATION_WARNING (66) is the default middle state — absorption is present but there's no reversal evidence yet, and the market might push through.
Convergence States
The highest states require convergence. REVERSAL_CONFIRMED at 86 needs reclaim or failed breakout PLUS opposite structure or hold — the full picture. At 84, it needs divergence plus at least one structural confirmation. TWO_REVERSAL_WARNINGS (80) fires on CVD divergence alone without structural confirmation — it's close to reversal conviction but lacks the physical proof. INVALIDATED (18) kills the thesis entirely: price accepted through the absorbed level. The passive defender lost — game over.
State Scoring Table
State
Score
Trigger
REVERSAL_CONFIRMED
86
Reclaim or failed breakout + opposite structure or hold
REVERSAL_CONFIRMED
84
Divergence + (reclaim or failed breakout or opposite structure)
ConfirmedSide flip validated — reversal banner shown for 30 seconds, new absorption side declared.
RejectedFlip conditions not met — current side held, noisy observations filtered out.
Low-Pass Filter
Raw absorption detection is noisy — in a fast market, the absorption side can appear to flip on every tick as aggressive flow alternates between bid and ask. The stability tracker is a low-pass filter that prevents meaningless side changes from polluting the engine's absorption state.
Weighted Voting
The tracker maintains a 60-second history window of scored observations. Each observation carries the quality score from the absorption detection module, and votes are weighted by score — a high-quality observation (e.g., REVERSAL_CONFIRMED at 86) outweighs several low-quality ones (ACTIVE_ABSORPTION at 58). To flip the declared side, the new side must satisfy three independent conditions simultaneously: (1) maintain its lead for at least 5 ticks, (2) hold for at least 10 seconds, and (3) achieve ≥60% weighted dominance in the recent vote window.
Grace & Banner
A fade grace period of 10 seconds holds the last declared side even when no new observations arrive — this prevents the tracker from going blank during brief pauses in flow. The reversal banner lasts 30 seconds: when a genuine side flip is confirmed, the engine displays a reversal alert on the dashboard for 30 seconds to ensure the trader notices the change. Observations below score 50 are dropped entirely — low-quality noise shouldn't influence the stability calculation.
Stability Parameters
Parameter
Value
History window
60 seconds
Flip minimum ticks
5 ticks
Flip minimum time
10 seconds
Flip lead %
60% weighted dominance
Min observation score
50 (drop below this)
Reversal banner
30 seconds
Fade grace
10 seconds (holds last side)
Recent vote window
25 ticks
Calibration Rationale
Without stability constraints, absorption side would flip on every tick in a noisy market, making the signal useless.
The 60% lead requirement means the new side must convincingly dominate, not just edge ahead.
The 10-second minimum prevents reactionary flips to single large prints — institutional iceberg orders can produce momentary opposite-side signals that shouldn't cause a flip.
Score-weighted voting ensures that a single high-quality reversal observation can outweigh multiple low-quality noise observations.
Order Flow Engine
Market Order Tempo
3 levels · 0.6× / 1.5×
TMO
How the engine reads market "loudness" — are institutions actively participating or is this retail noise?
DetectClassifyConfirmScore
Core Decision
Bar-by-bar institutional participation classification against fixed baselines per symbol.
WEAK<0.6× baseline → confluence +4
NORMAL0.6× – 1.5× baseline → no adjustment
STRONG≥1.5× baseline → confluence -3 to -5
ConfirmedSTRONG tempo — institutions actively participating, confluence threshold lowered, flow IS confirmation.
RejectedWEAK tempo — retail noise, no institutional footprint, confluence threshold raised by +4.
Baseline Measurement
Market order tempo is a per-bar measurement of institutional participation intensity. The engine counts both the number of trades (order frequency) and the total contracts (order size) in each bar, comparing them against fixed baselines derived from average RTH bar statistics: NQ baseline is 900 trades / 12,000 contracts per bar; ES baseline is 600 trades / 8,000 contracts.
Threshold Adjustment
Below 60% of baseline (WEAK), the market is dominated by retail noise — small orders, no institutional footprint. The engine responds by raising the confluence threshold by +4 points, demanding more evidence before firing. Above 150% of baseline (STRONG), institutions are actively participating. The engine lowers the threshold by -3 to -5 points because the heavy participation itself is a form of confirmation — you don't need as much structural evidence when the order book is shouting a direction.
Macro vs Micro Timescale
The tempo check is separate from the volume regime (P2) and works at a different timescale. Volume regime uses daily percentile rank (macro context); tempo measures bar-by-bar participation (micro context). A STRONG tempo bar in a LOW volume regime means "today is quiet overall, but right now someone big showed up" — that's a relevant signal for the current bar's evaluation.
Tempo Thresholds
Level
NQ Trades
NQ Volume
ES Trades
ES Volume
WEAK (<0.6×)
<540
<7,200
<360
<4,800
NORMAL
540–1,349
7,200–17,999
360–899
4,800–11,999
STRONG (≥1.5×)
≥1,350
≥18,000
≥900
≥12,000
Calibration Rationale
WEAK tempo means the market order flow is below 60% of baseline — retail noise, no institutional footprint; setups need more confluence to compensate.
STRONG means real participation — the flow itself IS confirmation.
The 0.6×/1.5× multipliers mark the empirical inflection points where institutional participation becomes visible (or invisible) in order flow data.
Order Flow Engine
Research Microstructure Signals (R1–R4)
4 signals · 10 substrate fields
R1–R4
Academic-validated microstructure signals wired into the RR profiler, stop adjustment, and per-fire substrate.
Market DataResearch SignalsDecision Chain
Four Signals
Each signal has a computation function (engine-order-flow.js) and decision-chain wiring (engine-rr-confluence.js + engine-pipeline.js).
R1 VPIN|buyVol − sellVol| / totalVol · rolling 50-bucket · TOXIC ≥0.7 / ELEVATED ≥0.5 / NORMAL ≥0.3 / CLEAN
R2 Vol ClockbarVolume / median(100 bars) · SURGE ≥2.0 / FAST ≥1.5 / NORMAL ≥0.7 / SLOW ≥0.4 / DEAD
R1 VPIN (Easley, Lopez de Prado, O’Hara) — Volume-Synchronized Probability of Informed Trading. Uses actual aggressor-classified buy/sell volume, not bulk classification. High VPIN precedes volatility events with R² ≈ 0.4 in the original paper.
R2 Volume Clock (Lopez de Prado, “Advances in Financial ML”) — volume-time vs wall-clock-time. When bars take longer to fill (SLOW/DEAD), signals are dominated by noise. The meta-multiplier gates the reliability of ALL other signal adjustments.
R3 First-30-min (Gao, Han, Li, Zhou 2018, Journal of Financial Economics) — first 30 minutes of the session predict the last 30 minutes’ direction. Crossed with GEX regime: CASCADE amplifies (dealers sell rips/buy dips in same direction), PINNED dampens (dealers absorb moves).
R4 OFI (Cont, Stoikov, Kukanov 2014) — Order Flow Imbalance velocity. Change in best bid size minus change in best ask size. R² ≈ 70% for short-term price prediction in the original paper. Scaffolded for L2 depth data (connecting this week).
3-layer utilization audit: “What does the engine KNOW vs what does it DO about it?” Wired institutional confluence into RR floor. Added raw-input substrate fields for post-hoc forensic decomposition.
AuditDecision ChainSubstrate Forensics
Build Sync · 2026-05-24
Today’s shipped cluster wired RR/state harmonization and entry-timing decomposition into live decisions, substrate persistence, and weekly extraction cohorts.
RR + Timing → SubstrateRR ATR14/state-coupling fields and P3.8 NO_FILL/CHASE timing fields are now captured on SIGNAL + BLOCKED rows, so weekly extraction can isolate entry-friction failure modes.
Regime → RRDay regime severity wired into RR: EXTREME +0.25R, ELEVATED +0.10R. Previously gated production but didn’t modulate magnitude.
Market-makers are structurally short options. To stay delta-neutral, they must hedge dynamically. Positive GEX = dealers buy dips, sell rips (stabilizing). Negative GEX = dealers sell into drops, buy into rips (amplifying). The gamma flip level is where this behavior inverts.
Cross-Market Edge
Data sourced from ETF options (SPY/QQQ), not futures options (ES/NQ). Independent participant pool: pension funds, insurance companies, retail equity vs futures prop desks. Cross-market confirmation with mechanical (not discretionary) basis.
VIX Integration
Research-validated: GEX is a VIX modifier, not standalone signal (FlashAlpha 8yr backtest: ρ=-0.14 after VIX control). Combined regime matrix: PINNED (calm+stabilizing) · COILED (calm+amplifying) · DAMPENED · VOLATILE · CASCADE.
Key Levels
Level
Definition
Gamma Flip
Price where net GEX crosses zero — regime boundary
Call Wall
Strike with highest call GEX — mechanical ceiling
Put Wall
Strike with highest put GEX — mechanical floor
Vol Trigger
Put wall below which negative-gamma cascading accelerates