A
ATLAS
The Architecture of Edge
10-brain portfolio racing NQ / ES intraday — shared substrate · per-brain edge measurement · the winner graduates to autonomous execution.
COLD HOT · NOW AUTOMATED
HOT · NOW Manual Broker OFF Bridge LIVE BUILD PHASE ◈ 10 brains race ◇ Forge has memory ◆ CI-LB > +0.3R · Holm · Static · not real-time
🪜 Deployment Phase Ladder · where the engine is
1 · PAPER TRADING ◀ NOW
Hot mode, paper fills. Everything works, is connected, and proves expectancy. Advance: works end-to-end + positive expectancy on paper.
2 · QUANTOWER DEMO
Verify every MECHANICAL op — order send, connections, bridges, automation. Mechanics, not edge.
3 · LIVE months away
Real broker / prop-firm account — the only phase touching real money. Only after 1 + 2 prove out.
Operative question is always "does this function work in paper?" — live framing is months out. Maps under: P1 ≈ G0 / Hot Run · P2 ≈ exec-ramp Phase 0 + G1 · P3 ≈ later gates.
19 racing brains
Portfolio fleet · NOA leader +138.9R
208 modules
Shared substrate · all brains read from it
~150 fields/fire
Substrate density · 16 categories
12,915 fires resolved
Deduped portfolio sample · HOT-RUN
6 layers · ~55 cards
This atlas · current state only
Risk & Performance Notice
Futures trading involves substantial risk and is not suitable for every investor. Past, simulated, hypothetical, replay-based, research, or model-generated results are not necessarily indicative of future performance. No representation is being made that any account will or is likely to achieve profits or losses similar to those shown. Unless explicitly marked as broker-executed live trades, all displayed outcomes should be treated as research/model outputs and may not reflect slippage, commissions, liquidity constraints, execution delays, or trader behavior. The 12,915-fire deduplicated sample referenced in this atlas was generated during research and hot-run evaluation phases across 10 racing brains — not under live broker execution.
Figure provenance — read every $ / R with its tag
A number whose provenance is left for the reader to guess is the ambiguous-zero applied to money. Default: every $ / R figure here is REALIZED in the shadow paper-race (it actually happened across the books — still not real broker money). Anything else MUST carry a tag: ⟨modeled⟩ / ⟨counterfactual · would-have⟩ (never happened, even in paper — e.g. "if reactive had been anticipated") · ⟨pre-haircut⟩ (claimed R before execution slippage) · ⟨synthetic⟩ (generated data) · ⟨live⟩ (real broker fills, when they exist). A would-have dollar next to a realized dollar, unlabeled, reads as real money — and that lie sizes positions. Per feedback_silent_failure_pattern.md R6.
SUBSTRATE
what the engine sees
Layer 1
PILLARS
how it thinks
Layer 2
THE RACE
10 brains compete
Layer 3
LAB + FORGE
how it discovers
Layer 4
DISCIPLINE
what protects it
Layer 5
NOA OPERATOR
+ execution ramp
Layer 6
REFERENCE
how it's calibrated
REF
Phase Ladder · 0 → 4 5-phase trajectory from substrate truth to autonomous portfolio
★ Phase 1 active
PHASE 0 CLOSED ✓
Data & measurement integrity
Make the numbers trustworthy before trusting them — every screen, file and journal now reads the same trade record from one source, so a profit can't be counted twice or quietly disappear. This was the foundation: measure honestly first, judge later.
Gate cleared 2026-06-07 · SSOT spine hardened 2026-06-10
PHASE 1 ★ NOW
Identify the edge · 10-brain race
Race all 10 brains on the same live tape and find the one with a real, repeatable edge — proven across enough trades and different market conditions that it can't be luck. Nothing graduates until the math rules out chance.
Contenders: NOA +138.9R · WCONS +117.4R
PHASE 2 queued
Graduation candidate
Lock the winning brain's rules so they can't drift, then re-test them on data they were never tuned on — each market type on its own. The edge has to survive being frozen, not just look good in hindsight.
Gate: PROMOTION_READY
PHASE 3 queued
Execution qualification
Take the frozen brain from "good signal on screen" to "real fill on a real account," one careful rung at a time — simulator, then demo broker — measuring how much the edge shrinks at each step before any real money.
Gate: D3 reached
PHASE 4 ★ NORTH-STAR
Portfolio allocation · NOA operates
The end goal: once two or more brains have earned their place, NOA runs them as a portfolio — sizing each by market conditions while you set the risk and stop pressing buttons. The engine becomes the trader.
≥ 2 brains graduated
Pivot trigger: if no brain clears Phase 1 by month 6 (~2026-11-13) → architecture council fires. Each phase has a hard gate. No graduation without clearing it.
Substrate L1
what every brain sees — shared market reality, classified once, read by all 10 racers
7 CARDS
Shared by construction. The substrate is computed ONCE per tick and consumed by every brain (NOA · ANT-NOA · BRO · CONS · WCONS · AGG · PA · PRECISION · LLQ · SCHL · SAVT). No brain owns its own private feed — when we say "the substrate," we mean the single source of truth all 10 racers measure against. Brain differences live downstream in Pillars, conviction, and exit policy — not here.
Order Flow
01
In plain terms. This watches what buyers and sellers are actually doing underneath each price move — who's winning, who's absorbing the other side, and whether the move has real force behind it. It's the engine's first defense against a breakout that looks real on the chart but has nothing buying or selling to back it up.
  • Tracks who's winning each bar — buyers or sellers — and flags when price disagrees with that pressure
  • Spots when big players are absorbing the other side, in four stages: building, confirmed, exhausted, reset
  • Three ways to know who took the trade — exchange tag, or inferred two different ways as fallback
  • Knows the difference between someone refilling an order and someone aggressively taking it
Prevents Fake breakouts where price jumps but no real buying or selling backs it up.
engine-order-flow.js · NQEliteAbsorption
4-state FSM
Liquidity
02
In plain terms. This maps the price levels institutions actually defend — and the traps they spring when the crowd piles in. It tells a real breakout from a fake one engineered to grab stops and reverse, so the engine doesn't chase a move built to trap it.
  • Tells a fake breakout (price grabs stops then snaps back) from a real one (grabs stops then keeps going)
  • Detects failed breakouts — price pokes through a key level but can't hold above it
  • Watches the levels that actually matter: yesterday's high/low, today's opening range, fair-value zones
  • If a symbol's levels stop behaving predictably, stop firing on that symbol
Prevents Chasing breakouts that were engineered to grab your stops and reverse.
stop-hunt-panel · pipeline sweep-state
±1 tick dedup
Market Profile
03
In plain terms. This tracks where the market spent time agreeing on price — and where it refused to. It shows where today's fair value sits and which way it's drifting, so the engine knows whether to expect a snap-back or a genuine shift in where price wants to trade.
  • Measures the first hour's range against the day's typical move — not against a fixed number
  • Tracks where most volume traded today and how the fair-value zone is shifting
  • Volume-weighted average price tracked from key moments — open, big news, session start
  • Yesterday's high and low survive restarts — never lose context after a reload
Prevents Betting on a snap-back when the market is actually shifting where it wants to trade.
engine-ib-* · poc-tracker · vwap-context
Dalton framework
Whale & OI
04
In plain terms. This reads the footprints institutions leave that retail never sees — repeated one-sided slamming, contracts added or closed overnight, and where the real size is sitting. It keeps the engine from fighting a level a big player is actively defending.
  • Spots bursts where the same side is repeatedly slamming the market
  • Watches how many contracts institutions added or closed overnight
  • Sees how thick the real buy and sell orders are at each price — not just how many there are
  • Recognizes when a big player is actively defending a price level on a pullback
Prevents Fighting positions that institutions are actively defending.
whale-tracker · oi-tracker · aggressor-streak
Quantower L2
Cross-Market
05
In plain terms. This checks whether all four index futures are pulling the same way — and who's leading the move. When NQ runs but the rest of the market quietly turns, this is what flags it before the engine trades a move the broader tape doesn't support.
  • Compares NQ · ES · YM · RTY every bar — reads the broader market picture
  • Who's pulling the move: tech, broad market, small-caps — or are they disagreeing?
  • Flags when one index makes a new high but its sibling doesn't — a classic warning
  • Detects when all four go quiet at once — often the calm before a real move
Prevents Trading a strong NQ while the rest of the market quietly rolls over.
engine-cross-index · noa-cross-market
4-symbol fabric
Macro Context
06
In plain terms. This is what the big-picture market is doing to your setup right now — fear, the dollar, and the calendar of major releases. The same setup can be a green light on a quiet day and a hard no in the minutes around a Fed decision; this is what tells them apart.
  • Tracks fear (VIX) and the US dollar (DXY) — and what they're saying today
  • Knows when Fed days, jobs reports, and inflation prints are about to hit
  • Stops firing signals in the minutes around major news releases
  • Same setup can be a buy on a weak-dollar day and a no-go on a strong-dollar day
Prevents Trading into Fed-day chaos, or fighting which way the dollar wants to go.
engine-news-risk · macro_daily.json
VIX · DXY · event
HTF · Volatility
07
In plain terms. This sets the bigger frame you're trading inside — how much the market typically moves today and what kind of day it is (trending, choppy, or reversing). It's what stops the engine from using calm-day tactics on a wild day, or the reverse.
  • Sizes targets relative to how much the market typically moves today — not against fixed numbers
  • Classifies the day: trending · choppy · reversing × calm · normal · loud
  • Knows Asian-session hours move differently and handles them separately
  • If you lose too much in one day, the day ends — full stop
Prevents Using choppy-day tactics on a trending day, or the other way around.
daily-risk-gate · ib-day-snapshot
3×3 regime grid
Pillars L2
how the brains think — the cognitive primitives every racer inherits before adding its own conviction
7 CARDS
Inherited, not invented. Every brain inherits these pillars — confluence weighting, setup catalog, bias arbitration, risk plan, cadence. The 10 racers differ in which pillars they emphasize, which setups they whitelist, and when they hold fire — but the pillars themselves are common ground. Memory (formerly the MI layer) folds in here as the seventh pillar.
Institutional Confluence
08
In plain terms. This is the engine's single official scorer — it weighs every piece of evidence (order flow, market structure, location, the wider context) into one number, and downgrades signals that are really just saying the same thing twice. One place owns that math, so no two screens can disagree.
  • One score combining four things: order flow · market structure · location · wider context
  • Different signals matter more for different setups — weights are tuned per setup, not globally
  • When most evidence agrees, the few outliers get downgraded — they can't inflate the score
  • One place owns the math — no parallel copies allowed to drift
Prevents A high score that's really the same signal getting counted twice.
rr-confluence · production-context
minority × (1−d/140)
Setup Catalog
09
In plain terms. This is the library of trade patterns the engine is allowed to recognize — every family backed by published research, not hunches. When two patterns fire on the same trade it merges them, so the count reflects real, distinct bets rather than the same idea wearing two names.
  • Every family has academic research + experienced-trader literature behind it
  • When two setups fire on the same trade, they get merged — 15 fires really means ~10 distinct bets
  • A setup only counts as "proven" after at least 30 trades (fewer if research already supports it)
  • Plans to reweight overlapping setups are documented but locked until the data earns them
Prevents Shipping near-identical setups under different names and pretending they're independent.
setup-classification · docs/setup_theses.md
10 families
Bias Arbitrator
10
In plain terms. Before anything gets sized, this decides what kind of trade it even is — with the trend or against it — and refuses the muddy middle. When the read is clean you keep the edge; the fuzzy fallback path is the expensive one, and it's measured.
  • For every signal, picks one of two clean paths: continuation or reversal
  • When the read is muddy and falls through to the backup path, trades lose ~1.6× risk on average — measured
  • Going long and going short use different rules — they aren't mirror images
  • Decides what kind of trade this is before sizing it — never after
Prevents Mixed signals diluting what was a clean directional read.
engine-bias-arbitrator.js
Δ −1.6R fallback
Phase F · Risk Plan
11
In plain terms. This is the layer whose whole job is to say no — cap how much reward you chase per trade, block trades fighting the day's trend, and end the day when losses hit the limit. It exists so a great-looking setup can't talk you into an oversized bet.
  • Caps reward-to-risk per symbol — no lottery tickets: NQ at 2.5× max, ES at 4×
  • Hard-blocks trades that go against the day's trend
  • Hit the daily loss limit, the day ends — no exceptions
  • Currently advice only · enforcement activates in Automated Run
Prevents Oversized "lottery ticket" trades that come from gut, not data.
daily-risk-gate · l4-risk-advisor
RR cap NQ 2.5 / ES 4.0
Phase G · Additive Setups
12
In plain terms. These are four newer trade patterns, each grounded in published research rather than invention — playing index disagreement, gap retraces, overnight order build-up, and failed first breakouts. Same bar as everything else: no setup ships without real evidence behind it.
  • G1 · NQ and ES disagree (one makes a new high, the other doesn't) — trade the laggard
  • G2 · The open gaps, price retraces halfway back — fade the move
  • G3 · Overnight volume piles up one way — ride it into the morning (NY Fed paper)
  • G3b · London grabs liquidity, NY's first breakout fails — fade the failure
Prevents Inventing setups out of thin air with no real evidence behind them.
production-context setup definitions
NY Fed sr917
Cadence & Pause
13
In plain terms. This is the engine's sense of when to stay quiet — limiting how often each symbol fires so the screen doesn't fill with noise. Pausing a symbol only silences the alerts; behind the scenes it keeps watching and learning. One missed signal beats a stream of noisy ones.
  • Limits how often each symbol and setup can fire — no spam
  • Pausing a symbol stops the signals on screen, but keeps the engine watching and learning
  • Quiet on the screen, busy in the background — the engine never stops collecting data
  • One missed signal is far cheaper than a stream of noisy ones
Prevents Signal-spam that wears down your focus and your trust in the system.
cadence-gate · symbol-pause
silence = information
Memory · Market Intelligence
14
In plain terms. This is the engine's memory — it records the events, the episodes, and how they resolved, so any brain can ask "have I seen this before?" before committing. It turns a repeating market into evidence instead of a fresh surprise every time.
  • 5 MI modules: events (CPI/NFP/FOMC) · episodes (high-variance windows) · outcomes (how each episode resolved) · store (durable IDB) · intelligence (the read API)
  • Every brain can query MarketIntelligence for context before committing — "this regime resolved BEARISH in the last 8 sessions"
  • 3-minute per-type cooldown — repeated insights don't spam the narration channel
  • Was its own layer (MI) — folded into Pillars as the seventh because it's shared infrastructure, not a brain
Prevents The same regime fooling the system twice — memory turns event recurrence into evidence, not surprise.
engine-noa-market-intelligence.js · IDB nqelite_mi_store_v1
5 modules · durable · 3-min cooldown
The Race L3
10 brains on one tape — the active battlefield (Phase 1). Same substrate, same pillars, different reads on what fires.
10 BRAINS
Same race, different reads. Every brain consumes the Substrate (L1) and inherits the Pillars (L2). They diverge on which setups they take, how strict their conviction gate is, and which regime cells they specialize in. Phase 1 asks: which racer's CI-lower clears +0.3R per trade on ≥100 fires across ≥2 regime cells, Holm-corrected across N=10? Winner snapshot freezes into Phase 2. Until then, all 10 race in shadow on the live tape.
🏆 Portfolio Scoreboard Deduplicated · current handoff truth · all 10 racers · NOA leader · AGG losing book
+54.2R portfolio net · −$51,515 on stop-dist + commissions · 12,915 n
Brain Kind R $ REAL n Phase 1 read
★ NOA broad · self-discovery +138.9 +$12,659 660 Real leader · NQ +100R · ES +39R · positive both symbols · Phase 1 contender
★ WCONS weighted consensus +117.4 +$11,452 249 NQ +90R · ES +27R · trend_up cell Holm-significant (n=63, ci_low +0.166R) · Phase 1 contender
PA price-action lens +6.1 −$16,615 1,534 R-vs-$ divergence · tier-sizing impact · NQ +56R / ES −50R · investigation flagged before any Phase 2 promotion
CONS consensus −1.3 −$238 440 Essentially breakeven · waiting for WCONS to graduate first
ANT-NOA pre-arm head −53.1 −$12,386 897 NQ −80R drags portfolio · counterfactual: strip top 5 leaks → would promote to NO_EDGE_YET
BRO Brooks PA inheritance −55.9 −$12,387 274 Negative both symbols · first auto-proposal: split by regime — wins trend_down, loses trend_up
AGG broad aggressive −97.9 −$34,001 8,861 Losing book · ES bleeds −707R · ES quarantine + regime heal shipped 2026-06-06 · counterfactual proves the gap is STRUCTURAL — leak quarantines alone don't graduate AGG
PRECISION whitelist over AGG new 0 8th racer · 9 documented-winner cells (5 NQ gold mines + 4 ES survivors) · install-date floor 2026-06-09 09:30 ET
LLQ regime whitelist over AGG new 0 9th racer · low_liquidity-only filter (the Holm-significant edge cell n=3540 ci_low +0.077R)
SCHL / SAVT self-learning · no setup catalog abstain 0 The only brains that learn the market themselves — Scholar (readable checklist) + Savant (learned representation). Train offline, ship frozen to browser. 2026-06-10: their lifetime-0 fires were a regime-bucket BUG (`_resolveRegime`→`'unknown'`→max CQL penalty→qLower −0.93), NOT "abstaining by design" — fixed (→ ProvenanceStamper). Now gates honestly; fires in trend/expansion. Still data-gated for retrain (≥6 RTH days, have 2).
★ NOA
B1
In plain terms. The fleet's lead brain and its broadest — it hunts edge across every kind of setup rather than specializing, and discovers new patterns on its own instead of waiting to be told what to trade. Right now it's the only racer making money on both NQ and ES.
  • +138.9R · +$12,659 · n=660 deduplicated
  • Positive both symbols: NQ +100R · ES +39R
  • Pillar weighting + bias arbitration + auto-experiment graduation
  • Phase 1 contender — needs CI-lower clear at +0.3R on ≥2 regime cells
engine-noa-*.js · 32 modules
Phase 1 contender
★ WCONS
B2
In plain terms. This one trades no setups of its own — it listens to all the other brains and takes a weighted vote, trusting each in proportion to how well it's done lately. A "wisdom of the crowd" brain that leans hardest on whoever's currently hot.
  • +117.4R · +$11,452 · n=249
  • NQ +90R · ES +27R · positive both
  • First Holm-significant cell in the fleet: trend_up n=63 ci_low +0.166R (p=0.000)
  • Phase 1 contender alongside NOA
engine-wcons-*.js
Phase 1 contender
PA
B3
In plain terms. A pure price-action reader — it trades off the shape and quality of the bars themselves (clean structure, second-entry pullbacks), not order flow or news. The busiest brain on the tape, but its dollars and its R disagree sharply — flagged for a look before it can advance.
  • +6.1R but −$16,615 · severe R-vs-$ divergence
  • NQ +56R · ES −50R — symbol asymmetry
  • n=1,534 — most-fired discretionary read
  • Sizing-architecture investigation flagged before Phase 2
  • Real conviction tier: fusion-bias agreement + regime alignment grade each fire 1–4 (was hardcoded tier-2 on every fire, polluting the band tape-wide)
engine-price-action-state.js · price-action lens
Divergent · investigate
CONS
B4
In plain terms. The control experiment — a plain, equal vote of all the brains, nobody weighted. It exists to prove the weighted version (WCONS) is actually earning its keep: if simple averaging worked just as well, the weighting would be pointless.
  • −1.3R · −$238 · n=440
  • Essentially breakeven · the "no edge if we just average everyone" baseline
  • WCONS proves the weighting is doing real work · CONS proves it's necessary
  • Not a Phase 1 candidate — exists as the methodological control
engine-cons-*.js
control · breakeven
Anticipation Spine · 6 arms
B5
In plain terms. Six brains that try to get in early — arming a trade just before the signal fully forms, each specialized in a different read (price action, Brooks, order flow, aggressive, consensus). Five of the six make money together; the original lead arm is the one that bleeds.
  • ANT-PA · +$12,838 · 80.3% WR · n=66 — standout · price-action pre-arm
  • ANT-BRO · +$8,628 · 76.1% · n=46 · Brooks-style pre-arm
  • ANT-OF · +$4,129 · 76.2% · n=21 · order-flow pre-arm
  • ANT-AGG · +$2,366 · 65.0% · n=60 · broad-aggressive pre-arm
  • ANT-CONS · +$1,668 · 100% · n=5 (small-n caveat) · consensus pre-arm
  • ANT-NOA · −$12,386 · 21.9% · n=897 — the losing arm (NQ −80R drag) · counterfactual: strip top 5 leaks → promotes to NO_EDGE_YET
  • Strict Holm: 0 spine cells survive family-wise correction · spine arms have noisier per-cell distributions but win in aggregate
engine-anticipation-layer.js · engine-ant-{pa,bro,of,agg,cons}.js · arena pairs vs NOA reactive
+$29k aggregate ⟨realized · shadow⟩ · 1 leader bleeds
BRO
B6
In plain terms. Trades the classic Al Brooks price-action playbook — bar-by-bar reading, second entries, wedges, final flags. The catch: Brooks's hand-written odds are treated as starting assumptions, not facts, and they haven't survived the post-2025 market — it loses on both symbols.
  • −55.9R · −$12,387 · n=274
  • Negative both symbols · Brooks priors don't survive 2025+ regime
  • First auto-proposal: split by regime — wins on trend_down (+0.39R), loses on trend_up (−0.52R)
  • Brooks's explicit probabilities = Bayesian priors, not facts
engine-noa-brooks.js v0.3 · 39 videos ingested
Bleeding · regime-split candidate
AGG
B7
In plain terms. The volume brain — it fires on every setup that clears the minimum bar, holding nothing back. That makes it the busiest and the biggest loser: it bleeds heavily on ES, and the gap is structural, not just a handful of bad setups.
  • −97.9R · −$34,001 · n=8,861 — the losing book
  • ES bleeds −707R alone · NQ +609R can't cover
  • ES quarantine shipped 2026-06-06 (3 setupId tuples) · regime heal recovered 1,647 labels
  • Counterfactual proves: structural edge gap — leak quarantines alone won't graduate it
engine-leading-edge-shadow.js · SHADOW_CONFIG.quarantine
Losing · MARGINAL_POSITIVE after heal
PRECISION
B8
In plain terms. A sharpshooter built on top of AGG — instead of firing on everything, it only takes the handful of specific setup-and-symbol combinations that have actually proven to win. Brand new, still gathering its first trades.
  • 9 (sym, setupId) cells: 5 NQ gold mines + 4 ES survivors
  • NQ: tape-climax-L/S · smt-cont-L/S · exhaust-rev-S
  • ES: tape-climax-L · smt-cont-L · overnight-cont-S · exhaust-rev-S
  • Install-date floor 2026-06-09 09:30 ET · no retro-credit · first Saturday read 2026-06-13
engine-shadow-book-precision.js · install-date floor
Accumulating · 0 fires
LLQ
B9
In plain terms. A single-question experiment: does AGG do better if it only trades in thin, low-liquidity conditions — the one market state where its edge tested positive? It's AGG with one regime filter, there to settle that question.
  • Asks: does AGG-restricted-to-low_liquidity outperform AGG-broad?
  • Built on the Forge's Holm-significant edge cell: low_liquidity n=3540 ci_low +0.077R
  • Same install-date floor as PRECISION
  • Hypothesis instrument — tests whether regime-filter alone is enough
engine-shadow-book-llq.js
Accumulating · regime test
SCHL + SAVT · The Self-Learning Brains
B10
The only brains in the fleet that learn the market themselves — no setup catalog, no hand-coded pillars, no human-written rules. They look at the data and discover their own edge.
  • Every other brain (NOA · BRO · AGG · PA · etc.) trades setups we designed. SCHL + SAVT design their own.
  • SCHL — the Scholar: discovers edge as a readable checklist. We can audit what it learned and why. Asks: can edge be NAMED?
  • SAVT — the Savant: discovers edge as a learned feature representation. Cannot explain itself in words — it just feels the pattern. Asks: can edge be FELT?
  • Two paradigms, one race — they're testing whether trader edge is something we can name or only something the machine can feel.
  • Safe by construction: trained wild offline in Python; a frozen snapshot ships to the browser. They learn in the lab, never on live capital.
  • Data-gated for retrain (≥6 RTH session-days of arena tapes, have 2). They'd rather stay silent than fire on a half-trained policy.
  • ⚠ 2026-06-10 bug fix: their lifetime-0 fires were NOT "abstaining by design" — a regime-bucket bug (`_resolveRegime` returned `'unknown'`, a key absent from the CQL table) pinned the conservatism penalty at max (1.0) → qLower ≈ −0.93 on every candidate, strangling a validated +0.087R edge. Fixed (→ `ProvenanceStamper.getRegimeKey` + `regimeAtFire` stamp); qLower −0.93 → +0.0147. They now gate honestly and fire in trend/expansion where qHat clears the floor.
engine-noa-solo.js · scholar_policy_live.json · savant_policy_live.json
pure self-learning · regime-bucket fixed 06-10
🏃 Run modes — same for all 10 brains
🧊 Cold · CALIBRATION — shadow capture, no signals shown. A tool, not the default.
🔥 Hot · NOW · HOT — every brain fires real signals visibly, Paper Trader runs on screen. No real account.
🤖 Automated · LIVE — graduated brain executes against real broker (Phase 3 ladder).
The Lab + The Forge L4
how the system discovers, governs, and calibrates itself — the Lab measures, the Co-Pilot watches and acts, the Forge stamps + ranks findings, the Constitution fences the layers, and the calibration loop grounds the roster
15 CARDS
Three instruments, one loop. The Lab (docs/research_plan.html) is where measurement happens — three tabs (Measure · Optimize · Discover) read the deduplicated truth and answer "what does the data say this week?" The Co-Pilot (docs/copilot.html) is the 6-faced sentinel — Hunter / Cross-Brain find leaks · Scout / Oracle find rising edges · the scanner self-schedules at 09:25 + 16:05 ET weekdays + a Saturday-night discovery slot (14:05 ET) · the AUTO watcher acts on graded-eligible findings (money-path permanently blocked). The Forge (docs/optimization.html) is where findings get persisted — cell-significance scans run with Holm-Bonferroni + session-clustered bootstrap CI, lifecycle stamps each finding (NEW / PERSISTENT / CONFIRMED / DROPPED), counterfactual ranking tests "would stripping this leak graduate the book?", auto-emit ships paste-ready quarantine patches + specialist-book stubs, and the Synthesizer composes regime-route policy candidates against the OOS gauntlet. The Discovery Lab (claude/auto_audit.py + lab_investigator.py + lab_features.py) is the autonomous quant researcher — it invents thousands of hypothesis cells per sweep across 30 dimensions (18 recorded — incl. the standing watches: news event/phase/aftertaste, roll window, institutional alignment, payout-multiple — + 12 derived senses: MFE/give-back/round-trip/heat/life/risk/RR bands, cross-book agreement, follower, regime alignment, streak context), screens them through BH-FDR + parent-lift, tortures survivors with an 11-test interrogation battery (strip-best, day-consistency, R-vs-$ money-truth, composition, cost floor…), grades each with a verdict ladder (CONFIRMED→DEAD) + evidence score 0–100, auto-freezes forward tests, runs the operator's question-templates forever, and briefs the Co-Pilot in desk language. Two gears: sentinel daily (forward-test health + edge-rot, quiet unless burning) · full discovery every Saturday night (findings land while the market is closed). It discovers, interrogates, and talks — it wires nothing. The system has memory of its own discoveries and a sentinel watching them.
🧪 The Lab · docs/research_plan.html measurement + research + discovery surface 5 cards
Measure Tab · Standings
L1
In plain terms. The weekly report card — it ranks every brain on how it's actually doing, broken down by market condition, and only calls an edge real once the statistics clear a strict bar. This is where you look to answer "who's winning, and is it real?"
  • Per-book aggregate: n, mean R, PF, win rate, session-clustered bootstrap CI
  • 9-cell heatmap: regime × outcome — colors graduate by sample density
  • Phase 1 tier: GRADUATION_LIKELY / MARGINAL_POSITIVE / NO_EDGE_YET / MARGINAL_NEGATIVE / QUARANTINE_LIKELY
  • Deep slices: per-setup, per-symbol, per-hour, per-session — drill on any racer
  • 6-check pipeline at the bottom enforces a single source of truth read
research_plan.html · #measureView · v20260607-r6
CI-lower · Holm · 9-cell grid
Optimize Tab · The Race
L2
In plain terms. The workshop view — it shows the live race, how close each brain is to graduating, and which leaks are worth fixing first. From here you jump to the Forge for the deep dig.
  • The Race: top quarantine candidates ranked by counterfactual ΔR — promotion-ready first
  • Graduation Ladder: per-book %s derived from verdict tier + would-promote flag
  • Calibration Bay: per-book aggregate + top 6 leaks + top 4 experiments + living loop
  • 🔧 Links to the Forge (`docs/optimization.html`) for deeper hyp drill-down
  • ARENA Phase 2(b) LIVE (2026-06-08): confirmation × exit-policy counterfactual cells from resolver_vm matrix replay across 36k ARM tapes — all 5 structural anticipation arms (ANT-PA · AGG · CONS · BRO · OF), brain-matched (each brain's confirmations replay only on its own tapes → genuinely distinct cells, e.g. PA +0.07/−0.03 vs AGG +0.04/+0.01). CI-backed. ANT-NOA out (learned pre-arm path). Historical cells use midPx-synthesized bars; live v1.1 capture adds engine bars + the 4 flow confirmations going forward.
research_plan.html · #optimizeView · build_races_and_ladder() · trigger_arena_replay.py · build_arena_proper()
races + ladder + cal bay + ARENA Phase 2(b)
Discover Tab · The Learn Pane
L3
In plain terms. The "what did we get wrong" tab — it ranks the week by where the system was most confidently mistaken, so the biggest surprises rise to the top instead of hiding inside the average.
  • Surprise rank: rank everything by how wrong our prior was — biggest deltas first
  • Confidence × wrongness: high-conviction misfires get the loudest signal
  • Regime mix shifts: detect when the distribution moves under us
  • Blind spots: regimes where n is too small to trust either way
  • Live now — was rendering fixture for an unknown period (caught 2026-06-06)
research_plan.html · #discoverView · j.lab.discover.surprise
surprise · wrongness · blind spots
Epistemic Suite · 5 tools
L4
In plain terms. A set of five research tools whose only job is to find what we don't yet know — what surprised us, where we're blind for lack of data, and which next trade would teach us the most. It proposes; it never trades.
  • surprise_rank.py · what was most surprising vs prior
  • blind_spots.py · regimes / setups with insufficient evidence
  • auto_experiment.py · pre-registered shadow tests, no fishing
  • unknown_clusters.py · clusters of unnamed edge
  • active_learning.py · what next fire would teach us most
  • Fail-closed below the 10-session floor · discovery proposes, pre-reg disposes
claude/epistemic_common.py · 5 tools · EPISTEMIC_LAYER_DESIGN.md
discovery engine · fail-closed
Discovery Lab · Auto-Audit
L5
In plain terms. The autonomous researcher — it dreams up thousands of possible edges, assumes each is fake until proven otherwise, puts the survivors through a brutal test battery, and reports back in plain desk language. It investigates and talks; it changes nothing on its own.
  • Hypothesis generator: ~4,100 cells/sweep over 21 dims (9 recorded + 12 derived senses incl. cross-book agreement + streak) + 60 seeded deep probes
  • Screen: BH-FDR (q=0.10) + parent-lift — a child cell must BEAT its parent or it's the parent's edge in a costume
  • 11-test battery → reason chain: strip-best trade/day · day/regime consistency · cross-ticker · cost floor · outlier · composition · R-vs-$ money-truth
  • Verdict ladder CONFIRMED/PROMISING/NEW/THIN/DIVERGENT/ARTIFACT/DEAD (edge-rot) + evidence 0–100 · counterfactual Δ · auto-frozen forward tests
  • Question templates: every operator question becomes a permanent investigation (10 live)
  • Two gears: sentinel daily (gate health, quiet) · full sweep Saturday night — desk brief pinned in the Co-Pilot WATCH zone
  • Outcome features (MFE/give-back/…) = lookahead → DIAGNOSTIC only, never tradable edges
claude/auto_audit.py · lab_investigator.py · lab_features.py · question_templates.json
presumed false until proven · 2 gears
🐙 The Co-Pilot · docs/copilot.html 6-faced sentinel — finds problems, recommends fixes, acts in one click 1 card
🐙 The Octopus · 6 faces
CP
In plain terms. The watchdog that doesn't just measure the brains — it hunts for problems and acts. Six "faces": two sniff out leaks, two spot edges that are rising, one runs the scans on a schedule, and one is allowed to act on findings that have earned it — never anything touching real money.
  • Hunter · finds bleeds → daily bleed / weekly cell bleed / engine paused / parent-silent-while-arm-profits
  • Cross-Brain · catches when one brain bleeds while a sibling profits on the same setup
  • Scout · finds things getting BETTER → FIX_VALIDATED · EDGE_RISING · GRADUATION_CANDIDATE
  • Oracle · warns about what's coming → CELL_DRIFT_WARN · REGIME_SHIFT · BRAIN_CORRELATION_SPIKE
  • Scanner self-scheduler · runs copilot_scan.py at 09:25 + 16:05 ET weekdays (ET-anchored, DST-safe, boot-catch-up)
  • AUTO watcher · armed; per action type opt-in; eligibility gate ≥20 grades + ≥90% hit-rate; money-path PERMANENTLY blocked at two layers
engine-noa-analyst.js (façade, 438 LOC) → engine-noa-findings-bus + engine-noa-analyst-tier1/tier2/tier3 + engine-noa-bridge · engine-noa-actions.js v0.2.0 · claude/shadow_race_disk_server.py v0.4.0 · /scanner_status
6/6 faces live · 2 scheduled scans/day · analyst decomposed 2026-06-09 (75K→18K core)
🔬 The Forge · docs/optimization.html instrument with memory — scan, classify, persist, surface, compose 8 cards
Cell-Significance Scanner
F1
In plain terms. The lie-detector for edges — it takes every brain-and-condition slice and asks whether its profit could just be luck, using strict statistics and a held-back chunk of data to confirm the edge survives out of sample.
  • 5 classifications: gold_mine · edge_cell · severe_leak · bleed_cell · dead_weight
  • Holm-Bonferroni across the family of cells per book — multiple-testing honest
  • Session-clustered bootstrap for the CI — same-session fires aren't independent
  • Walk-forward holdout (2026-06-07) — newest 30% of fires by ts held out; each hyp gets holdout_passes. First scan caught a NOA edge_cell overfit (train +0.388R → holdout −0.659R, sign flipped)
  • Regime-rotation expectancy (2026-06-07)regime_weighted_meanR + regime_weighted_usd_per_week; cells whose edge sits in now-rare regimes get marked down
  • Temporal stability + $-impact ranking — surface the load-bearing leaks first
claude/cell_significance.py · scan() + build_arena_early() + 4 gates
Holm · cluster CI · holdout · regime-weighted
Hypothesis Lifecycle
F2
In plain terms. The Forge's memory — every edge it finds gets tagged (new, holding, confirmed, or dead) so the system remembers what it discovered and notices the moment an edge stops being true.
  • Lifecycle: 🆕 NEW (absent from prior) · 🔁 PERSISTENT (in prior + current, < 3 dates) · ✓ CONFIRMED (≥3 distinct ET dates) · ❌ DROPPED (in prior, gone from current)
  • Drift status (2026-06-07): on first sighting, snapshot install_meanR / install_ci_low / install_usd_per_week. Each later scan computes sign-preserving drift_ratio against the snapshot
  • STABLE (≥0.95) · SLIPPING (0.7-0.95) · DRIFTING (<0.7) · INVERTED (sign flipped) · DEAD (CI now brackets zero) · NEW (no baseline)
  • Loudest alarm = CONFIRMED + DEAD — was real for ≥3 scan-dates, significance now lost (the "silently dying edge" case)
  • Distinct-scan-date counting prevents 50 Monday loads falsely confirming a Monday-only finding
claude/hyp_history.py · install snapshot + drift_status · current.json · scans.jsonl · actions.jsonl
4 lifecycle · 6 drift · CONFIRMED+DEAD = loudest
Counterfactual Ranking
F3
In plain terms. Asks the "what if we fixed this" question — if we stopped a brain from taking its worst leak, would it actually graduate to a better grade? Ranks the leaks by how much fixing each one would help.
  • Per-leak: simulates strip + recomputes aggregate ci_low via session-clustered bootstrap
  • Maps ci_low → Phase 1 tier · stamps counterfactual_would_promote: bool
  • Identifies counterfactual_promotion_depth — first depth where book promotes
  • The load-bearing finding: AGG's edge gap is STRUCTURAL — strip all 6 leaks → still MARGINAL_NEGATIVE
  • ANT-NOA: strip 5 → would promote · PA: 1 leak insufficient
cell_significance._compute_counterfactual_path()
structural gap detector
Dead-Weight Detection
F4
In plain terms. Finds the trades that are pure noise — lots of volume, zero edge either way. Not losers exactly, just dead weight that piles on risk and cost without adding any profit.
  • The OPPOSITE pattern from edge/leak — not bleeding, not edge, just commission burn + variance source
  • Body text projects commission burn over disk window
  • Independent of Holm — these don't compete for significance, they're confirmed null
  • First-run: 0 cells qualify (CIs too wide at 7 sessions) · framework in place for when disk grows
cell_significance._classify_dead_weight()
5th classification
Auto-Emit · Patches + Stubs
F5
In plain terms. Turns a finding into ready-to-paste code — closing the gap between "we spotted a leak" and "we shipped the fix." The Forge doesn't just point at the problem; it hands you the patch.
  • Quarantine patch: AGG (sym,setupId) → 1-line tuple · regime → 2-path guidance
  • Specialist book stub: 99-line LLQ/PRECISION-pattern module + persistence entry
  • 📋 Copy buttons per code block (clipboard API)
  • Workflow: "see finding (1 min) → review + paste (3-5 min)" — 6× speedup vs. hand-writing
  • Never auto-writes JS files — emitter outputs text in JSON, rendered as HTML
cell_significance._emit_quarantine_patch() · _emit_specialist_book_stub()
6× workflow speedup
Radial Examiner
F6
In plain terms. A one-brain X-ray — slice it any single way (by setup, by hour, by market type) and see exactly what makes that brain win or lose, value by value.
  • 6 inner-ring conditions: setupId · regime · symbol · direction · convictionTier · exit
  • 16 outer-ring context fields (VIX regime · gamma · HTF align · M1 episode · CVD trend · flow bias)
  • Session-clustered CI per value · 'pos' / 'neg' / 'inc' classification
  • First auto-surfaced proposal: BRO — split by regime (trend_down wins, trend_up loses)
cell_significance.build_examiner() · renderExaminer()
6 inner · 16 outer (pending)
Discoveries Chip + Dropped Archive
F7
In plain terms. The Forge's alert light — it waves when something new shows up and keeps a memory of what faded away, so a real find isn't missed and a dead one isn't chased twice.
  • Top-right chip: 🔬 polls /discoveries every 5 min
  • Amber pulse when N new hyps since last seen scan_date
  • Green steady when confirmed hyps exist · grey faint when idle/offline
  • Click → opens Forge in new tab + clears the alarm (last-seen scan_date stamp)
  • Dropped archive panel below hyp grid: last 15 hyps with kind, $/wk, lifecycle, stability
assets/js/engine-discoveries-chip.js · /hyp_action POST · dropped.jsonl
5-min poll · 4 states · archive
★ The Synthesizer · 5-stage gauntlet
F8
In plain terms. Doesn't just pick a winning brain — it builds a combined strategy that routes to different brains by market condition, then puts that blended result through a tough out-of-sample gauntlet before it can be called ready.
  • Stage 1: per-cell stats with session-clustered bootstrap CI on resolved fires (scanner quarantines applied)
  • Stage 2: regime route — per (sym × regime) pick the book with highest CI-lower at n≥30 + mean R ≥ +0.05R
  • Stage 3: correlation dedup — buckets aligned fires by (sym, ~5-min, direction) and flags pairs aligning ≥3×/week
  • Stage 4: OOS gauntlet on the COMPOSED stream — walk-forward + session-clustered CI on the OOS pool, not on the parts
  • Stage 5: frozen artifact mirroring the Cerberus policy schema generalized
  • First live run: 14,828 outcomes / 21d → 119 cells → 6 routes → 396 composed historical fires → 100% folds positive → OOS CI-lower +0.626R per trade → PROMOTION_READY
  • TRADABLE CUTOVER (2026-06-10): the composer now routes only on HOLDABLE fires — each book as a real netting account (≤1 position per FAMILY risk slot — NQ/MNQ one NASDAQ slot, ES/MES one SP slot), via shadow_tape.tradable_per_book (TRADABLE_ONLY). Routing on the raw stacked tape promoted books whose edge is un-holdable fantasy.
  • The rerank: 9 → 7 routes. AGG (raw +1037R but tradable −334R — fires 16,921×, 730 holdable, those lose) and PRECISION (+1288R → +52R, 96% fantasy) REMOVED from the strategy. New routes → ANT-OF · ANT-PA · ANT-CONS · WCONS (the tradable leaders; WCONS ×3). Still PROMOTION_READY, OOS CI-lower +0.550 — the edge survives the honesty filter, the fantasy doesn't.
  • ⚠ Caveat: pre-haircut. PROMOTION_READY = backtest verdict, not deployment authorization. Phase 1 Trust Calibration measures what survives a real broker — see L6.
  • Independently corroborated 2026-06-10: the Twin Timing Edge instrument (`claude/twin_timing_edge.py`) — a matched-pair method that pairs each reactive brain with its anticipation twin on the SAME setup (via `coFireClusterId`) and measures ΔR on the honest co-fired subset — found the SAME routing the synthesizer did (route to ANT arms where they win; PA/ANT-PA +0.45R [+0.29,+0.60] n=409). ZERO conflicts on diff. Two independent methods converge. Reactive brains kept as the control arm; the twin-table is a corroborator/watchdog, NOT an override engine. Finding: edge is a better entry RULE not earlier timing; conditioning flips signs (NOA loses overall but wins +0.83R in expansion).
claude/synthesizer.py · docs/synthesizer.html · synthesizer_policy_v1.json · claude/twin_timing_edge.py · twin_timing_table.json
TRADABLE CUTOVER · +0.550R CI-lower · 7 routes · AGG+PRECISION removed · pre-haircut
💰 Tradable Projection · the anti-lie
T1
In plain terms. The honesty check on profit — a single brain is a signal lab, and adding up overlapping signals as P&L is a lie a real account could never have held. This counts only the trades you could actually have taken, in real dollars.
  • The rule: ≤1 position per FAMILY risk slot (NQ/MNQ = one NASDAQ slot, ES/MES = one SP slot — a micro is a size rung of the slot, never a second position; risk-slot law 2026-06-12) · no hedge. Interval-overlap netting walk over the time-ordered fires.
  • One tape, two projections: raw (every fire = edge sample, for discovery) + tradable (what an account could hold, for trust/routing). The raw tape is untouched.
  • The lie quantified (per-book raw → tradable): PRECISION +1288 → +52 (96% fantasy) · AGG +1037 → −334 (FLIPS NEGATIVE) · NOA +217 → +22 · WCONS +166 → +131 (most honest, 21% blocked). It cuts both ways — PA/BRO score HIGHER tradable (stacking HID their edge). It reranks the race.
  • SSOT lockstep: BookCanon.tradable (JS, live cards) ↔ shadow_tape.tradable (Python, canonical — reads real direction + resolveTs). The cutover feeds the Synthesizer (F8).
  • Staged: dashboard headline flip (needs JS rows enriched w/ dir+resolveTs — stripped rows over-block high-freq books) · trust pipeline (separate Plane-B path) · graduation firing-gate (a brain literally holds ≤1/ticker once promoted).
assets/js/engine-book-canon.js · claude/shadow_tape.py · claude/synthesizer.py
netting projection · AGG +1037→−334 · reranks the race
🧬 Meta Thesis Engine
MB-T
In plain terms. Shifts the unit of thinking from "a trade" to "a thesis" — one living directional idea per market, born from the evidence, scored continuously, and killed when it stops being true. Born, lives, dies, learns.
  • The hard bound: ≤1 open position per FAMILY risk slot (NQ/MNQ, ES/MES — micros are size rungs of the slot, never a second position) · no hedge, intra-ticker or across the family. Without it the shadow P&L logs fills a real netting account could never take — a gross lie. Caught live: ES 8 open shorts / NQ 3.
  • Thesis Health: every brain ± , regime ± , EMA-smoothed. Reversal = flip only when the incumbent thesis is DEAD and the opposite is born. Exit-on-death = close when the idea dies, not only on stop.
  • Honesty (Pillar IX): below the n=8 floor → "I don't know" → death/reversal DISABLED. Structure now, tune after data (the PF-0.92 caveat baked in).
  • Ride free off the object: Belief State (bull/bear/neutral %) · Conviction · Uncertainty · Thesis Journal · Self-Calibration.
  • Surfaced: the Meta Brain · The Mind page (cognition lens) — Belief Map, Conscience (S7), Stream of consciousness. Plus the Meta Brain Trader Journal (the autonomous trader's logbook, calendar-first).
  • DECISION REPLAY (2026-06-12): every Meta commit ships a DECISION record (book v0.5.0) — chosen + the rejected field reconstructed from a 90s candidate buffer + decision stability (did the lead churn NQ→ES→NQ?) + the load-bearing argument + prosecutor override. Graded OFFLINE by decision_grader.py: allocationDelta = chosenR − bestHoldableRivalR (rivals replayed on the 1m bar tape, ⟨counterfactual⟩) — "good pick" vs "left money". This is Allocation Score v2: the per-decision grade vs the actual field. Surfaces in the journal row expand (why this, not that, how steady) + the Lab (DECISION_QUALITY: torn fields = stand-down candidate). Decision Families groups graded decisions by (regime × newsPhase × competition × stability) → the bleeding family is a pattern of mistakes, not a single bad trade. Governed by charter law: decision_metric_governance — no decision metric outranks real net tradable P&L (Goodhart guard).
engine-meta-brain-thesis.js · engine-meta-brain-book.js (gate · DECISION record) · claude/decision_grader.py · docs/meta_mind.html · engine-meta-journal.js
one thesis per ticker · exit-on-death · ≤1 position/family slot · decision replay
The Constitution · Influence Boundaries
§
In plain terms. The wiring rulebook, enforced by the build itself — it defines which layer may influence which, and if any part tries to reach somewhere it shouldn't, the build simply fails. Guardrails with teeth.
  • Boundaries: Lab discovers · Forge proposes · Meta trades · Co-Pilot explains
  • Co-Pilot = autonomous HAND, not judgment: ✓execute owners' blessed verdicts (blast-radius gated) · ✗originate (invent trust · score/rank · author promotion · trade)
  • check_constitution.py scans 38 core files / 4 layers → 0 violations; --self-test proves the fence has teeth
  • Amendment test: an idea may bend a rule freely but never break a pillar (truth · no silent mutation · judgment-tied-to-evidence · operator-owns-money · auditable)
  • Doc + guard are ONE law — amend together
claude/check_constitution.py · docs/constitution.html · /constitution
38 files · 0 violations · build-enforced
★ THE CHARTER · Capital Allocation Under Uncertainty
In plain terms. The one law above all others: the product isn't signals or trades — it's growing capital wisely across two real slots (one position per market, max), and protecting it when there's no real edge to deploy. Everything else is an instrument serving this.
  • One judging question for ALL work: does it improve how limited capital is allocated — or the trustworthiness of allocation measurement? Neither → backlog
  • Objective: net tradable $ per DAILY risk budget · constraints-as-law (not ratios) · NQ+ES same-dir = ONE risk unit (~0.9 correlated) · sizing = a discrete rung ladder
  • Gates G0–G4 with advance AND kill criteria at every rung · G0 now (paper supremacy, span-guarded) · G1 blocked on the operator's broker-connector decision
  • Instruments: regret ledger (flat scores 100 only when the tape agrees) · Allocation Score + selector drift · gate odometer (reads the law) · time-to-pay prior ($/slot-hour) · prosecutors (mute briefs — authority is EARNED) · black-swan sentinel (advisory, never flattens)
  • Lab re-chartered: every bus finding gate-impact stamped (G0 / measurement-trust / backlog) · deployment questions outrank entry edges
assets/data/charter_v1.json · claude/regret_ledger.py · engine-meta-brain-prosecutors.js · engine-shock-sentinel.js · DECISIONS 2026-06-12
G0 · 2 slots · the charter card lives on /copilot
Brain Calibration · self-calibration loop
In plain terms. The loop that turns a noisy pile of brains into a trustworthy roster the Meta Brain can lean on — scan, decide, apply, check the change actually stuck, then audit itself.
  • The cell is the unit (brain × setup × regime × symbol), never the brain
  • Reactive vs anticipated is the biggest lever — but mind provenance: the "+$190k" figure is ⟨MODELED · counterfactual would-have⟩, AND it's book-vs-book (selectivity-inflated). The realized spine number is +$29k (shadow paper-book), not $190k of real money. The HONEST matched-pair ΔR (claude/twin_timing_edge.py, co-fired subset only) is +0.45R/trade on PA and FLIPS by regime (NOA loses overall, wins only in expansion). Don't size off the $190k.
  • EV-ranking, not win-rate (Brier lies on asymmetric payoff) + the cost floor (~0.09R commissions — a thin positive R isn't edge)
  • STABLE gate: never auto-cut INSUFFICIENT_WINDOWS · R1–R6 codified in calibration_policy_v1.json (the autopilot seed)
  • Autonomous-hand safety (designed, gated): verify-applied · dead-man's-switch watchdog (silence = alarm) · Telegram escalation to the operator's phone
calibration_policy_v1.json · confidence_calibration_v1.json · feedback_calibration_doctrine.md
cell-unit · EV-rank · STABLE gate · verify+watchdog
Cross-Cutting Discipline L5
what protects the race — observability, kill-rules, EOD discipline, chip-as-alarm doctrine, propagation hygiene
8 CARDS
Three Timing Frames
23
In plain terms. Three clocks that must agree before a trade commits — the big-picture clock (what kind of day is it), the signal clock (is a setup forming), and the execution clock (commit on bar close). One commit, three layers of timing checked.
  • Big-picture clock · what kind of day is it · how loud · where is value moving
  • Signal clock · is a setup arming inside that bigger picture?
  • Execution clock · commits on bar close, with safeguards against flicker
  • The engine has the execution clock today · the next phase makes all three explicit
Prevents Pulling the trigger without checking what the bigger timeframe is doing.
decision-clock · render-stabilizer
3 clocks · 1 commit
One Source of Truth
24
In plain terms. A rule, not a feature — every screen reads the same official engine number, never a convenient copy. It's what stops two parts of the app from quietly showing different values for the same thing.
  • Every number on every screen reads from the same source the engine itself uses
  • Built into the architecture — not a "be careful" rule that can be forgotten later
  • Four real bugs caught the time a redesign tried to shortcut around it
  • A blank panel beats a wrong one
Prevents Screens that look right and feel right, but quietly show the wrong number.
buildLiveEngineExplanation() · pattern
4 bypass bugs caught
Build Phase · Lock Lifted
25
In plain terms. The current operating stance: the engine is being built, so we add aggressively and filter later rather than guarding every change. Data from before this point was declared invalid and isn't trusted.
  • Lock lifted after structurally broken engine was identified — prior data invalidated
  • Build aggressively: new modules, new brains, new whitelist books — all welcome
  • The Forge filters later — cell-significance + Holm + lifecycle do the curation
  • Lock returns only when bridge data returns and operator confirms — not a calendar date
Prevents Premature freezing of an architecture that hasn't found its edge yet.
MEMORY.md · 2026-05-22 lift
BUILD PHASE active
The Kill Rule
26
In plain terms. When the evidence says a feature doesn't work, it gets removed — not "tuned harder." A discipline against endlessly tweaking something the data already says is dead.
  • Every proposed fix carries a written "when do we kill this idea" clause — before it ships
  • Each candidate has an explicit, measurable kill condition
  • If four weeks of fresh data don't support it, it's dropped — not patched or rescued
  • Applies to the AI layer too — same rule, same standard, no favorites
Prevents Keeping features alive out of attachment, after the data has already killed them.
ROADMAP §LATER S1–S12 · noa doctrine
12 kill conditions
EOD Killswitch
27
In plain terms. No brain is allowed to hold a trade past the market close — full stop, anchored to real exchange time and wired into every part that can close a position. Stops overnight risk getting carried by accident.
  • 16:00 ET force-flatten: every open shadow position closes at last-valid R (exit='eod_flat')
  • No opens between 16:00–18:00 ET + all weekends
  • Wired into all 6 resolution owners + Paper Trader — flatten-guard + open-gate per brain
  • Visible topbar chip (🛑 EOD FLAT / 🟢 EOD 16:00 ET)
  • window.EODKill.flattenNow() = manual panic-flatten anytime (30s forced-halt window)
Prevents Post-close tape contamination — phantom resolutions on illiquid after-hours noise.
assets/js/engine-eod-killswitch.js
16:00 ET · 6 owners + Paper
Quarantine Registry
27a
In plain terms. One official list of which setups are retired — replacing three separate lists that used to disagree and caused a real bug. Now there's a single source for "is this thing benched."
  • Merged read across LES + PA — one call, normalized cell shape (book / sym / setupId|regime / source / reason / ts)
  • Metadata layer persisted to localStorage.nqelite.quarantine_registry.metadata.v1
  • Subscribe events on add / lift — Analyst + Actions + UI all see the same state
  • V1 additive — LES + PA still own their lists; registry normalizes the seam
Prevents 5-way quarantine-state divergence — the audit's R3 critical.
assets/js/engine-quarantine-registry.js · window.QuarantineRegistry
A4 shipped 2026-06-09 · audit R3 closed
Decision Event Bus
27b
In plain terms. Lets new listeners plug into the pipeline without editing the money-path file — the pipeline announces once, and anything that wants to listen subscribes itself. Keeps the dangerous core file untouched as the system grows.
  • publish('decision_tail', {ctx, decs, sym, d}) — one call from pipeline.js per tick
  • LES + MetaBrain self-register via subscribe('decision_tail', handler) at boot
  • CanonicalEngineState stays direct — pipeline reads its return value into ctx
  • Subscriber contract: read-only on payload (Meta Brain's autonomy goes through formal select → MetaDecision → OMS, not in-pipeline mutation)
Prevents Pipeline.js becoming the implicit observer registry. The audit's R11.
assets/js/engine-decision-event-bus.js · window.DecisionEventBus
D2 V1+V2 shipped 2026-06-09 · adding observer = one-line subscribe
Cross-Symbol Audit
28
In plain terms. Touch one instrument, check the other — no fix ships for NQ without verifying it didn't break ES. A guard against fixing one symbol while quietly breaking its twin.
  • Every NQ-side change triggers an ES-side parallel scan — same logic, different parameters
  • Caught real bugs: INV-2 Pass-1 shipped NQ-only, Pass-2 found the ES mirror was broken
  • Extends to 4-surface propagation: engine change → NOA Guide + Roadmap (md+html) + Atlas (EN+HE) in the same response
  • Cross-module trace before any edit: what does this touch downstream? The system is interconnected — no edit lives alone
Prevents Fixing one instrument while silently breaking the parallel — the most expensive class of bug that passes all single-symbol tests.
INV-2 · guardian gate · CLAUDE.md propagation rule
NQ↔ES · 4-surface propagation
Silent-Failure Doctrine (chip-as-alarm)
29
In plain terms. The rule that a status which can never turn red is lying. The enemy is the "ambiguous zero" — a 0 or "idle" that looks identical whether it's correct or the thing is actually dead. Every health light must be able to scream.
  • Born 2026-06-06 (SCHL/SAVT live but policy fetch blocked 2 days); upgraded 2026-06-10 into a standing review with 5 alarms shipped + 5 refinements
  • Smoke checks (`engine-smoke-test.js`): book_reader_coverage (FAIL — a book firing on disk but unread) · doc_claim_drift (WARN — a "by design/healthy" doc claim masking a dead brain) · disk_source_fresh (WARN — a 0 that's source-unreachable, not empty)
  • Chip states (`engine-noa-desk.js`): red .policy-down ⛔ · amber .strangled ⚠ (brain blocked, not idle) · amber .gate-locked 🔒 (engine gated, not idle)
  • Truth source: `engine-brain-truth.js` classifies brains FIRING/IDLE/STRANGLED/DOWN
  • 5 refinements: name the ambiguous zero · prove BOTH directions on the real trigger · false alarms are silent failures too · watch the watcher · encode FAIL/WARN/passive severity
  • CPU budget guard (2026-06-11) (`engine-cpu-sentinel.js`): the same invariant applied to CPU — every continuous timer has a per-fire ceiling (default 150ms); a breach turns the CPU chip RED and NAMES the offender (⛔ sustained / ⚡ spike) the moment it lands, so a heavy new timer is caught in ONE session, not after days of silent drift (+32,000ms). Levers: cache the expensive per-tick read > run-O(n)-every-Nth > cap growth > throttle. Doctrine + "fix CPU" procedure: `feedback_cpu_budget_doctrine.md`.
  • Order-flow capture chip (2026-06-13) (`engine-stability-observer.js` → #tbOFCapChip in the ALERTS row): the doctrine applied to the Event-Fragility-v2 capture — the 60s observer now logs continuous pre-event order flow (deltaPct/cvdSlope/DOM/absorption) so the quiet coiled pre-event tape is no longer a capture blind-spot. The chip shows the captured value live, color-coded by delta, or an honest OF · idle when a sample had no feed data — never a blank that hides a dead capture (it IS the captured row, one source of truth).
Prevents The system looking alive (or "fine by design") when it's dead — across chips, counts, AND doc claims.
engine-smoke-test.js · engine-brain-truth.js · engine-noa-desk.js · feedback_silent_failure_pattern.md · silent_failure_ledger.md
5 alarms · standing review · maintain mode
Run-Mode Vocabulary Lock
30
In plain terms. Three words, locked: Cold, Hot, Automated — each meaning one specific thing, no synonyms, no improvising. Confusing "which mode are we in" is one of the most expensive mistakes in this system.
  • 🧊 Cold = CALIBRATION mode · shadow capture, tool not default
  • 🔥 Hot · NOW = HOT mode · real signals, manual trader, Paper Trader on screen
  • 🤖 Automated = LIVE mode · real broker, real money, frozen graduated policy
  • Every "is it firing?" debug starts with: verify the live mode via EVDecision.phase.mode
  • Phase4Resume auto-CALIBRATION was disabled 2026-06-06 — HOT is the standing posture, code-enforced
Prevents Silent mode drift — the engine running in CALIBRATION while the operator believes HOT.
engine-ev-decision.js · feedback_run_modes_vocabulary.md
3 modes · locked
NOA Operator + Execution Ramp L6
how it ships — NOA is the Phase 4 north-star operator. The execution ramp + Phase 1 Trust Calibration gate get the graduated brain into a real broker without losing capital to its own scaffolding.
10 CARDS
Two halves, one trajectory. Today NOA is a racer in L3 (the brain) AND the operator-in-training here in L6 (the face + hands). Phase 4 north-star: NOA the racer wins Phase 1, gets snapshotted into Phase 2, qualified through Phase 3's execution ramp, and arrives in Phase 4 as a meta-allocator routing across all graduated brains. The execution ramp at execution/ is the deliberate, broker-safe path from "real signal" to "real fill on a real account." Already shipped: OMS Phase 0-Sim (69 tests green, 5 schemas, hash-chain ledger) · Meta Brain paper book (5 phases inside the dashboard, mirrors the Synthesizer's 6 routes) · Phase 1 Trust Calibration pipeline (measures claimed-R vs realized-R; pipeline real, data synthetic until a broker connects). Blocked on: operator picks broker connector (Quantower Trading Simulator first, AMP Rithmic demo for haircut measurement).
NOA Doctrine · 10 Commandments
N1
In plain terms. NOA is plumbing, not a character — and ten hard rules, enforced in code, keep her that way: stay silent unless there's something real to say, never make things up, never offer false comfort. The voice serves the edge, not the other way around.
  • 10 commandments: silence-first · no comfort · no tick narration · no fake certainty · no block execution · no punishment · no therapist · no hallucinate confidence · no omniscience · no social companion
  • Doctrine outranks feature requests, council enthusiasm, ship-faster pressure
  • Voice: 3 channels (alerts · companion · copilot) · 5–9 word templates · 85 lifecycle clips · text-only companion (voice rollout deferred)
  • Read-only consumer of engine state — zero engine-logic touch, ever
noa_should_not_doctrine.md · engine-noa-cognitive.js · engine-noa-voice.js
10 commandments · 3 channels
Thesis + Trade Companion
N2
In plain terms. When a trade fires it freezes the reason you took it, then checks reality against that reason on every tick — surfacing the moment something material changes and staying quiet for ordinary noise. It's what keeps you honest about why you're still in.
  • 5 lifecycle states: WATCH → PERMISSION → IN_TRADE → EXIT → DEBRIEF
  • Thesis: frozen at PERMISSION with mustRemainTrue[] + cancelIf[] + captured regime
  • Integrity scoring: SOLID_GREEN → PULSING_AMBER → BROKEN_AMBER → RED_FRACTURED → GRAY
  • Trade Companion 5 wires: weakening · strength · BE advisory · pressure · chart badge
  • 3 surfaces (chart overlay · left rail · topbar pill) all read the same signal object
engine-noa-thesis.js · engine-noa-trade-companion.js
5 states · 5 wires · 3 surfaces
★ Phase 4 · Portfolio Operator
N3
In plain terms. The end state — once at least two brains have proven themselves, NOA runs them as a portfolio and the engine, not you, is the one trading. The whole ladder points here.
  • Meta-allocator: regime-routing table — Phase 1's per-regime data populates it directly
  • Correlation-aware sizing: two brains firing same direction on same regime cell don't double-up
  • Drawdown budget partitioned across brains — brain-level breakers feed portfolio-level
  • Operator's role collapses to: risk-limit setter + monthly reviewer · no intraday touch
  • Stage 6 of the MI evolution ladder — designed, awaits graduating brains
routeBook(regime) → {book, size, exit} · MI v0.2 Stage 6
North-star · ≥2 brains graduated
Cerberus · Offline-Trained Frozen Policy
N4
In plain terms. A different kind of brain — it learns hard offline in Python, and only a frozen, tested snapshot ever ships to the live browser. Because it never learns live, it can't quietly drift away from what was validated.
  • Architecture: offline-trained / browser-deployed / FROZEN-policy — autonomous-execution doctrine
  • Two brains, two paradigms: SCHL (Scholar — readable, auditable, "can edge be NAMED?") + SAVT (Savant — self-supervised, black-box, "can edge be FELT?")
  • 3 heads each: HUNTER (CQL-conservative off-policy value) · SKEPTIC (adversarial meta-label) · SENTINEL (BOCPD regime-death anticipation, data-starved Phase 2)
  • Decision gate = conviction × (1−refuteProb) × regimeConfidence
  • Boot relaxation 2026-06-08: minExpectedR: 0.10 → 0.05. DR-OPE drMeanR is +0.087R ⟨modeled · off-policy estimate⟩ — original gate was structurally never reachable. _decisionOverrides layer keeps frozen-policy doctrine intact while runtime tunables win.
  • Data-gated: need ≥6 RTH session-days · have 2 · abstaining by design until retrain
engine-noa-solo.js · scholar_policy_live.json · savant_policy_live.json · getReasonCounters()
offline → frozen → deployed · relaxed
Execution Ramp · 0-Sim → D3
N5
In plain terms. The careful eight-rung path from "software test harness" to "real autonomous trading at size" — each rung must be earned, with the brain's rules frozen and promotion automatic only when the bar is cleared.
  • 0-Sim · OMS harness against Quantower built-in Simulator · ✅ 69 tests green, contracts shipped
  • 0-Broker · CQG/Rithmic demo · native bracket survival across power-loss · ◻ blocked on connector selection
  • A · demo assisted — proposes entry, operator confirms · ◻
  • B → C · auto-entry under caps → full demo auto · ◻
  • D1 → D2 → D3 · live supervised micro → limited autonomy → scaled by drawdown budget · ◻
HANDOFF_autonomous_execution_ramp_2026-05-31.md §11–§14
8 phases · north-star = D3
OMS Phase 0-Sim · Shipped
N6
In plain terms. The order-handling skeleton everything live will be built on — one writer, save-before-you-send, and a tamper-evident ledger. Boring on purpose: this is the part that must never lose or duplicate an order.
  • Contracts: 5 JSON Schemas (intent · command · event · operator · common) + ledger.sql (WAL + hash-chain) + 12 invariants + DISARMED-on-restart
  • OMS: clock · ledger · contracts · risk · state machine · kill policy (3 severities) · reconcile · reconnect
  • 69 tests green: contracts, ledger durability/tamper, state machine, risk, reconcile, end-to-end drills
  • C# adapter skeleton: NQEliteExecutionBridge.cs with // VERIFY markers · ◻ not compiled yet
execution/oms/ · execution/contracts/ · execution/adapter/
69 tests · 5 schemas · WAL hash-chain
Live SL/TP Hard Rule
N7
In plain terms. For real money, the stop-loss and target must live at the broker, not in the browser — because a browser stop disappears the moment you lose connection. Non-negotiable.
  • 🔺 Quantower "Local" SL/TP are client-side and vanish on disconnect/power-loss → forbidden for live
  • 🔺 Quantower built-in Simulator has no broker → can prove software, never broker-side protection
  • A SIMULATOR_LOCAL stop can never pass the live-eligibility gate (fail-closed, machine-enforced)
  • Real CQG/Rithmic-routed broker demo required before any live execution
  • Kill hierarchy: PAUSE → CONTROLLED_FLAT → EMERGENCY_FLAT · broker-kill non-overridable
execution/contracts/states.json · live-eligibility gate
broker-side mandatory
★ Meta Brain · the autonomous mastermind
N8
In plain terms. Not one of the racers — the boss above them. It reads every brain's vote, trusts each by track record, cancels out overlap, holds back when uncertain or when the real-world haircut is too steep, allocates the capital, and even checks whether it's beating the best single brain. Paper-only — it never touches the live money-path.
  • S1 · Haircut-aware EV · engine-meta-brain-selector.js — scores each route on expected R AFTER slippage/fees, scaled to avgWin/avgLoss (not a flat 1R penalty)
  • S2 · Intervention-bias wall (Gap 10) · engine-cf-snapshotter.js — freezes each candidate brain's state at decision time → CF_SNAPSHOT; brain trust learns from its OWN outcome, never from Meta's intervention
  • S3 · Provenance-keyed trust · claude/trust_reducer.py — Trust EMA keyed by (brain × sym × regime × session × provenance × surface); paper trust ≠ demo ≠ live
  • S4 · Candidate ingest · engine-meta-brain-ingest.js — polls the synthesizer-routed brains' fire streams; Meta mirrors the ROUTED brain per (sym × regime), not a rare engine SIGNAL. ingestCandidate() books via selector + haircut + sizing + MFE
  • Regime Commander · when the policy's primary brain is silent/dormant, the next crowd-aligned brain COVERS as secondary — tagged + sized-down, measurable apart
  • S5 · Uncertainty gate · engine-meta-brain-consensus.js — cross-brain consensus; STAND_DOWN when ≥4 voters split (minority_share > 0.40). Trust-weighted votes
  • S6 · Capital Router · engine-meta-brain-allocator.js — every 10s/symbol: ONE decision over all voters, correlation-collapsed (Gap 2/5 — the Oracle-flagged cluster counts once, not N×), quantized to integer contracts round-down, emits an allocation_vector
  • S7 · Holy Grail self-audit · engine-meta-brain-self-audit.js — "is Meta beating the BEST single brain?" If not → selfConfidence < 1 → the allocator sizes Meta DOWN. PROVING / BEATING / LAGGING / FAILING
engine-meta-brain-{book,scoring,selector,copilot,ingest,consensus,allocator,self-audit,state-push}.js · engine-cf-snapshotter.js · MetaBrain / MetaBrainAllocator / MetaBrainSelfAudit globals
S1–S7 live · correlation-collapsed · self-policing · SHADOW
Plane-B learning ledger
N8b
In plain terms. The tamper-evident record under the Meta Brain — kept in a separate database from the live-order ledger on purpose, so learning can never contaminate the money-path and the Meta's meddling can never muddy which brain deserves the credit.
  • execution/contracts/allocation_vector_v1.schema.json — FROZEN. ONE decision = ONE intent. Closes Gaps 1·2·4·6·10·11·12·13·14
  • execution/contracts/plane_b_event_v1.schema.json — 5 event types: META_DECISION · CF_SNAPSHOT · CF_OUTCOME · TRUST_UPDATE · ALLOCATOR_VERSION
  • claude/plane_b_ledger.py — hash-chained append-only (reuses OMS canonical_event_hash); writer-id authority enforced at the append boundary; 13 tests green (tamper · truncation · auth · replay)
  • Disk server: POST /push_plane_b_event (per-request SQLite connections — thread-safe on ThreadingHTTPServer) · GET /plane_b_status · GET /meta_brain_state
  • Cross-origin: dashboard mirrors Meta state to disk every 30s (engine-meta-brain-state-push.js); standalone page reads it — one writer, many readers, no IDB-race dupe class
execution/contracts/{allocation_vector,plane_b_event}_v1.schema.json · claude/plane_b_ledger.py · research_data/plane_b/plane_b_ledger.db
hash-chained · Plane-A/B separated · audited
Plane A vs Plane B
N9
In plain terms. Two different jobs kept separate: Plane A asks "where should we trade next?" and Plane B asks "how do we sharpen the brains we already have?" Mixing the two questions is how systems fool themselves.
  • Plane A · docs/meta_brain.html · violet brand accent · the rich 6-section view: status hero + route cards + recent fires strip + Meta Brain findings feed + mode-switch self-score + voice log
  • Plane B · docs/copilot.html · pure Co-Pilot — sharpen the books currently on the tape, no Meta Brain inline (single-line pointer above footbar)
  • Co-Pilot page: ⬢ Meta Brain nav link added · Meta Brain inline panel REMOVED → Plane-B is now leak/edge-only
  • Signals-page chip strip: MB chip (violet accent, trust-state badge ·U/·W/·T/·S/·F once ≥20 sealed) · Meta Brain is the Journal's 19th first-class book
  • Operator decision 2026-06-08: don't let Meta Brain's policy noise contaminate the active sharpening surface
docs/meta_brain.html · docs/copilot.html · engine-noa-desk.js · /meta_brain route
A = next · B = now
★ Phase 1 · Trust Calibration
N10
In plain terms. The reality check before any brain graduates — it compares what a brain claimed it made against what it would really have made (and claimed fills vs real fills). That gap, the "haircut," has to be applied before anyone trusts a brain's numbers. Waiting on a real broker to fill in real figures.
  • Analyzer · claude/trust_calibration.py — pairs engine fires with broker fills by intent_id, computes R_shortfall = actual_R − claimed_R (universal across winners + losers), session-clustered bootstrap CI by (sym × regime × book)
  • Spec · execution/contracts/broker_fills_v1.schema.json — FROZEN. Two record types (fill / exit), additionalProperties: false
  • JS lookup · engine-haircut.jswindow.Haircut.discount(claimedR, {sym, regime, book}) → expected_R + factor + CI + is_synthetic flag · resolution priority (sym × regime) → (sym × book) → (sym) → −0.30R default
  • Dashboard · docs/trust_calibration.html at /trust_calibration — headline tiles, per-regime + per-book breakdowns, [SYNTHETIC] banner
  • Routing config · execution/config/route_to_demo_v1.json — default NOA-only, both syms, RTH, max 2 concurrent / 10 daily / $1500 risk, kill on 4 streak or $600 daily loss
  • Promotion gate: Synthesizer verdict + haircut.is_synthetic == false + n ≥ 50 per sym + shortfall_CI_upper < 0 · all four must clear before composed policy leaves shadow
  • Pipeline real, data synthetic. Synth NOA NQ: −0.022R/trade · ES: −0.116R/trade — sanity-check only. Real numbers blocked on operator broker pick.
trust_calibration.py · broker_fills_v1.schema.json · simulate_broker_fills.py · engine-haircut.js · docs/trust_calibration.html · phase1_trust_calibration_handoff.md
measurement layer · synthetic · awaiting broker
Parameter Reference REF
every threshold, gate, setup weight, and order-flow rule — straight from the engine code
4 TABS · 40+ CARDS
Thresholds6
Gates & Pipeline16
Setups & Confluence10
Order Flow7
Order Flow Threshold
Delta Classification
6% · 350 · 120
P1
How The Engine Separates Real Directional Aggression From Market-Maker Noise.
Measure Rule Outcome Calibration Sources
Core Decision
Institutional Delta Requires Meaningful Size Plus Confirmed Directional Travel.
Magnitude|deltaPct| >= 6% OR |delta| >= absFloor
Efficiencydirectional displacement / total path >= 45%
Pass Interpretation Delta can support initiative flow, bias confidence, and downstream order-flow confirmation.
Fail Interpretation Delta is treated as noise, absorption risk, or inventory rebalancing unless other evidence overrides it.
Measures

Every bar the engine receives carries a delta: the net difference between aggressive buying volume, where market orders hit the ask, and aggressive selling volume, where market orders hit the bid. Raw delta is meaningless without context: 200 contracts on a 3,000-volume bar is noise; the same 200 on a 1,200-volume bar is a 16.7% directional skew, which means someone is acting with intent.

Gate 1: Magnitude Rule

The engine runs a dual-gate test. First, percentage: is this bar's delta at least ±6% of its total volume? Second, absolute floor: is the raw delta at least 350 contracts for NQ or 120 contracts for ES? Either gate passing qualifies the bar as meaningful. The absolute floor exists because during thin pre-market or lunch bars, even a 10% skew might represent only 30 contracts, which is statistically irrelevant for a futures instrument that trades millions daily.

Gate 2: Efficiency Rule

On top of delta magnitude, the engine evaluates bar efficiency: did price actually travel in the delta's direction? Efficiency equals directional price displacement divided by total path traveled. A bar with at least 45% efficiency moved purposefully in one direction. Below 45%, the bar oscillated: the delta might be real, but the price action is indecisive, and the move is more likely market-maker rebalancing than institutional commitment.

Parameter Summary
ParameterExact RuleRole In The Engine
Magnitude Filters
Relative Threshold±6% Of Bar VolumeFilters Weak Directional Skew.
Instrument Floors
NQ Absolute Floor350 ContractsMinimum Raw Delta For NQ Depth.
ES Absolute Floor120 ContractsMinimum Raw Delta For ES Depth.
Combined Gate|Δ%| ≥ 6% OR |Δ| ≥ absFloorMarks Delta Magnitude As Meaningful.
Efficiency Confirmation
Bar Efficiency Gate≥45% Displacement ÷ PathRequires Price To Confirm The Flow.
Min Price Move (NQ)16 Ticks = 4 PtsAvoids Microscopic NQ Drift.
Min Price Move (ES)10 Ticks = 2.5 PtsAvoids Microscopic ES Drift.
Calibration Rationale
  • The 6% threshold was derived empirically: below it, correlation between delta sign and next-bar direction drops below statistical noise.
  • The 3:1 ratio between NQ (350) and ES (120) absolute floors matches the typical volume ratio between the two instruments.
  • The 45% efficiency threshold aligns with microstructure research showing that bars below roughly 40-50% efficiency are dominated by market-maker inventory rebalancing, not directional intent.
Volume Classification
Volume & Market Tempo
4 tiers · self-calibrating
P2
Two separate systems: volatility-rank buckets for regime, and trade-count tempo for institutional participation.
Rank Classify Tempo Offset Score
Core Decision
Volume regime and bar-level tempo are independent systems that jointly adjust confluence threshold and bias confidence.
Regimevolatility rank percentile → 4 tiers (LOW / NORMAL / HIGH / EXTREME)
Tempobar trades vs baseline · WEAK <0.6× · STRONG ≥1.5×
When It Works HIGH/EXTREME regime lowers confluence threshold; STRONG tempo confirms institutional participation behind setups.
When It Fails LOW regime raises threshold +4; WEAK tempo flags retail-only environment — setups lack institutional backing.
Volume Regime Classification

The engine classifies market activity through two independent lenses. Volume regime uses a volatility rank (0-100 percentile) provided by the bridge from the instrument's recent history. This isn't a fixed number — "high volume" on a quiet August day means something different than "high volume" on an FOMC day. If the bridge can't provide a rank, the engine falls back to today's intraday range: NQ range >220 pts = EXTREME, <70 pts = LOW.

Market Order Tempo

Market order tempo is a separate, bar-level check. It counts trades and contracts in each bar and compares them to fixed baselines (NQ: 900 trades / 12,000 contracts; ES: 600 / 8,000). Below 60% of baseline = WEAK (retail noise, no institutional footprint). Above 150% = STRONG (institutions are participating). This distinction matters because a "high confluence" setup in a WEAK tempo environment is suspicious — who's going to move the market in your favor?

Score Adjustment

Both systems feed into score adjustments: volume regime shifts the confluence threshold (LOW: +4 harder, EXTREME: -5 easier), while tempo classification informs Gate 2.5's bias confidence and several order-flow conditions.

Volume Regime Tiers
Volume RegimeVolatility RankFallback (NQ range)
EXTREME≥85th percentile>220 pts
HIGH≥70th>140 pts
NORMAL≥35th70–140 pts
LOW<35th<70 pts
Tempo Baselines
Tempo GateNQES
Baseline900 trades / 12,000 vol600 trades / 8,000 vol
WEAK (<0.6×)<540t or <7,200v<360t or <4,800v
STRONG (≥1.5×)≥1,350t or ≥18,000v≥900t or ≥12,000v
Volume Score Offsets
Volume → Score OffsetEffect
LOW+4 (raise threshold — harder to fire)
NORMAL0 (baseline)
HIGH-3 (lower threshold — easier)
EXTREME-5 (much easier — conviction in loud markets)
Calibration Rationale
  • Self-calibrating percentile bands prevent the system from mislabeling a quiet Tuesday as "low volume" when it's actually normal for that contract's seasonal pattern.
  • The tempo baseline (900/600 trades) was calibrated from CME Group session data for NQ and ES regular trading hours.
  • The 0.6×/1.5× multipliers for WEAK/STRONG are derived from the point where institutional participation visibility inflects in order-book data.
Liquidity Detection
Sweep & Trap Detection
7-bar window · 3 confirm paths
P3
The engine's institutional-behavior detector: where stops cluster, how sweeps are identified, and when a sweep becomes a trap.
Target Sweep Window Confirm Score
Core Decision
A sweep past a liquidity target must either be accepted as a breakout or confirmed as a trap within a 7-bar window.
Sweepprice past target by ≥ break distance (NQ: 4t / ES: 3t) within 24 bars
Trapconfirmed trap scores 82 confidence — highest-conviction signal
When It Works Confirmed trap at 82 confidence — institutional stop-hunt detected, highest-conviction reversal signal.
When It Fails Sweep accepted (holds past fail distance for full window) — real breakout, not a trap.
Liquidity Targets

The engine maintains a ranked list of liquidity targets — session extremes, prior-day highs/lows, VPOC, VWAP, and equal highs/lows — scored by type, proximity, and session context. Each target represents a probable stop-cluster location: retail traders place stops outside these levels, creating pools of resting orders that institutional players can exploit.

Sweep Detection & Acceptance

A sweep is detected when price pushes past a target by at least the break distance (NQ: 4 ticks / 1 pt, ES: 3 ticks / 0.75 pt) within the last 24 bars. The engine then watches a 7-bar window: if price returns inside the level, it's a potential trap. If price holds beyond the fail distance (NQ: 6 ticks, ES: 4 ticks) for the full window, the sweep is accepted — real breakout, not a trap.

Trap Confirmation

Trap confirmation requires one of three signals within 4 bars after the sweep: (1) price reverses by at least trapMinTicks (NQ: 8 / ES: 5), (2) aggressive delta ≥10% stalls with a 2-tick reversal, or (3) the last 4 bars form a micro-range tighter than sweepFailTicks — the market froze after the attempt. Confirmed traps score 82 confidence in the order flow state machine — the highest-conviction signal the engine produces.

Sweep Distances
DistanceNQ (ticks / pts)ES (ticks / pts)
Sweep break4t / 1.00 pt3t / 0.75 pt
Sweep fail (accepted)6t / 1.50 pt4t / 1.00 pt
Trap min reversal8t / 2.00 pt5t / 1.25 pt
Trap max window12t / 3.00 pt8t / 2.00 pt
Gap significant80 pts / 0.4%20 pts / 0.3%
Trap Probability Scoring
Trap ProbabilityComponent
Base25 pts
+ Reclaim+28
+ Trap confirmed+28
+ Fake breakout+12
+ Contra delta+10
− Accepted−38
High confidence≥65 probability
Calibration Rationale
  • The break/fail distances are calibrated to each instrument's tick value. NQ at $5/tick needs 4 ticks ($20) to register a meaningful push past a level — less than that is just bid/ask bounce.
  • ES at $12.50/tick needs only 3 ($37.50) for a comparable dollar displacement.
  • The trap window (8–12 / 5–8 ticks) corresponds to typical retail stop-cluster distances from key levels — this is where the "victims" are positioned, based on common retail order-placement patterns observed in DOM data.
Risk Management
Risk Guardrails
-3.75R halt · 12-trade window · setup retirement
P4
Hard limits the engine will not cross — behavioral brakes calibrated to prevent judgment-degradation spirals.
Track Halt Size Retire
Core Decision
Three independent brakes prevent judgment-degradation spirals: daily loss halt, drawdown-scaled position sizing, and rolling setup retirement.
Daily Haltcumulative realized R hits -3.75R → new setups halted
Retirerolling 12-trade expectancy < -0.12R (n≥6) → setup benched
When It Works Position sizing stays full, setups remain active, engine operates at maximum capacity.
When It Fails Engine halts at -3.75R, sizes down to 25% floor at deep drawdown, or benches broken setups permanently.
Daily Loss Halt & Circuit Breaker

The engine tracks cumulative realized R across all closed signals since midnight ET. When the daily total hits -3.75R, new setups are halted — not because the edge disappeared, but because three standard-size losing trades in a row degrades judgment. A separate circuit breaker fires after 5 consecutive losses regardless of R magnitude, and a 30-fire hard cap prevents signal spam even if realizedR data isn't populated.

Setup Retirement

Setup retirement operates on a rolling window of the last 12 closed trades per setup type. If a setup's rolling average expectancy drops below -0.12R with at least 6 completed trades, it's benched ("Retired"). Between -0.05R and -0.12R, it's flagged as "Degrading" with a -5 priority penalty. Retired setups continue computing for observation but can never fire — the engine doesn't ride a dead horse.

Risk Parameters
ParameterValue
Daily loss halt-3.75R cumulative
Circuit breaker5 consecutive losses
Hard fire cap30 fires / day (safety net)
RR floor1.5:1 (HOT) / 2.0:1 (AUTOMATED)
RR cap (advisory)2.5 NQ / 4.0 ES (off in Phase A)
Retire threshold-0.12R rolling expectancy, n≥6
Degrade threshold-0.05R rolling expectancy
Rolling windowLast 12 closed trades per setup
Calibration Rationale
  • The -3.75R daily halt aligns with institutional prop desk standards where traders are pulled after 3-4 standard-risk losing trades.
  • The position degradation curve is a behavioral brake: it doesn't predict whether the next trade wins — it ensures the inevitable revenge trade costs less.
  • The 12-trade rolling window for retirement balances responsiveness (catching broken setups quickly) against noise resistance (not benching a setup over two bad days).
Risk Management
Risk Plan & Position Sizing
stop · entry · R-budget · contract count
P4.5
From signal to trade: stop placement → entry price → dollar risk budget → contract count → instrument → size tier. Every live trade starts here.
Plan Budget Size Tier Mirror
Core Decision
Given a setup's entry price and stop loss, compute the all-in contract plan that keeps stop risk plus commissions at or below the effective-equity R-budget.
Risk BudgeteffectiveAccount × maxRiskPctPerTrade × riskScale × sizeMode
All-In Risk(|entry−stop| × pointValue × contracts) + commissions
Current Selectorstandard if all-in fits, else micro fallback
When It Works Every trade has a defined dollar risk before entry. Sizing scales down automatically under drawdown — no manual calculation required.
When It Fails Entry or stop price not available at render time — chip shows plan fallback. Trader overrides with HALF or PROBE tier manually.
Stop Placement & Entry Price

The risk plan is anchored to two prices: entry and stop loss. These come from the setup's structural plan — the engine reads them from linked.entryPrice / linked.stopLoss at render time (signal card) and at journal-entry time (addSetupToTraderJournal()). If no linked plan exists, the engine attempts to derive them from buildStructuredPlan(). Stop distance is computed as |entry − stop| in points — direction agnostic.

Dollar Risk Budget

Base risk is a flat percentage of effective account equity: (accountSize + closed realized P&L) × maxRiskPctPerTrade when accountEquityAutoAdjust is enabled. This is then scaled by the automatic risk multiplier from riskScaleForNextTrade(): drawdown state, conviction tier, and bounded ATR-risk trim from stop width, target reach, and entry-to-VWAP extension. No regime multiplier: the regime gates already filter which setups fire; sizing does not second-guess what the gate decided.

Contract Count & Instrument Auto-Select

engine-position-sizer.js now optimizes executable contract mixes instead of forcing one instrument class. It can recommend all-micro, all-mini, or mixed output such as 1 NQ + 2 MNQ when that best fits the adjusted all-in risk budget. Dollar risk includes commissions: minis use $3.50 round turn, micros use $1.50. The chip displays all-in economics and the exact contract mix.

Size Tiers — FULL / HALF / PROBE

Three manual tiers appear on every signal card. FULL (default — no click required): trade the full adjusted risk budget. HALF: cut the budget to 50% before computing contracts. PROBE: bypass all math — always 1 micro, regardless of stop distance, drawdown, or account size. PROBE is not a sizing calculation — it is a veto. The trade is on, but exposure is minimal. Tier selection persists per setup ID for the session; clicking a new tier re-renders the chip immediately.

Instrument Specs
InstrumentPoint ValueTypeRound Turn
NQ$20 / ptStandard$3.50
MNQ$2 / ptMicro$1.50
ES$50 / ptStandard$3.50
MES$5 / ptMicro$1.50
Position Sizing Curve (Drawdown Scale)
Drawdown (R)ScaleEffect on $250 base budget
0100%$250
-285%$213
-470%$175
-655%$138
-840%$100
Config Fields
FieldDefaultLocation
accountSize25,000config.js · INSTITUTIONAL_TUNING
accountEquityAutoAdjusttrueconfig.js · INSTITUTIONAL_TUNING
maxRiskPctPerTrade0.01 (1%)config.js · INSTITUTIONAL_TUNING
roundTurnCommissionByInstrumentNQ/ES 3.50 · MNQ/MES 1.50config.js · INSTITUTIONAL_TUNING
sizeByDrawdowncurve aboveconfig.js · INSTITUTIONAL_TUNING
Calibration Rationale
  • 1% fixed risk per trade is the standard institutional starting point — aggressive enough to compound, conservative enough to survive a cold streak without account damage.
  • All-in sizing prevents commission drag from being invisible: contract count is based on stop risk plus round-turn fees.
  • Auto instrument selection now supports mixed mini+micro sizing, so full-size progression can be precise without excessive micro-only commissions.
  • PROBE tier exists for high-uncertainty setups where the trader wants presence but not exposure — one micro is near-zero cost and still captures the full experience in the substrate.
  • No regime multiplier by design: if a setup passed all gates, it has already been regime-filtered. Sizing adds only execution-geometry risk via ATR-proxy stop/target/VWAP measurements.
Signal Grading
Scoring, EV & Grading
sigmoid · 64% cap · 12-trade blend
P5
How confluence score converts to a win probability via a conservative sigmoid model — and what each grade means.
Score Sigmoid Blend EV Grade
Core Decision
Confluence score converts to win probability via a sigmoid capped at 64%, then computes EV to decide whether to fire.
Win Probsigmoid inflected at score 72, ceiling 64%
EV(winProb × R:R) − (1−winProb) × 1R, fire at EV ≥ 0.0
When It Works EV ≥ 0.0 fires the signal; EV ≥ 0.8 earns a premium badge for highest-conviction setups.
When It Fails EV negative — setup computed but suppressed. Near-miss (EV ≥ -0.2) tracked as Armed for observation.
Sigmoid Win Probability

Confluence score (0–100) doesn't directly fire a signal. It first converts to a win probability through a sigmoid curve inflected at score 72, capped at a 64% ceiling. The reasoning: even perfect alignment doesn't guarantee >64% win rate in futures markets. A score of 55 maps to ~25% win probability; 72 maps to ~43%; 90 maps to ~61%.

Historical Win Rate Blending

If the setup type has ≥12 completed trades in the outcome registry, the engine blends historical win rate into the curve estimate. Historical data gets up to 85% weight (curve always contributes at least 15%). If the actual win rate underperforms the curve by >8%, the historical weight is boosted to 92% — the model recognizes it was overestimating and defers to reality.

Expected Value & Multipliers

Expected Value is then computed as: EV = (winProb × R:R) − (1−winProb) × 1R. This is further adjusted by six multipliers: market fit (0.6–1.0×), directional alignment (±5%, −12% for conflict), confirmation status (+4%/−2%), production mode (+3%/−2%), sweep/absorption bonus (+6%), and chop penalty (−5% to −15%). The final win probability is hard-bounded between 15% and 72%.

Grade Thresholds
GradeScoreApprox Win Prob
S (elite)≥90~61%
A+80–89~54%
A70–79~40%
B<70<37%
EV Decision Thresholds
EV DecisionThreshold
Signal (Hot Run)EV ≥ 0.0
Signal (Automated Run)EV ≥ 0.5
Premium badgeEV ≥ 0.8
Armed (near-miss)EV ≥ -0.2
Historical blend trigger≥12 closed trades
Win prob ceiling64% (sigmoid cap)
Calibration Rationale
  • The sigmoid inflection at 72 was set empirically: substrate data shows a sharp inflection in outcome quality around that score.
  • The 64% ceiling is a market-structure constraint — no retail-accessible strategy sustains >64% win rate in liquid futures at meaningful holding periods.
  • The 12-trade minimum before blending historical WR prevents overfitting to a handful of early results.
  • Hot Run's 0.0 EV threshold is intentionally permissive — we're collecting signal-quality data, not filtering for profit.
Session & Data Quality
Session Windows, Slippage & Data Quality
64 quality floor · 3.2t MOC
P6
How long the engine remembers, what it assumes about execution cost, and when it declares itself blind.
Memory Slippage Quality Halt
Core Decision
Target memory tightens through the day, slippage is modeled conservatively per session, and data quality below 64/100 halts production.
Qualityscore from 100 downward · floor 64/100 → production halted
Feed< 10 ticks/min = hard stop · < 22 = warning
When It Works Quality ≥85 (grade A) — full production with accurate slippage estimates and fresh target memory.
When It Fails Quality <64 (grade D) — engine halts. A blank screen beats a wrong one.
Target Memory Windows

Target memory tightens through the trading day. In Asia (slow, level-driven), a liquidity target persists in memory for 26 minutes and up to 130 ticks away. By NY (fast, momentum-driven), the same target is stale at 18 minutes / 100 ticks. The scan range — how far the engine looks for new targets — follows the same tightening: 90 ticks in Asia, 64 in NY.

Slippage Model

Slippage is modeled per session and adjusted by volume regime. Base assumptions range from 1.8 ticks (Asia, thin but orderly) to 3.2 ticks (MOC, thin and chaotic). A volume multiplier scales these: EXTREME volume costs 1.6× the base slippage. Reversion setups get a 0.9× style discount (tighter fills at levels); continuation setups get 1.1× (looser fills chasing). A flat 2-tick round-trip cost is added for commissions and spread.

Data Quality Scoring

Data quality is scored from 100 downward, with deductions for each data gap. Missing price = −35. Stale feed = −24. Missing order book = −5. The quality floor is 64/100 (grade C) — below that, the engine halts production. The philosophy: a blank screen beats a wrong one. Feed health is monitored via a 1-minute sliding window of tick arrivals; below 10 ticks/minute = hard stop, below 22 = warning.

Session Memory Windows
WindowAsiaLondonNY
Target memory26 min22 min18 min
Max distance130t115t100t
Scan range90t78t64t
Slippage Assumptions
SlippageBase (ticks)At EXTREME vol
Opening Drive3.04.8
Power Hour2.64.16
MOC3.25.12
Lunch2.23.52
London2.03.20
Asia1.82.88
Style adj.reversion ×0.9 · continuation ×1.1
Round-trip cost2 ticks fixed
Data Quality Grades
Data Quality ScoreImpact
A (≥85)Full production
B (72–84)Production with penalty
C (64–71)Minimum passing
D (<64)Production halted
Calibration Rationale
  • Target memory windows tighten because volatility accelerates through the trading day — a level that's relevant for 26 minutes in Asia is ancient by NY.
  • The slippage model is intentionally conservative: it should overestimate execution cost, not underestimate it.
  • The MOC session gets the highest base (3.2t) because market-on-close flow creates the most adverse fills relative to displayed liquidity.
  • The 64/100 quality floor is aggressive by design — it means any two medium-severity data gaps halt the engine.
Serial AND cascade — each gate must pass. Hard blocks terminate immediately. Soft blocks accumulate as WAIT. Advisory gates (Phase A) compute but don't enforce.
Pipeline Gate
Gate 0 — Institutional Hard Blocks
3 sub-gates · hard block
G0
System-level kill switches. Three sub-gates that terminate before anything else runs.
CheckEvaluateDecideRecord
Core Decision
Any institutional block, symbol kill, or quarantined setup terminates the signal immediately.
Vetoinstitutional block flag set → signal dies
Kill Switchper-symbol toggle → hard block
Quarantineblacklisted setup → compute only, never SIGNAL
Gate Passes No institutional blocks, symbol is live, setup is not quarantined — signal proceeds to Gate 1.
Gate Blocks Signal dies immediately with no appeal. Quarantined setups continue computing for substrate observation only.
Institutional Veto

Gate 0 is the pipeline's first checkpoint and the only one that cannot be overridden by any downstream condition. It runs three independent checks in sequence. Gate 0 (Institutional Veto) scans for any reason the system has flagged as an institutional-level block — circuit breakers, exchange halts, or operator-set kill conditions. If any flag is set, the signal dies immediately with no appeal.

Symbol Kill Switch & Setup Quarantine

Gate 0a (Symbol Kill Switch) is a per-symbol manual toggle. During Phase 1 calibration, ES was locked in analysis-only mode — setups were computed and logged but never promoted to signal status. Gate 0b (Setup Quarantine) checks a static blacklist of setups that failed in live data. Quarantined setups continue computing (substrate observation only) but can never promote to SIGNAL. BUILD PHASE (2026-05-22): all 8 previously quarantined setups reinstated — prior data declared invalid (structurally broken engine). Fresh data collection will re-evaluate all 50 setups on equal footing.

0
Institutional Veto
Any institutional block reason set → signal dies. No appeal.
Hard
0a
Symbol Kill Switch
Per-symbol toggle. ES was analysis-only during Phase 1 calibration.
Hard
0b
Setup Quarantine
BUILD PHASE: all quarantines lifted (2026-05-23). 0 setups currently benched. Prior quarantine data invalidated by engine restructure.
Hard
Calibration Rationale
  • The quarantine list is data-driven, not opinion-driven — each benched setup was quarantined after its rolling performance fell below the retirement threshold or showed structural failure (0% WR).
  • Rather than deleting them, the engine keeps them computing for observation — if market conditions change and the setup rehabilitates in substrate data, it can be unbenched.
  • The kill switch was used to keep ES in watch-only mode until its own calibration data was collected.
Pipeline Gate
Gate 1 / 1.5 — Market Tradeable + State Confidence
50% confidence floor
G1
Is the market open and classified with enough confidence to act?
CheckEvaluateDecideRecord
Core Decision
Market must be open, state confidence must exceed 50%, and production mode must be active.
G1market open + valid state → hard/soft by reason
G1.5astate confidence < 50% → soft block
G1.5btradeMode = NONE → score penalty
G1.5cregime transition → proportional penalty
Gate Passes Market is open, engine trusts its classification, and production mode is active — signal proceeds to bias check.
Gate Blocks Pre-market or post-close hard-blocks. Low confidence or missing thesis imposes soft score penalties.
Market Tradeable

Gate 1 consults the market context analyzer to determine whether the market is open and in a valid, tradeable state. The severity of any block depends on the reason: a pre-market or post-close condition is a hard block (no point evaluating setups when the market is closed), while a data-quality degradation might only impose a soft penalty.

State Confidence & Transition Risk

Gate 1.5 evaluates three separate confidence dimensions. 1.5a (State Confidence) requires the engine's own classification of the current market context to be at least 50% confident. Below that threshold, the engine doesn't trust its regime classification — it might be calling a "trending" market when it's actually rotating, and every downstream gate depends on that classification being at least plausible. 1.5b (Production Mode) penalizes when tradeMode = NONE — no sweep+trap pattern detected and no trend exception active. This means the engine has no high-confidence thesis about what the market is doing. 1.5c (Transition Risk) applies a proportional penalty when the market is transitioning between regimes (e.g., from range to trend), because regime boundaries are statistically where false signals cluster.

1
Market Tradeable
Market context analyzer says the market is open and valid. Severity-based: hard or soft depending on the block reason.
Var
1.5a
State Confidence ≥ 50
Production state must be at least 50% confident. Below that the engine doesn't trust its own classification.
Soft
1.5b
Production Mode
If tradeMode = NONE (no sweep+trap, no trend exception), score takes a penalty.
Soft
1.5c
Transition Risk
Market transitioning between regimes → score penalty proportional to transition risk score.
Soft
Calibration Rationale
  • The 50% confidence floor is generous — it's a sanity check, not a filter. Below 50 the engine is essentially guessing what kind of market this is, and guessing is not a strategy.
  • Transition penalties exist because regime boundaries are where false signals cluster — a mean-reversion setup that fires during a regime transition to trend will get run over.
Pipeline Gate
Gate 2.5 — Bias Arbitrator
55% min · 8% edge
G2.5
Does the setup's direction match the market's directional bias? Five modules vote.
CheckEvaluateDecideRecord
Core Decision
Setup direction must align with the declared bias from five voting modules.
Biasside > 55% AND lead ≥ 8% → directional bias declared
Conflictboth sides > 55% within 8% → CONFLICT
Overrideactionable sweep/absorption → bias check skipped
Gate Passes Setup direction aligns with declared bias, or an actionable sweep/absorption event overrides the check.
Gate Blocks Direction opposes bias — score penalty applied. CONTINUATION_FALLBACK costs ~-1.6R on average.
Voting Framework

The Bias Arbitrator aggregates votes from five independent directional modules — each contributes a long score and a short score based on its own evidence. The arbitrator then applies a voting framework: if one side exceeds 55% and leads the other by at least 8%, a directional bias is declared. If both sides exceed 55% within 8% of each other, the state is CONFLICT — not uncertainty, but genuine contradictory evidence, which is worse for signal quality than having no bias at all.

Sweep Override & Continuation Fallback

The gate checks whether the proposed setup's direction aligns with the declared bias. A long setup in a SHORT bias environment gets a penalty. However, a critical exception exists: if there's an actionable sweep or absorption event, the bias check is skipped entirely. The logic is that liquidity events (stop hunts, institutional absorption at a level) override directional bias — a trapped short squeeze doesn't care what the trend says.

The CONTINUATION_FALLBACK state deserves attention: when bias direction matches but confidence is below the decisive threshold, the arbitrator falls back to a continuation assumption. Cohort attribution analysis found this fallback costs approximately -1.6R on average — it's a known weak spot that ships as a soft penalty rather than a hard block, awaiting more data to decide its fate.

Arbitrator Tuning
TuningValue
Side minimum55% (score to declare side)
Side edge8% (L-S gap for decisive call)
Conflict minimum55% (both sides ≥ this = CONFLICT)
Conflict edge8% (max gap to be "conflict" not "edge")
None max55% (below = no direction)
Calibration Rationale
  • The 8% edge requirement prevents firing on razor-thin directional advantages that could flip on the next bar.
  • CONFLICT state isn't "we don't know" — it's "both sides have a case," which is empirically worse for signal outcomes than NONE (where neither side has evidence).
  • The sweep/absorption override exists because liquidity events carry their own directional conviction independent of trend.
Pipeline Gate
Gate 2.6 / 2.7 — Market Story + Chop
55 fit · severe chop = kill
G2.6
Does the setup family match what the auction is doing — and is the market in chop?
CheckEvaluateDecideRecord
Core Decision
Setup family must fit the auction narrative; severe chop without liquidity events kills the signal.
G2.6story fit < 55 → soft penalty · fit < 35 + score < 60 + CHOP_NO_EDGE → hard block
G2.7SEVERE_NO_EDGE_CHOP without sweep/absorption → hard block
Gate Passes Setup family fits the auction narrative with score above 55, or chop is mild enough to allow continuation.
Gate Blocks Severe chop hard-blocks the signal. Low story fit imposes soft penalty; extreme mismatch escalates to hard block.
Market Story Fit

Gate 2.6 (Market Story Fit) computes a 0-100 score measuring how well the proposed setup family matches the current auction narrative. A continuation setup in a trending market scores high; the same setup in a rotating, mean-reverting market scores low. Below 55, the signal takes a soft penalty. Below 35, with a raw confluence score below 60 and the market classified as CHOP_NO_EDGE, the gate escalates to a hard block — the thesis has no structural support.

Chop Permissions

Gate 2.7 (Chop Permissions) is the engine's most aggressive defensive gate. When the market story evaluator classifies the environment as SEVERE_NO_EDGE_CHOP — meaning no directional conviction, no institutional footprint, and no structural level nearby — the gate hard-blocks unless there's an actionable sweep or absorption event. Other chop states (MILD_CHOP, ROTATIONAL_CHOP) impose graduated score penalties but allow the signal to continue. The distinction matters: mild chop can produce legitimate mean-reversion setups at levels, but severe chop is a fee-burning machine.

2.6
Market Story Fit
Market fit score 0–100. Below 55 → soft block. Below 35 with score <60 and story = CHOP_NO_EDGE → hard block.
Soft
2.7
Chop Permissions
SEVERE_NO_EDGE_CHOP without actionable sweep/absorption → hard block. Other chop states → score penalty.
Hard*
Calibration Rationale
  • CHOP_NO_EDGE is the engine saying "this market is going nowhere and there's no institutional footprint to ride." Hard-blocking it prevents the most expensive error in trading: forcing a trade when there's nothing to trade.
  • The 35/60 threshold is deliberately low — only the clearest "no edge" conditions trigger it.
  • The sweep/absorption override exists because these events can break a chop regime entirely.
Pipeline Gate
Gate 3 / 3.5 — Regime + Session Playbook
3 sessions · 6 families
G3
Setup style must match market regime, and only approved families run per session.
CheckEvaluateDecideRecord
Core Decision
Continuation setups require trend/expansion; reversion requires range/rotation. Session playbook restricts families.
G3 STRICTstyle mismatch → hard block · low_participation → hard block
G3 BALANCEDstyle mismatch → soft penalty
G3.5setup outside session playbook → blocked or downgraded
Gate Passes Setup style matches regime and is in the session's approved playbook — signal proceeds to state validity checks.
Gate Blocks Style mismatch blocks (STRICT) or penalizes (BALANCED). Low participation hard-blocks all setups regardless.
Regime Compatibility

Gate 3 (Regime Compatibility) enforces that continuation setups only fire in trend/expansion regimes, and reversion setups only fire in range/rotation regimes. In STRICT mode, a style mismatch is a hard block — a breakout setup in a ranging market simply cannot fire. In BALANCED mode, the mismatch imposes a soft penalty, allowing the setup to continue at a score disadvantage. Both modes hard-block all setups in low_participation regimes, because thin markets produce unreliable signals regardless of setup quality.

Session Playbook

Gate 3.5 (Session Playbook) restricts which setup families are approved for each trading session. Asia (slow, level-driven) permits key-mag, exhaust-rev, and vpoc-mig — setups that work with clear levels and minimal flow. London permits ib-reject and trap-rev — setups that exploit the London open's stop-hunting patterns. NY (institutional flow-driven) permits ib-brk, ib-ext, and flow-surge — setups that ride directional momentum. Setups outside their playbook are either blocked (STRICT) or downgraded (BALANCED). The playbook is calibrated from historical outcome data grouped by session, not from theory about what "should" work.

3
Regime Compatibility
Continuation setups allowed in trend/expansion. Reversion setups allowed in range/rotation. Banned in low_part. STRICT = hard block, BALANCED = soft.
Var
3.5
Session Playbook
Asia: key-mag, exhaust-rev, vpoc-mig. London: ib-reject, trap-rev. NY: ib-brk, ib-ext, flow-surge. Outside playbook = blocked (STRICT) or downgraded (BALANCED).
Var
Calibration Rationale
  • Not every setup works in every session. Asia is slow and level-driven — breakout setups waste capital chasing moves that don't follow through.
  • NY has real institutional flow — reversal setups against that flow lose.
  • The low_participation hard block protects against the deadliest trap: a "perfect" setup in an empty market where there's no one to move price in your favor.
Pipeline Gate
Gate 3.6 / 3.7 — State Validity + Dominance
gap + breakout + dominance
G3.6
Gap authority, breakout direction, and tier-1 dominance scope.
CheckEvaluateDecideRecord
Core Decision
Setup must not fight gap magnetism, clean breakouts, or active tier-1 signals.
G3.6agap opposes direction → soft block (sweep+trap overrides)
G3.6buntrapped breakout opposite → soft block
G3.7tier-1 active (ldn-sweep, fail-auc) → lower tiers yield
Gate Passes No gap conflict, no opposing breakout, no tier-1 dominance contention — signal proceeds to institutional checks.
Gate Blocks Soft penalties for gap or breakout headwinds. Lower-tier setups suppressed when tier-1 is active.
Gap Authority & Breakout State

Gate 3.6a (Gap Authority) checks whether a significant overnight gap conflicts with the setup's direction. A long setup when there's a large bearish gap (price opened well below prior close) faces headwind — the gap acts as an overhead magnet pulling price back. This is a soft block, not hard, because a confirmed sweep+trap can override the gap's authority.

Gate 3.6b (Breakout State) penalizes setups that fight a clean, untrapped breakout. If price has broken out of a range in one direction and no trap has been confirmed, setups pointing the opposite way are swimming upstream.

Tier-1 Dominance

Gate 3.7 (Tier-1 Dominance) is a priority arbitration mechanism: when a tier-1 setup (London sweep, failed auction — the highest-conviction patterns in the system) is active, lower-tier setups in the opposite direction are suppressed. Even lower-tier setups in the same direction yield — the tier-1 setup takes the slot. This prevents signal clutter and ensures the best signal gets the attention.

3.6a
Gap Authority
Significant gap conflicts with setup direction → soft block, unless sweep+trap is active.
Soft
3.6b
Breakout State
Clean breakout opposite to setup direction without trap confirmation → soft block.
Soft
3.7
Tier-1 Dominance
When a tier-1 setup (ldn-sweep, fail-auc) is active in opposite direction, lower-tier setups yield. Same direction but lower tier also yields.
Soft
Calibration Rationale
  • When the best setup in the system is pointing one way, lesser setups pointing the other way should defer.
  • Tier-1 setups have the strongest historical edge — when they speak, the rest of the pipeline listens.
  • The gap override for sweep+trap reflects that institutional stop-hunting events can negate gap magnetism entirely.
Pipeline Gate
Gate 3.8 / 3.9 — Whale + OI Conviction
whale · OI · soft
G3.8
Institutional positioning checks — are whales fighting you, and does open interest contradict your trade?
CheckEvaluateDecideRecord
Core Decision
Institutional DOM clustering and open interest dynamics must not contradict the setup's direction.
G3.8whale defense opposes direction → soft penalty · alignment = research note
G3.9SHORT_COVERING + long → penalty · LONG_CAPITULATION + short → penalty
Gate Passes Institutional positioning doesn't contradict the setup, or aligns with it — signal proceeds to final qualification.
Gate Blocks Soft penalties for opposing whale defense or OI contradictions. Not hard blocks — institutions can be wrong at turning points.
Whale Defense

Gate 3.8 (Whale Defense) reads DOM (Depth of Market) clustering patterns. When institutional-sized orders cluster at resistance above the current price (defense-resistance) or at support below (defense-support), the engine detects "whale defense" — large players actively protecting a level. If their defense opposes the setup's direction (e.g., massive sell-side clustering at resistance while a long setup tries to fire), the setup takes a soft penalty. If whale defense aligns with the setup, it's logged as a research note but doesn't adjust the score — alignment is expected, not bonus-worthy.

OI Conviction

Gate 3.9 (OI Conviction) cross-references open interest dynamics with the setup's direction. A long setup firing during a SHORT_COVERING environment gets penalized: the rally looks like buying, but OI falling says it's shorts closing, not new longs entering — the rally is structural unwind, not genuine demand. Similarly, a short setup during LONG_CAPITULATION faces a penalty because the selloff is exhaustion, not new bearish conviction. Counter-trend penalties apply when longing against REAL_DOWNTREND or shorting against REAL_UPTREND. All OI gates are soft penalties because institutional positioning can be wrong, and a strong enough setup-level signal can override positioning headwinds.

3.8
Whale Defense
Institutional DOM clustering (defense-resistance, defense-support) opposing setup direction → soft penalty. Alignment = research note only.
Soft
3.9
OI Conviction
Long setup + SHORT_COVERING → "rally is fake." Short setup + LONG_CAPITULATION → "selloff is exhaustion." Long vs REAL_DOWNTREND or short vs REAL_UPTREND → counter-trend penalty.
Soft
Calibration Rationale
  • These are the "bigger fish" gates — if institutional positioning contradicts the setup, the setup is fighting the wrong crowd.
  • Short covering rallies look identical to real buying on the tape, but they die once the shorts finish covering. OI dynamics distinguish real conviction from structural unwind.
  • Soft penalties, not hard blocks, because institutions can be wrong too — and often are at turning points.
Pipeline Gate
Gate 4–8 — Score, Confirmation & Risk
5 gates · retire to RR
G4-8
Final qualification: retirement check, confluence threshold, critical fails, entry timing, and risk/reward validation.
CheckEvaluateDecideRecord
Core Decision
Setup must pass retirement, confluence, critical-fail, confirmation, and R:R checks to fire.
G4rolling expectancy < -0.12R (6+ trades) → benched
G5confluence < ~68% dynamic threshold → soft block
G62+ critical fails (weight ≥18) → hard block
G8R:R < rrFloor (1.5 HOT / 2.0 AUTOMATED) → BLOCKED
Gate Passes Setup is not retired, confluence meets threshold, no critical failures, R:R is above rrFloor (1.5 HOT / 2.0 AUTOMATED) — eligible for SIGNAL.
Gate Blocks Retirement benches the setup. Critical fails hard-block. R:R below floor forces WATCH regardless of score quality.
Retirement & Confluence

Gate 4 (Retirement) checks the setup's rolling expectancy across its last 12 closed trades. Below -0.12R expectancy with at least 6 completed trades — the setup is benched. It continues computing (for substrate observation) but can never fire. Between -0.05R and -0.12R it's flagged as "Degrading" with a -5 priority penalty. Gate 5 (Confluence Threshold) is the raw score check: the setup's weighted confluence must reach a dynamic threshold (~68% base, adjusted by volume regime and qualification profile). BALANCED mode gets a -4 offset, making the threshold more permissive during data collection.

Critical Fails & Entry Confirmation

Gate 6 (Critical Fails) catches single-point-of-failure conditions. If any individual evaluator with weight ≥18 fails (scores zero), that's a critical fail. One critical fail is logged as a warning. Two or more critical fails trigger a hard block — the thesis has multiple load-bearing pillars that collapsed. Gate 7 (Entry Confirmation) requires setups below the bypass threshold (~75-80%) to show confirming price action, order flow, or liquidity signals before promoting. Above the bypass threshold, the confluence itself is confirmation enough.

Risk & R:R Tiering

Gate 8 (Risk & R:R) validates the stop/target geometry. R:R is calculated from the planned entry, stop, and target prices. The floor is phase-dependent: 1.5:1 in HOT Run, 2.0:1 in Automated Run — below that, the setup is BLOCKED regardless of confluence. R:R bands (1.5–2.0, 2.0–2.5, 2.5–3.0, 3.0–4.0, 4.0+) are tracked for statistical analysis but don't gate signal states. Calibration tuning clamps dynamic RR to the range 2.5–6.0 (config.js minRRGate/maxRRGate). GEX modifier (2026-05-21): dynamic R:R adjusts ±0.40R max based on options gamma exposure — LONG near a hard call wall (+0.25), SHORT near hard put wall (+0.25), COILED regime favors breakouts (-0.15), PINNED regime favors mean-reversion (-0.10), CASCADE regime penalizes fades (+0.40). Research-constrained: GEX modifiers only fire in CALM/NORMAL VIX (FlashAlpha 8yr: GEX adds zero in elevated/stressed VIX).

4
Retirement
Setup's rolling expectancy below -0.12R → benched. Keeps computing, can't fire.
Soft
5
Confluence Threshold
Score must reach dynamic threshold (base ~68%, STRICT offset 0, BALANCED offset -4). Gap logged if below.
Soft
6
Critical Fails
Any individual evaluation with weight ≥18 that fails. One = logged. Two or more = blocked.
Hard*
7
Entry Confirmation
If score below bypass threshold (~75–80%), requires price action / order flow / liquidity confirmation.
Soft
8
Risk & R:R Tier
Validates stop/target feasibility. Below rrFloor (1.5 HOT / 2.0 AUTOMATED) = BLOCKED. R:R bands tracked for statistics.
Hard
Calibration Rationale
  • Retirement prevents the engine from throwing good money after bad.
  • The critical-fail gate (weight ≥18) catches single-point-of-failure conditions — if one heavyweight check fails, the whole thesis is suspect regardless of total score.
  • R:R tiering means even a high-confluence signal doesn't fire if the risk/reward math doesn't work — this is where the engine enforces the asymmetry that makes edge-based trading viable.
Pipeline Gate
Gate 9 / 9b / 10 — Advisory (Phase A Off)
3 gates · advisory only
ADV
Computed but not enforced. Will block signals when switched on after calibration proves them.
CheckEvaluateDecideRecord
Core Decision
Advisory gates log what they would block but do not enforce — evidence before enforcement.
G9R:R > NQ 2.5:1 / ES 4.0:1 → would hard-block (advisory)
G9bdirection violation → would hard-block · style violation → would soft-block (advisory)
G10permanently disabled — 840-trade OOS showed it destroyed value
Gate Passes All advisory gates currently pass by default — they compute and log but never block during Phase A.
Gate Blocks No enforcement in Phase A. When activated post-calibration, G9 and G9b would hard/soft block as documented.
R:R Inflation Cap

Gate 9 (R:R Inflation Cap) addresses a structural bias in the engine's trade planner: while historical resolved trades show a median R:R of ~1.5:1, the planner consistently produces 4.5:1 median projections. The cap would hard-block any signal where the projected R:R exceeds 2.5:1 for NQ or 4.0:1 for ES. Currently advisory-only: the engine logs what it would have blocked, but doesn't enforce, pending calibration data proving that the cap improves outcomes rather than just filtering high-conviction setups.

Regime Fit Gate & Session Router

Gate 9b (Regime Fit Gate) layers a higher-timeframe regime check on top of Gate 3's style compatibility. It reads the daily regime bias and applies two sub-checks: a direction violation (shorting in a trend_up day) would be a hard block; a style violation (breakout setup in a range day) would be a soft block. Like Gate 9, it's advisory-only pending data proof. Gate 10 (Session Router) is permanently disabled. Out-of-sample validation on 840 historical trades showed it blocked profitable families more often than unprofitable ones — it destroyed net value. Kept in the codebase as an architectural placeholder and a reminder that "intuitive" filters can fail empirical validation.

9
R:R Inflation Cap
NQ capped at 2.5:1, ES at 4.0:1. Historical median RR was 1.5:1 but planner produces 4.5:1 median. Would hard-block inflated projections.
Off
9b
Regime Fit Gate
Day regime bias + style constraints. Short setup in trend_up → direction violation. Breakout in range → style violation. Direction = hard, style = soft (when enforced).
Off
10
Session Router
Permanently disabled. OOS validation on 840 trades showed it blocked profitable families. Kept as architectural placeholder.
Off
Calibration Rationale
  • Phase A philosophy: observe before enforcing. These gates log advisory verdicts so we can see what they would have blocked.
  • When the data proves a gate adds edge, it gets switched on.
  • Gate 10 was explicitly rejected — 840-trade OOS validation showed it destroyed value. Ideas that sound right but lose money get killed, not debated.
Pipeline Gate
Final Decision States
6 states · SIGNAL to fire
DEC
What the pipeline produces after all gates run.
CheckEvaluateDecideRecord
Core Decision
Pipeline resolves each evaluation into one of six discrete states — only SIGNAL can fire.
SIGNALall gates pass + R:R confirmed → can fire
ARMEDstrong confluence, waiting one condition → auto-promotes if resolved
BLOCKEDhard gate failed → structural rejection
Gate Passes SIGNAL state reached — the setup fires. All six states plus the determining gate are written to the substrate.
Gate Blocks ARMED, WATCH, BLOCKED, CONFLICT, or PAUSED — each conveys why the signal didn't fire for trader review.
Signal & Armed States

After every gate has executed, the pipeline resolves the signal into one of six discrete states. SIGNAL means every gate passed and R:R is confirmed — this is the only state that can fire. ARMED means confluence is strong and most gates passed, but the signal is waiting for one remaining condition (typically R:R confirmation or entry timing). Armed signals are near-miss candidates: if the pending condition resolves, the signal promotes automatically on the next evaluation cycle.

Non-Firing States

WATCH is the passive observation state — the setup is being evaluated but isn't close enough to actionability. BLOCKED means at least one hard gate failed — structural rejection with no workaround. CONFLICT is a distinct state from BLOCKED: it means the directional evidence is genuinely split, and the engine refuses to pick a side. PAUSED is operator-initiated — the trader manually paused this symbol via the dashboard's per-symbol pause button. All six states, along with the specific gate that determined them, are written to the substrate for every evaluation cycle. The trader sees all of this on the dashboard and can act on contextual nuance the gate system can't encode.

Decision State Matrix
StateCan FireMeaning
SIGNALYESAll gates pass, R:R confirmed
ARMEDWATCHStrong confluence, waiting R:R or confirm
WATCHNOWatching, not actionable
BLOCKEDNOHard gate failed
CONFLICTNODirectional conflict
PAUSEDNOTrader paused this symbol
Calibration Rationale
  • Six states, not two. Binary pass/fail loses information.
  • ARMED means "this was close — if one condition flips, it fires." CONFLICT means "the evidence is split."
  • The granularity exists because the human trader needs to know WHY something didn't fire, not just that it didn't — the context informs manual decisions and post-session review.
Scoring Framework
Global Confluence Categories
6 categories · 100 pts
CFG
Every setup scores from the same 6-category budget. Each category has a fixed weight — conditions within it divide those points.
Category Weight Score Threshold
Core Decision
Every setup scores against a fixed 100-point budget divided into six categories — same budget, different distributions per thesis.
BudgetLocation 25 + Regime 20 + Trigger 20 + Flow 15 + R:R 10 + Risk 10 = 100
Pass Setup scores above confluence threshold — qualifies for signal generation and trader presentation.
Fail Setup scores below threshold — filtered out, logged to substrate for analysis only.
Budget Architecture

Every setup in the engine — regardless of family — scores against the same 100-point budget divided into six categories. The budget is fixed but the distribution is not: each setup type allocates different weights to different conditions within each category, reflecting what matters for that specific thesis. An IB breakout weights trigger conditions heavily; an exhaustion reversal weights flow divergence conditions.

Category Rationale

Location receives the largest allocation (25 points) because where price sits relative to key structural levels is the single strongest predictor of whether a setup succeeds. A perfect trigger at a meaningless level is noise; a mediocre trigger at a critical level is a trade. Regime (20) and Trigger (20) share the next tier — regime ensures the market environment supports the thesis, while the trigger is the specific event that creates the opportunity. Flow (15) confirms the thesis with real-time order flow evidence. R:R (10) scores the mathematical quality of the stop/target geometry. Risk Filters (10) are binary safety checks — session timing, stop-hunt clearance, book stability — that don't generate edge but prevent easily avoidable losses.

Regime 20 Location 25 Trigger 20 Flow 15 R:R 10 Risk 10
Category Breakdown
CategoryWeightConditions
Regime20 ptshtf_aligned, ib_tight, cross_aligned, all_four, session_match
Location25 ptsvwap_side, vwap_extreme, key_level_near, gap_confirms, naked_vpoc, three_at_level
Trigger20 ptsib_one_sided, mss_confirmed, sweep_reclaim, mkt_order_tempo, first_hour_momentum
Flow15 ptsdelta_aligned, delta_divergence, dom_aligned, book_flow_sync, cvd_aligned, absorption
R:R10 pts(computed from stop/target geometry)
Risk Filters10 ptsstop_hunt_clear, book_stability, prime_window, not_lunch, session_match
Calibration Rationale
  • Location gets the most weight (25) because where price is relative to key levels matters more than any single trigger — this is the "where" that defines the trade.
  • Flow gets less (15) because it confirms but doesn't initiate.
  • Risk filters (10) are binary safety checks, not edge generators.
  • Every setup sums to exactly 100 points — different distributions, same budget — ensuring apples-to-apples comparison across families.
Qualification Mode
Qualification Profiles
STRICT vs BALANCED
QP
Two modes that control how many setups can fire and how selective the engine is.
Category Weight Score Threshold
Core Decision
STRICT maximizes signal quality; BALANCED maximizes data collection — Phase A runs BALANCED deliberately.
STRICT1 setup/sym · R:R ≥ 2.2 · offset 0 · hard blocks
BALANCED2 setups/sym · R:R ≥ 1.8 · offset -4 · soft penalties
Pass Setup meets profile requirements — proceeds through pipeline to signal generation.
Fail Setup rejected by profile constraints — hard block (STRICT) or downgraded score (BALANCED).
Profile Mechanics

The engine operates in one of two qualification profiles that control selectivity across the entire pipeline. STRICT is sniper mode: one setup per symbol, minimum 2.2:1 R:R, no score offset, and regime/playbook mismatches are hard blocks. BALANCED is scouting mode: two setups per symbol can compete, 1.8:1 R:R floor, -4 score offset (lowering the effective confluence threshold), and regime/playbook mismatches impose soft penalties rather than hard blocks.

Phase A Strategy

Phase A (current) runs BALANCED to maximize data collection. The engine intentionally fires more signals to build a statistical sample for each setup type. Once enough data accumulates to reliably distinguish setup quality, the profile will shift to STRICT — fewer signals, higher average quality, lower noise. The -4 score offset in BALANCED mode means a setup that needs 68 in STRICT only needs 64 in BALANCED. This isn't a quality compromise — it's deliberate observational permissiveness.

Profile Parameters
ParameterSTRICTBALANCED
Setups / symbol12
Min R:R2.21.8
Score offset0-4
Regime gateHard blockSoft (downgrade)
Playbook gateHard blockSoft (downgrade)
Calibration Rationale
  • STRICT optimizes for signal quality at the cost of sample size.
  • BALANCED optimizes for learning at the cost of signal purity.
  • Phase A needs volume — you can't evaluate what you don't fire.
  • The trade-off is explicit and temporary.
Setup Family
Structural Breakout Family
3 setups · continuation
F1
IB breakout, IB extension, breakout retest — continuation through structure.
Thesis Trigger Confluence Qualification Signal
Core Decision
The initial balance is the structural reference — breakout, extension, and retest each weight different evidence for the same directional thesis.
ib-brkib_one_sided 22 + delta_aligned 18
ib-extib_one_sided 20 + delta_aligned 20 + mkt_order_tempo 14
brk-retsweep_reclaim 20 + key_level_near 18 + absorption 14
High Confluence One-sided IB with confirmed directional flow and structural alignment — institutional continuation trade.
Low Confluence Balanced IB or weak flow — breakout is noise, not institutional intent.
Breakout & Extension

The Structural Breakout family trades the initial balance (IB — the range formed in the first hour of RTH) as a structural reference. IB Breakout (ib-brk) fires when price decisively exits the IB range. Its heaviest weight is ib_one_sided (22) — was the IB dominated by one side? A tight, one-sided IB that breaks out is institutional intent; a wide, balanced IB that breaks randomly is noise. IB Extension (ib-ext) fires after the breakout holds and extends. It shifts weight from the trigger to flow confirmation: delta_aligned (20) and mkt_order_tempo (14) together require proven, sustained flow in the breakout direction.

Breakout Retest

Breakout Retest (brk-ret) is the family's highest-conviction variant. It fires when a breakout level is retested and holds. The weight profile flips entirely: sweep_reclaim (20) and key_level_near (18) replace ib_one_sided — the thesis is no longer "the breakout happened" but "the breakout survived its first challenge." Absorption (14) confirms that aggressive selling at the retest level was absorbed by passive buyers. The not_lunch filter (8) exists because lunch-hour retests frequently fail due to thin liquidity, not structural weakness.

Condition Weights
Conditionib-brkib-extbrk-ret
ib_one_sided2220
delta_aligned182016
vwap_side1210
cross_aligned121012
mkt_order_tempo814
sweep_reclaim20
key_level_near18
absorption14
stop_hunt_clear8812
gap_confirms12
ib_tight10
prime_window6
session_match46
not_lunch8
Calibration Rationale
  • IB breakout weights one-sided action (22) because that's the primary signal — the IB was dominated by one side.
  • Extension shifts to delta (20) + tempo (14) because it's a follow-through trade; you need proven flow continuation.
  • Breakout retest flips to sweep_reclaim (20) because the thesis is "breakout held, retested, and reclaimed" — the trigger IS the level reclaim.
Setup Family
Mean Reversion Family
4 setups · reversion
F2
IB fade, VWAP bounce, VWAP deviation snap, value area fade — counter-move plays.
Thesis Trigger Confluence Qualification Signal
Core Decision
Price moved too far from fair value — divergence and absorption confirm the move is running out of fuel.
vwap-devvwap_extreme 24 + delta_divergence 20
vwap-bncvwap_bounce_zone 24 + delta_divergence 18
ib-fadedelta_divergence 20 + ib_one_sided 18
va-fadenaked_vpoc 18 + delta_divergence 18
High Confluence Flow diverges from price at an extreme level with absorption — high-probability snap back to fair value.
Low Confluence No divergence or absorption — catching a falling knife, not a mean reversion.
Family Thesis

Mean reversion setups bet that price has moved too far from fair value and will snap back. The family shares two dominant signals: delta divergence (flow disagreeing with price direction) and absorption (price stopping despite aggressive hitting). These two conditions together answer the question: "Is the move running out of fuel?"

Setup Variants

VWAP Deviation Snap (vwap-dev) has the family's most concentrated weight: vwap_extreme at 24 points. The entire thesis is "price extended ≥2σ from VWAP — mean reversion probability is high." Without extreme VWAP deviation, the setup doesn't exist. VWAP Bounce (vwap-bnc) similarly loads 24 on vwap_bounce_zone — the trade IS the bounce at VWAP. IB Fade (ib-fade) combines divergence (20) with one-sided IB (18) — the IB pushed hard one way, but flow says the push is exhausted. Value Area Fade (va-fade) anchors on naked VPOC (18) and key levels (14) — fading from a value area boundary that hasn't been visited yet is a high-quality mean-reversion thesis because the unvisited VPOC acts as a magnet.

Condition Weights
Conditionib-fadevwap-bncvwap-devva-fade
delta_divergence20182018
absorption16141616
vwap_side/extreme1424
vwap_bounce_zone24
ib_one_sided18
naked_vpoc18
key_level_near121214
cross_aligned12121012
stop_hunt_clear12101012
not_lunch810810
Calibration Rationale
  • Reversion trades live or die by divergence and absorption — without them, you're catching a falling knife.
  • VWAP deviation snap loads 24 on vwap_extreme because the entire thesis is "price extended too far from fair value."
  • VWAP bounce puts 24 on bounce_zone — the trade is the zone itself.
  • These concentrated weights ensure the setup can't fire if its core thesis is absent.
Setup Family
Trap & Reversal Family
3 setups · reversal
F3
IB rejection, trap & reverse, exhaustion reversal — failed moves become setups.
Thesis Trigger Confluence Qualification Signal
Core Decision
Failed moves trap participants on the wrong side — their forced exits fuel the reversal.
exhaustdelta_divergence 26 + key_level_near 20 + absorption 18
trap-revsweep_reclaim 20 + delta_divergence 18 + absorption 16
ib-rejdelta_divergence 20 + ib_one_sided 18 + sweep_reclaim 18
High Confluence Flow exhaustion confirmed at a key level with absorption — trapped participants will fuel the reversal.
Low Confluence No divergence or absorption proof — the move may be a legitimate trend, not a trap.
Exhaustion Reversal

This family trades failed moves — situations where a directional push exhausted itself, trapping participants on the wrong side. The thesis is contrarian: the losers' forced exits fuel the reversal. Exhaustion Reversal (exhaust) carries the heaviest single condition weight in the entire engine: delta_divergence at 26. The signal is unambiguous: aggressive flow pushed price to a level (key_level_near at 20), but the flow is dying (divergence) while passive absorption (18) holds the level. Three independent lines of evidence converge on one conclusion: the move is over.

Trap & IB Rejection

Trap & Reverse (trap-rev) leads with sweep_reclaim (20) — the failed breakout IS the trade. Price pushed past a level, swept stops, and reclaimed. The trapped participants are now underwater, and their forced exits become your fuel. IB Rejection (ib-rej) combines one-sided IB (18) with divergence (20) — the IB pushed hard one way but the rejection needs both: the setup (the push) and the proof (the flow reversal). Session_match (6) gives a small bonus because IB rejections are most reliable when they happen during the approved session for the setup family.

Condition Weights
Conditionib-rejtrap-revexhaust
delta_divergence201826
sweep_reclaim1820
absorption141618
ib_one_sided18
key_level_near1420
cross_aligned141212
stop_hunt_clear101214
session_match6
not_lunch810
Calibration Rationale
  • Exhaustion reversal loads 26 on divergence because the entire thesis is "flow dried up at a level." It's the engine's strongest conviction about single-condition importance.
  • Trap & reverse leads with sweep_reclaim (20) because the failed breakout IS the trade — you need proof that the move was a trap.
  • IB rejection combines the push (18) with the proof (20) — you need both.
Setup Family
Order Flow & Level Family
3 setups · flow-driven
F4
Flow surge, key level magnet, volume migration follow — flow-driven entries.
Thesis Trigger Confluence Qualification Signal
Core Decision
The order book tells a clear directional story — institutional flow events or structural level magnetism drive entry.
flow-surgedelta_aligned 24 + mkt_order_tempo 20 + dom_aligned 14
key-magnaked_vpoc 22 + key_level_near 18 + absorption 16
vol-mignaked_vpoc 20 + delta_aligned 18 + mkt_order_tempo 16
High Confluence Overwhelming institutional flow or strong level magnetism with book confirmation — ride the directional wave.
Low Confluence Weak flow or no structural magnet — chasing momentum without confirmation is suicidal.
Flow Surge

This family trades institutional flow events — situations where the order book tells a clear directional story. Flow Surge (flow-surge) is the engine's most aggressive setup, requiring overwhelming unidirectional flow: delta_aligned at 24 + mkt_order_tempo at 20. Together, these two conditions require both the magnitude (massive delta in one direction) AND the participation (heavy institutional trade count). DOM alignment (14) adds a third confirmation: the depth-of-market book structure should support the direction. This is a momentum trade — ride the institutional wave.

Level Magnet & Volume Migration

Key Level Magnet (key-mag) trades the pull toward an unvisited structural level. Its heaviest weight is naked_vpoc (22) — the unvisited Volume Point of Control is the magnet itself, a price level where significant volume transacted previously but the current session hasn't reached yet. Key_level_near (18) and absorption (16) add structural and flow confirmation. Volume Migration Follow (vol-mig) tracks when the VPOC physically migrates — it shifts (20) toward a new level, and the engine follows with delta confirmation (18) and cross-market alignment (14). This is a trend-following variant: the auction itself is voting on the new fair value.

Condition Weights
Conditionflow-surgekey-magvol-mig
delta_aligned241418
mkt_order_tempo2016
naked_vpoc2220
key_level_near101812
dom_aligned14
absorption16
vwap_side12
cross_aligned101214
stop_hunt_clear101012
not_lunch88
Calibration Rationale
  • Flow surge needs 44 combined points from delta + tempo because it's the riskiest setup type — chasing momentum without extreme flow confirmation is suicidal.
  • Key level magnet leads with naked_vpoc (22) because the unvisited VPOC is the magnet — without it, there's nothing to be attracted to.
  • Volume migration follows the market's own vote on new fair value.
Setup Family
Dynamic / Phase G+H Family
10 setups · specialist
F5
Phase G: London sweep, gap fade, overnight continuation, SMT divergence. Phase H: delta divergence reversal, opening shock reversal, whale cluster pullback, tape climax exhaustion, settlement magnet, pre-event compression.
Thesis Trigger Confluence Qualification Signal
Core Decision
Specialist plays defined by unique structural triggers — each setup IS its trigger, carrying the highest single-condition weights (22-30).
gap-fadegap_inside_range 30 + first_5min 25
ldn-sweeplondon_sweep 28 + ny_session 18
ovn-contovernight 28 + first_30min 22 + at_ovn_vwap 18
smt-lagsmt 25 + delta_aligned 18
delta-divdelta_divergence 28 + key_level_near 18 + absorption 16
open-shockopen_shock_failed 25 + open_shock_extreme 22 + delta_aligned 16
whale-pullabsorption 28 + dom_aligned 18 + delta_aligned 16
tape-climaxdelta_divergence 24 + mkt_order_tempo 20 + absorption 18
settle-magnear_settlement 30 + afternoon_window 22 + delta_aligned 16
fomc-comppre_event_day 22 + ib_tight 20 + delta_aligned 18
High Confluence Unique structural trigger fires with session and flow confirmation — specialist edge with concentrated thesis.
Low Confluence Core trigger absent — the setup literally doesn't exist without its defining condition.
Specialist Thesis

Phase G+H setups are specialist plays defined by unique structural triggers that don't exist in the core families. They carry the highest single-condition weights in the system (22-30 points) because each setup is its trigger — without the specific condition, the setup literally doesn't exist.

Phase G Variants

Gap Halfback Fade (gap-fade) loads 30 on its trigger (gap_inside_range) and 25 on first_5min — together these two conditions comprise 55% of the setup's total budget. The thesis: when the market opens with a gap that falls inside yesterday's range, and the first 5 minutes show reversal, the gap will fill at least 50% (halfback). London Sweep (ldn-sweep) at 28 triggers on the London session's characteristic stop-hunting pattern: price pushes above/below the Asian range to sweep stops, then reverses into the NY open. Overnight Continuation (ovn-cont) at 28 uses overnight VWAP (at_ovn_vwap, 18) as its anchor — the thesis is that the overnight direction established by Asia/London will continue into NY. SMT Divergence (smt-lag) at 25 fires when NQ and ES diverge structurally — one makes a new high/low while the other doesn't confirm, suggesting the leader is trapping participants.

Phase H Variants

Delta Divergence Reversal (delta-div) at 28 fires when cumulative delta diverges from price at a key level — flow says "no" while price says "yes." The highest-weight single flow condition in the system. Opening Shock Reversal (open-shock) combines extreme opening-range expansion (22) with failed continuation (25) — if the first minutes spike violently but can't sustain, the reversion trade has structure. US session only. Whale Cluster Pullback (whale-pull) leads with absorption (28) — the thesis is that visible institutional defense at a level creates a pullback anchor. Tape Climax Exhaustion (tape-climax) loads delta_divergence (24) + mkt_order_tempo (20) — the market is hitting hard but getting nowhere, tempo spikes are terminal not sustaining. Settlement Magnet (settle-mag) loads near_settlement (30) as a pure distance-based magnet play in afternoon PM session — price gravitates toward settlement in the last hours. No TP1/TP2, managed by distance to target. Pre-Event Compression (fomc-comp) at 22 requires pre_event_day from the economic calendar (FOMC/CPI/NFP) + tight IB (20) — the thesis is that pre-event compression resolves directionally once the event arrives. Powered by data-economic-calendar.js with 63 confirmed events through mid-2027.

Condition Weights — Phase G
Conditionldn-sweepgap-fadeovn-contsmt-lag
london_sweep / gap / overnight / smt28302825
delta_aligned14141218
ny_session / first_5min / first_30min182522
absorption81012
prime_window / at_ovn_vwap1018
cross_aligned868
mkt_order_tempo710
stop_hunt_clear7768
vwap_side812
session_match67
not_lunch8
Condition Weights — Phase H
Conditiondelta-divopen-shockwhale-pulltape-climaxsettle-magfomc-comp
Primary trigger282528243022
Secondary trigger182218202220
delta_aligned / divergence16161618
absorption161218
key_level_near18121412
cross_aligned1210881014
stop_hunt_clear1088101010
Calibration Rationale
  • These setups have the highest single-condition weights in the system (22-30) because each one is defined by its unique trigger.
  • Phase G: 4 specialist plays — gap, London sweep, overnight carry, SMT divergence. Each one IS its trigger.
  • Phase H: 6 microstructure plays — delta divergence, opening shock, whale pullback, tape climax, settlement magnet, pre-event compression. Researched and queued during BUILD PHASE.
  • Settlement magnet and pre-event compression use new data sources: near_settlement (distance-to-settlement evaluator) and pre_event_day (powered by data-economic-calendar.js with 63 confirmed FOMC/CPI/NFP/GDP/PCE events through mid-2027).
Setup Quarantine
Quarantine & Session Map
0 benched · 3 sessions
QS
Which setups are benched, and which sessions allow which families.
Category Weight Score Threshold
Core Decision
Quarantine is data-driven — each benched setup demonstrated structural failure, not a bad streak. Session playbooks map approved families to session liquidity profiles.
Quarantineobjective performance collapse → manual unblock only
Retirementauto-bench at -0.12R expectancy → auto-rehabilitate
Pass Setup not quarantined and approved for current session — proceeds to scoring pipeline.
Fail Setup quarantined or session-blocked — still computes and writes to substrate but never fires a signal.
Quarantine List

The quarantine list is the engine's holding pen for setups that failed in live data. BUILD PHASE (2026-05-23): all 8 previously quarantined setups have been reinstated. The engine was declared structurally broken on 2026-05-22 — all prior data is invalidated. Fresh data collection will re-evaluate every setup (now 50 total, including 12 new Phase H entries) on equal footing. Historical quarantine reasons preserved as comments in config.js for reference. The distinction between quarantine and retirement remains: retired setups auto-bench at -0.12R expectancy and can auto-rehabilitate; quarantined setups require manual operator intervention.

Session Playbook

The session playbook maps each trading session to its approved setup families. Asia (pre-London, low participation) permits only level-based plays: key-mag, exhaust-rev, vpoc-mig. Breakout setups are banned — thin Asia liquidity produces false breakouts. London permits trap-based plays: ib-reject, trap-rev — because the London open characteristically sweeps Asia stops. NY gets the full momentum arsenal: ib-brk, ib-ext, flow-surge — because NY has the institutional flow to sustain breakouts.

Quarantined Setups
QuarantinedReason
None — all quarantines lifted for BUILD PHASE (2026-05-23). Prior data invalidated by engine restructure. Fresh data collection will determine new quarantine candidates.
Session Approved Families
SessionApproved Families
Asiakey-mag, exhaust-rev, vpoc-mig
Londonib-reject, trap-rev
NYib-brk, ib-ext, flow-surge
Calibration Rationale
  • The quarantine list is data-driven, not opinion-driven. Each benched setup was quarantined after demonstrating structural failure, not a bad streak.
  • The session playbook is calibrated from historical outcome data grouped by session — Asia breakouts fail because there's no institutional flow to sustain them, not because breakouts are bad setups.
Liquidity Analysis
Liquidity Target Scoring
7 types · 44pt proximity cap
LIQ
How the engine scores potential sweep targets by type, proximity, and session.
Category Weight Score Threshold
Core Decision
Each liquidity target gets a composite score from type base + proximity adjustment + session bonus — producing a live priority queue of the "hottest" sweep targets.
ScorebaseScore + max(0, 44 - distTicks × 0.55) + sessionBonus
Session+8 when target belongs to current session
High Confluence Session extreme nearby with session match — highest-priority sweep target in the live queue.
Low Confluence Range target far away — effectively invisible to the current move, deprioritized.
Type Hierarchy

The engine maintains a real-time ranked list of liquidity targets — price levels where stop orders are likely to cluster. Each target receives a composite score combining three factors: type base score, proximity adjustment, and session context bonus. The type hierarchy reflects institutional behavior: session extremes (Asia/London/OR highs and lows) score highest (34) because they're the most visible stop-cluster locations — virtually every retail trader places stops outside them. Prior-day extremes (PDH/PDL at 32) serve a similar role at daily scale.

Proximity & Session Modifiers

The proximity formula adds up to 44 points based on how close the target is: max(0, 44 − distTicks × 0.55). This means a target at the current price gets +44, a target 40 ticks away gets +22, and a target 80 ticks away gets effectively zero. The decay rate of 0.55 per tick was calibrated to match the typical effective range of institutional sweep operations — stops too far away aren't getting swept in the current move. A session match bonus of +8 rewards targets that belong to the current session (Asia high targeted during Asia session) because same-session extremes are the freshest, most-watched levels. Range targets score only 6 base with an additional -20 penalty because intraday ranges are weak references — everyone sees them, but they lack the institutional significance of session extremes.

Base Scores by Type
Target TypeBase Score
Asia/London/OR High-Low34
PDH / PDL32
Value Area26
VPOC24
VWAP22
Equal Highs/Lows16
Range6
Score Modifiers
ModifierEffect
Proximitymax(0, 44 - distTicks × 0.55)
Session match bonus+8 (Asia target in Asia, etc.)
Range penalty-20
Equal H/L penalty-8
Calibration Rationale
  • Session extremes score highest because they're the most obvious stop-cluster locations — every retail trader sets stops outside them.
  • PDH/PDL are close behind for the same reason at daily scale.
  • Range gets only 6 (effectively -14 after penalty) because it's a weak, overused reference.
  • Proximity degrades at 0.55/tick — a target 80 ticks away is effectively invisible to the current move.
  • The scoring produces a live priority queue: the engine always knows which liquidity target is the "hottest" right now.
Order Flow Engine
Order Flow State Machine
9 states · 15–82 conf
OF
9 states evaluated in priority order. First match wins. Each state carries a quality rating and confidence score.
DetectClassifyConfirmScore
Core Decision
Priority-first state classification: first matching state wins, no further evaluation.
PriorityStale → Trap → Initiative → CVD Div → Absorption → Exhaustion → Delta NC → Noisy
Confidencerange 15–82, first match takes slot
Confirmed High-quality state (Trap 82, Initiative 78) — institutional behavior detected, full conviction signal.
Rejected Low-quality state (Noisy 35, Stale 15) — no interpretable flow, signal suppressed.
State Machine Logic

The order flow state machine is the engine's real-time interpretation of what the market is doing right now. Every bar, it evaluates 9 possible states in a fixed priority order — the first state whose conditions are met wins, and no further states are evaluated. This priority-first design prevents ambiguity: when multiple states could apply, the highest-conviction interpretation takes precedence.

High-Priority States

Stale (15) checks first as a circuit breaker — if the latest flow data exceeds the age threshold, every other state is meaningless. Trap Flow (82) evaluates second and carries the highest confidence in the system: confirmed stop hunt with reclaim means institutional behavior — this is the most reliable signal the engine produces. Initiative Buy/Sell (78) require the tightest triple-confirmation: meaningful delta magnitude + CVD confirmation + efficient price travel in the same direction. Missing any one of the three drops the state.

Mid & Low-Priority States

CVD Divergence (72) catches a specific failure: price makes a new extreme but cumulative volume delta doesn't confirm — the "who's buying?" gap. Absorption (68) detects price stopping despite aggressive hitting, especially near a structural level. Exhaustion (62) is the "dying move" state: price is still traveling but delta has lost its meaningful threshold, and either volume or CVD is diverging. Delta Not Confirmed (48) captures the awkward middle: delta is meaningful but price/CVD don't agree. Noisy/Unusable (35) is the default — no conditions met, no interpretation possible.

State Priority Table
StateQualityConfTrigger
StaleLOW15Flow age exceeds stale threshold
Trap FlowHIGH82Stop hunt = 'trap' OR reclaim confirmed + sweep + matching delta
CVD DivergenceMED72New high without CVD up, or new low without CVD down
AbsorptionMED68Meaningful + near level + (price stalled OR efficiency <28% OR vol ≥70%ile)
Initiative BuyHIGH78Price up + meaningful delta + buy-side + CVD confirming + efficient
Initiative SellHIGH78Price down + meaningful delta + sell-side + CVD confirming + efficient
ExhaustionMED62Price moving but delta NOT meaningful + (vol ≥65%ile OR CVD diverging)
Delta Not ConfirmedLOW48Delta is meaningful but price/CVD don't confirm
Noisy / UnusableLOW35Default — no conditions met
Calibration Rationale
  • Priority order matters — Trap Flow evaluates first (82 confidence) because confirmed institutional stop-hunting is the highest-conviction signal; when it's present, nothing else matters.
  • Initiative Buy/Sell (78) require the full triple-confirmation picture.
  • Exhaustion (62) is the "dying move" detector.
  • The spread from 15 to 82 reflects the genuine information-content gap between stale noise and confirmed institutional trapping.
Order Flow Engine
Directional Intent Scoring
62% min · 8 modes
DIR
Five modules vote on long vs. short. The winner must clear both a minimum score and a clear edge over the other side.
DetectClassifyConfirmScore
Core Decision
Dual-threshold directional call: side minimum 62% AND edge 10% required to declare direction.
Side min62% to declare LONG or SHORT
Edge10% L-S gap required
Confidencefloor 60% (range 45–75%)
Confirmed Direction declared with mode (ACCEPTANCE / PULLBACK / TRAP_REVERSAL) — routes eligible setup families.
Rejected CONFLICT or NO_DIRECTION — both sides too close or too weak, no directional trade eligible.
Voting & Thresholds

Directional intent scoring aggregates five independent modules — each evaluates its own evidence and produces a long score and a short score. The scores are weighted and combined into a final directional call. The thresholds are intentionally stricter than the Bias Arbitrator (Gate 2.5): the side minimum is 62% (vs. 55%), and the edge requirement is 10% (vs. 8%). This dual-threshold design prevents two failure modes: declaring direction when evidence is thin (minimum check) and declaring direction when both sides have nearly equal evidence (edge check).

Directional Modes

Beyond the binary long/short call, the system infers a directional mode — how the market is expressing its direction. ACCEPTANCE means price is being accepted in the direction (trending smoothly). PULLBACK means the direction holds but price is retracing (potential entry opportunity). TRAP_REVERSAL means a failed move trapped participants on the wrong side (highest conviction). The mode tells the setup system what kind of trade to look for: a pullback in an uptrend calls for continuation setups; a trap reversal calls for reversal setups.

Confidence Floor

The confidence floor at 60% flags marginal directional reads. A call that passes the math (62% side, 10% edge) but has low confidence (below 60%) is technically valid but unreliable — the engine logs it as a research note rather than a conviction signal. Confidence ranges from 45% to 75%, with the floor at 60% reflecting the point where cohort attribution data shows directional calls become predictive of setup outcomes.

Threshold Parameters
ParameterValue
Side minimum62% (to declare LONG or SHORT)
Side edge10% (L-S gap required)
Conflict min55% (both sides ≥ this = CONFLICT)
Conflict edge10% (max gap for CONFLICT state)
None max55% (below = NONE)
Confidence floor60% (range 45–75%)
Mode Taxonomy
ModeVariations
LONG_*ACCEPTANCE · PULLBACK · TRAP_REVERSAL
SHORT_*ACCEPTANCE · PULLBACK · TRAP_REVERSAL
MixedCONFLICT · NO_DIRECTION
Calibration Rationale
  • 62% is the real scoring minimum (vs. 55% in the bias arbitrator's pipeline version) because directional intent is a higher-stakes call — it determines the trade direction, not just a gate penalty.
  • The 10% edge prevents "barely long" calls.
  • Mode inference (ACCEPTANCE vs PULLBACK vs TRAP_REVERSAL) tells the setup system what kind of trade to look for — a critical routing decision that determines which setup families are eligible.
Order Flow Engine
Absorption Quality States
7 states · 18–86
ABS
Absorption isn't binary — it transitions through 6 states from initial detection to confirmed reversal or invalidation.
DetectClassifyConfirmScore
Core Decision
Progressive state machine: absorption evidence accumulates from initial detection (58) to reversal confirmed (86) or invalidated (18).
EntryACTIVE_ABSORPTION at 58 — initial detection
PeakREVERSAL_CONFIRMED at 86 — full convergence
KillINVALIDATED at 18 — price accepted through level
Confirmed REVERSAL_CONFIRMED (86/84) — failed move + structural proof + opposite hold, full reversal conviction.
Rejected INVALIDATED (18) — price bulldozed through the absorbed level, thesis dead.
Detection Model

Absorption detection answers a critical question: is someone passively absorbing aggressive flow at a level? When large limit orders silently eat incoming market orders without letting price move, it's invisible on the price chart but detectable in order flow — high volume + no price movement = passive absorption. The engine models this as a progressive state machine rather than a binary flag.

Early States

ACTIVE_ABSORPTION (58) is the initial detection: order flow shows absorption, but it's just a fact — someone is sitting on a level. It becomes interesting at TRAP_BUILDING (74) when structural evidence accumulates: a reclaim, a failed breakout, or proximity to a planned target suggests the absorption will hold. CONTINUATION_WARNING (66) is the default middle state — absorption is present but there's no reversal evidence yet, and the market might push through.

Convergence States

The highest states require convergence. REVERSAL_CONFIRMED at 86 needs reclaim or failed breakout PLUS opposite structure or hold — the full picture. At 84, it needs divergence plus at least one structural confirmation. TWO_REVERSAL_WARNINGS (80) fires on CVD divergence alone without structural confirmation — it's close to reversal conviction but lacks the physical proof. INVALIDATED (18) kills the thesis entirely: price accepted through the absorbed level. The passive defender lost — game over.

State Scoring Table
StateScoreTrigger
REVERSAL_CONFIRMED86Reclaim or failed breakout + opposite structure or hold
REVERSAL_CONFIRMED84Divergence + (reclaim or failed breakout or opposite structure)
TWO_REVERSAL_WARNINGS80CVD divergence alone (without structural confirmation)
TRAP_BUILDING74Reclaim or failed breakout or near planned target
CONTINUATION_WARNING66Default — absorption present but no reversal evidence
ACTIVE_ABSORPTION58Initial detection — order flow shows absorption
INVALIDATED18Price accepted through the absorbed level
Calibration Rationale
  • Absorption alone (58) is interesting but not actionable — anyone can sit on a level temporarily.
  • TRAP_BUILDING (74) means structural evidence is accumulating.
  • REVERSAL_CONFIRMED (86) is the full convergence: failed move + structural proof + opposite hold.
  • The progressive scoring prevents premature action on absorption while ensuring the engine acts decisively when the full picture emerges.
  • INVALIDATED (18) means price bulldozed through — the absorbed level is gone, and any setup based on it is dead.
Order Flow Engine
Absorption Stability Tracker
60s window · 60% lead
STB
Prevents flip-flopping — absorption side must hold for minimum ticks and time before switching.
DetectClassifyConfirmScore
Core Decision
Triple-condition side flip: 5 ticks + 10 seconds + 60% weighted dominance required simultaneously.
Window60s history of scored observations
Flip5 ticks + ≥10s + ≥60% weighted lead
Floordrop observations below score 50
Confirmed Side flip validated — reversal banner shown for 30 seconds, new absorption side declared.
Rejected Flip conditions not met — current side held, noisy observations filtered out.
Low-Pass Filter

Raw absorption detection is noisy — in a fast market, the absorption side can appear to flip on every tick as aggressive flow alternates between bid and ask. The stability tracker is a low-pass filter that prevents meaningless side changes from polluting the engine's absorption state.

Weighted Voting

The tracker maintains a 60-second history window of scored observations. Each observation carries the quality score from the absorption detection module, and votes are weighted by score — a high-quality observation (e.g., REVERSAL_CONFIRMED at 86) outweighs several low-quality ones (ACTIVE_ABSORPTION at 58). To flip the declared side, the new side must satisfy three independent conditions simultaneously: (1) maintain its lead for at least 5 ticks, (2) hold for at least 10 seconds, and (3) achieve ≥60% weighted dominance in the recent vote window.

Grace & Banner

A fade grace period of 10 seconds holds the last declared side even when no new observations arrive — this prevents the tracker from going blank during brief pauses in flow. The reversal banner lasts 30 seconds: when a genuine side flip is confirmed, the engine displays a reversal alert on the dashboard for 30 seconds to ensure the trader notices the change. Observations below score 50 are dropped entirely — low-quality noise shouldn't influence the stability calculation.

Stability Parameters
ParameterValue
History window60 seconds
Flip minimum ticks5 ticks
Flip minimum time10 seconds
Flip lead %60% weighted dominance
Min observation score50 (drop below this)
Reversal banner30 seconds
Fade grace10 seconds (holds last side)
Recent vote window25 ticks
Calibration Rationale
  • Without stability constraints, absorption side would flip on every tick in a noisy market, making the signal useless.
  • The 60% lead requirement means the new side must convincingly dominate, not just edge ahead.
  • The 10-second minimum prevents reactionary flips to single large prints — institutional iceberg orders can produce momentary opposite-side signals that shouldn't cause a flip.
  • Score-weighted voting ensures that a single high-quality reversal observation can outweigh multiple low-quality noise observations.
Order Flow Engine
Market Order Tempo
3 levels · 0.6× / 1.5×
TMO
How the engine reads market "loudness" — are institutions actively participating or is this retail noise?
DetectClassifyConfirmScore
Core Decision
Bar-by-bar institutional participation classification against fixed baselines per symbol.
WEAK<0.6× baseline → confluence +4
NORMAL0.6× – 1.5× baseline → no adjustment
STRONG1.5× baseline → confluence -3 to -5
Confirmed STRONG tempo — institutions actively participating, confluence threshold lowered, flow IS confirmation.
Rejected WEAK tempo — retail noise, no institutional footprint, confluence threshold raised by +4.
Baseline Measurement

Market order tempo is a per-bar measurement of institutional participation intensity. The engine counts both the number of trades (order frequency) and the total contracts (order size) in each bar, comparing them against fixed baselines derived from average RTH bar statistics: NQ baseline is 900 trades / 12,000 contracts per bar; ES baseline is 600 trades / 8,000 contracts.

Threshold Adjustment

Below 60% of baseline (WEAK), the market is dominated by retail noise — small orders, no institutional footprint. The engine responds by raising the confluence threshold by +4 points, demanding more evidence before firing. Above 150% of baseline (STRONG), institutions are actively participating. The engine lowers the threshold by -3 to -5 points because the heavy participation itself is a form of confirmation — you don't need as much structural evidence when the order book is shouting a direction.

Macro vs Micro Timescale

The tempo check is separate from the volume regime (P2) and works at a different timescale. Volume regime uses daily percentile rank (macro context); tempo measures bar-by-bar participation (micro context). A STRONG tempo bar in a LOW volume regime means "today is quiet overall, but right now someone big showed up" — that's a relevant signal for the current bar's evaluation.

Tempo Thresholds
LevelNQ TradesNQ VolumeES TradesES Volume
WEAK (<0.6×)<540<7,200<360<4,800
NORMAL540–1,3497,200–17,999360–8994,800–11,999
STRONG (≥1.5×)≥1,350≥18,000≥900≥12,000
Calibration Rationale
  • WEAK tempo means the market order flow is below 60% of baseline — retail noise, no institutional footprint; setups need more confluence to compensate.
  • STRONG means real participation — the flow itself IS confirmation.
  • The 0.6×/1.5× multipliers mark the empirical inflection points where institutional participation becomes visible (or invisible) in order flow data.
Order Flow Engine
Research Microstructure Signals (R1–R4)
4 signals · 10 substrate fields
R1–R4
Academic-validated microstructure signals wired into the RR profiler, stop adjustment, and per-fire substrate.
Market DataResearch SignalsDecision Chain
Four Signals
Each signal has a computation function (engine-order-flow.js) and decision-chain wiring (engine-rr-confluence.js + engine-pipeline.js).
R1 VPIN|buyVol − sellVol| / totalVol · rolling 50-bucket · TOXIC ≥0.7 / ELEVATED ≥0.5 / NORMAL ≥0.3 / CLEAN
R2 Vol ClockbarVolume / median(100 bars) · SURGE ≥2.0 / FAST ≥1.5 / NORMAL ≥0.7 / SLOW ≥0.4 / DEAD
R3 First-30(price10:00 − price9:30) / price9:30 · power-hour only (15:00–15:45 ET) × GEX regime
R4 OFIΔ(bestBidSize) − Δ(bestAskSize) · acceleration over 3 snapshots · BID_BUILDING / ASK_BUILDING / NEUTRAL
R1 → RR TOXIC + aligned → −0.15R (informed agrees). TOXIC + counter → +0.30R (counterparty risk). ELEVATED + counter → +0.15R.
R2 → Meta Applied LAST to total adjustment: DEAD ×0.4, SLOW ×0.7 (dampen toward zero). SURGE ×1.15 (amplify — signals trustworthy).
R3 → RR Power-hour only. Aligned → −0.15R × GEX mult. Opposing → +0.20R × GEX mult. CASCADE ×1.5 / PINNED ×0.5.
R4 → RR+Stop RR: aligned → −0.10R, opposing → +0.15R. Stop: OFI opposing trade direction → stop ×0.90 (tighter).
Research Basis
  • R1 VPIN (Easley, Lopez de Prado, O’Hara) — Volume-Synchronized Probability of Informed Trading. Uses actual aggressor-classified buy/sell volume, not bulk classification. High VPIN precedes volatility events with R² ≈ 0.4 in the original paper.
  • R2 Volume Clock (Lopez de Prado, “Advances in Financial ML”) — volume-time vs wall-clock-time. When bars take longer to fill (SLOW/DEAD), signals are dominated by noise. The meta-multiplier gates the reliability of ALL other signal adjustments.
  • R3 First-30-min (Gao, Han, Li, Zhou 2018, Journal of Financial Economics) — first 30 minutes of the session predict the last 30 minutes’ direction. Crossed with GEX regime: CASCADE amplifies (dealers sell rips/buy dips in same direction), PINNED dampens (dealers absorb moves).
  • R4 OFI (Cont, Stoikov, Kukanov 2014) — Order Flow Imbalance velocity. Change in best bid size minus change in best ask size. R² ≈ 70% for short-term price prediction in the original paper. Scaffolded for L2 depth data (connecting this week).
Utilization Audit
Institutional Confluence → RR + Raw-Input Forensics
L1a wire + 11 forensic fields + regime + chop
L1–L2
3-layer utilization audit: “What does the engine KNOW vs what does it DO about it?” Wired institutional confluence into RR floor. Added raw-input substrate fields for post-hoc forensic decomposition.
AuditDecision ChainSubstrate Forensics
Build Sync · 2026-05-24
Today’s shipped cluster wired RR/state harmonization and entry-timing decomposition into live decisions, substrate persistence, and weekly extraction cohorts.
L1a Confluence→RRinstitutionalConfluenceScore(setup) · score≥70 → −0.15R · score<35 → +0.20R
P3.7b RRrequiredRR = dynamicRR(base + flow + regime + ATR14 scale + marketState adj), bounded and source-aware
P3.8 TimingentryTimingClassAtFire + entryNoFillRiskAtFire + entryChaseRiskAtFire → NO_FILL / CHASE extraction cohorts
L1a → RR High confluence (≥70) = broad institutional confirmation → compress RR floor (target more reachable). Low (<35) = thin setup → raise floor.
RR + Timing → Substrate RR ATR14/state-coupling fields and P3.8 NO_FILL/CHASE timing fields are now captured on SIGNAL + BLOCKED rows, so weekly extraction can isolate entry-friction failure modes.
Regime → RR Day regime severity wired into RR: EXTREME +0.25R, ELEVATED +0.10R. Previously gated production but didn’t modulate magnitude.
Chop → RR Chop-proxy (range-bound + low vol + inside value) penalizes breakout/continuation +0.15R, favors reversion/trap −0.05R.
L3 — Expanded 2026-05-24 Analysis Cohorts
  • Market structure: orderFlowState · absorptionState · liquidityBreakoutType · pricePosVsVwap · momentumAlignment · pricePosVsVpoc
  • Regime + HTF: dayRegimeSeverity · htfH1 · htfH4 · htfMSS
  • Execution context (banded): sessionRunwayMin · stopDistanceATR · targetDistanceATR · rangePosPercent
  • P3.7b RR cohorts: rrAtr14Source/Ratio/Scale + rrMarketState/Mode/Severity/Adj
  • P3.8 entry-timing cohorts: preEntry/preSignal state + timing class + no-fill/chase risk bands
  • Total slices: 96 dimensions in weekly_report.py (latest live schema path)
  • Today shipped set reflected: P3.11, P3.12, P3.13b, P3.13c, P3.7b, P3.8, and P5.2 verification.
Cross-Market Signal
Gamma Exposure (GEX)
0–45 DTE · 5 regimes
GEX
ETF options OI → Black-Scholes gamma per strike → dealer hedging direction.
Macro ContextOptions MarketDealer Hedging
Core Decision
ETF options OI → Black-Scholes gamma per strike → dealer hedging direction.
GEXΓ × OI × 100 × S² × 0.01
SignCall GEX positive (stabilizing) · Put GEX negative (amplifying)
Pass Spot ABOVE gamma flip → STABILIZING regime → dealers dampen moves, mean-reversion favored.
Fail Spot BELOW gamma flip → AMPLIFYING regime → dealers amplify moves, breakout/momentum favored.
Dealer Mechanics

Market-makers are structurally short options. To stay delta-neutral, they must hedge dynamically. Positive GEX = dealers buy dips, sell rips (stabilizing). Negative GEX = dealers sell into drops, buy into rips (amplifying). The gamma flip level is where this behavior inverts.

Cross-Market Edge

Data sourced from ETF options (SPY/QQQ), not futures options (ES/NQ). Independent participant pool: pension funds, insurance companies, retail equity vs futures prop desks. Cross-market confirmation with mechanical (not discretionary) basis.

VIX Integration

Research-validated: GEX is a VIX modifier, not standalone signal (FlashAlpha 8yr backtest: ρ=-0.14 after VIX control). Combined regime matrix: PINNED (calm+stabilizing) · COILED (calm+amplifying) · DAMPENED · VOLATILE · CASCADE.

Key Levels
LevelDefinition
Gamma FlipPrice where net GEX crosses zero — regime boundary
Call WallStrike with highest call GEX — mechanical ceiling
Put WallStrike with highest put GEX — mechanical floor
Vol TriggerPut wall below which negative-gamma cascading accelerates
Calibration Rationale
  • Industry-standard Black-Scholes gamma calculation (Perfiliev/SpotGamma convention).
  • 0–45 DTE options included.
  • Dealer-short assumption validated by SpotGamma, SqueezeMetrics, FlashAlpha research.
HOT RUN · PHASE A
BRIDGE LIVE · Quantower
BUILD PHASE · lock lifted 2026-05-22
208 modules · ~150 fields/fire
--:--:-- UTC