ATLAS · The Architecture of Edge

ATLAS

The Architecture of Edge

10-brain portfolio racing NQ / ES intraday — shared substrate · per-brain edge measurement · the winner graduates to autonomous execution.

EN עב

COLD ▸ HOT · NOW ▸ AUTOMATED

HOT · NOW Manual Broker OFF Bridge LIVE BUILD PHASE │ ◈ 10 brains race ◇ Forge has memory ◆ CI-LB > +0.3R · Holm · Static · not real-time

🪜 Deployment Phase Ladder · where the engine is

1 · PAPER TRADING ◀ NOW

Hot mode, paper fills. Everything works, is connected, and proves expectancy. Advance: works end-to-end + positive expectancy on paper.

2 · QUANTOWER DEMO

Verify every MECHANICAL op — order send, connections, bridges, automation. Mechanics, not edge.

3 · LIVE months away

Real broker / prop-firm account — the only phase touching real money. Only after 1 + 2 prove out.

Operative question is always "does this function work in paper?" — live framing is months out. Maps under: P1 ≈ G0 / Hot Run · P2 ≈ exec-ramp Phase 0 + G1 · P3 ≈ later gates.

◉

19 racing brains

Portfolio fleet · NOA leader +138.9R

◬

208 modules

Shared substrate · all brains read from it

⧉

~150 fields/fire

Substrate density · 16 categories

⟁

12,915 fires resolved

Deduped portfolio sample · HOT-RUN

⬢

6 layers · ~55 cards

This atlas · current state only

Risk & Performance Notice

Futures trading involves substantial risk and is not suitable for every investor. Past, simulated, hypothetical, replay-based, research, or model-generated results are not necessarily indicative of future performance. No representation is being made that any account will or is likely to achieve profits or losses similar to those shown. Unless explicitly marked as broker-executed live trades, all displayed outcomes should be treated as research/model outputs and may not reflect slippage, commissions, liquidity constraints, execution delays, or trader behavior. The 12,915-fire deduplicated sample referenced in this atlas was generated during research and hot-run evaluation phases across 10 racing brains — not under live broker execution.

Figure provenance — read every $ / R with its tag

A number whose provenance is left for the reader to guess is the ambiguous-zero applied to money. Default: every $ / R figure here is REALIZED in the shadow paper-race (it actually happened across the books — still not real broker money). Anything else MUST carry a tag: ⟨modeled⟩ / ⟨counterfactual · would-have⟩ (never happened, even in paper — e.g. "if reactive had been anticipated") · ⟨pre-haircut⟩ (claimed R before execution slippage) · ⟨synthetic⟩ (generated data) · ⟨live⟩ (real broker fills, when they exist). A would-have dollar next to a realized dollar, unlabeled, reads as real money — and that lie sizes positions. Per feedback_silent_failure_pattern.md R6.

Phase Ladder · 0 → 4 5-phase trajectory from substrate truth to autonomous portfolio

★ Phase 1 active

PHASE 0 CLOSED ✓

Data & measurement integrity

Make the numbers trustworthy before trusting them — every screen, file and journal now reads the same trade record from one source, so a profit can't be counted twice or quietly disappear. This was the foundation: measure honestly first, judge later.

Gate cleared 2026-06-07 · SSOT spine hardened 2026-06-10

▸

PHASE 1 ★ NOW

Identify the edge · 10-brain race

Race all 10 brains on the same live tape and find the one with a real, repeatable edge — proven across enough trades and different market conditions that it can't be luck. Nothing graduates until the math rules out chance.

Contenders: NOA +138.9R · WCONS +117.4R

▸

PHASE 2 queued

Graduation candidate

Lock the winning brain's rules so they can't drift, then re-test them on data they were never tuned on — each market type on its own. The edge has to survive being frozen, not just look good in hindsight.

Gate: PROMOTION_READY

▸

PHASE 3 queued

Execution qualification

Take the frozen brain from "good signal on screen" to "real fill on a real account," one careful rung at a time — simulator, then demo broker — measuring how much the edge shrinks at each step before any real money.

Gate: D3 reached

▸

PHASE 4 ★ NORTH-STAR

Portfolio allocation · NOA operates

The end goal: once two or more brains have earned their place, NOA runs them as a portfolio — sizing each by market conditions while you set the risk and stop pressing buttons. The engine becomes the trader.

≥ 2 brains graduated

Pivot trigger: if no brain clears Phase 1 by month 6 (~2026-11-13) → architecture council fires. Each phase has a hard gate. No graduation without clearing it.

●

Substrate L1

what every brain sees — shared market reality, classified once, read by all 10 racers

7 CARDS

Shared by construction. The substrate is computed ONCE per tick and consumed by every brain (NOA · ANT-NOA · BRO · CONS · WCONS · AGG · PA · PRECISION · LLQ · SCHL · SAVT). No brain owns its own private feed — when we say "the substrate," we mean the single source of truth all 10 racers measure against. Brain differences live downstream in Pillars, conviction, and exit policy — not here.

Order Flow

In plain terms. This watches what buyers and sellers are actually doing underneath each price move — who's winning, who's absorbing the other side, and whether the move has real force behind it. It's the engine's first defense against a breakout that looks real on the chart but has nothing buying or selling to back it up.

Tracks who's winning each bar — buyers or sellers — and flags when price disagrees with that pressure
Spots when big players are absorbing the other side, in four stages: building, confirmed, exhausted, reset
Three ways to know who took the trade — exchange tag, or inferred two different ways as fallback
Knows the difference between someone refilling an order and someone aggressively taking it

Prevents Fake breakouts where price jumps but no real buying or selling backs it up.

⟨engine-order-flow.js · NQEliteAbsorption⟩

4-state FSM

Liquidity

In plain terms. This maps the price levels institutions actually defend — and the traps they spring when the crowd piles in. It tells a real breakout from a fake one engineered to grab stops and reverse, so the engine doesn't chase a move built to trap it.

Tells a fake breakout (price grabs stops then snaps back) from a real one (grabs stops then keeps going)
Detects failed breakouts — price pokes through a key level but can't hold above it
Watches the levels that actually matter: yesterday's high/low, today's opening range, fair-value zones
If a symbol's levels stop behaving predictably, stop firing on that symbol

Prevents Chasing breakouts that were engineered to grab your stops and reverse.

⟨stop-hunt-panel · pipeline sweep-state⟩

±1 tick dedup

Market Profile

In plain terms. This tracks where the market spent time agreeing on price — and where it refused to. It shows where today's fair value sits and which way it's drifting, so the engine knows whether to expect a snap-back or a genuine shift in where price wants to trade.

Measures the first hour's range against the day's typical move — not against a fixed number
Tracks where most volume traded today and how the fair-value zone is shifting
Volume-weighted average price tracked from key moments — open, big news, session start
Yesterday's high and low survive restarts — never lose context after a reload

Prevents Betting on a snap-back when the market is actually shifting where it wants to trade.

⟨engine-ib-* · poc-tracker · vwap-context⟩

Dalton framework

Whale & OI

In plain terms. This reads the footprints institutions leave that retail never sees — repeated one-sided slamming, contracts added or closed overnight, and where the real size is sitting. It keeps the engine from fighting a level a big player is actively defending.

Spots bursts where the same side is repeatedly slamming the market
Watches how many contracts institutions added or closed overnight
Sees how thick the real buy and sell orders are at each price — not just how many there are
Recognizes when a big player is actively defending a price level on a pullback

Prevents Fighting positions that institutions are actively defending.

⟨whale-tracker · oi-tracker · aggressor-streak⟩

Quantower L2

Cross-Market

In plain terms. This checks whether all four index futures are pulling the same way — and who's leading the move. When NQ runs but the rest of the market quietly turns, this is what flags it before the engine trades a move the broader tape doesn't support.

Compares NQ · ES · YM · RTY every bar — reads the broader market picture
Who's pulling the move: tech, broad market, small-caps — or are they disagreeing?
Flags when one index makes a new high but its sibling doesn't — a classic warning
Detects when all four go quiet at once — often the calm before a real move

Prevents Trading a strong NQ while the rest of the market quietly rolls over.

⟨engine-cross-index · noa-cross-market⟩

4-symbol fabric

Macro Context

In plain terms. This is what the big-picture market is doing to your setup right now — fear, the dollar, and the calendar of major releases. The same setup can be a green light on a quiet day and a hard no in the minutes around a Fed decision; this is what tells them apart.

Tracks fear (VIX) and the US dollar (DXY) — and what they're saying today
Knows when Fed days, jobs reports, and inflation prints are about to hit
Stops firing signals in the minutes around major news releases
Same setup can be a buy on a weak-dollar day and a no-go on a strong-dollar day

Prevents Trading into Fed-day chaos, or fighting which way the dollar wants to go.

⟨engine-news-risk · macro_daily.json⟩

VIX · DXY · event

HTF · Volatility

In plain terms. This sets the bigger frame you're trading inside — how much the market typically moves today and what kind of day it is (trending, choppy, or reversing). It's what stops the engine from using calm-day tactics on a wild day, or the reverse.

Sizes targets relative to how much the market typically moves today — not against fixed numbers
Classifies the day: trending · choppy · reversing × calm · normal · loud
Knows Asian-session hours move differently and handles them separately
If you lose too much in one day, the day ends — full stop

Prevents Using choppy-day tactics on a trending day, or the other way around.

⟨daily-risk-gate · ib-day-snapshot⟩

3×3 regime grid

◆

Pillars L2

how the brains think — the cognitive primitives every racer inherits before adding its own conviction

7 CARDS

Inherited, not invented. Every brain inherits these pillars — confluence weighting, setup catalog, bias arbitration, risk plan, cadence. The 10 racers differ in which pillars they emphasize, which setups they whitelist, and when they hold fire — but the pillars themselves are common ground. Memory (formerly the MI layer) folds in here as the seventh pillar.

Institutional Confluence

In plain terms. This is the engine's single official scorer — it weighs every piece of evidence (order flow, market structure, location, the wider context) into one number, and downgrades signals that are really just saying the same thing twice. One place owns that math, so no two screens can disagree.

One score combining four things: order flow · market structure · location · wider context
Different signals matter more for different setups — weights are tuned per setup, not globally
When most evidence agrees, the few outliers get downgraded — they can't inflate the score
One place owns the math — no parallel copies allowed to drift

Prevents A high score that's really the same signal getting counted twice.

⟨rr-confluence · production-context⟩

minority × (1−d/140)

Setup Catalog

In plain terms. This is the library of trade patterns the engine is allowed to recognize — every family backed by published research, not hunches. When two patterns fire on the same trade it merges them, so the count reflects real, distinct bets rather than the same idea wearing two names.

Every family has academic research + experienced-trader literature behind it
When two setups fire on the same trade, they get merged — 15 fires really means ~10 distinct bets
A setup only counts as "proven" after at least 30 trades (fewer if research already supports it)
Plans to reweight overlapping setups are documented but locked until the data earns them

Prevents Shipping near-identical setups under different names and pretending they're independent.

⟨setup-classification · docs/setup_theses.md⟩

10 families

Bias Arbitrator

In plain terms. Before anything gets sized, this decides what kind of trade it even is — with the trend or against it — and refuses the muddy middle. When the read is clean you keep the edge; the fuzzy fallback path is the expensive one, and it's measured.

For every signal, picks one of two clean paths: continuation or reversal
When the read is muddy and falls through to the backup path, trades lose ~1.6× risk on average — measured
Going long and going short use different rules — they aren't mirror images
Decides what kind of trade this is before sizing it — never after

Prevents Mixed signals diluting what was a clean directional read.

⟨engine-bias-arbitrator.js⟩

Δ −1.6R fallback

Phase F · Risk Plan

In plain terms. This is the layer whose whole job is to say no — cap how much reward you chase per trade, block trades fighting the day's trend, and end the day when losses hit the limit. It exists so a great-looking setup can't talk you into an oversized bet.

Caps reward-to-risk per symbol — no lottery tickets: NQ at 2.5× max, ES at 4×
Hard-blocks trades that go against the day's trend
Hit the daily loss limit, the day ends — no exceptions
Currently advice only · enforcement activates in Automated Run

Prevents Oversized "lottery ticket" trades that come from gut, not data.

⟨daily-risk-gate · l4-risk-advisor⟩

RR cap NQ 2.5 / ES 4.0

Phase G · Additive Setups

In plain terms. These are four newer trade patterns, each grounded in published research rather than invention — playing index disagreement, gap retraces, overnight order build-up, and failed first breakouts. Same bar as everything else: no setup ships without real evidence behind it.

G1 · NQ and ES disagree (one makes a new high, the other doesn't) — trade the laggard
G2 · The open gaps, price retraces halfway back — fade the move
G3 · Overnight volume piles up one way — ride it into the morning (NY Fed paper)
G3b · London grabs liquidity, NY's first breakout fails — fade the failure

Prevents Inventing setups out of thin air with no real evidence behind them.

⟨production-context setup definitions⟩

NY Fed sr917

Cadence & Pause

In plain terms. This is the engine's sense of when to stay quiet — limiting how often each symbol fires so the screen doesn't fill with noise. Pausing a symbol only silences the alerts; behind the scenes it keeps watching and learning. One missed signal beats a stream of noisy ones.

Limits how often each symbol and setup can fire — no spam
Pausing a symbol stops the signals on screen, but keeps the engine watching and learning
Quiet on the screen, busy in the background — the engine never stops collecting data
One missed signal is far cheaper than a stream of noisy ones

Prevents Signal-spam that wears down your focus and your trust in the system.

⟨cadence-gate · symbol-pause⟩

silence = information

Memory · Market Intelligence

In plain terms. This is the engine's memory — it records the events, the episodes, and how they resolved, so any brain can ask "have I seen this before?" before committing. It turns a repeating market into evidence instead of a fresh surprise every time.

5 MI modules: events (CPI/NFP/FOMC) · episodes (high-variance windows) · outcomes (how each episode resolved) · store (durable IDB) · intelligence (the read API)
Every brain can query MarketIntelligence for context before committing — "this regime resolved BEARISH in the last 8 sessions"
3-minute per-type cooldown — repeated insights don't spam the narration channel
Was its own layer (MI) — folded into Pillars as the seventh because it's shared infrastructure, not a brain

Prevents The same regime fooling the system twice — memory turns event recurrence into evidence, not surprise.

⟨engine-noa-market-intelligence.js · IDB nqelite_mi_store_v1⟩

5 modules · durable · 3-min cooldown

▲

The Race L3

10 brains on one tape — the active battlefield (Phase 1). Same substrate, same pillars, different reads on what fires.

10 BRAINS

Same race, different reads. Every brain consumes the Substrate (L1) and inherits the Pillars (L2). They diverge on which setups they take, how strict their conviction gate is, and which regime cells they specialize in. Phase 1 asks: which racer's CI-lower clears +0.3R per trade on ≥100 fires across ≥2 regime cells, Holm-corrected across N=10? Winner snapshot freezes into Phase 2. Until then, all 10 race in shadow on the live tape.

🏆 Portfolio Scoreboard Deduplicated · current handoff truth · all 10 racers · NOA leader · AGG losing book

+54.2R portfolio net · −$51,515 on stop-dist + commissions · 12,915 n

Brain	Kind	R	$ REAL	n	Phase 1 read
★ NOA	broad · self-discovery	+138.9	+$12,659	660	Real leader · NQ +100R · ES +39R · positive both symbols · Phase 1 contender
★ WCONS	weighted consensus	+117.4	+$11,452	249	NQ +90R · ES +27R · trend_up cell Holm-significant (n=63, ci_low +0.166R) · Phase 1 contender
PA	price-action lens	+6.1	−$16,615	1,534	R-vs-$ divergence · tier-sizing impact · NQ +56R / ES −50R · investigation flagged before any Phase 2 promotion
CONS	consensus	−1.3	−$238	440	Essentially breakeven · waiting for WCONS to graduate first
ANT-NOA	pre-arm head	−53.1	−$12,386	897	NQ −80R drags portfolio · counterfactual: strip top 5 leaks → would promote to NO_EDGE_YET
BRO	Brooks PA inheritance	−55.9	−$12,387	274	Negative both symbols · first auto-proposal: split by regime — wins trend_down, loses trend_up
AGG	broad aggressive	−97.9	−$34,001	8,861	Losing book · ES bleeds −707R · ES quarantine + regime heal shipped 2026-06-06 · counterfactual proves the gap is STRUCTURAL — leak quarantines alone don't graduate AGG
PRECISION	whitelist over AGG	new	—	0	8th racer · 9 documented-winner cells (5 NQ gold mines + 4 ES survivors) · install-date floor 2026-06-09 09:30 ET
LLQ	regime whitelist over AGG	new	—	0	9th racer · low_liquidity-only filter (the Holm-significant edge cell n=3540 ci_low +0.077R)
SCHL / SAVT	self-learning · no setup catalog	abstain	—	0	The only brains that learn the market themselves — Scholar (readable checklist) + Savant (learned representation). Train offline, ship frozen to browser. 2026-06-10: their lifetime-0 fires were a regime-bucket BUG (`_resolveRegime`→`'unknown'`→max CQL penalty→qLower −0.93), NOT "abstaining by design" — fixed (→ ProvenanceStamper). Now gates honestly; fires in trend/expansion. Still data-gated for retrain (≥6 RTH days, have 2).

★ NOA

In plain terms. The fleet's lead brain and its broadest — it hunts edge across every kind of setup rather than specializing, and discovers new patterns on its own instead of waiting to be told what to trade. Right now it's the only racer making money on both NQ and ES.

+138.9R · +$12,659 · n=660 deduplicated
Positive both symbols: NQ +100R · ES +39R
Pillar weighting + bias arbitration + auto-experiment graduation
Phase 1 contender — needs CI-lower clear at +0.3R on ≥2 regime cells

⟨engine-noa-*.js · 32 modules⟩

Phase 1 contender

★ WCONS

In plain terms. This one trades no setups of its own — it listens to all the other brains and takes a weighted vote, trusting each in proportion to how well it's done lately. A "wisdom of the crowd" brain that leans hardest on whoever's currently hot.

+117.4R · +$11,452 · n=249
NQ +90R · ES +27R · positive both
First Holm-significant cell in the fleet: trend_up n=63 ci_low +0.166R (p=0.000)
Phase 1 contender alongside NOA

⟨engine-wcons-*.js⟩

Phase 1 contender

In plain terms. A pure price-action reader — it trades off the shape and quality of the bars themselves (clean structure, second-entry pullbacks), not order flow or news. The busiest brain on the tape, but its dollars and its R disagree sharply — flagged for a look before it can advance.

+6.1R but −$16,615 · severe R-vs-$ divergence
NQ +56R · ES −50R — symbol asymmetry
n=1,534 — most-fired discretionary read
Sizing-architecture investigation flagged before Phase 2
Real conviction tier: fusion-bias agreement + regime alignment grade each fire 1–4 (was hardcoded tier-2 on every fire, polluting the band tape-wide)

⟨engine-price-action-state.js · price-action lens⟩

Divergent · investigate

CONS

In plain terms. The control experiment — a plain, equal vote of all the brains, nobody weighted. It exists to prove the weighted version (WCONS) is actually earning its keep: if simple averaging worked just as well, the weighting would be pointless.

−1.3R · −$238 · n=440
Essentially breakeven · the "no edge if we just average everyone" baseline
WCONS proves the weighting is doing real work · CONS proves it's necessary
Not a Phase 1 candidate — exists as the methodological control

⟨engine-cons-*.js⟩

control · breakeven

Anticipation Spine · 6 arms

In plain terms. Six brains that try to get in early — arming a trade just before the signal fully forms, each specialized in a different read (price action, Brooks, order flow, aggressive, consensus). Five of the six make money together; the original lead arm is the one that bleeds.

ANT-PA · +$12,838 · 80.3% WR · n=66 — standout · price-action pre-arm
ANT-BRO · +$8,628 · 76.1% · n=46 · Brooks-style pre-arm
ANT-OF · +$4,129 · 76.2% · n=21 · order-flow pre-arm
ANT-AGG · +$2,366 · 65.0% · n=60 · broad-aggressive pre-arm
ANT-CONS · +$1,668 · 100% · n=5 (small-n caveat) · consensus pre-arm
ANT-NOA · −$12,386 · 21.9% · n=897 — the losing arm (NQ −80R drag) · counterfactual: strip top 5 leaks → promotes to NO_EDGE_YET
Strict Holm: 0 spine cells survive family-wise correction · spine arms have noisier per-cell distributions but win in aggregate

⟨engine-anticipation-layer.js · engine-ant-{pa,bro,of,agg,cons}.js · arena pairs vs NOA reactive⟩

+$29k aggregate ⟨realized · shadow⟩ · 1 leader bleeds

BRO

In plain terms. Trades the classic Al Brooks price-action playbook — bar-by-bar reading, second entries, wedges, final flags. The catch: Brooks's hand-written odds are treated as starting assumptions, not facts, and they haven't survived the post-2025 market — it loses on both symbols.

−55.9R · −$12,387 · n=274
Negative both symbols · Brooks priors don't survive 2025+ regime
First auto-proposal: split by regime — wins on trend_down (+0.39R), loses on trend_up (−0.52R)
Brooks's explicit probabilities = Bayesian priors, not facts

⟨engine-noa-brooks.js v0.3 · 39 videos ingested⟩

Bleeding · regime-split candidate

AGG

In plain terms. The volume brain — it fires on every setup that clears the minimum bar, holding nothing back. That makes it the busiest and the biggest loser: it bleeds heavily on ES, and the gap is structural, not just a handful of bad setups.

−97.9R · −$34,001 · n=8,861 — the losing book
ES bleeds −707R alone · NQ +609R can't cover
ES quarantine shipped 2026-06-06 (3 setupId tuples) · regime heal recovered 1,647 labels
Counterfactual proves: structural edge gap — leak quarantines alone won't graduate it

⟨engine-leading-edge-shadow.js · SHADOW_CONFIG.quarantine⟩

Losing · MARGINAL_POSITIVE after heal

PRECISION

In plain terms. A sharpshooter built on top of AGG — instead of firing on everything, it only takes the handful of specific setup-and-symbol combinations that have actually proven to win. Brand new, still gathering its first trades.

9 (sym, setupId) cells: 5 NQ gold mines + 4 ES survivors
NQ: tape-climax-L/S · smt-cont-L/S · exhaust-rev-S
ES: tape-climax-L · smt-cont-L · overnight-cont-S · exhaust-rev-S
Install-date floor 2026-06-09 09:30 ET · no retro-credit · first Saturday read 2026-06-13

⟨engine-shadow-book-precision.js · install-date floor⟩

Accumulating · 0 fires

LLQ

In plain terms. A single-question experiment: does AGG do better if it only trades in thin, low-liquidity conditions — the one market state where its edge tested positive? It's AGG with one regime filter, there to settle that question.

Asks: does AGG-restricted-to-low_liquidity outperform AGG-broad?
Built on the Forge's Holm-significant edge cell: low_liquidity n=3540 ci_low +0.077R
Same install-date floor as PRECISION
Hypothesis instrument — tests whether regime-filter alone is enough

⟨engine-shadow-book-llq.js⟩

Accumulating · regime test

SCHL + SAVT · The Self-Learning Brains

B10

The only brains in the fleet that learn the market themselves — no setup catalog, no hand-coded pillars, no human-written rules. They look at the data and discover their own edge.

Every other brain (NOA · BRO · AGG · PA · etc.) trades setups we designed. SCHL + SAVT design their own.
SCHL — the Scholar: discovers edge as a readable checklist. We can audit what it learned and why. Asks: can edge be NAMED?
SAVT — the Savant: discovers edge as a learned feature representation. Cannot explain itself in words — it just feels the pattern. Asks: can edge be FELT?
Two paradigms, one race — they're testing whether trader edge is something we can name or only something the machine can feel.
Safe by construction: trained wild offline in Python; a frozen snapshot ships to the browser. They learn in the lab, never on live capital.
Data-gated for retrain (≥6 RTH session-days of arena tapes, have 2). They'd rather stay silent than fire on a half-trained policy.
⚠ 2026-06-10 bug fix: their lifetime-0 fires were NOT "abstaining by design" — a regime-bucket bug (`_resolveRegime` returned `'unknown'`, a key absent from the CQL table) pinned the conservatism penalty at max (1.0) → qLower ≈ −0.93 on every candidate, strangling a validated +0.087R edge. Fixed (→ `ProvenanceStamper.getRegimeKey` + `regimeAtFire` stamp); qLower −0.93 → +0.0147. They now gate honestly and fire in trend/expansion where qHat clears the floor.

⟨engine-noa-solo.js · scholar_policy_live.json · savant_policy_live.json⟩

pure self-learning · regime-bucket fixed 06-10

🏃 Run modes — same for all 10 brains

🧊 Cold · CALIBRATION — shadow capture, no signals shown. A tool, not the default.

🔥 Hot · NOW · HOT — every brain fires real signals visibly, Paper Trader runs on screen. No real account.

🤖 Automated · LIVE — graduated brain executes against real broker (Phase 3 ladder).

⬠

The Lab + The Forge L4

how the system discovers, governs, and calibrates itself — the Lab measures, the Co-Pilot watches and acts, the Forge stamps + ranks findings, the Constitution fences the layers, and the calibration loop grounds the roster

15 CARDS

Three instruments, one loop. The Lab (docs/research_plan.html) is where measurement happens — three tabs (Measure · Optimize · Discover) read the deduplicated truth and answer "what does the data say this week?" The Co-Pilot (docs/copilot.html) is the 6-faced sentinel — Hunter / Cross-Brain find leaks · Scout / Oracle find rising edges · the scanner self-schedules at 09:25 + 16:05 ET weekdays + a Saturday-night discovery slot (14:05 ET) · the AUTO watcher acts on graded-eligible findings (money-path permanently blocked). The Forge (docs/optimization.html) is where findings get persisted — cell-significance scans run with Holm-Bonferroni + session-clustered bootstrap CI, lifecycle stamps each finding (NEW / PERSISTENT / CONFIRMED / DROPPED), counterfactual ranking tests "would stripping this leak graduate the book?", auto-emit ships paste-ready quarantine patches + specialist-book stubs, and the Synthesizer composes regime-route policy candidates against the OOS gauntlet. The Discovery Lab (claude/auto_audit.py + lab_investigator.py + lab_features.py) is the autonomous quant researcher — it invents thousands of hypothesis cells per sweep across 30 dimensions (18 recorded — incl. the standing watches: news event/phase/aftertaste, roll window, institutional alignment, payout-multiple — + 12 derived senses: MFE/give-back/round-trip/heat/life/risk/RR bands, cross-book agreement, follower, regime alignment, streak context), screens them through BH-FDR + parent-lift, tortures survivors with an 11-test interrogation battery (strip-best, day-consistency, R-vs-$ money-truth, composition, cost floor…), grades each with a verdict ladder (CONFIRMED→DEAD) + evidence score 0–100, auto-freezes forward tests, runs the operator's question-templates forever, and briefs the Co-Pilot in desk language. Two gears: sentinel daily (forward-test health + edge-rot, quiet unless burning) · full discovery every Saturday night (findings land while the market is closed). It discovers, interrogates, and talks — it wires nothing. The system has memory of its own discoveries and a sentinel watching them.

🧪 The Lab · docs/research_plan.html measurement + research + discovery surface 5 cards

Measure Tab · Standings

In plain terms. The weekly report card — it ranks every brain on how it's actually doing, broken down by market condition, and only calls an edge real once the statistics clear a strict bar. This is where you look to answer "who's winning, and is it real?"

Per-book aggregate: n, mean R, PF, win rate, session-clustered bootstrap CI
9-cell heatmap: regime × outcome — colors graduate by sample density
Phase 1 tier: GRADUATION_LIKELY / MARGINAL_POSITIVE / NO_EDGE_YET / MARGINAL_NEGATIVE / QUARANTINE_LIKELY
Deep slices: per-setup, per-symbol, per-hour, per-session — drill on any racer
6-check pipeline at the bottom enforces a single source of truth read

⟨research_plan.html · #measureView · v20260607-r6⟩

CI-lower · Holm · 9-cell grid

Optimize Tab · The Race

In plain terms. The workshop view — it shows the live race, how close each brain is to graduating, and which leaks are worth fixing first. From here you jump to the Forge for the deep dig.

The Race: top quarantine candidates ranked by counterfactual ΔR — promotion-ready first
Graduation Ladder: per-book %s derived from verdict tier + would-promote flag
Calibration Bay: per-book aggregate + top 6 leaks + top 4 experiments + living loop
🔧 Links to the Forge (`docs/optimization.html`) for deeper hyp drill-down
ARENA Phase 2(b) LIVE (2026-06-08): confirmation × exit-policy counterfactual cells from resolver_vm matrix replay across 36k ARM tapes — all 5 structural anticipation arms (ANT-PA · AGG · CONS · BRO · OF), brain-matched (each brain's confirmations replay only on its own tapes → genuinely distinct cells, e.g. PA +0.07/−0.03 vs AGG +0.04/+0.01). CI-backed. ANT-NOA out (learned pre-arm path). Historical cells use midPx-synthesized bars; live v1.1 capture adds engine bars + the 4 flow confirmations going forward.

⟨research_plan.html · #optimizeView · build_races_and_ladder() · trigger_arena_replay.py · build_arena_proper()⟩

races + ladder + cal bay + ARENA Phase 2(b)

Discover Tab · The Learn Pane

In plain terms. The "what did we get wrong" tab — it ranks the week by where the system was most confidently mistaken, so the biggest surprises rise to the top instead of hiding inside the average.

Surprise rank: rank everything by how wrong our prior was — biggest deltas first
Confidence × wrongness: high-conviction misfires get the loudest signal
Regime mix shifts: detect when the distribution moves under us
Blind spots: regimes where n is too small to trust either way
Live now — was rendering fixture for an unknown period (caught 2026-06-06)

⟨research_plan.html · #discoverView · j.lab.discover.surprise⟩

surprise · wrongness · blind spots

Epistemic Suite · 5 tools

In plain terms. A set of five research tools whose only job is to find what we don't yet know — what surprised us, where we're blind for lack of data, and which next trade would teach us the most. It proposes; it never trades.

surprise_rank.py · what was most surprising vs prior
blind_spots.py · regimes / setups with insufficient evidence
auto_experiment.py · pre-registered shadow tests, no fishing
unknown_clusters.py · clusters of unnamed edge
active_learning.py · what next fire would teach us most
Fail-closed below the 10-session floor · discovery proposes, pre-reg disposes

⟨claude/epistemic_common.py · 5 tools · EPISTEMIC_LAYER_DESIGN.md⟩

discovery engine · fail-closed

Discovery Lab · Auto-Audit

In plain terms. The autonomous researcher — it dreams up thousands of possible edges, assumes each is fake until proven otherwise, puts the survivors through a brutal test battery, and reports back in plain desk language. It investigates and talks; it changes nothing on its own.

Hypothesis generator: ~4,100 cells/sweep over 21 dims (9 recorded + 12 derived senses incl. cross-book agreement + streak) + 60 seeded deep probes
Screen: BH-FDR (q=0.10) + parent-lift — a child cell must BEAT its parent or it's the parent's edge in a costume
11-test battery → reason chain: strip-best trade/day · day/regime consistency · cross-ticker · cost floor · outlier · composition · R-vs-$ money-truth
Verdict ladder CONFIRMED/PROMISING/NEW/THIN/DIVERGENT/ARTIFACT/DEAD (edge-rot) + evidence 0–100 · counterfactual Δ · auto-frozen forward tests
Question templates: every operator question becomes a permanent investigation (10 live)
Two gears: sentinel daily (gate health, quiet) · full sweep Saturday night — desk brief pinned in the Co-Pilot WATCH zone
Outcome features (MFE/give-back/…) = lookahead → DIAGNOSTIC only, never tradable edges

⟨claude/auto_audit.py · lab_investigator.py · lab_features.py · question_templates.json⟩

presumed false until proven · 2 gears

🐙 The Co-Pilot · docs/copilot.html 6-faced sentinel — finds problems, recommends fixes, acts in one click 1 card

🐙 The Octopus · 6 faces

In plain terms. The watchdog that doesn't just measure the brains — it hunts for problems and acts. Six "faces": two sniff out leaks, two spot edges that are rising, one runs the scans on a schedule, and one is allowed to act on findings that have earned it — never anything touching real money.

Hunter · finds bleeds → daily bleed / weekly cell bleed / engine paused / parent-silent-while-arm-profits
Cross-Brain · catches when one brain bleeds while a sibling profits on the same setup
Scout · finds things getting BETTER → FIX_VALIDATED · EDGE_RISING · GRADUATION_CANDIDATE
Oracle · warns about what's coming → CELL_DRIFT_WARN · REGIME_SHIFT · BRAIN_CORRELATION_SPIKE
Scanner self-scheduler · runs copilot_scan.py at 09:25 + 16:05 ET weekdays (ET-anchored, DST-safe, boot-catch-up)
AUTO watcher · armed; per action type opt-in; eligibility gate ≥20 grades + ≥90% hit-rate; money-path PERMANENTLY blocked at two layers

⟨engine-noa-analyst.js (façade, 438 LOC) → engine-noa-findings-bus + engine-noa-analyst-tier1/tier2/tier3 + engine-noa-bridge · engine-noa-actions.js v0.2.0 · claude/shadow_race_disk_server.py v0.4.0 · /scanner_status⟩

6/6 faces live · 2 scheduled scans/day · analyst decomposed 2026-06-09 (75K→18K core)

🔬 The Forge · docs/optimization.html instrument with memory — scan, classify, persist, surface, compose 8 cards

Cell-Significance Scanner

In plain terms. The lie-detector for edges — it takes every brain-and-condition slice and asks whether its profit could just be luck, using strict statistics and a held-back chunk of data to confirm the edge survives out of sample.

5 classifications: gold_mine · edge_cell · severe_leak · bleed_cell · dead_weight
Holm-Bonferroni across the family of cells per book — multiple-testing honest
Session-clustered bootstrap for the CI — same-session fires aren't independent
Walk-forward holdout (2026-06-07) — newest 30% of fires by ts held out; each hyp gets holdout_passes. First scan caught a NOA edge_cell overfit (train +0.388R → holdout −0.659R, sign flipped)
Regime-rotation expectancy (2026-06-07) — regime_weighted_meanR + regime_weighted_usd_per_week; cells whose edge sits in now-rare regimes get marked down
Temporal stability + $-impact ranking — surface the load-bearing leaks first

⟨claude/cell_significance.py · scan() + build_arena_early() + 4 gates⟩

Holm · cluster CI · holdout · regime-weighted

Hypothesis Lifecycle

In plain terms. The Forge's memory — every edge it finds gets tagged (new, holding, confirmed, or dead) so the system remembers what it discovered and notices the moment an edge stops being true.

Lifecycle: 🆕 NEW (absent from prior) · 🔁 PERSISTENT (in prior + current, < 3 dates) · ✓ CONFIRMED (≥3 distinct ET dates) · ❌ DROPPED (in prior, gone from current)
Drift status (2026-06-07): on first sighting, snapshot install_meanR / install_ci_low / install_usd_per_week. Each later scan computes sign-preserving drift_ratio against the snapshot
STABLE (≥0.95) · SLIPPING (0.7-0.95) · DRIFTING (<0.7) · INVERTED (sign flipped) · DEAD (CI now brackets zero) · NEW (no baseline)
Loudest alarm = CONFIRMED + DEAD — was real for ≥3 scan-dates, significance now lost (the "silently dying edge" case)
Distinct-scan-date counting prevents 50 Monday loads falsely confirming a Monday-only finding

⟨claude/hyp_history.py · install snapshot + drift_status · current.json · scans.jsonl · actions.jsonl⟩

4 lifecycle · 6 drift · CONFIRMED+DEAD = loudest

Counterfactual Ranking

In plain terms. Asks the "what if we fixed this" question — if we stopped a brain from taking its worst leak, would it actually graduate to a better grade? Ranks the leaks by how much fixing each one would help.

Per-leak: simulates strip + recomputes aggregate ci_low via session-clustered bootstrap
Maps ci_low → Phase 1 tier · stamps counterfactual_would_promote: bool
Identifies counterfactual_promotion_depth — first depth where book promotes
The load-bearing finding: AGG's edge gap is STRUCTURAL — strip all 6 leaks → still MARGINAL_NEGATIVE
ANT-NOA: strip 5 → would promote · PA: 1 leak insufficient

⟨cell_significance._compute_counterfactual_path()⟩

structural gap detector

Dead-Weight Detection

In plain terms. Finds the trades that are pure noise — lots of volume, zero edge either way. Not losers exactly, just dead weight that piles on risk and cost without adding any profit.

The OPPOSITE pattern from edge/leak — not bleeding, not edge, just commission burn + variance source
Body text projects commission burn over disk window
Independent of Holm — these don't compete for significance, they're confirmed null
First-run: 0 cells qualify (CIs too wide at 7 sessions) · framework in place for when disk grows

⟨cell_significance._classify_dead_weight()⟩

5th classification

Auto-Emit · Patches + Stubs

In plain terms. Turns a finding into ready-to-paste code — closing the gap between "we spotted a leak" and "we shipped the fix." The Forge doesn't just point at the problem; it hands you the patch.

Quarantine patch: AGG (sym,setupId) → 1-line tuple · regime → 2-path guidance
Specialist book stub: 99-line LLQ/PRECISION-pattern module + persistence entry
📋 Copy buttons per code block (clipboard API)
Workflow: "see finding (1 min) → review + paste (3-5 min)" — 6× speedup vs. hand-writing
Never auto-writes JS files — emitter outputs text in JSON, rendered as HTML

⟨cell_significance._emit_quarantine_patch() · _emit_specialist_book_stub()⟩

6× workflow speedup

Radial Examiner

In plain terms. A one-brain X-ray — slice it any single way (by setup, by hour, by market type) and see exactly what makes that brain win or lose, value by value.

6 inner-ring conditions: setupId · regime · symbol · direction · convictionTier · exit
16 outer-ring context fields (VIX regime · gamma · HTF align · M1 episode · CVD trend · flow bias)
Session-clustered CI per value · 'pos' / 'neg' / 'inc' classification
First auto-surfaced proposal: BRO — split by regime (trend_down wins, trend_up loses)

⟨cell_significance.build_examiner() · renderExaminer()⟩

6 inner · 16 outer (pending)

Discoveries Chip + Dropped Archive

In plain terms. The Forge's alert light — it waves when something new shows up and keeps a memory of what faded away, so a real find isn't missed and a dead one isn't chased twice.

Top-right chip: 🔬 polls /discoveries every 5 min
Amber pulse when N new hyps since last seen scan_date
Green steady when confirmed hyps exist · grey faint when idle/offline
Click → opens Forge in new tab + clears the alarm (last-seen scan_date stamp)
Dropped archive panel below hyp grid: last 15 hyps with kind, $/wk, lifecycle, stability

⟨assets/js/engine-discoveries-chip.js · /hyp_action POST · dropped.jsonl⟩

5-min poll · 4 states · archive

★ The Synthesizer · 5-stage gauntlet

In plain terms. Doesn't just pick a winning brain — it builds a combined strategy that routes to different brains by market condition, then puts that blended result through a tough out-of-sample gauntlet before it can be called ready.

Stage 1: per-cell stats with session-clustered bootstrap CI on resolved fires (scanner quarantines applied)
Stage 2: regime route — per (sym × regime) pick the book with highest CI-lower at n≥30 + mean R ≥ +0.05R
Stage 3: correlation dedup — buckets aligned fires by (sym, ~5-min, direction) and flags pairs aligning ≥3×/week
Stage 4: OOS gauntlet on the COMPOSED stream — walk-forward + session-clustered CI on the OOS pool, not on the parts
Stage 5: frozen artifact mirroring the Cerberus policy schema generalized
First live run: 14,828 outcomes / 21d → 119 cells → 6 routes → 396 composed historical fires → 100% folds positive → OOS CI-lower +0.626R per trade → PROMOTION_READY
TRADABLE CUTOVER (2026-06-10): the composer now routes only on HOLDABLE fires — each book as a real netting account (≤1 position per FAMILY risk slot — NQ/MNQ one NASDAQ slot, ES/MES one SP slot), via shadow_tape.tradable_per_book (TRADABLE_ONLY). Routing on the raw stacked tape promoted books whose edge is un-holdable fantasy.
The rerank: 9 → 7 routes. AGG (raw +1037R but tradable −334R — fires 16,921×, 730 holdable, those lose) and PRECISION (+1288R → +52R, 96% fantasy) REMOVED from the strategy. New routes → ANT-OF · ANT-PA · ANT-CONS · WCONS (the tradable leaders; WCONS ×3). Still PROMOTION_READY, OOS CI-lower +0.550 — the edge survives the honesty filter, the fantasy doesn't.
⚠ Caveat: pre-haircut. PROMOTION_READY = backtest verdict, not deployment authorization. Phase 1 Trust Calibration measures what survives a real broker — see L6.
Independently corroborated 2026-06-10: the Twin Timing Edge instrument (`claude/twin_timing_edge.py`) — a matched-pair method that pairs each reactive brain with its anticipation twin on the SAME setup (via `coFireClusterId`) and measures ΔR on the honest co-fired subset — found the SAME routing the synthesizer did (route to ANT arms where they win; PA/ANT-PA +0.45R [+0.29,+0.60] n=409). ZERO conflicts on diff. Two independent methods converge. Reactive brains kept as the control arm; the twin-table is a corroborator/watchdog, NOT an override engine. Finding: edge is a better entry RULE not earlier timing; conditioning flips signs (NOA loses overall but wins +0.83R in expansion).

⟨claude/synthesizer.py · docs/synthesizer.html · synthesizer_policy_v1.json · claude/twin_timing_edge.py · twin_timing_table.json⟩

TRADABLE CUTOVER · +0.550R CI-lower · 7 routes · AGG+PRECISION removed · pre-haircut

💰 Tradable Projection · the anti-lie

In plain terms. The honesty check on profit — a single brain is a signal lab, and adding up overlapping signals as P&L is a lie a real account could never have held. This counts only the trades you could actually have taken, in real dollars.

The rule: ≤1 position per FAMILY risk slot (NQ/MNQ = one NASDAQ slot, ES/MES = one SP slot — a micro is a size rung of the slot, never a second position; risk-slot law 2026-06-12) · no hedge. Interval-overlap netting walk over the time-ordered fires.
One tape, two projections: raw (every fire = edge sample, for discovery) + tradable (what an account could hold, for trust/routing). The raw tape is untouched.
The lie quantified (per-book raw → tradable): PRECISION +1288 → +52 (96% fantasy) · AGG +1037 → −334 (FLIPS NEGATIVE) · NOA +217 → +22 · WCONS +166 → +131 (most honest, 21% blocked). It cuts both ways — PA/BRO score HIGHER tradable (stacking HID their edge). It reranks the race.
SSOT lockstep: BookCanon.tradable (JS, live cards) ↔ shadow_tape.tradable (Python, canonical — reads real direction + resolveTs). The cutover feeds the Synthesizer (F8).
Staged: dashboard headline flip (needs JS rows enriched w/ dir+resolveTs — stripped rows over-block high-freq books) · trust pipeline (separate Plane-B path) · graduation firing-gate (a brain literally holds ≤1/ticker once promoted).

⟨assets/js/engine-book-canon.js · claude/shadow_tape.py · claude/synthesizer.py⟩

netting projection · AGG +1037→−334 · reranks the race

🧬 Meta Thesis Engine

MB-T

In plain terms. Shifts the unit of thinking from "a trade" to "a thesis" — one living directional idea per market, born from the evidence, scored continuously, and killed when it stops being true. Born, lives, dies, learns.

The hard bound: ≤1 open position per FAMILY risk slot (NQ/MNQ, ES/MES — micros are size rungs of the slot, never a second position) · no hedge, intra-ticker or across the family. Without it the shadow P&L logs fills a real netting account could never take — a gross lie. Caught live: ES 8 open shorts / NQ 3.
Thesis Health: every brain ± , regime ± , EMA-smoothed. Reversal = flip only when the incumbent thesis is DEAD and the opposite is born. Exit-on-death = close when the idea dies, not only on stop.
Honesty (Pillar IX): below the n=8 floor → "I don't know" → death/reversal DISABLED. Structure now, tune after data (the PF-0.92 caveat baked in).
Ride free off the object: Belief State (bull/bear/neutral %) · Conviction · Uncertainty · Thesis Journal · Self-Calibration.
Surfaced: the Meta Brain · The Mind page (cognition lens) — Belief Map, Conscience (S7), Stream of consciousness. Plus the Meta Brain Trader Journal (the autonomous trader's logbook, calendar-first).
DECISION REPLAY (2026-06-12): every Meta commit ships a DECISION record (book v0.5.0) — chosen + the rejected field reconstructed from a 90s candidate buffer + decision stability (did the lead churn NQ→ES→NQ?) + the load-bearing argument + prosecutor override. Graded OFFLINE by decision_grader.py: allocationDelta = chosenR − bestHoldableRivalR (rivals replayed on the 1m bar tape, ⟨counterfactual⟩) — "good pick" vs "left money". This is Allocation Score v2: the per-decision grade vs the actual field. Surfaces in the journal row expand (why this, not that, how steady) + the Lab (DECISION_QUALITY: torn fields = stand-down candidate). Decision Families groups graded decisions by (regime × newsPhase × competition × stability) → the bleeding family is a pattern of mistakes, not a single bad trade. Governed by charter law: decision_metric_governance — no decision metric outranks real net tradable P&L (Goodhart guard).

⟨engine-meta-brain-thesis.js · engine-meta-brain-book.js (gate · DECISION record) · claude/decision_grader.py · docs/meta_mind.html · engine-meta-journal.js⟩

one thesis per ticker · exit-on-death · ≤1 position/family slot · decision replay

The Constitution · Influence Boundaries

In plain terms. The wiring rulebook, enforced by the build itself — it defines which layer may influence which, and if any part tries to reach somewhere it shouldn't, the build simply fails. Guardrails with teeth.

Boundaries: Lab discovers · Forge proposes · Meta trades · Co-Pilot explains
Co-Pilot = autonomous HAND, not judgment: ✓execute owners' blessed verdicts (blast-radius gated) · ✗originate (invent trust · score/rank · author promotion · trade)
check_constitution.py scans 38 core files / 4 layers → 0 violations; --self-test proves the fence has teeth
Amendment test: an idea may bend a rule freely but never break a pillar (truth · no silent mutation · judgment-tied-to-evidence · operator-owns-money · auditable)
Doc + guard are ONE law — amend together

⟨claude/check_constitution.py · docs/constitution.html · /constitution⟩

38 files · 0 violations · build-enforced

★ THE CHARTER · Capital Allocation Under Uncertainty

⚖

In plain terms. The one law above all others: the product isn't signals or trades — it's growing capital wisely across two real slots (one position per market, max), and protecting it when there's no real edge to deploy. Everything else is an instrument serving this.

One judging question for ALL work: does it improve how limited capital is allocated — or the trustworthiness of allocation measurement? Neither → backlog
Objective: net tradable $ per DAILY risk budget · constraints-as-law (not ratios) · NQ+ES same-dir = ONE risk unit (~0.9 correlated) · sizing = a discrete rung ladder
Gates G0–G4 with advance AND kill criteria at every rung · G0 now (paper supremacy, span-guarded) · G1 blocked on the operator's broker-connector decision
Instruments: regret ledger (flat scores 100 only when the tape agrees) · Allocation Score + selector drift · gate odometer (reads the law) · time-to-pay prior ($/slot-hour) · prosecutors (mute briefs — authority is EARNED) · black-swan sentinel (advisory, never flattens)
Lab re-chartered: every bus finding gate-impact stamped (G0 / measurement-trust / backlog) · deployment questions outrank entry edges

⟨assets/data/charter_v1.json · claude/regret_ledger.py · engine-meta-brain-prosecutors.js · engine-shock-sentinel.js · DECISIONS 2026-06-12⟩

G0 · 2 slots · the charter card lives on /copilot

Brain Calibration · self-calibration loop

⚖

In plain terms. The loop that turns a noisy pile of brains into a trustworthy roster the Meta Brain can lean on — scan, decide, apply, check the change actually stuck, then audit itself.

The cell is the unit (brain × setup × regime × symbol), never the brain
Reactive vs anticipated is the biggest lever — but mind provenance: the "+$190k" figure is ⟨MODELED · counterfactual would-have⟩, AND it's book-vs-book (selectivity-inflated). The realized spine number is +$29k (shadow paper-book), not $190k of real money. The HONEST matched-pair ΔR (claude/twin_timing_edge.py, co-fired subset only) is +0.45R/trade on PA and FLIPS by regime (NOA loses overall, wins only in expansion). Don't size off the $190k.
EV-ranking, not win-rate (Brier lies on asymmetric payoff) + the cost floor (~0.09R commissions — a thin positive R isn't edge)
STABLE gate: never auto-cut INSUFFICIENT_WINDOWS · R1–R6 codified in calibration_policy_v1.json (the autopilot seed)
Autonomous-hand safety (designed, gated): verify-applied · dead-man's-switch watchdog (silence = alarm) · Telegram escalation to the operator's phone

⟨calibration_policy_v1.json · confidence_calibration_v1.json · feedback_calibration_doctrine.md⟩

cell-unit · EV-rank · STABLE gate · verify+watchdog

⬡

Cross-Cutting Discipline L5

what protects the race — observability, kill-rules, EOD discipline, chip-as-alarm doctrine, propagation hygiene

8 CARDS

Three Timing Frames

In plain terms. Three clocks that must agree before a trade commits — the big-picture clock (what kind of day is it), the signal clock (is a setup forming), and the execution clock (commit on bar close). One commit, three layers of timing checked.

Big-picture clock · what kind of day is it · how loud · where is value moving
Signal clock · is a setup arming inside that bigger picture?
Execution clock · commits on bar close, with safeguards against flicker
The engine has the execution clock today · the next phase makes all three explicit

Prevents Pulling the trigger without checking what the bigger timeframe is doing.

⟨decision-clock · render-stabilizer⟩

3 clocks · 1 commit

One Source of Truth

In plain terms. A rule, not a feature — every screen reads the same official engine number, never a convenient copy. It's what stops two parts of the app from quietly showing different values for the same thing.

Every number on every screen reads from the same source the engine itself uses
Built into the architecture — not a "be careful" rule that can be forgotten later
Four real bugs caught the time a redesign tried to shortcut around it
A blank panel beats a wrong one

Prevents Screens that look right and feel right, but quietly show the wrong number.

⟨buildLiveEngineExplanation() · pattern⟩

4 bypass bugs caught

Build Phase · Lock Lifted

In plain terms. The current operating stance: the engine is being built, so we add aggressively and filter later rather than guarding every change. Data from before this point was declared invalid and isn't trusted.

Lock lifted after structurally broken engine was identified — prior data invalidated
Build aggressively: new modules, new brains, new whitelist books — all welcome
The Forge filters later — cell-significance + Holm + lifecycle do the curation
Lock returns only when bridge data returns and operator confirms — not a calendar date

Prevents Premature freezing of an architecture that hasn't found its edge yet.

⟨MEMORY.md · 2026-05-22 lift⟩

BUILD PHASE active

The Kill Rule

In plain terms. When the evidence says a feature doesn't work, it gets removed — not "tuned harder." A discipline against endlessly tweaking something the data already says is dead.

Every proposed fix carries a written "when do we kill this idea" clause — before it ships
Each candidate has an explicit, measurable kill condition
If four weeks of fresh data don't support it, it's dropped — not patched or rescued
Applies to the AI layer too — same rule, same standard, no favorites

Prevents Keeping features alive out of attachment, after the data has already killed them.

⟨ROADMAP §LATER S1–S12 · noa doctrine⟩

12 kill conditions

EOD Killswitch

In plain terms. No brain is allowed to hold a trade past the market close — full stop, anchored to real exchange time and wired into every part that can close a position. Stops overnight risk getting carried by accident.

16:00 ET force-flatten: every open shadow position closes at last-valid R (exit='eod_flat')
No opens between 16:00–18:00 ET + all weekends
Wired into all 6 resolution owners + Paper Trader — flatten-guard + open-gate per brain
Visible topbar chip (🛑 EOD FLAT / 🟢 EOD 16:00 ET)
window.EODKill.flattenNow() = manual panic-flatten anytime (30s forced-halt window)

Prevents Post-close tape contamination — phantom resolutions on illiquid after-hours noise.

⟨assets/js/engine-eod-killswitch.js⟩

16:00 ET · 6 owners + Paper

Quarantine Registry

27a

In plain terms. One official list of which setups are retired — replacing three separate lists that used to disagree and caused a real bug. Now there's a single source for "is this thing benched."

Merged read across LES + PA — one call, normalized cell shape (book / sym / setupId|regime / source / reason / ts)
Metadata layer persisted to localStorage.nqelite.quarantine_registry.metadata.v1
Subscribe events on add / lift — Analyst + Actions + UI all see the same state
V1 additive — LES + PA still own their lists; registry normalizes the seam

Prevents 5-way quarantine-state divergence — the audit's R3 critical.

⟨assets/js/engine-quarantine-registry.js · window.QuarantineRegistry⟩

A4 shipped 2026-06-09 · audit R3 closed

Decision Event Bus

27b

In plain terms. Lets new listeners plug into the pipeline without editing the money-path file — the pipeline announces once, and anything that wants to listen subscribes itself. Keeps the dangerous core file untouched as the system grows.

publish('decision_tail', {ctx, decs, sym, d}) — one call from pipeline.js per tick
LES + MetaBrain self-register via subscribe('decision_tail', handler) at boot
CanonicalEngineState stays direct — pipeline reads its return value into ctx
Subscriber contract: read-only on payload (Meta Brain's autonomy goes through formal select → MetaDecision → OMS, not in-pipeline mutation)

Prevents Pipeline.js becoming the implicit observer registry. The audit's R11.

⟨assets/js/engine-decision-event-bus.js · window.DecisionEventBus⟩

D2 V1+V2 shipped 2026-06-09 · adding observer = one-line subscribe

Cross-Symbol Audit

In plain terms. Touch one instrument, check the other — no fix ships for NQ without verifying it didn't break ES. A guard against fixing one symbol while quietly breaking its twin.

Every NQ-side change triggers an ES-side parallel scan — same logic, different parameters
Caught real bugs: INV-2 Pass-1 shipped NQ-only, Pass-2 found the ES mirror was broken
Extends to 4-surface propagation: engine change → NOA Guide + Roadmap (md+html) + Atlas (EN+HE) in the same response
Cross-module trace before any edit: what does this touch downstream? The system is interconnected — no edit lives alone

Prevents Fixing one instrument while silently breaking the parallel — the most expensive class of bug that passes all single-symbol tests.

⟨INV-2 · guardian gate · CLAUDE.md propagation rule⟩

NQ↔ES · 4-surface propagation

Silent-Failure Doctrine (chip-as-alarm)

In plain terms. The rule that a status which can never turn red is lying. The enemy is the "ambiguous zero" — a 0 or "idle" that looks identical whether it's correct or the thing is actually dead. Every health light must be able to scream.

Born 2026-06-06 (SCHL/SAVT live but policy fetch blocked 2 days); upgraded 2026-06-10 into a standing review with 5 alarms shipped + 5 refinements
Smoke checks (`engine-smoke-test.js`): book_reader_coverage (FAIL — a book firing on disk but unread) · doc_claim_drift (WARN — a "by design/healthy" doc claim masking a dead brain) · disk_source_fresh (WARN — a 0 that's source-unreachable, not empty)
Chip states (`engine-noa-desk.js`): red .policy-down ⛔ · amber .strangled ⚠ (brain blocked, not idle) · amber .gate-locked 🔒 (engine gated, not idle)
Truth source: `engine-brain-truth.js` classifies brains FIRING/IDLE/STRANGLED/DOWN
5 refinements: name the ambiguous zero · prove BOTH directions on the real trigger · false alarms are silent failures too · watch the watcher · encode FAIL/WARN/passive severity
CPU budget guard (2026-06-11) (`engine-cpu-sentinel.js`): the same invariant applied to CPU — every continuous timer has a per-fire ceiling (default 150ms); a breach turns the CPU chip RED and NAMES the offender (⛔ sustained / ⚡ spike) the moment it lands, so a heavy new timer is caught in ONE session, not after days of silent drift (+32,000ms). Levers: cache the expensive per-tick read > run-O(n)-every-Nth > cap growth > throttle. Doctrine + "fix CPU" procedure: `feedback_cpu_budget_doctrine.md`.
Order-flow capture chip (2026-06-13) (`engine-stability-observer.js` → #tbOFCapChip in the ALERTS row): the doctrine applied to the Event-Fragility-v2 capture — the 60s observer now logs continuous pre-event order flow (deltaPct/cvdSlope/DOM/absorption) so the quiet coiled pre-event tape is no longer a capture blind-spot. The chip shows the captured value live, color-coded by delta, or an honest OF · idle when a sample had no feed data — never a blank that hides a dead capture (it IS the captured row, one source of truth).

Prevents The system looking alive (or "fine by design") when it's dead — across chips, counts, AND doc claims.

⟨engine-smoke-test.js · engine-brain-truth.js · engine-noa-desk.js · feedback_silent_failure_pattern.md · silent_failure_ledger.md⟩

5 alarms · standing review · maintain mode

Run-Mode Vocabulary Lock

In plain terms. Three words, locked: Cold, Hot, Automated — each meaning one specific thing, no synonyms, no improvising. Confusing "which mode are we in" is one of the most expensive mistakes in this system.

🧊 Cold = CALIBRATION mode · shadow capture, tool not default
🔥 Hot · NOW = HOT mode · real signals, manual trader, Paper Trader on screen
🤖 Automated = LIVE mode · real broker, real money, frozen graduated policy
Every "is it firing?" debug starts with: verify the live mode via EVDecision.phase.mode
Phase4Resume auto-CALIBRATION was disabled 2026-06-06 — HOT is the standing posture, code-enforced

Prevents Silent mode drift — the engine running in CALIBRATION while the operator believes HOT.

⟨engine-ev-decision.js · feedback_run_modes_vocabulary.md⟩

3 modes · locked

◈

NOA Operator + Execution Ramp L6

how it ships — NOA is the Phase 4 north-star operator. The execution ramp + Phase 1 Trust Calibration gate get the graduated brain into a real broker without losing capital to its own scaffolding.

10 CARDS

Two halves, one trajectory. Today NOA is a racer in L3 (the brain) AND the operator-in-training here in L6 (the face + hands). Phase 4 north-star: NOA the racer wins Phase 1, gets snapshotted into Phase 2, qualified through Phase 3's execution ramp, and arrives in Phase 4 as a meta-allocator routing across all graduated brains. The execution ramp at execution/ is the deliberate, broker-safe path from "real signal" to "real fill on a real account." Already shipped: OMS Phase 0-Sim (69 tests green, 5 schemas, hash-chain ledger) · Meta Brain paper book (5 phases inside the dashboard, mirrors the Synthesizer's 6 routes) · Phase 1 Trust Calibration pipeline (measures claimed-R vs realized-R; pipeline real, data synthetic until a broker connects). Blocked on: operator picks broker connector (Quantower Trading Simulator first, AMP Rithmic demo for haircut measurement).

NOA Doctrine · 10 Commandments

In plain terms. NOA is plumbing, not a character — and ten hard rules, enforced in code, keep her that way: stay silent unless there's something real to say, never make things up, never offer false comfort. The voice serves the edge, not the other way around.

10 commandments: silence-first · no comfort · no tick narration · no fake certainty · no block execution · no punishment · no therapist · no hallucinate confidence · no omniscience · no social companion
Doctrine outranks feature requests, council enthusiasm, ship-faster pressure
Voice: 3 channels (alerts · companion · copilot) · 5–9 word templates · 85 lifecycle clips · text-only companion (voice rollout deferred)
Read-only consumer of engine state — zero engine-logic touch, ever

⟨noa_should_not_doctrine.md · engine-noa-cognitive.js · engine-noa-voice.js⟩

10 commandments · 3 channels

Thesis + Trade Companion

In plain terms. When a trade fires it freezes the reason you took it, then checks reality against that reason on every tick — surfacing the moment something material changes and staying quiet for ordinary noise. It's what keeps you honest about why you're still in.

5 lifecycle states: WATCH → PERMISSION → IN_TRADE → EXIT → DEBRIEF
Thesis: frozen at PERMISSION with mustRemainTrue[] + cancelIf[] + captured regime
Integrity scoring: SOLID_GREEN → PULSING_AMBER → BROKEN_AMBER → RED_FRACTURED → GRAY
Trade Companion 5 wires: weakening · strength · BE advisory · pressure · chart badge
3 surfaces (chart overlay · left rail · topbar pill) all read the same signal object

⟨engine-noa-thesis.js · engine-noa-trade-companion.js⟩

5 states · 5 wires · 3 surfaces

★ Phase 4 · Portfolio Operator

In plain terms. The end state — once at least two brains have proven themselves, NOA runs them as a portfolio and the engine, not you, is the one trading. The whole ladder points here.

Meta-allocator: regime-routing table — Phase 1's per-regime data populates it directly
Correlation-aware sizing: two brains firing same direction on same regime cell don't double-up
Drawdown budget partitioned across brains — brain-level breakers feed portfolio-level
Operator's role collapses to: risk-limit setter + monthly reviewer · no intraday touch
Stage 6 of the MI evolution ladder — designed, awaits graduating brains

⟨routeBook(regime) → {book, size, exit} · MI v0.2 Stage 6⟩

North-star · ≥2 brains graduated

Cerberus · Offline-Trained Frozen Policy

In plain terms. A different kind of brain — it learns hard offline in Python, and only a frozen, tested snapshot ever ships to the live browser. Because it never learns live, it can't quietly drift away from what was validated.

Architecture: offline-trained / browser-deployed / FROZEN-policy — autonomous-execution doctrine
Two brains, two paradigms: SCHL (Scholar — readable, auditable, "can edge be NAMED?") + SAVT (Savant — self-supervised, black-box, "can edge be FELT?")
3 heads each: HUNTER (CQL-conservative off-policy value) · SKEPTIC (adversarial meta-label) · SENTINEL (BOCPD regime-death anticipation, data-starved Phase 2)
Decision gate = conviction × (1−refuteProb) × regimeConfidence
Boot relaxation 2026-06-08: minExpectedR: 0.10 → 0.05. DR-OPE drMeanR is +0.087R ⟨modeled · off-policy estimate⟩ — original gate was structurally never reachable. _decisionOverrides layer keeps frozen-policy doctrine intact while runtime tunables win.
Data-gated: need ≥6 RTH session-days · have 2 · abstaining by design until retrain

⟨engine-noa-solo.js · scholar_policy_live.json · savant_policy_live.json · getReasonCounters()⟩

offline → frozen → deployed · relaxed

Execution Ramp · 0-Sim → D3

In plain terms. The careful eight-rung path from "software test harness" to "real autonomous trading at size" — each rung must be earned, with the brain's rules frozen and promotion automatic only when the bar is cleared.

0-Sim · OMS harness against Quantower built-in Simulator · ✅ 69 tests green, contracts shipped
0-Broker · CQG/Rithmic demo · native bracket survival across power-loss · ◻ blocked on connector selection
A · demo assisted — proposes entry, operator confirms · ◻
B → C · auto-entry under caps → full demo auto · ◻
D1 → D2 → D3 · live supervised micro → limited autonomy → scaled by drawdown budget · ◻

⟨HANDOFF_autonomous_execution_ramp_2026-05-31.md §11–§14⟩

8 phases · north-star = D3

OMS Phase 0-Sim · Shipped

In plain terms. The order-handling skeleton everything live will be built on — one writer, save-before-you-send, and a tamper-evident ledger. Boring on purpose: this is the part that must never lose or duplicate an order.

Contracts: 5 JSON Schemas (intent · command · event · operator · common) + ledger.sql (WAL + hash-chain) + 12 invariants + DISARMED-on-restart
OMS: clock · ledger · contracts · risk · state machine · kill policy (3 severities) · reconcile · reconnect
69 tests green: contracts, ledger durability/tamper, state machine, risk, reconcile, end-to-end drills
C# adapter skeleton: NQEliteExecutionBridge.cs with // VERIFY markers · ◻ not compiled yet

⟨execution/oms/ · execution/contracts/ · execution/adapter/⟩

69 tests · 5 schemas · WAL hash-chain

Live SL/TP Hard Rule

In plain terms. For real money, the stop-loss and target must live at the broker, not in the browser — because a browser stop disappears the moment you lose connection. Non-negotiable.

🔺 Quantower "Local" SL/TP are client-side and vanish on disconnect/power-loss → forbidden for live
🔺 Quantower built-in Simulator has no broker → can prove software, never broker-side protection
A SIMULATOR_LOCAL stop can never pass the live-eligibility gate (fail-closed, machine-enforced)
Real CQG/Rithmic-routed broker demo required before any live execution
Kill hierarchy: PAUSE → CONTROLLED_FLAT → EMERGENCY_FLAT · broker-kill non-overridable

⟨execution/contracts/states.json · live-eligibility gate⟩

broker-side mandatory

★ Meta Brain · the autonomous mastermind

In plain terms. Not one of the racers — the boss above them. It reads every brain's vote, trusts each by track record, cancels out overlap, holds back when uncertain or when the real-world haircut is too steep, allocates the capital, and even checks whether it's beating the best single brain. Paper-only — it never touches the live money-path.

S1 · Haircut-aware EV · engine-meta-brain-selector.js — scores each route on expected R AFTER slippage/fees, scaled to avgWin/avgLoss (not a flat 1R penalty)
S2 · Intervention-bias wall (Gap 10) · engine-cf-snapshotter.js — freezes each candidate brain's state at decision time → CF_SNAPSHOT; brain trust learns from its OWN outcome, never from Meta's intervention
S3 · Provenance-keyed trust · claude/trust_reducer.py — Trust EMA keyed by (brain × sym × regime × session × provenance × surface); paper trust ≠ demo ≠ live
S4 · Candidate ingest · engine-meta-brain-ingest.js — polls the synthesizer-routed brains' fire streams; Meta mirrors the ROUTED brain per (sym × regime), not a rare engine SIGNAL. ingestCandidate() books via selector + haircut + sizing + MFE
Regime Commander · when the policy's primary brain is silent/dormant, the next crowd-aligned brain COVERS as secondary — tagged + sized-down, measurable apart
S5 · Uncertainty gate · engine-meta-brain-consensus.js — cross-brain consensus; STAND_DOWN when ≥4 voters split (minority_share > 0.40). Trust-weighted votes
S6 · Capital Router · engine-meta-brain-allocator.js — every 10s/symbol: ONE decision over all voters, correlation-collapsed (Gap 2/5 — the Oracle-flagged cluster counts once, not N×), quantized to integer contracts round-down, emits an allocation_vector
S7 · Holy Grail self-audit · engine-meta-brain-self-audit.js — "is Meta beating the BEST single brain?" If not → selfConfidence < 1 → the allocator sizes Meta DOWN. PROVING / BEATING / LAGGING / FAILING

⟨engine-meta-brain-{book,scoring,selector,copilot,ingest,consensus,allocator,self-audit,state-push}.js · engine-cf-snapshotter.js · MetaBrain / MetaBrainAllocator / MetaBrainSelfAudit globals⟩

S1–S7 live · correlation-collapsed · self-policing · SHADOW

Plane-B learning ledger

N8b

In plain terms. The tamper-evident record under the Meta Brain — kept in a separate database from the live-order ledger on purpose, so learning can never contaminate the money-path and the Meta's meddling can never muddy which brain deserves the credit.

execution/contracts/allocation_vector_v1.schema.json — FROZEN. ONE decision = ONE intent. Closes Gaps 1·2·4·6·10·11·12·13·14
execution/contracts/plane_b_event_v1.schema.json — 5 event types: META_DECISION · CF_SNAPSHOT · CF_OUTCOME · TRUST_UPDATE · ALLOCATOR_VERSION
claude/plane_b_ledger.py — hash-chained append-only (reuses OMS canonical_event_hash); writer-id authority enforced at the append boundary; 13 tests green (tamper · truncation · auth · replay)
Disk server: POST /push_plane_b_event (per-request SQLite connections — thread-safe on ThreadingHTTPServer) · GET /plane_b_status · GET /meta_brain_state
Cross-origin: dashboard mirrors Meta state to disk every 30s (engine-meta-brain-state-push.js); standalone page reads it — one writer, many readers, no IDB-race dupe class

⟨execution/contracts/{allocation_vector,plane_b_event}_v1.schema.json · claude/plane_b_ledger.py · research_data/plane_b/plane_b_ledger.db⟩

hash-chained · Plane-A/B separated · audited

Plane A vs Plane B

In plain terms. Two different jobs kept separate: Plane A asks "where should we trade next?" and Plane B asks "how do we sharpen the brains we already have?" Mixing the two questions is how systems fool themselves.

Plane A · docs/meta_brain.html · violet brand accent · the rich 6-section view: status hero + route cards + recent fires strip + Meta Brain findings feed + mode-switch self-score + voice log
Plane B · docs/copilot.html · pure Co-Pilot — sharpen the books currently on the tape, no Meta Brain inline (single-line pointer above footbar)
Co-Pilot page: ⬢ Meta Brain nav link added · Meta Brain inline panel REMOVED → Plane-B is now leak/edge-only
Signals-page chip strip: MB chip (violet accent, trust-state badge ·U/·W/·T/·S/·F once ≥20 sealed) · Meta Brain is the Journal's 19th first-class book
Operator decision 2026-06-08: don't let Meta Brain's policy noise contaminate the active sharpening surface

⟨docs/meta_brain.html · docs/copilot.html · engine-noa-desk.js · /meta_brain route⟩

A = next · B = now

★ Phase 1 · Trust Calibration

N10

In plain terms. The reality check before any brain graduates — it compares what a brain claimed it made against what it would really have made (and claimed fills vs real fills). That gap, the "haircut," has to be applied before anyone trusts a brain's numbers. Waiting on a real broker to fill in real figures.

Analyzer · claude/trust_calibration.py — pairs engine fires with broker fills by intent_id, computes R_shortfall = actual_R − claimed_R (universal across winners + losers), session-clustered bootstrap CI by (sym × regime × book)
Spec · execution/contracts/broker_fills_v1.schema.json — FROZEN. Two record types (fill / exit), additionalProperties: false
JS lookup · engine-haircut.js — window.Haircut.discount(claimedR, {sym, regime, book}) → expected_R + factor + CI + is_synthetic flag · resolution priority (sym × regime) → (sym × book) → (sym) → −0.30R default
Dashboard · docs/trust_calibration.html at /trust_calibration — headline tiles, per-regime + per-book breakdowns, [SYNTHETIC] banner
Routing config · execution/config/route_to_demo_v1.json — default NOA-only, both syms, RTH, max 2 concurrent / 10 daily / $1500 risk, kill on 4 streak or $600 daily loss
Promotion gate: Synthesizer verdict + haircut.is_synthetic == false + n ≥ 50 per sym + shortfall_CI_upper < 0 · all four must clear before composed policy leaves shadow
⚠ Pipeline real, data synthetic. Synth NOA NQ: −0.022R/trade · ES: −0.116R/trade — sanity-check only. Real numbers blocked on operator broker pick.

⟨trust_calibration.py · broker_fills_v1.schema.json · simulate_broker_fills.py · engine-haircut.js · docs/trust_calibration.html · phase1_trust_calibration_handoff.md⟩

measurement layer · synthetic · awaiting broker

◎

Parameter Reference REF

every threshold, gate, setup weight, and order-flow rule — straight from the engine code

4 TABS · 40+ CARDS

Thresholds6

Gates & Pipeline16

Setups & Confluence10

Order Flow7

Order Flow Threshold

Delta Classification

6% · 350 · 120

How The Engine Separates Real Directional Aggression From Market-Maker Noise.

Measure Rule Outcome Calibration Sources

Core Decision

Institutional Delta Requires Meaningful Size Plus Confirmed Directional Travel.

Magnitude|deltaPct| >= 6% OR |delta| >= absFloor

Efficiencydirectional displacement / total path >= 45%

Pass Interpretation Delta can support initiative flow, bias confidence, and downstream order-flow confirmation.

Fail Interpretation Delta is treated as noise, absorption risk, or inventory rebalancing unless other evidence overrides it.

Measures

Every bar the engine receives carries a delta: the net difference between aggressive buying volume, where market orders hit the ask, and aggressive selling volume, where market orders hit the bid. Raw delta is meaningless without context: 200 contracts on a 3,000-volume bar is noise; the same 200 on a 1,200-volume bar is a 16.7% directional skew, which means someone is acting with intent.

Gate 1: Magnitude Rule

The engine runs a dual-gate test. First, percentage: is this bar's delta at least ±6% of its total volume? Second, absolute floor: is the raw delta at least 350 contracts for NQ or 120 contracts for ES? Either gate passing qualifies the bar as meaningful. The absolute floor exists because during thin pre-market or lunch bars, even a 10% skew might represent only 30 contracts, which is statistically irrelevant for a futures instrument that trades millions daily.

Gate 2: Efficiency Rule

On top of delta magnitude, the engine evaluates bar efficiency: did price actually travel in the delta's direction? Efficiency equals directional price displacement divided by total path traveled. A bar with at least 45% efficiency moved purposefully in one direction. Below 45%, the bar oscillated: the delta might be real, but the price action is indecisive, and the move is more likely market-maker rebalancing than institutional commitment.

Parameter Summary

Parameter	Exact Rule	Role In The Engine
Magnitude Filters
Relative Threshold	±6% Of Bar Volume	Filters Weak Directional Skew.
Instrument Floors
NQ Absolute Floor	350 Contracts	Minimum Raw Delta For NQ Depth.
ES Absolute Floor	120 Contracts	Minimum Raw Delta For ES Depth.
Combined Gate	\|Δ%\| ≥ 6% OR \|Δ\| ≥ absFloor	Marks Delta Magnitude As Meaningful.
Efficiency Confirmation
Bar Efficiency Gate	≥45% Displacement ÷ Path	Requires Price To Confirm The Flow.
Min Price Move (NQ)	16 Ticks = 4 Pts	Avoids Microscopic NQ Drift.
Min Price Move (ES)	10 Ticks = 2.5 Pts	Avoids Microscopic ES Drift.

Calibration Rationale

The 6% threshold was derived empirically: below it, correlation between delta sign and next-bar direction drops below statistical noise.
The 3:1 ratio between NQ (350) and ES (120) absolute floors matches the typical volume ratio between the two instruments.
The 45% efficiency threshold aligns with microstructure research showing that bars below roughly 40-50% efficiency are dominated by market-maker inventory rebalancing, not directional intent.

Volume Classification

Volume & Market Tempo

4 tiers · self-calibrating

Two separate systems: volatility-rank buckets for regime, and trade-count tempo for institutional participation.

Rank Classify Tempo Offset Score

Core Decision

Volume regime and bar-level tempo are independent systems that jointly adjust confluence threshold and bias confidence.

Regimevolatility rank percentile → 4 tiers (LOW / NORMAL / HIGH / EXTREME)

Tempobar trades vs baseline · WEAK <0.6× · STRONG ≥1.5×

When It Works HIGH/EXTREME regime lowers confluence threshold; STRONG tempo confirms institutional participation behind setups.

When It Fails LOW regime raises threshold +4; WEAK tempo flags retail-only environment — setups lack institutional backing.

Volume Regime Classification

The engine classifies market activity through two independent lenses. Volume regime uses a volatility rank (0-100 percentile) provided by the bridge from the instrument's recent history. This isn't a fixed number — "high volume" on a quiet August day means something different than "high volume" on an FOMC day. If the bridge can't provide a rank, the engine falls back to today's intraday range: NQ range >220 pts = EXTREME, <70 pts = LOW.

Market Order Tempo

Market order tempo is a separate, bar-level check. It counts trades and contracts in each bar and compares them to fixed baselines (NQ: 900 trades / 12,000 contracts; ES: 600 / 8,000). Below 60% of baseline = WEAK (retail noise, no institutional footprint). Above 150% = STRONG (institutions are participating). This distinction matters because a "high confluence" setup in a WEAK tempo environment is suspicious — who's going to move the market in your favor?

Score Adjustment

Both systems feed into score adjustments: volume regime shifts the confluence threshold (LOW: +4 harder, EXTREME: -5 easier), while tempo classification informs Gate 2.5's bias confidence and several order-flow conditions.

Volume Regime Tiers

Volume Regime	Volatility Rank	Fallback (NQ range)
EXTREME	≥85th percentile	>220 pts
HIGH	≥70th	>140 pts
NORMAL	≥35th	70–140 pts
LOW	<35th	<70 pts

Tempo Baselines

Tempo Gate	NQ	ES
Baseline	900 trades / 12,000 vol	600 trades / 8,000 vol
WEAK (<0.6×)	<540t or <7,200v	<360t or <4,800v
STRONG (≥1.5×)	≥1,350t or ≥18,000v	≥900t or ≥12,000v

Volume Score Offsets

Volume → Score Offset	Effect
LOW	+4 (raise threshold — harder to fire)
NORMAL	0 (baseline)
HIGH	-3 (lower threshold — easier)
EXTREME	-5 (much easier — conviction in loud markets)

Calibration Rationale

Self-calibrating percentile bands prevent the system from mislabeling a quiet Tuesday as "low volume" when it's actually normal for that contract's seasonal pattern.
The tempo baseline (900/600 trades) was calibrated from CME Group session data for NQ and ES regular trading hours.
The 0.6×/1.5× multipliers for WEAK/STRONG are derived from the point where institutional participation visibility inflects in order-book data.

Liquidity Detection

Sweep & Trap Detection

7-bar window · 3 confirm paths

The engine's institutional-behavior detector: where stops cluster, how sweeps are identified, and when a sweep becomes a trap.

Target Sweep Window Confirm Score

Core Decision

A sweep past a liquidity target must either be accepted as a breakout or confirmed as a trap within a 7-bar window.

Sweepprice past target by ≥ break distance (NQ: 4t / ES: 3t) within 24 bars

Trapconfirmed trap scores 82 confidence — highest-conviction signal

When It Works Confirmed trap at 82 confidence — institutional stop-hunt detected, highest-conviction reversal signal.

When It Fails Sweep accepted (holds past fail distance for full window) — real breakout, not a trap.

Liquidity Targets

The engine maintains a ranked list of liquidity targets — session extremes, prior-day highs/lows, VPOC, VWAP, and equal highs/lows — scored by type, proximity, and session context. Each target represents a probable stop-cluster location: retail traders place stops outside these levels, creating pools of resting orders that institutional players can exploit.

Sweep Detection & Acceptance

A sweep is detected when price pushes past a target by at least the break distance (NQ: 4 ticks / 1 pt, ES: 3 ticks / 0.75 pt) within the last 24 bars. The engine then watches a 7-bar window: if price returns inside the level, it's a potential trap. If price holds beyond the fail distance (NQ: 6 ticks, ES: 4 ticks) for the full window, the sweep is accepted — real breakout, not a trap.

Trap Confirmation

Trap confirmation requires one of three signals within 4 bars after the sweep: (1) price reverses by at least trapMinTicks (NQ: 8 / ES: 5), (2) aggressive delta ≥10% stalls with a 2-tick reversal, or (3) the last 4 bars form a micro-range tighter than sweepFailTicks — the market froze after the attempt. Confirmed traps score 82 confidence in the order flow state machine — the highest-conviction signal the engine produces.

Sweep Distances

Distance	NQ (ticks / pts)	ES (ticks / pts)
Sweep break	4t / 1.00 pt	3t / 0.75 pt
Sweep fail (accepted)	6t / 1.50 pt	4t / 1.00 pt
Trap min reversal	8t / 2.00 pt	5t / 1.25 pt
Trap max window	12t / 3.00 pt	8t / 2.00 pt
Gap significant	80 pts / 0.4%	20 pts / 0.3%

Trap Probability Scoring

Trap Probability	Component
Base	25 pts
+ Reclaim	+28
+ Trap confirmed	+28
+ Fake breakout	+12
+ Contra delta	+10
− Accepted	−38
High confidence	≥65 probability

Calibration Rationale

The break/fail distances are calibrated to each instrument's tick value. NQ at $5/tick needs 4 ticks ($20) to register a meaningful push past a level — less than that is just bid/ask bounce.
ES at $12.50/tick needs only 3 ($37.50) for a comparable dollar displacement.
The trap window (8–12 / 5–8 ticks) corresponds to typical retail stop-cluster distances from key levels — this is where the "victims" are positioned, based on common retail order-placement patterns observed in DOM data.

Risk Management

Risk Guardrails

-3.75R halt · 12-trade window · setup retirement

Hard limits the engine will not cross — behavioral brakes calibrated to prevent judgment-degradation spirals.

Track Halt Size Retire

Core Decision

Three independent brakes prevent judgment-degradation spirals: daily loss halt, drawdown-scaled position sizing, and rolling setup retirement.

Daily Haltcumulative realized R hits -3.75R → new setups halted

Retirerolling 12-trade expectancy < -0.12R (n≥6) → setup benched

When It Works Position sizing stays full, setups remain active, engine operates at maximum capacity.

When It Fails Engine halts at -3.75R, sizes down to 25% floor at deep drawdown, or benches broken setups permanently.

Daily Loss Halt & Circuit Breaker

The engine tracks cumulative realized R across all closed signals since midnight ET. When the daily total hits -3.75R, new setups are halted — not because the edge disappeared, but because three standard-size losing trades in a row degrades judgment. A separate circuit breaker fires after 5 consecutive losses regardless of R magnitude, and a 30-fire hard cap prevents signal spam even if realizedR data isn't populated.

Setup Retirement

Setup retirement operates on a rolling window of the last 12 closed trades per setup type. If a setup's rolling average expectancy drops below -0.12R with at least 6 completed trades, it's benched ("Retired"). Between -0.05R and -0.12R, it's flagged as "Degrading" with a -5 priority penalty. Retired setups continue computing for observation but can never fire — the engine doesn't ride a dead horse.

Risk Parameters

Parameter	Value
Daily loss halt	-3.75R cumulative
Circuit breaker	5 consecutive losses
Hard fire cap	30 fires / day (safety net)
RR floor	1.5:1 (HOT) / 2.0:1 (AUTOMATED)
RR cap (advisory)	2.5 NQ / 4.0 ES (off in Phase A)
Retire threshold	-0.12R rolling expectancy, n≥6
Degrade threshold	-0.05R rolling expectancy
Rolling window	Last 12 closed trades per setup

Calibration Rationale

The -3.75R daily halt aligns with institutional prop desk standards where traders are pulled after 3-4 standard-risk losing trades.
The position degradation curve is a behavioral brake: it doesn't predict whether the next trade wins — it ensures the inevitable revenge trade costs less.
The 12-trade rolling window for retirement balances responsiveness (catching broken setups quickly) against noise resistance (not benching a setup over two bad days).

Risk Management

Risk Plan & Position Sizing

stop · entry · R-budget · contract count

P4.5

From signal to trade: stop placement → entry price → dollar risk budget → contract count → instrument → size tier. Every live trade starts here.

Plan Budget Size Tier Mirror

Core Decision

Given a setup's entry price and stop loss, compute the all-in contract plan that keeps stop risk plus commissions at or below the effective-equity R-budget.

Risk BudgeteffectiveAccount × maxRiskPctPerTrade × riskScale × sizeMode

All-In Risk(|entry−stop| × pointValue × contracts) + commissions

Current Selectorstandard if all-in fits, else micro fallback

When It Works Every trade has a defined dollar risk before entry. Sizing scales down automatically under drawdown — no manual calculation required.

When It Fails Entry or stop price not available at render time — chip shows plan fallback. Trader overrides with HALF or PROBE tier manually.

Stop Placement & Entry Price

The risk plan is anchored to two prices: entry and stop loss. These come from the setup's structural plan — the engine reads them from linked.entryPrice / linked.stopLoss at render time (signal card) and at journal-entry time (addSetupToTraderJournal()). If no linked plan exists, the engine attempts to derive them from buildStructuredPlan(). Stop distance is computed as |entry − stop| in points — direction agnostic.

Dollar Risk Budget

Base risk is a flat percentage of effective account equity: (accountSize + closed realized P&L) × maxRiskPctPerTrade when accountEquityAutoAdjust is enabled. This is then scaled by the automatic risk multiplier from riskScaleForNextTrade(): drawdown state, conviction tier, and bounded ATR-risk trim from stop width, target reach, and entry-to-VWAP extension. No regime multiplier: the regime gates already filter which setups fire; sizing does not second-guess what the gate decided.

Contract Count & Instrument Auto-Select

engine-position-sizer.js now optimizes executable contract mixes instead of forcing one instrument class. It can recommend all-micro, all-mini, or mixed output such as 1 NQ + 2 MNQ when that best fits the adjusted all-in risk budget. Dollar risk includes commissions: minis use $3.50 round turn, micros use $1.50. The chip displays all-in economics and the exact contract mix.

Size Tiers — FULL / HALF / PROBE

Three manual tiers appear on every signal card. FULL (default — no click required): trade the full adjusted risk budget. HALF: cut the budget to 50% before computing contracts. PROBE: bypass all math — always 1 micro, regardless of stop distance, drawdown, or account size. PROBE is not a sizing calculation — it is a veto. The trade is on, but exposure is minimal. Tier selection persists per setup ID for the session; clicking a new tier re-renders the chip immediately.

Instrument Specs

Instrument	Point Value	Type	Round Turn
NQ	$20 / pt	Standard	$3.50
MNQ	$2 / pt	Micro	$1.50
ES	$50 / pt	Standard	$3.50
MES	$5 / pt	Micro	$1.50

Position Sizing Curve (Drawdown Scale)

Drawdown (R)	Scale	Effect on $250 base budget
0	100%	$250
-2	85%	$213
-4	70%	$175
-6	55%	$138
-8	40%	$100

Config Fields

Field	Default	Location
accountSize	25,000	config.js · INSTITUTIONAL_TUNING
accountEquityAutoAdjust	true	config.js · INSTITUTIONAL_TUNING
maxRiskPctPerTrade	0.01 (1%)	config.js · INSTITUTIONAL_TUNING
roundTurnCommissionByInstrument	NQ/ES 3.50 · MNQ/MES 1.50	config.js · INSTITUTIONAL_TUNING
sizeByDrawdown	curve above	config.js · INSTITUTIONAL_TUNING

Calibration Rationale

1% fixed risk per trade is the standard institutional starting point — aggressive enough to compound, conservative enough to survive a cold streak without account damage.
All-in sizing prevents commission drag from being invisible: contract count is based on stop risk plus round-turn fees.
Auto instrument selection now supports mixed mini+micro sizing, so full-size progression can be precise without excessive micro-only commissions.
PROBE tier exists for high-uncertainty setups where the trader wants presence but not exposure — one micro is near-zero cost and still captures the full experience in the substrate.
No regime multiplier by design: if a setup passed all gates, it has already been regime-filtered. Sizing adds only execution-geometry risk via ATR-proxy stop/target/VWAP measurements.

Engine Source engine-position-sizer.js (2026-05-24) · config.js · INSTITUTIONAL_TUNING · engine-dashboard-renderers.js · renderActionable() · engine-trade-management.js · buildSignalRecord() · engine-pipeline.js · EVDecision.log() · weekly_report.py

Substrate Mirror sizedInstrument / sizedContracts / sizedGrossDollarRisk / sizedNetDollarRisk / sizedCommissionDollars on signal records; sized*AtFire fields in EV substrate Section 25; weekly_report slices: instrument, contracts, mode, all-in risk, fees, effective account.

Academic Basis Kelly criterion literature · Tharp (1998) fixed-fraction sizing · Auto instrument select: prop desk micro-contract conventions

Signal Grading

Scoring, EV & Grading

sigmoid · 64% cap · 12-trade blend

How confluence score converts to a win probability via a conservative sigmoid model — and what each grade means.

Score Sigmoid Blend EV Grade

Core Decision

Confluence score converts to win probability via a sigmoid capped at 64%, then computes EV to decide whether to fire.

Win Probsigmoid inflected at score 72, ceiling 64%

EV(winProb × R:R) − (1−winProb) × 1R, fire at EV ≥ 0.0

When It Works EV ≥ 0.0 fires the signal; EV ≥ 0.8 earns a premium badge for highest-conviction setups.

When It Fails EV negative — setup computed but suppressed. Near-miss (EV ≥ -0.2) tracked as Armed for observation.

Sigmoid Win Probability

Confluence score (0–100) doesn't directly fire a signal. It first converts to a win probability through a sigmoid curve inflected at score 72, capped at a 64% ceiling. The reasoning: even perfect alignment doesn't guarantee >64% win rate in futures markets. A score of 55 maps to ~25% win probability; 72 maps to ~43%; 90 maps to ~61%.

Historical Win Rate Blending

If the setup type has ≥12 completed trades in the outcome registry, the engine blends historical win rate into the curve estimate. Historical data gets up to 85% weight (curve always contributes at least 15%). If the actual win rate underperforms the curve by >8%, the historical weight is boosted to 92% — the model recognizes it was overestimating and defers to reality.

Expected Value & Multipliers

Expected Value is then computed as: EV = (winProb × R:R) − (1−winProb) × 1R. This is further adjusted by six multipliers: market fit (0.6–1.0×), directional alignment (±5%, −12% for conflict), confirmation status (+4%/−2%), production mode (+3%/−2%), sweep/absorption bonus (+6%), and chop penalty (−5% to −15%). The final win probability is hard-bounded between 15% and 72%.

Grade Thresholds

Grade	Score	Approx Win Prob
S (elite)	≥90	~61%
A+	80–89	~54%
A	70–79	~40%
B	<70	<37%

EV Decision Thresholds

EV Decision	Threshold
Signal (Hot Run)	EV ≥ 0.0
Signal (Automated Run)	EV ≥ 0.5
Premium badge	EV ≥ 0.8
Armed (near-miss)	EV ≥ -0.2
Historical blend trigger	≥12 closed trades
Win prob ceiling	64% (sigmoid cap)

Calibration Rationale

The sigmoid inflection at 72 was set empirically: substrate data shows a sharp inflection in outcome quality around that score.
The 64% ceiling is a market-structure constraint — no retail-accessible strategy sustains >64% win rate in liquid futures at meaningful holding periods.
The 12-trade minimum before blending historical WR prevents overfitting to a handful of early results.
Hot Run's 0.0 EV threshold is intentionally permissive — we're collecting signal-quality data, not filtering for profit.

Session & Data Quality

Session Windows, Slippage & Data Quality

64 quality floor · 3.2t MOC

How long the engine remembers, what it assumes about execution cost, and when it declares itself blind.

Memory Slippage Quality Halt

Core Decision

Target memory tightens through the day, slippage is modeled conservatively per session, and data quality below 64/100 halts production.

Qualityscore from 100 downward · floor 64/100 → production halted

Feed< 10 ticks/min = hard stop · < 22 = warning

When It Works Quality ≥85 (grade A) — full production with accurate slippage estimates and fresh target memory.

When It Fails Quality <64 (grade D) — engine halts. A blank screen beats a wrong one.

Target Memory Windows

Target memory tightens through the trading day. In Asia (slow, level-driven), a liquidity target persists in memory for 26 minutes and up to 130 ticks away. By NY (fast, momentum-driven), the same target is stale at 18 minutes / 100 ticks. The scan range — how far the engine looks for new targets — follows the same tightening: 90 ticks in Asia, 64 in NY.

Slippage Model

Slippage is modeled per session and adjusted by volume regime. Base assumptions range from 1.8 ticks (Asia, thin but orderly) to 3.2 ticks (MOC, thin and chaotic). A volume multiplier scales these: EXTREME volume costs 1.6× the base slippage. Reversion setups get a 0.9× style discount (tighter fills at levels); continuation setups get 1.1× (looser fills chasing). A flat 2-tick round-trip cost is added for commissions and spread.

Data Quality Scoring

Data quality is scored from 100 downward, with deductions for each data gap. Missing price = −35. Stale feed = −24. Missing order book = −5. The quality floor is 64/100 (grade C) — below that, the engine halts production. The philosophy: a blank screen beats a wrong one. Feed health is monitored via a 1-minute sliding window of tick arrivals; below 10 ticks/minute = hard stop, below 22 = warning.

Session Memory Windows

Window	Asia	London	NY
Target memory	26 min	22 min	18 min
Max distance	130t	115t	100t
Scan range	90t	78t	64t

Slippage Assumptions

Slippage	Base (ticks)	At EXTREME vol
Opening Drive	3.0	4.8
Power Hour	2.6	4.16
MOC	3.2	5.12
Lunch	2.2	3.52
London	2.0	3.20
Asia	1.8	2.88
Style adj.	reversion ×0.9 · continuation ×1.1
Round-trip cost	2 ticks fixed

Data Quality Grades

Data Quality Score	Impact
A (≥85)	Full production
B (72–84)	Production with penalty
C (64–71)	Minimum passing
D (<64)	Production halted

Calibration Rationale

Target memory windows tighten because volatility accelerates through the trading day — a level that's relevant for 26 minutes in Asia is ancient by NY.
The slippage model is intentionally conservative: it should overestimate execution cost, not underestimate it.
The MOC session gets the highest base (3.2t) because market-on-close flow creates the most adverse fills relative to displayed liquidity.
The 64/100 quality floor is aggressive by design — it means any two medium-severity data gaps halt the engine.

Serial AND cascade — each gate must pass. Hard blocks terminate immediately. Soft blocks accumulate as WAIT. Advisory gates (Phase A) compute but don't enforce.

Pipeline Gate

Gate 0 — Institutional Hard Blocks

3 sub-gates · hard block

System-level kill switches. Three sub-gates that terminate before anything else runs.

CheckEvaluateDecideRecord

Core Decision

Any institutional block, symbol kill, or quarantined setup terminates the signal immediately.

Vetoinstitutional block flag set → signal dies

Kill Switchper-symbol toggle → hard block

Quarantineblacklisted setup → compute only, never SIGNAL

Gate Passes No institutional blocks, symbol is live, setup is not quarantined — signal proceeds to Gate 1.

Gate Blocks Signal dies immediately with no appeal. Quarantined setups continue computing for substrate observation only.

Institutional Veto

Gate 0 is the pipeline's first checkpoint and the only one that cannot be overridden by any downstream condition. It runs three independent checks in sequence. Gate 0 (Institutional Veto) scans for any reason the system has flagged as an institutional-level block — circuit breakers, exchange halts, or operator-set kill conditions. If any flag is set, the signal dies immediately with no appeal.

Symbol Kill Switch & Setup Quarantine

Gate 0a (Symbol Kill Switch) is a per-symbol manual toggle. During Phase 1 calibration, ES was locked in analysis-only mode — setups were computed and logged but never promoted to signal status. Gate 0b (Setup Quarantine) checks a static blacklist of setups that failed in live data. Quarantined setups continue computing (substrate observation only) but can never promote to SIGNAL. BUILD PHASE (2026-05-22): all 8 previously quarantined setups reinstated — prior data declared invalid (structurally broken engine). Fresh data collection will re-evaluate all 50 setups on equal footing.

Institutional Veto

Any institutional block reason set → signal dies. No appeal.

Hard

Symbol Kill Switch

Per-symbol toggle. ES was analysis-only during Phase 1 calibration.

Hard

Setup Quarantine

BUILD PHASE: all quarantines lifted (2026-05-23). 0 setups currently benched. Prior quarantine data invalidated by engine restructure.

Hard

Calibration Rationale

The quarantine list is data-driven, not opinion-driven — each benched setup was quarantined after its rolling performance fell below the retirement threshold or showed structural failure (0% WR).
Rather than deleting them, the engine keeps them computing for observation — if market conditions change and the setup rehabilitates in substrate data, it can be unbenched.
The kill switch was used to keep ES in watch-only mode until its own calibration data was collected.

Pipeline Gate

Gate 1 / 1.5 — Market Tradeable + State Confidence

50% confidence floor

Is the market open and classified with enough confidence to act?

CheckEvaluateDecideRecord

Core Decision

Market must be open, state confidence must exceed 50%, and production mode must be active.

G1market open + valid state → hard/soft by reason

G1.5astate confidence < 50% → soft block

G1.5btradeMode = NONE → score penalty

G1.5cregime transition → proportional penalty

Gate Passes Market is open, engine trusts its classification, and production mode is active — signal proceeds to bias check.

Gate Blocks Pre-market or post-close hard-blocks. Low confidence or missing thesis imposes soft score penalties.

Market Tradeable

Gate 1 consults the market context analyzer to determine whether the market is open and in a valid, tradeable state. The severity of any block depends on the reason: a pre-market or post-close condition is a hard block (no point evaluating setups when the market is closed), while a data-quality degradation might only impose a soft penalty.

State Confidence & Transition Risk

Gate 1.5 evaluates three separate confidence dimensions. 1.5a (State Confidence) requires the engine's own classification of the current market context to be at least 50% confident. Below that threshold, the engine doesn't trust its regime classification — it might be calling a "trending" market when it's actually rotating, and every downstream gate depends on that classification being at least plausible. 1.5b (Production Mode) penalizes when tradeMode = NONE — no sweep+trap pattern detected and no trend exception active. This means the engine has no high-confidence thesis about what the market is doing. 1.5c (Transition Risk) applies a proportional penalty when the market is transitioning between regimes (e.g., from range to trend), because regime boundaries are statistically where false signals cluster.

Market Tradeable

Market context analyzer says the market is open and valid. Severity-based: hard or soft depending on the block reason.

Var

1.5a

State Confidence ≥ 50

Production state must be at least 50% confident. Below that the engine doesn't trust its own classification.

Soft

1.5b

Production Mode

If tradeMode = NONE (no sweep+trap, no trend exception), score takes a penalty.

Soft

1.5c

Transition Risk

Market transitioning between regimes → score penalty proportional to transition risk score.

Soft

Calibration Rationale

The 50% confidence floor is generous — it's a sanity check, not a filter. Below 50 the engine is essentially guessing what kind of market this is, and guessing is not a strategy.
Transition penalties exist because regime boundaries are where false signals cluster — a mean-reversion setup that fires during a regime transition to trend will get run over.

Pipeline Gate

Gate 2.5 — Bias Arbitrator

55% min · 8% edge

G2.5

Does the setup's direction match the market's directional bias? Five modules vote.

CheckEvaluateDecideRecord

Core Decision

Setup direction must align with the declared bias from five voting modules.

Biasside > 55% AND lead ≥ 8% → directional bias declared

Conflictboth sides > 55% within 8% → CONFLICT

Overrideactionable sweep/absorption → bias check skipped

Gate Passes Setup direction aligns with declared bias, or an actionable sweep/absorption event overrides the check.

Gate Blocks Direction opposes bias — score penalty applied. CONTINUATION_FALLBACK costs ~-1.6R on average.

Voting Framework

The Bias Arbitrator aggregates votes from five independent directional modules — each contributes a long score and a short score based on its own evidence. The arbitrator then applies a voting framework: if one side exceeds 55% and leads the other by at least 8%, a directional bias is declared. If both sides exceed 55% within 8% of each other, the state is CONFLICT — not uncertainty, but genuine contradictory evidence, which is worse for signal quality than having no bias at all.

Sweep Override & Continuation Fallback

The gate checks whether the proposed setup's direction aligns with the declared bias. A long setup in a SHORT bias environment gets a penalty. However, a critical exception exists: if there's an actionable sweep or absorption event, the bias check is skipped entirely. The logic is that liquidity events (stop hunts, institutional absorption at a level) override directional bias — a trapped short squeeze doesn't care what the trend says.

The CONTINUATION_FALLBACK state deserves attention: when bias direction matches but confidence is below the decisive threshold, the arbitrator falls back to a continuation assumption. Cohort attribution analysis found this fallback costs approximately -1.6R on average — it's a known weak spot that ships as a soft penalty rather than a hard block, awaiting more data to decide its fate.

Arbitrator Tuning

Tuning	Value
Side minimum	55% (score to declare side)
Side edge	8% (L-S gap for decisive call)
Conflict minimum	55% (both sides ≥ this = CONFLICT)
Conflict edge	8% (max gap to be "conflict" not "edge")
None max	55% (below = no direction)

Calibration Rationale

The 8% edge requirement prevents firing on razor-thin directional advantages that could flip on the next bar.
CONFLICT state isn't "we don't know" — it's "both sides have a case," which is empirically worse for signal outcomes than NONE (where neither side has evidence).
The sweep/absorption override exists because liquidity events carry their own directional conviction independent of trend.

Pipeline Gate

Gate 2.6 / 2.7 — Market Story + Chop

55 fit · severe chop = kill

G2.6

Does the setup family match what the auction is doing — and is the market in chop?

CheckEvaluateDecideRecord

Core Decision

Setup family must fit the auction narrative; severe chop without liquidity events kills the signal.

G2.6story fit < 55 → soft penalty · fit < 35 + score < 60 + CHOP_NO_EDGE → hard block

G2.7SEVERE_NO_EDGE_CHOP without sweep/absorption → hard block

Gate Passes Setup family fits the auction narrative with score above 55, or chop is mild enough to allow continuation.

Gate Blocks Severe chop hard-blocks the signal. Low story fit imposes soft penalty; extreme mismatch escalates to hard block.

Market Story Fit

Gate 2.6 (Market Story Fit) computes a 0-100 score measuring how well the proposed setup family matches the current auction narrative. A continuation setup in a trending market scores high; the same setup in a rotating, mean-reverting market scores low. Below 55, the signal takes a soft penalty. Below 35, with a raw confluence score below 60 and the market classified as CHOP_NO_EDGE, the gate escalates to a hard block — the thesis has no structural support.

Chop Permissions

Gate 2.7 (Chop Permissions) is the engine's most aggressive defensive gate. When the market story evaluator classifies the environment as SEVERE_NO_EDGE_CHOP — meaning no directional conviction, no institutional footprint, and no structural level nearby — the gate hard-blocks unless there's an actionable sweep or absorption event. Other chop states (MILD_CHOP, ROTATIONAL_CHOP) impose graduated score penalties but allow the signal to continue. The distinction matters: mild chop can produce legitimate mean-reversion setups at levels, but severe chop is a fee-burning machine.

2.6

Market Story Fit

Market fit score 0–100. Below 55 → soft block. Below 35 with score <60 and story = CHOP_NO_EDGE → hard block.

Soft

2.7

Chop Permissions

SEVERE_NO_EDGE_CHOP without actionable sweep/absorption → hard block. Other chop states → score penalty.

Hard*

Calibration Rationale

CHOP_NO_EDGE is the engine saying "this market is going nowhere and there's no institutional footprint to ride." Hard-blocking it prevents the most expensive error in trading: forcing a trade when there's nothing to trade.
The 35/60 threshold is deliberately low — only the clearest "no edge" conditions trigger it.
The sweep/absorption override exists because these events can break a chop regime entirely.

Pipeline Gate

Gate 3 / 3.5 — Regime + Session Playbook

3 sessions · 6 families

Setup style must match market regime, and only approved families run per session.

CheckEvaluateDecideRecord

Core Decision

Continuation setups require trend/expansion; reversion requires range/rotation. Session playbook restricts families.

G3 STRICTstyle mismatch → hard block · low_participation → hard block

G3 BALANCEDstyle mismatch → soft penalty

G3.5setup outside session playbook → blocked or downgraded

Gate Passes Setup style matches regime and is in the session's approved playbook — signal proceeds to state validity checks.

Gate Blocks Style mismatch blocks (STRICT) or penalizes (BALANCED). Low participation hard-blocks all setups regardless.

Regime Compatibility

Gate 3 (Regime Compatibility) enforces that continuation setups only fire in trend/expansion regimes, and reversion setups only fire in range/rotation regimes. In STRICT mode, a style mismatch is a hard block — a breakout setup in a ranging market simply cannot fire. In BALANCED mode, the mismatch imposes a soft penalty, allowing the setup to continue at a score disadvantage. Both modes hard-block all setups in low_participation regimes, because thin markets produce unreliable signals regardless of setup quality.

Session Playbook

Gate 3.5 (Session Playbook) restricts which setup families are approved for each trading session. Asia (slow, level-driven) permits key-mag, exhaust-rev, and vpoc-mig — setups that work with clear levels and minimal flow. London permits ib-reject and trap-rev — setups that exploit the London open's stop-hunting patterns. NY (institutional flow-driven) permits ib-brk, ib-ext, and flow-surge — setups that ride directional momentum. Setups outside their playbook are either blocked (STRICT) or downgraded (BALANCED). The playbook is calibrated from historical outcome data grouped by session, not from theory about what "should" work.

Regime Compatibility

Continuation setups allowed in trend/expansion. Reversion setups allowed in range/rotation. Banned in low_part. STRICT = hard block, BALANCED = soft.

Var

3.5

Session Playbook

Asia: key-mag, exhaust-rev, vpoc-mig. London: ib-reject, trap-rev. NY: ib-brk, ib-ext, flow-surge. Outside playbook = blocked (STRICT) or downgraded (BALANCED).

Var

Calibration Rationale

Not every setup works in every session. Asia is slow and level-driven — breakout setups waste capital chasing moves that don't follow through.
NY has real institutional flow — reversal setups against that flow lose.
The low_participation hard block protects against the deadliest trap: a "perfect" setup in an empty market where there's no one to move price in your favor.

Pipeline Gate

Gate 3.6 / 3.7 — State Validity + Dominance

gap + breakout + dominance

G3.6

Gap authority, breakout direction, and tier-1 dominance scope.

CheckEvaluateDecideRecord

Core Decision

Setup must not fight gap magnetism, clean breakouts, or active tier-1 signals.

G3.6agap opposes direction → soft block (sweep+trap overrides)

G3.6buntrapped breakout opposite → soft block

G3.7tier-1 active (ldn-sweep, fail-auc) → lower tiers yield

Gate Passes No gap conflict, no opposing breakout, no tier-1 dominance contention — signal proceeds to institutional checks.

Gate Blocks Soft penalties for gap or breakout headwinds. Lower-tier setups suppressed when tier-1 is active.

Gap Authority & Breakout State

Gate 3.6a (Gap Authority) checks whether a significant overnight gap conflicts with the setup's direction. A long setup when there's a large bearish gap (price opened well below prior close) faces headwind — the gap acts as an overhead magnet pulling price back. This is a soft block, not hard, because a confirmed sweep+trap can override the gap's authority.

Gate 3.6b (Breakout State) penalizes setups that fight a clean, untrapped breakout. If price has broken out of a range in one direction and no trap has been confirmed, setups pointing the opposite way are swimming upstream.

Tier-1 Dominance

Gate 3.7 (Tier-1 Dominance) is a priority arbitration mechanism: when a tier-1 setup (London sweep, failed auction — the highest-conviction patterns in the system) is active, lower-tier setups in the opposite direction are suppressed. Even lower-tier setups in the same direction yield — the tier-1 setup takes the slot. This prevents signal clutter and ensures the best signal gets the attention.

3.6a

Gap Authority

Significant gap conflicts with setup direction → soft block, unless sweep+trap is active.

Soft

3.6b

Breakout State

Clean breakout opposite to setup direction without trap confirmation → soft block.

Soft

3.7

Tier-1 Dominance

When a tier-1 setup (ldn-sweep, fail-auc) is active in opposite direction, lower-tier setups yield. Same direction but lower tier also yields.

Soft

Calibration Rationale

When the best setup in the system is pointing one way, lesser setups pointing the other way should defer.
Tier-1 setups have the strongest historical edge — when they speak, the rest of the pipeline listens.
The gap override for sweep+trap reflects that institutional stop-hunting events can negate gap magnetism entirely.

Pipeline Gate

Gate 3.8 / 3.9 — Whale + OI Conviction

whale · OI · soft

G3.8

Institutional positioning checks — are whales fighting you, and does open interest contradict your trade?

CheckEvaluateDecideRecord

Core Decision

Institutional DOM clustering and open interest dynamics must not contradict the setup's direction.

G3.8whale defense opposes direction → soft penalty · alignment = research note

G3.9SHORT_COVERING + long → penalty · LONG_CAPITULATION + short → penalty

Gate Passes Institutional positioning doesn't contradict the setup, or aligns with it — signal proceeds to final qualification.

Gate Blocks Soft penalties for opposing whale defense or OI contradictions. Not hard blocks — institutions can be wrong at turning points.

Whale Defense

Gate 3.8 (Whale Defense) reads DOM (Depth of Market) clustering patterns. When institutional-sized orders cluster at resistance above the current price (defense-resistance) or at support below (defense-support), the engine detects "whale defense" — large players actively protecting a level. If their defense opposes the setup's direction (e.g., massive sell-side clustering at resistance while a long setup tries to fire), the setup takes a soft penalty. If whale defense aligns with the setup, it's logged as a research note but doesn't adjust the score — alignment is expected, not bonus-worthy.

OI Conviction

Gate 3.9 (OI Conviction) cross-references open interest dynamics with the setup's direction. A long setup firing during a SHORT_COVERING environment gets penalized: the rally looks like buying, but OI falling says it's shorts closing, not new longs entering — the rally is structural unwind, not genuine demand. Similarly, a short setup during LONG_CAPITULATION faces a penalty because the selloff is exhaustion, not new bearish conviction. Counter-trend penalties apply when longing against REAL_DOWNTREND or shorting against REAL_UPTREND. All OI gates are soft penalties because institutional positioning can be wrong, and a strong enough setup-level signal can override positioning headwinds.

3.8

Whale Defense

Institutional DOM clustering (defense-resistance, defense-support) opposing setup direction → soft penalty. Alignment = research note only.

Soft

3.9

OI Conviction

Long setup + SHORT_COVERING → "rally is fake." Short setup + LONG_CAPITULATION → "selloff is exhaustion." Long vs REAL_DOWNTREND or short vs REAL_UPTREND → counter-trend penalty.

Soft

Calibration Rationale

These are the "bigger fish" gates — if institutional positioning contradicts the setup, the setup is fighting the wrong crowd.
Short covering rallies look identical to real buying on the tape, but they die once the shorts finish covering. OI dynamics distinguish real conviction from structural unwind.
Soft penalties, not hard blocks, because institutions can be wrong too — and often are at turning points.

Pipeline Gate

Gate 4–8 — Score, Confirmation & Risk

5 gates · retire to RR

G4-8

Final qualification: retirement check, confluence threshold, critical fails, entry timing, and risk/reward validation.

CheckEvaluateDecideRecord

Core Decision

Setup must pass retirement, confluence, critical-fail, confirmation, and R:R checks to fire.

G4rolling expectancy < -0.12R (6+ trades) → benched

G5confluence < ~68% dynamic threshold → soft block

G62+ critical fails (weight ≥18) → hard block

G8R:R < rrFloor (1.5 HOT / 2.0 AUTOMATED) → BLOCKED

Gate Passes Setup is not retired, confluence meets threshold, no critical failures, R:R is above rrFloor (1.5 HOT / 2.0 AUTOMATED) — eligible for SIGNAL.

Gate Blocks Retirement benches the setup. Critical fails hard-block. R:R below floor forces WATCH regardless of score quality.

Retirement & Confluence

Gate 4 (Retirement) checks the setup's rolling expectancy across its last 12 closed trades. Below -0.12R expectancy with at least 6 completed trades — the setup is benched. It continues computing (for substrate observation) but can never fire. Between -0.05R and -0.12R it's flagged as "Degrading" with a -5 priority penalty. Gate 5 (Confluence Threshold) is the raw score check: the setup's weighted confluence must reach a dynamic threshold (~68% base, adjusted by volume regime and qualification profile). BALANCED mode gets a -4 offset, making the threshold more permissive during data collection.

Critical Fails & Entry Confirmation

Gate 6 (Critical Fails) catches single-point-of-failure conditions. If any individual evaluator with weight ≥18 fails (scores zero), that's a critical fail. One critical fail is logged as a warning. Two or more critical fails trigger a hard block — the thesis has multiple load-bearing pillars that collapsed. Gate 7 (Entry Confirmation) requires setups below the bypass threshold (~75-80%) to show confirming price action, order flow, or liquidity signals before promoting. Above the bypass threshold, the confluence itself is confirmation enough.

Risk & R:R Tiering

Gate 8 (Risk & R:R) validates the stop/target geometry. R:R is calculated from the planned entry, stop, and target prices. The floor is phase-dependent: 1.5:1 in HOT Run, 2.0:1 in Automated Run — below that, the setup is BLOCKED regardless of confluence. R:R bands (1.5–2.0, 2.0–2.5, 2.5–3.0, 3.0–4.0, 4.0+) are tracked for statistical analysis but don't gate signal states. Calibration tuning clamps dynamic RR to the range 2.5–6.0 (config.js minRRGate/maxRRGate). GEX modifier (2026-05-21): dynamic R:R adjusts ±0.40R max based on options gamma exposure — LONG near a hard call wall (+0.25), SHORT near hard put wall (+0.25), COILED regime favors breakouts (-0.15), PINNED regime favors mean-reversion (-0.10), CASCADE regime penalizes fades (+0.40). Research-constrained: GEX modifiers only fire in CALM/NORMAL VIX (FlashAlpha 8yr: GEX adds zero in elevated/stressed VIX).

Retirement

Setup's rolling expectancy below -0.12R → benched. Keeps computing, can't fire.

Soft

Confluence Threshold

Score must reach dynamic threshold (base ~68%, STRICT offset 0, BALANCED offset -4). Gap logged if below.

Soft

Critical Fails

Any individual evaluation with weight ≥18 that fails. One = logged. Two or more = blocked.

Hard*

Entry Confirmation

If score below bypass threshold (~75–80%), requires price action / order flow / liquidity confirmation.

Soft

Risk & R:R Tier

Validates stop/target feasibility. Below rrFloor (1.5 HOT / 2.0 AUTOMATED) = BLOCKED. R:R bands tracked for statistics.

Hard

Calibration Rationale

Retirement prevents the engine from throwing good money after bad.
The critical-fail gate (weight ≥18) catches single-point-of-failure conditions — if one heavyweight check fails, the whole thesis is suspect regardless of total score.
R:R tiering means even a high-confluence signal doesn't fire if the risk/reward math doesn't work — this is where the engine enforces the asymmetry that makes edge-based trading viable.

Pipeline Gate

Gate 9 / 9b / 10 — Advisory (Phase A Off)

3 gates · advisory only

ADV

Computed but not enforced. Will block signals when switched on after calibration proves them.

CheckEvaluateDecideRecord

Core Decision

Advisory gates log what they would block but do not enforce — evidence before enforcement.

G9R:R > NQ 2.5:1 / ES 4.0:1 → would hard-block (advisory)

G9bdirection violation → would hard-block · style violation → would soft-block (advisory)

G10permanently disabled — 840-trade OOS showed it destroyed value

Gate Passes All advisory gates currently pass by default — they compute and log but never block during Phase A.

Gate Blocks No enforcement in Phase A. When activated post-calibration, G9 and G9b would hard/soft block as documented.

R:R Inflation Cap

Gate 9 (R:R Inflation Cap) addresses a structural bias in the engine's trade planner: while historical resolved trades show a median R:R of ~1.5:1, the planner consistently produces 4.5:1 median projections. The cap would hard-block any signal where the projected R:R exceeds 2.5:1 for NQ or 4.0:1 for ES. Currently advisory-only: the engine logs what it would have blocked, but doesn't enforce, pending calibration data proving that the cap improves outcomes rather than just filtering high-conviction setups.

Regime Fit Gate & Session Router

Gate 9b (Regime Fit Gate) layers a higher-timeframe regime check on top of Gate 3's style compatibility. It reads the daily regime bias and applies two sub-checks: a direction violation (shorting in a trend_up day) would be a hard block; a style violation (breakout setup in a range day) would be a soft block. Like Gate 9, it's advisory-only pending data proof. Gate 10 (Session Router) is permanently disabled. Out-of-sample validation on 840 historical trades showed it blocked profitable families more often than unprofitable ones — it destroyed net value. Kept in the codebase as an architectural placeholder and a reminder that "intuitive" filters can fail empirical validation.

R:R Inflation Cap

NQ capped at 2.5:1, ES at 4.0:1. Historical median RR was 1.5:1 but planner produces 4.5:1 median. Would hard-block inflated projections.

Off

Regime Fit Gate

Day regime bias + style constraints. Short setup in trend_up → direction violation. Breakout in range → style violation. Direction = hard, style = soft (when enforced).

Off

Session Router

Permanently disabled. OOS validation on 840 trades showed it blocked profitable families. Kept as architectural placeholder.

Off

Calibration Rationale

Phase A philosophy: observe before enforcing. These gates log advisory verdicts so we can see what they would have blocked.
When the data proves a gate adds edge, it gets switched on.
Gate 10 was explicitly rejected — 840-trade OOS validation showed it destroyed value. Ideas that sound right but lose money get killed, not debated.

Pipeline Gate

Final Decision States

6 states · SIGNAL to fire

DEC

What the pipeline produces after all gates run.

CheckEvaluateDecideRecord

Core Decision

Pipeline resolves each evaluation into one of six discrete states — only SIGNAL can fire.

SIGNALall gates pass + R:R confirmed → can fire

ARMEDstrong confluence, waiting one condition → auto-promotes if resolved

BLOCKEDhard gate failed → structural rejection

Gate Passes SIGNAL state reached — the setup fires. All six states plus the determining gate are written to the substrate.

Gate Blocks ARMED, WATCH, BLOCKED, CONFLICT, or PAUSED — each conveys why the signal didn't fire for trader review.

Signal & Armed States

After every gate has executed, the pipeline resolves the signal into one of six discrete states. SIGNAL means every gate passed and R:R is confirmed — this is the only state that can fire. ARMED means confluence is strong and most gates passed, but the signal is waiting for one remaining condition (typically R:R confirmation or entry timing). Armed signals are near-miss candidates: if the pending condition resolves, the signal promotes automatically on the next evaluation cycle.

Non-Firing States

WATCH is the passive observation state — the setup is being evaluated but isn't close enough to actionability. BLOCKED means at least one hard gate failed — structural rejection with no workaround. CONFLICT is a distinct state from BLOCKED: it means the directional evidence is genuinely split, and the engine refuses to pick a side. PAUSED is operator-initiated — the trader manually paused this symbol via the dashboard's per-symbol pause button. All six states, along with the specific gate that determined them, are written to the substrate for every evaluation cycle. The trader sees all of this on the dashboard and can act on contextual nuance the gate system can't encode.

Decision State Matrix

State	Can Fire	Meaning
SIGNAL	YES	All gates pass, R:R confirmed
ARMED	WATCH	Strong confluence, waiting R:R or confirm
WATCH	NO	Watching, not actionable
BLOCKED	NO	Hard gate failed
CONFLICT	NO	Directional conflict
PAUSED	NO	Trader paused this symbol

Calibration Rationale

Six states, not two. Binary pass/fail loses information.
ARMED means "this was close — if one condition flips, it fires." CONFLICT means "the evidence is split."
The granularity exists because the human trader needs to know WHY something didn't fire, not just that it didn't — the context informs manual decisions and post-session review.

Scoring Framework

Global Confluence Categories

6 categories · 100 pts

CFG

Every setup scores from the same 6-category budget. Each category has a fixed weight — conditions within it divide those points.

Category Weight Score Threshold

Core Decision

Every setup scores against a fixed 100-point budget divided into six categories — same budget, different distributions per thesis.

BudgetLocation 25 + Regime 20 + Trigger 20 + Flow 15 + R:R 10 + Risk 10 = 100

Pass Setup scores above confluence threshold — qualifies for signal generation and trader presentation.

Fail Setup scores below threshold — filtered out, logged to substrate for analysis only.

Budget Architecture

Every setup in the engine — regardless of family — scores against the same 100-point budget divided into six categories. The budget is fixed but the distribution is not: each setup type allocates different weights to different conditions within each category, reflecting what matters for that specific thesis. An IB breakout weights trigger conditions heavily; an exhaustion reversal weights flow divergence conditions.

Category Rationale

Location receives the largest allocation (25 points) because where price sits relative to key structural levels is the single strongest predictor of whether a setup succeeds. A perfect trigger at a meaningless level is noise; a mediocre trigger at a critical level is a trade. Regime (20) and Trigger (20) share the next tier — regime ensures the market environment supports the thesis, while the trigger is the specific event that creates the opportunity. Flow (15) confirms the thesis with real-time order flow evidence. R:R (10) scores the mathematical quality of the stop/target geometry. Risk Filters (10) are binary safety checks — session timing, stop-hunt clearance, book stability — that don't generate edge but prevent easily avoidable losses.

Regime 20 Location 25 Trigger 20 Flow 15 R:R 10 Risk 10

Category Breakdown

Category	Weight	Conditions
Regime	20 pts	htf_aligned, ib_tight, cross_aligned, all_four, session_match
Location	25 pts	vwap_side, vwap_extreme, key_level_near, gap_confirms, naked_vpoc, three_at_level
Trigger	20 pts	ib_one_sided, mss_confirmed, sweep_reclaim, mkt_order_tempo, first_hour_momentum
Flow	15 pts	delta_aligned, delta_divergence, dom_aligned, book_flow_sync, cvd_aligned, absorption
R:R	10 pts	(computed from stop/target geometry)
Risk Filters	10 pts	stop_hunt_clear, book_stability, prime_window, not_lunch, session_match

Calibration Rationale

Location gets the most weight (25) because where price is relative to key levels matters more than any single trigger — this is the "where" that defines the trade.
Flow gets less (15) because it confirms but doesn't initiate.
Risk filters (10) are binary safety checks, not edge generators.
Every setup sums to exactly 100 points — different distributions, same budget — ensuring apples-to-apples comparison across families.

Qualification Mode

Qualification Profiles

STRICT vs BALANCED

Two modes that control how many setups can fire and how selective the engine is.

Category Weight Score Threshold

Core Decision

STRICT maximizes signal quality; BALANCED maximizes data collection — Phase A runs BALANCED deliberately.

STRICT1 setup/sym · R:R ≥ 2.2 · offset 0 · hard blocks

BALANCED2 setups/sym · R:R ≥ 1.8 · offset -4 · soft penalties

Pass Setup meets profile requirements — proceeds through pipeline to signal generation.

Fail Setup rejected by profile constraints — hard block (STRICT) or downgraded score (BALANCED).

Profile Mechanics

The engine operates in one of two qualification profiles that control selectivity across the entire pipeline. STRICT is sniper mode: one setup per symbol, minimum 2.2:1 R:R, no score offset, and regime/playbook mismatches are hard blocks. BALANCED is scouting mode: two setups per symbol can compete, 1.8:1 R:R floor, -4 score offset (lowering the effective confluence threshold), and regime/playbook mismatches impose soft penalties rather than hard blocks.

Phase A Strategy

Phase A (current) runs BALANCED to maximize data collection. The engine intentionally fires more signals to build a statistical sample for each setup type. Once enough data accumulates to reliably distinguish setup quality, the profile will shift to STRICT — fewer signals, higher average quality, lower noise. The -4 score offset in BALANCED mode means a setup that needs 68 in STRICT only needs 64 in BALANCED. This isn't a quality compromise — it's deliberate observational permissiveness.

Profile Parameters

Parameter	STRICT	BALANCED
Setups / symbol	1	2
Min R:R	2.2	1.8
Score offset	0	-4
Regime gate	Hard block	Soft (downgrade)
Playbook gate	Hard block	Soft (downgrade)

Calibration Rationale

STRICT optimizes for signal quality at the cost of sample size.
BALANCED optimizes for learning at the cost of signal purity.
Phase A needs volume — you can't evaluate what you don't fire.
The trade-off is explicit and temporary.

Setup Family

Structural Breakout Family

3 setups · continuation

IB breakout, IB extension, breakout retest — continuation through structure.

Thesis Trigger Confluence Qualification Signal

Core Decision

The initial balance is the structural reference — breakout, extension, and retest each weight different evidence for the same directional thesis.

ib-brkib_one_sided 22 + delta_aligned 18

ib-extib_one_sided 20 + delta_aligned 20 + mkt_order_tempo 14

brk-retsweep_reclaim 20 + key_level_near 18 + absorption 14

High Confluence One-sided IB with confirmed directional flow and structural alignment — institutional continuation trade.

Low Confluence Balanced IB or weak flow — breakout is noise, not institutional intent.

Breakout & Extension

The Structural Breakout family trades the initial balance (IB — the range formed in the first hour of RTH) as a structural reference. IB Breakout (ib-brk) fires when price decisively exits the IB range. Its heaviest weight is ib_one_sided (22) — was the IB dominated by one side? A tight, one-sided IB that breaks out is institutional intent; a wide, balanced IB that breaks randomly is noise. IB Extension (ib-ext) fires after the breakout holds and extends. It shifts weight from the trigger to flow confirmation: delta_aligned (20) and mkt_order_tempo (14) together require proven, sustained flow in the breakout direction.

Breakout Retest

Breakout Retest (brk-ret) is the family's highest-conviction variant. It fires when a breakout level is retested and holds. The weight profile flips entirely: sweep_reclaim (20) and key_level_near (18) replace ib_one_sided — the thesis is no longer "the breakout happened" but "the breakout survived its first challenge." Absorption (14) confirms that aggressive selling at the retest level was absorbed by passive buyers. The not_lunch filter (8) exists because lunch-hour retests frequently fail due to thin liquidity, not structural weakness.

Condition Weights

Condition	ib-brk	ib-ext	brk-ret
ib_one_sided	22	20	—
delta_aligned	18	20	16
vwap_side	12	10	—
cross_aligned	12	10	12
mkt_order_tempo	8	14	—
sweep_reclaim	—	—	20
key_level_near	—	—	18
absorption	—	—	14
stop_hunt_clear	8	8	12
gap_confirms	—	12	—
ib_tight	10	—	—
prime_window	6	—	—
session_match	4	6	—
not_lunch	—	—	8

Calibration Rationale

IB breakout weights one-sided action (22) because that's the primary signal — the IB was dominated by one side.
Extension shifts to delta (20) + tempo (14) because it's a follow-through trade; you need proven flow continuation.
Breakout retest flips to sweep_reclaim (20) because the thesis is "breakout held, retested, and reclaimed" — the trigger IS the level reclaim.

Setup Family

Mean Reversion Family

4 setups · reversion

IB fade, VWAP bounce, VWAP deviation snap, value area fade — counter-move plays.

Thesis Trigger Confluence Qualification Signal

Core Decision

Price moved too far from fair value — divergence and absorption confirm the move is running out of fuel.

vwap-devvwap_extreme 24 + delta_divergence 20

vwap-bncvwap_bounce_zone 24 + delta_divergence 18

ib-fadedelta_divergence 20 + ib_one_sided 18

va-fadenaked_vpoc 18 + delta_divergence 18

High Confluence Flow diverges from price at an extreme level with absorption — high-probability snap back to fair value.

Low Confluence No divergence or absorption — catching a falling knife, not a mean reversion.

Family Thesis

Mean reversion setups bet that price has moved too far from fair value and will snap back. The family shares two dominant signals: delta divergence (flow disagreeing with price direction) and absorption (price stopping despite aggressive hitting). These two conditions together answer the question: "Is the move running out of fuel?"

Setup Variants

VWAP Deviation Snap (vwap-dev) has the family's most concentrated weight: vwap_extreme at 24 points. The entire thesis is "price extended ≥2σ from VWAP — mean reversion probability is high." Without extreme VWAP deviation, the setup doesn't exist. VWAP Bounce (vwap-bnc) similarly loads 24 on vwap_bounce_zone — the trade IS the bounce at VWAP. IB Fade (ib-fade) combines divergence (20) with one-sided IB (18) — the IB pushed hard one way, but flow says the push is exhausted. Value Area Fade (va-fade) anchors on naked VPOC (18) and key levels (14) — fading from a value area boundary that hasn't been visited yet is a high-quality mean-reversion thesis because the unvisited VPOC acts as a magnet.

Condition Weights

Condition	ib-fade	vwap-bnc	vwap-dev	va-fade
delta_divergence	20	18	20	18
absorption	16	14	16	16
vwap_side/extreme	14	—	24	—
vwap_bounce_zone	—	24	—	—
ib_one_sided	18	—	—	—
naked_vpoc	—	—	—	18
key_level_near	—	12	12	14
cross_aligned	12	12	10	12
stop_hunt_clear	12	10	10	12
not_lunch	8	10	8	10

Calibration Rationale

Reversion trades live or die by divergence and absorption — without them, you're catching a falling knife.
VWAP deviation snap loads 24 on vwap_extreme because the entire thesis is "price extended too far from fair value."
VWAP bounce puts 24 on bounce_zone — the trade is the zone itself.
These concentrated weights ensure the setup can't fire if its core thesis is absent.

Setup Family

Trap & Reversal Family

3 setups · reversal

IB rejection, trap & reverse, exhaustion reversal — failed moves become setups.

Thesis Trigger Confluence Qualification Signal

Core Decision

Failed moves trap participants on the wrong side — their forced exits fuel the reversal.

exhaustdelta_divergence 26 + key_level_near 20 + absorption 18

trap-revsweep_reclaim 20 + delta_divergence 18 + absorption 16

ib-rejdelta_divergence 20 + ib_one_sided 18 + sweep_reclaim 18

High Confluence Flow exhaustion confirmed at a key level with absorption — trapped participants will fuel the reversal.

Low Confluence No divergence or absorption proof — the move may be a legitimate trend, not a trap.

Exhaustion Reversal

This family trades failed moves — situations where a directional push exhausted itself, trapping participants on the wrong side. The thesis is contrarian: the losers' forced exits fuel the reversal. Exhaustion Reversal (exhaust) carries the heaviest single condition weight in the entire engine: delta_divergence at 26. The signal is unambiguous: aggressive flow pushed price to a level (key_level_near at 20), but the flow is dying (divergence) while passive absorption (18) holds the level. Three independent lines of evidence converge on one conclusion: the move is over.

Trap & IB Rejection

Trap & Reverse (trap-rev) leads with sweep_reclaim (20) — the failed breakout IS the trade. Price pushed past a level, swept stops, and reclaimed. The trapped participants are now underwater, and their forced exits become your fuel. IB Rejection (ib-rej) combines one-sided IB (18) with divergence (20) — the IB pushed hard one way but the rejection needs both: the setup (the push) and the proof (the flow reversal). Session_match (6) gives a small bonus because IB rejections are most reliable when they happen during the approved session for the setup family.

Condition Weights

Condition	ib-rej	trap-rev	exhaust
delta_divergence	20	18	26
sweep_reclaim	18	20	—
absorption	14	16	18
ib_one_sided	18	—	—
key_level_near	—	14	20
cross_aligned	14	12	12
stop_hunt_clear	10	12	14
session_match	6	—	—
not_lunch	—	8	10

Calibration Rationale

Exhaustion reversal loads 26 on divergence because the entire thesis is "flow dried up at a level." It's the engine's strongest conviction about single-condition importance.
Trap & reverse leads with sweep_reclaim (20) because the failed breakout IS the trade — you need proof that the move was a trap.
IB rejection combines the push (18) with the proof (20) — you need both.

Setup Family

Order Flow & Level Family

3 setups · flow-driven

Flow surge, key level magnet, volume migration follow — flow-driven entries.

Thesis Trigger Confluence Qualification Signal

Core Decision

The order book tells a clear directional story — institutional flow events or structural level magnetism drive entry.

flow-surgedelta_aligned 24 + mkt_order_tempo 20 + dom_aligned 14

key-magnaked_vpoc 22 + key_level_near 18 + absorption 16

vol-mignaked_vpoc 20 + delta_aligned 18 + mkt_order_tempo 16

High Confluence Overwhelming institutional flow or strong level magnetism with book confirmation — ride the directional wave.

Low Confluence Weak flow or no structural magnet — chasing momentum without confirmation is suicidal.

Flow Surge

This family trades institutional flow events — situations where the order book tells a clear directional story. Flow Surge (flow-surge) is the engine's most aggressive setup, requiring overwhelming unidirectional flow: delta_aligned at 24 + mkt_order_tempo at 20. Together, these two conditions require both the magnitude (massive delta in one direction) AND the participation (heavy institutional trade count). DOM alignment (14) adds a third confirmation: the depth-of-market book structure should support the direction. This is a momentum trade — ride the institutional wave.

Level Magnet & Volume Migration

Key Level Magnet (key-mag) trades the pull toward an unvisited structural level. Its heaviest weight is naked_vpoc (22) — the unvisited Volume Point of Control is the magnet itself, a price level where significant volume transacted previously but the current session hasn't reached yet. Key_level_near (18) and absorption (16) add structural and flow confirmation. Volume Migration Follow (vol-mig) tracks when the VPOC physically migrates — it shifts (20) toward a new level, and the engine follows with delta confirmation (18) and cross-market alignment (14). This is a trend-following variant: the auction itself is voting on the new fair value.

Condition Weights

Condition	flow-surge	key-mag	vol-mig
delta_aligned	24	14	18
mkt_order_tempo	20	—	16
naked_vpoc	—	22	20
key_level_near	10	18	12
dom_aligned	14	—	—
absorption	—	16	—
vwap_side	12	—	—
cross_aligned	10	12	14
stop_hunt_clear	10	10	12
not_lunch	—	8	8

Calibration Rationale

Flow surge needs 44 combined points from delta + tempo because it's the riskiest setup type — chasing momentum without extreme flow confirmation is suicidal.
Key level magnet leads with naked_vpoc (22) because the unvisited VPOC is the magnet — without it, there's nothing to be attracted to.
Volume migration follows the market's own vote on new fair value.

Setup Family

Dynamic / Phase G+H Family

10 setups · specialist

Phase G: London sweep, gap fade, overnight continuation, SMT divergence. Phase H: delta divergence reversal, opening shock reversal, whale cluster pullback, tape climax exhaustion, settlement magnet, pre-event compression.

Thesis Trigger Confluence Qualification Signal

Core Decision

Specialist plays defined by unique structural triggers — each setup IS its trigger, carrying the highest single-condition weights (22-30).

gap-fadegap_inside_range 30 + first_5min 25

ldn-sweeplondon_sweep 28 + ny_session 18

ovn-contovernight 28 + first_30min 22 + at_ovn_vwap 18

smt-lagsmt 25 + delta_aligned 18

delta-divdelta_divergence 28 + key_level_near 18 + absorption 16

open-shockopen_shock_failed 25 + open_shock_extreme 22 + delta_aligned 16

whale-pullabsorption 28 + dom_aligned 18 + delta_aligned 16

tape-climaxdelta_divergence 24 + mkt_order_tempo 20 + absorption 18

settle-magnear_settlement 30 + afternoon_window 22 + delta_aligned 16

fomc-comppre_event_day 22 + ib_tight 20 + delta_aligned 18

High Confluence Unique structural trigger fires with session and flow confirmation — specialist edge with concentrated thesis.

Low Confluence Core trigger absent — the setup literally doesn't exist without its defining condition.

Specialist Thesis

Phase G+H setups are specialist plays defined by unique structural triggers that don't exist in the core families. They carry the highest single-condition weights in the system (22-30 points) because each setup is its trigger — without the specific condition, the setup literally doesn't exist.

Phase G Variants

Gap Halfback Fade (gap-fade) loads 30 on its trigger (gap_inside_range) and 25 on first_5min — together these two conditions comprise 55% of the setup's total budget. The thesis: when the market opens with a gap that falls inside yesterday's range, and the first 5 minutes show reversal, the gap will fill at least 50% (halfback). London Sweep (ldn-sweep) at 28 triggers on the London session's characteristic stop-hunting pattern: price pushes above/below the Asian range to sweep stops, then reverses into the NY open. Overnight Continuation (ovn-cont) at 28 uses overnight VWAP (at_ovn_vwap, 18) as its anchor — the thesis is that the overnight direction established by Asia/London will continue into NY. SMT Divergence (smt-lag) at 25 fires when NQ and ES diverge structurally — one makes a new high/low while the other doesn't confirm, suggesting the leader is trapping participants.

Phase H Variants

Delta Divergence Reversal (delta-div) at 28 fires when cumulative delta diverges from price at a key level — flow says "no" while price says "yes." The highest-weight single flow condition in the system. Opening Shock Reversal (open-shock) combines extreme opening-range expansion (22) with failed continuation (25) — if the first minutes spike violently but can't sustain, the reversion trade has structure. US session only. Whale Cluster Pullback (whale-pull) leads with absorption (28) — the thesis is that visible institutional defense at a level creates a pullback anchor. Tape Climax Exhaustion (tape-climax) loads delta_divergence (24) + mkt_order_tempo (20) — the market is hitting hard but getting nowhere, tempo spikes are terminal not sustaining. Settlement Magnet (settle-mag) loads near_settlement (30) as a pure distance-based magnet play in afternoon PM session — price gravitates toward settlement in the last hours. No TP1/TP2, managed by distance to target. Pre-Event Compression (fomc-comp) at 22 requires pre_event_day from the economic calendar (FOMC/CPI/NFP) + tight IB (20) — the thesis is that pre-event compression resolves directionally once the event arrives. Powered by data-economic-calendar.js with 63 confirmed events through mid-2027.

Condition Weights — Phase G

Condition	ldn-sweep	gap-fade	ovn-cont	smt-lag
london_sweep / gap / overnight / smt	28	30	28	25
delta_aligned	14	14	12	18
ny_session / first_5min / first_30min	18	25	22	—
absorption	8	10	—	12
prime_window / at_ovn_vwap	10	—	18	—
cross_aligned	8	6	8	—
mkt_order_tempo	7	—	—	10
stop_hunt_clear	7	7	6	8
vwap_side	—	8	—	12
session_match	—	—	6	7
not_lunch	—	—	—	8

Condition Weights — Phase H

Condition	delta-div	open-shock	whale-pull	tape-climax	settle-mag	fomc-comp
Primary trigger	28	25	28	24	30	22
Secondary trigger	18	22	18	20	22	20
delta_aligned / divergence	—	16	16	—	16	18
absorption	16	12	—	18	—	—
key_level_near	18	—	12	14	12	—
cross_aligned	12	10	8	8	10	14
stop_hunt_clear	10	8	8	10	10	10

Calibration Rationale

These setups have the highest single-condition weights in the system (22-30) because each one is defined by its unique trigger.
Phase G: 4 specialist plays — gap, London sweep, overnight carry, SMT divergence. Each one IS its trigger.
Phase H: 6 microstructure plays — delta divergence, opening shock, whale pullback, tape climax, settlement magnet, pre-event compression. Researched and queued during BUILD PHASE.
Settlement magnet and pre-event compression use new data sources: near_settlement (distance-to-settlement evaluator) and pre_event_day (powered by data-economic-calendar.js with 63 confirmed FOMC/CPI/NFP/GDP/PCE events through mid-2027).

Engine Source engine-production-context.js · PB array + CC object · data-economic-calendar.js · engine-setup-classification.js · engine-setup-diagnostics.js

Academic Basis Gap statistics: Doran et al. (2007) · London sweep: empirical London-to-NY reversal frequency · SMT divergence: inter-market non-confirmation · Delta divergence: order flow analysis frameworks (Easley/Lopez de Prado) · Settlement magnet: futures settlement gravity (Stoll & Whaley 1991) · Pre-event compression: pre-FOMC drift literature (Lucca & Moench 2015)

Setup Quarantine

Quarantine & Session Map

0 benched · 3 sessions

Which setups are benched, and which sessions allow which families.

Category Weight Score Threshold

Core Decision

Quarantine is data-driven — each benched setup demonstrated structural failure, not a bad streak. Session playbooks map approved families to session liquidity profiles.

Quarantineobjective performance collapse → manual unblock only

Retirementauto-bench at -0.12R expectancy → auto-rehabilitate

Pass Setup not quarantined and approved for current session — proceeds to scoring pipeline.

Fail Setup quarantined or session-blocked — still computes and writes to substrate but never fires a signal.

Quarantine List

The quarantine list is the engine's holding pen for setups that failed in live data. BUILD PHASE (2026-05-23): all 8 previously quarantined setups have been reinstated. The engine was declared structurally broken on 2026-05-22 — all prior data is invalidated. Fresh data collection will re-evaluate every setup (now 50 total, including 12 new Phase H entries) on equal footing. Historical quarantine reasons preserved as comments in config.js for reference. The distinction between quarantine and retirement remains: retired setups auto-bench at -0.12R expectancy and can auto-rehabilitate; quarantined setups require manual operator intervention.

Session Playbook

The session playbook maps each trading session to its approved setup families. Asia (pre-London, low participation) permits only level-based plays: key-mag, exhaust-rev, vpoc-mig. Breakout setups are banned — thin Asia liquidity produces false breakouts. London permits trap-based plays: ib-reject, trap-rev — because the London open characteristically sweeps Asia stops. NY gets the full momentum arsenal: ib-brk, ib-ext, flow-surge — because NY has the institutional flow to sustain breakouts.

Quarantined Setups

Quarantined	Reason
None — all quarantines lifted for BUILD PHASE (2026-05-23). Prior data invalidated by engine restructure. Fresh data collection will determine new quarantine candidates.

Session Approved Families

Session	Approved Families
Asia	key-mag, exhaust-rev, vpoc-mig
London	ib-reject, trap-rev
NY	ib-brk, ib-ext, flow-surge

Calibration Rationale

The quarantine list is data-driven, not opinion-driven. Each benched setup was quarantined after demonstrating structural failure, not a bad streak.
The session playbook is calibrated from historical outcome data grouped by session — Asia breakouts fail because there's no institutional flow to sustain them, not because breakouts are bad setups.

Liquidity Analysis

Liquidity Target Scoring

7 types · 44pt proximity cap

LIQ

How the engine scores potential sweep targets by type, proximity, and session.

Category Weight Score Threshold

Core Decision

Each liquidity target gets a composite score from type base + proximity adjustment + session bonus — producing a live priority queue of the "hottest" sweep targets.

ScorebaseScore + max(0, 44 - distTicks × 0.55) + sessionBonus

Session+8 when target belongs to current session

High Confluence Session extreme nearby with session match — highest-priority sweep target in the live queue.

Low Confluence Range target far away — effectively invisible to the current move, deprioritized.

Type Hierarchy

The engine maintains a real-time ranked list of liquidity targets — price levels where stop orders are likely to cluster. Each target receives a composite score combining three factors: type base score, proximity adjustment, and session context bonus. The type hierarchy reflects institutional behavior: session extremes (Asia/London/OR highs and lows) score highest (34) because they're the most visible stop-cluster locations — virtually every retail trader places stops outside them. Prior-day extremes (PDH/PDL at 32) serve a similar role at daily scale.

Proximity & Session Modifiers

The proximity formula adds up to 44 points based on how close the target is: max(0, 44 − distTicks × 0.55). This means a target at the current price gets +44, a target 40 ticks away gets +22, and a target 80 ticks away gets effectively zero. The decay rate of 0.55 per tick was calibrated to match the typical effective range of institutional sweep operations — stops too far away aren't getting swept in the current move. A session match bonus of +8 rewards targets that belong to the current session (Asia high targeted during Asia session) because same-session extremes are the freshest, most-watched levels. Range targets score only 6 base with an additional -20 penalty because intraday ranges are weak references — everyone sees them, but they lack the institutional significance of session extremes.

Base Scores by Type

Target Type	Base Score
Asia/London/OR High-Low	34
PDH / PDL	32
Value Area	26
VPOC	24
VWAP	22
Equal Highs/Lows	16
Range	6

Score Modifiers

Modifier	Effect
Proximity	max(0, 44 - distTicks × 0.55)
Session match bonus	+8 (Asia target in Asia, etc.)
Range penalty	-20
Equal H/L penalty	-8

Calibration Rationale

Session extremes score highest because they're the most obvious stop-cluster locations — every retail trader sets stops outside them.
PDH/PDL are close behind for the same reason at daily scale.
Range gets only 6 (effectively -14 after penalty) because it's a weak, overused reference.
Proximity degrades at 0.55/tick — a target 80 ticks away is effectively invisible to the current move.
The scoring produces a live priority queue: the engine always knows which liquidity target is the "hottest" right now.

Order Flow Engine

Order Flow State Machine

9 states · 15–82 conf

9 states evaluated in priority order. First match wins. Each state carries a quality rating and confidence score.

DetectClassifyConfirmScore

Core Decision

Priority-first state classification: first matching state wins, no further evaluation.

PriorityStale → Trap → Initiative → CVD Div → Absorption → Exhaustion → Delta NC → Noisy

Confidencerange 15–82, first match takes slot

Confirmed High-quality state (Trap 82, Initiative 78) — institutional behavior detected, full conviction signal.

Rejected Low-quality state (Noisy 35, Stale 15) — no interpretable flow, signal suppressed.

State Machine Logic

The order flow state machine is the engine's real-time interpretation of what the market is doing right now. Every bar, it evaluates 9 possible states in a fixed priority order — the first state whose conditions are met wins, and no further states are evaluated. This priority-first design prevents ambiguity: when multiple states could apply, the highest-conviction interpretation takes precedence.

High-Priority States

Stale (15) checks first as a circuit breaker — if the latest flow data exceeds the age threshold, every other state is meaningless. Trap Flow (82) evaluates second and carries the highest confidence in the system: confirmed stop hunt with reclaim means institutional behavior — this is the most reliable signal the engine produces. Initiative Buy/Sell (78) require the tightest triple-confirmation: meaningful delta magnitude + CVD confirmation + efficient price travel in the same direction. Missing any one of the three drops the state.

Mid & Low-Priority States

CVD Divergence (72) catches a specific failure: price makes a new extreme but cumulative volume delta doesn't confirm — the "who's buying?" gap. Absorption (68) detects price stopping despite aggressive hitting, especially near a structural level. Exhaustion (62) is the "dying move" state: price is still traveling but delta has lost its meaningful threshold, and either volume or CVD is diverging. Delta Not Confirmed (48) captures the awkward middle: delta is meaningful but price/CVD don't agree. Noisy/Unusable (35) is the default — no conditions met, no interpretation possible.

State Priority Table

State	Quality	Conf	Trigger
Stale	LOW	15	Flow age exceeds stale threshold
Trap Flow	HIGH	82	Stop hunt = 'trap' OR reclaim confirmed + sweep + matching delta
CVD Divergence	MED	72	New high without CVD up, or new low without CVD down
Absorption	MED	68	Meaningful + near level + (price stalled OR efficiency <28% OR vol ≥70%ile)
Initiative Buy	HIGH	78	Price up + meaningful delta + buy-side + CVD confirming + efficient
Initiative Sell	HIGH	78	Price down + meaningful delta + sell-side + CVD confirming + efficient
Exhaustion	MED	62	Price moving but delta NOT meaningful + (vol ≥65%ile OR CVD diverging)
Delta Not Confirmed	LOW	48	Delta is meaningful but price/CVD don't confirm
Noisy / Unusable	LOW	35	Default — no conditions met

Calibration Rationale

Priority order matters — Trap Flow evaluates first (82 confidence) because confirmed institutional stop-hunting is the highest-conviction signal; when it's present, nothing else matters.
Initiative Buy/Sell (78) require the full triple-confirmation picture.
Exhaustion (62) is the "dying move" detector.
The spread from 15 to 82 reflects the genuine information-content gap between stale noise and confirmed institutional trapping.

Order Flow Engine

Directional Intent Scoring

62% min · 8 modes

DIR

Five modules vote on long vs. short. The winner must clear both a minimum score and a clear edge over the other side.

DetectClassifyConfirmScore

Core Decision

Dual-threshold directional call: side minimum 62% AND edge 10% required to declare direction.

Side min62% to declare LONG or SHORT

Edge10% L-S gap required

Confidencefloor 60% (range 45–75%)

Confirmed Direction declared with mode (ACCEPTANCE / PULLBACK / TRAP_REVERSAL) — routes eligible setup families.

Rejected CONFLICT or NO_DIRECTION — both sides too close or too weak, no directional trade eligible.

Voting & Thresholds

Directional intent scoring aggregates five independent modules — each evaluates its own evidence and produces a long score and a short score. The scores are weighted and combined into a final directional call. The thresholds are intentionally stricter than the Bias Arbitrator (Gate 2.5): the side minimum is 62% (vs. 55%), and the edge requirement is 10% (vs. 8%). This dual-threshold design prevents two failure modes: declaring direction when evidence is thin (minimum check) and declaring direction when both sides have nearly equal evidence (edge check).

Directional Modes

Beyond the binary long/short call, the system infers a directional mode — how the market is expressing its direction. ACCEPTANCE means price is being accepted in the direction (trending smoothly). PULLBACK means the direction holds but price is retracing (potential entry opportunity). TRAP_REVERSAL means a failed move trapped participants on the wrong side (highest conviction). The mode tells the setup system what kind of trade to look for: a pullback in an uptrend calls for continuation setups; a trap reversal calls for reversal setups.

Confidence Floor

The confidence floor at 60% flags marginal directional reads. A call that passes the math (62% side, 10% edge) but has low confidence (below 60%) is technically valid but unreliable — the engine logs it as a research note rather than a conviction signal. Confidence ranges from 45% to 75%, with the floor at 60% reflecting the point where cohort attribution data shows directional calls become predictive of setup outcomes.

Threshold Parameters

Parameter	Value
Side minimum	62% (to declare LONG or SHORT)
Side edge	10% (L-S gap required)
Conflict min	55% (both sides ≥ this = CONFLICT)
Conflict edge	10% (max gap for CONFLICT state)
None max	55% (below = NONE)
Confidence floor	60% (range 45–75%)

Mode Taxonomy

Mode	Variations
LONG_*	ACCEPTANCE · PULLBACK · TRAP_REVERSAL
SHORT_*	ACCEPTANCE · PULLBACK · TRAP_REVERSAL
Mixed	CONFLICT · NO_DIRECTION

Calibration Rationale

62% is the real scoring minimum (vs. 55% in the bias arbitrator's pipeline version) because directional intent is a higher-stakes call — it determines the trade direction, not just a gate penalty.
The 10% edge prevents "barely long" calls.
Mode inference (ACCEPTANCE vs PULLBACK vs TRAP_REVERSAL) tells the setup system what kind of trade to look for — a critical routing decision that determines which setup families are eligible.

Order Flow Engine

Absorption Quality States

7 states · 18–86

ABS

Absorption isn't binary — it transitions through 6 states from initial detection to confirmed reversal or invalidation.

DetectClassifyConfirmScore

Core Decision

Progressive state machine: absorption evidence accumulates from initial detection (58) to reversal confirmed (86) or invalidated (18).

EntryACTIVE_ABSORPTION at 58 — initial detection

PeakREVERSAL_CONFIRMED at 86 — full convergence

KillINVALIDATED at 18 — price accepted through level

Confirmed REVERSAL_CONFIRMED (86/84) — failed move + structural proof + opposite hold, full reversal conviction.

Rejected INVALIDATED (18) — price bulldozed through the absorbed level, thesis dead.

Detection Model

Absorption detection answers a critical question: is someone passively absorbing aggressive flow at a level? When large limit orders silently eat incoming market orders without letting price move, it's invisible on the price chart but detectable in order flow — high volume + no price movement = passive absorption. The engine models this as a progressive state machine rather than a binary flag.

Early States

ACTIVE_ABSORPTION (58) is the initial detection: order flow shows absorption, but it's just a fact — someone is sitting on a level. It becomes interesting at TRAP_BUILDING (74) when structural evidence accumulates: a reclaim, a failed breakout, or proximity to a planned target suggests the absorption will hold. CONTINUATION_WARNING (66) is the default middle state — absorption is present but there's no reversal evidence yet, and the market might push through.

Convergence States

The highest states require convergence. REVERSAL_CONFIRMED at 86 needs reclaim or failed breakout PLUS opposite structure or hold — the full picture. At 84, it needs divergence plus at least one structural confirmation. TWO_REVERSAL_WARNINGS (80) fires on CVD divergence alone without structural confirmation — it's close to reversal conviction but lacks the physical proof. INVALIDATED (18) kills the thesis entirely: price accepted through the absorbed level. The passive defender lost — game over.

State Scoring Table

State	Score	Trigger
REVERSAL_CONFIRMED	86	Reclaim or failed breakout + opposite structure or hold
REVERSAL_CONFIRMED	84	Divergence + (reclaim or failed breakout or opposite structure)
TWO_REVERSAL_WARNINGS	80	CVD divergence alone (without structural confirmation)
TRAP_BUILDING	74	Reclaim or failed breakout or near planned target
CONTINUATION_WARNING	66	Default — absorption present but no reversal evidence
ACTIVE_ABSORPTION	58	Initial detection — order flow shows absorption
INVALIDATED	18	Price accepted through the absorbed level

Calibration Rationale

Absorption alone (58) is interesting but not actionable — anyone can sit on a level temporarily.
TRAP_BUILDING (74) means structural evidence is accumulating.
REVERSAL_CONFIRMED (86) is the full convergence: failed move + structural proof + opposite hold.
The progressive scoring prevents premature action on absorption while ensuring the engine acts decisively when the full picture emerges.
INVALIDATED (18) means price bulldozed through — the absorbed level is gone, and any setup based on it is dead.

Order Flow Engine

Absorption Stability Tracker

60s window · 60% lead

STB

Prevents flip-flopping — absorption side must hold for minimum ticks and time before switching.

DetectClassifyConfirmScore

Core Decision

Triple-condition side flip: 5 ticks + 10 seconds + 60% weighted dominance required simultaneously.

Window60s history of scored observations

Flip≥5 ticks + ≥10s + ≥60% weighted lead

Floordrop observations below score 50

Confirmed Side flip validated — reversal banner shown for 30 seconds, new absorption side declared.

Rejected Flip conditions not met — current side held, noisy observations filtered out.

Low-Pass Filter

Raw absorption detection is noisy — in a fast market, the absorption side can appear to flip on every tick as aggressive flow alternates between bid and ask. The stability tracker is a low-pass filter that prevents meaningless side changes from polluting the engine's absorption state.

Weighted Voting

The tracker maintains a 60-second history window of scored observations. Each observation carries the quality score from the absorption detection module, and votes are weighted by score — a high-quality observation (e.g., REVERSAL_CONFIRMED at 86) outweighs several low-quality ones (ACTIVE_ABSORPTION at 58). To flip the declared side, the new side must satisfy three independent conditions simultaneously: (1) maintain its lead for at least 5 ticks, (2) hold for at least 10 seconds, and (3) achieve ≥60% weighted dominance in the recent vote window.

Grace & Banner

A fade grace period of 10 seconds holds the last declared side even when no new observations arrive — this prevents the tracker from going blank during brief pauses in flow. The reversal banner lasts 30 seconds: when a genuine side flip is confirmed, the engine displays a reversal alert on the dashboard for 30 seconds to ensure the trader notices the change. Observations below score 50 are dropped entirely — low-quality noise shouldn't influence the stability calculation.

Stability Parameters

Parameter	Value
History window	60 seconds
Flip minimum ticks	5 ticks
Flip minimum time	10 seconds
Flip lead %	60% weighted dominance
Min observation score	50 (drop below this)
Reversal banner	30 seconds
Fade grace	10 seconds (holds last side)
Recent vote window	25 ticks

Calibration Rationale

Without stability constraints, absorption side would flip on every tick in a noisy market, making the signal useless.
The 60% lead requirement means the new side must convincingly dominate, not just edge ahead.
The 10-second minimum prevents reactionary flips to single large prints — institutional iceberg orders can produce momentary opposite-side signals that shouldn't cause a flip.
Score-weighted voting ensures that a single high-quality reversal observation can outweigh multiple low-quality noise observations.

Order Flow Engine

Market Order Tempo

3 levels · 0.6× / 1.5×

TMO

How the engine reads market "loudness" — are institutions actively participating or is this retail noise?

DetectClassifyConfirmScore

Core Decision

Bar-by-bar institutional participation classification against fixed baselines per symbol.

WEAK<0.6× baseline → confluence +4

NORMAL0.6× – 1.5× baseline → no adjustment

STRONG≥1.5× baseline → confluence -3 to -5

Confirmed STRONG tempo — institutions actively participating, confluence threshold lowered, flow IS confirmation.

Rejected WEAK tempo — retail noise, no institutional footprint, confluence threshold raised by +4.

Baseline Measurement

Market order tempo is a per-bar measurement of institutional participation intensity. The engine counts both the number of trades (order frequency) and the total contracts (order size) in each bar, comparing them against fixed baselines derived from average RTH bar statistics: NQ baseline is 900 trades / 12,000 contracts per bar; ES baseline is 600 trades / 8,000 contracts.

Threshold Adjustment

Below 60% of baseline (WEAK), the market is dominated by retail noise — small orders, no institutional footprint. The engine responds by raising the confluence threshold by +4 points, demanding more evidence before firing. Above 150% of baseline (STRONG), institutions are actively participating. The engine lowers the threshold by -3 to -5 points because the heavy participation itself is a form of confirmation — you don't need as much structural evidence when the order book is shouting a direction.

Macro vs Micro Timescale

The tempo check is separate from the volume regime (P2) and works at a different timescale. Volume regime uses daily percentile rank (macro context); tempo measures bar-by-bar participation (micro context). A STRONG tempo bar in a LOW volume regime means "today is quiet overall, but right now someone big showed up" — that's a relevant signal for the current bar's evaluation.

Tempo Thresholds

Level	NQ Trades	NQ Volume	ES Trades	ES Volume
WEAK (<0.6×)	<540	<7,200	<360	<4,800
NORMAL	540–1,349	7,200–17,999	360–899	4,800–11,999
STRONG (≥1.5×)	≥1,350	≥18,000	≥900	≥12,000

Calibration Rationale

WEAK tempo means the market order flow is below 60% of baseline — retail noise, no institutional footprint; setups need more confluence to compensate.
STRONG means real participation — the flow itself IS confirmation.
The 0.6×/1.5× multipliers mark the empirical inflection points where institutional participation becomes visible (or invisible) in order flow data.

Order Flow Engine

Research Microstructure Signals (R1–R4)

4 signals · 10 substrate fields

R1–R4

Academic-validated microstructure signals wired into the RR profiler, stop adjustment, and per-fire substrate.

Market DataResearch SignalsDecision Chain

Four Signals

Each signal has a computation function (engine-order-flow.js) and decision-chain wiring (engine-rr-confluence.js + engine-pipeline.js).

R1 VPIN|buyVol − sellVol| / totalVol · rolling 50-bucket · TOXIC ≥0.7 / ELEVATED ≥0.5 / NORMAL ≥0.3 / CLEAN

R2 Vol ClockbarVolume / median(100 bars) · SURGE ≥2.0 / FAST ≥1.5 / NORMAL ≥0.7 / SLOW ≥0.4 / DEAD

R3 First-30(price_10:00 − price_9:30) / price_9:30 · power-hour only (15:00–15:45 ET) × GEX regime

R4 OFIΔ(bestBidSize) − Δ(bestAskSize) · acceleration over 3 snapshots · BID_BUILDING / ASK_BUILDING / NEUTRAL

R1 → RR TOXIC + aligned → −0.15R (informed agrees). TOXIC + counter → +0.30R (counterparty risk). ELEVATED + counter → +0.15R.

R2 → Meta Applied LAST to total adjustment: DEAD ×0.4, SLOW ×0.7 (dampen toward zero). SURGE ×1.15 (amplify — signals trustworthy).

R3 → RR Power-hour only. Aligned → −0.15R × GEX mult. Opposing → +0.20R × GEX mult. CASCADE ×1.5 / PINNED ×0.5.

R4 → RR+Stop RR: aligned → −0.10R, opposing → +0.15R. Stop: OFI opposing trade direction → stop ×0.90 (tighter).

Research Basis

R1 VPIN (Easley, Lopez de Prado, O’Hara) — Volume-Synchronized Probability of Informed Trading. Uses actual aggressor-classified buy/sell volume, not bulk classification. High VPIN precedes volatility events with R² ≈ 0.4 in the original paper.
R2 Volume Clock (Lopez de Prado, “Advances in Financial ML”) — volume-time vs wall-clock-time. When bars take longer to fill (SLOW/DEAD), signals are dominated by noise. The meta-multiplier gates the reliability of ALL other signal adjustments.
R3 First-30-min (Gao, Han, Li, Zhou 2018, Journal of Financial Economics) — first 30 minutes of the session predict the last 30 minutes’ direction. Crossed with GEX regime: CASCADE amplifies (dealers sell rips/buy dips in same direction), PINNED dampens (dealers absorb moves).
R4 OFI (Cont, Stoikov, Kukanov 2014) — Order Flow Imbalance velocity. Change in best bid size minus change in best ask size. R² ≈ 70% for short-term price prediction in the original paper. Scaffolded for L2 depth data (connecting this week).

Utilization Audit

Institutional Confluence → RR + Raw-Input Forensics

L1a wire + 11 forensic fields + regime + chop

L1–L2

3-layer utilization audit: “What does the engine KNOW vs what does it DO about it?” Wired institutional confluence into RR floor. Added raw-input substrate fields for post-hoc forensic decomposition.

AuditDecision ChainSubstrate Forensics

Build Sync · 2026-05-24

Today’s shipped cluster wired RR/state harmonization and entry-timing decomposition into live decisions, substrate persistence, and weekly extraction cohorts.

L1a Confluence→RRinstitutionalConfluenceScore(setup) · score≥70 → −0.15R · score<35 → +0.20R

P3.7b RRrequiredRR = dynamicRR(base + flow + regime + ATR14 scale + marketState adj), bounded and source-aware

P3.8 TimingentryTimingClassAtFire + entryNoFillRiskAtFire + entryChaseRiskAtFire → NO_FILL / CHASE extraction cohorts

L1a → RR High confluence (≥70) = broad institutional confirmation → compress RR floor (target more reachable). Low (<35) = thin setup → raise floor.

RR + Timing → Substrate RR ATR14/state-coupling fields and P3.8 NO_FILL/CHASE timing fields are now captured on SIGNAL + BLOCKED rows, so weekly extraction can isolate entry-friction failure modes.

Regime → RR Day regime severity wired into RR: EXTREME +0.25R, ELEVATED +0.10R. Previously gated production but didn’t modulate magnitude.

Chop → RR Chop-proxy (range-bound + low vol + inside value) penalizes breakout/continuation +0.15R, favors reversion/trap −0.05R.

L3 — Expanded 2026-05-24 Analysis Cohorts

Market structure: orderFlowState · absorptionState · liquidityBreakoutType · pricePosVsVwap · momentumAlignment · pricePosVsVpoc
Regime + HTF: dayRegimeSeverity · htfH1 · htfH4 · htfMSS
Execution context (banded): sessionRunwayMin · stopDistanceATR · targetDistanceATR · rangePosPercent
P3.7b RR cohorts: rrAtr14Source/Ratio/Scale + rrMarketState/Mode/Severity/Adj
P3.8 entry-timing cohorts: preEntry/preSignal state + timing class + no-fill/chase risk bands
Total slices: 96 dimensions in weekly_report.py (latest live schema path)
Today shipped set reflected: P3.11, P3.12, P3.13b, P3.13c, P3.7b, P3.8, and P5.2 verification.

Cross-Market Signal

Gamma Exposure (GEX)

0–45 DTE · 5 regimes

GEX

ETF options OI → Black-Scholes gamma per strike → dealer hedging direction.

Macro ContextOptions MarketDealer Hedging

Core Decision

ETF options OI → Black-Scholes gamma per strike → dealer hedging direction.

GEXΓ × OI × 100 × S² × 0.01

SignCall GEX positive (stabilizing) · Put GEX negative (amplifying)

Pass Spot ABOVE gamma flip → STABILIZING regime → dealers dampen moves, mean-reversion favored.

Fail Spot BELOW gamma flip → AMPLIFYING regime → dealers amplify moves, breakout/momentum favored.

Dealer Mechanics

Market-makers are structurally short options. To stay delta-neutral, they must hedge dynamically. Positive GEX = dealers buy dips, sell rips (stabilizing). Negative GEX = dealers sell into drops, buy into rips (amplifying). The gamma flip level is where this behavior inverts.

Cross-Market Edge

Data sourced from ETF options (SPY/QQQ), not futures options (ES/NQ). Independent participant pool: pension funds, insurance companies, retail equity vs futures prop desks. Cross-market confirmation with mechanical (not discretionary) basis.

VIX Integration

Research-validated: GEX is a VIX modifier, not standalone signal (FlashAlpha 8yr backtest: ρ=-0.14 after VIX control). Combined regime matrix: PINNED (calm+stabilizing) · COILED (calm+amplifying) · DAMPENED · VOLATILE · CASCADE.

Key Levels

Level	Definition
Gamma Flip	Price where net GEX crosses zero — regime boundary
Call Wall	Strike with highest call GEX — mechanical ceiling
Put Wall	Strike with highest put GEX — mechanical floor
Vol Trigger	Put wall below which negative-gamma cascading accelerates

Calibration Rationale

Industry-standard Black-Scholes gamma calculation (Perfiliev/SpotGamma convention).
0–45 DTE options included.
Dealer-short assumption validated by SpotGamma, SqueezeMetrics, FlashAlpha research.

⬡ Core Engine 8

config dom engine-pipeline engine-ev-decision engine-production-context engine-pipeline-readers engine-book-canon engine-shadow-disk-cache

◈ Market Reading 30

engine-order-flow engine-absorption-button-detail engine-absorption-pillar-direction-fix engine-aggressor-streak engine-poc-tracker engine-vwap-context engine-explicit-exports engine-dom-health-probe engine-bias-arbitrator engine-htf-context engine-session-router engine-runner-day-detector engine-setup-watch engine-cross-index engine-ib-anchor engine-ib-day-snapshot engine-ib-session-cache engine-ib-zero-sanitizer engine-whale-tracker engine-oi-tracker engine-market-clock engine-market-events engine-market-episodes engine-market-outcomes engine-market-memory-store engine-market-intelligence engine-market-intelligence-obs engine-mi-signal-panel data-economic-calendar engine-volume-delta-strip

◆ Risk, Gates & Sizing 12

engine-l4-risk-advisor engine-daily-risk-gate engine-position-sizer engine-news-risk engine-cadence-gate engine-asia-chop-bypass engine-reversal-watch engine-setup-classification engine-setup-diagnostics engine-substrate-boundary-guard engine-canonical-engine-state engine-gate-diagnostic

⟁ Signal & Trade Lifecycle 14

engine-signal-page engine-signal-registry signal-active-whisperer signal-decision-trace signal-decision-trace-rowui engine-trade-management engine-rr-confluence engine-stop-hunt-panel engine-session-wrapup engine-overnight-export engine-state-dwell market-narrator post-mortem-vs-trace diagnostics

🧠 NOA — Cognitive Companion 19

engine-noa-cognitive engine-noa-observability engine-noa-cross-market engine-noa-self engine-noa-brooks engine-noa-thesis engine-noa-thesis-pill engine-noa-avatar engine-noa-voice engine-noa-copilot engine-noa-cockpit engine-noa-left-rail engine-noa-chart-companion engine-noa-topbar engine-noa-market-intelligence engine-noa-briefing engine-noa-mute-controls engine-noa-proactive engine-noa-durable-snapshot

🤖 NOA — Trading · Path to Automation 47

engine-noa-context engine-noa-desk engine-noa-trade-companion engine-noa-journal-log engine-paper-trader engine-paper-trader-ui engine-paper-trader-review engine-leading-edge-shadow engine-price-action-state engine-noa-anticipation engine-noa-fusion engine-noa-consensus engine-noa-consensus-book engine-race-book-focus engine-noa-live-demo engine-noa-council-review engine-noa-learning engine-noa-graduation engine-anticipation-layer engine-ant-pa engine-ant-bro engine-ant-of engine-ant-agg engine-ant-cons engine-weighted-consensus shadow-book-persistence engine-fire-provenance engine-anticipation-path-capture engine-bro-path-capture engine-bro-fork-emitter engine-noa-solo engine-precision-book engine-llq-book engine-meta-brain-book engine-meta-brain-scoring engine-meta-brain-selector engine-meta-brain-discipline engine-meta-brain-market-mind engine-meta-brain-pressure engine-meta-brain-observer engine-meta-brain-copilot engine-haircut engine-noa-actions engine-noa-analyst engine-noa-findings-bus engine-noa-analyst-tier1 engine-noa-analyst-tier2 engine-noa-analyst-tier3 engine-noa-bridge engine-noa-audits engine-noa-findings-panel engine-discoveries-chip engine-quarantine-registry engine-decision-event-bus engine-confidence-schema engine-a1-parity-capture

⟳ Calibration & Substrate 7

engine-calibration engine-calibration-archive engine-calibration-banner engine-es-calibration calibration-learning-surface engine-ev-substrate-store engine-market-thesis

◈ Journal 9

engine-journal-analytics engine-journal-enrichment engine-journal-export engine-journal-patterns engine-journal-persistence engine-journal-renderers engine-journal-symbol-filter engine-journal-unified journal-missed-bridge

⚙ Platform · Infra · Ops 27

websocket-bridge engine-bridge-persistence engine-payload-validator engine-dom-availability engine-data-strategy-ui engine-stability-observer engine-module-health engine-durable-store engine-producer-cache engine-utils engine-input-health-log engine-phase4-resume engine-ui-prefs engine-symbol-pause engine-blocked-tracker engine-view-state-audit engine-alerts engine-decision-debug engine-prose-coarsen engine-tutorial-boot engine-tutorial-content engine-tutorial-fixture engine-tutorial-overlay engine-health-check engine-dom-freshness-log engine-right-panels-relocate engine-eod-killswitch

⬢ UI · Chart · Topbar 17

engine-chart engine-dashboard-renderers engine-misc-renderers engine-misc-helpers engine-blocked-compact engine-sbb-stats-strip engine-overlay-activity engine-render-stabilizer engine-today-r-chip engine-decision-clock engine-key-levels-panel engine-mode-chip engine-health-status-chip engine-desk-lanes-collapse engine-hero-meta engine-ghost-overlays engine-clock-pills

One thing to take away.

The engine's job is not to be right about every signal. It fires only when what big players are doing, what kind of day it is, and the discipline around it all agree — and learns faster than the market changes.

Every card above answers one of six questions

01What does it see?

02How does it think?

03How does it commit?

04How does it improve?

The cross-cutting cards answer a fifth — how does it refuse to lie to itself.

The NOA cards answer a sixth — how does it talk to you without becoming noise.

HOT RUN · PHASE A

BRIDGE LIVE · Quantower

BUILD PHASE · lock lifted 2026-05-22

208 modules · ~150 fields/fire

--:--:-- UTC