Polyvaria

Every large multi-manager converged on the same design: independent research pods hunting alpha in parallel, with a shared data, risk, and execution platform underneath. The design's limit has always been people. There are only so many world-class pods you can field.

Polyvaria removes that limit. Its research pods are agent teams directed by centralized principal-investigator orchestration: priorities and risk budgets flow down, results and attribution flow up. A pod develops its hypotheses from idea through implementation against the shared backtester and submits the survivors for validation. The PI layer moves compute toward the families earning it and retires lines of research that stall. It also keeps the shared record of what has already failed, and a dead hypothesis doesn't get run twice. Adding a pod is a deployment rather than a hire. Research breadth grows with compute, and the gates still decide what earns weight in one book.

The platform underneath is staffed the same way. The data service, the backtester, risk, optimization, and execution are separate systems, each owned by its own agent team with its own quality rules. The separation is deliberate. A research pod cannot touch the referee that scores it or the data room it reads from, so there is no way to cheat, only to pass. Every result is recomputed from pinned inputs, and the verdict is cryptographically signed with keys no agent holds. That is what makes the breadth safe to scale. The gates don't care whether the work came from an agent or a human.

Research pods scale horizontally; orchestration stays centralized.
Multi-manager breadth, single-book risk discipline.
No result earns weight until the platform has recomputed it.

Research

The research loop.

It starts when the platform notices a weakness in the book: a decaying signal, a corner of the market with no coverage. An agentic principal investigator turns the gap into funded research, pods build candidate signals, and a statistical validation gate judges what they produce. Survivors are deployed and watched, the watching surfaces the next weakness, and humans govern the whole loop without sitting inside it.

01

Universe

Point-in-time by construction.

The investable universe is rebuilt for every historical date, delisted names included, so the backtest never sees a company the live book couldn't have traded. About 2,500 tradable names on a typical day, sixteen years deep.

point-in-time universe
02

Alpha Research

45 signals in production.

Research pods work price, fundamental, event, analyst, and news data into cross-sectional rankings of the universe. Fundamentals enter as first filed and news as it arrived, never as later revised. The newest layer is interaction alpha: momentum confirmed by an earnings surprise, surprise confirmed by cash profitability, reversal that skips announcement windows. In testing, these conditioned signals held up where naive combinations failed. Candidates clear durability gates across three market regimes before they earn a weight.

cross-sectional alpha ranks
03

Machine-Learning Overlay

Two horizon models, one unified score.

Gradient-boosted models blend the signal panel into a unified alpha score at two- and three-week horizons. Every model is trained under a pre-registered protocol: features, folds, and acceptance gates are frozen before training starts, and every trial lands in a permanent ledger. The overfitting math never gets to forget an experiment. Refits are walk-forward with purge and embargo against leakage. In live serving, the overlay abstains rather than guesses when its inputs drift.

unified alpha score
04

Factor Risk Model

Barra-grade, estimated daily.

A cross-sectional factor model in the USE4 tradition: market, seven styles (size, value, momentum, leverage, residual volatility, beta, quality), and industries, re-estimated every trading day. The covariance corrects for the ways risk models actually fail: serial correlation in factor returns, the optimizer's habit of betting hardest on the least-well-estimated directions, and volatility regimes that move faster than a trailing window. Signals are winsorized, z-scored within sector, and residualized against this structure. What survives is return the factor model can't explain, and that residual is what the book bets on.

residual alpha · factor covariance
05

Portfolio Optimizer

Constrained conic QP.

A conic solver maximizes expected return net of risk and transaction cost, subject to long-only, sector, market-cap, single-name, and turnover constraints. Cost and turnover are charged against current holdings, which keeps the trade list stable from one rebalance to the next, and a turnover circuit-breaker sits behind the solve as a second, independent line. Each plan carries a deterministic, content-addressed identifier. Any day's solve replays exactly.

target weights · rebalance plan
06

Execution

Every order knows its urgency.

Each trade in the plan carries an urgency score built from alpha decay, risk reduction, and drift, then damped by liquidity. The score maps to a schedule: patient limit orders where the edge is slow, immediate marketable orders where it is dying. Target price bands and participation caps come from a spread-and-impact cost model, and post-trade TCA feeds realized costs back into the optimizer, and the cost curve the book trades against ends up reflecting what it actually pays.

fills · realized cost

Research Library

Thirteen signal families.

Each family is owned by a research pod that works it for new signals against the platform's backtester and sixteen years of point-in-time history. Published anomalies are assumed to be decaying from the day they were published, and get haircut accordingly. More candidates means more accidental discoveries, which is exactly why validation sits outside the pod. A candidate stays on the bench until the referee says it clears, and the referee is never its author.

The lake holds 74 datasets backfilled to 2010: vendor feeds, reference tables, risk factors, and the 45 derived signals the book trades on. Scout agents keep it growing against the book's coverage gaps, and datasets that stop earning are retired.

Untrusted code, trusted verdicts.

Agent-written signal code runs in a fail-closed sandbox: no network, no filesystem, no credentials. Only numbers cross the boundary. The platform recomputes every statistic itself and signs the verdict with a key the agents don't hold, so no agent can mint its own acceptance.

Negative results are archival.

Each investigation ends in a findings memo whose verdict is set by the scorer, never the author, and a row in a research log where rejects are first-class. When deeper history exposed a shipped signal as a small-sample artifact, it was deleted from production the same week. The no's are written down so they stay dead.

Execution calibrates on its own fills.

The linear and square-root impact coefficients in the cost model are fitted to the platform's own broker fills and refitted as trade history accumulates. Slippage assumptions track what the book actually pays to trade, and the same fills drive the post-trade TCA that order scheduling is tuned against.

The operators write the platform.

Pipelines, monitors, data ingest, the backtesting harness, and the platform code itself are built and improved by the same agent pods that operate the system day to day; validation stays with gates the builders don't own. There is no separate engineering organization for research to wait on.

EXHIBIT A · FROM THE RESEARCH LOG

signal: insider_grade_consensus
family: Analyst & Sentiment
lifecycle: shipped June 2026 · retired July 2026
research read: IC t-stat 3.04, on a 77-name subset
full history: IC t-stat 0.12 · durable in 0 of 3 regimes
disposition: small-sample artifact · rejected
action: deleted from production · 1,647 partitions dropped
verdict: recomputed from pinned inputs · machine-signed

Mandate: One long-only, unlevered U.S. equity book. No shorts, no derivatives. The mandate is deliberately narrow: every layer of the platform gets to specialize.
Hard constraints: Single names are capped at 5% of the book; one-way turnover is capped at 30% per rebalance, a ceiling the cost penalty keeps the book well under. Sector and market-cap bounds sit beside them, and all of these are constraints inside the solve itself, so a book that violates them can never come out of it. The same limits bound the book's liquidity footprint to what the cost model says can trade without moving prices.
Validation gates: A signal gets live weight only after clearing combinatorial purged cross-validation (folds cut so nothing bleeds back from the future), a deflated Sharpe bar (which rises with every experiment run against the data), a ceiling on the probability of backtest overfitting, and tripwires that plant bait to catch look-ahead. The numbers are net of costs, with commission and impact charged before anything is judged. Results are recomputed from pinned inputs in a sandbox the author can't touch, and the gates are the same for agents and humans.
Fail-safe: Holdings are reconciled against the broker before every rebalance, and any day's decisions replay exactly from pinned inputs. A rebalance that can't reconcile, validate, or solve doesn't trade; the book holds its last valid positions. The human principals can halt the book at any time.

Agent layer: Research pods and subsystem teams, sandboxed with scoped data and compute, under principal-investigator direction. Output reaches production only through the gates.
Optimization: CVXPY with the Clarabel interior-point solver. Conic QP, re-solved daily.
Data lake: Apache Parquet, DuckDB, Polars. Arrow-native end-to-end.
Orchestration: Dagster asset graph. Daily partitions, retry policy, idempotent stages.
Risk: Barra-grade multi-factor model in the USE4 tradition: market, seven styles, and industry factors, with Newey-West, eigenfactor, and volatility-regime adjustments. The differentiated work is what gets residualized against it.
Machine learning: XGBoost over an evidence-curated feature panel; two horizon models, walk-forward refit under purged combinatorial cross-validation.
Verification: Fail-closed kernel sandboxing for agent code; every verdict recomputed from pinned inputs and signed.
Execution: Urgency-tiered order scheduling against a spread-and-impact model; deterministic plan identifiers; broker reconciliation before every rebalance.

Methods appendix: the actual thresholds

cross-validation: combinatorial purged CV, C(6,2) = 15 splits · purge ≥ 21 bars · embargo 10 bars
acceptance: deflated Sharpe probability ≥ 0.95 · probability of backtest overfitting ≤ 0.5
scoring costs: 1 bp commission · next-open fills, one-bar lag · impact charged before judgment
leakage tripwires: future-invariance · planted canary · measured lookback
durability screen: same-sign IC, |t| > 1 in each regime: 2012–16 · 2017–21 · 2022–26
sandbox: no network, read-only root, cleared environment, capabilities dropped · 4 GB / 1,200 CPU-second caps
verdicts: ECDSA P-256 signatures binding signal code, config, and data snapshot hashes
numeric oracle: agreement within rtol 1e-9 against an independent implementation
skeptic panel: three adversarial verifiers, five on dissent · uncertainty defaults to reject
onboarding dry run: 512 MB memory ceiling proven before any backfill
ml overlay: pre-registered protocol · purge 11–16 days, embargo 10–15 by horizon · walk-forward refit
risk model: daily cross-sectional WLS · Newey-West · eigenfactor adjustment · volatility-regime scaling

Polyvaria is in active development and trades proprietary capital only. It is not open to outside investment or inquiries.

A multi-manager research floor, run by agent teams.

Multi-strategy, horizontally scaled.

The research loop.

The pipeline.

Universe

Alpha Research

Machine-Learning Overlay

Factor Risk Model

Portfolio Optimizer

Execution

Thirteen signal families.

Grown, not bought.

How results earn trust.

Untrusted code, trusted verdicts.

Negative results are archival.

Execution calibrates on its own fills.

The operators write the platform.

Where the autonomy stops.

Boring on purpose.