# AwarenessFund — Back-Test Protocol (Spec Sheet)

**Author**: capital-markets-lead (ACG / A-C-Gee)
**Date**: 2026-05-18
**Status**: DRAFT TEMPLATE — REQUIRES EXPERT REVIEW BEFORE PRODUCTION USE
**Audience**: AwarenessFund partners + sister AiCIVs + (eventually) ACG evolution-cycle substrate

---

## §0 — Existing back-test infrastructure audit (ACG repo, 2026-05-18)

| Repo path | What it is | Equity back-test? | Reusable? |
|-----------|------------|-------------------|-----------|
| `opus45-edge-lab/src/backtest/` | Polymarket prediction-market edge-tracker (TrackedEdge / ResolutionOutcome / EdgeTrackingStatus lifecycle) | **NO** | **NO** — different domain (binary-outcome markets) and different data model (event-resolution, not price-series) |
| `opus45-edge-lab/tools/run_backtest.py` | CLI for edge resolution | NO | NO |
| `projects/aiciv-jared/.claude/agents/trading-strategist.md` | Agent manifest (Solana strategy) | NO | NO |
| `projects/selah-clone/.claude/agents/trading-strategist.md` | Agent manifest (crypto strategy) | NO | NO |
| `projects/witness-sim/agents/trading-strategist.md` | Agent manifest (sister-civ ref) | NO | NO |
| `memories/agents/sol-dev/asv-*-20260108.md` | SOL/USDC auto-trader design docs | NO | NO |
| `memories/agents/chart-analyzer/research-guide.md` | TA research notes | NO | NO |

**Audit verdict**: **ZERO equity portfolio back-test capability exists.** AwarenessFund back-test is greenfield. Spec below designs from scratch.

**Implication for build**: The back-test harness itself is a buildable artifact. Suggested home: `projects/awareness-fund/backtest/` (new subdir, infra-lead can scaffold; mind-lead writes the engine; capital-markets-lead provides protocol + acceptance tests).

---

## §1 — Goals of the back-test

1. **Validate the picks-and-shovels thesis** against the past 24 months (2024-05-18 → 2026-05-18) — show outperformance (or not) vs S&P 500, equal-weight S&P, NASDAQ-100, and sector ETFs
2. **Test 5–10 ingestion variations** (research-lead-defined) — discover which weighting / rebalance / risk-overlay produces best risk-adjusted return
3. **Decompose by vertical** — how much of return is Mining vs Energy vs Chip-fab vs AI-endpoint?
4. **Produce ACG-evolution substrate** — back-test outcomes feed ACG's nightly evolution cycle (learnings about thesis-construction, signal-quality, false-positives)
5. **Deliver investor-grade audit trail** — every signal, trade, rebalance reproducible from source data + protocol-spec

---

## §2 — Time-window + data-window definitions

| Parameter | Value | Why |
|-----------|-------|-----|
| Back-test window | 2024-05-18 → 2026-05-18 (24 months) | Per Corey's directive; captures full AI capex acceleration cycle |
| Warm-up window | 2023-05-18 → 2024-05-17 (12 months prior) | For trailing-12m volatility, beta, momentum signals before back-test start |
| Total data window required | 2023-05-18 → 2026-05-18 (36 months) | Free-tier yfinance covers; Polygon free tier covers (2y limit is tight — escalate if extending) |
| Bar frequency | Daily adjusted close (for return), daily OHLC (for risk metrics) | Free-tier compatible; weekly resampling possible |
| Trading-day calendar | NYSE calendar (pandas-market-calendars) | Foreign-listed tickers normalized to NYSE-day for cross-comparison |

---

## §3 — Data sourcing strategy (FREE-TIER FIRST)

### §3.1 Price data

| Source | Cost | Coverage | Use for |
|--------|------|----------|---------|
| **yfinance** (Yahoo wrapper) | Free | OHLC adjusted-close, dividends, splits, all 60 tickers + foreign | Primary price source v1 |
| **Polygon free tier** | Free, 5 req/min, 2y history | EOD US equities | Secondary / yfinance cross-check |
| **FRED** | Free, API key | Commodity prices (PCOPPUSDM, PURANUSDM), FX, rates | Macro overlays + factor returns |

**Cross-source validation rule**: For any ticker that materially drives back-test outcome (top-10 P&L contributors), back-test must pass yfinance↔Polygon EOD price reconciliation (tolerance: 1bp daily). Catches Yahoo's known dividend-adjustment errors.

### §3.2 Fundamentals

| Source | Cost | Coverage | Use for |
|--------|------|----------|---------|
| **SEC EDGAR** | Free, no key, polite-rate-limited | 10-K, 10-Q, 8-K, capex, segment revenue, US-listers only | Quarterly fundamentals for US tickers |
| **FMP free tier** | Free, 250 calls/day | Income statement, balance sheet, cash flow subset | Quarterly fundamentals supplement; **tight** quota (250 calls / 60 tickers ≈ 4 calls/ticker/day — careful sequencing required) |
| **Finnhub free** | Free, 60 req/min | Basic fundamentals + news sentiment | Sentiment overlay feature |
| **Yahoo Finance HTML** | Free, scrape-able | Foreign-ticker fundamentals (.T, .AX listings) | Last-resort for non-EDGAR filers |

### §3.3 Benchmark data

| Series | Source | Cost |
|--------|--------|------|
| S&P 500 (SPX) total return | yfinance: `^GSPC` (price) + `^SP500TR` (TR) | Free |
| Equal-weight S&P 500 | yfinance: `RSP` ETF (proxy) | Free |
| NASDAQ-100 | yfinance: `^NDX` (price) + `QQQ` (TR) | Free |
| Sector ETFs (XLU, XLE, XLB, XLK) | yfinance | Free |
| Risk-free rate | FRED `DGS3MO` (3-month Treasury) | Free |

### §3.4 What we DON'T have on free tier (explicit gaps)

- Real-time intraday data (don't need for daily back-test)
- Earnings transcripts (Finnhub free does NOT include; FMP Premium would)
- Institutional consensus estimates (FactSet/IBES — out of scope, ~$4K/mo)
- Alternative data (web traffic, satellite, credit-card) — out of scope
- Survivorship-bias-corrected ticker universe (would need CRSP — $500+/mo academic) — **MITIGATION**: include known delistings manually; flag survivorship caveat in every back-test output

---

## §4 — Portfolio construction rules

### §4.1 Universe gate (applied each rebalance date)

Universe = the 60-ticker AwarenessFund candidate set (see `ticker-universe-spec.md`).

Tradability filters applied each rebalance:
1. Listed on rebalance date (handles IPOs like ARM mid-window)
2. 60-day median ADV (average daily $-volume) ≥ $10M (institutional-tradable)
3. Bid-ask spread proxy: (high-low)/close ≤ 5% on rebalance date (illiquid-day filter)
4. Not currently halted (SMCI 2024 episode → research-lead's variations should test "halt-handling" rules)

### §4.2 Position sizing (variation-dependent — research-lead defines variations)

Suggested back-test variations to RUN (research-lead may add more):

| Variation | Description | Hypothesis being tested |
|-----------|-------------|--------------------------|
| **V1 — Equal-weight** | 1/60 each, rebalanced monthly | Naïve "buy the whole thesis basket" baseline |
| **V2 — Equal-weight by vertical** | 25% / 25% / 25% / 25% across 4 verticals, equal-weight within | Vertical-level diversification matters |
| **V3 — Market-cap-weight** | Proportional to free-float market cap | Tilts toward the obvious large-cap names |
| **V4 — Inverse-volatility** | Position size ∝ 1/(60-day realized vol) | Risk-parity overlay |
| **V5 — Momentum-tilt (12-1)** | Tilt toward top trailing-12m-excl-most-recent-month winners | Cross-sectional momentum |
| **V6 — Low-volatility** | Top-30 lowest-vol of 60 universe, equal-weight | Defensive overlay |
| **V7 — Quality-tilt** | Top-30 by ROIC (FMP free where available), equal-weight | Quality factor inside thesis |
| **V8 — Concentrated 4-by-4** | 4 best per vertical (16 names) by composite score | Concentration premium |
| **V9 — Picks-and-shovels purity score** | Higher weight to monopolists (ASML, KLAC, NEE) | Bet-on-the-moats variation |
| **V10 — Cap-weighted, sector-neutral** | Cap-weighted within vertical, equal-weighted across verticals | Hybrid V2 + V3 |

**Interface contract with research-lead**: research-lead's `ingestion-variations-spec.md` defines the canonical V1–Vn list and the signal/weighting recipe per variation. This document's V1–V10 are *placeholders*; final list comes from research-lead. Capital-markets back-test protocol is variation-agnostic — each variation = one weighting function `w(t, ticker) → weight`.

### §4.3 Rebalancing

| Parameter | Default | Tested in variations? |
|-----------|---------|------------------------|
| Frequency | Monthly (last trading day) | YES — research-lead should also test quarterly + weekly |
| Cash buffer | 1% (transaction costs + slippage reserve) | NO |
| Tax handling | Pre-tax returns only | NO (would require lot-level tracking — out of scope v1) |
| Rebalance bands | None v1 (full rebalance to target) | Optional — 5% drift bands as separate variation |

### §4.4 Transaction cost + slippage assumptions

| Assumption | Value | Source / rationale |
|------------|-------|---------------------|
| Commission | 0 bps | Modern brokerage reality (Schwab/IBKR/Fidelity zero-commission) |
| Bid-ask half-spread (slippage) | 5 bps liquid / 15 bps small-cap (AEHR, ONTO, SYM, etc.) | Conservative; calibrate from realized spreads in Polygon if upgraded |
| Market-impact cost | +2 bps per 1% of ADV traded (Almgren-Chriss simple linear) | For $10M+ fund AUM stress test |
| Foreign-listed extra cost | +10 bps (currency conversion + spread) for .T, .AX | Conservative |

**Stress variation**: re-run V1 with 30 bps round-trip cost — catches over-trading sensitivity.

### §4.5 Risk overlays (variation-overlays research-lead may compose with above)

| Overlay | Rule |
|---------|------|
| Max single-name weight | 8% (caps concentration) |
| Max single-vertical weight | 40% (forces some diversification) |
| Stop-loss | None v1 (no intra-month exits) |
| Drawdown circuit | If portfolio 30-day return < −15%, halve gross exposure for next month (test as separate variation) |

---

## §5 — Risk metrics captured (per variation, per benchmark)

### §5.1 Return metrics

| Metric | Definition | Why |
|--------|------------|-----|
| Total return | Cumulative geometric return, 24mo | Headline |
| CAGR | Annualized return | Comparable to fund-fact-sheet language |
| Monthly returns array | For statistical tests | Drives all other metrics |
| Win rate | % of months positive | Behavioral durability |
| Best / worst month | Tail behavior | Investor-conversation surface |

### §5.2 Risk metrics

| Metric | Definition | Why |
|--------|------------|-----|
| Annualized volatility | σ(monthly_returns) × √12 | Standard |
| Max drawdown | Peak-to-trough geometric loss | Pain-tolerance |
| Drawdown duration | Days from peak to recovery | "Time-to-heal" |
| Downside deviation | σ of returns below 0 | For Sortino |
| Beta vs S&P 500 | OLS regression slope, monthly | Market sensitivity |
| Correlation vs S&P 500 | ρ(strategy, SPX) | Diversification value |

### §5.3 Risk-adjusted metrics

| Metric | Definition | Why |
|--------|------------|-----|
| **Sharpe ratio** | (CAGR − Rf) / σ | Headline RA |
| **Sortino ratio** | (CAGR − Rf) / downside_dev | Asymmetric RA |
| **Information ratio** | (CAGR − S&P_CAGR) / TE | Active-mgmt fitness |
| Tracking error | σ(strategy − S&P) | Active risk |
| Calmar ratio | CAGR / max_drawdown | DD-adjusted RA |
| Alpha (Jensen) | OLS-intercept vs S&P | Single-factor alpha |
| **Multi-factor alpha** | OLS-intercept vs Fama-French 3 + Momentum (factors from K. French data library) | True picks-and-shovels alpha after controlling for factor exposure |

### §5.4 Attribution metrics (decomposition)

| Metric | Definition | Why |
|--------|------------|-----|
| Vertical contribution | Σ(weight × return) per vertical | "What drove the return?" |
| Top-10 contributors / detractors | Names sorted by $-P&L | Investor-conversation surface |
| Turnover | Σ|Δweight| / 2 annualized | Cost-justification |
| Concentration | Sum of top-5 weights | Diversification check |
| Active share vs S&P | 0.5 × Σ |w_strategy − w_SPX| | Bona-fide-active proof |

### §5.5 Statistical-significance tests

| Test | Use |
|------|-----|
| t-test on monthly alpha vs zero | Is alpha distinguishable from zero? |
| t-test on monthly excess return vs S&P | Is outperformance statistically real or noise? |
| Bootstrap of CAGR distribution (1000 resamples) | Confidence-interval on point estimate |
| Sharpe-ratio standard error (Lo 2002 formula) | Significance of Sharpe vs benchmark Sharpe |

**Honest disclaimer**: 24 months = 24 monthly observations. Statistical power is **low**. Significance tests are reported as substrate for partner discussion, **not** as inference-grade evidence. Longer back-test window (5y+) is upgrade path with Polygon paid.

---

## §6 — Survivorship + look-ahead bias controls

| Risk | Mitigation in v1 (free-tier) | Upgrade-path mitigation |
|------|-------------------------------|--------------------------|
| Survivorship bias | Manually include delistings during window (research-lead's job to enumerate) | CRSP database ($500+/mo academic) gives clean point-in-time universe |
| Look-ahead bias on fundamentals | Use `filing_date` (not `period_end`) for any fundamental signal; lag 1 day | FactSet/FMP Premium have explicit point-in-time data feeds |
| Look-ahead bias on ETF inclusions | Use ETF-constituent snapshots ≤ rebalance date only | If unavailable, use today's constituents and flag caveat |
| Selection bias on ticker universe | Universe was curated 2026-05-18 with hindsight — **explicit caveat in every output** | Pre-register universe before next 24mo back-test window |
| Data revisions | yfinance can revise; pin a snapshot date per back-test run | Use bitemporal data store (DuckDB + revision tracking) |

---

## §7 — Reproducibility + audit-trail requirements

Every back-test run MUST produce:

1. **Manifest file**: variation ID, universe-snapshot timestamp, data-source versions, parameter dict
2. **Trade log**: every rebalance, every trade (ticker, direction, $-amount, executed-price-assumption, cost)
3. **Daily NAV series**: portfolio value EOD for entire window
4. **Position log**: weights per ticker per rebalance date
5. **Source-data hash**: SHA256 of each underlying price/fundamental dataset used
6. **Code-version hash**: git commit of back-test engine that produced the output
7. **Comparison metrics**: full §5 metric set for strategy + each benchmark

Output format: parquet for time-series, JSON manifest, markdown report for human-read. Stored under `projects/awareness-fund/backtest-runs/YYYY-MM-DD-runID/`.

---

## §8 — Build sequencing (proposed)

| Phase | Owner | Deliverable | Days (est) |
|-------|-------|-------------|------------|
| 0 — Spec freeze | capital-markets-lead + research-lead | This doc + ingestion-variations-spec aligned | 1 |
| 1 — Data layer | mind-lead | Free-tier price + fundamental ingest pipeline → parquet | 2–3 |
| 2 — Engine | mind-lead | Portfolio simulator: weights → trades → NAV | 2–3 |
| 3 — Variation harness | mind-lead + capital-markets-lead | V1–V10 (or research-lead's final list) all runnable from CLI | 1–2 |
| 4 — Metrics + reports | capital-markets-lead | §5 metric set computed; markdown reports | 1 |
| 5 — Partner review | All | Walk-through with Corey + Russell + Apex/Jordana | 1 |
| 6 — Iteration | All | Refine variations, add overlays per feedback | ongoing |

Total v1 dev estimate: **8–11 active days**, all on free-tier data.

---

## §9 — Acceptance criteria for v1

The back-test is "v1-complete" when:

1. ✅ All 10 (or research-lead's final) variations run end-to-end on free-tier data
2. ✅ All variations reproduce identical results from a clean re-run (deterministic seeding)
3. ✅ All §5 metrics computed and present in output report
4. ✅ Source-data hashes + code-version hashes embedded in every run
5. ✅ Survivorship-bias caveat explicit in every output
6. ✅ "DRAFT TEMPLATE — REQUIRES EXPERT REVIEW BEFORE PRODUCTION USE" disclaimer on every artifact
7. ✅ Partner-readable markdown report generated automatically
8. ✅ A blind expert-reviewer (e.g., a portfolio manager friend of Russell's) can read the protocol + report and reproduce the headline number ±10bps

---

## §10 — Cross-coordination items for research-lead

Capital-markets back-test protocol is **variation-agnostic**. We need research-lead to deliver:

1. **Final ingestion-variation list** (replace this doc's placeholder V1–V10 with research-lead's canonical variations)
2. **Per-variation signal spec** — exact formula for `w(t, ticker)` per variation
3. **Hypothesis ledger** — what each variation is *testing* (mapped to a falsifiable thesis-claim)
4. **PPO / competing-hypothesis framing** — which variations are explicitly designed as adversarial competitors to each other
5. **Survivorship-corrected ticker enumeration** — which tickers were listed but exited during 2024-05-18 → 2026-05-18 in the 4 verticals (delistings, M&A, bankruptcies)

When research-lead's `ingestion-variations-spec.md` lands at `projects/awareness-fund/research/ingestion-variations-spec.md`, the back-test §4.2 variation list is **superseded** by research-lead's canonical list.

---

## §11 — Out of scope for v1 (explicit non-goals)

- **Live trading** (analytics only; AwarenessFund is fund-strategy substrate, not auto-execution)
- **Options / derivatives** (no premium data on free tier)
- **Intraday signals** (daily-bar only)
- **Multi-currency portfolio P&L** (USD-only for v1; foreign-listed → currency-hedged-USD assumption)
- **Tax-lot accounting** (pre-tax returns only)
- **Capacity analysis** (how much $AUM the strategy could hold without alpha decay)
- **ESG screens / negative screens** (out of scope; partner-defined if needed)

---

**Status**: SPEC v1 — ready for research-lead alignment when ingestion-variations spec lands.

**Risk-disclosure boilerplate to include in every back-test output**:
> *Past performance does not predict future results. Back-tests are hypothetical, derived from historical price data, and do not reflect actual trading, transaction costs in practice, taxes, or fund-management fees. Survivorship bias, selection bias, and look-ahead bias may be material despite mitigation. AwarenessFund partners and reviewers should treat back-test outputs as analytical substrate for discussion, NOT as a predictive forecast or investment recommendation.*