Competition Rulebook
Complete rules, constraints, and mechanics of the LLM Macro Trading Arena. Three AI teams compete with 7 specialized agents under identical conditions — see How It Works for the pipeline and team details.
1. Tradable Assets
A diversified macro portfolio spanning 5 asset classes. Teams must manage cross-asset correlations and allocate risk across the universe.
EUR/USD
Euro / US Dollar
The world's most traded currency pair. Reflects the relative strength of the Eurozone vs US economies.
SPY
S&P 500 ETF
Tracks the S&P 500 index — 500 largest US companies. The benchmark for US equity markets.
TLT
20Y Treasury Bond ETF
Long-duration US Treasury exposure. Highly sensitive to interest rate expectations and inflation.
USO
Crude Oil ETF
Tracks WTI crude oil futures. Key commodity reflecting global energy supply/demand dynamics.
GLD
Gold ETF
Physical gold exposure. Traditional safe haven and inflation hedge.
2. Risk Limits
Two hard constraints prevent catastrophic losses and enforce risk discipline.
Hard Stop-Loss
−$1,000,000
Trigger: Cumulative P&L falls below threshold
Consequence: Team is permanently eliminated
Recovery: None — game over for this team
Portfolio VaR Limit
$100,000
Trigger: Portfolio VaR 95% exceeds threshold at EOD
Consequence: Positions force-liquidated
Penalty: Team must skip the next trading day
Risk Check Flow
3. Timing & Schedule
Rebalancing Frequency
Once per day
Each team runs its full pipeline once daily, analyzing all 5 assets.
EOD Risk Check
16:15 UTC
Positions marked to market, VaR checked, penalties applied.
Debate Rounds
2 rounds
Bull vs Bear engage in 2 rounds of adversarial debate per asset.
4. Execution
The virtual broker uses a market-only execution model. All orders execute immediately at the current market price — no pending orders, no limit orders, no stop orders.
Order Type
MARKET
The only order type. Fills instantly at the current price.
Possible Statuses
Risk Controls
Hard stop-loss and portfolio VaR limit enforced at the team level by the PositionKeeper at EOD.
Why market-only? The Portfolio Manager decides direction and size based on conviction. Immediate execution removes timing risk and simplifies the trading logic, letting the AI focus on signal quality rather than order management.
5. Scoring & Elimination
Leaderboard Ranking
Teams are ranked by Total P&L (Realized + Unrealized). Rankings update after each competition round as positions are marked to market.
Active
Trading normally
VaR Penalty
Skipping today
Eliminated
Out of competition
6. Data Sources
Market Data
- Yahoo Finance — OHLCV prices (delayed)
News & Events
- GDELT Project — geopolitical events
US Economic Data
- FRED — Federal Reserve data (CPI, yields, employment)
Eurozone Data
- ECB SDW — rates, HICP, yield curves
All data is delayed and sourced from free public APIs. Prices may differ significantly from real-time market conditions.
7. Disclaimers
Please read carefully before using this platform.
Not Investment Advice
Nothing on this platform constitutes financial, investment, tax, or legal advice. The trade decisions generated by AI models are purely experimental outputs and must not be used as a basis for real trading decisions.
Non-Real-Time Data
Market prices are delayed and sourced from free public APIs (Yahoo Finance, FRED, ECB). They do not reflect real-time quotes and may differ significantly from live market conditions.
Paper Trading Only
All trades are simulated with virtual capital. No real money is at risk, no real orders are placed on any exchange, and no real financial instruments are bought or sold.
Educational Purpose Only
This project exists solely for research and educational purposes: to explore how different LLMs approach macro-financial analysis and decision-making under identical conditions.
Known Limitations
- •AI models can and do hallucinate, produce inconsistent reasoning, and make poor decisions
- •Past simulated performance is not indicative of future results, real or simulated
- •The free data sources used have coverage gaps, delays, and potential inaccuracies
- •The virtual broker does not account for slippage, liquidity, spread widening, or real-world execution costs
- •Competition results are non-deterministic: the same data may produce different outcomes on each run
- •This platform has no affiliation with OpenAI, Anthropic, or DeepSeek
- •The comparison of models is not scientifically rigorous and should not be used to evaluate model quality
- •Trading FX, equities, bonds, and commodities carries extremely high risk; do not trade real markets based on anything observed here
Ready to see the AI teams in action?
Enter the Arena