Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
Forecasting market: Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
Alpha Opportunity
Alpha Thesis
We believe the Manifold contract for AI achieving >85% on FrontierMath before 2028 is overvalued at 69%, with our estimate at 35%. GPT-5.4 currently leads at 47.6% overall and 50% on Tiers 1-3. While progress has been remarkable (from <2% to 47.6% in ~18 months), reaching 85% requires nearly doubling current performance — including solving the hardest Tier 4 research-level problems that currently stand at 38%. The rate of improvement is decelerating as problems get harder.
📐Key Metrics
Key Findings
- Progress Has Been Remarkable But Decelerating — From <2% to 47.6% in 18 months. But gains from 50% to 85% are historically harder than 0% to 50%.
- Tier 4 Is the Bottleneck — Research-level problems at 38% success rate require genuine mathematical creativity that current LLMs struggle with.
- Benchmark Saturation Pattern — Every AI benchmark shows the same pattern: rapid initial gains, then asymptotic slowdown. FrontierMath will follow this pattern.
- 2 Years Is Short — From March 2026 to January 2028, AI must nearly double FrontierMath performance. ~2 model generations.
- 42% of Tier 4 Solved At Least Once — This is per-attempt; consistent >85% requires solving these reliably, not just occasionally.
Full Research Report
Unlock the complete analysis including probability assessment, Bayesian calculations, resolution rigor analysis, and strategic positioning recommendations across 5+ dimensions.
Alpha Quality Factors
Criteria that determine how exploitable this mispricing is
Human Bias Detected
Cognitive biases creating this alpha opportunity
The crowd may lack specialized knowledge that narrows the true probability range.
Compare Markets
Searching Polymarket, Kalshi, Manifold & Metaculus…
Market Data
Position Sizing
Kelly Criterion (per $1,000 bankroll)