• Playlist
  • Seattle Startup Toolkit
  • Portfolio
  • About
  • Job Board
  • Blog
  • Token Talk
  • News
Menu

Ascend.vc

  • Playlist
  • Seattle Startup Toolkit
  • Portfolio
  • About
  • Job Board
  • Blog
  • Token Talk
  • News

Token Talk 28: Proof Is in the Prompt

August 8, 2025

For decades, mathematicians thought their jobs were safe from the AI boom. Sure, AlphaGo humbled the Go grandmasters, and protein folders won Nobel prizes, but creative proofs just felt like a human sanctuary. Last week that illusion evaporated faster than an integral under a well-placed substitution.

Google DeepMind’s Gemini Deep Think and an experimental OpenAI model walked out of the 66th International Mathematical Olympiad (IMO) with gold-level scores (35/42 points or five of six problems solved). 

The news shocked Polymarket when its “will AI win the IMO gold medal in 2025?” market odds spiked to 67% “yes” on July 19th only to crash down to 14% “yes” a week later. 

Turns out the challenge only resolves when the models are open sourced and results fully reproducible, a bar neither Google nor OpenAI cleared. Fine, I suppose we have to wait until the 67th IMO next July in Shanghai for China to open source math superintelligence.

Still, there is no doubt, AIs are getting better at math. In just two years we’ve gone from ChatGPT failing basic math to this OpenAI and Gemini victory. All made possible because of longer context windows, reasoning, and reinforcement learning.

The wild part is OpenAI’s model wasn’t even optimized for geometry or number theory. It simply burned gargantuan test-time compute, thinking for the entire 4.5-hour window every human contestant faces until a clean proof survived peer review by three former IMO champions. Google’s gold run is estimated at $1.5 million of TPU time. OpenAI's costs are unknown.  

The two labs are now arguing about who announced first and who used the official rubric. Basically: whose gold is shinier. That drama on X is fun, but the scoreboard reads Machines 2 – Humans 0.

Additionally, IMO President Prof. Dr. Gregor Dolinar said the IMO graders found the AIs’ answers to be clear, precise, and mostly easy to follow.

Which is impressive because these problems are complex and demand creative leaps. For example: Problem 3. 

Let N denote the set of positive integers. A function f : N → N is said to be bonza if 

f(a) divides b^a − f(b)^f(a)

for all positive integers a and b. Determine the smallest real constant c such that f(n) ⩽ cn for all bonza functions f and all positive integers n.

The official solution is 4 since every bonza function is one of the following: 

  • The constant 1 function 

  • The identity function 

  • The function that is 1 for all odd n and 2 for all even n

  • The function that is 1 for all odd n, 2 for all even, n != 4, and 16 for n = 4

The 68th IMO will be held in 2027, but the location and exact dates have yet to be announced. Does it matter? Probably not. The real game has already moved beyond high school competitions to the frontiers of math research.

Some in the field are pivoting to solving "explain why the model is wrong" meta-questions. And math AI’s could cause grant committees to be flooded with generated lemmas, potentially requiring disclosure of AI contributions. As mathematicians start offloading boilerplate proofs, a new type of startup is emerging.

Like Harmonic.ai — which was started by RobinHood CEO Vlad Tenev and raised $100 million from Kleiner Perkins and another $75M from Sequoia — to build mathematical super intelligence. Or Lean FRO, a nonprofit which is rallying top formalization talent to build open, verifiable AI mathematicians using Microsoft developed LEAN language to ensure that the AI’s outputs are provably correct.

In the end, these algorithmic improvements are generalizable mechanisms. The IMO is just another milestone, like AlphaGo or AlphaFold, on AI’s  inexorable march into every domain of human intellect. The drama and the caveats of this are temporary distractions. The fundamental truth is that a tool for automating discovery is here, and science will probably never be the same.

Tags Token Talk
Token Talk 27: Sovereignty as a Service →

FEATURED

Featured
Metro Multiples: A ranking of top startup ecosystems by return on investment
Metro Multiples: A ranking of top startup ecosystems by return on investment
Mapping Cascadian Dynamism
Mapping Cascadian Dynamism
Subscribe to Token Talk
Subscribe to Token Talk
You Let AI Help Build Your Product. Can You Still Own It?
You Let AI Help Build Your Product. Can You Still Own It?
Startup-Comp (1).jpg
Early-Stage Hiring, Decoded: What 60 Seattle Startups Told Us
Booming: An Inside Look at Seattle's AI Startup Scene
Booming: An Inside Look at Seattle's AI Startup Scene
SEATTLE AI MARKET MAP V2 - EDITED (V4).jpg
Mapping Seattle's Enterprise AI Startups
Our 2025 Predictions: AI, space policy, and hoverboards
Our 2025 Predictions: AI, space policy, and hoverboards
Mapping Seattle's Active Venture Firms
Mapping Seattle's Active Venture Firms
PHOTOS: Founders Bash 2024
PHOTOS: Founders Bash 2024
VC for the rest of us: A big tech employee’s guide to becoming startup advisors
VC for the rest of us: A big tech employee’s guide to becoming startup advisors
Valley VCs.jpg
Event Recap: Valley VCs Love Seattle Startups
VC for the rest of us: The ultimate guide to investing in venture capital funds for tech employees
VC for the rest of us: The ultimate guide to investing in venture capital funds for tech employees
Seattle VC Firms Led Just 11% of Early-Stage Funding Rounds in 2023
Seattle VC Firms Led Just 11% of Early-Stage Funding Rounds in 2023
Seattle AI Market Map (1).jpg
Mapping the Emerald City’s Growing AI Dominance
SaaS 3.0: Why the Software Business Model Will Continue to Thrive in the Age of In-House AI Development
SaaS 3.0: Why the Software Business Model Will Continue to Thrive in the Age of In-House AI Development
3b47f6bc-a54c-4cf3-889d-4a5faa9583e9.png
Best Practices for Requesting Warm Intros From Your Investors
 

Powered by Squarespace