Gemini 3.1 Pro Preview currently holds top or near-top positions on Humanity's Last Exam leaderboards, achieving 44.7–46.4% accuracy with high-reasoning modes on the 2,500-question frontier benchmark testing expert-level reasoning across math, science, and humanities. This reflects Google's February 2026 Gemini 3 advancements, including Deep Think for complex chain-of-thought processing, outpacing OpenAI's GPT-5.4 at 41.6–44.3%. However, Meta's April 9 Muse Spark release at 50.2% has shifted competitive dynamics, highlighting rapid scaling in multimodal reasoning. With no new Gemini models since March app updates, trader sentiment hinges on potential announcements at Google I/O in May and training progress toward 50% thresholds by June 30, amid benchmark saturation risks and evaluation variances.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated$289,381 Vol.
50%+
38%
55%+
18%
60%+
9%
$289,381 Vol.
50%+
38%
55%+
18%
60%+
9%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Market Opened: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Gemini 3.1 Pro Preview currently holds top or near-top positions on Humanity's Last Exam leaderboards, achieving 44.7–46.4% accuracy with high-reasoning modes on the 2,500-question frontier benchmark testing expert-level reasoning across math, science, and humanities. This reflects Google's February 2026 Gemini 3 advancements, including Deep Think for complex chain-of-thought processing, outpacing OpenAI's GPT-5.4 at 41.6–44.3%. However, Meta's April 9 Muse Spark release at 50.2% has shifted competitive dynamics, highlighting rapid scaling in multimodal reasoning. With no new Gemini models since March app updates, trader sentiment hinges on potential announcements at Google I/O in May and training progress toward 50% thresholds by June 30, amid benchmark saturation risks and evaluation variances.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated
Beware of external links.
Beware of external links.
Frequently Asked Questions