Trivia Model Ranking

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear. Sort by: Total Cost ↑.

Models Shown

Average Trivia Score

3.1

Best Model

Failure Reasons

With failure reason Wrong answer133 With failure reason API error13 With failure reason No answer8

169/169

Rank	Model	Company	Trivia Score	Score	Total Cost	Tests Correct	Response Time (avg)
#95	Gemini 3.1 Flash Lite Preview high	Google	0.0	6.1	$2.310	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $2.310 Response Time (avg) 0ms
#6	Claude Fable 5 medium	Anthropic	3.0	9.2	$3.165	0/1	25.6s
Total Tests 1 Wrong Tests 1 Total Cost $3.165 Response Time (avg) 25.6s
#9	GPT-5.5 medium	OpenAI	2.8	9.0	$3.679	0/1	37.9s
Total Tests 1 Wrong Tests 1 Total Cost $3.679 Response Time (avg) 37.9s
#136	Grok 4.20 Multi Agent Beta medium	X AI	0.0	5.0	$5.599	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $5.599 Response Time (avg) 0ms

Trivia Ranking