Trivia Model Ranking

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear. Sort by: Tests Correct ↓.

Models Shown

Average Trivia Score

3.1

Best Model

Gemini 3.5 Flash 10.0

Failure Reasons

With failure reason Wrong answer133 With failure reason API error13 With failure reason No answer8

169/169

Rank	Model	Company	Trivia Score	Score	Total Cost	Tests Correct	Response Time (avg)
#91	Gemini 3 PRO Preview medium	Google	3.0	6.2	$0.385	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.385 Response Time (avg) 0ms
#92	Seed-2.0-Lite none	Bytedance Seed	3.0	6.2	$0.019	0/1	1.96s
Total Tests 1 Wrong Tests 1 Total Cost $0.019 Response Time (avg) 1.96s
#93	Gemini 2.5 Flash none	Google	3.0	6.2	$0.016	0/1	1.15s
Total Tests 1 Wrong Tests 1 Total Cost $0.016 Response Time (avg) 1.15s
#94	Gemini 3.1 Flash Lite minimal	Google	3.0	6.1	$0.013	0/1	724ms
Total Tests 1 Wrong Tests 1 Total Cost $0.013 Response Time (avg) 724ms
#95	Gemini 3.1 Flash Lite Preview high	Google	0.0	6.1	$2.310	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $2.310 Response Time (avg) 0ms
#96	Gemini 3.1 Flash Lite none	Google	3.0	6.1	$0.013	0/1	733ms
Total Tests 1 Wrong Tests 1 Total Cost $0.013 Response Time (avg) 733ms
#97	Qwen3.5-Flash none	Qwen	3.0	6.1	$0.005	0/1	588ms
Total Tests 1 Wrong Tests 1 Total Cost $0.005 Response Time (avg) 588ms
#98	Gemma 4 31B none	Google	3.0	6.1	$0.004	0/1	1.25s
Total Tests 1 Wrong Tests 1 Total Cost $0.004 Response Time (avg) 1.25s
#99	Nemotron 3 Ultra 550b A55b none	NVIDIA	3.0	6.1	$0.027	0/1	1.83s
Total Tests 1 Wrong Tests 1 Total Cost $0.027 Response Time (avg) 1.83s
#100	Qwen3.6 Max Preview none	Qwen	3.0	6.0	$0.075	0/1	1.97s
Total Tests 1 Wrong Tests 1 Total Cost $0.075 Response Time (avg) 1.97s
#101	GLM 5 none	Z.ai	3.0	6.0	$0.027	0/1	3.62s
Total Tests 1 Wrong Tests 1 Total Cost $0.027 Response Time (avg) 3.62s
#102	Qwen3.6 Flash none	Qwen	3.0	6.0	$0.015	0/1	649ms
Total Tests 1 Wrong Tests 1 Total Cost $0.015 Response Time (avg) 649ms
#103	Qwen3.5-35B-A3B none	Qwen	3.0	5.9	$0.012	0/1	493ms
Total Tests 1 Wrong Tests 1 Total Cost $0.012 Response Time (avg) 493ms
#104	Qwen3.5-27B none	Qwen	3.0	5.9	$0.015	0/1	599ms
Total Tests 1 Wrong Tests 1 Total Cost $0.015 Response Time (avg) 599ms
#105	GLM 5V Turbo none	Z.ai	3.0	5.9	$0.052	0/1	2.23s
Total Tests 1 Wrong Tests 1 Total Cost $0.052 Response Time (avg) 2.23s

Trivia Ranking

Filter models

Top Models by Trivia Score

Trivia Score vs Total Cost

Top Models by Response Time (avg)