Trivia Model Ranking

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear.

Models Shown

Average Trivia Score

3.1

Best Model

Gemini 3.5 Flash 10.0

Failure Reasons

With failure reason Wrong answer133 With failure reason API error13 With failure reason No answer8

169/169

Rank	Model	Company	Trivia Score	Score	Total Cost	Tests Correct	Response Time (avg)
#162	Laguna Xs.2 none	Poolside	3.0	4.0	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#163	Granite 4.1 8B none	IBM Granite	3.0	4.0	$0.003	0/1	306ms
Total Tests 1 Wrong Tests 1 Total Cost $0.003 Response Time (avg) 306ms
#164	gpt-oss-120b none	OpenAI	3.0	4.0	$0.010	0/1	47.3s
Total Tests 1 Wrong Tests 1 Total Cost $0.010 Response Time (avg) 47.3s
#165	Qwen3.5-9B medium	Qwen	3.0	3.8	$0.036	0/1	177.0s
Total Tests 1 Wrong Tests 1 Total Cost $0.036 Response Time (avg) 177.0s
#166	Nemotron 3 Nano Omni 30b A3b Reasoning medium	NVIDIA	3.0	3.6	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#167	Nemotron 3 Nano Omni 30b A3b Reasoning none	NVIDIA	3.0	3.5	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#168	Step 3.5 Flash none	Stepfun	3.0	2.6	$0.020	0/1	114.1s
Total Tests 1 Wrong Tests 1 Total Cost $0.020 Response Time (avg) 114.1s
#9	GPT-5.5 medium	OpenAI	2.8	9.0	$3.679	0/1	37.9s
Total Tests 1 Wrong Tests 1 Total Cost $3.679 Response Time (avg) 37.9s
#10	GPT-5.3-Codex medium	OpenAI	2.8	8.9	$0.740	0/1	14.4s
Total Tests 1 Wrong Tests 1 Total Cost $0.740 Response Time (avg) 14.4s
#66	Gemini 3.5 Flash none	Google	2.8	7.0	$1.079	0/1	4.87s
Total Tests 1 Wrong Tests 1 Total Cost $1.079 Response Time (avg) 4.87s
#69	Grok 4.20 Beta medium	X AI	0.0	6.8	$0.750	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $0.750 Response Time (avg) 0ms
#83	Gemini 3.1 Flash Lite high	Google	0.0	6.5	$2.044	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $2.044 Response Time (avg) 0ms
#95	Gemini 3.1 Flash Lite Preview high	Google	0.0	6.1	$2.310	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $2.310 Response Time (avg) 0ms
#132	Hunter Alpha medium	OpenRouter	0.0	5.1	$0.000	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $0.000 Response Time (avg) 0ms
#136	Grok 4.20 Multi Agent Beta medium	X AI	0.0	5.0	$5.599	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $5.599 Response Time (avg) 0ms

Trivia Ranking

Filter models

Top Models by Trivia Score

Trivia Score vs Total Cost

Top Models by Response Time (avg)