Trivia Model Ranking

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear. Sort by: Score ↑.

Models Shown

Average Trivia Score

3.1

Best Model

LFM2-24B-A2B 0.0

Failure Reasons

With failure reason Wrong answer133 With failure reason API error13 With failure reason No answer8

169/169

Rank	Model	Company	Trivia Score	Score	Total Cost	Tests Correct	Response Time (avg)
#169	LFM2-24B-A2B none	Liquid	0.0	2.4	$0.001	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $0.001 Response Time (avg) 0ms
#168	Step 3.5 Flash none	Stepfun	3.0	2.6	$0.020	0/1	114.1s
Total Tests 1 Wrong Tests 1 Total Cost $0.020 Response Time (avg) 114.1s
#167	Nemotron 3 Nano Omni 30b A3b Reasoning none	NVIDIA	3.0	3.5	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#166	Nemotron 3 Nano Omni 30b A3b Reasoning medium	NVIDIA	3.0	3.6	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#165	Qwen3.5-9B medium	Qwen	3.0	3.8	$0.036	0/1	177.0s
Total Tests 1 Wrong Tests 1 Total Cost $0.036 Response Time (avg) 177.0s
#164	gpt-oss-120b none	OpenAI	3.0	4.0	$0.010	0/1	47.3s
Total Tests 1 Wrong Tests 1 Total Cost $0.010 Response Time (avg) 47.3s
#163	Granite 4.1 8B none	IBM Granite	3.0	4.0	$0.003	0/1	306ms
Total Tests 1 Wrong Tests 1 Total Cost $0.003 Response Time (avg) 306ms
#162	Laguna Xs.2 none	Poolside	3.0	4.0	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#161	Grok 4.1 Fast none	X AI	3.0	4.0	$0.008	0/1	731ms
Total Tests 1 Wrong Tests 1 Total Cost $0.008 Response Time (avg) 731ms
#160	Grok Build 0.1 none	X AI	3.0	4.2	$0.547	0/1	36.1s
Total Tests 1 Wrong Tests 1 Total Cost $0.547 Response Time (avg) 36.1s
#159	MiMo-V2-Flash none	Xiaomi	3.0	4.3	$0.025	0/1	1.82s
Total Tests 1 Wrong Tests 1 Total Cost $0.025 Response Time (avg) 1.82s
#158	Hy3 preview none	Tencent	3.0	4.3	$0.003	0/1	2.71s
Total Tests 1 Wrong Tests 1 Total Cost $0.003 Response Time (avg) 2.71s
#157	GLM 4.7 Flash medium	Z.ai	3.0	4.3	$0.054	0/1	11.1s
Total Tests 1 Wrong Tests 1 Total Cost $0.054 Response Time (avg) 11.1s
#156	Laguna Xs.2 medium	Poolside	3.0	4.3	$0.000	0/1	0ms
Total Tests 1 Wrong Tests 1 Total Cost $0.000 Response Time (avg) 0ms
#155	Grok 4.20 none	X AI	0.0	4.4	$0.057	0/0	0ms
Total Tests 0 Wrong Tests 0 Total Cost $0.057 Response Time (avg) 0ms

Trivia Ranking

Filter models

Top Models by Trivia Score

Trivia Score vs Total Cost

Top Models by Response Time (avg)