Trivia Model Ranking

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear.

Models Shown

Average Trivia Score

3.1

Best Model

Gemini 3.5 Flash 10.0

Failure Reasons

With failure reason Wrong answer133 With failure reason API error13 With failure reason No answer8

169/169

Rank	Model	Company	Trivia Score	Score	Total Cost	Tests Correct	Response Time (avg)
#32	Gemini 3.1 Flash Lite Preview medium	Google	3.0	7.8	$0.068	0/1	2.68s
Total Tests 1 Wrong Tests 1 Total Cost $0.068 Response Time (avg) 2.68s
#33	Qwen3.5 Plus 2026-04-20 medium	Qwen	3.0	7.8	$0.317	0/1	92.6s
Total Tests 1 Wrong Tests 1 Total Cost $0.317 Response Time (avg) 92.6s
#34	Gemini 3.1 Flash Lite medium	Google	3.0	7.8	$0.071	0/1	3.08s
Total Tests 1 Wrong Tests 1 Total Cost $0.071 Response Time (avg) 3.08s
#35	Kimi K2.6 medium	Moonshot AI	3.0	7.8	$0.889	0/1	130.3s
Total Tests 1 Wrong Tests 1 Total Cost $0.889 Response Time (avg) 130.3s
#36	Qwen3.5-122B-A10B medium	Qwen	3.0	7.7	$0.588	0/1	52.9s
Total Tests 1 Wrong Tests 1 Total Cost $0.588 Response Time (avg) 52.9s
#37	Grok 4.3 medium	X AI	3.0	7.7	$0.614	0/1	44.5s
Total Tests 1 Wrong Tests 1 Total Cost $0.614 Response Time (avg) 44.5s
#38	Claude Opus 4.6 medium	Anthropic	3.0	7.7	$2.053	0/1	63.2s
Total Tests 1 Wrong Tests 1 Total Cost $2.053 Response Time (avg) 63.2s
#39	Step 3.7 Flash low	Stepfun	3.0	7.7	$0.341	0/1	124.8s
Total Tests 1 Wrong Tests 1 Total Cost $0.341 Response Time (avg) 124.8s
#40	MiniMax M3 medium	Minimax	3.0	7.6	$0.131	0/1	100.8s
Total Tests 1 Wrong Tests 1 Total Cost $0.131 Response Time (avg) 100.8s
#41	DeepSeek V4 Pro high	DeepSeek	3.0	7.6	$0.157	0/1	34.0s
Total Tests 1 Wrong Tests 1 Total Cost $0.157 Response Time (avg) 34.0s
#42	Grok Build 0.1 medium	X AI	3.0	7.6	$0.927	0/1	53.5s
Total Tests 1 Wrong Tests 1 Total Cost $0.927 Response Time (avg) 53.5s
#43	Kimi K2.5 medium	Moonshot AI	3.0	7.5	$0.348	0/1	83.9s
Total Tests 1 Wrong Tests 1 Total Cost $0.348 Response Time (avg) 83.9s
#44	Mercury 2 medium	Inception	3.0	7.5	$0.058	0/1	2.58s
Total Tests 1 Wrong Tests 1 Total Cost $0.058 Response Time (avg) 2.58s
#45	GPT-5.3 Chat none	OpenAI	3.0	7.5	$0.433	0/1	4.38s
Total Tests 1 Wrong Tests 1 Total Cost $0.433 Response Time (avg) 4.38s
#46	GPT-5.4 Nano medium	OpenAI	3.0	7.5	$0.107	0/1	4.81s
Total Tests 1 Wrong Tests 1 Total Cost $0.107 Response Time (avg) 4.81s

Trivia Ranking

Filter models

Top Models by Trivia Score

Trivia Score vs Total Cost

Top Models by Response Time (avg)