Trivia Model Ranking | AI BENCHY

AI BENCHY Category

See which AI models perform best on Trivia, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Trivia Score

2.9

Best Model

Gemini 3 Flash Preview 10.0

Failure Reasons

With failure reason Wrong answer117

Rank	Model	Company	Trivia Score	Score	Tests Correct	Response Time (avg)
#72	GPT-5.5 none	OpenAI	3.0	6.7	0/1	5.01s
#73	Gemini 3.1 Flash Lite none	Google	3.0	6.7	0/1	733ms
#76	Qwen3.5 Plus 2026-02-15 none	Qwen	3.0	6.5	0/1	1.11s
#77	Grok 4.1 Fast medium	X AI	3.0	6.5	0/1	25.5s
#78	GLM 5 none	Z.ai	3.0	6.5	0/1	3.62s
#79	MiMo-V2-Omni none	Xiaomi	3.0	6.3	0/1	1.30s
#80	Mercury 2 medium	Inception	3.0	6.3	0/1	2.58s
#81	Gemini 2.5 Flash none	Google	3.0	6.3	0/1	1.15s
#82	Gemma 4 26B A4B none	Google	3.0	6.3	0/1	778ms
#83	GPT-5 Nano medium	OpenAI	3.0	6.2	0/1	20.1s
#84	DeepSeek V4 Pro none	DeepSeek	3.0	6.2	0/1	15.6s
#85	Nemotron 3 Super medium	NVIDIA	3.0	6.1	0/1	55.3s
#86	Seed-2.0-Lite none	Bytedance Seed	3.0	6.0	0/1	1.96s
#87	GLM 5V Turbo none	Z.ai	3.0	6.0	0/1	2.23s
#88	Owl Alpha medium	Openrouter	3.0	6.0	0/1	2.38s

←

1 4 5 6 9

→

Top Models by Trivia Score

Trivia Score vs Total Cost

Top Models by Response Time (avg)