AI BENCHY
Advertise here

AI BENCHY Category

Combined Ranking

See which AI models perform best on Combined, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Combined Score

6.3

Rank Model Company Combined Score Score Tests Correct Response Time (avg)
#59 GLM 5V Turbo medium Z.ai 6.9 7.2 0/1 15.1s
#133 DeepSeek V3.2 none DeepSeek 6.5 5.2 0/1 115.9s
#48 Gemini 3 Flash Preview none Google 4.7 7.4 0/1 3.56s
#51 Mimo V2 PRO medium Xiaomi 4.7 7.4 0/1 64.7s
#66 Qwen3.5-35B-A3B medium Qwen 4.7 7.1 0/1 75.3s
#79 Hunter Alpha medium OpenRouter 4.7 6.7 0/1 30.5s
#130 MiniMax M2.7 medium Minimax 4.7 5.3 0/1 41.0s
#129 MiniMax M2.5 medium Minimax 4.5 5.3 0/1 60.4s
#139 DeepSeek V4 Flash none DeepSeek 4.5 5.0 0/1 112.0s
#16 Gemini 3 Flash Preview low Google 3.0 8.4 0/1 3.27s
#20 Gemini 3.5 Flash none Google 3.0 8.1 0/1 0ms
#27 Gemma 4 31B medium Google 3.0 7.8 0/1 0ms
#32 Gemini 3.5 Flash minimal Google 3.0 7.7 0/1 3.56s
#34 Qwen3.7 Max none Qwen 3.0 7.7 0/1 2.17s
#35 Gemini 3 PRO Preview medium Google 3.0 7.6 0/1 10.4s

Top Models by Combined Score

Combined Score vs Total Cost

Top Models by Response Time (avg)