AI BENCHY
Advertise here

AI BENCHY Category

Combined Ranking

See which AI models perform best on Combined, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Combined Score

6.3

Rank Model Company Combined Score Score Tests Correct Response Time (avg)
#98 GLM 5 none Z.ai 3.0 6.1 0/1 4.98s
#100 Grok Build 0.1 none X AI 3.0 6.0 0/1 0ms
#101 Mimo V2 Omni none Xiaomi 3.0 6.0 0/1 5.96s
#102 Gemma 4 26B A4B none Google 3.0 6.0 0/1 30.5s
#104 Nemotron 3 Ultra 550b A55b none NVIDIA 3.0 6.0 0/1 4.79s
#106 Grok 4.20 Beta none X AI 3.0 5.8 0/1 6.48s
#107 Laguna Xs.2 medium Poolside 3.0 5.8 0/1 15.9s
#108 Qwen3.5-Flash none Qwen 3.0 5.8 0/1 6.22s
#109 GLM 5V Turbo none Z.ai 3.0 5.8 0/1 6.51s
#110 Seed-2.0-Lite none Bytedance Seed 3.0 5.8 0/1 6.59s
#111 Owl Alpha medium Openrouter 3.0 5.7 0/1 10.0s
#116 Hunter Alpha none OpenRouter 3.0 5.7 0/1 15.2s
#117 Qwen3.5-35B-A3B none Qwen 3.0 5.6 0/1 47.4s
#118 Qwen3.6 27B none Qwen 3.0 5.6 0/1 9.95s
#119 Cobuddy medium Baidu 3.0 5.6 0/1 47.4s

Top Models by Combined Score

Combined Score vs Total Cost

Top Models by Response Time (avg)