AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

Combined Ranking

See which AI models perform best on Combined, which ones stay reliable, and where the biggest gaps appear.

Models Shown

13

Average Combined Score

6.3

Rank Model Company Combined Score Score Tests Correct Response Time (avg)
#155 Mercury 2 none Inception 3.0 4.5 0/1 606ms
#156 Hy3 preview none Tencent 3.0 4.4 0/1 35.8s
#157 Grok 4.1 Fast none X AI 3.0 4.4 0/1 3.33s
#159 Ling-2.6-1T none Inclusionai 3.0 4.3 0/1 23.5s
#160 LFM2-24B-A2B none Liquid 3.0 4.2 0/1 0ms
#161 Qwen3.5-9B medium Qwen 3.0 4.2 0/1 0ms
#162 Nemotron 3 Nano Omni 30b A3b Reasoning none NVIDIA 3.0 4.1 0/1 0ms
#163 Granite 4.1 8B none IBM Granite 3.0 4.0 0/1 1.88s
#112 GLM 5.1 none Z.ai 2.8 5.7 0/1 32.6s
#135 Kimi K2.5 none Moonshot AI 2.8 5.2 0/1 19.2s
#158 GLM 4.7 Flash medium Z.ai 2.8 4.4 0/1 65.6s
#114 Qwen3.5 Plus 2026-04-20 none Qwen 2.8 5.7 0/1 13.3s
#115 Qwen3.5-27B none Qwen 2.8 5.7 0/1 9.39s

Top Models by Combined Score

Combined Score vs Total Cost

Top Models by Response Time (avg)