AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

Combined Ranking

See which AI models perform best on Combined, which ones stay reliable, and where the biggest gaps appear. Sort by: Response Time (avg) ↓.

Models Shown

15

Average Combined Score

6.3

Rank Model Company Combined Score Score Tests Correct Response Time (avg)
#75 Ring-2.6-1T medium Inclusionai 10.0 6.9 1/1 304.2s
#12 Gemini 3.1 Flash Lite Preview high Google 10.0 8.6 1/1 280.5s
#73 Seed-2.0-Mini medium Bytedance Seed 10.0 6.9 1/1 262.8s
#30 Qwen3.5-27B medium Qwen 10.0 7.8 1/1 164.0s
#53 Gemini 3.1 Flash Lite high Google 10.0 7.3 1/1 149.2s
#14 Qwen3.6 Max Preview medium Qwen 10.0 8.5 1/1 121.5s
#133 DeepSeek V3.2 none DeepSeek 6.5 5.2 0/1 115.9s
#82 Hy3 preview high Tencent 10.0 6.6 1/1 113.1s
#139 DeepSeek V4 Flash none DeepSeek 4.5 5.0 0/1 112.0s
#29 Qwen3.5-122B-A10B medium Qwen 10.0 7.8 1/1 107.8s
#72 DeepSeek V3.2 medium DeepSeek 10.0 7.0 1/1 93.1s
#36 Qwen3.5 Plus 2026-04-20 medium Qwen 10.0 7.6 1/1 92.4s
#54 GPT-5 Mini medium OpenAI 10.0 7.3 1/1 88.2s
#105 Nemotron 3 Super medium NVIDIA 10.0 5.8 1/1 87.8s
#78 Qwen3.6 27B medium Qwen 7.0 6.8 0/1 83.1s

Top Models by Combined Score

Combined Score vs Total Cost

Top Models by Response Time (avg)