AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

Coding Ranking

See which AI models perform best on Coding, which ones stay reliable, and where the biggest gaps appear. Sort by: Metric ↑.

Models Shown

15

Average Coding Score

6.1

Rank Model Company Coding Score Score Tests Correct Response Time (avg)
#38 Qwen3.5-122B-A10B medium Qwen 4.1 7.7 0/2 119.6s
#56 Qwen3.5-Flash medium Qwen 4.1 7.4 0/2 54.2s
#67 MiMo-V2-Flash medium Xiaomi 4.1 7.1 0/2 7.20s
#91 Gemma 4 26B A4B none Google 4.1 6.2 0/2 3.83s
#141 Qwen3 Coder Next medium Qwen 4.1 4.7 0/2 1.17s
#33 Qwen3.6 Plus medium Qwen 4.1 7.8 0/2 201.7s
#105 Cobuddy medium Baidu 4.1 5.7 0/2 79.2s
#66 Qwen3.6 Max Preview none Qwen 4.2 7.1 0/2 3.06s
#113 GLM 5.1 none Z.ai 4.3 5.6 0/2 6.33s
#74 Laguna M.1 medium Poolside 4.3 6.9 0/1 35.6s
#129 gpt-oss-120b none OpenAI 4.3 5.2 0/1 9.57s
#101 Qwen3.5 Plus 2026-04-20 none Qwen 4.4 5.8 0/2 2.08s
#125 GLM 5 Turbo none Z.ai 4.4 5.3 0/2 2.58s
#142 Qwen3.5-9B none Qwen 4.4 4.6 0/2 5.39s
#89 GLM 5 none Z.ai 4.6 6.3 0/2 5.18s

Top Models by Coding Score

Coding Score vs Total Cost

Top Models by Response Time (avg)