AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

Coding Ranking

See which AI models perform best on Coding, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Coding Score

6.1

Rank Model Company Coding Score Score Tests Correct Response Time (avg)
#73 GPT-5 Mini medium OpenAI 10.0 6.9 2/2 30.7s
#81 Grok 4.20 Multi Agent Beta medium X AI 10.0 6.6 1/1 27.1s
#82 Grok Build 0.1 none X AI 10.0 6.6 1/1 21.4s
#128 Ling-2.6-flash none Inclusionai 10.0 5.3 1/1 11.2s
#145 Nemotron 3 Nano Omni 30b A3b Reasoning none NVIDIA 10.0 4.6 1/1 1.27s
#9 Gemini 3.5 Flash none Google 8.2 8.9 1/2 39.6s
#11 GPT-5.5 medium OpenAI 8.2 8.7 1/2 69.7s
#14 Qwen3.6 Max Preview medium Qwen 8.2 8.4 1/2 178.0s
#27 GPT-5.4 medium OpenAI 8.2 7.9 1/2 55.0s
#44 GPT-5.2 Chat none OpenAI 8.2 7.6 1/2 8.05s
#1 Gemini 3 Flash Preview medium Google 7.9 9.8 1/2 96.0s
#20 Qwen3.5 Plus 2026-02-15 medium Qwen 7.6 8.1 1/2 193.8s
#45 MiMo-V2-Pro medium Xiaomi 7.5 7.6 1/2 94.2s
#65 GPT-5.4 Mini medium OpenAI 7.5 7.1 1/2 73.3s
#123 Laguna M.1 none Poolside 7.5 5.4 0/1 2.93s

Top Models by Coding Score

Coding Score vs Total Cost

Top Models by Response Time (avg)