AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

Coding Ranking

See which AI models perform best on Coding, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Coding Score

6.1

Rank Model Company Coding Score Score Tests Correct Response Time (avg)
#139 MiMo-V2.5 none Xiaomi 6.8 4.8 1/2 3.74s
#133 MiniMax M2.7 medium Minimax 6.7 5.0 1/2 54.7s
#36 Gemini 2.5 Flash medium Google 6.6 7.7 1/2 54.6s
#100 Owl Alpha medium Openrouter 6.6 5.8 1/2 19.1s
#116 Qwen3.6 Flash none Qwen 6.6 5.5 1/2 2.34s
#30 Qwen3.6 35B A3B medium Qwen 6.6 7.8 1/2 59.3s
#83 Qwen3.6 27B medium Qwen 6.6 6.6 1/2 165.4s
#54 Kimi K2.6 medium Moonshot AI 6.5 7.4 1/2 118.2s
#70 Qwen3.5-35B-A3B medium Qwen 6.5 7.0 1/2 244.5s
#84 Laguna Xs.2 medium Poolside 6.3 6.6 0/1 14.4s
#117 Grok 4.20 Beta none X AI 5.5 5.5 0/1 1.14s
#146 Ling-2.6-1T none Inclusionai 5.5 4.5 0/1 10.6s
#94 GPT-5 Nano medium OpenAI 5.4 6.1 0/2 47.8s
#95 DeepSeek V4 Pro none DeepSeek 5.4 6.0 0/2 8.27s
#147 GPT-5.4 Nano none OpenAI 5.4 4.5 0/2 1.09s

Top Models by Coding Score

Coding Score vs Total Cost

Top Models by Response Time (avg)