AI BENCHY
Advertise here

AI BENCHY Category

Coding Ranking

See which AI models perform best on Coding, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Coding Score

6.1

Rank Model Company Coding Score Score Tests Correct Response Time (avg)
#99 Seed-2.0-Lite none Bytedance Seed 6.8 5.9 1/2 2.95s
#6 Gemini 3.5 Flash medium Google 6.8 9.0 1/2 9.91s
#64 GPT-5.4 Nano medium OpenAI 6.8 7.1 1/2 21.1s
#68 Seed-2.0-Mini medium Bytedance Seed 6.8 7.1 1/2 220.5s
#110 Kimi K2.6 none Moonshot AI 6.8 5.6 1/2 122.8s
#3 Gemini 3.5 Flash low Google 6.8 9.3 1/2 5.54s
#26 Qwen3.7 Max none Qwen 6.8 7.9 1/2 1.39s
#34 Gemini 3.1 Flash Lite Preview medium Google 6.8 7.7 1/2 3.98s
#35 Gemini 3.1 Flash Lite medium Google 6.8 7.7 1/2 3.59s
#39 Gemini 3 Flash Preview none Google 6.8 7.7 1/2 2.19s
#41 Gemini 3.1 Flash Lite Preview low Google 6.8 7.6 1/2 1.56s
#49 Gemini 3.1 Flash Lite Preview none Google 6.8 7.5 1/2 1.06s
#50 Gemini 3.1 Flash Lite low Google 6.8 7.4 1/2 1.71s
#55 DeepSeek V4 Flash high DeepSeek 6.8 7.4 1/2 58.1s
#60 GLM 5V Turbo medium Z.ai 6.8 7.4 1/2 54.8s

Top Models by Coding Score

Coding Score vs Total Cost

Top Models by Response Time (avg)