AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Category

General Intelligence Ranking

See which AI models perform best on General Intelligence, which ones stay reliable, and where the biggest gaps appear. Sort by: Tests Correct ↓.

Models Shown

15

Average General Intelligence Score

5.9

Rank Model Company General Intelligence Score Score Tests Correct Response Time (avg)
#60 Kimi K2.6 medium Moonshot AI 10.0 7.2 1/1 17.8s
#68 Claude Opus 4.8 none Anthropic 10.0 7.0 1/1 3.48s
#69 Claude Opus 4.6 medium Anthropic 10.0 7.0 1/1 5.04s
#85 Gemma 4 31B none Google 10.0 6.5 1/1 2.09s
#91 GPT-5.5 none OpenAI 10.0 6.4 1/1 3.41s
#98 GLM 5 none Z.ai 10.0 6.1 1/1 3.27s
#108 Qwen3.5-Flash none Qwen 10.0 5.8 1/1 803ms
#110 Seed-2.0-Lite none Bytedance Seed 10.0 5.8 1/1 3.45s
#128 Qwen3.6 Flash none Qwen 10.0 5.4 1/1 947ms
#135 Kimi K2.5 none Moonshot AI 10.0 5.2 1/1 4.00s
#140 Qwen3 Coder Next none Qwen 10.0 4.9 1/1 1.34s
#15 GPT-5.3-Codex medium OpenAI 4.6 8.4 0/1 4.87s
#17 GLM 5 medium Z.ai 6.1 8.3 0/1 14.7s
#19 Seed-2.0-Lite medium Bytedance Seed 6.7 8.2 0/1 18.2s
#21 GPT-5.4 medium OpenAI 4.7 8.0 0/1 4.92s

Top Models by General Intelligence Score

General Intelligence Score vs Total Cost

Top Models by Response Time (avg)