AI BENCHY
Advertise here

AI BENCHY Category

Domain specific Ranking

See which AI models perform best on Domain specific, which ones stay reliable, and where the biggest gaps appear.

Models Shown

15

Average Domain specific Score

4.8

Rank Model Company Domain specific Score Score Tests Correct Response Time (avg)
#145 Laguna M.1 none Poolside 3.6 4.8 0/3 5.50s
#156 Hy3 preview none Tencent 3.6 4.4 0/3 17.6s
#161 Qwen3.5-9B medium Qwen 3.6 4.2 0/3 137.7s
#162 Nemotron 3 Nano Omni 30b A3b Reasoning none NVIDIA 3.6 4.1 0/3 489ms
#17 GLM 5 medium Z.ai 3.5 8.3 0/3 0ms
#39 Qwen3.6 Flash medium Qwen 3.5 7.5 0/3 14.6s
#41 Nemotron 3 Ultra 550b A55b medium NVIDIA 3.5 7.5 0/3 24.9s
#63 GPT-5.3 Chat none OpenAI 3.5 7.2 0/3 13.0s
#75 Ring-2.6-1T medium Inclusionai 3.5 6.9 0/3 64.9s
#76 Kimi K2.5 medium Moonshot AI 3.5 6.8 0/3 137.3s
#144 GPT-5.4 Mini none OpenAI 3.5 4.9 0/3 937ms
#153 Qwen3.6 35B A3B none Qwen 3.5 4.6 0/3 7.45s
#158 GLM 4.7 Flash medium Z.ai 3.5 4.4 0/3 174.6s
#40 Gemini 3.1 Flash Lite Preview medium Google 3.0 7.5 0/3 4.21s
#69 Claude Opus 4.6 medium Anthropic 3.0 7.0 0/3 83.4s

Top Models by Domain specific Score

Domain specific Score vs Total Cost

Top Models by Response Time (avg)