AI BENCHY
Advertise here

AI BENCHY Category

Domain specific Ranking

See which AI models perform best on Domain specific, which ones stay reliable, and where the biggest gaps appear. Sort by: Tests Correct ↑.

Models Shown

15

Average Domain specific Score

4.8

Rank Model Company Domain specific Score Score Tests Correct Response Time (avg)
#87 Gemini 3.1 Flash Lite minimal Google 2.9 6.4 0/3 1.02s
#88 Qwen3.7 Plus none Qwen 3.0 6.4 0/3 868ms
#90 Gemini 3.1 Flash Lite none Google 2.9 6.4 0/3 762ms
#91 GPT-5.5 none OpenAI 2.9 6.4 0/3 1.31s
#93 Qwen3.6 Plus Preview medium Qwen 3.0 6.3 0/3 22.1s
#98 GLM 5 none Z.ai 3.0 6.1 0/3 2.24s
#99 gpt-oss-120b medium OpenAI 2.9 6.1 0/3 50.9s
#100 Grok Build 0.1 none X AI 3.6 6.0 0/3 103.7s
#102 Gemma 4 26B A4B none Google 3.6 6.0 0/3 2.49s
#103 DeepSeek V4 Pro high DeepSeek 2.9 6.0 0/3 205.7s
#105 Nemotron 3 Super medium NVIDIA 2.9 5.8 0/3 16.2s
#106 Grok 4.20 Beta none X AI 3.0 5.8 0/3 611ms
#107 Laguna Xs.2 medium Poolside 4.1 5.8 0/3 11.1s
#110 Seed-2.0-Lite none Bytedance Seed 3.6 5.8 0/3 1.33s
#112 GLM 5.1 none Z.ai 2.9 5.7 0/3 1.99s

Top Models by Domain specific Score

Domain specific Score vs Total Cost

Top Models by Response Time (avg)