Domain specific Model Ranking

See which AI models perform best on Domain specific, which ones stay reliable, and where the biggest gaps appear. Sort by: Response Time (avg) ↑.

Models Shown

Average Domain specific Score

4.7

Best Model

Claude Sonnet 4.6 2.9

Failure Reasons

With failure reason Wrong answer421 With failure reason Timed out43 With failure reason Extra formatting17 With failure reason No answer8 With failure reason API error7 With failure reason Did not follow instructions1

216/216

Rank	Model	Company	Domain specific Score	Score	Total Cost	Tests Correct	Response Time (avg)
#6	Gemini 3.6 Flash low	Google	10.0	9.4	$0.517	3/3	3.96s
Total Tests 3 Wrong Tests 0 Total Cost $0.517 Response Time (avg) 3.96s
#101	GLM 5.2 none	Z.ai	5.3	6.6	$0.128	1/3	4.04s
Total Tests 3 Wrong Tests 2 Total Cost $0.128 Response Time (avg) 4.04s
#179	DeepSeek V3.2 none	DeepSeek	2.9	5.0	$0.054	0/3	4.17s
Total Tests 3 Wrong Tests 3 Total Cost $0.054 Response Time (avg) 4.17s
#68	Gemini 3.1 Flash Lite Preview medium	Google	3.0	7.3	$0.115	0/3	4.21s
Total Tests 3 Wrong Tests 3 Total Cost $0.115 Response Time (avg) 4.21s
#161	Kimi K2.5 none	Moonshot AI	5.3	5.5	$0.127	1/3	4.38s
Total Tests 3 Wrong Tests 2 Total Cost $0.127 Response Time (avg) 4.38s
#132	Qwen3.5 Plus 2026-04-20 none	Qwen	5.3	6.1	$0.122	1/3	4.43s
Total Tests 3 Wrong Tests 2 Total Cost $0.122 Response Time (avg) 4.43s
#155	KAT-Coder-Air V2.5 medium	Kwaipilot	3.0	5.6	$0.048	0/3	4.87s
Total Tests 3 Wrong Tests 3 Total Cost $0.048 Response Time (avg) 4.87s
#184	Ling-2.6-flash none	Inclusionai	3.0	4.9	$0.002	0/3	4.95s
Total Tests 3 Wrong Tests 3 Total Cost $0.002 Response Time (avg) 4.95s
#164	KAT-Coder-Air V2.5 low	Kwaipilot	2.9	5.4	$0.041	0/3	4.99s
Total Tests 3 Wrong Tests 3 Total Cost $0.041 Response Time (avg) 4.99s
#12	Gemini 3.5 Flash medium	Google	7.7	9.1	$0.642	2/3	5.24s
Total Tests 3 Wrong Tests 1 Total Cost $0.642 Response Time (avg) 5.24s
#198	Laguna M.1 none	Poolside	3.6	4.4	$0.009	0/3	5.50s
Total Tests 3 Wrong Tests 3 Total Cost $0.009 Response Time (avg) 5.50s
#173	Mistral Small 4 medium	Mistral	5.3	5.1	$0.096	1/3	6.11s
Total Tests 3 Wrong Tests 2 Total Cost $0.096 Response Time (avg) 6.11s
#183	Nemotron 3 Super none	NVIDIA	3.6	4.9	$0.008	0/3	6.23s
Total Tests 3 Wrong Tests 3 Total Cost $0.008 Response Time (avg) 6.23s
#188	KAT-Coder-Air V2.5 none	Kwaipilot	2.9	4.8	$0.067	0/3	6.24s
Total Tests 3 Wrong Tests 3 Total Cost $0.067 Response Time (avg) 6.24s
#82	Mercury 2 medium	Inception	2.9	7.0	$0.093	0/3	6.48s
Total Tests 3 Wrong Tests 3 Total Cost $0.093 Response Time (avg) 6.48s

Domain specific Ranking

Filter models

Top Models by Domain specific Score

Domain specific Score vs Total Cost

Top Models by Response Time (avg)