Anti-AI Tricks Model Ranking

See which AI models perform best on Anti-AI Tricks, which ones stay reliable, and where the biggest gaps appear.

Models Shown

Average Anti-AI Tricks Score

7.1

Best Model

Failure Reasons

With failure reason Wrong answer293 With failure reason Did not follow instructions33 With failure reason Extra formatting20 With failure reason API error14 With failure reason No answer4 With failure reason Timed out4

210/210

Rank	Model	Company	Anti-AI Tricks Score	Score	Total Cost	Tests Correct	Response Time (avg)
#114	Qwen3.5-Flash medium	Qwen	10.0	6.2	$0.139	4/4	59.1s
Total Tests 4 Wrong Tests 0 Total Cost $0.139 Response Time (avg) 59.1s
#119	Qwen3.5-35B-A3B medium	Qwen	10.0	6.2	$0.837	4/4	21.1s
Total Tests 4 Wrong Tests 0 Total Cost $0.837 Response Time (avg) 21.1s
#130	Step 3.5 Flash medium	Stepfun	10.0	6.0	$0.108	4/4	40.6s
Total Tests 4 Wrong Tests 0 Total Cost $0.108 Response Time (avg) 40.6s
#133	Gemini 3 PRO Preview medium	Google	10.0	6.0	$0.385	4/4	15.0s
Total Tests 4 Wrong Tests 0 Total Cost $0.385 Response Time (avg) 15.0s
#134	Mimo V2 Omni medium	Xiaomi	10.0	5.9	$0.683	4/4	2.75s
Total Tests 4 Wrong Tests 0 Total Cost $0.683 Response Time (avg) 2.75s
#209	Step 3.5 Flash none	Stepfun	10.0	2.3	$0.020	4/4	35.0s
Total Tests 4 Wrong Tests 0 Total Cost $0.020 Response Time (avg) 35.0s
#179	Ring-2.6-1T none	Inclusionai	9.2	4.8	$0.026	3/4	43.3s
Total Tests 4 Wrong Tests 1 Total Cost $0.026 Response Time (avg) 43.3s
#65	Gemini 3.1 Flash Lite medium	Google	9.1	7.3	$0.117	3/4	2.39s
Total Tests 4 Wrong Tests 1 Total Cost $0.117 Response Time (avg) 2.39s
#64	Gemini 3.1 Flash Lite Preview medium	Google	9.1	7.3	$0.115	3/4	2.33s
Total Tests 4 Wrong Tests 1 Total Cost $0.115 Response Time (avg) 2.33s
#97	LongCat 2.0 high	Meituan	8.9	6.6	$0.469	3/4	7.76s
Total Tests 4 Wrong Tests 1 Total Cost $0.469 Response Time (avg) 7.76s
#149	KAT-Coder-Air V2.5 medium	Kwaipilot	8.7	5.6	$0.048	3/4	3.79s
Total Tests 4 Wrong Tests 1 Total Cost $0.048 Response Time (avg) 3.79s
#5	GPT-5.6 Sol high	OpenAI	8.7	9.4	$1.234	3/4	3.39s
Total Tests 4 Wrong Tests 1 Total Cost $1.234 Response Time (avg) 3.39s
#13	GPT-5.3-Codex medium	OpenAI	8.7	8.9	$0.920	3/4	4.16s
Total Tests 4 Wrong Tests 1 Total Cost $0.920 Response Time (avg) 4.16s
#29	Step 3.7 Flash medium	Stepfun	8.7	8.0	$0.515	3/4	9.65s
Total Tests 4 Wrong Tests 1 Total Cost $0.515 Response Time (avg) 9.65s
#30	GPT-5.2 Chat none	OpenAI	8.7	8.0	$0.604	3/4	3.40s
Total Tests 4 Wrong Tests 1 Total Cost $0.604 Response Time (avg) 3.40s

Anti-AI Tricks Ranking

Filter models

Top Models by Anti-AI Tricks Score

Anti-AI Tricks Score vs Total Cost

Top Models by Response Time (avg)