Instructions following x No answer Ranking

See which AI models are most likely to hit No answer on Instructions following, so you can spot weak points faster. Sort by: Tests Correct ↑.

Models Shown

Total Failures

Most Affected Model

Failure Reasons

Wrong answer61 Did not follow instructions18 Extra formatting3 No answer2 API error1 Timed out1

Categories

Combined29 Coding18 Trivia13 Data parsing and extraction8 Domain specific8 Anti-AI Tricks4 Puzzle Solving3 Instructions following2 Tool Calling2

2/2

Rank	Model	Company	No answer Count	Category Score	Total Cost	Tests Correct	Response Time (avg)
#143	Gemini 3.1 Flash Lite high	Google	1	7.3	$2.044	1/2	23.3s
Total Tests 2 Wrong Tests 1 Total Cost $2.044 Response Time (avg) 23.3s
#204	Qwen3.5-9B medium	Qwen	1	6.5	$0.036	1/2	5.75s
Total Tests 2 Wrong Tests 1 Total Cost $0.036 Response Time (avg) 5.75s

Filter models