AI BENCHY
Advertise here

AI BENCHY Category Failures

Anti-AI Tricks: Wrong answer

Anti-AI Tricks
Wrong answer

See which AI models are most likely to hit Wrong answer on Anti-AI Tricks, so you can spot weak points faster.

Models Shown

5

Total Failures

245

Most Affected Model

Gemini 2.5 Flash 4
Rank Model Company Wrong answer Count Category Score Tests Correct Response Time (avg)
#137 Elephant Alpha none Openrouter 1 6.6 2/4 963ms
#138 Ling-2.6-flash none Inclusionai 1 6.8 2/4 11.8s
#149 Nemotron 3 Nano Omni 30b A3b Reasoning medium NVIDIA 1 6.4 2/4 1.20s
#156 Hy3 preview none Tencent 1 4.8 1/4 11.1s
#161 Qwen3.5-9B medium Qwen 1 5.1 1/4 34.4s

Top Models by Wrong answer Count

Wrong answer Count vs Score

Top Models by Response Time (avg)

Top Models by Estimated Wasted Cost