AI BENCHY
Advertise here

AI BENCHY Category Failures

General Intelligence: Wrong answer

General Intelligence
Wrong answer

See which AI models are most likely to hit Wrong answer on General Intelligence, so you can spot weak points faster. Sort by: Tests Correct ↑.

Models Shown

2

Total Failures

32

Most Affected Model

Step 3.7 Flash 1

Top Models by Wrong answer Count

Wrong answer Count vs Score

Top Models by Response Time (avg)

Top Models by Estimated Wasted Cost