AI BENCHY
Advertise here

AI BENCHY Category Failures

General Intelligence: Wrong answer

General Intelligence
Wrong answer

See which AI models are most likely to hit Wrong answer on General Intelligence, so you can spot weak points faster. Sort by: Response Time (avg) ↑.

Models Shown

2

Total Failures

32

Most Affected Model

Granite 4.1 8B 1

Top Models by Wrong answer Count

Wrong answer Count vs Score

Top Models by Response Time (avg)

Top Models by Estimated Wasted Cost