AI BENCHY Category Failures
Instructions following: Extra formatting
Instructions following
Extra formatting
See which AI models are most likely to hit Extra formatting on Instructions following, so you can spot weak points faster.
Failure Reasons
2/2
Filter models
No models match the current search and filters.
| Rank | Model | Company | Extra formatting Count | Category Score | Total Cost | Tests Correct | Response Time (avg) |
|---|---|---|---|---|---|---|---|
| #117 | DeepSeek V4 Flash none | DeepSeek | 1 | 6.5 | $0.007 | 1/2 | 17.5s |
| #158 | Hy3 preview none | Tencent | 1 | 6.3 | $0.003 | 1/2 | 13.0s |