AI BENCHY
Advertise here

AI BENCHY Category Failures

General Intelligence: API error

General Intelligence
API error

See which AI models are most likely to hit API error on General Intelligence, so you can spot weak points faster.

Models Shown

12

Total Failures

12

Most Affected Model

Nemotron 3 Ultra 550b A55b 1
Rank Model Company API error Count Category Score Tests Correct Response Time (avg)
#41 Nemotron 3 Ultra 550b A55b medium NVIDIA 1 3.7 0/1 2.52s
#72 DeepSeek V3.2 medium DeepSeek 1 3.4 0/1 58.3s
#82 Hy3 preview high Tencent 1 3.0 0/1 0ms
#89 Hy3 preview low Tencent 1 3.0 0/1 0ms
#92 Laguna M.1 medium Poolside 1 3.0 0/1 0ms
#93 Qwen3.6 Plus Preview medium Qwen 1 3.0 0/1 0ms
#107 Laguna Xs.2 medium Poolside 1 3.0 0/1 0ms
#133 DeepSeek V3.2 none DeepSeek 1 4.7 0/1 9.32s
#145 Laguna M.1 none Poolside 1 3.0 0/1 0ms
#146 Laguna Xs.2 none Poolside 1 3.0 0/1 0ms
#149 Nemotron 3 Nano Omni 30b A3b Reasoning medium NVIDIA 1 3.0 0/1 0ms
#162 Nemotron 3 Nano Omni 30b A3b Reasoning none NVIDIA 1 3.0 0/1 0ms

Top Models by API error Count

API error Count vs Score

Top Models by Response Time (avg)

Top Models by Estimated Wasted Cost