Kushindwa kwa kategoria za AI BENCHY
Maarifa ya jumla: Hitilafu ya API
Maarifa ya jumla
Hitilafu ya API
Ona ni modeli gani za AI zina uwezekano mkubwa wa kupata Hitilafu ya API katika Maarifa ya jumla, ili uone udhaifu haraka. Panga kwa: Muda wa majibu (wastani) ↓.
Sababu za kushindwa
| Nafasi | Modeli | Kampuni | Idadi ya Hitilafu ya API | Alama ya kategoria | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #161 | Qwen3.5-9B medium | Qwen | 1 | 3.0 | 0/1 | 177.0s |
| #35 | Gemini 3 PRO Preview medium | 1 | 3.0 | 0/1 | 0ms | |
| #92 | Laguna M.1 medium | Poolside | 1 | 3.0 | 0/1 | 0ms |
| #93 | Qwen3.6 Plus Preview medium | Qwen | 1 | 3.0 | 0/1 | 0ms |
| #107 | Laguna Xs.2 medium | Poolside | 1 | 3.0 | 0/1 | 0ms |
| #136 | Elephant Alpha medium | Openrouter | 1 | 3.0 | 0/1 | 0ms |
| #137 | Elephant Alpha none | Openrouter | 1 | 3.0 | 0/1 | 0ms |
| #145 | Laguna M.1 none | Poolside | 1 | 3.0 | 0/1 | 0ms |
| #146 | Laguna Xs.2 none | Poolside | 1 | 3.0 | 0/1 | 0ms |
| #149 | Nemotron 3 Nano Omni 30b A3b Reasoning medium | NVIDIA | 1 | 3.0 | 0/1 | 0ms |
| #159 | Ling-2.6-1T none | Inclusionai | 1 | 3.0 | 0/1 | 0ms |
| #162 | Nemotron 3 Nano Omni 30b A3b Reasoning none | NVIDIA | 1 | 3.0 | 0/1 | 0ms |