AI BENCHY Categoriefouten
Algemene kennis: Verkeerd antwoord
Algemene kennis
Verkeerd antwoord
Zie welke AI-modellen op Algemene kennis het meest kans hebben op Verkeerd antwoord, zodat je zwakke punten sneller ziet. Sorteren op: Totale kosten โ.
Foutredenen
133/133
Modellen filteren
Geen modellen komen overeen met de huidige zoekopdracht en filters.
| Rang | Model | Bedrijf | Verkeerd antwoord-aantal | Categoriescore | Totale kosten | Correcte tests | Responstijd (gem.) |
|---|---|---|---|---|---|---|---|
| #108 | Owl Alpha medium | Openrouter | 1 | 3.0 | $0.000 | 0/1 | 2.38s |
| #110 | Owl Alpha none | Openrouter | 1 | 3.0 | $0.000 | 0/1 | 2.50s |
| #131 | North Mini Code none | Cohere | 1 | 3.0 | $0.000 | 0/1 | 37.4s |
| #140 | Cobuddy medium | Baidu | 1 | 3.0 | $0.000 | 0/1 | 37.0s |
| #143 | Ling-2.6-flash none | Inclusionai | 1 | 3.0 | $0.001 | 0/1 | 1.06s |
| #158 | Hy3 preview none | Tencent | 1 | 3.0 | $0.003 | 0/1 | 2.71s |
| #163 | Granite 4.1 8B none | IBM Granite | 1 | 3.0 | $0.003 | 0/1 | 306ms |
| #98 | Gemma 4 31B none | 1 | 3.0 | $0.004 | 0/1 | 1.25s | |
| #121 | Gemma 4 26B A4B none | 1 | 3.0 | $0.004 | 0/1 | 778ms | |
| #141 | GLM 4.7 Flash none | Z.ai | 1 | 3.0 | $0.004 | 0/1 | 692ms |
| #97 | Qwen3.5-Flash none | Qwen | 1 | 3.0 | $0.005 | 0/1 | 588ms |
| #135 | Qwen3.5-9B none | Qwen | 1 | 3.0 | $0.006 | 0/1 | 2.32s |
| #139 | GPT-4o-mini none | OpenAI | 1 | 3.0 | $0.006 | 0/1 | 794ms |
| #142 | Nemotron 3 Super none | NVIDIA | 1 | 3.0 | $0.007 | 0/1 | 8.94s |
| #134 | MiMo-V2.5 none | Xiaomi | 1 | 3.0 | $0.007 | 0/1 | 3.89s |