AI BENCHY Categoriefouten
Algemene kennis: Verkeerd antwoord
Algemene kennis
Verkeerd antwoord
Zie welke AI-modellen op Algemene kennis het meest kans hebben op Verkeerd antwoord, zodat je zwakke punten sneller ziet. Sorteren op: Responstijd (gem.) โ.
Foutredenen
133/133
Modellen filteren
Geen modellen komen overeen met de huidige zoekopdracht en filters.
| Rang | Model | Bedrijf | Verkeerd antwoord-aantal | Categoriescore | Totale kosten | Correcte tests | Responstijd (gem.) |
|---|---|---|---|---|---|---|---|
| #125 | Qwen3.5-122B-A10B none | Qwen | 1 | 3.0 | $0.020 | 0/1 | 295ms |
| #163 | Granite 4.1 8B none | IBM Granite | 1 | 3.0 | $0.003 | 0/1 | 306ms |
| #129 | Mistral Small 4 none | Mistral | 1 | 3.0 | $0.007 | 0/1 | 397ms |
| #148 | Qwen3 Coder Next medium | Qwen | 1 | 3.0 | $0.008 | 0/1 | 399ms |
| #128 | Qwen3.6 35B A3B none | Qwen | 1 | 3.0 | $0.031 | 0/1 | 414ms |
| #103 | Qwen3.5-35B-A3B none | Qwen | 1 | 3.0 | $0.012 | 0/1 | 493ms |
| #151 | Mercury 2 none | Inception | 1 | 3.0 | $0.011 | 0/1 | 548ms |
| #97 | Qwen3.5-Flash none | Qwen | 1 | 3.0 | $0.005 | 0/1 | 588ms |
| #104 | Qwen3.5-27B none | Qwen | 1 | 3.0 | $0.015 | 0/1 | 599ms |
| #130 | Qwen3 Coder Next none | Qwen | 1 | 3.0 | $0.009 | 0/1 | 601ms |
| #102 | Qwen3.6 Flash none | Qwen | 1 | 3.0 | $0.015 | 0/1 | 649ms |
| #141 | GLM 4.7 Flash none | Z.ai | 1 | 3.0 | $0.004 | 0/1 | 692ms |
| #94 | Gemini 3.1 Flash Lite minimal | 1 | 3.0 | $0.013 | 0/1 | 724ms | |
| #161 | Grok 4.1 Fast none | X AI | 1 | 3.0 | $0.008 | 0/1 | 731ms |
| #96 | Gemini 3.1 Flash Lite none | 1 | 3.0 | $0.013 | 0/1 | 733ms |