AI BENCHY Categorie
Gecombineerd-ranglijst
Zie welke AI-modellen het best presteren op Gecombineerd, welke betrouwbaar blijven en waar de grootste verschillen zitten. Sorteren op: Responstijd (gem.) โ.
| Rang | Model | Bedrijf | Gecombineerd-score | Score | Correcte tests | Responstijd (gem.) |
|---|---|---|---|---|---|---|
| #91 | GPT-5.5 none | OpenAI | 3.0 | 6.4 | 0/1 | 5.56s |
| #154 | Qwen3.5-9B none | Qwen | 3.0 | 4.6 | 0/1 | 5.91s |
| #101 | Mimo V2 Omni none | Xiaomi | 3.0 | 6.0 | 0/1 | 5.96s |
| #127 | Grok 4.20 none | X AI | 3.0 | 5.4 | 0/1 | 6.04s |
| #108 | Qwen3.5-Flash none | Qwen | 3.0 | 5.8 | 0/1 | 6.22s |
| #3 | Gemini 3.5 Flash low | 10.0 | 9.4 | 1/1 | 6.44s | |
| #106 | Grok 4.20 Beta none | X AI | 3.0 | 5.8 | 0/1 | 6.48s |
| #109 | GLM 5V Turbo none | Z.ai | 3.0 | 5.8 | 0/1 | 6.51s |
| #120 | Mimo V2 PRO none | Xiaomi | 3.0 | 5.6 | 0/1 | 6.58s |
| #110 | Seed-2.0-Lite none | Bytedance Seed | 3.0 | 5.8 | 0/1 | 6.59s |
| #95 | Qwen3.5 Plus 2026-02-15 none | Qwen | 3.0 | 6.3 | 0/1 | 6.65s |
| #147 | GPT-4o-mini none | OpenAI | 3.0 | 4.8 | 0/1 | 7.58s |
| #57 | Step 3.7 Flash low | Stepfun | 10.0 | 7.3 | 1/1 | 7.98s |
| #151 | Trinity Large Preview none | Arcee AI | 3.0 | 4.6 | 0/1 | 8.91s |
| #22 | Step 3.7 Flash medium | Stepfun | 10.0 | 8.0 | 1/1 | 9.06s |