AI BENCHY Categorie
Toolaanroepen-ranglijst
Zie welke AI-modellen het best presteren op Toolaanroepen, welke betrouwbaar blijven en waar de grootste verschillen zitten. Sorteren op: Correcte tests โ.
169/169
Modellen filteren
Geen modellen komen overeen met de huidige zoekopdracht en filters.
| Rang | Model | Bedrijf | Toolaanroepen-score | Score | Totale kosten | Correcte tests | Responstijd (gem.) |
|---|---|---|---|---|---|---|---|
| #85 | Gemini 3.1 Flash Lite low | 10.0 | 6.4 | $0.028 | 1/1 | 5.66s | |
| #87 | Nemotron 3 Super medium | NVIDIA | 10.0 | 6.3 | $0.021 | 1/1 | 39.7s |
| #89 | Qwen3.5-35B-A3B medium | Qwen | 10.0 | 6.3 | $0.401 | 1/1 | 4.65s |
| #90 | GPT-5.5 none | OpenAI | 10.0 | 6.3 | $0.231 | 1/1 | 3.90s |
| #91 | Gemini 3 PRO Preview medium | 10.0 | 6.2 | $0.385 | 1/1 | 12.0s | |
| #92 | Seed-2.0-Lite none | Bytedance Seed | 10.0 | 6.2 | $0.019 | 1/1 | 3.94s |
| #93 | Gemini 2.5 Flash none | 10.0 | 6.2 | $0.016 | 1/1 | 1.91s | |
| #94 | Gemini 3.1 Flash Lite minimal | 10.0 | 6.1 | $0.013 | 1/1 | 3.51s | |
| #95 | Gemini 3.1 Flash Lite Preview high | 10.0 | 6.1 | $2.310 | 1/1 | 7.73s | |
| #96 | Gemini 3.1 Flash Lite none | 10.0 | 6.1 | $0.013 | 1/1 | 2.97s | |
| #97 | Qwen3.5-Flash none | Qwen | 10.0 | 6.1 | $0.005 | 1/1 | 3.67s |
| #99 | Nemotron 3 Ultra 550b A55b none | NVIDIA | 10.0 | 6.1 | $0.027 | 1/1 | 2.99s |
| #100 | Qwen3.6 Max Preview none | Qwen | 10.0 | 6.0 | $0.075 | 1/1 | 5.27s |
| #101 | GLM 5 none | Z.ai | 10.0 | 6.0 | $0.027 | 1/1 | 11.1s |
| #102 | Qwen3.6 Flash none | Qwen | 10.0 | 6.0 | $0.015 | 1/1 | 2.49s |