AI BENCHY
AD
Track all your projects in one dashboard. Get ๐Ÿ“Šstats, ๐Ÿ”ฅheatmaps and ๐Ÿ‘€recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Fouten

API-fout-fouten

Zie welke AI-modellen het vaakst tegen API-fout aanlopen, zodat je betrouwbaarheidsrisico's ziet voordat je kiest.

Getoonde modellen

15

Totaal fouten

144

Meest getroffen model

Qwen3.6 Plus Preview 8
Rang Model Bedrijf API-fout-aantal Score Correcte tests Responstijd (gem.)
#93 Qwen3.6 Plus Preview medium Qwen 8 6.3 9/19 15.2s
#82 Hy3 preview high Tencent 7 6.6 11/21 56.6s
#89 Hy3 preview low Tencent 7 6.4 10/21 24.6s
#149 Nemotron 3 Nano Omni 30b A3b Reasoning medium NVIDIA 6 4.6 4/19 17.1s
#162 Nemotron 3 Nano Omni 30b A3b Reasoning none NVIDIA 6 4.1 2/19 728ms
#96 Ring-2.6-1T none Inclusionai 5 6.2 9/21 55.1s
#103 DeepSeek V4 Pro high DeepSeek 5 6.0 8/21 65.2s
#35 Gemini 3 PRO Preview medium Google 4 7.6 14/21 9.05s
#83 Step 3.5 Flash none Stepfun 4 6.6 6/12 39.0s
#92 Laguna M.1 medium Poolside 4 6.4 9/19 14.7s
#107 Laguna Xs.2 medium Poolside 4 5.8 6/19 6.73s
#133 DeepSeek V3.2 none DeepSeek 4 5.2 6/21 13.8s
#145 Laguna M.1 none Poolside 4 4.8 4/19 2.89s
#146 Laguna Xs.2 none Poolside 4 4.8 5/19 806ms
#156 Hy3 preview none Tencent 4 4.4 4/21 12.9s

Topmodellen op API-fout-aantal

API-fout-aantal vs Score

Topmodellen op Responstijd (gem.)