Foutenranglijst voor Verkeerd antwoord

Zie welke AI-modellen het vaakst tegen Verkeerd antwoord aanlopen, zodat je betrouwbaarheidsrisico's ziet voordat je kiest. Sorteren op: Responstijd (gem.) ↑.

Getoonde modellen

Totaal fouten

1558

Meest getroffen model

Nemotron 3 Nano Omni 30b A3b Reasoning 9

Categorieën

In categorie Domeinspecifiek412 In categorie Anti-AI-trucs293 In categorie Programmeren252 In categorie Puzzeloplossing201 In categorie Algemene kennis168 In categorie Gecombineerd68 In categorie Instructies opvolgen61 In categorie Algemene intelligentie59 In categorie Gegevensparsering en extractie41 In categorie Toolaanroepen3

209/209

Rang	Model	Bedrijf	Verkeerd antwoord-aantal	Score	Totale kosten	Correcte tests	Responstijd (gem.)
#89	Gemini 3 Flash Preview none	Google	8	6.8	$0.085	13/22	2.95s
Totaal tests 22 Foute tests 9 Totale kosten $0.085 Responstijd (gem.) 2.95s
#183	Trinity Large Preview none	Arcee AI	12	4.8	$0.008	4/21	2.98s
Totaal tests 21 Foute tests 17 Totale kosten $0.008 Responstijd (gem.) 2.98s
#145	GLM 5V Turbo none	Z.ai	11	5.6	$0.052	8/21	2.99s
Totaal tests 21 Foute tests 13 Totale kosten $0.052 Responstijd (gem.) 2.99s
#94	Claude Opus 4.7 none	Anthropic	3	6.6	$0.505	16/19	3.02s
Totaal tests 19 Foute tests 3 Totale kosten $0.505 Responstijd (gem.) 3.02s
#164	Inkling none	Thinkingmachines	13	5.2	$0.147	6/22	3.50s
Totaal tests 22 Foute tests 16 Totale kosten $0.147 Responstijd (gem.) 3.50s
#124	Qwen3.6 Flash none	Qwen	12	6.1	$0.062	7/22	3.74s
Totaal tests 22 Foute tests 15 Totale kosten $0.062 Responstijd (gem.) 3.74s
#129	Nemotron 3 Ultra none	NVIDIA	12	6.1	$0.095	8/22	3.87s
Totaal tests 22 Foute tests 14 Totale kosten $0.095 Responstijd (gem.) 3.87s
#141	GLM 5 none	Z.ai	12	5.7	$0.041	9/21	4.03s
Totaal tests 21 Foute tests 12 Totale kosten $0.041 Responstijd (gem.) 4.03s
#154	MiMo-V2.5-Pro none	Xiaomi	11	5.5	$0.068	6/22	4.12s
Totaal tests 22 Foute tests 16 Totale kosten $0.068 Responstijd (gem.) 4.12s
#65	Gemini 3.1 Flash Lite medium	Google	7	7.3	$0.117	13/22	4.27s
Totaal tests 22 Foute tests 9 Totale kosten $0.117 Responstijd (gem.) 4.27s
#116	Seed-2.0-Lite none	Bytedance Seed	13	6.2	$0.066	8/22	4.40s
Totaal tests 22 Foute tests 14 Totale kosten $0.066 Responstijd (gem.) 4.40s
#59	Qwen3.7 Max none	Qwen	7	7.4	$0.197	15/22	4.52s
Totaal tests 22 Foute tests 7 Totale kosten $0.197 Responstijd (gem.) 4.52s
#64	Gemini 3.1 Flash Lite Preview medium	Google	7	7.3	$0.115	13/22	4.61s
Totaal tests 22 Foute tests 9 Totale kosten $0.115 Responstijd (gem.) 4.61s
#168	MiMo-V2.5 none	Xiaomi	14	5.1	$0.025	5/22	4.62s
Totaal tests 22 Foute tests 17 Totale kosten $0.025 Responstijd (gem.) 4.62s
#196	Hunter Alpha none	OpenRouter	9	4.2	$0.000	6/18	4.70s
Totaal tests 18 Foute tests 12 Totale kosten $0.000 Responstijd (gem.) 4.70s

Verkeerd antwoord-fouten

Modellen filteren

Topmodellen op Verkeerd antwoord-aantal

Verkeerd antwoord-aantal vs Score

Topmodellen op Responstijd (gem.)