Ranglijst voor Domeinspecifiek x API-fout

Zie welke AI-modellen op Domeinspecifiek het meest kans hebben op API-fout, zodat je zwakke punten sneller ziet.

Getoonde modellen

Totaal fouten

Meest getroffen model

Foutredenen

Verkeerd antwoord412 Time-out43 Extra opmaak17 Geen antwoord8 API-fout7 Instructies niet gevolgd1

Categorieën

Programmeren45 Gecombineerd26 Toolaanroepen17 Anti-AI-trucs14 Gegevensparsering en extractie14 Algemene kennis13 Algemene intelligentie12 Puzzeloplossing12 Domeinspecifiek7 Instructies opvolgen1

7/7

Rang	Model	Bedrijf	API-fout-aantal	Categoriescore	Totale kosten	Correcte tests	Responstijd (gem.)
#27	Muse Spark 1.1 high	Meta	1	3.5	$1.694	0/3	67.4s
Totaal tests 3 Foute tests 3 Totale kosten $1.694 Responstijd (gem.) 67.4s
#158	KAT-Coder-Air V2.5 low	Kwaipilot	1	2.9	$0.041	0/3	4.99s
Totaal tests 3 Foute tests 3 Totale kosten $0.041 Responstijd (gem.) 4.99s
#167	Mistral Small 4 medium	Mistral	1	5.3	$0.096	1/3	6.11s
Totaal tests 3 Foute tests 2 Totale kosten $0.096 Responstijd (gem.) 6.11s
#173	DeepSeek V3.2 none	DeepSeek	1	2.9	$0.054	0/3	4.17s
Totaal tests 3 Foute tests 3 Totale kosten $0.054 Responstijd (gem.) 4.17s
#175	Qwen3.6 Plus Preview medium	Qwen	1	3.0	$0.000	0/3	22.1s
Totaal tests 3 Foute tests 3 Totale kosten $0.000 Responstijd (gem.) 22.1s
#199	Hy3 preview none	Tencent	1	3.6	$0.003	0/3	17.6s
Totaal tests 3 Foute tests 3 Totale kosten $0.003 Responstijd (gem.) 17.6s
#210	LFM2-24B-A2B none	Liquid	1	5.9	$0.001	1/3	287ms
Totaal tests 3 Foute tests 2 Totale kosten $0.001 Responstijd (gem.) 287ms

Modellen filteren