Foutenranglijst voor Ongeldige toolaanroep

Zie welke AI-modellen het vaakst tegen Ongeldige toolaanroep aanlopen, zodat je betrouwbaarheidsrisico's ziet voordat je kiest.

Getoonde modellen

Totaal fouten

100

Meest getroffen model

Ling-2.6-flash 3

Categorieën

In categorie Gecombineerd91 In categorie Toolaanroepen9

83/83

Rang	Model	Bedrijf	Ongeldige toolaanroep-aantal	Score	Totale kosten	Correcte tests	Responstijd (gem.)
#174	Ling-2.6-flash none	Inclusionai	3	4.9	$0.002	6/22	10.7s
Totaal tests 22 Foute tests 16 Totale kosten $0.002 Responstijd (gem.) 10.7s
#27	Muse Spark 1.1 high	Meta	2	8.1	$1.694	12/22	31.5s
Totaal tests 22 Foute tests 10 Totale kosten $1.694 Responstijd (gem.) 31.5s
#28	Inkling high	Thinkingmachines	2	8.0	$1.006	15/22	64.2s
Totaal tests 22 Foute tests 7 Totale kosten $1.006 Responstijd (gem.) 64.2s
#87	Gemini 3.5 Flash minimal	Google	2	6.8	$0.300	14/22	2.65s
Totaal tests 22 Foute tests 8 Totale kosten $0.300 Responstijd (gem.) 2.65s
#91	GLM 5V Turbo medium	Z.ai	2	6.7	$0.457	11/21	23.1s
Totaal tests 21 Foute tests 10 Totale kosten $0.457 Responstijd (gem.) 23.1s
#96	Qwen3.6 27B medium	Qwen	2	6.5	$0.779	10/22	106.3s
Totaal tests 22 Foute tests 12 Totale kosten $0.779 Responstijd (gem.) 106.3s
#119	Inkling low	Thinkingmachines	2	6.1	$0.187	10/22	5.15s
Totaal tests 22 Foute tests 12 Totale kosten $0.187 Responstijd (gem.) 5.15s
#120	Qwen3.6 Flash none	Qwen	2	6.1	$0.062	7/22	3.74s
Totaal tests 22 Foute tests 15 Totale kosten $0.062 Responstijd (gem.) 3.74s
#146	DeepSeek V4 Flash none	DeepSeek	2	5.6	$0.044	5/22	36.8s
Totaal tests 22 Foute tests 17 Totale kosten $0.044 Responstijd (gem.) 36.8s
#148	Qwen3.6 27B none	Qwen	2	5.5	$0.087	7/22	10.7s
Totaal tests 22 Foute tests 15 Totale kosten $0.087 Responstijd (gem.) 10.7s
#165	Qwen3.5-9B none	Qwen	2	5.1	$0.021	4/22	19.2s
Totaal tests 22 Foute tests 18 Totale kosten $0.021 Responstijd (gem.) 19.2s
#167	North Mini Code none	Cohere	2	5.1	$0.000	4/22	29.9s
Totaal tests 22 Foute tests 18 Totale kosten $0.000 Responstijd (gem.) 29.9s
#169	DeepSeek V3.2 none	DeepSeek	2	5.0	$0.054	6/22	18.3s
Totaal tests 22 Foute tests 16 Totale kosten $0.054 Responstijd (gem.) 18.3s
#172	GLM 4.7 Flash none	Z.ai	2	4.9	$0.016	6/22	9.15s
Totaal tests 22 Foute tests 16 Totale kosten $0.016 Responstijd (gem.) 9.15s
#190	GLM 4.7 Flash medium	Z.ai	2	4.3	$0.166	4/22	142.6s
Totaal tests 22 Foute tests 18 Totale kosten $0.166 Responstijd (gem.) 142.6s

1 2 3 4 5 6

→

Ongeldige toolaanroep-fouten

Modellen filteren

Topmodellen op Ongeldige toolaanroep-aantal

Ongeldige toolaanroep-aantal vs Score

Topmodellen op Responstijd (gem.)