AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

Qwen: Qwen3.5 Plus 2026-02-15

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo Qwen: Qwen3.5 Plus 2026-02-15
Nafasi#14
KampuniQwen
Score 5.67
Uthabiti 9.99
Gharama kwa matokeo 0.0997
Jumla ya gharama $0.00599
Majaribio sahihi 6/12
Kiwango cha kupita kwa kila jaribio 50.0%
Majaribio yasiyo thabiti 0
Tokeni za matokeo 775
Tokeni za hoja 0

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 0/2 1.00 10.00 0.0% 0 - $0.00017
Data parsing and extraction 2/2 10.00 10.00 100.0% 0 - $0.00369
Domain specific 1/3 4.00 10.00 33.3% 0 - $0.00036
Instructions following 2/2 9.50 10.00 100.0% 0 - $0.00046
Puzzle Solving 1/3 5.00 9.96 33.3% 0 - $0.00133

Modeli zilizolinganishwa

Linganisha Qwen: Qwen3.5 Plus 2026-02-15 dhidi ya...

#13 · Anthropic

Anthropic: Claude Sonnet 4.6

Bila uchambuzi

Score: 5.75

Uthabiti: 9.42

Kiwango cha kupita kwa kila jaribio: 52.8%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 0.9480

Majaribio sahihi: 6/12

Jumla ya gharama: $0.05688

Linganisha

#15 · Z.ai

Z.ai: GLM 5

Bila uchambuzi

Score: 5.42

Uthabiti: 10.00

Kiwango cha kupita kwa kila jaribio: 50.0%

Majaribio yasiyo thabiti: 0

Gharama kwa matokeo: 0.0704

Majaribio sahihi: 6/12

Jumla ya gharama: $0.00423

Linganisha

#12 · OpenAI

OpenAI: gpt-oss-120b

Uchambuzi (medium)

Score: 5.75

Uthabiti: 7.19

Kiwango cha kupita kwa kila jaribio: 63.9%

Majaribio yasiyo thabiti: 4

Gharama kwa matokeo: 0.0951

Majaribio sahihi: 6/12

Jumla ya gharama: $0.00571

Linganisha

Ulinganisho wa haraka

Linganisha Qwen: Qwen3.5 Plus 2026-02-15 dhidi ya...