AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

Anthropic: Claude Sonnet 4.6

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo Anthropic: Claude Sonnet 4.6
Nafasi#6
KampuniAnthropic
Score 7.00
Uthabiti 9.30
Gharama kwa matokeo 9.3797
Jumla ya gharama $0.75038
Majaribio sahihi 8/12
Kiwango cha kupita kwa kila jaribio 69.4%
Majaribio yasiyo thabiti 1
Tokeni za matokeo 28,193
Tokeni za hoja 19,665

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 1/2 5.50 10.00 50.0% 0 9.83 $0.02304
Data parsing and extraction 2/2 10.00 10.00 100.0% 0 9.83 $0.04958
Domain specific 0/3 1.00 7.21 11.1% 1 5.58 $0.64205
Instructions following 2/2 10.00 10.00 100.0% 0 10.00 $0.01497
Puzzle Solving 3/3 10.00 10.00 100.0% 0 9.44 $0.02077

Modeli zilizolinganishwa

Linganisha Anthropic: Claude Sonnet 4.6 dhidi ya...

#5 · OpenAI

OpenAI: GPT-5.2

Uchambuzi (medium)

Score: 7.92

Uthabiti: 9.30

Kiwango cha kupita kwa kila jaribio: 80.6%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 2.2838

Majaribio sahihi: 9/12

Jumla ya gharama: $0.20554

Linganisha

#7 · Z.ai

Z.ai: GLM 5

Uchambuzi (medium)

Score: 6.83

Uthabiti: 7.86

Kiwango cha kupita kwa kila jaribio: 80.6%

Majaribio yasiyo thabiti: 3

Gharama kwa matokeo: 1.3424

Majaribio sahihi: 8/12

Jumla ya gharama: $0.10740

Linganisha

#4 · Qwen

Qwen: Qwen3.5 Plus 2026-02-15

Uchambuzi (medium)

Score: 8.42

Uthabiti: 9.30

Kiwango cha kupita kwa kila jaribio: 86.1%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 2.3151

Majaribio sahihi: 10/12

Jumla ya gharama: $0.23151

Linganisha

Ulinganisho wa haraka

Linganisha Anthropic: Claude Sonnet 4.6 dhidi ya...