AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

Anthropic: Claude Opus 4.6

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo Anthropic: Claude Opus 4.6
Nafasi#16
KampuniAnthropic
Score 5.42
Uthabiti 8.60
Gharama kwa matokeo 12.8695
Jumla ya gharama $0.77217
Majaribio sahihi 6/12
Kiwango cha kupita kwa kila jaribio 55.5%
Majaribio yasiyo thabiti 2
Tokeni za matokeo 18,415
Tokeni za hoja 10,289

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 0/2 1.00 1.62 33.3% 2 10.00 $0.03036
Data parsing and extraction 2/2 10.00 10.00 100.0% 0 9.83 $0.07755
Domain specific 0/3 1.00 10.00 0.0% 0 7.61 $0.60915
Instructions following 2/2 9.50 9.99 100.0% 0 9.50 $0.02231
Puzzle Solving 2/3 7.00 10.00 66.7% 0 9.44 $0.03281

Modeli zilizolinganishwa

Linganisha Anthropic: Claude Opus 4.6 dhidi ya...

#15 · Z.ai

Z.ai: GLM 5

Bila uchambuzi

Score: 5.42

Uthabiti: 10.00

Kiwango cha kupita kwa kila jaribio: 50.0%

Majaribio yasiyo thabiti: 0

Gharama kwa matokeo: 0.0704

Majaribio sahihi: 6/12

Jumla ya gharama: $0.00423

Linganisha

#17 · MiniMax

MiniMax: MiniMax M2.5

Uchambuzi (medium)

Score: 5.08

Uthabiti: 6.00

Kiwango cha kupita kwa kila jaribio: 61.1%

Majaribio yasiyo thabiti: 6

Gharama kwa matokeo: 4.0276

Majaribio sahihi: 5/12

Jumla ya gharama: $0.20138

Linganisha

#14 · Qwen

Qwen: Qwen3.5 Plus 2026-02-15

Bila uchambuzi

Score: 5.67

Uthabiti: 9.99

Kiwango cha kupita kwa kila jaribio: 50.0%

Majaribio yasiyo thabiti: 0

Gharama kwa matokeo: 0.0997

Majaribio sahihi: 6/12

Jumla ya gharama: $0.00599

Linganisha

Ulinganisho wa haraka

Linganisha Anthropic: Claude Opus 4.6 dhidi ya...