AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

OpenAI: GPT-5.2

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo OpenAI: GPT-5.2
Nafasi#5
KampuniOpenAI
Score 7.92
Uthabiti 9.30
Gharama kwa matokeo 2.2838
Jumla ya gharama $0.20554
Majaribio sahihi 9/12
Kiwango cha kupita kwa kila jaribio 80.6%
Majaribio yasiyo thabiti 1
Tokeni za matokeo 1,123
Tokeni za hoja 12,448

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 2/2 10.00 10.00 100.0% 0 8.25 $0.01131
Data parsing and extraction 2/2 10.00 10.00 100.0% 0 9.50 $0.02230
Domain specific 1/3 4.00 7.21 55.6% 1 3.17 $0.13697
Instructions following 2/2 9.50 10.00 100.0% 0 8.00 $0.01071
Puzzle Solving 2/3 8.00 10.00 66.7% 0 8.83 $0.02427

Modeli zilizolinganishwa

Linganisha OpenAI: GPT-5.2 dhidi ya...

#4 · Qwen

Qwen: Qwen3.5 Plus 2026-02-15

Uchambuzi (medium)

Score: 8.42

Uthabiti: 9.30

Kiwango cha kupita kwa kila jaribio: 86.1%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 2.3151

Majaribio sahihi: 10/12

Jumla ya gharama: $0.23151

Linganisha

#6 · Anthropic

Anthropic: Claude Sonnet 4.6

Uchambuzi (medium)

Score: 7.00

Uthabiti: 9.30

Kiwango cha kupita kwa kila jaribio: 69.4%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 9.3797

Majaribio sahihi: 8/12

Jumla ya gharama: $0.75038

Linganisha

#3 · Google

Google: Gemini 3 Pro Preview

Uchambuzi (medium)

Score: 8.42

Uthabiti: 10.00

Kiwango cha kupita kwa kila jaribio: 83.3%

Majaribio yasiyo thabiti: 0

Gharama kwa matokeo: 0.8028

Majaribio sahihi: 10/12

Jumla ya gharama: $0.08029

Linganisha

Ulinganisho wa haraka

Linganisha OpenAI: GPT-5.2 dhidi ya...