AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

OpenAI: gpt-oss-120b

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo OpenAI: gpt-oss-120b
Nafasi#12
KampuniOpenAI
Score 5.75
Uthabiti 7.19
Gharama kwa matokeo 0.0951
Jumla ya gharama $0.00571
Majaribio sahihi 6/12
Kiwango cha kupita kwa kila jaribio 63.9%
Majaribio yasiyo thabiti 4
Tokeni za matokeo 8,060
Tokeni za hoja 23,792

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 2/2 10.00 10.00 100.0% 0 10.00 $0.00029
Data parsing and extraction 1/2 5.50 5.81 83.3% 1 10.00 $0.00052
Domain specific 0/3 1.00 4.41 22.2% 2 8.53 $0.00393
Instructions following 2/2 10.00 10.00 100.0% 0 9.50 $0.00040
Puzzle Solving 1/3 5.00 7.13 44.4% 1 7.89 $0.00059

Modeli zilizolinganishwa

Linganisha OpenAI: gpt-oss-120b dhidi ya...

#11 · OpenAI

OpenAI: GPT-5 Nano

Uchambuzi (medium)

Score: 5.92

Uthabiti: 6.03

Kiwango cha kupita kwa kila jaribio: 72.2%

Majaribio yasiyo thabiti: 6

Gharama kwa matokeo: 0.4675

Majaribio sahihi: 6/12

Jumla ya gharama: $0.02806

Linganisha

#13 · Anthropic

Anthropic: Claude Sonnet 4.6

Bila uchambuzi

Score: 5.75

Uthabiti: 9.42

Kiwango cha kupita kwa kila jaribio: 52.8%

Majaribio yasiyo thabiti: 1

Gharama kwa matokeo: 0.9480

Majaribio sahihi: 6/12

Jumla ya gharama: $0.05688

Linganisha

#10 · Google

Google: Gemini 3 Flash Preview

Bila uchambuzi

Score: 6.25

Uthabiti: 8.60

Kiwango cha kupita kwa kila jaribio: 66.7%

Majaribio yasiyo thabiti: 2

Gharama kwa matokeo: 0.0754

Majaribio sahihi: 7/12

Jumla ya gharama: $0.00528

Linganisha

Ulinganisho wa haraka

Linganisha OpenAI: gpt-oss-120b dhidi ya...