AI BENCHY
Linganisha Chati
โค๏ธ Made by XCS
Your ad here

#33

GPT-5 Mini

OpenAI ยท Toleo: 2025-08-07 ยท openai/gpt-5-mini::medium

Wastani wa alama

5.77

Gharama kwa matokeo

1.200

Uthabiti

8.80

Jumla ya gharama

$0.084

Majaribio sahihi

7

Jaribio huhesabiwa kuwa limepita kikamilifu tu ikiwa run zake zote zimepita.

Majaribio yenye makosa

7

Kiwango cha kupita kwa kila jaribio: 57.1%

Majaribio yasiyo thabiti

2

Muda wa majibu: wastani 21.47s ยท jumla 300.64s ยท upeo 82.55s

Hakufuata maelekezo: 3 Jibu lisilo sahihi: 3 Muda umeisha: 1

Modeli bora kwa alama

Chagua modeli ya kwanza, kisha bofya modeli ya pili kufungua ukurasa wa kulinganisha bega kwa bega.

Ulinganisho wa haraka

Mgawanyo wa kategoria

Kategoria Wastani wa alama Uthabiti Majaribio sahihi
Anti-AI Tricks 7.00 9.62 2/3
Data parsing and extraction 9.88 10.00 2/2
Domain specific 1.00 7.21 0/3
Instructions following 7.00 6.64 1/2
Puzzle Solving 4.34 9.78 1/3
Tool Calling 10.00 10.00 1/1