AI BENCHY
Linganisha
❤️ Made by XCS

Jina la modeli

StepFun: Step 3.5 Flash

Benchmark zimetengenezwa kutoka seti za majaribio za Aibenchy tarehe : 19 Feb 2026

Kipimo StepFun: Step 3.5 Flash
Nafasi#18
KampuniStepfun
Score 4.92
Uthabiti 7.34
Gharama kwa matokeo 0.0000
Jumla ya gharama $0.00000
Majaribio sahihi 5/12
Kiwango cha kupita kwa kila jaribio 58.3%
Majaribio yasiyo thabiti 4
Tokeni za matokeo 46,871
Tokeni za hoja 95,440

Mgawanyo wa kategoria

Kategoria Majaribio yaliyopita kikamilifu Score Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Alama ya hoja Gharama
Anti-AI Tricks 1/2 5.50 5.81 83.3% 1 10.00 $0.00000
Data parsing and extraction 1/2 5.00 10.00 50.0% 0 9.75 $0.00000
Domain specific 1/3 4.00 7.21 44.4% 1 8.44 $0.00000
Instructions following 2/2 10.00 10.00 100.0% 0 9.67 $0.00000
Puzzle Solving 0/3 2.00 4.96 33.3% 2 9.22 $0.00000

Modeli zilizolinganishwa

Linganisha StepFun: Step 3.5 Flash dhidi ya...

#17 · MiniMax

MiniMax: MiniMax M2.5

Uchambuzi (medium)

Score: 5.08

Uthabiti: 6.00

Kiwango cha kupita kwa kila jaribio: 61.1%

Majaribio yasiyo thabiti: 6

Gharama kwa matokeo: 4.0276

Majaribio sahihi: 5/12

Jumla ya gharama: $0.20138

Linganisha

#19 · OpenAI

OpenAI: GPT-4o-mini

Bila uchambuzi

Score: 4.00

Uthabiti: 9.98

Kiwango cha kupita kwa kila jaribio: 25.0%

Majaribio yasiyo thabiti: 0

Gharama kwa matokeo: 0.0576

Majaribio sahihi: 3/12

Jumla ya gharama: $0.00173

Linganisha

#16 · Anthropic

Anthropic: Claude Opus 4.6

Uchambuzi (medium)

Score: 5.42

Uthabiti: 8.60

Kiwango cha kupita kwa kila jaribio: 55.5%

Majaribio yasiyo thabiti: 2

Gharama kwa matokeo: 12.8695

Majaribio sahihi: 6/12

Jumla ya gharama: $0.77217

Linganisha

Ulinganisho wa haraka

Linganisha StepFun: Step 3.5 Flash dhidi ya...