AI BENCHY
Linganisha Chati Mbinu
โค๏ธ Made by XCS
Your ad here

#13

Step 3.5 Flash

Stepfun ยท Toleo: 2026-02-01 ยท stepfun/step-3.5-flash::medium

Wastani wa alama

7.4

Gharama kwa matokeo

0.000

Uthabiti

9.1

Jumla ya gharama

$0.000

Majaribio sahihi

Majaribio yenye makosa: 6

Kiwango cha kupita kwa kila jaribio: 68.8%

Majaribio yasiyo thabiti

2

Majaribio yasiyo thabiti yalikuwa na matokeo mchanganyiko kati ya run (angalau kupita moja na kufeli moja).

Muda wa majibu (wastani)

29.10s

Muda wa majibu (upeo): 170.45s

Muda wa majibu (jumla): 290.96s

Hakufuata maelekezo: 3 Jibu lisilo sahihi: 3

Chati

Chagua modeli ya kwanza, kisha bofya modeli ya pili kufungua ukurasa wa kulinganisha bega kwa bega.

Ulinganisho wa haraka

Mgawanyo wa kategoria

Kategoria Wastani wa alama Uthabiti Majaribio sahihi
Anti-AI Tricks 10.0 10.0
Combined 10.0 10.0
Data parsing and extraction 10.0 10.0
Domain specific 4.0 7.2
General Intelligence 6.0 10.0
Instructions following 9.0 6.8
Puzzle Solving 4.0 10.0
Tool Calling 10.0 10.0