Urambazaji
AI BENCHY
Linganisha Chati Mbinu
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Qwen: Qwen3.5-122B-A10B vs StepFun: Step 3.5 Flash

Linganisha:

Benchmark zimetengenezwa kutoka seti za majaribio za AI BENCHY tarehe: 2026-03-06

Kipimo Qwen: Qwen3.5-122B-A10B medium Toleo: 2026-02-24 StepFun: Step 3.5 Flash medium Toleo: 2026-02-01 Inapatikana bure
Nafasi #10 #13
Wastani wa alama 7.7 7.4
Uthabiti 9.0 9.1
Gharama kwa matokeo 4.095 0.000
Jumla ya gharama $0.492 $0.000
Majaribio sahihi
Kiwango cha kupita kwa kila jaribio 79.2% 68.8%
Majaribio yasiyo thabiti 2 2
Jumla ya uendeshaji 48 (16 x 3) 48 (16 x 3)
Tokeni za matokeo 17,292 71,452
Tokeni za hoja 145,625 155,147
Muda wa majibu (wastani) 29.74s 29.10s
Muda wa majibu (upeo) 119.29s 170.45s
Muda wa majibu (jumla) 475.83s 290.96s

Modeli bora kwa alama

Alama dhidi ya gharama ya jumla

Muda wa majibu (wastani)

Wastani wa alama vs Muda wa majibu (wastani)

Mgawanyo wa kategoria

Mbinu za kupinga AI Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 6.99s 248 10,486
StepFun: Step 3.5 Flash 10.0 10.0 100.0% 0 18.54s 13,924 17,208
Mchanganyiko Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 107.79s 483 11,337
StepFun: Step 3.5 Flash 10.0 10.0 100.0% 0 29.57s 1,176 12,984
Uchanganuzi na uchimbaji wa data Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 9.9 10.0 100.0% 0 23.41s 270 16,558
StepFun: Step 3.5 Flash 10.0 10.0 100.0% 0 15.01s 600 13,886
Mahususi kwa domeni Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 7.2 11.1% 1 63.40s 15,537 64,889
StepFun: Step 3.5 Flash 4.0 7.2 44.4% 1 170.45s 45,350 90,436
Akili ya jumla Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 2.2 33.3% 1 34.11s 66 7,592
StepFun: Step 3.5 Flash 6.0 10.0 0.0% 0 6.54s 2,214 2,584
Ufuataji wa maagizo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 9.88s 77 7,372
StepFun: Step 3.5 Flash 9.0 6.8 83.3% 1 4.98s 2,284 3,412
Puzzle Solving Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 17.18s 289 26,165
StepFun: Step 3.5 Flash 4.0 10.0 33.3% 0 7.72s 5,629 10,835
Mwito wa zana Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Qwen: Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 4.60s 322 1,226
StepFun: Step 3.5 Flash 10.0 10.0 100.0% 0 11.91s 275 3,802

Ulinganisho wa haraka

Badilisha jozi ya ulinganisho