Urambazaji
AI BENCHY
Your ad here

AI BENCHY Compare

Modeli zilizolinganishwa

Benchmark zimetengenezwa kutoka seti za majaribio za AI BENCHY tarehe: 2026-04-14

Kipimo Elephant Elephant none Toleo: 2026-04-14 Qwen3.5-27B Qwen3.5-27B none Toleo: 2026-02-24 Elephant Elephant medium Toleo: 2026-04-14 Qwen3.5-27B Qwen3.5-27B medium Toleo: 2026-02-24
Alama 5.2 5.9 5.2 8.4
Nafasi #81 #64 #77 #8
Uthabiti 9.6 9.2 9.6 8.8
Majaribio sahihi
Kiwango cha kupita kwa kila jaribio 31.5% 38.9% 29.6% 81.5%
Majaribio yasiyo thabiti 1 2 1 3
Jumla ya uendeshaji 54 54 54 54
Gharama kwa matokeo 0.000 0.265 0.000 3.822
Jumla ya gharama $0.000 $0.016 $0.000 $0.497
Bei ya ingizo $0.000 / 1M $0.195 / 1M $0.000 / 1M $0.195 / 1M
Bei ya toleo $0.000 / 1M $1.560 / 1M $0.000 / 1M $1.560 / 1M
Tokeni za matokeo 2,573 3,545 2,596 2,500
Tokeni za hoja 0 0 0 242,500
Muda wa majibu (wastani) 1.23s 1.74s 1.27s 53.03s
Muda wa majibu (upeo) 3.81s 9.39s 3.70s 163.96s
Muda wa majibu (jumla) 22.16s 31.32s 22.82s 954.46s

Modeli bora kwa alama

Alama dhidi ya gharama ya jumla

Muda wa majibu (wastani)

Alama vs Muda wa majibu (wastani)

Jumla ya tokeni za matokeo

Alama vs Jumla ya tokeni za matokeo

Mgawanyo wa kategoria

Mbinu za kupinga AI Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 6.6 10.0 50.0% 0 963ms 610 0
Qwen3.5-27B 4.8 10.0 25.0% 0 788ms 267 0
Elephant 6.6 10.0 50.0% 0 1.19s 815 0
Qwen3.5-27B 8.7 7.9 91.7% 1 19.75s 569 31,505
Uandishi wa msimbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 6.4 3.3 66.7% 1 1.39s 375 0
Qwen3.5-27B 10.0 10.0 100.0% 0 2.51s 381 0
Elephant 5.1 3.3 33.3% 1 1.30s 365 0
Qwen3.5-27B 10.0 10.0 100.0% 0 70.35s 375 19,165
Mchanganyiko Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 3.0 10.0 0.0% 0 3.81s 731 0
Qwen3.5-27B 2.8 1.6 33.3% 1 9.39s 1,461 0
Elephant 3.0 10.0 0.0% 0 3.70s 562 0
Qwen3.5-27B 10.0 10.0 100.0% 0 163.96s 483 9,991
Uchanganuzi na uchimbaji wa data Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 6.5 10.0 50.0% 0 1.04s 246 0
Qwen3.5-27B 10.0 10.0 100.0% 0 1.43s 243 0
Elephant 6.5 10.0 50.0% 0 979ms 246 0
Qwen3.5-27B 10.0 10.0 100.0% 0 30.26s 270 16,150
Mahususi kwa domeni Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 3.0 10.0 0.0% 0 927ms 24 0
Qwen3.5-27B 3.0 10.0 0.0% 0 540ms 15 0
Elephant 3.0 10.0 0.0% 0 925ms 24 0
Qwen3.5-27B 5.3 10.0 33.3% 0 79.53s 43 52,368
Akili ya jumla Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 4.0 10.0 0.0% 0 854ms 106 0
Qwen3.5-27B 5.0 10.0 0.0% 0 2.51s 126 0
Elephant 4.3 10.0 0.0% 0 920ms 105 0
Qwen3.5-27B 6.1 3.1 66.7% 1 101.41s 70 23,147
Ufuataji wa maagizo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 9.8 10.0 100.0% 0 1.03s 81 0
Qwen3.5-27B 4.8 10.0 0.0% 0 815ms 69 0
Elephant 9.8 10.0 100.0% 0 987ms 82 0
Qwen3.5-27B 10.0 10.0 100.0% 0 19.66s 97 11,638
Utatuzi wa mafumbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 3.3 10.0 0.0% 0 849ms 170 0
Qwen3.5-27B 6.7 7.9 55.6% 1 1.37s 680 0
Elephant 3.7 10.0 0.0% 0 867ms 166 0
Qwen3.5-27B 8.2 7.7 77.8% 1 64.61s 245 77,213
Mwito wa zana Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Elephant 3.0 10.0 0.0% 0 2.79s 230 0
Qwen3.5-27B 10.0 10.0 100.0% 0 3.54s 303 0
Elephant 3.0 10.0 0.0% 0 2.83s 231 0
Qwen3.5-27B 10.0 10.0 100.0% 0 7.45s 348 1,323

Ulinganisho wa haraka

Badilisha jozi ya ulinganisho