Urambazaji
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Inception: Mercury 2 vs OpenAI: gpt-oss-120b

Benchmark zimetengenezwa kutoka seti za majaribio za AI BENCHY tarehe: 2026-04-11

Kipimo Mercury 2 Mercury 2 medium Toleo: 2026-02-24 gpt-oss-120b gpt-oss-120b none Toleo: 2025-08-05 Inapatikana bure
Alama 6.5 5.2
Nafasi #51 #79
Uthabiti 8.6 7.9
Majaribio sahihi
Kiwango cha kupita kwa kila jaribio 53.7% 38.9%
Majaribio yasiyo thabiti 3 5
Jumla ya uendeshaji 54 54
Gharama kwa matokeo 0.580 0.221
Jumla ya gharama $0.047 $0.009
Bei ya ingizo $0.250 / 1M $0.039 / 1M
Bei ya toleo $0.750 / 1M $0.190 / 1M
Tokeni za matokeo 3,972 44,652
Tokeni za hoja 48,333 0
Muda wa majibu (wastani) 2.21s 11.96s
Muda wa majibu (upeo) 14.63s 68.97s
Muda wa majibu (jumla) 37.51s 179.34s

Modeli bora kwa alama

Alama dhidi ya gharama ya jumla

Muda wa majibu (wastani)

Alama vs Muda wa majibu (wastani)

Jumla ya tokeni za matokeo

Alama vs Jumla ya tokeni za matokeo

Mgawanyo wa kategoria

Mbinu za kupinga AI Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 6.9 9.9 50.0% 0 1.12s 2,546 2,609
gpt-oss-120b 6.6 8.0 58.3% 1 6.03s 4,867 0
Uandishi wa msimbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 10.0 10.0 100.0% 0 1.53s 249 2,213
gpt-oss-120b 4.3 1.1 66.7% 1 9.57s 3,232 0
Mchanganyiko Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 10.0 10.0 100.0% 0 3.28s 268 4,887
gpt-oss-120b 3.0 10.0 0.0% 0 0ms 0 0
Uchanganuzi na uchimbaji wa data Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 7.3 5.9 83.3% 1 1.11s 183 1,656
gpt-oss-120b 6.5 10.0 50.0% 0 7.12s 598 0
Mahususi kwa domeni Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 2.9 7.2 11.1% 1 6.48s 41 30,754
gpt-oss-120b 3.0 10.0 0.0% 0 34.98s 29,483 0
Akili ya jumla Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 4.8 10.0 0.0% 0 821ms 137 542
gpt-oss-120b 4.6 10.0 0.0% 0 2.83s 586 0
Ufuataji wa maagizo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 10.0 10.0 100.0% 0 1.07s 14 958
gpt-oss-120b 8.4 6.9 83.3% 1 5.10s 1,982 0
Utatuzi wa mafumbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 3.9 7.5 22.2% 1 934ms 354 2,758
gpt-oss-120b 4.5 4.8 44.5% 2 6.86s 3,904 0
Mwito wa zana Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za matokeo Tokeni za hoja
Mercury 2 10.0 10.0 100.0% 0 1.89s 180 1,956
gpt-oss-120b 3.0 10.0 0.0% 0 0ms 0 0

Ulinganisho wa haraka

Badilisha jozi ya ulinganisho