Urambazaji
AI BENCHY
Advertise here

AI BENCHY Compare

Modeli zilizolinganishwa

Muhtasari

Ulinganisho wa benchmark GPT-5.5 vs GPT-5.4 vs Gemini 3.1 Pro Preview vs Claude Opus 4.7Gemini 3.1 Pro Preview inaongoza kwenye Alama kwa 9.2. GPT-5.5 inaongoza kwenye Uaminifu kwa 10.0. Claude Opus 4.7 ina Jumla ya gharama ya chini zaidi kwa $0.679. Claude Opus 4.7 ndiyo ya haraka zaidi kwa 4.73s.

Muundo unaopendekezwa: Claude Opus 4.7 - Its score stays close to the best score here (8.7 vs 9.2), while costing about 2.9x less than miundo mingine katika ulinganisho huu.

Benchmark zimetengenezwa kutoka seti za majaribio za AI BENCHY tarehe: 2026-06-18

Kipimo GPT-5.5 GPT-5.5 medium Toleo: 2026-04-24 GPT-5.4 GPT-5.4 medium Toleo: 2026-03-05 Gemini 3.1 Pro Preview Gemini 3.1 Pro Preview medium Toleo: 2026-02-19 Claude Opus 4.7 Claude Opus 4.7 medium Toleo: 2026-04-16
Alama 9.0 8.5 9.2 8.7
Nafasi #9 #17 #7 #13
Uaminifu 10.0 10.0 10.0 10.0
Uthabiti 8.9 8.6 10.0 9.6
Majaribio sahihi
Kiwango cha kupita kwa kila jaribio 87.3% 76.2% 90.5% 82.5%
Majaribio yasiyo thabiti 3 4 0 1
Jumla ya uendeshaji 63 63 63 63
Gharama kwa matokeo 21.638 8.640 5.546 3.991
Jumla ya gharama $3.679 $1.210 $1.054 $0.679
Bei ya ingizo $5.000 / 1M $2.500 / 1M $2.000 / 1M $5.000 / 1M
Bei ya toleo $30.000 / 1M $15.000 / 1M $12.000 / 1M $25.000 / 1M
Jumla ya tokeni za ingizo 34,212 34,108 41,617 65,406
Tokeni za matokeo 1,985 2,242 1,977 11,858
Tokeni za hoja 114,925 72,707 78,896 2,198
Muda wa majibu (wastani) 37.98s 22.35s 20.14s 4.73s
Muda wa majibu (upeo) 332.10s 100.41s 88.68s 23.18s
Muda wa majibu (jumla) 797.60s 469.29s 281.92s 94.51s

Onyesho la kizazi

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#9 GPT-5.5

medium
Gharama
$0.112
Muda
71.9s
Tokeni
3,807 tok

#17 GPT-5.4

medium
Gharama
$0.214
Muda
199.6s
Tokeni
14,349 tok

#7 Gemini 3.1 Pro Preview

medium
Gharama
$0.115
Muda
87.2s
Tokeni
9,629 tok

#13 Claude Opus 4.7

medium
Gharama
$0.059
Muda
26.8s
Tokeni
2,475 tok

Modeli bora kwa alama

Alama dhidi ya gharama ya jumla

Muda wa majibu (wastani)

Alama vs Muda wa majibu (wastani)

Jumla ya tokeni za matokeo

Alama vs Jumla ya tokeni za matokeo

Mgawanyo wa kategoria

Mbinu za kupinga AI Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 4.66s 606 250 1,335
GPT-5.4 8.3 10.0 75.0% 0 4.11s 606 240 1,511
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 7.90s 498 112 3,218
Claude Opus 4.7 8.3 10.0 75.0% 0 1.85s 894 348 0
Uandishi wa msimbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 8.8 7.8 88.9% 1 59.77s 7,305 362 24,959
GPT-5.4 8.8 7.8 88.9% 1 44.36s 7,305 433 24,216
Gemini 3.1 Pro Preview 7.9 9.9 66.7% 0 40.17s 8,124 435 41,247
Claude Opus 4.7 7.6 7.2 77.8% 1 12.96s 10,635 7,629 1,114
Mchanganyiko Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 19.29s 11,019 312 2,841
GPT-5.4 10.0 10.0 100.0% 0 20.57s 11,019 301 3,543
Gemini 3.1 Pro Preview 9.5 10.0 100.0% 0 40.61s 17,240 432 9,281
Claude Opus 4.7 10.0 10.0 100.0% 0 21.45s 24,501 2,369 1,084
Uchanganuzi na uchimbaji wa data Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 4.18s 7,140 234 593
GPT-5.4 10.0 10.0 100.0% 0 5.32s 7,140 234 804
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 7.72s 7,265 279 3,904
Claude Opus 4.7 10.0 10.0 100.0% 0 2.37s 10,533 324 0
Mahususi kwa domeni Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 5.3 7.2 44.4% 1 164.14s 723 67 79,625
GPT-5.4 5.3 7.2 44.4% 1 74.27s 619 61 34,748
Gemini 3.1 Pro Preview 7.7 10.0 66.7% 0 32.73s 635 18 12,424
Claude Opus 4.7 7.7 10.0 66.7% 0 1.17s 630 51 0
Akili ya jumla Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 4.16s 477 138 223
GPT-5.4 4.7 3.1 33.3% 1 4.92s 477 145 321
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 11.77s 490 108 1,179
Claude Opus 4.7 10.0 10.0 100.0% 0 2.87s 723 256 0
Ufuataji wa maagizo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 3.36s 660 93 538
GPT-5.4 10.0 10.0 100.0% 0 3.11s 660 93 897
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 9.56s 621 72 2,236
Claude Opus 4.7 10.0 10.0 100.0% 0 1.57s 939 114 0
Utatuzi wa mafumbo Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 6.76s 642 241 2,225
GPT-5.4 8.2 7.2 88.9% 1 9.14s 642 441 3,815
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 6.90s 570 235 3,128
Claude Opus 4.7 10.0 10.0 100.0% 0 2.43s 939 370 0
Mwito wa zana Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 10.0 10.0 100.0% 0 10.57s 5,445 258 832
GPT-5.4 10.0 10.0 100.0% 0 13.28s 5,445 264 1,031
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 23.15s 6,018 274 982
Claude Opus 4.7 10.0 10.0 100.0% 0 4.17s 15,339 373 0
Maarifa ya jumla Alama Uthabiti Kiwango cha kupita kwa kila jaribio Majaribio yasiyo thabiti Majaribio sahihi Muda wa majibu (wastani) Tokeni za ingizo Tokeni za matokeo Tokeni za hoja
GPT-5.5 2.8 1.6 33.3% 1 37.86s 195 30 1,754
GPT-5.4 3.0 10.0 0.0% 0 13.95s 195 30 1,821
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 6.27s 156 12 1,297
Claude Opus 4.7 3.0 10.0 0.0% 0 2.25s 273 24 0

Ulinganisho wa haraka

Badilisha jozi ya ulinganisho