Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemma 4 31B vs OpenAI: GPT-5.5

Last updated at: 2026-05-22

Metric Gemma 4 31B Gemma 4 31B none Release: 2026-04-02 Free Available GPT-5.5 GPT-5.5 medium Release: 2026-04-24
Score 6.7 8.7
Rank #76 #11
Reliability 10.0 10.0
Consistency 10.0 8.8
Tests Correct
Attempt pass rate 50.0% 86.7%
Flaky tests 0 3
Total Runs 60 60
Cost per result 0.030 21.891
Total Cost $0.003 $3.503
Input Price $0.120 / 1M $5.000 / 1M
Output Price $0.370 / 1M $30.000 / 1M
Output Tokens 1,398 1,973
Reasoning Tokens 0 109,510
Response Time (avg) 3.84s 37.89s
Response Time (max) 26.13s 332.10s
Response Time (total) 69.13s 757.71s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 6.5 10.0 50.0% 0 1.85s 45 0
GPT-5.5 10.0 10.0 100.0% 0 4.66s 250 1,335
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 6.8 10.0 50.0% 0 14.84s 726 0
GPT-5.5 8.2 6.7 83.3% 1 69.68s 341 19,515
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 3.0 10.0 0.0% 0 0ms 0 0
GPT-5.5 10.0 10.0 100.0% 0 19.29s 312 2,841
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 10.0 10.0 100.0% 0 2.25s 285 0
GPT-5.5 10.0 10.0 100.0% 0 4.18s 234 593
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 7.7 10.0 66.7% 0 3.22s 27 0
GPT-5.5 5.3 7.2 44.4% 1 164.14s 67 79,625
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 10.0 10.0 100.0% 0 2.09s 117 0
GPT-5.5 10.0 10.0 100.0% 0 4.16s 138 223
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 6.5 10.0 50.0% 0 2.84s 78 0
GPT-5.5 10.0 10.0 100.0% 0 3.36s 93 538
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 6.5 10.0 33.3% 0 2.95s 108 0
GPT-5.5 10.0 10.0 100.0% 0 6.78s 250 2,254
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 3.0 10.0 0.0% 0 0ms 0 0
GPT-5.5 10.0 10.0 100.0% 0 10.57s 258 832
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemma 4 31B 3.0 10.0 0.0% 0 1.25s 12 0
GPT-5.5 2.8 1.6 33.3% 1 37.86s 30 1,754

Quick Compare

Switch Comparison Pair