Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Anthropic: Claude Opus 4.7 vs Google: Gemma 4 26B A4B

Last updated at: 2026-04-22

Metric Claude Opus 4.7 Claude Opus 4.7 none Release: 2026-04-16 Gemma 4 26B A4B Gemma 4 26B A4B medium Release: 2026-04-03 Free Available
Score 9.2 8.0
Rank #4 #25
Consistency 10.0 9.0
Tests Correct
Attempt pass rate 88.9% 75.9%
Flaky tests 0 2
Total Runs 54 54
Cost per result 3.155 0.214
Total Cost $0.505 $0.028
Input Price $5.000 / 1M $0.070 / 1M
Output Price $25.000 / 1M $0.340 / 1M
Output Tokens 6,326 15,928
Reasoning Tokens 0 44,631
Response Time (avg) 3.13s 25.03s
Response Time (max) 18.27s 147.47s
Response Time (total) 56.33s 425.48s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 8.3 10.0 75.0% 0 2.12s 522 0
Gemma 4 26B A4B 10.0 10.0 100.0% 0 6.20s 1,142 3,045
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.84s 494 0
Gemma 4 26B A4B 2.8 10.0 0.0% 0 147.47s 3,516 4,676
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 9.5 10.0 100.0% 0 18.27s 3,504 0
Gemma 4 26B A4B 9.6 10.0 100.0% 0 73.55s 5,415 13,112
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.15s 324 0
Gemma 4 26B A4B 10.0 10.0 100.0% 0 16.51s 1,567 2,827
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 7.7 10.0 66.7% 0 1.19s 78 0
Gemma 4 26B A4B 2.9 4.4 22.2% 2 23.62s 2,469 7,105
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 3.47s 257 0
Gemma 4 26B A4B 10.0 10.0 100.0% 0 29.76s 25 5,075
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 1.46s 114 0
Gemma 4 26B A4B 10.0 10.0 100.0% 0 17.54s 887 4,470
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.58s 661 0
Gemma 4 26B A4B 7.9 9.6 66.7% 0 8.52s 457 3,065
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 4.74s 372 0
Gemma 4 26B A4B 10.0 10.0 100.0% 0 9.01s 450 1,256

Quick Compare

Switch Comparison Pair