Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Gemini 3 PRO Preview vs xAI: Grok Build 0.1

Last updated at: 2026-05-22

Metric Gemini 3 PRO Preview Gemini 3 PRO Preview medium Release: 2025-11-18 Grok Build 0.1 Grok Build 0.1 none Release: 2026-05-21
Score 8.1 6.6
Rank #19 #82
Reliability N/A 10.0
Consistency 10.0 8.0
Tests Correct
Attempt pass rate 73.7% 60.4%
Flaky tests 0 4
Total Runs 60 57
Cost per result 1.406 7.805
Total Cost $0.197 $0.547
Input Price $0.000 / 1M $1.000 / 1M
Output Price $0.000 / 1M $2.000 / 1M
Output Tokens 1,508 267,275
Reasoning Tokens 10,084 0
Response Time (avg) 9.06s 28.69s
Response Time (max) 26.24s 138.35s
Response Time (total) 90.58s 459.00s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 10.0 10.0 100.0% 0 14.99s 149 1,485
Grok Build 0.1 8.7 7.9 91.7% 1 6.30s 11,162 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 3.0 10.0 0.0% 0 0ms 0 0
Grok Build 0.1 10.0 10.0 100.0% 0 21.41s 16,568 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 3.0 10.0 0.0% 0 10.37s 351 952
Grok Build 0.1 0.0 0.0 0.0% 0 0ms 0 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 10.0 10.0 100.0% 0 10.84s 279 3,156
Grok Build 0.1 4.7 1.6 66.7% 1 9.33s 6,359 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 5.3 10.0 33.3% 0 7.01s 15 1,195
Grok Build 0.1 3.6 7.2 22.2% 1 103.71s 179,469 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 10.0 10.0 100.0% 0 9.34s 78 374
Grok Build 0.1 4.3 10.0 0.0% 0 12.47s 6,647 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 9.8 10.0 100.0% 0 3.26s 69 754
Grok Build 0.1 9.8 10.0 100.0% 0 7.36s 8,970 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 10.0 10.0 100.0% 0 3.91s 243 1,197
Grok Build 0.1 6.4 7.7 55.6% 1 9.55s 14,982 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 10.0 10.0 100.0% 0 11.96s 324 971
Grok Build 0.1 0.0 0.0 0.0% 0 0ms 0 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3 PRO Preview 0.0 0.0 0.0% 0 0ms 0 0
Grok Build 0.1 3.0 10.0 0.0% 0 36.09s 23,118 0

Quick Compare

Switch Comparison Pair