Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemini 3.1 Flash Lite Preview vs xAI: Grok Build 0.1

Last updated at: 2026-05-21

Metric Gemini 3.1 Flash Lite Preview Gemini 3.1 Flash Lite Preview none Release: 2026-03-03 Grok Build 0.1 Grok Build 0.1 medium Release: 2026-05-21
Score 7.7 7.8
Rank #46 #41
Reliability 10.0 10.0
Consistency 9.7 8.9
Tests Correct
Attempt pass rate 66.7% 71.9%
Flaky tests 1 3
Total Runs 57 57
Cost per result 0.131 4.064
Total Cost $0.016 $0.488
Input Price $0.250 / 1M $1.000 / 1M
Output Price $1.500 / 1M $2.000 / 1M
Output Tokens 5,370 1,947
Reasoning Tokens 0 223,372
Response Time (avg) 1.28s 22.28s
Response Time (max) 3.39s 88.28s
Response Time (total) 24.23s 423.30s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 7.5 8.4 66.7% 1 1.04s 1,092 0
Grok Build 0.1 10.0 10.0 100.0% 0 5.46s 195 9,825
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 1.47s 640 0
Grok Build 0.1 7.3 3.7 66.7% 1 30.98s 354 17,734
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 3.0 10.0 0.0% 0 3.20s 339 0
Grok Build 0.1 10.0 10.0 100.0% 0 30.81s 231 18,779
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 1.22s 399 0
Grok Build 0.1 10.0 10.0 100.0% 0 7.76s 180 10,343
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 5.3 10.0 33.3% 0 942ms 568 0
Grok Build 0.1 5.3 10.0 33.3% 0 77.75s 501 111,807
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 4.0 10.0 0.0% 0 741ms 69 0
Grok Build 0.1 3.8 2.5 33.3% 1 10.14s 78 5,386
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 1.13s 574 0
Grok Build 0.1 9.8 10.0 100.0% 0 9.62s 57 12,436
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 972ms 898 0
Grok Build 0.1 6.2 7.5 55.6% 1 8.67s 161 15,476
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 3.39s 782 0
Grok Build 0.1 10.0 10.0 100.0% 0 9.40s 180 5,319
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 3.0 10.0 0.0% 0 814ms 9 0
Grok Build 0.1 3.0 10.0 0.0% 0 26.07s 10 16,267

Quick Compare

Switch Comparison Pair