Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemini 3.1 Flash Lite vs xAI: Grok 4.3

Last updated at: 2026-05-08

Metric Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite medium Release: 2026-05-08 Grok 4.3 Grok 4.3 medium Release: 2026-05-01
Score 7.9 8.0
Rank #27 #24
Reliability 10.0 10.0
Consistency 9.1 8.7
Tests Correct
Attempt pass rate 71.9% 77.2%
Flaky tests 2 3
Total Runs 57 57
Cost per result 0.452 4.229
Total Cost $0.059 $0.550
Input Price $0.250 / 1M $1.250 / 1M
Output Price $1.500 / 1M $2.500 / 1M
Output Tokens 2,224 1,237
Reasoning Tokens 32,034 200,033
Response Time (avg) 3.14s 48.41s
Response Time (max) 10.87s 216.69s
Response Time (total) 59.62s 919.73s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 9.1 10.0 75.0% 0 2.39s 604 4,201
Grok 4.3 10.0 10.0 100.0% 0 8.83s 88 8,207
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 3.26s 429 2,712
Grok 4.3 10.0 10.0 100.0% 0 45.72s 284 9,659
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 10.87s 327 7,401
Grok 4.3 10.0 10.0 100.0% 0 63.99s 234 15,301
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 2.60s 279 2,845
Grok 4.3 10.0 10.0 100.0% 0 18.97s 180 9,546
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 2.9 7.2 11.1% 1 3.16s 15 5,165
Grok 4.3 5.3 7.2 44.4% 1 181.74s 14 111,300
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 2.60s 84 1,142
Grok 4.3 5.4 2.5 66.7% 1 24.70s 70 5,020
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 9.9 10.0 100.0% 0 2.59s 75 3,320
Grok 4.3 9.8 10.0 100.0% 0 18.58s 57 8,713
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 7.6 7.2 77.8% 1 1.95s 165 2,450
Grok 4.3 5.9 7.2 55.6% 1 22.53s 128 14,686
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 4.55s 234 921
Grok 4.3 10.0 10.0 100.0% 0 17.66s 168 4,615
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 3.08s 12 1,877
Grok 4.3 3.0 10.0 0.0% 0 44.47s 14 12,986

Quick Compare

Switch Comparison Pair