Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemini 3.1 Flash Lite vs OpenAI: GPT-5.2 Chat

Last updated at: 2026-05-08

Metric Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite medium Release: 2026-05-08 GPT-5.2 Chat GPT-5.2 Chat none Release: 2025-12-11
Score 7.9 7.6
Rank #27 #41
Reliability 10.0 10.0
Consistency 9.1 8.8
Tests Correct
Attempt pass rate 71.9% 71.9%
Flaky tests 2 3
Total Runs 57 57
Cost per result 0.452 2.572
Total Cost $0.059 $0.309
Input Price $0.250 / 1M $1.750 / 1M
Output Price $1.500 / 1M $14.000 / 1M
Output Tokens 2,224 18,585
Reasoning Tokens 32,034 0
Response Time (avg) 3.14s 6.85s
Response Time (max) 10.87s 38.52s
Response Time (total) 59.62s 130.06s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 9.1 10.0 75.0% 0 2.39s 604 4,201
GPT-5.2 Chat 8.7 7.9 91.7% 1 3.40s 1,807 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 3.26s 429 2,712
GPT-5.2 Chat 10.0 10.0 100.0% 0 8.97s 1,345 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 10.87s 327 7,401
GPT-5.2 Chat 10.0 10.0 100.0% 0 9.12s 1,243 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 2.60s 279 2,845
GPT-5.2 Chat 10.0 10.0 100.0% 0 3.05s 980 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 2.9 7.2 11.1% 1 3.16s 15 5,165
GPT-5.2 Chat 5.3 10.0 33.3% 0 17.78s 7,810 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 2.60s 84 1,142
GPT-5.2 Chat 4.4 3.0 33.3% 1 3.20s 335 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 9.9 10.0 100.0% 0 2.59s 75 3,320
GPT-5.2 Chat 7.3 5.9 83.3% 1 5.46s 1,528 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 7.6 7.2 77.8% 1 1.95s 165 2,450
GPT-5.2 Chat 7.7 10.0 66.7% 0 4.42s 1,743 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 4.55s 234 921
GPT-5.2 Chat 10.0 10.0 100.0% 0 4.68s 555 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 3.08s 12 1,877
GPT-5.2 Chat 3.0 10.0 0.0% 0 6.89s 1,239 0

Quick Compare

Switch Comparison Pair