Navigate
AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Google: Gemini 3 Pro Preview vs OpenAI: GPT-5.2 Chat

Compare:

Last updated at: 2026-03-06

Metric Google: Gemini 3 Pro Preview medium Release: 2025-11-18 OpenAI: GPT-5.2 Chat none Release: 2025-12-11
Avg Score 8.1 7.7
Rank #9 #11
Tests Correct
Consistency 10.0 9.5
Cost per result 1.547 2.389
Total Cost $0.186 $0.263
Attempt pass rate 80.0% 77.8%
Flaky tests 0 1
common.totalRuns 45 (15 x 3) 45 (15 x 3)
Output Tokens 1,424 15,510
Reasoning Tokens 9,332 0
Response Time (avg) 6.87s 7.29s
Response Time (max) 11.96s 38.52s
Response Time (total) 55.00s 109.31s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Avg Score vs Response Time (avg)

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 10.0 10.0 100.0% 0 3.75s 143 1,107
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 3.97s 1,651 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 10.0 10.0 0.0% 0 10.37s 351 952
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 9.12s 1,243 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 9.9 10.0 100.0% 0 10.84s 279 3,156
OpenAI: GPT-5.2 Chat 9.9 10.0 100.0% 0 3.05s 980 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 4.0 10.0 33.3% 0 7.01s 15 1,195
OpenAI: GPT-5.2 Chat 4.0 10.0 33.3% 0 17.78s 7,810 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 9.5 10.0 100.0% 0 3.26s 69 754
OpenAI: GPT-5.2 Chat 6.0 6.1 83.3% 1 5.46s 1,528 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 10.0 10.0 100.0% 0 3.91s 243 1,197
OpenAI: GPT-5.2 Chat 7.0 10.0 66.7% 0 4.42s 1,743 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Google: Gemini 3 Pro Preview 10.0 10.0 100.0% 0 11.96s 324 971
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 4.68s 555 0

Quick Compare

Switch Comparison Pair