AI BENCHY
Compare
❤️ Made by XCS
Your ad here

Model Name

Google: Gemini 3 Flash Preview

Reasoning (low)

Last updated at : Feb 24, 2026

Metric Google: Gemini 3 Flash Preview
Rank#5
CompanyGoogle
Score 8.23
Consistency 8.71
Cost per result 0.6173
Total Cost $0.06174
Tests Correct
Attempt pass rate 82.0%
Flaky tests 2
Output Tokens 936
Reasoning Tokens 18,071
Response Time (avg)6746ms
Response Time (total)87697ms
Response Time (max)14717ms

Category Breakdown

Category Fully passed tests Score Consistency Attempt pass rate Flaky tests Reasoning score Response Time (avg) Cost
Anti-AI Tricks 10.00 10.00 100.0% 0 6.23 3496ms $0.00844
Data parsing and extraction 10.00 10.00 100.0% 0 4.73 9460ms $0.01354
Domain specific 4.00 4.41 55.5% 2 1.83 8314ms $0.01993
Instructions following 7.50 9.99 50.0% 0 5.00 7016ms $0.00878
Puzzle Solving 10.00 10.00 100.0% 0 7.50 6440ms $0.01105

Compared models

Compare Google: Gemini 3 Flash Preview against...

#4 · Qwen

Qwen: Qwen3.5 Plus 2026-02-15

Reasoning (medium)

Score: 8.54

Consistency: 9.35

Attempt pass rate: 87.2%

Flaky tests: 1

Cost per result: 2.1621

Tests Correct:

Total Cost: $0.23784

Compare

#6 · OpenAI

OpenAI: GPT-5.3-Codex

Reasoning (medium)

Score: 7.77

Consistency: 8.75

Attempt pass rate: 76.9%

Flaky tests: 2

Cost per result: 4.9342

Tests Correct:

Total Cost: $0.44408

Compare

#3 · Google

Google: Gemini 3 Pro Preview

Reasoning (medium)

Score: 8.54

Consistency: 10.00

Attempt pass rate: 84.6%

Flaky tests: 0

Cost per result: 0.7901

Tests Correct:

Total Cost: $0.08692

Compare

Quick Compare

Compare Google: Gemini 3 Flash Preview against...