Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

Anthropic: Claude Sonnet 4.6 vs OpenAI: GPT-5.2

Last updated at: 2026-04-14

Metric Claude Sonnet 4.6 Claude Sonnet 4.6 none Release: 2026-02-17 GPT-5.2 GPT-5.2 medium Release: 2025-12-11
Score 7.4 7.5
Rank #39 #37
Consistency 9.6 8.1
Tests Correct
Attempt pass rate 64.8% 72.2%
Flaky tests 1 4
Total Runs 54 54
Cost per result 2.376 3.193
Total Cost $0.262 $0.352
Input Price $3.000 / 1M $1.750 / 1M
Output Price $15.000 / 1M $14.000 / 1M
Output Tokens 7,433 2,705
Reasoning Tokens 0 18,977
Response Time (avg) 4.98s 14.04s
Response Time (max) 23.84s 77.80s
Response Time (total) 54.83s 154.41s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 4.8 10.0 25.0% 0 2.94s 1,214 0
GPT-5.2 6.5 8.0 58.3% 1 7.81s 567 2,002
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 3.67s 523 0
GPT-5.2 10.0 10.0 100.0% 0 15.12s 467 2,166
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 9.5 10.0 100.0% 0 23.84s 3,766 0
GPT-5.2 10.0 10.0 100.0% 0 14.06s 291 1,757
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 3.43s 252 0
GPT-5.2 10.0 10.0 100.0% 0 3.15s 234 420
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 7.7 10.0 66.7% 0 3.54s 413 0
GPT-5.2 5.9 7.2 55.6% 1 77.80s 42 10,342
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 6.1 3.1 66.7% 1 2.56s 192 0
GPT-5.2 3.7 9.7 0.0% 0 4.32s 162 269
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 6.5 10.0 50.0% 0 1.96s 90 0
GPT-5.2 9.9 10.0 100.0% 0 3.12s 94 614
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 7.7 10.0 66.7% 0 2.92s 536 0
GPT-5.2 7.7 7.3 77.8% 1 5.47s 609 938
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 4.11s 447 0
GPT-5.2 4.7 1.6 66.7% 1 10.30s 239 469

Quick Compare

Switch Comparison Pair