Navigate
AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
Your ad here

AI BENCHY Compare

DeepSeek: DeepSeek V3.2 vs OpenAI: GPT-5.2

Compare:

Last updated at: 2026-03-06

Metric DeepSeek: DeepSeek V3.2 none Release: 2025-12-01 OpenAI: GPT-5.2 medium Release: 2025-12-11
Rank #33 #27
Avg Score 5.5 6.5
Consistency 8.4 7.9
Cost per result 0.220 3.125
Total Cost $0.016 $0.313
Tests Correct
Attempt pass rate 54.2% 75.0%
Flaky tests 3 4
Total Runs 48 48
Output Tokens 7,823 2,220
Reasoning Tokens 0 16,811
Response Time (avg) 12.86s 15.33s
Response Time (max) 115.89s 77.80s
Response Time (total) 205.78s 138.01s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Avg Score vs Response Time (avg)

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 10.0 9.7 0.0% 0 8.79s 1,411 0
OpenAI: GPT-5.2 7.0 7.3 77.8% 1 14.34s 549 2,002
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 8.0 10.0 0.0% 0 115.89s 2,887 0
OpenAI: GPT-5.2 10.0 10.0 100.0% 0 14.06s 291 1,757
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 5.4 5.8 66.7% 1 9.42s 1,710 0
OpenAI: GPT-5.2 9.9 10.0 100.0% 0 3.15s 234 420
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 10.0 7.2 22.2% 1 1.61s 24 0
OpenAI: GPT-5.2 4.0 7.2 55.6% 1 77.80s 42 10,342
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 10.0 10.0 100.0% 0 2.86s 67 0
OpenAI: GPT-5.2 10.0 9.7 0.0% 0 4.32s 162 269
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 10.0 10.0 100.0% 0 1.52s 66 0
OpenAI: GPT-5.2 9.5 10.0 100.0% 0 3.12s 94 614
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 7.7 7.5 88.9% 1 7.37s 1,136 0
OpenAI: GPT-5.2 7.0 7.3 77.8% 1 5.47s 609 938
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek: DeepSeek V3.2 10.0 10.0 100.0% 0 11.85s 522 0
OpenAI: GPT-5.2 10.0 1.6 66.7% 1 10.30s 239 469

Quick Compare

Switch Comparison Pair