DeepSeek V4 Flash (high) vs GPT-5.2 (medium)

Recommended model DeepSeek V4 Flash (high)

Its score stays close to the best score here (7.7 vs 8.4), while costing about 23.7x less than GPT-5.2 (medium).

Detailed comparison

Metric	DeepSeek V4 Flash DeepSeek V4 Flash high Release: 2026-04-24	GPT-5.2 GPT-5.2 medium Release: 2025-12-11

Metric	DeepSeek V4 Flash DeepSeek V4 Flash high Release: 2026-04-24	GPT-5.2 GPT-5.2 medium Release: 2025-12-11
Score	7.7	8.4
Rank	#52	#27
Reliability	10.0	10.0
Consistency	8.2	8.5
Tests Correct
Attempt pass rate	72.7%	72.7%
Flaky tests	5	4
Total Runs	66	66
Cost per result	0.402	6.791
Total Cost	$0.041	$0.951
Input Price	$0.094 / 1M	$1.750 / 1M
Output Price	$0.188 / 1M	$14.000 / 1M
Total Input Tokens	108,392	105,004
Output Tokens	14,478	9,914
Reasoning Tokens	153,687	44,868
Response Time (avg)	49.75s	22.62s
Response Time (max)	218.13s	102.93s
Response Time (total)	1094.41s	339.28s

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

high

medium

Category:

Anti-AI Tricks	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	8.3	10.0	75.0%	0		28.51s	540	140	7,770
GPT-5.2	6.5	8.0	58.3%	1		7.81s	606	567	2,002

Coding	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	7.8	10.0	66.7%	0		50.60s	7,279	395	34,862
GPT-5.2	10.0	10.0	100.0%	0		22.73s	7,302	511	11,912

Combined	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	6.4	5.8	66.7%	1		104.10s	82,663	4,633	37,533
GPT-5.2	10.0	10.0	100.0%	0		58.50s	82,056	7,304	14,693

Data parsing and extraction	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		28.03s	7,290	201	1,179
GPT-5.2	10.0	10.0	100.0%	0		3.15s	7,140	234	420

Domain specific	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	4.1	4.4	44.5%	2		100.31s	666	27	59,249
GPT-5.2	5.9	7.2	55.6%	1		77.80s	473	42	10,342

General Intelligence	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	6.1	3.1	66.7%	1		25.15s	471	79	632
GPT-5.2	3.7	9.7	0.0%	0		4.32s	477	162	269

Instructions following	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		15.36s	627	63	1,622
GPT-5.2	9.9	10.0	100.0%	0		3.12s	660	94	614

Puzzle Solving	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	8.2	7.2	88.9%	1		26.11s	594	196	1,767
GPT-5.2	7.5	7.3	77.8%	1		5.80s	642	735	924

Tool Calling	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		74.73s	8,079	228	542
GPT-5.2	4.7	1.6	66.7%	1		10.30s	5,453	239	469

Trivia	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	3.0	10.0	0.0%	0		54.46s	183	8,516	8,531
GPT-5.2	3.0	10.0	0.0%	0		28.18s	195	26	3,223

Switch Comparison Pair