DeepSeek V4 Flash (high) vs GPT-5.4 Mini (medium)

Recommended model DeepSeek V4 Flash (high)

It has the best score here (7.7), while costing about 18.8x less than GPT-5.4 Mini (medium).

Detailed comparison

Metric	DeepSeek V4 Flash DeepSeek V4 Flash high Release: 2026-04-24	GPT-5.4 Mini GPT-5.4 Mini medium Release: 2026-03-17

Metric	DeepSeek V4 Flash DeepSeek V4 Flash high Release: 2026-04-24	GPT-5.4 Mini GPT-5.4 Mini medium Release: 2026-03-17
Score	7.7	7.5
Rank	#52	#64
Reliability	10.0	10.0
Consistency	8.2	7.7
Tests Correct
Attempt pass rate	72.7%	71.2%
Flaky tests	5	6
Total Runs	66	66
Cost per result	0.402	6.299
Total Cost	$0.041	$0.756
Input Price	$0.094 / 1M	$0.750 / 1M
Output Price	$0.188 / 1M	$4.500 / 1M
Total Input Tokens	108,392	97,155
Output Tokens	14,478	6,211
Reasoning Tokens	153,687	145,544
Response Time (avg)	49.75s	25.94s
Response Time (max)	218.13s	138.75s
Response Time (total)	1094.41s	570.66s

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

high

medium

Category:

Anti-AI Tricks	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	8.3	10.0	75.0%	0		28.51s	540	140	7,770
GPT-5.4 Mini	8.6	7.9	91.7%	1		4.05s	606	296	2,876

Coding	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	7.8	10.0	66.7%	0		50.60s	7,279	395	34,862
GPT-5.4 Mini	8.4	7.4	88.9%	1		57.87s	7,305	467	40,902

Combined	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	6.4	5.8	66.7%	1		104.10s	82,663	4,633	37,533
GPT-5.4 Mini	6.9	5.9	66.7%	1		59.64s	74,058	4,347	40,924

Data parsing and extraction	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		28.03s	7,290	201	1,179
GPT-5.4 Mini	10.0	10.0	100.0%	0		2.43s	7,140	234	650

Domain specific	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	4.1	4.4	44.5%	2		100.31s	666	27	59,249
GPT-5.4 Mini	4.1	4.4	44.5%	2		65.31s	619	60	43,286

General Intelligence	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	6.1	3.1	66.7%	1		25.15s	471	79	632
GPT-5.4 Mini	4.5	10.0	0.0%	0		3.72s	477	150	510

Instructions following	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		15.36s	627	63	1,622
GPT-5.4 Mini	9.8	10.0	100.0%	0		2.13s	660	96	1,185

Puzzle Solving	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	8.2	7.2	88.9%	1		26.11s	594	196	1,767
GPT-5.4 Mini	7.8	10.0	66.7%	0		4.37s	642	278	2,443

Tool Calling	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	10.0	10.0	100.0%	0		74.73s	8,079	228	542
GPT-5.4 Mini	4.7	1.6	66.7%	1		9.62s	5,453	251	2,594

Trivia	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
DeepSeek V4 Flash	3.0	10.0	0.0%	0		54.46s	183	8,516	8,531
GPT-5.4 Mini	3.0	10.0	0.0%	0		30.10s	195	32	10,174

Switch Comparison Pair