Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

OpenAI: GPT-5.4 Nano vs Qwen: Qwen3.5-35B-A3B

Last updated at: 2026-03-17

Metric GPT-5.4 Nano GPT-5.4 Nano medium Release: 2026-03-17 Qwen3.5-35B-A3B Qwen3.5-35B-A3B none Release: 2026-02-24
Rank #28 #49
Score 7.4 5.9
Consistency 9.0 8.6
Cost per result 0.769 0.237
Total Cost $0.077 $0.015
Tests Correct
Attempt pass rate 66.7% 47.1%
Flaky tests 2 3
Total Runs 51 51
Output Tokens 2,474 3,761
Reasoning Tokens 54,516 0
Response Time (avg) 11.08s 3.89s
Response Time (max) 94.06s 47.43s
Response Time (total) 188.39s 66.07s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 8.3 10.0 75.0% 0 4.52s 683 2,254
Qwen3.5-35B-A3B 3.4 7.9 16.7% 1 1.43s 574 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 9.8 10.0 100.0% 0 24.13s 349 5,719
Qwen3.5-35B-A3B 3.0 10.0 0.0% 0 47.43s 1,833 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 10.0 10.0 100.0% 0 2.54s 234 516
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 1.16s 243 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 5.9 7.2 55.6% 1 38.18s 60 43,325
Qwen3.5-35B-A3B 7.7 10.0 66.7% 0 485ms 15 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 4.5 10.0 0.0% 0 4.15s 179 443
Qwen3.5-35B-A3B 6.5 3.4 66.7% 1 1.19s 114 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 9.8 10.0 100.0% 0 1.88s 95 521
Qwen3.5-35B-A3B 6.3 10.0 50.0% 0 809ms 63 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 4.0 7.1 22.2% 1 3.65s 640 1,356
Qwen3.5-35B-A3B 3.9 7.4 22.2% 1 1.34s 655 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 10.0 10.0 100.0% 0 7.71s 234 382
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 2.30s 264 0

Quick Compare

Switch Comparison Pair