AI BENCHY Compare

Nemotron 3 Ultra 550b A55b vs Qwen: Qwen3.6 27B

Summary

Nemotron 3 Ultra 550b A55b vs Qwen3.6 27B benchmark comparison: Qwen3.6 27B leads on average score with 6.6 vs 6.1. Nemotron 3 Ultra 550b A55b has the lower benchmark cost at $0.027 vs $0.440. Nemotron 3 Ultra 550b A55b is faster at 2.27s vs 59.71s, with pass rates of 44.4% vs 60.3%.

Recommended model: Nemotron 3 Ultra 550b A55b - Its score stays close to the best score here (6.1 vs 6.6), while costing about 16.7x less than Qwen3.6 27B.

Last updated at: 2026-06-18

Metric	Nemotron 3 Ultra 550b A55b Nemotron 3 Ultra 550b A55b none Release: 2026-06-04 Free Available	Qwen3.6 27B Qwen3.6 27B medium Release: 2026-04-20

Metric	Nemotron 3 Ultra 550b A55b Nemotron 3 Ultra 550b A55b none Release: 2026-06-04 Free Available	Qwen3.6 27B Qwen3.6 27B medium Release: 2026-04-20
Score	6.1	6.6
Rank	#99	#81
Reliability	10.0	10.0
Consistency	9.2	8.2
Tests Correct
Attempt pass rate	44.4%	60.3%
Flaky tests	2	5
Total Runs	63	63
Cost per result	0.000	3.361
Total Cost	$0.027	$0.440
Input Price	$0.500 / 1M	$0.289 / 1M
Output Price	$2.200 / 1M	$3.170 / 1M
Total Input Tokens	43,326	39,376
Output Tokens	2,138	16,189
Reasoning Tokens	0	122,521
Response Time (avg)	2.27s	59.71s
Response Time (max)	13.49s	168.22s
Response Time (total)	47.65s	1254.01s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#99 Nemotron 3 Ultra 550b A55b

none

Cost: $0.000
Time: 149.6s
Tokens: 3,405 tok

#81 Qwen3.6 27B

medium

Cost: $0.009
Time: 39.6s
Tokens: 3,090 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	3.5	8.0	16.7%	1		2.35s	696	239	0
Qwen3.6 27B	8.3	10.0	75.0%	0		12.62s	453	582	4,311

Coding	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	5.5	10.0	33.3%	0		1.02s	7,623	369	0
Qwen3.6 27B	7.7	10.0	66.7%	0		142.99s	5,051	7,968	43,367

Combined	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	3.0	10.0	0.0%	0		4.79s	15,558	357	0
Qwen3.6 27B	7.0	3.7	66.7%	1		83.07s	15,104	2,088	14,689

Data parsing and extraction	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	10.0	10.0	100.0%	0		1.94s	7,944	249	0
Qwen3.6 27B	3.5	1.4	50.0%	2		37.30s	7,778	568	9,404

Domain specific	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	5.3	10.0	33.3%	0		698ms	789	27	0
Qwen3.6 27B	2.9	7.2	11.1%	1		73.38s	662	3,510	20,352

General Intelligence	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	5.0	10.0	0.0%	0		13.49s	516	101	0
Qwen3.6 27B	6.5	3.4	66.7%	1		39.53s	516	81	3,045

Instructions following	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	10.0	10.0	100.0%	0		1.46s	723	69	0
Qwen3.6 27B	10.0	10.0	100.0%	0		37.96s	699	346	6,548

Puzzle Solving	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	5.9	7.2	55.6%	1		1.06s	726	352	0
Qwen3.6 27B	7.7	10.0	66.7%	0		61.14s	696	255	12,044

Tool Calling	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	10.0	10.0	100.0%	0		2.99s	8,544	264	0
Qwen3.6 27B	10.0	10.0	100.0%	0		16.88s	8,213	390	2,954

Trivia	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Nemotron 3 Ultra 550b A55b	3.0	10.0	0.0%	0		1.83s	207	111	0
Qwen3.6 27B	3.0	10.0	0.0%	0		80.99s	204	401	5,807

Quick Compare

Switch Comparison Pair

Gemini 3.1 Flash LiteminimalvsNemotron 3 Ultra 550b A55bnoneFree Available Gemini 3.1 Flash Lite PreviewlowvsQwen3.6 27Bmedium Gemini 3.1 Flash Lite PreviewnonevsQwen3.6 27Bmedium Gemini 3.1 Flash LitelowvsQwen3.6 27Bmedium North Mini CodemediumFree AvailablevsNemotron 3 Ultra 550b A55bnoneFree Available Gemini 3.5 FlashminimalvsQwen3.6 27Bmedium Nemotron 3 Ultra 550b A55bnoneFree AvailablevsQwen3.5-35B-A3Bmedium Gemma 4 31BmediumFree AvailablevsNemotron 3 Ultra 550b A55bnoneFree Available GPT-5.5nonevsQwen3.6 27Bmedium Gemini 3 Flash PreviewnonevsQwen3.6 27Bmedium Gemini 3.1 Flash LitelowvsNemotron 3 Ultra 550b A55bnoneFree Available Seed-2.0-LitenonevsQwen3.6 27Bmedium