Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Qwen: Qwen3.5-35B-A3B vs Xiaomi: MiMo-V2.5-Pro

Summary

Qwen3.5-35B-A3B vs MiMo-V2.5-Pro benchmark comparison: Qwen3.5-35B-A3B leads on average score with 6.3 vs 5.5. MiMo-V2.5-Pro has the lower benchmark cost at $0.017 vs $0.401. MiMo-V2.5-Pro is faster at 1.78s vs 72.57s, with pass rates of 69.8% vs 39.7%.

Recommended model: MiMo-V2.5-Pro - Its score stays close to the best score here (5.5 vs 6.3), while costing about 25.0x less than Qwen3.5-35B-A3B.

Last updated at: 2026-07-02

Metric Qwen3.5-35B-A3B Qwen3.5-35B-A3B medium Release: 2026-02-24 MiMo-V2.5-Pro MiMo-V2.5-Pro none Release: 2026-04-22
Score 6.3 5.5
Rank #92 #123
Reliability 10.0 10.0
Consistency 7.5 8.6
Tests Correct
Attempt pass rate 69.8% 39.7%
Flaky tests 6 4
Total Runs 63 63
Cost per result 5.162 0.648
Total Cost $0.401 $0.017
Input Price $0.140 / 1M $0.435 / 1M
Output Price $1.000 / 1M $0.870 / 1M
Total Input Tokens 42,196 30,724
Output Tokens 40,630 3,043
Reasoning Tokens 353,577 0
Response Time (avg) 72.57s 1.78s
Response Time (max) 409.98s 8.32s
Response Time (total) 1524.04s 37.42s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#92 Qwen3.5-35B-A3B

medium
Cost
$0.009
Time
71.4s
Tokens
8,631 tok

#123 MiMo-V2.5-Pro

none
Cost
$0.004
Time
46.4s
Tokens
4,025 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 21.13s 672 798 42,652
MiMo-V2.5-Pro 3.3 8.1 8.3% 1 2.67s 645 994 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 5.9 9.3 33.3% 0 206.65s 4,106 23,844 111,462
MiMo-V2.5-Pro 4.3 7.8 22.2% 1 1.41s 6,559 485 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 4.7 1.6 66.7% 1 75.34s 20,992 775 12,485
MiMo-V2.5-Pro 3.0 10.0 0.0% 0 3.54s 4,695 596 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 7.3 5.9 83.3% 1 59.33s 6,061 235 19,493
MiMo-V2.5-Pro 10.0 10.0 100.0% 0 1.32s 7,758 249 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 4.1 4.4 44.5% 2 88.34s 500 41 46,368
MiMo-V2.5-Pro 5.3 10.0 33.3% 0 877ms 753 27 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 2.8 1.6 33.3% 1 30.30s 172 20 3,753
MiMo-V2.5-Pro 4.0 10.0 0.0% 0 2.58s 498 87 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 24.45s 699 97 17,361
MiMo-V2.5-Pro 6.4 10.0 50.0% 0 1.03s 684 66 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 8.2 7.2 88.9% 1 33.13s 597 3,592 26,585
MiMo-V2.5-Pro 6.7 4.7 77.8% 2 1.30s 678 267 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 4.65s 8,193 309 1,365
MiMo-V2.5-Pro 10.0 10.0 100.0% 0 3.30s 8,238 258 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 3.0 10.0 0.0% 0 177.35s 204 10,919 72,053
MiMo-V2.5-Pro 3.0 10.0 0.0% 0 1.89s 216 14 0

Quick Compare

Switch Comparison Pair