Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

MoonshotAI: Kimi K2.6 vs Qwen: Qwen3.5-122B-A10B

Last updated at: 2026-05-22

Metric Kimi K2.6 Kimi K2.6 medium Release: 2026-04-20 Qwen3.5-122B-A10B Qwen3.5-122B-A10B medium Release: 2026-02-24
Score 7.4 7.7
Rank #54 #38
Reliability 8.3 10.0
Consistency 8.3 8.8
Tests Correct
Attempt pass rate 70.8% 71.7%
Flaky tests 4 3
Total Runs 60 60
Cost per result 7.630 4.997
Total Cost $0.916 $0.650
Input Price $0.730 / 1M $0.260 / 1M
Output Price $3.490 / 1M $2.080 / 1M
Output Tokens 102,488 26,171
Reasoning Tokens 229,389 212,114
Response Time (avg) 54.11s 39.29s
Response Time (max) 215.85s 168.16s
Response Time (total) 1028.14s 785.87s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 7.0 8.0 66.7% 1 11.59s 7,115 8,934
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 9.75s 269 16,835
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 6.5 5.3 75.0% 1 118.23s 9,255 52,215
Qwen3.5-122B-A10B 4.1 5.8 33.3% 1 119.57s 8,036 45,074
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 10.0 10.0 100.0% 0 40.96s 711 13,876
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 107.79s 483 11,337
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 10.0 10.0 100.0% 0 20.38s 316 11,305
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 23.41s 270 16,558
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 5.3 7.2 44.4% 1 202.38s 47,035 98,262
Qwen3.5-122B-A10B 2.9 7.2 11.1% 1 63.40s 15,537 64,889
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 10.0 10.0 100.0% 0 17.83s 3,981 4,472
Qwen3.5-122B-A10B 3.4 2.2 33.3% 1 34.11s 66 7,592
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 10.0 10.0 100.0% 0 12.53s 3,977 5,269
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 9.88s 77 7,372
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 6.0 7.4 55.6% 1 25.59s 14,140 17,868
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 17.18s 289 26,165
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 10.0 10.0 100.0% 0 8.92s 248 1,011
Qwen3.5-122B-A10B 10.0 10.0 100.0% 0 4.60s 322 1,226
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Kimi K2.6 3.0 10.0 0.0% 0 130.27s 15,710 16,177
Qwen3.5-122B-A10B 3.0 10.0 0.0% 0 52.87s 822 15,066

Quick Compare

Switch Comparison Pair