AI BENCHY Compare
Inception: Mercury 2 vs Qwen: Qwen3.5-35B-A3B
Vergelijken:
Benchmarks gegenereerd uit AI BENCHY-testsuites op: 2026-03-05
| Metriek | Inception: Mercury 2 none Releasedatum: 2026-02-24 | Qwen: Qwen3.5-35B-A3B medium Releasedatum: 2026-02-24 |
|---|---|---|
| Rang | #50 | #33 |
| Gem. score | 3.4 | 5.8 |
| Correcte tests | ||
| Consistentie | 8.9 | 6.7 |
| Kosten per resultaat | 0.147 | 4.189 |
| Totale kosten | $0.006 | $0.336 |
| Slaagpercentage per poging | 33.3% | 80.0% |
| Instabiele tests | 2 | 6 |
| common.totalAttempts | 45 (15 x 3) | 45 (15 x 3) |
| Uitvoer-tokens | 1,144 | 5,475 |
| Redeneer-tokens | 0 | 165,513 |
| Responstijd (gem.) | 594ms | 44.84s |
| Responstijd (max) | 1.27s | 106.00s |
| Responstijd (totaal) | 8.91s | 672.55s |
Responstijd (gem.)
Score vs totale kosten
Gem. score vs Responstijd (gem.)
Categorie-uitsplitsing
| Anti-AI-trucs | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 10.0 | 10.0 | 0.0% | 0 | 466ms | 274 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 10.0 | 10.0 | 100.0% | 0 | 21.75s | 429 | 36,235 |
| Gecombineerd | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 10.0 | 10.0 | 0.0% | 0 | 606ms | 131 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 10.0 | 1.6 | 66.7% | 1 | 75.34s | 775 | 12,485 |
| Gegevensparsering en extractie | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 5.5 | 5.9 | 83.3% | 1 | 667ms | 180 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 5.5 | 5.9 | 83.3% | 1 | 59.33s | 235 | 19,493 |
| Domeinspecifiek | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 4.0 | 7.2 | 44.4% | 1 | 534ms | 46 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 10.0 | 4.4 | 44.5% | 2 | 88.34s | 41 | 46,368 |
| Instructies opvolgen | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 5.5 | 10.0 | 50.0% | 0 | 551ms | 82 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 10.0 | 10.0 | 100.0% | 0 | 24.45s | 97 | 17,361 |
| Puzzle Solving | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 10.0 | 10.0 | 0.0% | 0 | 533ms | 234 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 4.0 | 4.4 | 77.8% | 2 | 31.58s | 3,589 | 32,206 |
| Toolaanroepen | Score | Consistentie | Slaagpercentage per poging | Instabiele tests | Correcte tests | Responstijd (gem.) | Uitvoer-tokens | Redeneer-tokens |
|---|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | 10.0 | 10.0 | 100.0% | 0 | 1.27s | 197 | 0 | |
| Qwen: Qwen3.5-35B-A3B | 10.0 | 10.0 | 100.0% | 0 | 4.65s | 309 | 1,365 |
Snelle vergelijking
Vergelijkingspaar wisselen
Qwen3.5-35B-A3BmediumvsGLM 5noneMercury 2nonevsQwen3 Coder NextmediumMercury 2nonevsGLM 4.7 FlashmediumGemini 2.5 FlashnonevsQwen3.5-35B-A3BmediumDeepSeek V3.2nonevsQwen3.5-35B-A3BmediumClaude Sonnet 4.6nonevsQwen3.5-35B-A3BmediumGemini 3 Flash PreviewnonevsQwen3.5-35B-A3BmediumGPT-5.4nonevsQwen3.5-35B-A3BmediumMercury 2nonevsMiniMax M2.5mediumTrinity Large Preview (free)noneGratis beschikbaarvsQwen3.5-35B-A3BmediumGemini 3.1 Flash Lite PreviewnonevsQwen3.5-35B-A3BmediumGPT-4o-mininonevsQwen3.5-35B-A3Bmedium