AI BENCHY श्रेणी
संयुक्त रैंकिंग
देखें कि संयुक्त में कौन से AI मॉडल सबसे अच्छा प्रदर्शन करते हैं, कौन से भरोसेमंद बने रहते हैं और सबसे बड़े अंतर कहाँ दिखाई देते हैं। क्रमबद्ध करें: प्रतिक्रिया समय (औसत) ↓.
संबंधित विफलता कारण
| रैंक | मॉडल | कंपनी | संयुक्त स्कोर | औसत स्कोर | सही परीक्षण | प्रतिक्रिया समय (औसत) |
|---|---|---|---|---|---|---|
| #8 | Gemini 3.1 Flash Lite Preview high | 10.0 | 8.2 | 1/1 | 280.5s | |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 10.0 | 6.9 | 1/1 | 262.8s |
| #7 | Qwen3.5-27B medium | Qwen | 10.0 | 8.2 | 1/1 | 164.0s |
| #33 | DeepSeek V3.2 none | DeepSeek | 8.0 | 5.5 | 0/1 | 115.9s |
| #10 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.7 | 1/1 | 107.8s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 10.0 | 7.3 | 1/1 | 93.1s |
| #32 | GPT-5 Mini medium | OpenAI | 10.0 | 6.0 | 1/1 | 88.2s |
| #26 | Claude Opus 4.6 medium | Anthropic | 10.0 | 6.6 | 1/1 | 76.7s |
| #21 | MiMo-V2-Flash medium | Xiaomi | 9.0 | 7.2 | 1/1 | 75.7s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 10.0 | 5.5 | 0/1 | 75.3s |
| #28 | Kimi K2.5 medium | Moonshot AI | 10.0 | 6.4 | 1/1 | 71.4s |
| #34 | GPT-5 Nano medium | OpenAI | 10.0 | 5.5 | 1/1 | 66.0s |
| #52 | GLM 4.7 Flash medium | Z.ai | 10.0 | 3.1 | 0/1 | 65.6s |
| #43 | MiniMax M2.5 medium | Minimax | 10.0 | 4.7 | 0/1 | 60.4s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 10.0 | 1/1 | 50.2s | |
| #42 | Qwen3.5-35B-A3B none | Qwen | 10.0 | 4.7 | 0/1 | 47.4s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 10.0 | 8.3 | 1/1 | 46.8s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.7 | 1/1 | 46.4s |
| #40 | Qwen3.5-122B-A10B none | Qwen | 10.0 | 5.0 | 0/1 | 46.0s |
| #48 | Qwen3 Coder Next none | Qwen | 10.0 | 4.0 | 0/1 | 45.1s |
| #2 | Gemini 3.1 Pro Preview medium | 9.0 | 9.4 | 1/1 | 40.6s | |
| #30 | Grok 4.1 Fast medium | X AI | 10.0 | 6.2 | 1/1 | 37.6s |
| #39 | gpt-oss-120b medium | OpenAI | 10.0 | 5.1 | 1/1 | 31.2s |
| #13 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.4 | 1/1 | 29.6s |
| #14 | GLM 5 medium | Z.ai | 10.0 | 7.4 | 1/1 | 29.0s |
| #16 | Gemini 2.5 Flash medium | 10.0 | 7.4 | 1/1 | 28.4s | |
| #25 | Claude Sonnet 4.6 none | Anthropic | 9.0 | 6.8 | 1/1 | 23.8s |
| #9 | GPT-5.4 medium | OpenAI | 10.0 | 8.0 | 1/1 | 20.6s |
| #3 | GPT-5.3-Codex medium | OpenAI | 10.0 | 8.4 | 1/1 | 19.6s |
| #46 | Kimi K2.5 none | Moonshot AI | 10.0 | 4.1 | 0/1 | 19.2s |
| #24 | Qwen3.5-Flash medium | Qwen | 10.0 | 6.9 | 1/1 | 17.8s |
| #12 | Gemini 3.1 Flash Lite Preview medium | 10.0 | 7.5 | 1/1 | 14.9s | |
| #27 | GPT-5.2 medium | OpenAI | 10.0 | 6.5 | 1/1 | 14.1s |
| #19 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.3 | 1/1 | 12.0s |
| #17 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.3 | 0/1 | 11.9s | |
| #6 | Gemini 3 Pro Preview medium | 10.0 | 8.2 | 0/1 | 10.4s | |
| #41 | Qwen3.5-27B none | Qwen | 10.0 | 4.9 | 0/1 | 9.39s |
| #15 | GPT-5.2 Chat none | OpenAI | 10.0 | 7.4 | 1/1 | 9.12s |
| #45 | Trinity Large Preview none | Arcee AI | 10.0 | 4.2 | 0/1 | 8.91s |
| #47 | GPT-4o-mini none | OpenAI | 10.0 | 4.0 | 0/1 | 7.58s |
| #29 | Qwen3.5 Plus 2026-02-15 none | Qwen | 10.0 | 6.2 | 0/1 | 6.65s |
| #37 | Qwen3.5-Flash none | Qwen | 10.0 | 5.2 | 0/1 | 6.22s |
| #31 | GLM 5 none | Z.ai | 10.0 | 6.0 | 0/1 | 4.98s |
| #38 | Gemini 2.5 Flash none | 10.0 | 5.2 | 0/1 | 4.39s | |
| #50 | Qwen3 Coder Next medium | Qwen | 10.0 | 3.5 | 0/1 | 4.28s |
| #20 | Gemini 3 Flash Preview none | 10.0 | 7.2 | 0/1 | 3.56s | |
| #53 | Grok 4.1 Fast none | X AI | 10.0 | 2.9 | 0/1 | 3.33s |
| #36 | Mercury 2 medium | Inception | 10.0 | 5.3 | 1/1 | 3.28s |
| #5 | Gemini 3 Flash Preview low | 10.0 | 8.2 | 0/1 | 3.27s | |
| #49 | GLM 4.7 Flash none | Z.ai | 10.0 | 3.9 | 0/1 | 3.22s |
| #22 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.1 | 0/1 | 3.20s | |
| #44 | GPT-5.4 none | OpenAI | 10.0 | 4.5 | 0/1 | 2.89s |
| #54 | MiMo-V2-Flash none | Xiaomi | 10.0 | 2.9 | 0/1 | 2.87s |
| #51 | Mercury 2 none | Inception | 10.0 | 3.4 | 0/1 | 606ms |
| #55 | LFM2-24B-A2B none | Liquid | 10.0 | 2.6 | 0/1 | 0ms |