AI BENCHY श्रेणी
संयुक्त रैंकिंग
देखें कि संयुक्त में कौन से AI मॉडल सबसे अच्छा प्रदर्शन करते हैं, कौन से भरोसेमंद बने रहते हैं और सबसे बड़े अंतर कहाँ दिखाई देते हैं। क्रमबद्ध करें: प्रतिक्रिया समय (औसत) ↑.
संबंधित विफलता कारण
| रैंक | मॉडल | कंपनी | संयुक्त स्कोर | औसत स्कोर | सही परीक्षण | प्रतिक्रिया समय (औसत) |
|---|---|---|---|---|---|---|
| #55 | LFM2-24B-A2B none | Liquid | 10.0 | 2.6 | 0/1 | 0ms |
| #51 | Mercury 2 none | Inception | 10.0 | 3.4 | 0/1 | 606ms |
| #54 | MiMo-V2-Flash none | Xiaomi | 10.0 | 2.9 | 0/1 | 2.87s |
| #44 | GPT-5.4 none | OpenAI | 10.0 | 4.5 | 0/1 | 2.89s |
| #22 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.1 | 0/1 | 3.20s | |
| #49 | GLM 4.7 Flash none | Z.ai | 10.0 | 3.9 | 0/1 | 3.22s |
| #5 | Gemini 3 Flash Preview low | 10.0 | 8.2 | 0/1 | 3.27s | |
| #36 | Mercury 2 medium | Inception | 10.0 | 5.3 | 1/1 | 3.28s |
| #53 | Grok 4.1 Fast none | X AI | 10.0 | 2.9 | 0/1 | 3.33s |
| #20 | Gemini 3 Flash Preview none | 10.0 | 7.2 | 0/1 | 3.56s | |
| #50 | Qwen3 Coder Next medium | Qwen | 10.0 | 3.5 | 0/1 | 4.28s |
| #38 | Gemini 2.5 Flash none | 10.0 | 5.2 | 0/1 | 4.39s | |
| #31 | GLM 5 none | Z.ai | 10.0 | 6.0 | 0/1 | 4.98s |
| #37 | Qwen3.5-Flash none | Qwen | 10.0 | 5.2 | 0/1 | 6.22s |
| #29 | Qwen3.5 Plus 2026-02-15 none | Qwen | 10.0 | 6.2 | 0/1 | 6.65s |
| #47 | GPT-4o-mini none | OpenAI | 10.0 | 4.0 | 0/1 | 7.58s |
| #45 | Trinity Large Preview none | Arcee AI | 10.0 | 4.2 | 0/1 | 8.91s |
| #15 | GPT-5.2 Chat none | OpenAI | 10.0 | 7.4 | 1/1 | 9.12s |
| #41 | Qwen3.5-27B none | Qwen | 10.0 | 4.9 | 0/1 | 9.39s |
| #6 | Gemini 3 Pro Preview medium | 10.0 | 8.2 | 0/1 | 10.4s | |
| #17 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.3 | 0/1 | 11.9s | |
| #19 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.3 | 1/1 | 12.0s |
| #27 | GPT-5.2 medium | OpenAI | 10.0 | 6.5 | 1/1 | 14.1s |
| #12 | Gemini 3.1 Flash Lite Preview medium | 10.0 | 7.5 | 1/1 | 14.9s | |
| #24 | Qwen3.5-Flash medium | Qwen | 10.0 | 6.9 | 1/1 | 17.8s |
| #46 | Kimi K2.5 none | Moonshot AI | 10.0 | 4.1 | 0/1 | 19.2s |
| #3 | GPT-5.3-Codex medium | OpenAI | 10.0 | 8.4 | 1/1 | 19.6s |
| #9 | GPT-5.4 medium | OpenAI | 10.0 | 8.0 | 1/1 | 20.6s |
| #25 | Claude Sonnet 4.6 none | Anthropic | 9.0 | 6.8 | 1/1 | 23.8s |
| #16 | Gemini 2.5 Flash medium | 10.0 | 7.4 | 1/1 | 28.4s | |
| #14 | GLM 5 medium | Z.ai | 10.0 | 7.4 | 1/1 | 29.0s |
| #13 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.4 | 1/1 | 29.6s |
| #39 | gpt-oss-120b medium | OpenAI | 10.0 | 5.1 | 1/1 | 31.2s |
| #30 | Grok 4.1 Fast medium | X AI | 10.0 | 6.2 | 1/1 | 37.6s |
| #2 | Gemini 3.1 Pro Preview medium | 9.0 | 9.4 | 1/1 | 40.6s | |
| #48 | Qwen3 Coder Next none | Qwen | 10.0 | 4.0 | 0/1 | 45.1s |
| #40 | Qwen3.5-122B-A10B none | Qwen | 10.0 | 5.0 | 0/1 | 46.0s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.7 | 1/1 | 46.4s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 10.0 | 8.3 | 1/1 | 46.8s |
| #42 | Qwen3.5-35B-A3B none | Qwen | 10.0 | 4.7 | 0/1 | 47.4s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 10.0 | 1/1 | 50.2s | |
| #43 | MiniMax M2.5 medium | Minimax | 10.0 | 4.7 | 0/1 | 60.4s |
| #52 | GLM 4.7 Flash medium | Z.ai | 10.0 | 3.1 | 0/1 | 65.6s |
| #34 | GPT-5 Nano medium | OpenAI | 10.0 | 5.5 | 1/1 | 66.0s |
| #28 | Kimi K2.5 medium | Moonshot AI | 10.0 | 6.4 | 1/1 | 71.4s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 10.0 | 5.5 | 0/1 | 75.3s |
| #21 | MiMo-V2-Flash medium | Xiaomi | 9.0 | 7.2 | 1/1 | 75.7s |
| #26 | Claude Opus 4.6 medium | Anthropic | 10.0 | 6.6 | 1/1 | 76.7s |
| #32 | GPT-5 Mini medium | OpenAI | 10.0 | 6.0 | 1/1 | 88.2s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 10.0 | 7.3 | 1/1 | 93.1s |
| #10 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.7 | 1/1 | 107.8s |
| #33 | DeepSeek V3.2 none | DeepSeek | 8.0 | 5.5 | 0/1 | 115.9s |
| #7 | Qwen3.5-27B medium | Qwen | 10.0 | 8.2 | 1/1 | 164.0s |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 10.0 | 6.9 | 1/1 | 262.8s |
| #8 | Gemini 3.1 Flash Lite Preview high | 10.0 | 8.2 | 1/1 | 280.5s |