AI BENCHY श्रेणी
पहेली समाधान रैंकिंग
देखें कि पहेली समाधान में कौन से AI मॉडल सबसे अच्छा प्रदर्शन करते हैं, कौन से भरोसेमंद बने रहते हैं और सबसे बड़े अंतर कहाँ दिखाई देते हैं। क्रमबद्ध करें: प्रतिक्रिया समय (औसत) ↓.
संबंधित विफलता कारण
| रैंक | मॉडल | कंपनी | पहेली समाधान स्कोर | औसत स्कोर | सही परीक्षण | प्रतिक्रिया समय (औसत) |
|---|---|---|---|---|---|---|
| #7 | Qwen3.5-27B medium | Qwen | 8.3 | 8.2 | 2/3 | 64.6s |
| #24 | Qwen3.5-Flash medium | Qwen | 4.0 | 6.9 | 1/3 | 56.7s |
| #8 | Gemini 3.1 Flash Lite Preview high | 7.0 | 8.2 | 2/3 | 46.3s | |
| #28 | Kimi K2.5 medium | Moonshot AI | 4.0 | 6.4 | 1/3 | 45.4s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 7.0 | 7.3 | 2/3 | 36.9s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 10.0 | 8.3 | 3/3 | 34.6s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 4.0 | 5.5 | 1/3 | 31.6s |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 7.0 | 6.9 | 2/3 | 25.9s |
| #48 | Qwen3 Coder Next none | Qwen | 1.3 | 4.0 | 0/3 | 22.9s |
| #34 | GPT-5 Nano medium | OpenAI | 4.0 | 5.5 | 1/3 | 19.8s |
| #10 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.7 | 3/3 | 17.2s |
| #14 | GLM 5 medium | Z.ai | 10.0 | 7.4 | 3/3 | 15.6s |
| #32 | GPT-5 Mini medium | OpenAI | 4.3 | 6.0 | 1/3 | 14.1s |
| #52 | GLM 4.7 Flash medium | Z.ai | 10.0 | 3.1 | 0/3 | 12.9s |
| #39 | gpt-oss-120b medium | OpenAI | 1.7 | 5.1 | 0/3 | 11.8s |
| #43 | MiniMax M2.5 medium | Minimax | 4.0 | 4.7 | 1/3 | 11.5s |
| #9 | GPT-5.4 medium | OpenAI | 7.0 | 8.0 | 2/3 | 9.13s |
| #30 | Grok 4.1 Fast medium | X AI | 4.0 | 6.2 | 1/3 | 8.08s |
| #13 | Step 3.5 Flash medium | Stepfun | 4.0 | 7.4 | 1/3 | 7.72s |
| #33 | DeepSeek V3.2 none | DeepSeek | 7.7 | 5.5 | 2/3 | 7.37s |
| #2 | Gemini 3.1 Pro Preview medium | 10.0 | 9.4 | 3/3 | 7.15s | |
| #5 | Gemini 3 Flash Preview low | 10.0 | 8.2 | 3/3 | 6.11s | |
| #37 | Qwen3.5-Flash none | Qwen | 1.3 | 5.2 | 0/3 | 5.90s |
| #27 | GPT-5.2 medium | OpenAI | 7.0 | 6.5 | 2/3 | 5.47s |
| #3 | GPT-5.3-Codex medium | OpenAI | 9.3 | 8.4 | 2/3 | 5.12s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.7 | 3/3 | 4.80s |
| #46 | Kimi K2.5 none | Moonshot AI | 10.0 | 4.1 | 0/3 | 4.73s |
| #26 | Claude Opus 4.6 medium | Anthropic | 7.0 | 6.6 | 2/3 | 4.60s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 10.0 | 3/3 | 4.43s | |
| #15 | GPT-5.2 Chat none | OpenAI | 7.0 | 7.4 | 2/3 | 4.42s |
| #16 | Gemini 2.5 Flash medium | 7.0 | 7.4 | 2/3 | 3.94s | |
| #6 | Gemini 3 Pro Preview medium | 10.0 | 8.2 | 3/3 | 3.91s | |
| #21 | MiMo-V2-Flash medium | Xiaomi | 7.0 | 7.2 | 2/3 | 3.77s |
| #12 | Gemini 3.1 Flash Lite Preview medium | 7.0 | 7.5 | 2/3 | 3.58s | |
| #45 | Trinity Large Preview none | Arcee AI | 4.0 | 4.2 | 1/3 | 3.30s |
| #19 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.3 | 3/3 | 2.93s |
| #25 | Claude Sonnet 4.6 none | Anthropic | 7.0 | 6.8 | 2/3 | 2.92s |
| #29 | Qwen3.5 Plus 2026-02-15 none | Qwen | 7.0 | 6.2 | 2/3 | 2.82s |
| #17 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.3 | 3/3 | 2.76s | |
| #50 | Qwen3 Coder Next medium | Qwen | 10.0 | 3.5 | 0/3 | 2.30s |
| #31 | GLM 5 none | Z.ai | 7.0 | 6.0 | 2/3 | 2.05s |
| #55 | LFM2-24B-A2B none | Liquid | 3.3 | 2.6 | 0/3 | 1.69s |
| #44 | GPT-5.4 none | OpenAI | 4.0 | 4.5 | 1/3 | 1.52s |
| #54 | MiMo-V2-Flash none | Xiaomi | 10.0 | 2.9 | 0/3 | 1.38s |
| #41 | Qwen3.5-27B none | Qwen | 6.3 | 4.9 | 1/3 | 1.37s |
| #42 | Qwen3.5-35B-A3B none | Qwen | 1.7 | 4.7 | 0/3 | 1.34s |
| #47 | GPT-4o-mini none | OpenAI | 2.3 | 4.0 | 0/3 | 1.30s |
| #53 | Grok 4.1 Fast none | X AI | 1.3 | 2.9 | 0/3 | 1.28s |
| #20 | Gemini 3 Flash Preview none | 7.0 | 7.2 | 2/3 | 1.06s | |
| #49 | GLM 4.7 Flash none | Z.ai | 3.7 | 3.9 | 0/3 | 1.00s |
| #40 | Qwen3.5-122B-A10B none | Qwen | 4.0 | 5.0 | 1/3 | 982ms |
| #22 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.1 | 3/3 | 972ms | |
| #36 | Mercury 2 medium | Inception | 1.7 | 5.3 | 0/3 | 934ms |
| #38 | Gemini 2.5 Flash none | 4.7 | 5.2 | 1/3 | 576ms | |
| #51 | Mercury 2 none | Inception | 10.0 | 3.4 | 0/3 | 533ms |