AI BENCHY श्रेणी
पहेली समाधान रैंकिंग
देखें कि पहेली समाधान में कौन से AI मॉडल सबसे अच्छा प्रदर्शन करते हैं, कौन से भरोसेमंद बने रहते हैं और सबसे बड़े अंतर कहाँ दिखाई देते हैं। क्रमबद्ध करें: प्रतिक्रिया समय (औसत) ↑.
संबंधित विफलता कारण
| रैंक | मॉडल | कंपनी | पहेली समाधान स्कोर | औसत स्कोर | सही परीक्षण | प्रतिक्रिया समय (औसत) |
|---|---|---|---|---|---|---|
| #51 | Mercury 2 none | Inception | 10.0 | 3.4 | 0/3 | 533ms |
| #38 | Gemini 2.5 Flash none | 4.7 | 5.2 | 1/3 | 576ms | |
| #36 | Mercury 2 medium | Inception | 1.7 | 5.3 | 0/3 | 934ms |
| #22 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.1 | 3/3 | 972ms | |
| #40 | Qwen3.5-122B-A10B none | Qwen | 4.0 | 5.0 | 1/3 | 982ms |
| #49 | GLM 4.7 Flash none | Z.ai | 3.7 | 3.9 | 0/3 | 1.00s |
| #20 | Gemini 3 Flash Preview none | 7.0 | 7.2 | 2/3 | 1.06s | |
| #53 | Grok 4.1 Fast none | X AI | 1.3 | 2.9 | 0/3 | 1.28s |
| #47 | GPT-4o-mini none | OpenAI | 2.3 | 4.0 | 0/3 | 1.30s |
| #42 | Qwen3.5-35B-A3B none | Qwen | 1.7 | 4.7 | 0/3 | 1.34s |
| #41 | Qwen3.5-27B none | Qwen | 6.3 | 4.9 | 1/3 | 1.37s |
| #54 | MiMo-V2-Flash none | Xiaomi | 10.0 | 2.9 | 0/3 | 1.38s |
| #44 | GPT-5.4 none | OpenAI | 4.0 | 4.5 | 1/3 | 1.52s |
| #55 | LFM2-24B-A2B none | Liquid | 3.3 | 2.6 | 0/3 | 1.69s |
| #31 | GLM 5 none | Z.ai | 7.0 | 6.0 | 2/3 | 2.05s |
| #50 | Qwen3 Coder Next medium | Qwen | 10.0 | 3.5 | 0/3 | 2.30s |
| #17 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.3 | 3/3 | 2.76s | |
| #29 | Qwen3.5 Plus 2026-02-15 none | Qwen | 7.0 | 6.2 | 2/3 | 2.82s |
| #25 | Claude Sonnet 4.6 none | Anthropic | 7.0 | 6.8 | 2/3 | 2.92s |
| #19 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.3 | 3/3 | 2.93s |
| #45 | Trinity Large Preview none | Arcee AI | 4.0 | 4.2 | 1/3 | 3.30s |
| #12 | Gemini 3.1 Flash Lite Preview medium | 7.0 | 7.5 | 2/3 | 3.58s | |
| #21 | MiMo-V2-Flash medium | Xiaomi | 7.0 | 7.2 | 2/3 | 3.77s |
| #6 | Gemini 3 Pro Preview medium | 10.0 | 8.2 | 3/3 | 3.91s | |
| #16 | Gemini 2.5 Flash medium | 7.0 | 7.4 | 2/3 | 3.94s | |
| #15 | GPT-5.2 Chat none | OpenAI | 7.0 | 7.4 | 2/3 | 4.42s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 10.0 | 3/3 | 4.43s | |
| #26 | Claude Opus 4.6 medium | Anthropic | 7.0 | 6.6 | 2/3 | 4.60s |
| #46 | Kimi K2.5 none | Moonshot AI | 10.0 | 4.1 | 0/3 | 4.73s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.7 | 3/3 | 4.80s |
| #3 | GPT-5.3-Codex medium | OpenAI | 9.3 | 8.4 | 2/3 | 5.12s |
| #27 | GPT-5.2 medium | OpenAI | 7.0 | 6.5 | 2/3 | 5.47s |
| #37 | Qwen3.5-Flash none | Qwen | 1.3 | 5.2 | 0/3 | 5.90s |
| #5 | Gemini 3 Flash Preview low | 10.0 | 8.2 | 3/3 | 6.11s | |
| #2 | Gemini 3.1 Pro Preview medium | 10.0 | 9.4 | 3/3 | 7.15s | |
| #33 | DeepSeek V3.2 none | DeepSeek | 7.7 | 5.5 | 2/3 | 7.37s |
| #13 | Step 3.5 Flash medium | Stepfun | 4.0 | 7.4 | 1/3 | 7.72s |
| #30 | Grok 4.1 Fast medium | X AI | 4.0 | 6.2 | 1/3 | 8.08s |
| #9 | GPT-5.4 medium | OpenAI | 7.0 | 8.0 | 2/3 | 9.13s |
| #43 | MiniMax M2.5 medium | Minimax | 4.0 | 4.7 | 1/3 | 11.5s |
| #39 | gpt-oss-120b medium | OpenAI | 1.7 | 5.1 | 0/3 | 11.8s |
| #52 | GLM 4.7 Flash medium | Z.ai | 10.0 | 3.1 | 0/3 | 12.9s |
| #32 | GPT-5 Mini medium | OpenAI | 4.3 | 6.0 | 1/3 | 14.1s |
| #14 | GLM 5 medium | Z.ai | 10.0 | 7.4 | 3/3 | 15.6s |
| #10 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.7 | 3/3 | 17.2s |
| #34 | GPT-5 Nano medium | OpenAI | 4.0 | 5.5 | 1/3 | 19.8s |
| #48 | Qwen3 Coder Next none | Qwen | 1.3 | 4.0 | 0/3 | 22.9s |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 7.0 | 6.9 | 2/3 | 25.9s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 4.0 | 5.5 | 1/3 | 31.6s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 10.0 | 8.3 | 3/3 | 34.6s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 7.0 | 7.3 | 2/3 | 36.9s |
| #28 | Kimi K2.5 medium | Moonshot AI | 4.0 | 6.4 | 1/3 | 45.4s |
| #8 | Gemini 3.1 Flash Lite Preview high | 7.0 | 8.2 | 2/3 | 46.3s | |
| #24 | Qwen3.5-Flash medium | Qwen | 4.0 | 6.9 | 1/3 | 56.7s |
| #7 | Qwen3.5-27B medium | Qwen | 8.3 | 8.2 | 2/3 | 64.6s |