AI BENCHY زمرہ
ڈیٹا پارسنگ اور استخراج درجہ بندی
دیکھیں کہ ڈیٹا پارسنگ اور استخراج میں کون سے AI ماڈلز بہترین کارکردگی دکھاتے ہیں، کون سے قابلِ اعتماد رہتے ہیں، اور سب سے بڑے فرق کہاں نظر آتے ہیں۔ ترتیب دیں حسب: ردِعمل کا وقت (اوسط) ↓.
متعلقہ ناکامی کی وجوہات
| درجہ | ماڈل | کمپنی | ڈیٹا پارسنگ اور استخراج اسکور | اوسط اسکور | درست ٹیسٹس | ردِعمل کا وقت (اوسط) |
|---|---|---|---|---|---|---|
| #50 | Qwen3 Coder Next medium | Qwen | 5.4 | 3.5 | 1/2 | 81.8s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 5.5 | 5.5 | 1/2 | 59.3s |
| #24 | Qwen3.5-Flash medium | Qwen | 5.5 | 6.9 | 1/2 | 57.0s |
| #28 | Kimi K2.5 medium | Moonshot AI | 9.9 | 6.4 | 2/2 | 49.8s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 9.9 | 8.3 | 2/2 | 46.9s |
| #46 | Kimi K2.5 none | Moonshot AI | 5.4 | 4.1 | 1/2 | 42.1s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 9.9 | 7.3 | 2/2 | 36.1s |
| #7 | Qwen3.5-27B medium | Qwen | 9.9 | 8.2 | 2/2 | 30.3s |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 9.9 | 6.9 | 2/2 | 24.3s |
| #10 | Qwen3.5-122B-A10B medium | Qwen | 9.9 | 7.7 | 2/2 | 23.4s |
| #34 | GPT-5 Nano medium | OpenAI | 10.0 | 5.5 | 0/2 | 21.4s |
| #54 | MiMo-V2-Flash none | Xiaomi | 10.0 | 2.9 | 0/2 | 19.7s |
| #13 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.4 | 2/2 | 15.0s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 9.9 | 7.7 | 2/2 | 13.9s |
| #32 | GPT-5 Mini medium | OpenAI | 9.9 | 6.0 | 2/2 | 12.6s |
| #6 | Gemini 3 Pro Preview medium | 9.9 | 8.2 | 2/2 | 10.8s | |
| #33 | DeepSeek V3.2 none | DeepSeek | 5.4 | 5.5 | 1/2 | 9.42s |
| #5 | Gemini 3 Flash Preview low | 9.9 | 8.2 | 2/2 | 9.40s | |
| #14 | GLM 5 medium | Z.ai | 5.0 | 7.4 | 1/2 | 8.90s |
| #2 | Gemini 3.1 Pro Preview medium | 9.9 | 9.4 | 2/2 | 7.72s | |
| #43 | MiniMax M2.5 medium | Minimax | 10.0 | 4.7 | 0/2 | 7.48s |
| #26 | Claude Opus 4.6 medium | Anthropic | 9.9 | 6.6 | 2/2 | 7.37s |
| #8 | Gemini 3.1 Flash Lite Preview high | 9.9 | 8.2 | 2/2 | 7.16s | |
| #30 | Grok 4.1 Fast medium | X AI | 9.9 | 6.2 | 2/2 | 6.63s |
| #31 | GLM 5 none | Z.ai | 9.9 | 6.0 | 2/2 | 5.78s |
| #9 | GPT-5.4 medium | OpenAI | 9.9 | 8.0 | 2/2 | 5.32s |
| #49 | GLM 4.7 Flash none | Z.ai | 5.4 | 3.9 | 1/2 | 4.82s |
| #1 | Gemini 3 Flash Preview medium | 9.9 | 10.0 | 2/2 | 4.72s | |
| #16 | Gemini 2.5 Flash medium | 9.9 | 7.4 | 2/2 | 4.06s | |
| #25 | Claude Sonnet 4.6 none | Anthropic | 9.9 | 6.8 | 2/2 | 3.43s |
| #45 | Trinity Large Preview none | Arcee AI | 9.9 | 4.2 | 2/2 | 3.26s |
| #27 | GPT-5.2 medium | OpenAI | 9.9 | 6.5 | 2/2 | 3.15s |
| #3 | GPT-5.3-Codex medium | OpenAI | 9.9 | 8.4 | 2/2 | 3.07s |
| #15 | GPT-5.2 Chat none | OpenAI | 9.9 | 7.4 | 2/2 | 3.05s |
| #17 | Gemini 3.1 Flash Lite Preview low | 9.9 | 7.3 | 2/2 | 3.00s | |
| #12 | Gemini 3.1 Flash Lite Preview medium | 9.9 | 7.5 | 2/2 | 2.29s | |
| #19 | GPT-5.3 Chat none | OpenAI | 9.9 | 7.3 | 2/2 | 2.21s |
| #39 | gpt-oss-120b medium | OpenAI | 5.5 | 5.1 | 1/2 | 1.98s |
| #29 | Qwen3.5 Plus 2026-02-15 none | Qwen | 9.9 | 6.2 | 2/2 | 1.89s |
| #37 | Qwen3.5-Flash none | Qwen | 9.9 | 5.2 | 2/2 | 1.57s |
| #52 | GLM 4.7 Flash medium | Z.ai | 5.0 | 3.1 | 1/2 | 1.51s |
| #41 | Qwen3.5-27B none | Qwen | 9.9 | 4.9 | 2/2 | 1.43s |
| #20 | Gemini 3 Flash Preview none | 9.9 | 7.2 | 2/2 | 1.41s | |
| #48 | Qwen3 Coder Next none | Qwen | 5.4 | 4.0 | 1/2 | 1.32s |
| #47 | GPT-4o-mini none | OpenAI | 9.9 | 4.0 | 2/2 | 1.27s |
| #22 | Gemini 3.1 Flash Lite Preview none | 9.9 | 7.1 | 2/2 | 1.22s | |
| #42 | Qwen3.5-35B-A3B none | Qwen | 9.9 | 4.7 | 2/2 | 1.16s |
| #36 | Mercury 2 medium | Inception | 5.5 | 5.3 | 1/2 | 1.11s |
| #44 | GPT-5.4 none | OpenAI | 9.9 | 4.5 | 2/2 | 1.04s |
| #40 | Qwen3.5-122B-A10B none | Qwen | 9.9 | 5.0 | 2/2 | 1.01s |
| #53 | Grok 4.1 Fast none | X AI | 9.9 | 2.9 | 2/2 | 943ms |
| #55 | LFM2-24B-A2B none | Liquid | 10.0 | 2.6 | 0/2 | 714ms |
| #51 | Mercury 2 none | Inception | 5.5 | 3.4 | 1/2 | 667ms |
| #38 | Gemini 2.5 Flash none | 9.9 | 5.2 | 2/2 | 652ms | |
| #21 | MiMo-V2-Flash medium | Xiaomi | 5.5 | 7.2 | 1/2 | 0ms |