Kategori AI BENCHY
Peringkat Parsing dan ekstraksi data
Lihat model AI mana yang paling baik di Parsing dan ekstraksi data, mana yang tetap andal, dan di mana kesenjangan terbesar muncul. Urutkan berdasarkan: Waktu respons (rata-rata) ↓.
Model yang ditampilkan
15
Rata-rata Skor Parsing dan ekstraksi data
8.7
Model terbaik
Qwen3.5-9B 3.6| Peringkat | Model | Perusahaan | Skor Parsing dan ekstraksi data | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #40 | Gemini 3.1 Flash Lite Preview medium | 10.0 | 7.5 | 2/2 | 2.29s | |
| #57 | Step 3.7 Flash low | Stepfun | 7.3 | 7.3 | 1/2 | 2.29s |
| #85 | Gemma 4 31B none | 10.0 | 6.5 | 2/2 | 2.25s | |
| #63 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.2 | 2/2 | 2.21s |
| #8 | Claude Opus 4.7 none | Anthropic | 10.0 | 8.9 | 2/2 | 2.15s |
| #128 | Qwen3.6 Flash none | Qwen | 10.0 | 5.4 | 2/2 | 2.13s |
| #118 | Qwen3.6 27B none | Qwen | 7.3 | 5.6 | 1/2 | 2.06s |
| #99 | gpt-oss-120b medium | OpenAI | 6.4 | 6.1 | 1/2 | 1.98s |
| #104 | Nemotron 3 Ultra 550b A55b none | NVIDIA | 10.0 | 6.0 | 2/2 | 1.94s |
| #95 | Qwen3.5 Plus 2026-02-15 none | Qwen | 10.0 | 6.3 | 2/2 | 1.89s |
| #110 | Seed-2.0-Lite none | Bytedance Seed | 10.0 | 5.8 | 2/2 | 1.82s |
| #3 | Gemini 3.5 Flash low | 10.0 | 9.4 | 2/2 | 1.81s | |
| #68 | Claude Opus 4.8 none | Anthropic | 7.3 | 7.0 | 1/2 | 1.77s |
| #101 | Mimo V2 Omni none | Xiaomi | 10.0 | 6.0 | 2/2 | 1.76s |
| #102 | Gemma 4 26B A4B none | 10.0 | 6.0 | 2/2 | 1.70s |