Kategori AI BENCHY
Peringkat Parsing dan ekstraksi data
Lihat model AI mana yang paling baik di Parsing dan ekstraksi data, mana yang tetap andal, dan di mana kesenjangan terbesar muncul. Urutkan berdasarkan: Tes benar ↑.
| Peringkat | Model | Perusahaan | Skor Parsing dan ekstraksi data | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #46 | Qwen3.6 35B A3B medium | Qwen | 10.0 | 7.4 | 2/2 | 13.0s |
| #47 | Grok Build 0.1 medium | X AI | 10.0 | 7.4 | 2/2 | 10.7s |
| #48 | Gemini 3 Flash Preview none | 10.0 | 7.4 | 2/2 | 1.41s | |
| #50 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.4 | 2/2 | 3.00s | |
| #52 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.4 | 2/2 | 13.9s |
| #53 | Gemini 3.1 Flash Lite high | 10.0 | 7.3 | 2/2 | 4.49s | |
| #54 | GPT-5 Mini medium | OpenAI | 10.0 | 7.3 | 2/2 | 12.6s |
| #55 | GLM 5.1 medium | Z.ai | 10.0 | 7.3 | 2/2 | 9.33s |
| #58 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.2 | 2/2 | 1.22s | |
| #59 | GLM 5V Turbo medium | Z.ai | 10.0 | 7.2 | 2/2 | 9.60s |
| #60 | Kimi K2.6 medium | Moonshot AI | 10.0 | 7.2 | 2/2 | 20.4s |
| #61 | Gemini 3.1 Flash Lite low | 10.0 | 7.2 | 2/2 | 1.44s | |
| #62 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.2 | 2/2 | 15.0s |
| #63 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.2 | 2/2 | 2.21s |
| #65 | Grok 4.20 medium | X AI | 10.0 | 7.1 | 2/2 | 4.17s |