Kategori AI BENCHY
Peringkat Parsing dan ekstraksi data
Lihat model AI mana yang paling baik di Parsing dan ekstraksi data, mana yang tetap andal, dan di mana kesenjangan terbesar muncul.
Model yang ditampilkan
15
Rata-rata Skor Parsing dan ekstraksi data
8.7
Model terbaik
DeepSeek V4 Flash 10.0| Peringkat | Model | Perusahaan | Skor Parsing dan ekstraksi data | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #31 | DeepSeek V4 Flash high | DeepSeek | 10.0 | 7.7 | 2/2 | 28.0s |
| #62 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.2 | 2/2 | 15.0s |
| #139 | DeepSeek V4 Flash none | DeepSeek | 10.0 | 5.0 | 2/2 | 23.8s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 9.8 | 2/2 | 5.43s | |
| #2 | Gemini 3.5 Flash high | 10.0 | 9.6 | 2/2 | 6.43s | |
| #3 | Gemini 3.5 Flash low | 10.0 | 9.4 | 2/2 | 1.81s | |
| #4 | Gemini 3.1 Pro Preview medium | 10.0 | 9.4 | 2/2 | 7.72s | |
| #5 | Qwen3.7 Max medium | Qwen | 10.0 | 9.1 | 2/2 | 8.80s |
| #6 | GPT-5.5 low | OpenAI | 10.0 | 9.0 | 2/2 | 3.28s |
| #7 | Gemini 3.5 Flash medium | 10.0 | 9.0 | 2/2 | 4.07s | |
| #8 | Claude Opus 4.7 none | Anthropic | 10.0 | 8.9 | 2/2 | 2.15s |
| #9 | GPT-5.5 medium | OpenAI | 10.0 | 8.8 | 2/2 | 4.18s |
| #11 | Claude Opus 4.7 medium | Anthropic | 10.0 | 8.7 | 2/2 | 2.37s |
| #12 | Gemini 3.1 Flash Lite Preview high | 10.0 | 8.6 | 2/2 | 7.16s | |
| #13 | Grok 4.20 Beta medium | X AI | 10.0 | 8.5 | 2/2 | 4.01s |