AI BENCHY
Bandingkan Grafik Metodologi
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

Kegagalan AI BENCHY

Kegagalan Kedaluwarsa

Lihat model AI mana yang paling sering mengalami Kedaluwarsa, agar Anda bisa melihat risiko keandalan sebelum memilih. Urutkan berdasarkan: Waktu respons (rata-rata) ↓.

Model yang ditampilkan

15

Total kegagalan

25

Model yang paling terdampak

Qwen3.5-Flash 3
Peringkat Model Perusahaan Jumlah Kedaluwarsa Skor Rata-rata Tes benar Waktu respons (rata-rata)
#24 Qwen3.5-Flash medium Qwen 3 6.9 10/16 70.8s
#28 Kimi K2.5 medium Moonshot AI 1 6.4 9/16 69.8s
#23 Seed-2.0-Mini medium Bytedance Seed 4 6.9 10/16 65.1s
#7 Qwen3.5-27B medium Qwen 1 8.2 12/16 52.1s
#34 GPT-5 Nano medium OpenAI 1 5.5 7/16 47.9s
#35 Qwen3.5-35B-A3B medium Qwen 4 5.5 8/16 43.9s
#43 MiniMax M2.5 medium Minimax 2 4.7 5/16 43.0s
#18 DeepSeek V3.2 medium DeepSeek 1 7.3 11/16 39.5s
#4 Qwen3.5 Plus 2026-02-15 medium Qwen 2 8.3 13/16 34.5s
#10 Qwen3.5-122B-A10B medium Qwen 1 7.7 12/16 29.7s
#30 Grok 4.1 Fast medium X AI 1 6.2 9/16 26.3s
#32 GPT-5 Mini medium OpenAI 1 6.0 8/16 25.1s
#14 GLM 5 medium Z.ai 1 7.4 11/16 16.2s
#27 GPT-5.2 medium OpenAI 1 6.5 10/16 15.3s
#11 Claude Sonnet 4.6 medium Anthropic 1 7.7 12/16 11.2s

Model teratas menurut Jumlah Kedaluwarsa

Jumlah Kedaluwarsa vs skor rata-rata

Model teratas menurut Waktu respons (rata-rata)