AI BENCHY
Bandingkan Grafik Metodologi
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

Kegagalan kategori AI BENCHY

Spesifik domain
Kedaluwarsa

Lihat model AI mana yang paling mungkin mengalami Kedaluwarsa di Spesifik domain, agar Anda bisa menemukan titik lemahnya lebih cepat. Urutkan berdasarkan: Waktu respons (rata-rata) ↓.

Model yang ditampilkan

14

Total kegagalan

17

Model yang paling terdampak

MiniMax M2.5 1
Peringkat Model Perusahaan Jumlah Kedaluwarsa Skor kategori Tes benar Waktu respons (rata-rata)
#43 MiniMax M2.5 medium Minimax 1 10.0 0/3 237.3s
#34 GPT-5 Nano medium OpenAI 1 4.0 1/3 204.0s
#24 Qwen3.5-Flash medium Qwen 1 4.0 1/3 146.5s
#28 Kimi K2.5 medium Moonshot AI 1 10.0 0/3 137.3s
#30 Grok 4.1 Fast medium X AI 1 4.0 1/3 121.8s
#35 Qwen3.5-35B-A3B medium Qwen 2 10.0 0/3 88.3s
#7 Qwen3.5-27B medium Qwen 1 4.0 1/3 79.5s
#27 GPT-5.2 medium OpenAI 1 4.0 1/3 77.8s
#32 GPT-5 Mini medium OpenAI 1 10.0 0/3 44.6s
#18 DeepSeek V3.2 medium DeepSeek 1 4.0 1/3 39.3s
#4 Qwen3.5 Plus 2026-02-15 medium Qwen 1 4.0 1/3 17.5s
#11 Claude Sonnet 4.6 medium Anthropic 1 10.0 0/3 0ms
#14 GLM 5 medium Z.ai 1 10.0 0/3 0ms
#23 Seed-2.0-Mini medium Bytedance Seed 3 10.0 0/3 0ms

Model teratas menurut Jumlah Kedaluwarsa

Jumlah Kedaluwarsa vs skor rata-rata

Model teratas menurut Waktu respons (rata-rata)

Model teratas menurut Perkiraan biaya terbuang