AI BENCHY
Advertise here

Kegagalan AI BENCHY

Kegagalan Kedaluwarsa

Lihat model AI mana yang paling sering mengalami Kedaluwarsa, agar Anda bisa melihat risiko keandalan sebelum memilih.

Model yang ditampilkan

15

Total kegagalan

59

Model yang paling terdampak

Qwen3.5-9B 10
Peringkat Model Perusahaan Jumlah Kedaluwarsa Skor Tes benar Waktu respons (rata-rata)
#142 Qwen3.5-9B medium Qwen 10 4.3 3/19 80.1s
#58 Seed-2.0-Mini medium Bytedance Seed 4 7.2 11/19 68.9s
#59 Qwen3.5-35B-A3B medium Qwen 4 7.2 10/19 51.5s
#108 MiniMax M2.5 medium Minimax 4 5.5 5/19 43.4s
#45 Qwen3.5-Flash medium Qwen 3 7.6 11/19 65.8s
#12 Qwen3.5 Plus 2026-02-15 medium Qwen 2 8.2 14/19 51.3s
#31 Qwen3.5-122B-A10B medium Qwen 2 7.9 13/19 32.5s
#38 Gemma 4 26B A4B medium Google 2 7.7 13/19 33.7s
#42 Kimi K2.6 medium Moonshot AI 2 7.6 12/19 49.9s
#47 GLM 5.1 medium Z.ai 2 7.6 12/19 24.4s
#61 DeepSeek V3.2 medium DeepSeek 2 7.2 11/19 46.1s
#71 Kimi K2.5 medium Moonshot AI 2 6.8 9/19 73.4s
#73 Hunter Alpha medium OpenRouter 2 6.7 8/18 10.3s
#125 MiniMax M2.7 medium Minimax 2 5.1 4/19 30.6s
#3 Claude Opus 4.7 medium Anthropic 1 8.9 16/19 3.46s

Model teratas menurut Jumlah Kedaluwarsa

Jumlah Kedaluwarsa vs Skor

Model teratas menurut Waktu respons (rata-rata)