Kushindwa kwa kategoria za AI BENCHY
Mbinu za kupinga AI: Muundo wa ziada
Mbinu za kupinga AI
Muundo wa ziada
Ona ni modeli gani za AI zina uwezekano mkubwa wa kupata Muundo wa ziada katika Mbinu za kupinga AI, ili uone udhaifu haraka. Panga kwa: Muda wa majibu (wastani) ↑.
Sababu za kushindwa
| Nafasi | Modeli | Kampuni | Idadi ya Muundo wa ziada | Alama ya kategoria | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #42 | Claude Sonnet 4.6 none | Anthropic | 2 | 4.8 | 1/4 | 2.94s |
| #26 | Claude Sonnet 4.6 medium | Anthropic | 1 | 6.5 | 2/4 | 2.98s |
| #87 | Qwen3 Coder Next none | Qwen | 1 | 3.6 | 0/4 | 3.31s |
| #56 | Grok 4.20 Multi Agent Beta medium | X AI | 1 | 6.9 | 2/4 | 3.46s |
| #37 | Claude Opus 4.6 medium | Anthropic | 2 | 6.4 | 2/4 | 7.45s |
| #64 | DeepSeek V3.2 none | DeepSeek | 2 | 3.2 | 0/4 | 7.63s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 1 | 8.1 | 3/4 | 15.8s |
| #10 | Qwen3.5-27B medium | Qwen | 1 | 8.7 | 3/4 | 19.8s |