AI BENCHY
Linganisha Chati Mbinu
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

Kategoria ya AI BENCHY

Orodha ya Utatuzi wa mafumbo

Ona ni modeli gani za AI zinafanya vizuri zaidi katika Utatuzi wa mafumbo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Kipimo ↑.

Modeli zilizoonyeshwa

55

Wastani wa Alama ya Utatuzi wa mafumbo

6.5

Nafasi Modeli Kampuni Alama ya Utatuzi wa mafumbo Wastani wa alama Majaribio sahihi Muda wa majibu (wastani)
#37 Qwen3.5-Flash none Qwen 1.3 5.2 0/3 5.90s
#48 Qwen3 Coder Next none Qwen 1.3 4.0 0/3 22.9s
#53 Grok 4.1 Fast none X AI 1.3 2.9 0/3 1.28s
#36 Mercury 2 medium Inception 1.7 5.3 0/3 934ms
#39 gpt-oss-120b medium OpenAI 1.7 5.1 0/3 11.8s
#42 Qwen3.5-35B-A3B none Qwen 1.7 4.7 0/3 1.34s
#47 GPT-4o-mini none OpenAI 2.3 4.0 0/3 1.30s
#55 LFM2-24B-A2B none Liquid 3.3 2.6 0/3 1.69s
#49 GLM 4.7 Flash none Z.ai 3.7 3.9 0/3 1.00s
#13 Step 3.5 Flash medium Stepfun 4.0 7.4 1/3 7.72s
#24 Qwen3.5-Flash medium Qwen 4.0 6.9 1/3 56.7s
#28 Kimi K2.5 medium Moonshot AI 4.0 6.4 1/3 45.4s
#30 Grok 4.1 Fast medium X AI 4.0 6.2 1/3 8.08s
#34 GPT-5 Nano medium OpenAI 4.0 5.5 1/3 19.8s
#35 Qwen3.5-35B-A3B medium Qwen 4.0 5.5 1/3 31.6s
#40 Qwen3.5-122B-A10B none Qwen 4.0 5.0 1/3 982ms
#43 MiniMax M2.5 medium Minimax 4.0 4.7 1/3 11.5s
#44 GPT-5.4 none OpenAI 4.0 4.5 1/3 1.52s
#45 Trinity Large Preview none Arcee AI 4.0 4.2 1/3 3.30s
#32 GPT-5 Mini medium OpenAI 4.3 6.0 1/3 14.1s
#38 Gemini 2.5 Flash none Google 4.7 5.2 1/3 576ms
#41 Qwen3.5-27B none Qwen 6.3 4.9 1/3 1.37s
#8 Gemini 3.1 Flash Lite Preview high Google 7.0 8.2 2/3 46.3s
#9 GPT-5.4 medium OpenAI 7.0 8.0 2/3 9.13s
#12 Gemini 3.1 Flash Lite Preview medium Google 7.0 7.5 2/3 3.58s
#15 GPT-5.2 Chat none OpenAI 7.0 7.4 2/3 4.42s
#16 Gemini 2.5 Flash medium Google 7.0 7.4 2/3 3.94s
#18 DeepSeek V3.2 medium DeepSeek 7.0 7.3 2/3 36.9s
#20 Gemini 3 Flash Preview none Google 7.0 7.2 2/3 1.06s
#21 MiMo-V2-Flash medium Xiaomi 7.0 7.2 2/3 3.77s
#23 Seed-2.0-Mini medium Bytedance Seed 7.0 6.9 2/3 25.9s
#25 Claude Sonnet 4.6 none Anthropic 7.0 6.8 2/3 2.92s
#26 Claude Opus 4.6 medium Anthropic 7.0 6.6 2/3 4.60s
#27 GPT-5.2 medium OpenAI 7.0 6.5 2/3 5.47s
#29 Qwen3.5 Plus 2026-02-15 none Qwen 7.0 6.2 2/3 2.82s
#31 GLM 5 none Z.ai 7.0 6.0 2/3 2.05s
#33 DeepSeek V3.2 none DeepSeek 7.7 5.5 2/3 7.37s
#7 Qwen3.5-27B medium Qwen 8.3 8.2 2/3 64.6s
#3 GPT-5.3-Codex medium OpenAI 9.3 8.4 2/3 5.12s
#1 Gemini 3 Flash Preview medium Google 10.0 10.0 3/3 4.43s
#2 Gemini 3.1 Pro Preview medium Google 10.0 9.4 3/3 7.15s
#4 Qwen3.5 Plus 2026-02-15 medium Qwen 10.0 8.3 3/3 34.6s
#5 Gemini 3 Flash Preview low Google 10.0 8.2 3/3 6.11s
#6 Gemini 3 Pro Preview medium Google 10.0 8.2 3/3 3.91s
#10 Qwen3.5-122B-A10B medium Qwen 10.0 7.7 3/3 17.2s
#11 Claude Sonnet 4.6 medium Anthropic 10.0 7.7 3/3 4.80s
#14 GLM 5 medium Z.ai 10.0 7.4 3/3 15.6s
#17 Gemini 3.1 Flash Lite Preview low Google 10.0 7.3 3/3 2.76s
#19 GPT-5.3 Chat none OpenAI 10.0 7.3 3/3 2.93s
#22 Gemini 3.1 Flash Lite Preview none Google 10.0 7.1 3/3 972ms
#46 Kimi K2.5 none Moonshot AI 10.0 4.1 0/3 4.73s
#50 Qwen3 Coder Next medium Qwen 10.0 3.5 0/3 2.30s
#51 Mercury 2 none Inception 10.0 3.4 0/3 533ms
#52 GLM 4.7 Flash medium Z.ai 10.0 3.1 0/3 12.9s
#54 MiMo-V2-Flash none Xiaomi 10.0 2.9 0/3 1.38s

Modeli bora kwa Alama ya Utatuzi wa mafumbo

Alama ya Utatuzi wa mafumbo dhidi ya jumla ya gharama

Modeli bora kwa Muda wa majibu (wastani)