AI BENCHY
Advertise here

Kushindwa kwa AI BENCHY

Kushindwa kwa Mwito wa zana si sahihi

Ona ni modeli gani za AI hukutana na Mwito wa zana si sahihi mara nyingi zaidi ili utambue hatari za utegemevu kabla ya kuchagua. Panga kwa: Alama ↑.

Modeli zilizoonyeshwa

9

Jumla ya kushindwa

26

Modeli iliyoathirika zaidi

Granite 4.1 8B 1
Nafasi Modeli Kampuni Idadi ya Mwito wa zana si sahihi Alama Majaribio sahihi Muda wa majibu (wastani)
#122 GLM 4.7 Flash none Z.ai 1 5.5 6/21 2.86s
#119 Cobuddy medium Baidu 1 5.6 7/21 39.9s
#118 Qwen3.6 27B none Qwen 1 5.6 7/21 3.72s
#112 GLM 5.1 none Z.ai 1 5.7 7/21 4.10s
#107 Laguna Xs.2 medium Poolside 1 5.8 6/19 6.73s
#106 Grok 4.20 Beta none X AI 1 5.8 6/18 1.19s
#78 Qwen3.6 27B medium Qwen 1 6.8 10/21 59.7s
#59 GLM 5V Turbo medium Z.ai 2 7.2 11/21 23.1s
#32 Gemini 3.5 Flash minimal Google 1 7.7 14/21 1.57s

Modeli bora kwa Idadi ya Mwito wa zana si sahihi

Idadi ya Mwito wa zana si sahihi dhidi ya Alama

Modeli bora kwa Muda wa majibu (wastani)