Clasament modele pentru Respectarea instrucțiunilor

Vezi ce modele AI se descurcă cel mai bine la Respectarea instrucțiunilor, care rămân fiabile și unde apar cele mai mari diferențe. Sortează după: Timp de răspuns (mediu) ↓.

Modele afișate

Media pentru Scor Respectarea instrucțiunilor

8.5

Cel mai bun model

Kimi K2.5 10.0

Motive de eșec

Cu motivul de eșec Răspuns greșit61 Cu motivul de eșec Nu a urmat instrucțiunile18 Cu motivul de eșec Formatare suplimentară3 Cu motivul de eșec Fără răspuns2 Cu motivul de eșec Eroare API1 Cu motivul de eșec Timp expirat1

210/210

Rang	Model	Companie	Scor Respectarea instrucțiunilor	Scor	Cost total	Teste corecte	Timp de răspuns (mediu)
#77	Kimi K2.5 medium	Moonshot AI	10.0	7.0	$0.600	2/2	92.5s
Total teste 2 Teste greșite 0 Cost total $0.600 Timp de răspuns (mediu) 92.5s
#163	Gemini 3.1 Flash Lite Preview high	Google	9.8	5.3	$2.310	2/2	64.0s
Total teste 2 Teste greșite 0 Cost total $2.310 Timp de răspuns (mediu) 64.0s
#114	Qwen3.5-Flash medium	Qwen	10.0	6.2	$0.139	2/2	63.5s
Total teste 2 Teste greșite 0 Cost total $0.139 Timp de răspuns (mediu) 63.5s
#99	Qwen3.6 27B medium	Qwen	10.0	6.5	$0.779	2/2	38.0s
Total teste 2 Teste greșite 0 Cost total $0.779 Timp de răspuns (mediu) 38.0s
#76	DeepSeek V3.2 medium	DeepSeek	10.0	7.0	$0.078	2/2	35.8s
Total teste 2 Teste greșite 0 Cost total $0.078 Timp de răspuns (mediu) 35.8s
#135	Hy3 preview high	Tencent	10.0	5.9	$0.048	2/2	34.4s
Total teste 2 Teste greșite 0 Cost total $0.048 Timp de răspuns (mediu) 34.4s
#57	Qwen3.5 Plus 2026-02-15 medium	Qwen	10.0	7.5	$0.437	2/2	31.9s
Total teste 2 Teste greșite 0 Cost total $0.437 Timp de răspuns (mediu) 31.9s
#171	North Mini Code none	Cohere	6.5	5.1	$0.000	1/2	30.7s
Total teste 2 Teste greșite 1 Cost total $0.000 Timp de răspuns (mediu) 30.7s
#179	Ring-2.6-1T none	Inclusionai	9.8	4.8	$0.026	2/2	27.4s
Total teste 2 Teste greșite 0 Cost total $0.026 Timp de răspuns (mediu) 27.4s
#119	Qwen3.5-35B-A3B medium	Qwen	10.0	6.2	$0.837	2/2	24.4s
Total teste 2 Teste greșite 0 Cost total $0.837 Timp de răspuns (mediu) 24.4s
#19	Qwen3.6 Max Preview medium	Qwen	10.0	8.4	$1.143	2/2	24.3s
Total teste 2 Teste greșite 0 Cost total $1.143 Timp de răspuns (mediu) 24.3s
#143	Gemini 3.1 Flash Lite high	Google	7.3	5.6	$2.044	1/2	23.3s
Total teste 2 Teste greșite 1 Cost total $2.044 Timp de răspuns (mediu) 23.3s
#70	Qwen3.5 Plus 2026-04-20 medium	Qwen	10.0	7.2	$0.317	2/2	20.2s
Total teste 2 Teste greșite 0 Cost total $0.317 Timp de răspuns (mediu) 20.2s
#58	Qwen3.5-27B medium	Qwen	10.0	7.4	$1.627	2/2	19.7s
Total teste 2 Teste greșite 0 Cost total $1.627 Timp de răspuns (mediu) 19.7s
#73	Grok 4.3 medium	X AI	9.8	7.1	$0.779	2/2	18.6s
Total teste 2 Teste greșite 0 Cost total $0.779 Timp de răspuns (mediu) 18.6s

Clasament Respectarea instrucțiunilor

Filtrează modelele

Top modele după Scor Respectarea instrucțiunilor

Scor Respectarea instrucțiunilor vs cost total

Top modele după Timp de răspuns (mediu)