AI BENCHY Compare
xAI: Grok 4.20 vs xAI: Grok 4.3
Benchmark dihasilkan dari suite pengujian AI BENCHY pada: 2026-05-01
| Metrik | Grok 4.20 Grok 4.20 medium | Grok 4.3 Grok 4.3 medium |
|---|---|---|
| Skor | 7.0 | 8.2 |
| Peringkat | #63 | #20 |
| Keandalan | T/A | 10.0 |
| Konsistensi | 7.8 | 8.6 |
| Tes benar | ||
| Tingkat lulus per percobaan | 66.7% | 81.5% |
| Tes tidak stabil | 5 | 3 |
| Total Run | 54 | 54 |
| Biaya per hasil | 8.252 | 3.974 |
| Total Biaya | $0.743 | $0.517 |
| Harga input | $2.000 / 1M | $1.250 / 1M |
| Harga output | $6.000 / 1M | $2.500 / 1M |
| Token output | 1,744 | 1,223 |
| Token penalaran | 109,882 | 187,047 |
| Waktu respons (rata-rata) | 10.33s | 48.63s |
| Waktu respons (maks) | 29.87s | 216.69s |
| Waktu respons (total) | 185.87s | 875.27s |
Skor vs Total Biaya
Waktu respons (rata-rata)
Skor vs Waktu respons (rata-rata)
Total token output
Skor vs Total token output
Rincian Kategori
Perbandingan Cepat
Ganti Pasangan Perbandingan
DeepSeek V4 ProhighvsGrok 4.20mediumGemma 4 31BnoneTersedia gratisvsGrok 4.20mediumHY3 PreviewlowTersedia gratisvsGrok 4.3mediumGemini 3 Flash PreviewnonevsGrok 4.3mediumGemini 3.1 Flash Lite PreviewlowvsGrok 4.3mediumQwen3.5 Plus 2026-02-15nonevsGrok 4.20mediumGPT-5.5nonevsGrok 4.20mediumGPT-5.2 ChatnonevsGrok 4.3mediumGrok 4.20mediumvsGLM 5noneGemini 3.1 Flash Lite PreviewnonevsGrok 4.3mediumGPT-5.3 ChatnonevsGrok 4.3mediumHY3 PreviewhighTersedia gratisvsGrok 4.3medium