#160 xAI: Grok 4.20
none- Cost
- $0.004
- Time
- 6.5s
- Tokens
- 1,367 tok
Summary
Grok 4.20 scores 4.4 on AI BENCHY and ranks #160. It has N/A reliability, a 28.6% pass rate, $0.057 total cost, and 1.11s average response time.
What makes Grok 4.20 unique: Its total benchmark cost is unusually low for its score range. It is notably fast compared with similar models.
Identity note
Grok 4.20 Beta was the preview version of Grok 4.20.
4.4
Consistency
8.5
N/A
Total Output Tokens
1,923
Total Input Tokens
41,313
Input Price
$1.250 / 1M
Output Price
$2.500 / 1M
Flaky tests
0
Flaky tests had mixed outcomes across runs (at least one pass and one fail).
Generation showcase
Prompt: Create a detailed SVG illustration of a hamster playing table tennis.
Run history
| Tested on | Score | Reliability | Tests Correct | Total Cost | Compare |
|---|---|---|---|---|---|
| 2026-05-06 14:16 Re-test | 5.4 | N/A | $0.057 ↓ | Current run | |
| 2026-05-06 14:16 Re-test | 5.4 | N/A | $0.095 | Compare | |
| 2026-05-06 14:16 Re-test | 5.4 | N/A | $0.095 | Compare | |
| 2026-05-06 14:16 Suite changed | 5.4 | N/A | $0.095 | Compare | |
| 2026-04-11 01:44 First recorded run | 5.2 | N/A | $0.095 | Compare |
Run comparison
| Run | Score | Consistency | Reliability | Tests Correct | Flaky tests | Total Output Tokens | Total Input Tokens | Total Cost | Response Time (avg) |
|---|---|---|---|---|---|---|---|---|---|
| 2026-05-06 14:16 · Current run | 4.4 | 8.5 | N/A | 6/18 | 0 | 1,923 | 41,313 | $0.057 | 1.11s |
| 2026-04-11 01:44 · First recorded run | 5.2 | 9.5 | N/A | 5/18 | 1 | 1,967 | 0 | $0.095 | 1.11s |
| Difference | -0.9 | -1.0 | +1 | -1 | -44 | +41313 | -$0.038 | -4ms |
These two runs used different benchmark suites, so the deltas reflect both model changes and suite changes.
Price History
Historical pricing data for this model from OpenRouter.
| Date | Input Price | Output Price |
|---|---|---|
| 2026-06-04 15:40 | $1.250 / 1M | $2.500 / 1M |
Choose the first model, then click a second model to open a side-by-side page.
| Category | Score | Consistency | Tests Correct |
|---|---|---|---|
| Anti-AI Tricks | 4.8 | 10.0 | |
| Coding | 1.1 | 3.1 | |
| Combined | 3.0 | 10.0 | |
| Data parsing and extraction | 10.0 | 10.0 | |
| Domain specific | 3.0 | 10.0 | |
| General Intelligence | 4.8 | 10.0 | |
| Instructions following | 6.3 | 10.0 | |
| Puzzle Solving | 5.3 | 10.0 | |
| Tool Calling | 10.0 | 10.0 | |
| Trivia | 0.0 | 0.0 |