AI BENCHY Compare
OpenAI: GPT-5.4 vs Elephant
Last updated at: 2026-04-14
| Metric | GPT-5.4 GPT-5.4 none | Elephant Elephant medium |
|---|---|---|
| Score | 5.9 | 5.2 |
| Rank | #63 | #77 |
| Consistency | 9.1 | 9.6 |
| Tests Correct | ||
| Attempt pass rate | 42.6% | 29.6% |
| Flaky tests | 2 | 1 |
| Total Runs | 54 | 54 |
| Cost per result | 1.477 | 0.000 |
| Total Cost | $0.104 | $0.000 |
| Input Price | $2.500 / 1M | $0.000 / 1M |
| Output Price | $15.000 / 1M | $0.000 / 1M |
| Output Tokens | 2,317 | 2,596 |
| Reasoning Tokens | 0 | 0 |
| Response Time (avg) | 1.51s | 1.27s |
| Response Time (max) | 2.95s | 3.70s |
| Response Time (total) | 27.21s | 22.82s |
Score vs Total Cost
Response Time (avg)
Score vs Response Time (avg)
Total Output Tokens
Score vs Total Output Tokens
Category Breakdown
Quick Compare
Switch Comparison Pair
ElephantmediumvsGrok 4.20noneMistral Small 4nonevsElephantmediumgpt-oss-120bnoneFree AvailablevsElephantmediumTrinity Large PreviewnoneFree AvailablevsElephantmediumGPT-5.4 MininonevsElephantmediumElephantmediumvsQwen3 Coder NextnoneNemotron 3 SupernoneFree AvailablevsElephantmediumMiniMax M2.5mediumFree AvailablevsGPT-5.4noneMistral Small 4mediumvsGPT-5.4noneElephantmediumvsGLM 5 TurbononeKimi K2.5nonevsElephantmediumElephantmediumvsGLM 5.1none