#58

gpt-oss-120b

OpenAI · Release: 2025-08-05 · openai/gpt-oss-120b::medium

6.0

Consistency

7.6

$0.010

Total Output Tokens

47,595

Input Price

$0.039 / 1M

Output Price

$0.190 / 1M

Wrong Tests: 10

Attempt pass rate: 51.0%

Flaky tests

5

Flaky tests had mixed outcomes across runs (at least one pass and one fail).

Response Time (avg)

15.05s

Response Time (max): 50.92s

Response Time (total): 150.55s

Wrong answer: 6 Did not follow instructions: 4

Charts

Choose the first model, then click a second model to open a side-by-side page.

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Quick Compare

gpt-oss-120bmediumFree AvailablevsQwen3.5-Flashnone gpt-oss-120bmediumFree AvailablevsGrok 4.20 Multi Agent Betamedium gpt-oss-120bmediumFree AvailablevsSeed-2.0-Litenone gpt-oss-120bmediumFree AvailablevsGPT-5 Nanomedium gpt-oss-120bmediumFree AvailablevsGemini 2.5 Flashnone gpt-oss-120bmediumFree AvailablevsGemini 3 Flash Previewmedium gpt-oss-120bmediumFree AvailablevsGemini 3.1 Pro Previewmedium gpt-oss-120bmediumFree AvailablevsQwen3.6 Plus Previewmedium

Category Breakdown

Category	Score	Consistency	Tests Correct
Anti-AI Tricks	6.7	9.9
Combined	10.0	10.0
Data parsing and extraction	6.4	5.9
Domain specific	2.9	4.4
General Intelligence	4.3	10.0
Instructions following	9.9	10.0
Puzzle Solving	3.2	4.7
Tool Calling	9.8	10.0

Compared models