Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Nemotron 3 Super 120b A12b vs Z.ai: GLM 4.7 Flash

Last updated at: 2026-03-12

Metric Nemotron 3 Super 120b A12b Nemotron 3 Super 120b A12b none Release: 2026-03-11 Free Available GLM 4.7 Flash GLM 4.7 Flash medium Release: 2026-01-19
Rank #59 #62
Avg Score 3.4 3.1
Consistency 8.6 6.4
Cost per result 0.000 1.040
Total Cost $0.000 $0.042
Tests Correct
Attempt pass rate 31.3% 41.7%
Flaky tests 3 7
Total Runs 48 48
Output Tokens 4,222 38,682
Reasoning Tokens 0 64,952
Response Time (avg) 8.90s 36.84s
Response Time (max) 24.97s 174.55s
Response Time (total) 142.40s 331.58s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Avg Score vs Response Time (avg)

Total Output Tokens

Avg Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 10.0 10.0 0.0% 0 7.14s 2,171 0
GLM 4.7 Flash 4.0 4.5 55.6% 2 27.09s 1,085 5,597
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 10.0 10.0 0.0% 0 19.98s 124 0
GLM 4.7 Flash 10.0 2.1 33.3% 1 65.57s 2,585 20,648
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 9.9 10.0 100.0% 0 7.92s 249 0
GLM 4.7 Flash 5.0 10.0 50.0% 0 1.51s 584 2,755
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 10.0 7.2 22.2% 1 6.23s 26 0
GLM 4.7 Flash 10.0 4.4 33.3% 2 174.55s 33,000 25,394
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 3.0 9.9 0.0% 0 24.97s 170 0
GLM 4.7 Flash 10.0 9.7 0.0% 0 18.14s 18 2,138
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 4.5 6.9 33.3% 1 1.50s 66 0
GLM 4.7 Flash 5.0 5.8 66.7% 1 2.97s 388 2,181
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 4.7 10.0 33.3% 0 7.50s 1,135 0
GLM 4.7 Flash 10.0 7.2 11.1% 1 12.90s 798 5,225
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Nemotron 3 Super 120b A12b 10.0 1.6 66.7% 1 16.00s 281 0
GLM 4.7 Flash 10.0 10.0 100.0% 0 15.95s 224 1,014

Quick Compare

Switch Comparison Pair