Qwen3.7 Plus vs GLM 5V Turbo benchmark comparison: Qwen3.7 Plus leads on average score with 7.2 vs 5.9. Qwen3.7 Plus has the lower benchmark cost at $0.023 vs $0.052. Qwen3.7 Plus is faster at 2.85s vs 2.99s, with pass rates of 47.6% vs 38.1%.
Recommended model: Qwen3.7 Plus - It has the best score here (7.2), while costing about 2.3x less than GLM 5V Turbo.
GLM 5V TurboGLM 5V TurbononeArchived model: this model is no longer updated or tested on new tests.Release: 2026-04-01
Score
7.2Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
5.9Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
Rank
#60
#105
Reliability
10.0First-attempt success score: 10.0 means no retryable target API or rate-limit failures before successful calls; tracked failures lower the score.…
10.0First-attempt success score: 10.0 means no retryable target API or rate-limit failures before successful calls; tracked failures lower the score.…
Consistency
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
Tests Correct
A test is fully passed only if every run passed for that test.Wrong answer: 10Did not follow instructions: 1Response Time (avg)2.85sResponse Time (max)29.38sResponse Time (total)59.86sA test is fully passed only if every run passed for that test.…
A test is fully passed only if every run passed for that test.Wrong answer: 11Did not follow instructions: 2Response Time (avg)2.99sResponse Time (max)6.51sResponse Time (total)62.74sA test is fully passed only if every run passed for that test.…
Attempt pass rate
47.6%Attempt pass rate = passed attempts / total attempts across runs.…
38.1%Attempt pass rate = passed attempts / total attempts across runs.…
Flaky tests
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
Total Runs
63Total Runs…
63Total Runs…
Cost per result
0.276Shows the average cost per correct benchmark answer in cents (lower is better).…
0.645Shows the average cost per correct benchmark answer in cents (lower is better).…
Total Cost
$0.023Total Cost (Current Price)…
$0.052Total Cost (Current Price)…
Input Price
$0.320 / 1MInput Price…
$1.200 / 1MInput Price…
Output Price
$1.280 / 1MOutput Price…
$4.000 / 1MOutput Price…
Total Input Tokens
42,510Total Input Tokens…
37,100Total Input Tokens…
Output Tokens
6,578Output Tokens…
1,766Output Tokens…
Reasoning Tokens
0Reasoning Tokens…
0Reasoning Tokens…
Response Time (avg)
2.85sResponse Time (avg)…
2.99sResponse Time (avg)…
Response Time (max)
29.38sResponse Time (max)…
6.51sResponse Time (max)…
Response Time (total)
59.86sResponse Time (total)…
62.74sResponse Time (total)…
Generation showcase
Hamster playing table tennis
Prompt: Create a detailed SVG illustration of a hamster playing table tennis.
6.5Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
50.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)1.38sResponse Time (max)2.69sResponse Time (total)5.51sA test is fully passed only if every run passed for that test.…
1.38sResponse Time (avg)…
696Total Input Tokens…
349Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
4.8Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
25.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)3.13sResponse Time (max)5.90sResponse Time (total)12.50sA test is fully passed only if every run passed for that test.…
5.5Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
33.3%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)2.15sResponse Time (max)4.39sResponse Time (total)6.44sA test is fully passed only if every run passed for that test.…
2.15sResponse Time (avg)…
7,911Total Input Tokens…
639Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
5.5Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
33.3%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)3.13sResponse Time (max)5.30sResponse Time (total)9.40sA test is fully passed only if every run passed for that test.…
10.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
100.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)29.38sResponse Time (max)29.38sResponse Time (total)29.38sA test is fully passed only if every run passed for that test.…
29.38sResponse Time (avg)…
14,952Total Input Tokens…
4,505Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
3.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)6.51sResponse Time (max)6.51sResponse Time (total)6.51sA test is fully passed only if every run passed for that test.…
10.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
100.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.43sResponse Time (max)1.57sResponse Time (total)2.86sA test is fully passed only if every run passed for that test.…
1.43sResponse Time (avg)…
7,794Total Input Tokens…
243Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
10.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
100.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.81sResponse Time (max)5.69sResponse Time (total)7.62sA test is fully passed only if every run passed for that test.…
3.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)868msResponse Time (max)1.02sResponse Time (total)2.60sA test is fully passed only if every run passed for that test.…
868msResponse Time (avg)…
789Total Input Tokens…
18Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
5.3Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
33.3%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)2.09sResponse Time (max)2.39sResponse Time (total)6.26sA test is fully passed only if every run passed for that test.…
5.3Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)1.33sResponse Time (max)1.33sResponse Time (total)1.33sA test is fully passed only if every run passed for that test.…
1.33sResponse Time (avg)…
522Total Input Tokens…
78Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
4.6Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)2.22sResponse Time (max)2.22sResponse Time (total)2.22sA test is fully passed only if every run passed for that test.…
6.3Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
50.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)929msResponse Time (max)1.05sResponse Time (total)1.86sA test is fully passed only if every run passed for that test.…
929msResponse Time (avg)…
711Total Input Tokens…
72Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
6.5Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
50.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.97sResponse Time (max)2.43sResponse Time (total)3.93sA test is fully passed only if every run passed for that test.…
7.7Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
66.7%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.71sResponse Time (max)2.65sResponse Time (total)5.13sA test is fully passed only if every run passed for that test.…
1.71sResponse Time (avg)…
714Total Input Tokens…
443Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
5.3Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
33.3%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Did not follow instructions: 1Wrong answer: 1Response Time (avg)2.40sResponse Time (max)3.81sResponse Time (total)7.21sA test is fully passed only if every run passed for that test.…
10.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
100.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.54sResponse Time (max)3.54sResponse Time (total)3.54sA test is fully passed only if every run passed for that test.…
3.54sResponse Time (avg)…
8,211Total Input Tokens…
222Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
10.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
100.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.86sResponse Time (max)4.86sResponse Time (total)4.86sA test is fully passed only if every run passed for that test.…
3.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.21sResponse Time (max)1.21sResponse Time (total)1.21sA test is fully passed only if every run passed for that test.…
1.21sResponse Time (avg)…
210Total Input Tokens…
9Output Tokens…
0Reasoning Tokens…
GLM 5V TurboArchived model: this model is no longer updated or tested on new tests.
3.0Summarizes broad quality across our full private benchmark suite, so ranking reflects consistent performance.…
10.0Consistency score reflects run-to-run stability (10 = very consistent, even if consistently wrong).…
0.0%Attempt pass rate = passed attempts / total attempts across runs.…
0Flaky tests had mixed outcomes across runs (at least one pass and one fail).…
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)2.23sResponse Time (max)2.23sResponse Time (total)2.23sA test is fully passed only if every run passed for that test.…