A test is fully passed only if every run passed for that test.Wrong answer: 6No answer: 3Response Time (avg)49.43sResponse Time (max)192.75sResponse Time (total)988.58s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)13.40sResponse Time (max)45.73sResponse Time (total)53.58s
Coding
: 3.7 A test is fully passed only if every run passed for that test.No answer: 1Wrong answer: 1Response Time (avg)126.82sResponse Time (max)192.75sResponse Time (total)253.65s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)13.01sResponse Time (max)13.01sResponse Time (total)13.01s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)14.72sResponse Time (max)24.97sResponse Time (total)29.43s
Domain specific
: 4.1 A test is fully passed only if every run passed for that test.Wrong answer: 2No answer: 1Response Time (avg)149.64sResponse Time (max)163.21sResponse Time (total)448.91s
General Intelligence
: 5.5 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)4.17sResponse Time (max)4.17sResponse Time (total)4.17s
Instructions following
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.52sResponse Time (max)1.89sResponse Time (total)3.03s
Puzzle Solving
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)10.22sResponse Time (max)23.65sResponse Time (total)30.66s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.79sResponse Time (max)2.79sResponse Time (total)2.79s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.No answer: 1Response Time (avg)149.34sResponse Time (max)149.34sResponse Time (total)149.34s
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)8.30sResponse Time (max)34.82sResponse Time (total)165.92s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.57sResponse Time (max)3.60sResponse Time (total)10.27s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)24.62sResponse Time (max)34.82sResponse Time (total)49.24s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)22.37sResponse Time (max)22.37sResponse Time (total)22.37s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.43sResponse Time (max)8.51sResponse Time (total)12.87s
Domain specific
: 7.6 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)14.09sResponse Time (max)22.00sResponse Time (total)42.27s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.63sResponse Time (max)3.63sResponse Time (total)3.63s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.35sResponse Time (max)3.42sResponse Time (total)6.69s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.23sResponse Time (max)3.68sResponse Time (total)9.69s
Tool Calling
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.96sResponse Time (max)4.96sResponse Time (total)4.96s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.94sResponse Time (max)3.94sResponse Time (total)3.94s
A test is fully passed only if every run passed for that test.Wrong answer: 3No answer: 1Response Time (avg)9.34sResponse Time (max)38.03sResponse Time (total)186.84s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.95sResponse Time (max)5.76sResponse Time (total)15.79s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)14.97sResponse Time (max)22.27sResponse Time (total)29.93s
Combined
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)38.03sResponse Time (max)38.03sResponse Time (total)38.03s
Data parsing and extraction
: 7.1 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)12.29sResponse Time (max)19.64sResponse Time (total)24.59s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)14.15sResponse Time (max)28.41sResponse Time (total)42.46s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.46sResponse Time (max)2.46sResponse Time (total)2.46s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.32sResponse Time (max)5.07sResponse Time (total)6.63s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.95sResponse Time (max)4.33sResponse Time (total)11.85s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)8.96sResponse Time (max)8.96sResponse Time (total)8.96s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.No answer: 1Response Time (avg)6.14sResponse Time (max)6.14sResponse Time (total)6.14s
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)20.77sResponse Time (max)88.68sResponse Time (total)269.96s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.90sResponse Time (max)9.52sResponse Time (total)15.80s
Coding
: 7.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)54.28sResponse Time (max)88.68sResponse Time (total)108.56s
Combined
: 9.5 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)40.61sResponse Time (max)40.61sResponse Time (total)40.61s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.72sResponse Time (max)7.72sResponse Time (total)7.72s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)32.73sResponse Time (max)32.73sResponse Time (total)32.73s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)11.77sResponse Time (max)11.77sResponse Time (total)11.77s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)9.56sResponse Time (max)9.56sResponse Time (total)9.56s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.90sResponse Time (max)8.49sResponse Time (total)13.79s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)23.15sResponse Time (max)23.15sResponse Time (total)23.15s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.27sResponse Time (max)6.27sResponse Time (total)6.27s
A test is fully passed only if every run passed for that test.Wrong answer: 5Did not follow instructions: 2Response Time (avg)22.31sResponse Time (max)100.41sResponse Time (total)446.17s…
Anti-AI Tricks
: 8.3 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)4.11sResponse Time (max)6.42sResponse Time (total)16.42s
Coding
: 8.2 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)54.98sResponse Time (max)96.94sResponse Time (total)109.96s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)20.57sResponse Time (max)20.57sResponse Time (total)20.57s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.32sResponse Time (max)5.40sResponse Time (total)10.64s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)74.27sResponse Time (max)100.41sResponse Time (total)222.80s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.11sResponse Time (max)3.68sResponse Time (total)6.22s
Puzzle Solving
: 8.2 A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)9.14sResponse Time (max)18.14sResponse Time (total)27.41s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)13.28sResponse Time (max)13.28sResponse Time (total)13.28s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)13.95sResponse Time (max)13.95sResponse Time (total)13.95s
A test is fully passed only if every run passed for that test.Extra formatting: 3Wrong answer: 3Timed out: 1Response Time (avg)15.81sResponse Time (max)46.35sResponse Time (total)189.71s…
Coding
: 7.2 A test is fully passed only if every run passed for that test.Extra formatting: 1Response Time (avg)33.87sResponse Time (max)35.76sResponse Time (total)67.74s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)46.35sResponse Time (max)46.35sResponse Time (total)46.35s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)13.90sResponse Time (max)13.90sResponse Time (total)13.90s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.94sResponse Time (max)4.94sResponse Time (total)4.94s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.61sResponse Time (max)2.61sResponse Time (total)2.61s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.31sResponse Time (max)6.24sResponse Time (total)10.62s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.48sResponse Time (max)7.48sResponse Time (total)7.48s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)30.09sResponse Time (max)30.09sResponse Time (total)30.09s
Anti-AI Tricks
: 6.4 A test is fully passed only if every run passed for that test.Extra formatting: 2Response Time (avg)7.45sResponse Time (max)11.88sResponse Time (total)14.90s
Coding
: 7.2 A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)29.37sResponse Time (max)35.63sResponse Time (total)58.74s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)76.66sResponse Time (max)76.66sResponse Time (total)76.66s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.37sResponse Time (max)7.37sResponse Time (total)7.37s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.04sResponse Time (max)5.04sResponse Time (total)5.04s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.43sResponse Time (max)2.43sResponse Time (total)2.43s
Puzzle Solving
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)4.71sResponse Time (max)4.75sResponse Time (total)9.41s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)9.73sResponse Time (max)9.73sResponse Time (total)9.73s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)63.24sResponse Time (max)63.24sResponse Time (total)63.24s
Anti-AI Tricks
: 8.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)37.16sResponse Time (max)140.53sResponse Time (total)148.65s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)137.63sResponse Time (max)137.63sResponse Time (total)137.63s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)149.23sResponse Time (max)149.23sResponse Time (total)149.23s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.49sResponse Time (max)4.96sResponse Time (total)8.98s
Domain specific
: 3.6 A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)139.90sResponse Time (max)141.40sResponse Time (total)419.69s
Instructions following
: 7.3 A test is fully passed only if every run passed for that test.No answer: 1Response Time (avg)23.26sResponse Time (max)43.87sResponse Time (total)46.51s
Puzzle Solving
: 5.7 A test is fully passed only if every run passed for that test.Did not follow instructions: 2Response Time (avg)50.83sResponse Time (max)144.85sResponse Time (total)152.49s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.44sResponse Time (max)6.44sResponse Time (total)6.44s
A test is fully passed only if every run passed for that test.Wrong answer: 2Did not follow instructions: 1Response Time (avg)68.14sResponse Time (max)280.52sResponse Time (total)1090.28s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)43.87sResponse Time (max)121.88sResponse Time (total)131.62s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)280.52sResponse Time (max)280.52sResponse Time (total)280.52s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.16sResponse Time (max)8.54sResponse Time (total)14.31s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)127.58sResponse Time (max)133.93sResponse Time (total)382.74s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.25sResponse Time (max)5.25sResponse Time (total)5.25s
Instructions following
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)64.03sResponse Time (max)124.45sResponse Time (total)128.06s
Puzzle Solving
: 7.7 A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)46.68sResponse Time (max)134.22sResponse Time (total)140.04s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.73sResponse Time (max)7.73sResponse Time (total)7.73s
A test is fully passed only if every run passed for that test.Wrong answer: 4Response Time (avg)37.88sResponse Time (max)332.10sResponse Time (total)757.66s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.66sResponse Time (max)6.74sResponse Time (total)18.65s
Coding
: 8.2 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)69.68sResponse Time (max)130.26sResponse Time (total)139.35s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)19.29sResponse Time (max)19.29sResponse Time (total)19.29s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.18sResponse Time (max)4.35sResponse Time (total)8.36s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)164.14sResponse Time (max)332.10sResponse Time (total)492.41s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.16sResponse Time (max)4.16sResponse Time (total)4.16s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.36sResponse Time (max)3.46sResponse Time (total)6.73s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.76sResponse Time (max)10.54sResponse Time (total)20.28s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)10.57sResponse Time (max)10.57sResponse Time (total)10.57s
Trivia
: 2.8 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)37.86sResponse Time (max)37.86sResponse Time (total)37.86s
Anti-AI Tricks
: 6.9 A test is fully passed only if every run passed for that test.Extra formatting: 1Wrong answer: 1Response Time (avg)3.46sResponse Time (max)4.38sResponse Time (total)13.86s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)27.11sResponse Time (max)27.11sResponse Time (total)27.11s
Combined
: 3.0 A test is fully passed only if every run passed for that test.API error: 1Response Time (avg)0msResponse Time (max)0msResponse Time (total)0ms
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.54sResponse Time (max)7.51sResponse Time (total)11.08s
Instructions following
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.52sResponse Time (max)3.80sResponse Time (total)7.04s
Tool Calling
: 3.0 A test is fully passed only if every run passed for that test.API error: 1Response Time (avg)0msResponse Time (max)0msResponse Time (total)0ms