A test is fully passed only if every run passed for that test.Wrong answer: 4Response Time (avg)5.81sResponse Time (max)14.72sResponse Time (total)116.25s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.48sResponse Time (max)4.31sResponse Time (total)13.94s
Coding
: 7.3 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)6.66sResponse Time (max)6.94sResponse Time (total)13.31s
Combined
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)3.27sResponse Time (max)3.27sResponse Time (total)3.27s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)9.40sResponse Time (max)14.72sResponse Time (total)18.80s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)8.05sResponse Time (max)14.40sResponse Time (total)24.15s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.68sResponse Time (max)3.68sResponse Time (total)3.68s
Instructions following
: 9.9 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.02sResponse Time (max)7.35sResponse Time (total)14.03s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.77sResponse Time (max)10.27sResponse Time (total)17.32s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.99sResponse Time (max)4.99sResponse Time (total)4.99s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.75sResponse Time (max)2.75sResponse Time (total)2.75s
A test is fully passed only if every run passed for that test.Wrong answer: 2Did not follow instructions: 1Response Time (avg)68.14sResponse Time (max)280.52sResponse Time (total)1090.28s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)43.87sResponse Time (max)121.88sResponse Time (total)131.62s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)280.52sResponse Time (max)280.52sResponse Time (total)280.52s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.16sResponse Time (max)8.54sResponse Time (total)14.31s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)127.58sResponse Time (max)133.93sResponse Time (total)382.74s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.25sResponse Time (max)5.25sResponse Time (total)5.25s
Instructions following
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)64.03sResponse Time (max)124.45sResponse Time (total)128.06s
Puzzle Solving
: 7.7 A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)46.68sResponse Time (max)134.22sResponse Time (total)140.04s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.73sResponse Time (max)7.73sResponse Time (total)7.73s
A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)3.02sResponse Time (max)18.27sResponse Time (total)57.44s…
Anti-AI Tricks
: 8.3 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)2.12sResponse Time (max)3.75sResponse Time (total)8.50s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.84sResponse Time (max)2.84sResponse Time (total)2.84s
Combined
: 9.5 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)18.27sResponse Time (max)18.27sResponse Time (total)18.27s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.15sResponse Time (max)2.33sResponse Time (total)4.29s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.19sResponse Time (max)1.40sResponse Time (total)3.58s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.47sResponse Time (max)3.47sResponse Time (total)3.47s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.46sResponse Time (max)1.68sResponse Time (total)2.91s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.46sResponse Time (max)3.72sResponse Time (total)7.38s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.74sResponse Time (max)4.74sResponse Time (total)4.74s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.46sResponse Time (max)1.46sResponse Time (total)1.46s
A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)13.83sResponse Time (max)33.37sResponse Time (total)276.53s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.36sResponse Time (max)8.75sResponse Time (total)25.44s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)22.98sResponse Time (max)32.31sResponse Time (total)45.96s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)19.60sResponse Time (max)19.60sResponse Time (total)19.60s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)8.80sResponse Time (max)10.25sResponse Time (total)17.60s
Domain specific
: 5.9 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)24.94sResponse Time (max)29.00sResponse Time (total)74.81s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)11.70sResponse Time (max)11.70sResponse Time (total)11.70s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.46sResponse Time (max)10.17sResponse Time (total)14.92s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)8.84sResponse Time (max)11.71sResponse Time (total)26.51s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.63sResponse Time (max)6.63sResponse Time (total)6.63s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)33.37sResponse Time (max)33.37sResponse Time (total)33.37s
A test is fully passed only if every run passed for that test.Wrong answer: 2Did not follow instructions: 1Response Time (avg)4.29sResponse Time (max)12.05sResponse Time (total)85.72s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.09sResponse Time (max)2.56sResponse Time (total)8.35s
Coding
: 6.8 A test is fully passed only if every run passed for that test.Did not follow instructions: 1Response Time (avg)9.91sResponse Time (max)11.59sResponse Time (total)19.82s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)12.05sResponse Time (max)12.05sResponse Time (total)12.05s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.07sResponse Time (max)5.60sResponse Time (total)8.14s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)5.24sResponse Time (max)6.43sResponse Time (total)15.73s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.52sResponse Time (max)2.52sResponse Time (total)2.52s
Instructions following
: 9.9 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.70sResponse Time (max)3.07sResponse Time (total)5.40s
Puzzle Solving
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)2.38sResponse Time (max)2.55sResponse Time (total)7.15s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.81sResponse Time (max)3.81sResponse Time (total)3.81s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.75sResponse Time (max)2.75sResponse Time (total)2.75s
A test is fully passed only if every run passed for that test.Wrong answer: 2Timed out: 1Response Time (avg)4.48sResponse Time (max)23.18sResponse Time (total)85.21s…
Anti-AI Tricks
: 8.3 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)1.85sResponse Time (max)2.71sResponse Time (total)7.38s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)14.79sResponse Time (max)23.18sResponse Time (total)29.59s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)21.45sResponse Time (max)21.45sResponse Time (total)21.45s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.37sResponse Time (max)3.30sResponse Time (total)4.74s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Timed out: 1Response Time (avg)1.17sResponse Time (max)1.40sResponse Time (total)2.35s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.87sResponse Time (max)2.87sResponse Time (total)2.87s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.57sResponse Time (max)1.66sResponse Time (total)3.14s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.43sResponse Time (max)2.89sResponse Time (total)7.28s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.17sResponse Time (max)4.17sResponse Time (total)4.17s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)2.25sResponse Time (max)2.25sResponse Time (total)2.25s
A test is fully passed only if every run passed for that test.Wrong answer: 3Response Time (avg)9.43sResponse Time (max)56.19sResponse Time (total)188.66s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.41sResponse Time (max)6.32sResponse Time (total)17.64s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)14.42sResponse Time (max)21.06sResponse Time (total)28.85s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)9.56sResponse Time (max)9.56sResponse Time (total)9.56s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.28sResponse Time (max)5.13sResponse Time (total)6.56s
Domain specific
: 5.3 A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)28.05sResponse Time (max)56.19sResponse Time (total)84.16s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.17sResponse Time (max)5.17sResponse Time (total)5.17s
Instructions following
: 9.9 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.74sResponse Time (max)3.99sResponse Time (total)7.48s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.74sResponse Time (max)5.61sResponse Time (total)14.21s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.96sResponse Time (max)4.96sResponse Time (total)4.96s
Trivia
: 3.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)10.06sResponse Time (max)10.06sResponse Time (total)10.06s
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)2.98sResponse Time (max)6.44sResponse Time (total)59.59s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.52sResponse Time (max)5.40sResponse Time (total)10.08s
Coding
: 6.8 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)5.54sResponse Time (max)5.59sResponse Time (total)11.08s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.44sResponse Time (max)6.44sResponse Time (total)6.44s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.81sResponse Time (max)2.32sResponse Time (total)3.63s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)3.39sResponse Time (max)4.44sResponse Time (total)10.16s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.27sResponse Time (max)2.27sResponse Time (total)2.27s
Instructions following
: 9.9 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.86sResponse Time (max)2.10sResponse Time (total)3.73s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.35sResponse Time (max)3.25sResponse Time (total)7.06s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.27sResponse Time (max)3.27sResponse Time (total)3.27s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)1.88sResponse Time (max)1.88sResponse Time (total)1.88s
A test is fully passed only if every run passed for that test.Wrong answer: 2Response Time (avg)20.77sResponse Time (max)88.68sResponse Time (total)269.96s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.90sResponse Time (max)9.52sResponse Time (total)15.80s
Coding
: 7.0 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)54.28sResponse Time (max)88.68sResponse Time (total)108.56s
Combined
: 9.5 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)40.61sResponse Time (max)40.61sResponse Time (total)40.61s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)7.72sResponse Time (max)7.72sResponse Time (total)7.72s
Domain specific
: 7.7 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)32.73sResponse Time (max)32.73sResponse Time (total)32.73s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)11.77sResponse Time (max)11.77sResponse Time (total)11.77s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)9.56sResponse Time (max)9.56sResponse Time (total)9.56s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.90sResponse Time (max)8.49sResponse Time (total)13.79s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)23.15sResponse Time (max)23.15sResponse Time (total)23.15s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.27sResponse Time (max)6.27sResponse Time (total)6.27s
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)16.50sResponse Time (max)117.26sResponse Time (total)330.06s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.88sResponse Time (max)5.73sResponse Time (total)15.53s
Coding
: 7.9 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)95.96sResponse Time (max)117.26sResponse Time (total)191.92s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)22.42sResponse Time (max)22.42sResponse Time (total)22.42s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.43sResponse Time (max)6.18sResponse Time (total)10.86s
Domain specific
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)15.27sResponse Time (max)34.09sResponse Time (total)45.80s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.19sResponse Time (max)5.19sResponse Time (total)5.19s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.04sResponse Time (max)4.70sResponse Time (total)8.08s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.05sResponse Time (max)5.64sResponse Time (total)12.15s
Tool Calling
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)12.60sResponse Time (max)12.60sResponse Time (total)12.60s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)5.50sResponse Time (max)5.50sResponse Time (total)5.50s
A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)8.30sResponse Time (max)34.82sResponse Time (total)165.92s…
Anti-AI Tricks
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)2.57sResponse Time (max)3.60sResponse Time (total)10.27s
Coding
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)24.62sResponse Time (max)34.82sResponse Time (total)49.24s
Combined
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)22.37sResponse Time (max)22.37sResponse Time (total)22.37s
Data parsing and extraction
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)6.43sResponse Time (max)8.51sResponse Time (total)12.87s
Domain specific
: 7.6 A test is fully passed only if every run passed for that test.Wrong answer: 1Response Time (avg)14.09sResponse Time (max)22.00sResponse Time (total)42.27s
General Intelligence
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.63sResponse Time (max)3.63sResponse Time (total)3.63s
Instructions following
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.35sResponse Time (max)3.42sResponse Time (total)6.69s
Puzzle Solving
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.23sResponse Time (max)3.68sResponse Time (total)9.69s
Tool Calling
: 9.8 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)4.96sResponse Time (max)4.96sResponse Time (total)4.96s
Trivia
: 10.0 A test is fully passed only if every run passed for that test.No failed answers.Response Time (avg)3.94sResponse Time (max)3.94sResponse Time (total)3.94s