AI BENCHY 分类
领域专项 排名
看看哪些 AI 模型在 领域专项 上表现最好,哪些更稳定,以及差距主要出现在哪里。 排序方式: 响应时间(平均) ↓.
| 排名 | 模型 | 公司 | 领域专项 得分 | 分数 | 测试正确 | 响应时间(平均) |
|---|---|---|---|---|---|---|
| #119 | Cobuddy medium | Baidu | 2.9 | 5.6 | 0/3 | 128.2s |
| #12 | Gemini 3.1 Flash Lite Preview high | 5.3 | 8.6 | 1/3 | 127.6s | |
| #86 | Grok 4.1 Fast medium | X AI | 5.8 | 6.5 | 1/3 | 121.8s |
| #82 | Hy3 preview high | Tencent | 5.3 | 6.6 | 1/3 | 109.0s |
| #100 | Grok Build 0.1 none | X AI | 3.6 | 6.0 | 0/3 | 103.7s |
| #31 | DeepSeek V4 Flash high | DeepSeek | 4.1 | 7.7 | 0/3 | 100.3s |
| #64 | MiMo-V2-Flash medium | Xiaomi | 5.9 | 7.2 | 1/3 | 96.0s |
| #14 | Qwen3.6 Max Preview medium | Qwen | 2.9 | 8.5 | 0/3 | 95.9s |
| #19 | Seed-2.0-Lite medium | Bytedance Seed | 5.9 | 8.2 | 1/3 | 88.7s |
| #66 | Qwen3.5-35B-A3B medium | Qwen | 4.1 | 7.1 | 0/3 | 88.3s |
| #69 | Claude Opus 4.6 medium | Anthropic | 3.0 | 7.0 | 0/3 | 83.4s |
| #30 | Qwen3.5-27B medium | Qwen | 5.3 | 7.8 | 1/3 | 79.5s |
| #42 | GPT-5.2 medium | OpenAI | 5.9 | 7.5 | 1/3 | 77.8s |
| #21 | GPT-5.4 medium | OpenAI | 5.3 | 8.0 | 1/3 | 74.3s |
| #96 | Ring-2.6-1T none | Inclusionai | 5.3 | 6.2 | 1/3 | 73.4s |