APEX
Leaderboard
Models
Compare
Tasks
Metrics
About
Leaderboard
Overall
Frontend
Backend
Full-Stack
Debugging
Refactoring
Code Review
From Scratch
Multi-Language
All Levels
Easy
Medium
Hard
Expert
Master
#
Model
ELO
Peak
Avg Score
Avg Cost
Consistency
1
Claude Opus 4.8
1946
1965
90.0
$2.04
100.0%
2
Claude Opus 4.7
1880
1896
88.3
$1.47
100.0%
3
GPT 5.5
1840
1858
86.8
$1.61
94.3%
4
GLM 5.2
1795
1892
85.3
$0.39
97.1%
5
GPT 5.4 Mini
1767
1782
84.3
$0.38
95.7%
6
Composer 2.5
1765
1793
84.7
$0.39
100.0%
7
Claude Opus 4.6
1758
1773
84.7
$1.11
97.1%
8
Kimi K2.7 Code
1746
1772
83.9
$0.60
97.1%
9
Grok 4.3
1743
1793
83.5
$0.39
97.1%
10
Claude Sonnet 4.6
1743
1760
83.8
$0.31
91.4%
11
Minimax M3
1741
1756
83.9
$0.15
98.6%
12
Claude Opus 4.5
1698
1712
82.3
$0.88
92.9%
13
Kimi K2.6
1678
1694
80.8
$0.47
91.4%
14
GLM 5.1
1669
1686
80.7
$0.22
92.9%
15
Qwen3.7 Max
1664
1681
80.2
$1.00
92.7%
16
Deepseek V4 Pro
1656
1672
80.2
$0.37
90.0%
17
GPT 5.2
1644
1662
80.2
$0.19
84.4%
18
GPT 5.3 Codex Spark
1642
1659
79.3
$0.32
92.9%
19
GPT 5.2 Codex
1642
1659
78.5
$0.13
80.3%
20
Deepseek V4 Flash
1641
1658
79.6
$0.06
88.6%
21
Gemini 3.5 Flash
1639
1655
79.8
$0.27
94.3%
22
GPT 5.3 Codex
1639
1655
80.2
$0.13
82.8%
23
Minimax M2.7 [NVFP4]
1621
1638
79.4
<$0.01
93.5%
24
Qwen3.6 27b [Q4_K_XL]
1615
1617
78.8
<$0.01
87.1%
25
Qwen3.6 Plus
1610
1627
77.8
$0.40
90.0%
26
GPT 5.1 Codex Mini
1598
1612
76.6
$0.59
82.2%
27
Qwen3.6 35b A3b Q4 K XL [Q4_K_XL]
1577
1596
76.5
<$0.01
85.9%
28
Gemini 3.1 Pro Preview
1564
1580
75.9
$0.57
70.3%
29
Claude Sonnet 4.5
1557
1572
76.1
$0.25
72.3%
30
Minimax M2.7
1554
1571
76.4
$0.05
78.6%
31
Qwen3.6 35b A3b [BF16]
1547
1567
76.2
<$0.01
77.1%
32
Gemini 3 Pro Preview
1540
1557
74.9
$0.46
72.3%
33
Qwen3.6 27b [BF16]
1530
1547
73.2
<$0.01
67.5%
34
Qwen3.5 397b A17b Q4 K XL [Q4_K_XL]
1524
1541
74.8
$0.85
75.7%
35
GLM 5
1522
1540
73.2
$0.15
69.4%
36
Claude Haiku 4.5
1498
1513
71.5
$0.07
63.1%
37
Qwen3.5 122b A10b [Q4_K_XL]
1493
1509
71.8
$0.24
60.2%
38
Kimi K2.5
1493
1509
72.4
$0.06
68.9%
39
Gemini 3 Flash Preview
1486
1504
72.4
$0.02
67.4%
40
GLM 4.7
1486
1502
71.8
$0.10
64.5%
41
Qwen3.5 Plus 02.15
1482
1499
70.6
$0.13
56.5%
42
Grok 4
1466
1483
71.8
$0.27
66.9%
43
GLM 4.7 [Q4_K_XL]
1451
1467
71.2
$0.04
56.1%
44
Qwen3.5 27b
1421
1435
70.0
$0.36
59.8%
45
Qwen3.5 122b A10b
1416
1432
69.6
$0.38
57.3%
46
Gemini 2.5 Pro
1409
1425
68.4
$0.27
53.4%
47
Grok 4.1 Fast
1397
1413
68.7
$0.05
60.2%
48
Minimax M2.1
1394
1410
65.1
$0.05
46.4%
49
Deepseek R1 0528
1386
1405
64.6
$0.05
43.7%
50
Deepseek V3.2
1360
1377
64.0
$0.04
37.5%
51
Qwen3 Coder
1358
1374
60.8
$0.11
37.8%
52
Minimax M2.5
1358
1373
65.5
$0.14
45.5%
53
GLM 4.6
1357
1373
64.4
$0.11
40.7%
54
GLM 4.5
1352
1368
65.9
$0.10
49.5%
55
Qwen3 Coder Plus
1351
1367
63.2
$0.07
39.3%
56
Grok Code Fast 1
1348
1365
65.8
$0.07
42.4%
57
Qwen3.5 35b A3b
1347
1365
63.2
$0.08
42.1%
58
Minimax M2.5 [Q4_K_XL]
1346
1362
63.9
$0.03
33.3%
59
Qwen3.5 Flash 02.23
1341
1357
63.4
$0.06
46.3%
60
Qwen3.5 27b [Q4_K_M]
1334
1351
63.1
$0.18
42.7%
61
Devstral 2512
1324
1341
63.0
$0.10
32.4%
62
Step 3.5 Flash
1312
1328
60.4
$0.13
39.8%
63
Qwen3 Coder Next
1301
1317
61.2
$0.02
36.1%
64
GLM 4.5 Air
1299
1317
60.3
$0.03
29.1%
65
GPT OSS 120b
1295
1312
59.1
$0.12
30.2%
66
Trinity Large Preview:free
1280
1293
54.5
$0.05
20.3%
67
Qwen3 Coder Flash
1267
1284
59.1
$0.02
27.5%
68
Qwen3 Coder Next [Q4_K_XL]
1263
1281
58.3
<$0.01
19.5%
69
Gemini 2.5 Flash Lite
1255
1271
56.6
$0.02
21.5%
70
Qwen3.5 35b A3b [Q4_K_XL]
1224
1239
49.4
$0.05
17.2%
71
GPT OSS 20b
1220
1235
53.8
$0.11
21.9%
72
GLM 4.7 Flash
1210
1226
55.8
$0.01
22.4%
73
Nemotron 3 Nano 30b A3b
1192
1207
52.3
$0.09
18.3%
74
Qwen3 Coder 30b [Q4_K_M]
1133
1152
52.1
$0.01
12.5%
ELO Distribution