APEX
Back to models

GLM 4.5 Air

Z.ai

131K context$0.20/M input$1.10/M output
1361peak 1388

Avg Score

59.9

Avg Cost

$0.03

Score/$

1869.9

Runs

117

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

refactoringexpert
2244
from-scratchexpert
1809
refactoring
1636
refactoringmedium
1605
backendeasy
1562
debuggingmedium
1508
from-scratcheasy
1474
frontendeasy
1457
backendmedium
1443
debugginghard
1440
frontendhard
1434
debugging
1417
debuggingexpert
1367
code-reviewhard
1367
backend
1360
frontendmedium
1342
frontend
1302
code-review
1297
backendexpert
1281
backendhard
1280
full-stack
1248
code-reviewmedium
1247
from-scratch
1240
full-stackmedium
1235
full-stackhard
1232
from-scratchhard
1138
multi-language
1033
from-scratchmedium
651
multi-languagehard
583
multi-languageexpert
0
frontendexpert
0

All Results

TaskCategoryScore
Write integration tests for payment flowcode-review72.9
Build production website with auth and members areafrontend25.9
Implement transformer inference engine with KV cachefrom-scratch82.0
Find and patch all OWASP Top 10 vulnerabilitiesdebugging68.2
Add retry logic and dead letter queue to Python task queuebackend71.4
Add Redis caching layer to Express APIbackend63.0
Remove AI slop and over-engineering from codebaserefactoring88.7
Implement Stripe webhook handlerbackend72.2
Optimize bloated React bundle under 500KBfrontend70.5
Fix flaky test suitedebugging76.4
Write tests for untested legacy Flask servicecode-review35.0
Fix Node.js stream backpressure causing OOM on large filesbackend59.0
Build codebase indexer for LLM context windowsfrom-scratch39.5
Add rate limiting middlewarebackend73.0
Add caching layer to eliminate slow SSR page loadsfull-stack79.4
Build REST API from scratchfrom-scratch71.2
Build terminal UI dashboardfrom-scratch43.5
Add cursor-based pagination to REST APIbackend71.5
Optimize slow Postgres queries in Flask appbackend79.2
Fix React hydration mismatchfrontend69.3
Add WebSocket real-time updatesfull-stack67.2
Fix race conditions in order matching enginebackend71.5
Add GraphQL layer over REST APImulti-language56.9
Write complex SQL report with window functionsbackend57.7
Implement background job scheduler with persistencebackend27.8
Add Google OAuth2 login to Express appfull-stack65.8
Convert React app to PWA with offline supportfrontend65.8
Build CLI tool with subcommands and configfrom-scratch36.7
Add slash commands and moderation to Discord botbackend69.4
Fix N+1 query in dashboardbackend56.5
Add i18n with locale routing to Next.js appfull-stack56.2
Fix broken responsive layoutfrontend67.4
Debug and fix 6 broken database triggers and constraintsdebugging50.4
Fix auth bypass vulnerabilitydebugging78.0
Zero-downtime schema migrationfull-stack64.1
Harden insecure Docker setup with 12 vulnerabilitiescode-review78.3
Fix 12 WCAG accessibility violations in checkout formfrontend56.4
Fix hallucination and context window bugs in RAG agentbackend25.1
Refactor monolithic handler to CQRSrefactoring86.6
Implement JWT auth middlewarebackend50.9
Build distributed node cluster with gossip protocolfrom-scratch31.1
Port Python CLI to Rustmulti-language31.3
Build SaaS admin dashboard from scratchfrom-scratch31.9
Build real-time portfolio risk calculatorbackend33.9
Find and fix 4 hidden backdoors in Flask appdebugging74.6
Build LLM evaluation harness with structured gradingbackend62.3
Fix deadlocking transaction patterns in Flask appbackend65.8
Add streaming SSE endpoint for LLM chatbackend79.2
Dockerize Node.js monorepofull-stack67.0
Fix memory leak in event handlerdebugging79.1
Build MCP server for database managementbackend20.1
Build materialized view refresh pipeline for analyticsbackend73.1
Add file upload with S3 presigned URLsbackend79.8
Build RAG pipeline with vector searchbackend41.9
Add virtual scrolling to table rendering 5000 rowsfrontend51.5
Implement zero-trust API authentication layerbackend67.5
Debug race condition in worker pooldebugging78.7
Fix data integrity bugs in denormalized e-commerce schemadebugging77.3
Migrate callback-hell Express app to async/awaitrefactoring62.5
Implement multi-tenant row-level security in Postgresbackend66.7
Fix broken GitHub Actions CI pipelinedebugging84.3
Build RAG pipeline with vector searchbackend45.8
Implement multi-tenant row-level security in Postgresbackend30.3
Add caching layer to eliminate slow SSR page loadsfull-stack27.7
Dockerize Node.js monorepofull-stack59.5
Harden insecure Docker setup with 12 vulnerabilitiescode-review67.0
Convert React app to PWA with offline supportfrontend64.9
Build codebase indexer for LLM context windowsfrom-scratch37.5
Remove AI slop and over-engineering from codebaserefactoring73.7
Implement JWT auth middlewarebackend71.9
Optimize bloated React bundle under 500KBfrontend53.8
Write Kubernetes manifests for Node.js microservicefull-stack70.0
Replace console.log with structured loggingrefactoring68.9
Find and patch all OWASP Top 10 vulnerabilitiesdebugging62.1
Implement zero-trust API authentication layerbackend50.5
Add i18n with locale routing to Next.js appfull-stack59.4
Fix broken responsive layoutfrontend58.1
Split 1100-line god file into proper modulesrefactoring72.7
Implement background job scheduler with persistencebackend43.4
Build SaaS admin dashboard from scratchfrom-scratch51.5
Implement transformer inference engine with KV cachefrom-scratch76.8
Build MCP server for database managementbackend54.5
Build CLI tool with subcommands and configfrom-scratch39.8
Build production website with auth and members areafrontend33.8
Build real-time portfolio risk calculatorbackend32.0
Build materialized view refresh pipeline for analyticsbackend40.0
Fix Node.js stream backpressure causing OOM on large filesbackend57.3
Build LLM evaluation harness with structured gradingbackend61.6
Fix data integrity bugs in denormalized e-commerce schemadebugging57.6
Fix race conditions in order matching enginebackend63.0
Debug and fix 6 broken database triggers and constraintsdebugging61.6
Fix 12 WCAG accessibility violations in checkout formfrontend77.6
Implement Stripe webhook handlerbackend71.7
Add retry logic and dead letter queue to Python task queuebackend69.5
Find and fix 4 hidden backdoors in Flask appdebugging68.3
Optimize slow Postgres queries in Flask appbackend65.6
Write integration tests for payment flowcode-review67.5
Add virtual scrolling to table rendering 5000 rowsfrontend59.9
Zero-downtime schema migrationfull-stack61.0
Refactor monolithic handler to CQRSrefactoring65.1
Debug race condition in worker pooldebugging67.0
Fix React hydration mismatchfrontend76.0
Fix N+1 query in dashboardbackend86.1
Fix deadlocking transaction patterns in Flask appbackend59.8
Fix auth bypass vulnerabilitydebugging92.6
Fix flaky test suitedebugging55.3
Write complex SQL report with window functionsbackend49.4
Write tests for untested legacy Flask servicecode-review45.8
Build distributed node cluster with gossip protocolfrom-scratch27.5
Add cursor-based pagination to REST APIbackend48.3
Fix hallucination and context window bugs in RAG agentbackend50.4
Add slash commands and moderation to Discord botbackend62.9
Add rate limiting middlewarebackend65.7
Fix memory leak in event handlerdebugging62.7
Code review: identify security vulnscode-review37.8
Build terminal UI dashboardfrom-scratch44.9
Build REST API from scratchfrom-scratch65.5