APEX
Back to models

GLM 4.5

Z.ai

131K context$0.60/M input$2.20/M output
1352peak 1368

Avg Score

65.3

Avg Cost

$0.09

Score/$

728.5

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchexpert
1931
frontendeasy
1668
code-reviewhard
1655
debuggingmedium
1523
backendmedium
1486
debugginghard
1477
debugging
1410
backend
1384
frontendmedium
1381
full-stackmedium
1380
frontend
1374
backendhard
1349
code-review
1328
full-stack
1322
backendexpert
1300
refactoringexpert
1295
from-scratch
1266
debuggingexpert
1266
from-scratcheasy
1260
full-stackhard
1252
refactoring
1241
code-reviewmedium
1239
refactoringmedium
1174
from-scratchhard
1114
frontendexpert
1100
frontendhard
1043
multi-language
1002
multi-languageexpert
585
backendeasy
365
from-scratchmedium
176
multi-languagehard
0

All Results

TaskCategoryScore
Implement background job scheduler with persistencebackend29.8
Add caching layer to eliminate slow SSR page loadsfull-stack76.8
Write Kubernetes manifests for Node.js microservicefull-stack83.5
Add virtual scrolling to table rendering 5000 rowsfrontend72.9
Build SaaS admin dashboard from scratchfrom-scratch40.7
Add cursor-based pagination to REST APIbackend82.4
Migrate callback-hell Express app to async/awaitrefactoring55.7
Build codebase indexer for LLM context windowsfrom-scratch32.5
Dockerize Node.js monorepofull-stack68.8
Fix memory leak in event handlerdebugging70.9
Build real-time portfolio risk calculatorbackend54.9
Implement transformer inference engine with KV cachefrom-scratch84.0
Write complex SQL report with window functionsbackend77.4
Write tests for untested legacy Flask servicecode-review43.0
Implement JWT auth middlewarebackend80.1
Fix race conditions in order matching enginebackend76.3
Fix hallucination and context window bugs in RAG agentbackend34.7
Convert React app to PWA with offline supportfrontend67.7
Build materialized view refresh pipeline for analyticsbackend73.2
Build LLM evaluation harness with structured gradingbackend55.3
Port Python CLI to Rustmulti-language38.0
Zero-downtime schema migrationfull-stack53.9
Build production website with auth and members areafrontend60.1
Fix 12 WCAG accessibility violations in checkout formfrontend72.0
Implement Stripe webhook handlerbackend72.7
Build distributed node cluster with gossip protocolfrom-scratch46.8
Code review: identify security vulnscode-review70.3
Build REST API from scratchfrom-scratch75.7
Fix deadlocking transaction patterns in Flask appbackend71.0
Fix Node.js stream backpressure causing OOM on large filesbackend84.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging67.3
Debug race condition in worker pooldebugging85.6
Debug and fix 6 broken database triggers and constraintsdebugging48.0
Add Redis caching layer to Express APIbackend77.4
Add WebSocket real-time updatesfull-stack73.1
Write integration tests for payment flowcode-review76.5
Fix N+1 query in dashboardbackend47.9
Implement zero-trust API authentication layerbackend57.4
Add i18n with locale routing to Next.js appfull-stack72.6
Add streaming SSE endpoint for LLM chatbackend79.2
Fix auth bypass vulnerabilitydebugging91.5
Add rate limiting middlewarebackend39.6
Remove AI slop and over-engineering from codebaserefactoring77.9
Find and fix 4 hidden backdoors in Flask appdebugging80.4
Fix flaky test suitedebugging68.8
Implement multi-tenant row-level security in Postgresbackend59.0
Fix broken responsive layoutfrontend76.8
Fix broken GitHub Actions CI pipelinedebugging90.9
Split 1100-line god file into proper modulesrefactoring58.1
Add Google OAuth2 login to Express appfull-stack30.4
Add GraphQL layer over REST APImulti-language36.8
Build RAG pipeline with vector searchbackend77.7
Add file upload with S3 presigned URLsbackend65.3
Harden insecure Docker setup with 12 vulnerabilitiescode-review74.7
Optimize bloated React bundle under 500KBfrontend64.8
Replace console.log with structured loggingrefactoring47.8
Build CLI tool with subcommands and configfrom-scratch55.0
Build MCP server for database managementbackend78.2
Fix data integrity bugs in denormalized e-commerce schemadebugging78.3
Optimize slow Postgres queries in Flask appbackend66.0
Add slash commands and moderation to Discord botbackend75.5
Add retry logic and dead letter queue to Python task queuebackend79.7
Refactor monolithic handler to CQRSrefactoring63.5
Build terminal UI dashboardfrom-scratch40.8
Fix React hydration mismatchfrontend78.0