APEX
Back to models

GLM 4.6

Z.ai

200K context$0.60/M input$2.20/M output
1357peak 1373

Avg Score

64.4

Avg Cost

$0.08

Score/$

797.3

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchexpert
1704
frontendhard
1623
code-reviewhard
1605
full-stackmedium
1489
debugginghard
1487
debuggingexpert
1440
full-stack
1430
debugging
1405
frontendexpert
1403
multi-languagehard
1402
frontend
1397
full-stackhard
1375
frontendmedium
1369
backendmedium
1366
refactoringmedium
1359
multi-language
1355
refactoring
1347
code-review
1332
backend
1319
backendexpert
1295
backendhard
1293
from-scratch
1291
code-reviewmedium
1259
frontendeasy
1220
from-scratchhard
1198
from-scratcheasy
1179
multi-languageexpert
1115
debuggingmedium
1106
from-scratchmedium
794
refactoringexpert
653
backendeasy
250

All Results

TaskCategoryScore
Remove AI slop and over-engineering from codebaserefactoring45.0
Add Redis caching layer to Express APIbackend72.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging71.1
Add Google OAuth2 login to Express appfull-stack72.8
Convert React app to PWA with offline supportfrontend70.7
Implement background job scheduler with persistencebackend41.1
Build distributed node cluster with gossip protocolfrom-scratch38.5
Build LLM evaluation harness with structured gradingbackend36.4
Build RAG pipeline with vector searchbackend44.6
Add WebSocket real-time updatesfull-stack75.8
Fix React hydration mismatchfrontend84.5
Build production website with auth and members areafrontend66.0
Debug race condition in worker pooldebugging87.3
Write complex SQL report with window functionsbackend76.1
Build codebase indexer for LLM context windowsfrom-scratch50.6
Fix data integrity bugs in denormalized e-commerce schemadebugging86.1
Fix auth bypass vulnerabilitydebugging89.5
Fix 12 WCAG accessibility violations in checkout formfrontend80.4
Implement Stripe webhook handlerbackend73.0
Optimize slow Postgres queries in Flask appbackend52.0
Debug and fix 6 broken database triggers and constraintsdebugging59.6
Implement multi-tenant row-level security in Postgresbackend65.5
Fix broken GitHub Actions CI pipelinedebugging76.8
Build terminal UI dashboardfrom-scratch53.1
Add rate limiting middlewarebackend37.6
Add i18n with locale routing to Next.js appfull-stack69.0
Build MCP server for database managementbackend76.8
Fix broken responsive layoutfrontend69.1
Add cursor-based pagination to REST APIbackend81.6
Write tests for untested legacy Flask servicecode-review41.1
Fix deadlocking transaction patterns in Flask appbackend71.9
Build real-time portfolio risk calculatorbackend62.6
Fix race conditions in order matching enginebackend59.3
Write integration tests for payment flowcode-review75.3
Fix hallucination and context window bugs in RAG agentbackend65.1
Add caching layer to eliminate slow SSR page loadsfull-stack82.0
Zero-downtime schema migrationfull-stack65.3
Build materialized view refresh pipeline for analyticsbackend50.4
Fix flaky test suitedebugging51.2
Add retry logic and dead letter queue to Python task queuebackend57.8
Split 1100-line god file into proper modulesrefactoring63.4
Add streaming SSE endpoint for LLM chatbackend80.5
Add slash commands and moderation to Discord botbackend63.2
Build SaaS admin dashboard from scratchfrom-scratch47.3
Migrate callback-hell Express app to async/awaitrefactoring55.7
Optimize bloated React bundle under 500KBfrontend61.2
Harden insecure Docker setup with 12 vulnerabilitiescode-review75.1
Build CLI tool with subcommands and configfrom-scratch53.2
Find and fix 4 hidden backdoors in Flask appdebugging78.0
Implement zero-trust API authentication layerbackend68.5
Add virtual scrolling to table rendering 5000 rowsfrontend42.1
Fix Node.js stream backpressure causing OOM on large filesbackend85.5
Build REST API from scratchfrom-scratch73.4
Dockerize Node.js monorepofull-stack71.8
Add file upload with S3 presigned URLsbackend51.4
Replace console.log with structured loggingrefactoring55.7
Fix memory leak in event handlerdebugging74.5
Write Kubernetes manifests for Node.js microservicefull-stack85.8
Port Python CLI to Rustmulti-language46.7
Implement transformer inference engine with KV cachefrom-scratch80.7
Code review: identify security vulnscode-review72.4
Refactor monolithic handler to CQRSrefactoring45.8
Fix N+1 query in dashboardbackend56.5
Implement JWT auth middlewarebackend41.5
Add GraphQL layer over REST APImulti-language72.8