APEX
Back to models

GLM 4.6

Z.ai

200K context$0.60/M input$2.20/M output
1441peak 1451

Avg Score

64.1

Avg Cost

$0.11

Score/$

575.0

Runs

123

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
1924
from-scratchexpert
1733
frontendexpert
1691
multi-language
1686
multi-languagehard
1656
frontendhard
1590
frontendeasy
1555
debuggingexpert
1550
debugginghard
1536
from-scratchhard
1523
code-reviewmedium
1512
frontend
1484
full-stackmedium
1480
refactoringmedium
1477
code-review
1472
backendexpert
1463
debugging
1462
full-stack
1461
full-stackhard
1451
backendmedium
1431
refactoring
1428
frontendmedium
1424
from-scratch
1411
backend
1405
code-reviewhard
1395
backendhard
1372
debuggingmedium
1191
from-scratchmedium
1151
from-scratcheasy
1052
refactoringexpert
680
backendeasy
115

All Results

TaskCategoryScore
Remove AI slop and over-engineering from codebaserefactoring45.0
Add Redis caching layer to Express APIbackend72.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging71.1
Add Google OAuth2 login to Express appfull-stack72.8
Convert React app to PWA with offline supportfrontend70.7
Implement background job scheduler with persistencebackend41.1
Build distributed node cluster with gossip protocolfrom-scratch28.0
Build LLM evaluation harness with structured gradingbackend36.4
Build RAG pipeline with vector searchbackend44.6
Add WebSocket real-time updatesfull-stack78.2
Fix React hydration mismatchfrontend84.5
Build production website with auth and members areafrontend58.5
Debug race condition in worker pooldebugging87.3
Write complex SQL report with window functionsbackend76.1
Build codebase indexer for LLM context windowsfrom-scratch50.6
Fix data integrity bugs in denormalized e-commerce schemadebugging86.1
Fix auth bypass vulnerabilitydebugging89.5
Fix 12 WCAG accessibility violations in checkout formfrontend80.4
Implement Stripe webhook handlerbackend73.0
Optimize slow Postgres queries in Flask appbackend52.0
Debug and fix 6 broken database triggers and constraintsdebugging59.6
Implement multi-tenant row-level security in Postgresbackend65.5
Fix broken GitHub Actions CI pipelinedebugging76.8
Build terminal UI dashboardfrom-scratch51.1
Add rate limiting middlewarebackend37.6
Add i18n with locale routing to Next.js appfull-stack69.0
Build MCP server for database managementbackend76.8
Fix broken responsive layoutfrontend69.1
Add cursor-based pagination to REST APIbackend81.6
Write tests for untested legacy Flask servicecode-review34.8
Fix deadlocking transaction patterns in Flask appbackend71.9
Build real-time portfolio risk calculatorbackend62.6
Fix race conditions in order matching enginebackend59.3
Write integration tests for payment flowcode-review75.3
Fix hallucination and context window bugs in RAG agentbackend65.1
Add caching layer to eliminate slow SSR page loadsfull-stack82.0
Zero-downtime schema migrationfull-stack65.3
Build materialized view refresh pipeline for analyticsbackend50.4
Fix flaky test suitedebugging51.2
Add retry logic and dead letter queue to Python task queuebackend57.8
Split 1100-line god file into proper modulesrefactoring63.4
Add streaming SSE endpoint for LLM chatbackend80.5
Add slash commands and moderation to Discord botbackend63.2
Build SaaS admin dashboard from scratchfrom-scratch47.3
Migrate callback-hell Express app to async/awaitrefactoring55.7
Optimize bloated React bundle under 500KBfrontend61.2
Harden insecure Docker setup with 12 vulnerabilitiescode-review75.1
Build CLI tool with subcommands and configfrom-scratch44.4
Find and fix 4 hidden backdoors in Flask appdebugging78.0
Implement zero-trust API authentication layerbackend68.5
Add virtual scrolling to table rendering 5000 rowsfrontend42.1
Fix Node.js stream backpressure causing OOM on large filesbackend85.5
Build REST API from scratchfrom-scratch73.4
Dockerize Node.js monorepofull-stack71.8
Add file upload with S3 presigned URLsbackend51.4
Replace console.log with structured loggingrefactoring55.7
Fix memory leak in event handlerdebugging74.5
Write Kubernetes manifests for Node.js microservicefull-stack85.8
Port Python CLI to Rustmulti-language65.5
Implement transformer inference engine with KV cachefrom-scratch80.7
Code review: identify security vulnscode-review72.4
Refactor monolithic handler to CQRSrefactoring45.8
Fix N+1 query in dashboardbackend56.5
Implement JWT auth middlewarebackend41.5
Write tests for untested legacy Flask servicecode-review41.1
Add WebSocket real-time updatesfull-stack75.8
Port Python CLI to Rustmulti-language46.7
Implement multi-tenant row-level security in Postgresbackend67.2
Add file upload with S3 presigned URLsbackend63.1
Split 1100-line god file into proper modulesrefactoring72.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging67.8
Implement JWT auth middlewarebackend66.0
Dockerize Node.js monorepofull-stack68.4
Add caching layer to eliminate slow SSR page loadsfull-stack71.1
Implement zero-trust API authentication layerbackend56.8
Add i18n with locale routing to Next.js appfull-stack67.8
Fix broken responsive layoutfrontend75.1
Remove AI slop and over-engineering from codebaserefactoring87.0
Convert React app to PWA with offline supportfrontend43.3
Harden insecure Docker setup with 12 vulnerabilitiescode-review88.5
Write Kubernetes manifests for Node.js microservicefull-stack77.8
Build codebase indexer for LLM context windowsfrom-scratch26.8
Replace console.log with structured loggingrefactoring47.8
Optimize bloated React bundle under 500KBfrontend62.7
Build CLI tool with subcommands and configfrom-scratch55.9
Implement background job scheduler with persistencebackend26.5
Build MCP server for database managementbackend74.8
Implement transformer inference engine with KV cachefrom-scratch47.9
Build SaaS admin dashboard from scratchfrom-scratch62.9
Build production website with auth and members areafrontend67.2
Fix deadlocking transaction patterns in Flask appbackend59.8
Fix race conditions in order matching enginebackend74.7
Write complex SQL report with window functionsbackend72.1
Build real-time portfolio risk calculatorbackend53.5
Fix data integrity bugs in denormalized e-commerce schemadebugging50.6
Build LLM evaluation harness with structured gradingbackend64.2
Debug and fix 6 broken database triggers and constraintsdebugging57.4
Implement Stripe webhook handlerbackend56.9
Add Redis caching layer to Express APIbackend76.5
Find and fix 4 hidden backdoors in Flask appdebugging92.1
Add virtual scrolling to table rendering 5000 rowsfrontend73.0
Add GraphQL layer over REST APImulti-language72.8
Fix auth bypass vulnerabilitydebugging92.9
Fix flaky test suitedebugging58.2
Fix Node.js stream backpressure causing OOM on large filesbackend80.5
Fix memory leak in event handlerdebugging60.3
Fix N+1 query in dashboardbackend62.0
Fix React hydration mismatchfrontend68.0
Code review: identify security vulnscode-review60.3
Debug race condition in worker pooldebugging82.3
Optimize slow Postgres queries in Flask appbackend53.6
Add slash commands and moderation to Discord botbackend73.1
Add cursor-based pagination to REST APIbackend71.1
Fix hallucination and context window bugs in RAG agentbackend72.5
Build REST API from scratchfrom-scratch73.9
Build terminal UI dashboardfrom-scratch51.0
Add rate limiting middlewarebackend42.8
Write integration tests for payment flowcode-review69.5
Fix 12 WCAG accessibility violations in checkout formfrontend80.7
Zero-downtime schema migrationfull-stack58.0
Build distributed node cluster with gossip protocolfrom-scratch58.1
Add retry logic and dead letter queue to Python task queuebackend54.0
Refactor monolithic handler to CQRSrefactoring48.6