APEX
Back to models

GLM 5

Z.ai

200K context$1.00/M input$3.20/M output
1632peak 1642

Avg Score

73.1

Avg Cost

$0.15

Score/$

492.5

Runs

124

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

frontendexpert
2287
backendeasy
2265
from-scratchexpert
2072
from-scratchmedium
2031
multi-languagehard
2003
refactoringexpert
1853
full-stackhard
1814
from-scratchhard
1788
from-scratch
1782
full-stack
1742
full-stackmedium
1696
backendhard
1692
debuggingmedium
1684
backendmedium
1674
backend
1671
from-scratcheasy
1659
frontendhard
1655
frontend
1634
backendexpert
1626
frontendeasy
1625
debugginghard
1624
frontendmedium
1597
debugging
1583
debuggingexpert
1578
refactoring
1520
refactoringmedium
1481
multi-language
1477
code-reviewmedium
1396
code-review
1379
code-reviewhard
1279
multi-languageexpert
1269

All Results

TaskCategoryScore
Build codebase indexer for LLM context windowsfrom-scratch50.1
Fix N+1 query in dashboardbackend57.2
Fix deadlocking transaction patterns in Flask appbackend70.0
Fix 12 WCAG accessibility violations in checkout formfrontend72.6
Add rate limiting middlewarebackend82.5
Convert React app to PWA with offline supportfrontend60.5
Dockerize Node.js monorepofull-stack77.0
Fix flaky test suitedebugging86.7
Add Redis caching layer to Express APIbackend85.8
Optimize bloated React bundle under 500KBfrontend83.0
Debug and fix 6 broken database triggers and constraintsdebugging88.8
Fix broken GitHub Actions CI pipelinedebugging82.9
Replace console.log with structured loggingrefactoring57.2
Refactor monolithic handler to CQRSrefactoring81.0
Fix data integrity bugs in denormalized e-commerce schemadebugging79.7
Add Google OAuth2 login to Express appfull-stack84.6
Port Python CLI to Rustmulti-language34.3
Build CLI tool with subcommands and configfrom-scratch68.3
Build terminal UI dashboardfrom-scratch56.4
Fix Node.js stream backpressure causing OOM on large filesbackend89.8
Fix memory leak in event handlerdebugging82.4
Harden insecure Docker setup with 12 vulnerabilitiescode-review73.3
Zero-downtime schema migrationfull-stack81.9
Add retry logic and dead letter queue to Python task queuebackend76.5
Add file upload with S3 presigned URLsbackend85.2
Debug race condition in worker pooldebugging91.8
Add virtual scrolling to table rendering 5000 rowsfrontend82.4
Add caching layer to eliminate slow SSR page loadsfull-stack82.0
Implement transformer inference engine with KV cachefrom-scratch85.5
Implement zero-trust API authentication layerbackend52.8
Build MCP server for database managementbackend83.6
Fix hallucination and context window bugs in RAG agentbackend64.8
Add streaming SSE endpoint for LLM chatbackend84.2
Migrate callback-hell Express app to async/awaitrefactoring69.2
Build LLM evaluation harness with structured gradingbackend79.7
Fix race conditions in order matching enginebackend79.2
Add WebSocket real-time updatesfull-stack79.5
Remove AI slop and over-engineering from codebaserefactoring84.5
Write Kubernetes manifests for Node.js microservicefull-stack84.8
Build REST API from scratchfrom-scratch77.3
Add i18n with locale routing to Next.js appfull-stack73.5
Write complex SQL report with window functionsbackend62.6
Fix auth bypass vulnerabilitydebugging88.8
Implement JWT auth middlewarebackend53.0
Add cursor-based pagination to REST APIbackend81.8
Optimize slow Postgres queries in Flask appbackend88.0
Add GraphQL layer over REST APImulti-language80.9
Implement multi-tenant row-level security in Postgresbackend55.2
Find and fix 4 hidden backdoors in Flask appdebugging91.3
Build distributed node cluster with gossip protocolfrom-scratch70.8
Implement Stripe webhook handlerbackend78.3
Build real-time portfolio risk calculatorbackend71.0
Find and patch all OWASP Top 10 vulnerabilitiesdebugging70.0
Build materialized view refresh pipeline for analyticsbackend70.3
Fix broken responsive layoutfrontend77.0
Build production website with auth and members areafrontend72.1
Split 1100-line god file into proper modulesrefactoring60.5
Add slash commands and moderation to Discord botbackend74.3
Implement background job scheduler with persistencebackend71.7
Build RAG pipeline with vector searchbackend60.1
Fix React hydration mismatchfrontend82.2
Write tests for untested legacy Flask servicecode-review59.5
Code review: identify security vulnscode-review27.1
Build SaaS admin dashboard from scratchfrom-scratch61.1
Write integration tests for payment flowcode-review71.2
Implement zero-trust API authentication layerbackend71.8
Optimize bloated React bundle under 500KBfrontend75.0
Add caching layer to eliminate slow SSR page loadsfull-stack81.5
Write Kubernetes manifests for Node.js microservicefull-stack88.4
Dockerize Node.js monorepofull-stack83.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging73.7
Harden insecure Docker setup with 12 vulnerabilitiescode-review78.8
Convert React app to PWA with offline supportfrontend3.0
Fix broken responsive layoutfrontend77.2
Remove AI slop and over-engineering from codebaserefactoring82.9
Build codebase indexer for LLM context windowsfrom-scratch49.8
Replace console.log with structured loggingrefactoring47.0
Implement multi-tenant row-level security in Postgresbackend75.2
Split 1100-line god file into proper modulesrefactoring66.3
Implement JWT auth middlewarebackend79.2
Add i18n with locale routing to Next.js appfull-stack75.0
Add streaming SSE endpoint for LLM chatbackend78.4
Build CLI tool with subcommands and configfrom-scratch68.1
Build production website with auth and members areafrontend74.3
Build MCP server for database managementbackend82.4
Implement background job scheduler with persistencebackend45.5
Build SaaS admin dashboard from scratchfrom-scratch75.8
Implement transformer inference engine with KV cachefrom-scratch87.7
Add cursor-based pagination to REST APIbackend71.2
Fix data integrity bugs in denormalized e-commerce schemadebugging77.6
Add retry logic and dead letter queue to Python task queuebackend43.3
Write complex SQL report with window functionsbackend83.5
Add slash commands and moderation to Discord botbackend69.1
Build LLM evaluation harness with structured gradingbackend63.5
Fix hallucination and context window bugs in RAG agentbackend57.0
Implement Stripe webhook handlerbackend79.7
Build real-time portfolio risk calculatorbackend74.0
Debug and fix 6 broken database triggers and constraintsdebugging80.7
Fix deadlocking transaction patterns in Flask appbackend73.0
Find and fix 4 hidden backdoors in Flask appdebugging91.7
Write tests for untested legacy Flask servicecode-review45.9
Add Redis caching layer to Express APIbackend82.2
Optimize slow Postgres queries in Flask appbackend86.7
Fix broken GitHub Actions CI pipelinedebugging93.3
Add virtual scrolling to table rendering 5000 rowsfrontend81.6
Fix 12 WCAG accessibility violations in checkout formfrontend82.5
Fix Node.js stream backpressure causing OOM on large filesbackend89.8
Build distributed node cluster with gossip protocolfrom-scratch60.9
Fix flaky test suitedebugging82.5
Write integration tests for payment flowcode-review37.2
Fix auth bypass vulnerabilitydebugging93.7
Add GraphQL layer over REST APImulti-language75.5
Zero-downtime schema migrationfull-stack72.2
Add rate limiting middlewarebackend87.5
Fix N+1 query in dashboardbackend88.4
Fix React hydration mismatchfrontend69.5
Build terminal UI dashboardfrom-scratch70.9
Refactor monolithic handler to CQRSrefactoring66.0
Code review: identify security vulnscode-review65.5
Fix memory leak in event handlerdebugging60.9
Add WebSocket real-time updatesfull-stack78.2
Debug race condition in worker pooldebugging90.3
Fix race conditions in order matching enginebackend69.8
Build REST API from scratchfrom-scratch67.9