APEX
Back to models

Grok 4

OpenRouter

256K context$3.00/M input$15.00/M output
1594peak 1610

Avg Score

71.5

Avg Cost

$0.27

Score/$

263.8

Runs

121

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

backendeasy
2456
from-scratchexpert
2072
from-scratchmedium
1975
code-reviewhard
1948
debuggingexpert
1877
debuggingmedium
1754
multi-languagehard
1744
full-stackmedium
1705
from-scratchhard
1702
from-scratch
1688
backendmedium
1669
backendexpert
1633
backend
1632
frontendeasy
1625
refactoringexpert
1621
frontendmedium
1614
refactoringmedium
1607
full-stack
1606
debugging
1601
refactoring
1589
from-scratcheasy
1587
backendhard
1562
full-stackhard
1552
frontend
1495
code-review
1480
debugginghard
1425
code-reviewmedium
1409
multi-language
1380
frontendexpert
1176
frontendhard
939
multi-languageexpert
417

All Results

TaskCategoryScore
Implement background job scheduler with persistencebackend66.8
Build production website with auth and members areafrontend56.6
Fix N+1 query in dashboardbackend73.7
Port Python CLI to Rustmulti-language44.3
Debug and fix 6 broken database triggers and constraintsdebugging93.0
Fix race conditions in order matching enginebackend80.4
Add slash commands and moderation to Discord botbackend79.3
Add i18n with locale routing to Next.js appfull-stack69.7
Add file upload with S3 presigned URLsbackend84.4
Fix broken responsive layoutfrontend68.4
Find and fix 4 hidden backdoors in Flask appdebugging89.0
Fix hallucination and context window bugs in RAG agentbackend63.0
Refactor monolithic handler to CQRSrefactoring57.4
Debug race condition in worker pooldebugging85.5
Add GraphQL layer over REST APImulti-language77.3
Fix broken GitHub Actions CI pipelinedebugging80.0
Remove AI slop and over-engineering from codebaserefactoring73.0
Code review: identify security vulnscode-review76.2
Replace console.log with structured loggingrefactoring57.6
Implement multi-tenant row-level security in Postgresbackend72.6
Add cursor-based pagination to REST APIbackend72.5
Add Google OAuth2 login to Express appfull-stack76.7
Optimize slow Postgres queries in Flask appbackend80.0
Fix deadlocking transaction patterns in Flask appbackend64.6
Add Redis caching layer to Express APIbackend78.5
Add caching layer to eliminate slow SSR page loadsfull-stack89.1
Harden insecure Docker setup with 12 vulnerabilitiescode-review88.4
Implement transformer inference engine with KV cachefrom-scratch88.8
Add streaming SSE endpoint for LLM chatbackend84.0
Write integration tests for payment flowcode-review82.5
Build MCP server for database managementbackend77.8
Add retry logic and dead letter queue to Python task queuebackend79.0
Implement Stripe webhook handlerbackend83.6
Migrate callback-hell Express app to async/awaitrefactoring56.8
Build real-time portfolio risk calculatorbackend50.0
Dockerize Node.js monorepofull-stack73.0
Write complex SQL report with window functionsbackend69.9
Convert React app to PWA with offline supportfrontend71.1
Add WebSocket real-time updatesfull-stack72.6
Zero-downtime schema migrationfull-stack74.1
Build RAG pipeline with vector searchbackend53.8
Fix React hydration mismatchfrontend73.1
Fix flaky test suitedebugging93.2
Fix data integrity bugs in denormalized e-commerce schemadebugging91.6
Write Kubernetes manifests for Node.js microservicefull-stack78.3
Build codebase indexer for LLM context windowsfrom-scratch53.8
Fix memory leak in event handlerdebugging74.4
Split 1100-line god file into proper modulesrefactoring41.0
Fix auth bypass vulnerabilitydebugging71.3
Build CLI tool with subcommands and configfrom-scratch48.5
Build distributed node cluster with gossip protocolfrom-scratch35.1
Build LLM evaluation harness with structured gradingbackend81.0
Build terminal UI dashboardfrom-scratch47.9
Add rate limiting middlewarebackend89.8
Build REST API from scratchfrom-scratch78.1
Build materialized view refresh pipeline for analyticsbackend75.5
Implement JWT auth middlewarebackend47.7
Fix Node.js stream backpressure causing OOM on large filesbackend87.0
Build SaaS admin dashboard from scratchfrom-scratch56.5
Write tests for untested legacy Flask servicecode-review36.3
Optimize bloated React bundle under 500KBfrontend79.1
Find and patch all OWASP Top 10 vulnerabilitiesdebugging63.1
Implement zero-trust API authentication layerbackend70.1
Add virtual scrolling to table rendering 5000 rowsfrontend58.0
Fix 12 WCAG accessibility violations in checkout formfrontend71.3
Remove AI slop and over-engineering from codebaserefactoring87.3
Convert React app to PWA with offline supportfrontend74.9
Harden insecure Docker setup with 12 vulnerabilitiescode-review82.3
Replace console.log with structured loggingrefactoring62.4
Add file upload with S3 presigned URLsbackend72.2
Build codebase indexer for LLM context windowsfrom-scratch50.5
Implement multi-tenant row-level security in Postgresbackend73.0
Fix broken responsive layoutfrontend77.2
Dockerize Node.js monorepofull-stack81.0
Add caching layer to eliminate slow SSR page loadsfull-stack77.8
Split 1100-line god file into proper modulesrefactoring78.5
Write Kubernetes manifests for Node.js microservicefull-stack82.5
Optimize bloated React bundle under 500KBfrontend84.8
Implement JWT auth middlewarebackend85.1
Implement zero-trust API authentication layerbackend74.7
Add i18n with locale routing to Next.js appfull-stack69.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging75.8
Implement transformer inference engine with KV cachefrom-scratch83.8
Add Google OAuth2 login to Express appfull-stack73.1
Fix hallucination and context window bugs in RAG agentbackend65.5
Implement background job scheduler with persistencebackend65.1
Build production website with auth and members areafrontend48.4
Fix data integrity bugs in denormalized e-commerce schemadebugging71.2
Build MCP server for database managementbackend75.5
Fix race conditions in order matching enginebackend80.2
Build real-time portfolio risk calculatorbackend72.4
Find and fix 4 hidden backdoors in Flask appdebugging73.7
Fix deadlocking transaction patterns in Flask appbackend66.6
Build CLI tool with subcommands and configfrom-scratch50.7
Build SaaS admin dashboard from scratchfrom-scratch71.0
Write complex SQL report with window functionsbackend73.2
Build LLM evaluation harness with structured gradingbackend62.5
Debug and fix 6 broken database triggers and constraintsdebugging89.3
Write tests for untested legacy Flask servicecode-review37.9
Add Redis caching layer to Express APIbackend11.5
Add retry logic and dead letter queue to Python task queuebackend81.8
Optimize slow Postgres queries in Flask appbackend78.7
Add slash commands and moderation to Discord botbackend82.5
Fix 12 WCAG accessibility violations in checkout formfrontend65.7
Build distributed node cluster with gossip protocolfrom-scratch75.2
Add virtual scrolling to table rendering 5000 rowsfrontend77.5
Fix Node.js stream backpressure causing OOM on large filesbackend90.4
Write integration tests for payment flowcode-review75.2
Add GraphQL layer over REST APImulti-language75.4
Fix auth bypass vulnerabilitydebugging86.3
Add rate limiting middlewarebackend81.7
Refactor monolithic handler to CQRSrefactoring64.2
Zero-downtime schema migrationfull-stack53.5
Add cursor-based pagination to REST APIbackend86.5
Fix flaky test suitedebugging73.3
Fix React hydration mismatchfrontend72.3
Fix memory leak in event handlerdebugging53.3
Fix N+1 query in dashboardbackend67.5
Build terminal UI dashboardfrom-scratch69.9
Debug race condition in worker pooldebugging82.3
Build REST API from scratchfrom-scratch84.0