APEX
Back to models

GPT 5.2 Codex

OpenRouter

400K context$1.75/M input$14.00/M output
1753peak 1772

Avg Score

78.0

Avg Cost

$0.18

Score/$

424.5

Runs

122

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

code-reviewhard
2301
frontendhard
2126
from-scratchexpert
2072
from-scratchhard
1993
multi-languageexpert
1924
backendeasy
1923
backendexpert
1915
from-scratcheasy
1901
debuggingexpert
1900
from-scratchmedium
1896
full-stackmedium
1865
from-scratch
1859
refactoringexpert
1822
frontendmedium
1821
backendhard
1813
full-stack
1810
full-stackhard
1810
debugginghard
1780
backend
1772
debugging
1762
code-review
1757
frontend
1747
frontendexpert
1743
code-reviewmedium
1732
debuggingmedium
1732
multi-language
1699
backendmedium
1696
multi-languagehard
1695
refactoringmedium
1667
refactoring
1662
frontendeasy
1572

All Results

TaskCategoryScore
Write complex SQL report with window functionsbackend75.1
Debug and fix 6 broken database triggers and constraintsdebugging93.0
Migrate callback-hell Express app to async/awaitrefactoring68.2
Build distributed node cluster with gossip protocolfrom-scratch42.5
Harden insecure Docker setup with 12 vulnerabilitiescode-review89.3
Build codebase indexer for LLM context windowsfrom-scratch59.3
Replace console.log with structured loggingrefactoring64.7
Fix data integrity bugs in denormalized e-commerce schemadebugging93.7
Add caching layer to eliminate slow SSR page loadsfull-stack81.6
Fix auth bypass vulnerabilitydebugging92.2
Fix race conditions in order matching enginebackend92.5
Fix hallucination and context window bugs in RAG agentbackend61.1
Optimize slow Postgres queries in Flask appbackend89.3
Fix broken GitHub Actions CI pipelinedebugging79.7
Fix deadlocking transaction patterns in Flask appbackend74.5
Add i18n with locale routing to Next.js appfull-stack81.8
Fix 12 WCAG accessibility violations in checkout formfrontend87.7
Find and patch all OWASP Top 10 vulnerabilitiesdebugging74.7
Build real-time portfolio risk calculatorbackend67.8
Implement zero-trust API authentication layerbackend77.9
Add streaming SSE endpoint for LLM chatbackend84.8
Remove AI slop and over-engineering from codebaserefactoring89.3
Optimize bloated React bundle under 500KBfrontend78.7
Add Redis caching layer to Express APIbackend84.6
Implement background job scheduler with persistencebackend80.0
Build materialized view refresh pipeline for analyticsbackend78.7
Build CLI tool with subcommands and configfrom-scratch76.6
Zero-downtime schema migrationfull-stack74.4
Add file upload with S3 presigned URLsbackend87.5
Split 1100-line god file into proper modulesrefactoring60.3
Build MCP server for database managementbackend89.5
Write integration tests for payment flowcode-review88.3
Add WebSocket real-time updatesfull-stack83.7
Implement multi-tenant row-level security in Postgresbackend88.4
Build SaaS admin dashboard from scratchfrom-scratch76.0
Build terminal UI dashboardfrom-scratch66.8
Implement JWT auth middlewarebackend53.3
Add slash commands and moderation to Discord botbackend84.0
Fix N+1 query in dashboardbackend78.2
Implement transformer inference engine with KV cachefrom-scratch88.2
Fix React hydration mismatchfrontend80.5
Fix memory leak in event handlerdebugging89.3
Build RAG pipeline with vector searchbackend43.5
Build LLM evaluation harness with structured gradingbackend81.3
Write Kubernetes manifests for Node.js microservicefull-stack93.7
Add retry logic and dead letter queue to Python task queuebackend86.9
Add virtual scrolling to table rendering 5000 rowsfrontend87.8
Implement Stripe webhook handlerbackend85.8
Add GraphQL layer over REST APImulti-language41.0
Add cursor-based pagination to REST APIbackend75.5
Fix flaky test suitedebugging93.3
Find and fix 4 hidden backdoors in Flask appdebugging88.3
Dockerize Node.js monorepofull-stack76.5
Build REST API from scratchfrom-scratch66.2
Code review: identify security vulnscode-review82.0
Convert React app to PWA with offline supportfrontend82.5
Fix Node.js stream backpressure causing OOM on large filesbackend89.2
Build production website with auth and members areafrontend64.8
Add Google OAuth2 login to Express appfull-stack84.1
Port Python CLI to Rustmulti-language67.4
Add rate limiting middlewarebackend72.2
Debug race condition in worker pooldebugging92.0
Write tests for untested legacy Flask servicecode-review72.3
Refactor monolithic handler to CQRSrefactoring60.5
Fix broken responsive layoutfrontend76.5
Build codebase indexer for LLM context windowsfrom-scratch42.2
Implement JWT auth middlewarebackend88.5
Replace console.log with structured loggingrefactoring64.3
Fix broken responsive layoutfrontend72.2
Implement zero-trust API authentication layerbackend84.4
Remove AI slop and over-engineering from codebaserefactoring88.0
Add streaming SSE endpoint for LLM chatbackend77.7
Harden insecure Docker setup with 12 vulnerabilitiescode-review87.3
Add file upload with S3 presigned URLsbackend75.6
Find and patch all OWASP Top 10 vulnerabilitiesdebugging76.1
Optimize bloated React bundle under 500KBfrontend82.8
Add i18n with locale routing to Next.js appfull-stack78.8
Dockerize Node.js monorepofull-stack78.5
Add caching layer to eliminate slow SSR page loadsfull-stack92.3
Convert React app to PWA with offline supportfrontend82.5
Implement multi-tenant row-level security in Postgresbackend82.8
Split 1100-line god file into proper modulesrefactoring78.3
Write Kubernetes manifests for Node.js microservicefull-stack93.7
Implement background job scheduler with persistencebackend80.8
Build production website with auth and members areafrontend68.0
Build SaaS admin dashboard from scratchfrom-scratch76.3
Build real-time portfolio risk calculatorbackend73.0
Build MCP server for database managementbackend84.1
Implement transformer inference engine with KV cachefrom-scratch89.0
Fix race conditions in order matching enginebackend87.0
Fix hallucination and context window bugs in RAG agentbackend75.2
Build CLI tool with subcommands and configfrom-scratch33.8
Build LLM evaluation harness with structured gradingbackend67.5
Build materialized view refresh pipeline for analyticsbackend75.3
Write complex SQL report with window functionsbackend81.2
Fix deadlocking transaction patterns in Flask appbackend67.9
Fix data integrity bugs in denormalized e-commerce schemadebugging63.0
Debug and fix 6 broken database triggers and constraintsdebugging85.3
Fix 12 WCAG accessibility violations in checkout formfrontend89.2
Find and fix 4 hidden backdoors in Flask appdebugging90.3
Write tests for untested legacy Flask servicecode-review63.9
Add Google OAuth2 login to Express appfull-stack79.5
Add slash commands and moderation to Discord botbackend79.7
Add retry logic and dead letter queue to Python task queuebackend80.9
Optimize slow Postgres queries in Flask appbackend88.8
Add GraphQL layer over REST APImulti-language74.1
Build distributed node cluster with gossip protocolfrom-scratch81.0
Add virtual scrolling to table rendering 5000 rowsfrontend88.5
Fix Node.js stream backpressure causing OOM on large filesbackend87.2
Fix React hydration mismatchfrontend79.5
Write integration tests for payment flowcode-review67.0
Fix auth bypass vulnerabilitydebugging92.1
Add rate limiting middlewarebackend81.7
Fix flaky test suitedebugging89.3
Zero-downtime schema migrationfull-stack61.4
Refactor monolithic handler to CQRSrefactoring68.6
Add cursor-based pagination to REST APIbackend64.7
Build terminal UI dashboardfrom-scratch62.0
Fix N+1 query in dashboardbackend75.5
Fix memory leak in event handlerdebugging82.2
Build REST API from scratchfrom-scratch88.8
Debug race condition in worker pooldebugging94.7