APEX
Back to models

GPT 5.3 Codex

OpenRouter

400K context$1.75/M input$14.00/M output
1808peak 1839

Avg Score

80.2

Avg Cost

$0.28

Score/$

285.8

Runs

122

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
3237
backendeasy
2506
frontendexpert
2458
multi-languageexpert
2400
refactoringexpert
2370
from-scratchhard
2224
multi-languagehard
2101
from-scratchexpert
2090
frontendhard
2046
from-scratch
2010
frontendmedium
2001
full-stackmedium
1909
debuggingexpert
1900
frontend
1892
backendmedium
1880
backendhard
1877
debuggingmedium
1865
backend
1833
multi-language
1829
from-scratcheasy
1819
full-stack
1796
code-reviewhard
1775
debugging
1755
refactoring
1743
backendexpert
1736
full-stackhard
1734
code-reviewmedium
1698
refactoringmedium
1686
debugginghard
1686
code-review
1677
frontendeasy
1643

All Results

TaskCategoryScore
Build codebase indexer for LLM context windowsfrom-scratch56.1
Add file upload with S3 presigned URLsbackend84.2
Build CLI tool with subcommands and configfrom-scratch78.8
Add virtual scrolling to table rendering 5000 rowsfrontend87.3
Replace console.log with structured loggingrefactoring58.0
Add WebSocket real-time updatesfull-stack82.7
Implement transformer inference engine with KV cachefrom-scratch89.8
Build distributed node cluster with gossip protocolfrom-scratch82.3
Fix memory leak in event handlerdebugging88.9
Fix broken responsive layoutfrontend77.3
Add Redis caching layer to Express APIbackend87.0
Fix auth bypass vulnerabilitydebugging80.2
Add GraphQL layer over REST APImulti-language84.4
Convert React app to PWA with offline supportfrontend88.4
Fix broken GitHub Actions CI pipelinedebugging84.9
Fix flaky test suitedebugging93.7
Refactor monolithic handler to CQRSrefactoring58.9
Build LLM evaluation harness with structured gradingbackend83.0
Add cursor-based pagination to REST APIbackend87.0
Remove AI slop and over-engineering from codebaserefactoring87.3
Dockerize Node.js monorepofull-stack84.4
Implement JWT auth middlewarebackend53.0
Build materialized view refresh pipeline for analyticsbackend76.2
Build REST API from scratchfrom-scratch85.7
Add i18n with locale routing to Next.js appfull-stack80.0
Fix race conditions in order matching enginebackend90.0
Fix data integrity bugs in denormalized e-commerce schemadebugging93.3
Build production website with auth and members areafrontend72.1
Add slash commands and moderation to Discord botbackend88.8
Implement zero-trust API authentication layerbackend74.9
Add Google OAuth2 login to Express appfull-stack87.2
Write complex SQL report with window functionsbackend76.7
Migrate callback-hell Express app to async/awaitrefactoring62.7
Optimize bloated React bundle under 500KBfrontend81.9
Implement Stripe webhook handlerbackend85.4
Debug and fix 6 broken database triggers and constraintsdebugging83.5
Find and fix 4 hidden backdoors in Flask appdebugging69.5
Add caching layer to eliminate slow SSR page loadsfull-stack80.3
Fix 12 WCAG accessibility violations in checkout formfrontend86.9
Add rate limiting middlewarebackend91.3
Build RAG pipeline with vector searchbackend66.8
Optimize slow Postgres queries in Flask appbackend91.7
Debug race condition in worker pooldebugging93.3
Split 1100-line god file into proper modulesrefactoring51.1
Build real-time portfolio risk calculatorbackend73.3
Build terminal UI dashboardfrom-scratch78.8
Build SaaS admin dashboard from scratchfrom-scratch54.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging72.8
Add streaming SSE endpoint for LLM chatbackend89.2
Code review: identify security vulnscode-review77.5
Implement multi-tenant row-level security in Postgresbackend58.8
Fix N+1 query in dashboardbackend73.5
Fix hallucination and context window bugs in RAG agentbackend67.0
Write Kubernetes manifests for Node.js microservicefull-stack94.3
Fix React hydration mismatchfrontend65.8
Implement background job scheduler with persistencebackend84.4
Harden insecure Docker setup with 12 vulnerabilitiescode-review90.7
Add retry logic and dead letter queue to Python task queuebackend88.0
Write tests for untested legacy Flask servicecode-review66.4
Write integration tests for payment flowcode-review79.6
Fix Node.js stream backpressure causing OOM on large filesbackend89.3
Port Python CLI to Rustmulti-language76.0
Fix deadlocking transaction patterns in Flask appbackend61.0
Build MCP server for database managementbackend91.7
Zero-downtime schema migrationfull-stack66.0
Build codebase indexer for LLM context windowsfrom-scratch76.0
Convert React app to PWA with offline supportfrontend80.5
Implement multi-tenant row-level security in Postgresbackend79.7
Find and patch all OWASP Top 10 vulnerabilitiesdebugging80.4
Remove AI slop and over-engineering from codebaserefactoring88.7
Implement JWT auth middlewarebackend85.8
Harden insecure Docker setup with 12 vulnerabilitiescode-review91.4
Add caching layer to eliminate slow SSR page loadsfull-stack88.1
Add streaming SSE endpoint for LLM chatbackend79.8
Split 1100-line god file into proper modulesrefactoring85.3
Add i18n with locale routing to Next.js appfull-stack79.6
Implement zero-trust API authentication layerbackend74.8
Dockerize Node.js monorepofull-stack76.9
Fix broken responsive layoutfrontend76.5
Optimize bloated React bundle under 500KBfrontend85.5
Add file upload with S3 presigned URLsbackend76.2
Replace console.log with structured loggingrefactoring56.0
Write Kubernetes manifests for Node.js microservicefull-stack93.7
Write tests for untested legacy Flask servicecode-review70.5
Implement background job scheduler with persistencebackend67.5
Build production website with auth and members areafrontend79.7
Build SaaS admin dashboard from scratchfrom-scratch83.4
Build MCP server for database managementbackend90.1
Implement transformer inference engine with KV cachefrom-scratch88.6
Build CLI tool with subcommands and configfrom-scratch81.7
Build real-time portfolio risk calculatorbackend76.8
Build LLM evaluation harness with structured gradingbackend64.0
Build materialized view refresh pipeline for analyticsbackend81.3
Fix hallucination and context window bugs in RAG agentbackend79.9
Fix data integrity bugs in denormalized e-commerce schemadebugging75.5
Fix race conditions in order matching enginebackend82.1
Debug and fix 6 broken database triggers and constraintsdebugging87.0
Write complex SQL report with window functionsbackend69.4
Fix deadlocking transaction patterns in Flask appbackend61.1
Find and fix 4 hidden backdoors in Flask appdebugging89.5
Add Redis caching layer to Express APIbackend89.7
Add Google OAuth2 login to Express appfull-stack88.2
Add slash commands and moderation to Discord botbackend88.3
Add retry logic and dead letter queue to Python task queuebackend89.3
Fix 12 WCAG accessibility violations in checkout formfrontend88.1
Optimize slow Postgres queries in Flask appbackend84.7
Build distributed node cluster with gossip protocolfrom-scratch82.7
Write integration tests for payment flowcode-review73.0
Fix Node.js stream backpressure causing OOM on large filesbackend90.8
Add virtual scrolling to table rendering 5000 rowsfrontend87.5
Add GraphQL layer over REST APImulti-language79.5
Add rate limiting middlewarebackend86.5
Build terminal UI dashboardfrom-scratch88.7
Implement Stripe webhook handlerbackend85.8
Zero-downtime schema migrationfull-stack68.0
Refactor monolithic handler to CQRSrefactoring76.7
Fix React hydration mismatchfrontend86.5
Fix flaky test suitedebugging90.0
Fix memory leak in event handlerdebugging84.8
Fix N+1 query in dashboardbackend91.0
Build REST API from scratchfrom-scratch86.6
Debug race condition in worker pooldebugging85.3