APEX
Back to models

GPT 5.3 Codex Spark

OpenAI

200K context$0.75/M input$4.50/M output
1642peak 1659

Avg Score

79.3

Avg Cost

$0.32

Score/$

248.9

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
2377
frontendexpert
2268
code-reviewhard
2248
backendeasy
2234
frontendhard
2061
multi-languagehard
2052
from-scratchexpert
2005
from-scratcheasy
1945
code-reviewmedium
1821
code-review
1816
frontendeasy
1797
refactoringmedium
1796
debuggingmedium
1721
full-stackmedium
1700
from-scratch
1696
refactoring
1679
debuggingexpert
1672
backendhard
1657
from-scratchhard
1655
backendexpert
1651
frontend
1641
full-stack
1637
backend
1637
debugging
1635
debugginghard
1619
backendmedium
1619
full-stackhard
1600
frontendmaster
1573
multi-language
1545
frontendmedium
1538
backendmaster
1138
multi-languageexpert
1055
refactoringexpert
941

All Results

TaskCategoryScore
Migrate Express monolith to modular architecturebackend62.1
Build interactive data visualization dashboardfrontend72.1
Add streaming SSE endpoint for LLM chatbackend85.3
Find and fix 4 hidden backdoors in Flask appdebugging90.9
Add Redis caching layer to Express APIbackend83.9
Build codebase indexer for LLM context windowsfrom-scratch33.8
Build real-time portfolio risk calculatorbackend77.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging73.1
Build 3D browser game with physics and multiplayer syncfrontend80.2
Implement multi-tenant row-level security in Postgresbackend80.0
Optimize bloated React bundle under 500KBfrontend75.8
Add file upload with S3 presigned URLsbackend80.8
Add retry logic and dead letter queue to Python task queuebackend83.6
Fix 12 WCAG accessibility violations in checkout formfrontend89.0
Harden insecure Docker setup with 12 vulnerabilitiescode-review93.9
Fix auth bypass vulnerabilitydebugging86.5
Implement Stripe webhook handlerbackend84.4
Fix flaky test suitedebugging86.0
Fix Node.js stream backpressure causing OOM on large filesbackend88.5
Code review: identify security vulnscode-review83.5
Debug race condition in worker pooldebugging88.6
Build MCP server for database managementbackend87.0
Fix memory leak in event handlerdebugging87.0
Implement JWT auth middlewarebackend51.6
Fix and extend Chrome browser extensionfrontend70.1
Add Google OAuth2 login to Express appfull-stack74.2
Build LLM evaluation harness with structured gradingbackend78.9
Add virtual scrolling to table rendering 5000 rowsfrontend80.8
Debug and fix 6 broken database triggers and constraintsdebugging86.8
Fix data integrity bugs in denormalized e-commerce schemadebugging86.8
Write Kubernetes manifests for Node.js microservicefull-stack91.8
Zero-downtime schema migrationfull-stack75.2
Optimize slow Postgres queries in Flask appbackend75.3
Build distributed node cluster with gossip protocolfrom-scratch82.9
Add WebSocket real-time updatesfull-stack87.5
Write integration tests for payment flowcode-review85.0
Migrate callback-hell Express app to async/awaitrefactoring82.5
Build production website with auth and members areafrontend78.1
Convert React app to PWA with offline supportfrontend82.5
Fix React hydration mismatchfrontend70.3
Remove AI slop and over-engineering from codebaserefactoring89.4
Build REST API from scratchfrom-scratch84.9
Split 1100-line god file into proper modulesrefactoring84.3
Fix race conditions in order matching enginebackend79.7
Fix N+1 query in dashboardbackend83.5
Fix broken GitHub Actions CI pipelinedebugging90.9
Replace console.log with structured loggingrefactoring79.2
Fix hallucination and context window bugs in RAG agentbackend79.2
Fix deadlocking transaction patterns in Flask appbackend79.3
Add caching layer to eliminate slow SSR page loadsfull-stack77.5
Add i18n with locale routing to Next.js appfull-stack76.7
Fix broken responsive layoutfrontend80.0
Write tests for untested legacy Flask servicecode-review79.5
Build RAG pipeline with vector searchbackend79.0
Port Python CLI to Rustmulti-language45.5
Add slash commands and moderation to Discord botbackend77.7
Add cursor-based pagination to REST APIbackend80.5
Build terminal UI dashboardfrom-scratch79.3
Build materialized view refresh pipeline for analyticsbackend72.1
Add rate limiting middlewarebackend87.5
Write complex SQL report with window functionsbackend80.8
Dockerize Node.js monorepofull-stack85.5
Build multi-tool LLM agent runtimebackend70.7
Implement transformer inference engine with KV cachefrom-scratch85.8
Build CLI tool with subcommands and configfrom-scratch71.0
Implement zero-trust API authentication layerbackend76.1
Build SaaS admin dashboard from scratchfrom-scratch80.9
Add GraphQL layer over REST APImulti-language84.5
Implement background job scheduler with persistencebackend80.5
Refactor monolithic handler to CQRSrefactoring54.5