APEX
Back to models

Claude Opus 4.5

Anthropic

200K context$15.00/M input$75.00/M output
1698peak 1712

Avg Score

82.3

Avg Cost

$0.88

Score/$

93.7

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
2607
frontendeasy
2383
multi-languagehard
2365
frontendexpert
2320
from-scratcheasy
2308
from-scratchmedium
2254
frontendhard
2229
from-scratchexpert
2157
backendeasy
2109
multi-language
2005
from-scratchhard
1999
refactoringmedium
1940
from-scratch
1914
full-stackmedium
1850
code-reviewmedium
1845
refactoring
1819
frontend
1778
frontendmedium
1736
debugginghard
1703
backendmaster
1697
backendexpert
1696
full-stack
1689
code-review
1689
debuggingmedium
1672
backendhard
1663
backend
1652
refactoringexpert
1607
full-stackhard
1603
backendmedium
1600
debugging
1595
frontendmaster
1532
debuggingexpert
1502
code-reviewhard
1322

All Results

TaskCategoryScore
Migrate Express monolith to modular architecturebackend86.3
Fix and extend Chrome browser extensionfrontend74.3
Build 3D browser game with physics and multiplayer syncfrontend77.5
Build interactive data visualization dashboardfrontend67.7
Build multi-tool LLM agent runtimebackend85.3
Port Python CLI to Rustmulti-language81.5
Write tests for untested legacy Flask servicecode-review81.8
Fix Node.js stream backpressure causing OOM on large filesbackend94.4
Fix React hydration mismatchfrontend87.6
Add Redis caching layer to Express APIbackend74.0
Add WebSocket real-time updatesfull-stack87.9
Build RAG pipeline with vector searchbackend84.3
Add retry logic and dead letter queue to Python task queuebackend80.5
Add GraphQL layer over REST APImulti-language90.4
Code review: identify security vulnscode-review83.5
Implement background job scheduler with persistencebackend76.2
Migrate callback-hell Express app to async/awaitrefactoring86.5
Implement transformer inference engine with KV cachefrom-scratch89.3
Build distributed node cluster with gossip protocolfrom-scratch78.5
Fix broken GitHub Actions CI pipelinedebugging95.0
Optimize bloated React bundle under 500KBfrontend81.4
Find and patch all OWASP Top 10 vulnerabilitiesdebugging91.5
Add streaming SSE endpoint for LLM chatbackend79.0
Add file upload with S3 presigned URLsbackend78.1
Implement JWT auth middlewarebackend86.9
Build codebase indexer for LLM context windowsfrom-scratch78.8
Replace console.log with structured loggingrefactoring92.8
Add caching layer to eliminate slow SSR page loadsfull-stack89.2
Harden insecure Docker setup with 12 vulnerabilitiescode-review95.6
Convert React app to PWA with offline supportfrontend82.3
Add i18n with locale routing to Next.js appfull-stack78.2
Implement zero-trust API authentication layerbackend82.6
Remove AI slop and over-engineering from codebaserefactoring90.5
Split 1100-line god file into proper modulesrefactoring86.2
Dockerize Node.js monorepofull-stack84.8
Implement multi-tenant row-level security in Postgresbackend78.6
Fix broken responsive layoutfrontend86.9
Write Kubernetes manifests for Node.js microservicefull-stack89.4
Refactor monolithic handler to CQRSrefactoring68.7
Fix flaky test suitedebugging82.0
Build real-time portfolio risk calculatorbackend77.5
Zero-downtime schema migrationfull-stack66.7
Fix 12 WCAG accessibility violations in checkout formfrontend91.8
Build production website with auth and members areafrontend80.2
Build SaaS admin dashboard from scratchfrom-scratch82.2
Fix hallucination and context window bugs in RAG agentbackend73.2
Build CLI tool with subcommands and configfrom-scratch81.8
Build MCP server for database managementbackend83.2
Build LLM evaluation harness with structured gradingbackend73.8
Fix race conditions in order matching enginebackend80.4
Fix data integrity bugs in denormalized e-commerce schemadebugging72.8
Build materialized view refresh pipeline for analyticsbackend75.3
Fix deadlocking transaction patterns in Flask appbackend87.9
Debug and fix 6 broken database triggers and constraintsdebugging78.3
Write complex SQL report with window functionsbackend84.0
Find and fix 4 hidden backdoors in Flask appdebugging93.1
Add Google OAuth2 login to Express appfull-stack79.0
Optimize slow Postgres queries in Flask appbackend83.2
Add slash commands and moderation to Discord botbackend82.0
Add virtual scrolling to table rendering 5000 rowsfrontend79.3
Write integration tests for payment flowcode-review69.9
Fix auth bypass vulnerabilitydebugging93.7
Add cursor-based pagination to REST APIbackend48.5
Implement Stripe webhook handlerbackend77.3
Add rate limiting middlewarebackend83.2
Build terminal UI dashboardfrom-scratch77.5
Build REST API from scratchfrom-scratch93.7
Fix N+1 query in dashboardbackend91.5
Fix memory leak in event handlerdebugging85.2
Debug race condition in worker pooldebugging91.9