APEX
Back to models

Claude Opus 4.5

Anthropic

200K context$15.00/M input$75.00/M output
1783peak 1794

Avg Score

82.6

Avg Cost

$0.70

Score/$

117.2

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languagehard
3135
multi-languageexpert
2875
frontendeasy
2632
from-scratchmedium
2530
frontendexpert
2388
multi-language
2277
frontendhard
2218
from-scratchhard
2184
from-scratcheasy
2127
from-scratchexpert
2090
from-scratch
2026
backendeasy
1985
refactoringmedium
1965
code-reviewmedium
1913
frontend
1899
refactoring
1857
code-reviewhard
1856
full-stackmedium
1851
code-review
1837
frontendmedium
1829
refactoringexpert
1822
backendhard
1793
backendexpert
1787
backend
1759
backendmedium
1749
full-stack
1743
debugginghard
1692
full-stackhard
1681
debuggingmedium
1681
debuggingexpert
1649
debugging
1632

All Results

TaskCategoryScore
Port Python CLI to Rustmulti-language81.5
Write tests for untested legacy Flask servicecode-review81.8
Fix Node.js stream backpressure causing OOM on large filesbackend94.4
Fix React hydration mismatchfrontend87.6
Add Redis caching layer to Express APIbackend74.0
Add WebSocket real-time updatesfull-stack87.9
Build RAG pipeline with vector searchbackend84.3
Add retry logic and dead letter queue to Python task queuebackend80.5
Add GraphQL layer over REST APImulti-language90.4
Code review: identify security vulnscode-review83.5
Implement background job scheduler with persistencebackend76.2
Migrate callback-hell Express app to async/awaitrefactoring86.5
Implement transformer inference engine with KV cachefrom-scratch89.3
Build distributed node cluster with gossip protocolfrom-scratch78.5
Fix broken GitHub Actions CI pipelinedebugging95.0
Optimize bloated React bundle under 500KBfrontend81.4
Find and patch all OWASP Top 10 vulnerabilitiesdebugging91.5
Add streaming SSE endpoint for LLM chatbackend79.0
Add file upload with S3 presigned URLsbackend78.1
Implement JWT auth middlewarebackend86.9
Build codebase indexer for LLM context windowsfrom-scratch78.8
Replace console.log with structured loggingrefactoring92.8
Add caching layer to eliminate slow SSR page loadsfull-stack89.2
Harden insecure Docker setup with 12 vulnerabilitiescode-review95.6
Convert React app to PWA with offline supportfrontend82.3
Add i18n with locale routing to Next.js appfull-stack78.2
Implement zero-trust API authentication layerbackend82.6
Remove AI slop and over-engineering from codebaserefactoring90.5
Split 1100-line god file into proper modulesrefactoring86.2
Dockerize Node.js monorepofull-stack84.8
Implement multi-tenant row-level security in Postgresbackend78.6
Fix broken responsive layoutfrontend86.9
Write Kubernetes manifests for Node.js microservicefull-stack89.4
Refactor monolithic handler to CQRSrefactoring68.7
Fix flaky test suitedebugging82.0
Build real-time portfolio risk calculatorbackend77.5
Zero-downtime schema migrationfull-stack66.7
Fix 12 WCAG accessibility violations in checkout formfrontend91.8
Build production website with auth and members areafrontend77.0
Build SaaS admin dashboard from scratchfrom-scratch82.2
Fix hallucination and context window bugs in RAG agentbackend73.2
Build CLI tool with subcommands and configfrom-scratch81.8
Build MCP server for database managementbackend83.2
Build LLM evaluation harness with structured gradingbackend73.8
Fix race conditions in order matching enginebackend80.4
Fix data integrity bugs in denormalized e-commerce schemadebugging72.8
Build materialized view refresh pipeline for analyticsbackend75.3
Fix deadlocking transaction patterns in Flask appbackend87.9
Debug and fix 6 broken database triggers and constraintsdebugging78.3
Write complex SQL report with window functionsbackend84.0
Find and fix 4 hidden backdoors in Flask appdebugging93.1
Add Google OAuth2 login to Express appfull-stack79.0
Optimize slow Postgres queries in Flask appbackend83.2
Add slash commands and moderation to Discord botbackend82.0
Add virtual scrolling to table rendering 5000 rowsfrontend79.3
Write integration tests for payment flowcode-review69.9
Fix auth bypass vulnerabilitydebugging93.7
Add cursor-based pagination to REST APIbackend48.5
Implement Stripe webhook handlerbackend77.3
Add rate limiting middlewarebackend83.2
Build terminal UI dashboardfrom-scratch77.5
Build REST API from scratchfrom-scratch93.7
Fix N+1 query in dashboardbackend91.5
Fix memory leak in event handlerdebugging85.2
Debug race condition in worker pooldebugging91.9