APEX
Back to models

Claude Opus 4.8

Anthropic

200K context$15.00/M input$75.00/M output
1946peak 1965

Avg Score

90.0

Avg Cost

$2.04

Score/$

44.2

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
3237
frontendexpert
3237
multi-languageexpert
2921
frontendeasy
2755
refactoringexpert
2748
code-reviewhard
2611
from-scratchhard
2488
from-scratcheasy
2388
multi-languagehard
2381
from-scratch
2200
backendeasy
2180
from-scratchexpert
2168
frontendhard
2166
multi-language
2113
refactoring
2065
refactoringmedium
2061
frontend
2045
backendexpert
2044
frontendmedium
2037
full-stackhard
2034
backendhard
2023
full-stack
1999
full-stackmedium
1991
code-review
1967
backend
1956
backendmedium
1940
code-reviewmedium
1934
frontendmaster
1897
debuggingexpert
1848
debuggingmedium
1804
backendmaster
1771
debugging
1749
debugginghard
1715

All Results

TaskCategoryScore
Add streaming SSE endpoint for LLM chatbackend95.6
Fix and extend Chrome browser extensionfrontend82.3
Optimize bloated React bundle under 500KBfrontend91.3
Add i18n with locale routing to Next.js appfull-stack89.0
Migrate Express monolith to modular architecturebackend87.3
Build interactive data visualization dashboardfrontend80.7
Build multi-tool LLM agent runtimebackend87.5
Build 3D browser game with physics and multiplayer syncfrontend86.0
Build CLI tool with subcommands and configfrom-scratch91.0
Dockerize Node.js monorepofull-stack90.4
Fix N+1 query in dashboardbackend89.6
Write tests for untested legacy Flask servicecode-review91.0
Implement JWT auth middlewarebackend90.2
Write integration tests for payment flowcode-review93.0
Fix flaky test suitedebugging90.8
Fix memory leak in event handlerdebugging92.2
Refactor monolithic handler to CQRSrefactoring89.6
Fix 12 WCAG accessibility violations in checkout formfrontend91.0
Write Kubernetes manifests for Node.js microservicefull-stack92.6
Write complex SQL report with window functionsbackend90.5
Add GraphQL layer over REST APImulti-language90.8
Convert React app to PWA with offline supportfrontend89.3
Split 1100-line god file into proper modulesrefactoring90.2
Build LLM evaluation harness with structured gradingbackend87.6
Port Python CLI to Rustmulti-language89.0
Code review: identify security vulnscode-review85.6
Fix Node.js stream backpressure causing OOM on large filesbackend92.0
Add slash commands and moderation to Discord botbackend90.0
Fix race conditions in order matching enginebackend90.9
Fix broken GitHub Actions CI pipelinedebugging92.2
Add file upload with S3 presigned URLsbackend85.9
Implement background job scheduler with persistencebackend89.2
Migrate callback-hell Express app to async/awaitrefactoring92.3
Debug and fix 6 broken database triggers and constraintsdebugging89.3
Implement zero-trust API authentication layerbackend90.4
Remove AI slop and over-engineering from codebaserefactoring93.0
Add retry logic and dead letter queue to Python task queuebackend87.2
Build real-time portfolio risk calculatorbackend86.9
Optimize slow Postgres queries in Flask appbackend90.0
Add Google OAuth2 login to Express appfull-stack87.2
Build production website with auth and members areafrontend91.6
Fix deadlocking transaction patterns in Flask appbackend89.2
Add rate limiting middlewarebackend86.5
Build materialized view refresh pipeline for analyticsbackend91.0
Implement multi-tenant row-level security in Postgresbackend90.7
Implement Stripe webhook handlerbackend90.7
Fix hallucination and context window bugs in RAG agentbackend89.3
Build MCP server for database managementbackend89.3
Build RAG pipeline with vector searchbackend88.7
Find and patch all OWASP Top 10 vulnerabilitiesdebugging89.2
Add virtual scrolling to table rendering 5000 rowsfrontend90.3
Add cursor-based pagination to REST APIbackend87.8
Build terminal UI dashboardfrom-scratch93.0
Add caching layer to eliminate slow SSR page loadsfull-stack90.3
Fix React hydration mismatchfrontend90.2
Fix broken responsive layoutfrontend94.6
Zero-downtime schema migrationfull-stack91.0
Find and fix 4 hidden backdoors in Flask appdebugging93.5
Build SaaS admin dashboard from scratchfrom-scratch92.2
Fix data integrity bugs in denormalized e-commerce schemadebugging89.8
Build codebase indexer for LLM context windowsfrom-scratch88.7
Fix auth bypass vulnerabilitydebugging94.1
Add Redis caching layer to Express APIbackend89.7
Add WebSocket real-time updatesfull-stack89.4
Replace console.log with structured loggingrefactoring90.7
Build distributed node cluster with gossip protocolfrom-scratch90.6
Implement transformer inference engine with KV cachefrom-scratch90.3
Build REST API from scratchfrom-scratch94.8
Harden insecure Docker setup with 12 vulnerabilitiescode-review92.2
Debug race condition in worker pooldebugging88.9