APEX
Back to models

Claude Haiku 4.5

Anthropic

128K context<$0.01/M input<$0.01/M output
1553peak 1557

Avg Score

71.1

Avg Cost

$0.07

Score/$

973.6

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languagehard
2110
backendeasy
2098
from-scratcheasy
2011
multi-languageexpert
1994
refactoringexpert
1984
code-reviewhard
1967
multi-language
1871
refactoringmedium
1811
refactoring
1789
code-reviewmedium
1780
code-review
1756
full-stackmedium
1727
from-scratchhard
1711
frontendhard
1705
full-stack
1672
debuggingmedium
1669
full-stackhard
1652
frontendexpert
1641
backendhard
1608
from-scratch
1601
frontendeasy
1572
debugginghard
1567
debugging
1526
debuggingexpert
1496
backend
1470
frontend
1469
backendmedium
1391
frontendmedium
1364
backendexpert
1336
from-scratchmedium
1229
from-scratchexpert
406

All Results

TaskCategoryScore
Build SaaS admin dashboard from scratchfrom-scratch68.1
Add retry logic and dead letter queue to Python task queuebackend60.9
Build RAG pipeline with vector searchbackend74.3
Write tests for untested legacy Flask servicecode-review82.8
Build distributed node cluster with gossip protocolfrom-scratch53.5
Add Google OAuth2 login to Express appfull-stack66.0
Migrate callback-hell Express app to async/awaitrefactoring84.9
Build terminal UI dashboardfrom-scratch55.0
Build materialized view refresh pipeline for analyticsbackend63.9
Zero-downtime schema migrationfull-stack87.8
Add Redis caching layer to Express APIbackend56.0
Implement background job scheduler with persistencebackend48.2
Port Python CLI to Rustmulti-language69.9
Fix broken GitHub Actions CI pipelinedebugging95.0
Add rate limiting middlewarebackend84.9
Add GraphQL layer over REST APImulti-language80.2
Code review: identify security vulnscode-review79.3
Fix race conditions in order matching enginebackend87.2
Build MCP server for database managementbackend87.3
Add file upload with S3 presigned URLsbackend71.8
Replace console.log with structured loggingrefactoring74.6
Add streaming SSE endpoint for LLM chatbackend23.0
Split 1100-line god file into proper modulesrefactoring81.0
Implement multi-tenant row-level security in Postgresbackend5.0
Optimize bloated React bundle under 500KBfrontend72.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging73.7
Add i18n with locale routing to Next.js appfull-stack75.0
Implement JWT auth middlewarebackend83.4
Convert React app to PWA with offline supportfrontend80.1
Build codebase indexer for LLM context windowsfrom-scratch81.9
Fix broken responsive layoutfrontend76.9
Add caching layer to eliminate slow SSR page loadsfull-stack87.5
Write Kubernetes manifests for Node.js microservicefull-stack84.6
Harden insecure Docker setup with 12 vulnerabilitiescode-review92.8
Dockerize Node.js monorepofull-stack83.8
Implement zero-trust API authentication layerbackend45.8
Remove AI slop and over-engineering from codebaserefactoring88.4
Implement transformer inference engine with KV cachefrom-scratch53.8
Build production website with auth and members areafrontend65.7
Build CLI tool with subcommands and configfrom-scratch75.0
Write integration tests for payment flowcode-review70.3
Build LLM evaluation harness with structured gradingbackend40.5
Refactor monolithic handler to CQRSrefactoring72.8
Fix hallucination and context window bugs in RAG agentbackend75.4
Build real-time portfolio risk calculatorbackend52.5
Fix data integrity bugs in denormalized e-commerce schemadebugging50.8
Write complex SQL report with window functionsbackend78.5
Fix deadlocking transaction patterns in Flask appbackend73.9
Debug and fix 6 broken database triggers and constraintsdebugging84.0
Find and fix 4 hidden backdoors in Flask appdebugging92.5
Optimize slow Postgres queries in Flask appbackend74.3
Add slash commands and moderation to Discord botbackend54.9
Fix 12 WCAG accessibility violations in checkout formfrontend83.2
Add virtual scrolling to table rendering 5000 rowsfrontend62.8
Fix Node.js stream backpressure causing OOM on large filesbackend81.0
Fix auth bypass vulnerabilitydebugging92.1
Fix flaky test suitedebugging79.5
Implement Stripe webhook handlerbackend59.4
Add cursor-based pagination to REST APIbackend62.6
Fix N+1 query in dashboardbackend63.8
Add WebSocket real-time updatesfull-stack60.6
Fix memory leak in event handlerdebugging60.4
Debug race condition in worker pooldebugging87.3
Fix React hydration mismatchfrontend56.2
Build REST API from scratchfrom-scratch90.9