APEX
Back to models

Step 3.5 Flash

OpenRouter

256K context$0.10/M input$0.30/M output
1312peak 1328

Avg Score

53.8

Avg Cost

$0.03

Score/$

1945.5

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
1832
debugginghard
1550
code-reviewmedium
1405
backendexpert
1402
code-review
1383
frontendmedium
1373
backend
1331
frontendexpert
1331
backendhard
1324
backendmedium
1318
from-scratch
1315
full-stackmedium
1299
frontend
1283
from-scratchhard
1269
refactoring
1264
full-stack
1261
debugging
1256
refactoringmedium
1240
full-stackhard
1202
code-reviewhard
1198
from-scratcheasy
1143
refactoringexpert
856
multi-language
793
debuggingmedium
537
debuggingexpert
529
multi-languagehard
204
from-scratchexpert
172
backendeasy
0
multi-languageexpert
0
frontendhard
0
frontendeasy
0

All Results

TaskCategoryScore
Fix race conditions in order matching enginebackend65.8
Port Python CLI to Rustmulti-language4.0
Build CLI tool with subcommands and configfrom-scratch49.5
Add cursor-based pagination to REST APIbackend6.5
Add Google OAuth2 login to Express appfull-stack5.8
Build codebase indexer for LLM context windowsfrom-scratch46.3
Add retry logic and dead letter queue to Python task queuebackend28.0
Build real-time portfolio risk calculatorbackend27.2
Implement JWT auth middlewarebackend28.0
Fix N+1 query in dashboardbackend80.0
Implement background job scheduler with persistencebackend34.5
Fix auth bypass vulnerabilitydebugging85.0
Add streaming SSE endpoint for LLM chatbackend63.6
Write integration tests for payment flowcode-review61.0
Implement Stripe webhook handlerbackend78.5
Fix broken GitHub Actions CI pipelinedebugging59.7
Write tests for untested legacy Flask servicecode-review63.5
Add file upload with S3 presigned URLsbackend74.8
Add rate limiting middlewarebackend28.0
Add WebSocket real-time updatesfull-stack27.9
Fix flaky test suitedebugging23.0
Fix memory leak in event handlerdebugging70.5
Refactor monolithic handler to CQRSrefactoring52.0
Fix 12 WCAG accessibility violations in checkout formfrontend28.0
Find and patch all OWASP Top 10 vulnerabilitiesdebugging28.0
Fix data integrity bugs in denormalized e-commerce schemadebugging28.0
Code review: identify security vulnscode-review22.0
Optimize bloated React bundle under 500KBfrontend51.8
Debug and fix 6 broken database triggers and constraintsdebugging41.8
Migrate callback-hell Express app to async/awaitrefactoring22.0
Add i18n with locale routing to Next.js appfull-stack28.0
Optimize slow Postgres queries in Flask appbackend79.1
Dockerize Node.js monorepofull-stack28.0
Fix broken responsive layoutfrontend32.0
Build LLM evaluation harness with structured gradingbackend63.8
Implement transformer inference engine with KV cachefrom-scratch43.0
Remove AI slop and over-engineering from codebaserefactoring22.0
Fix React hydration mismatchfrontend62.9
Find and fix 4 hidden backdoors in Flask appdebugging90.9
Add virtual scrolling to table rendering 5000 rowsfrontend59.1
Build terminal UI dashboardfrom-scratch69.7
Add GraphQL layer over REST APImulti-language54.0
Build MCP server for database managementbackend76.9
Build REST API from scratchfrom-scratch72.7
Add Redis caching layer to Express APIbackend57.8
Fix deadlocking transaction patterns in Flask appbackend79.2
Convert React app to PWA with offline supportfrontend84.8
Build production website with auth and members areafrontend64.3
Split 1100-line god file into proper modulesrefactoring80.8
Write Kubernetes manifests for Node.js microservicefull-stack81.7
Build RAG pipeline with vector searchbackend43.0
Implement multi-tenant row-level security in Postgresbackend74.8
Build SaaS admin dashboard from scratchfrom-scratch44.3
Write complex SQL report with window functionsbackend41.8
Add slash commands and moderation to Discord botbackend53.3
Fix hallucination and context window bugs in RAG agentbackend50.5
Implement zero-trust API authentication layerbackend77.3
Debug race condition in worker pooldebugging87.9
Add caching layer to eliminate slow SSR page loadsfull-stack81.7
Replace console.log with structured loggingrefactoring60.0
Harden insecure Docker setup with 12 vulnerabilitiescode-review88.4
Zero-downtime schema migrationfull-stack82.5
Build materialized view refresh pipeline for analyticsbackend80.0
Fix Node.js stream backpressure causing OOM on large filesbackend52.3
Build distributed node cluster with gossip protocolfrom-scratch66.8