APEX
Back to models

Gemini 3 Pro Preview

OpenRouter

1049K context$2.00/M input$12.00/M output
1540peak 1557

Avg Score

75.0

Avg Cost

$0.41

Score/$

184.9

Runs

61

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

backendeasy
2201
refactoringexpert
2127
from-scratchexpert
2113
from-scratchmedium
1974
debuggingmedium
1698
full-stackhard
1697
frontendhard
1686
full-stack
1655
code-reviewmedium
1651
backendexpert
1636
full-stackmedium
1620
from-scratchhard
1611
refactoring
1592
from-scratch
1591
multi-languageexpert
1591
backend
1587
backendhard
1585
refactoringmedium
1539
backendmedium
1530
code-review
1489
debugginghard
1488
frontendeasy
1487
frontend
1449
debugging
1447
frontendmedium
1429
debuggingexpert
1346
multi-language
1330
frontendexpert
1307
from-scratcheasy
1118
multi-languagehard
564
code-reviewhard
129

All Results

TaskCategoryScore
Find and fix 4 hidden backdoors in Flask appdebugging93.7
Build materialized view refresh pipeline for analyticsbackend72.7
Build CLI tool with subcommands and configfrom-scratch72.2
Add Redis caching layer to Express APIbackend82.8
Add slash commands and moderation to Discord botbackend82.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging69.9
Migrate callback-hell Express app to async/awaitrefactoring58.7
Fix race conditions in order matching enginebackend86.6
Fix N+1 query in dashboardbackend68.7
Implement Stripe webhook handlerbackend84.6
Add streaming SSE endpoint for LLM chatbackend82.5
Fix React hydration mismatchfrontend74.6
Build LLM evaluation harness with structured gradingbackend82.7
Fix broken responsive layoutfrontend73.1
Zero-downtime schema migrationfull-stack84.6
Fix 12 WCAG accessibility violations in checkout formfrontend82.3
Add virtual scrolling to table rendering 5000 rowsfrontend79.2
Build REST API from scratchfrom-scratch72.3
Dockerize Node.js monorepofull-stack73.2
Fix flaky test suitedebugging87.6
Optimize slow Postgres queries in Flask appbackend78.0
Build MCP server for database managementbackend85.3
Fix Node.js stream backpressure causing OOM on large filesbackend58.6
Code review: identify security vulnscode-review89.3
Build RAG pipeline with vector searchbackend75.3
Write tests for untested legacy Flask servicecode-review63.0
Build real-time portfolio risk calculatorbackend74.5
Replace console.log with structured loggingrefactoring61.9
Add rate limiting middlewarebackend86.8
Fix auth bypass vulnerabilitydebugging86.5
Optimize bloated React bundle under 500KBfrontend75.6
Add GraphQL layer over REST APImulti-language51.4
Debug and fix 6 broken database triggers and constraintsdebugging76.3
Write integration tests for payment flowcode-review42.4
Convert React app to PWA with offline supportfrontend61.4
Add WebSocket real-time updatesfull-stack83.3
Port Python CLI to Rustmulti-language57.6
Build distributed node cluster with gossip protocolfrom-scratch71.6
Build codebase indexer for LLM context windowsfrom-scratch51.0
Write Kubernetes manifests for Node.js microservicefull-stack85.0
Add i18n with locale routing to Next.js appfull-stack75.1
Build terminal UI dashboardfrom-scratch72.3
Implement JWT auth middlewarebackend75.5
Split 1100-line god file into proper modulesrefactoring88.0
Implement multi-tenant row-level security in Postgresbackend76.2
Remove AI slop and over-engineering from codebaserefactoring84.9
Implement zero-trust API authentication layerbackend79.9
Harden insecure Docker setup with 12 vulnerabilitiescode-review82.8
Add caching layer to eliminate slow SSR page loadsfull-stack90.5
Implement background job scheduler with persistencebackend73.6
Implement transformer inference engine with KV cachefrom-scratch87.7
Fix hallucination and context window bugs in RAG agentbackend61.0
Build production website with auth and members areafrontend63.5
Build SaaS admin dashboard from scratchfrom-scratch69.0
Write complex SQL report with window functionsbackend80.7
Fix data integrity bugs in denormalized e-commerce schemadebugging75.8
Fix deadlocking transaction patterns in Flask appbackend60.5
Add retry logic and dead letter queue to Python task queuebackend80.5
Refactor monolithic handler to CQRSrefactoring78.4
Fix memory leak in event handlerdebugging55.2
Debug race condition in worker pooldebugging86.7