APEX
Back to models

Gemini 3 Pro Preview

OpenRouter

1049K context$2.00/M input$12.00/M output
1641peak 1665

Avg Score

74.9

Avg Cost

$0.46

Score/$

163.2

Runs

94

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
2835
frontendeasy
2632
backendeasy
2631
refactoringexpert
2520
from-scratchmedium
2480
code-reviewhard
2170
from-scratchexpert
2072
from-scratcheasy
1867
full-stackhard
1819
from-scratch
1783
frontendhard
1756
full-stack
1715
refactoring
1699
from-scratchhard
1697
full-stackmedium
1691
code-review
1669
frontend
1666
backendexpert
1665
backend
1652
backendmedium
1650
backendhard
1649
frontendmedium
1643
debuggingmedium
1632
code-reviewmedium
1619
refactoringmedium
1617
frontendexpert
1544
debuggingexpert
1524
debugging
1485
debugginghard
1466
multi-language
1428
multi-languagehard
326

All Results

TaskCategoryScore
Find and fix 4 hidden backdoors in Flask appdebugging93.7
Build materialized view refresh pipeline for analyticsbackend72.7
Build CLI tool with subcommands and configfrom-scratch72.2
Add Redis caching layer to Express APIbackend82.8
Add slash commands and moderation to Discord botbackend82.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging69.9
Migrate callback-hell Express app to async/awaitrefactoring58.7
Fix race conditions in order matching enginebackend86.6
Fix N+1 query in dashboardbackend68.7
Implement Stripe webhook handlerbackend84.6
Add streaming SSE endpoint for LLM chatbackend82.5
Fix React hydration mismatchfrontend74.6
Build LLM evaluation harness with structured gradingbackend82.7
Fix broken responsive layoutfrontend73.1
Zero-downtime schema migrationfull-stack84.6
Fix 12 WCAG accessibility violations in checkout formfrontend82.3
Add virtual scrolling to table rendering 5000 rowsfrontend79.2
Build REST API from scratchfrom-scratch72.3
Dockerize Node.js monorepofull-stack73.2
Fix flaky test suitedebugging87.6
Optimize slow Postgres queries in Flask appbackend78.0
Build MCP server for database managementbackend85.3
Fix Node.js stream backpressure causing OOM on large filesbackend58.6
Code review: identify security vulnscode-review89.3
Build RAG pipeline with vector searchbackend75.3
Write tests for untested legacy Flask servicecode-review63.0
Build real-time portfolio risk calculatorbackend74.5
Replace console.log with structured loggingrefactoring61.9
Add rate limiting middlewarebackend86.8
Fix auth bypass vulnerabilitydebugging86.5
Optimize bloated React bundle under 500KBfrontend75.6
Add GraphQL layer over REST APImulti-language51.4
Debug and fix 6 broken database triggers and constraintsdebugging76.3
Write integration tests for payment flowcode-review42.4
Convert React app to PWA with offline supportfrontend61.4
Add WebSocket real-time updatesfull-stack83.3
Port Python CLI to Rustmulti-language57.6
Build distributed node cluster with gossip protocolfrom-scratch71.6
Build codebase indexer for LLM context windowsfrom-scratch51.0
Write Kubernetes manifests for Node.js microservicefull-stack85.0
Add i18n with locale routing to Next.js appfull-stack75.1
Build terminal UI dashboardfrom-scratch72.3
Implement JWT auth middlewarebackend75.5
Add i18n with locale routing to Next.js appfull-stack69.9
Split 1100-line god file into proper modulesrefactoring88.0
Find and patch all OWASP Top 10 vulnerabilitiesdebugging71.5
Implement multi-tenant row-level security in Postgresbackend76.2
Remove AI slop and over-engineering from codebaserefactoring84.9
Optimize bloated React bundle under 500KBfrontend77.8
Convert React app to PWA with offline supportfrontend55.5
Fix broken responsive layoutfrontend90.7
Implement zero-trust API authentication layerbackend79.9
Replace console.log with structured loggingrefactoring62.2
Build codebase indexer for LLM context windowsfrom-scratch37.3
Harden insecure Docker setup with 12 vulnerabilitiescode-review82.8
Dockerize Node.js monorepofull-stack74.0
Add caching layer to eliminate slow SSR page loadsfull-stack90.5
Write Kubernetes manifests for Node.js microservicefull-stack84.8
Find and fix 4 hidden backdoors in Flask appdebugging74.3
Implement background job scheduler with persistencebackend73.6
Implement transformer inference engine with KV cachefrom-scratch87.7
Build MCP server for database managementbackend82.5
Fix hallucination and context window bugs in RAG agentbackend61.0
Build production website with auth and members areafrontend63.5
Build SaaS admin dashboard from scratchfrom-scratch69.0
Build real-time portfolio risk calculatorbackend70.0
Build LLM evaluation harness with structured gradingbackend66.1
Build CLI tool with subcommands and configfrom-scratch56.3
Fix race conditions in order matching enginebackend86.8
Write complex SQL report with window functionsbackend80.7
Build materialized view refresh pipeline for analyticsbackend64.3
Fix data integrity bugs in denormalized e-commerce schemadebugging75.8
Fix deadlocking transaction patterns in Flask appbackend60.5
Debug and fix 6 broken database triggers and constraintsdebugging78.8
Write tests for untested legacy Flask servicecode-review49.3
Optimize slow Postgres queries in Flask appbackend84.0
Add slash commands and moderation to Discord botbackend69.5
Fix 12 WCAG accessibility violations in checkout formfrontend84.8
Add retry logic and dead letter queue to Python task queuebackend80.5
Fix Node.js stream backpressure causing OOM on large filesbackend90.4
Add virtual scrolling to table rendering 5000 rowsfrontend85.9
Write integration tests for payment flowcode-review75.0
Fix auth bypass vulnerabilitydebugging88.9
Build distributed node cluster with gossip protocolfrom-scratch72.4
Zero-downtime schema migrationfull-stack62.1
Add rate limiting middlewarebackend79.0
Fix flaky test suitedebugging82.4
Refactor monolithic handler to CQRSrefactoring78.4
Fix N+1 query in dashboardbackend86.1
Fix React hydration mismatchfrontend82.7
Fix memory leak in event handlerdebugging55.2
Debug race condition in worker pooldebugging86.7
Build terminal UI dashboardfrom-scratch75.9
Build REST API from scratchfrom-scratch88.0