APEX
Back to models

Gemini 3.1 Pro Preview

OpenRouter

1049K context$2.00/M input$12.00/M output
1672peak 1676

Avg Score

75.7

Avg Cost

$0.57

Score/$

132.6

Runs

101

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
3238
frontendeasy
2632
backendeasy
2631
refactoringexpert
2558
from-scratchmedium
2530
frontendexpert
2354
code-reviewhard
2125
from-scratchexpert
2090
from-scratchhard
2015
from-scratch
1911
multi-languagehard
1829
from-scratcheasy
1819
refactoring
1818
frontendhard
1806
frontend
1737
backendhard
1724
refactoringmedium
1724
multi-language
1684
backend
1675
code-review
1670
backendmedium
1667
debuggingexpert
1660
full-stackmedium
1633
code-reviewmedium
1629
frontendmedium
1627
backendexpert
1612
full-stack
1605
full-stackhard
1604
debugginghard
1593
debugging
1489
debuggingmedium
992

All Results

TaskCategoryScore
Replace console.log with structured loggingrefactoring68.9
Build SaaS admin dashboard from scratchfrom-scratch63.6
Fix Node.js stream backpressure causing OOM on large filesbackend86.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging69.7
Build LLM evaluation harness with structured gradingbackend81.7
Find and fix 4 hidden backdoors in Flask appdebugging90.7
Fix React hydration mismatchfrontend71.1
Implement background job scheduler with persistencebackend79.7
Implement multi-tenant row-level security in Postgresbackend80.3
Write complex SQL report with window functionsbackend78.1
Fix broken responsive layoutfrontend78.0
Write tests for untested legacy Flask servicecode-review49.9
Build codebase indexer for LLM context windowsfrom-scratch54.8
Implement Stripe webhook handlerbackend81.2
Build CLI tool with subcommands and configfrom-scratch82.5
Fix race conditions in order matching enginebackend91.8
Add streaming SSE endpoint for LLM chatbackend87.7
Split 1100-line god file into proper modulesrefactoring61.3
Debug and fix 6 broken database triggers and constraintsdebugging90.0
Build terminal UI dashboardfrom-scratch65.3
Migrate callback-hell Express app to async/awaitrefactoring65.0
Fix 12 WCAG accessibility violations in checkout formfrontend85.5
Code review: identify security vulnscode-review78.1
Fix memory leak in event handlerdebugging48.9
Fix auth bypass vulnerabilitydebugging92.1
Zero-downtime schema migrationfull-stack82.8
Add WebSocket real-time updatesfull-stack76.5
Build real-time portfolio risk calculatorbackend65.0
Add retry logic and dead letter queue to Python task queuebackend88.3
Optimize slow Postgres queries in Flask appbackend85.0
Add cursor-based pagination to REST APIbackend79.2
Build distributed node cluster with gossip protocolfrom-scratch61.9
Build production website with auth and members areafrontend64.6
Fix data integrity bugs in denormalized e-commerce schemadebugging84.2
Add rate limiting middlewarebackend87.5
Remove AI slop and over-engineering from codebaserefactoring85.8
Add caching layer to eliminate slow SSR page loadsfull-stack89.2
Add Redis caching layer to Express APIbackend46.5
Fix broken GitHub Actions CI pipelinedebugging67.5
Add GraphQL layer over REST APImulti-language78.5
Port Python CLI to Rustmulti-language76.5
Code review: identify security vulnscode-review83.2
Add WebSocket real-time updatesfull-stack84.2
Build RAG pipeline with vector searchbackend78.3
Add file upload with S3 presigned URLsbackend44.3
Build codebase indexer for LLM context windowsfrom-scratch57.6
Implement zero-trust API authentication layerbackend67.7
Add i18n with locale routing to Next.js appfull-stack35.8
Fix broken responsive layoutfrontend89.5
Split 1100-line god file into proper modulesrefactoring86.8
Implement JWT auth middlewarebackend69.0
Write Kubernetes manifests for Node.js microservicefull-stack93.8
Convert React app to PWA with offline supportfrontend63.4
Optimize bloated React bundle under 500KBfrontend76.6
Remove AI slop and over-engineering from codebaserefactoring73.5
Add caching layer to eliminate slow SSR page loadsfull-stack87.6
Dockerize Node.js monorepofull-stack63.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging74.6
Replace console.log with structured loggingrefactoring64.2
Add streaming SSE endpoint for LLM chatbackend69.8
Harden insecure Docker setup with 12 vulnerabilitiescode-review87.2
Implement multi-tenant row-level security in Postgresbackend42.4
Write tests for untested legacy Flask servicecode-review42.9
Fix deadlocking transaction patterns in Flask appbackend57.1
Build production website with auth and members areafrontend74.9
Build CLI tool with subcommands and configfrom-scratch76.6
Build SaaS admin dashboard from scratchfrom-scratch76.9
Implement transformer inference engine with KV cachefrom-scratch89.8
Implement background job scheduler with persistencebackend80.7
Build MCP server for database managementbackend86.1
Build real-time portfolio risk calculatorbackend72.8
Fix hallucination and context window bugs in RAG agentbackend64.5
Build LLM evaluation harness with structured gradingbackend80.3
Debug and fix 6 broken database triggers and constraintsdebugging72.3
Fix data integrity bugs in denormalized e-commerce schemadebugging78.3
Fix race conditions in order matching enginebackend90.5
Write complex SQL report with window functionsbackend65.6
Build materialized view refresh pipeline for analyticsbackend66.3
Find and fix 4 hidden backdoors in Flask appdebugging87.4
Optimize slow Postgres queries in Flask appbackend90.4
Add virtual scrolling to table rendering 5000 rowsfrontend80.5
Add Google OAuth2 login to Express appfull-stack76.0
Fix Node.js stream backpressure causing OOM on large filesbackend81.9
Add slash commands and moderation to Discord botbackend85.1
Add retry logic and dead letter queue to Python task queuebackend89.4
Fix 12 WCAG accessibility violations in checkout formfrontend83.0
Write integration tests for payment flowcode-review74.7
Build distributed node cluster with gossip protocolfrom-scratch85.8
Fix auth bypass vulnerabilitydebugging92.9
Refactor monolithic handler to CQRSrefactoring80.5
Add rate limiting middlewarebackend85.2
Zero-downtime schema migrationfull-stack64.3
Fix React hydration mismatchfrontend79.5
Implement Stripe webhook handlerbackend76.7
Fix flaky test suitedebugging63.8
Build terminal UI dashboardfrom-scratch79.6
Fix N+1 query in dashboardbackend90.6
Add cursor-based pagination to REST APIbackend85.2
Debug race condition in worker pooldebugging87.0
Fix memory leak in event handlerdebugging73.0
Build REST API from scratchfrom-scratch86.3