APEX
Back to models

Gemini 3.1 Pro Preview

OpenRouter

1049K context$2.00/M input$12.00/M output
1564peak 1580

Avg Score

75.3

Avg Cost

$0.53

Score/$

142.5

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
2331
refactoringexpert
2250
backendeasy
2245
from-scratchexpert
2157
from-scratcheasy
2012
from-scratchmedium
1912
frontendhard
1884
multi-languagehard
1738
frontendeasy
1690
from-scratch
1686
from-scratchhard
1663
multi-language
1657
full-stackmedium
1637
debuggingexpert
1621
backendhard
1611
backend
1599
debugginghard
1582
backendmedium
1578
code-reviewmedium
1575
full-stack
1574
backendexpert
1567
code-reviewhard
1555
refactoring
1544
code-review
1538
full-stackhard
1533
debugging
1487
frontend
1482
refactoringmedium
1460
frontendmedium
1418
frontendexpert
1331
debuggingmedium
989

All Results

TaskCategoryScore
Replace console.log with structured loggingrefactoring68.9
Build SaaS admin dashboard from scratchfrom-scratch63.6
Fix Node.js stream backpressure causing OOM on large filesbackend86.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging69.7
Build LLM evaluation harness with structured gradingbackend81.7
Find and fix 4 hidden backdoors in Flask appdebugging90.7
Fix React hydration mismatchfrontend71.1
Implement background job scheduler with persistencebackend79.7
Implement multi-tenant row-level security in Postgresbackend80.3
Write complex SQL report with window functionsbackend78.1
Fix broken responsive layoutfrontend78.0
Write tests for untested legacy Flask servicecode-review49.9
Build codebase indexer for LLM context windowsfrom-scratch54.8
Implement Stripe webhook handlerbackend81.2
Build CLI tool with subcommands and configfrom-scratch82.5
Fix race conditions in order matching enginebackend91.8
Add streaming SSE endpoint for LLM chatbackend87.7
Split 1100-line god file into proper modulesrefactoring61.3
Debug and fix 6 broken database triggers and constraintsdebugging90.0
Build terminal UI dashboardfrom-scratch70.5
Migrate callback-hell Express app to async/awaitrefactoring65.0
Fix 12 WCAG accessibility violations in checkout formfrontend85.5
Code review: identify security vulnscode-review83.2
Fix memory leak in event handlerdebugging48.9
Fix auth bypass vulnerabilitydebugging92.1
Zero-downtime schema migrationfull-stack82.8
Add WebSocket real-time updatesfull-stack84.2
Build real-time portfolio risk calculatorbackend65.0
Add retry logic and dead letter queue to Python task queuebackend88.3
Optimize slow Postgres queries in Flask appbackend85.0
Add cursor-based pagination to REST APIbackend79.2
Build distributed node cluster with gossip protocolfrom-scratch69.9
Build production website with auth and members areafrontend64.6
Fix data integrity bugs in denormalized e-commerce schemadebugging84.2
Add rate limiting middlewarebackend87.5
Remove AI slop and over-engineering from codebaserefactoring85.8
Add caching layer to eliminate slow SSR page loadsfull-stack89.2
Add Redis caching layer to Express APIbackend46.5
Fix broken GitHub Actions CI pipelinedebugging67.5
Add GraphQL layer over REST APImulti-language78.5
Port Python CLI to Rustmulti-language76.5
Build RAG pipeline with vector searchbackend78.3
Add file upload with S3 presigned URLsbackend44.3
Implement zero-trust API authentication layerbackend67.7
Add i18n with locale routing to Next.js appfull-stack35.8
Implement JWT auth middlewarebackend69.0
Write Kubernetes manifests for Node.js microservicefull-stack93.8
Convert React app to PWA with offline supportfrontend63.4
Optimize bloated React bundle under 500KBfrontend76.6
Dockerize Node.js monorepofull-stack63.5
Harden insecure Docker setup with 12 vulnerabilitiescode-review87.2
Fix deadlocking transaction patterns in Flask appbackend57.1
Implement transformer inference engine with KV cachefrom-scratch89.8
Build MCP server for database managementbackend86.1
Fix hallucination and context window bugs in RAG agentbackend64.5
Build materialized view refresh pipeline for analyticsbackend66.3
Add virtual scrolling to table rendering 5000 rowsfrontend80.5
Add Google OAuth2 login to Express appfull-stack76.0
Add slash commands and moderation to Discord botbackend85.1
Write integration tests for payment flowcode-review74.7
Refactor monolithic handler to CQRSrefactoring80.5
Fix flaky test suitedebugging63.8
Fix N+1 query in dashboardbackend90.6
Debug race condition in worker pooldebugging87.0
Build REST API from scratchfrom-scratch86.3