APEX
Back to models

Gemini 3.5 Flash

Google

1049K context$1.50/M input$9.00/M output
1639peak 1655

Avg Score

79.8

Avg Cost

$0.27

Score/$

292.9

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
2158
frontendexpert
2136
backendeasy
2109
from-scratcheasy
2089
from-scratchhard
1929
code-reviewhard
1923
from-scratchexpert
1904
multi-languageexpert
1896
frontendhard
1838
from-scratch
1809
refactoringmedium
1764
backendhard
1756
debuggingmedium
1750
frontendeasy
1747
backendexpert
1741
refactoring
1711
full-stackhard
1709
backend
1682
refactoringexpert
1675
code-reviewmedium
1665
code-review
1661
full-stack
1641
backendmedium
1612
frontend
1564
full-stackmedium
1559
debugging
1517
debuggingexpert
1498
multi-language
1497
frontendmaster
1495
frontendmedium
1465
debugginghard
1458
backendmaster
1450
multi-languagehard
1438

All Results

TaskCategoryScore
Fix and extend Chrome browser extensionfrontend58.3
Build interactive data visualization dashboardfrontend76.4
Write tests for untested legacy Flask servicecode-review84.7
Fix auth bypass vulnerabilitydebugging78.8
Build multi-tool LLM agent runtimebackend80.8
Build REST API from scratchfrom-scratch88.2
Add virtual scrolling to table rendering 5000 rowsfrontend72.5
Add GraphQL layer over REST APImulti-language73.2
Debug and fix 6 broken database triggers and constraintsdebugging82.5
Harden insecure Docker setup with 12 vulnerabilitiescode-review88.5
Implement JWT auth middlewarebackend74.3
Add WebSocket real-time updatesfull-stack85.0
Build distributed node cluster with gossip protocolfrom-scratch82.5
Fix React hydration mismatchfrontend70.0
Implement zero-trust API authentication layerbackend80.5
Build materialized view refresh pipeline for analyticsbackend79.6
Fix memory leak in event handlerdebugging71.3
Add rate limiting middlewarebackend83.9
Optimize bloated React bundle under 500KBfrontend83.2
Add caching layer to eliminate slow SSR page loadsfull-stack82.0
Write complex SQL report with window functionsbackend88.5
Add i18n with locale routing to Next.js appfull-stack81.8
Optimize slow Postgres queries in Flask appbackend87.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging63.5
Fix 12 WCAG accessibility violations in checkout formfrontend85.1
Add slash commands and moderation to Discord botbackend81.7
Build SaaS admin dashboard from scratchfrom-scratch77.2
Zero-downtime schema migrationfull-stack81.0
Add Redis caching layer to Express APIbackend85.2
Fix broken GitHub Actions CI pipelinedebugging90.9
Code review: identify security vulnscode-review73.5
Fix N+1 query in dashboardbackend73.0
Build LLM evaluation harness with structured gradingbackend77.3
Fix data integrity bugs in denormalized e-commerce schemadebugging85.6
Debug race condition in worker pooldebugging84.7
Build MCP server for database managementbackend83.3
Fix Node.js stream backpressure causing OOM on large filesbackend91.3
Add streaming SSE endpoint for LLM chatbackend82.4
Fix deadlocking transaction patterns in Flask appbackend83.0
Write Kubernetes manifests for Node.js microservicefull-stack86.0
Replace console.log with structured loggingrefactoring82.4
Write integration tests for payment flowcode-review80.0
Build 3D browser game with physics and multiplayer syncfrontend76.0
Migrate Express monolith to modular architecturebackend67.9
Implement multi-tenant row-level security in Postgresbackend79.2
Migrate callback-hell Express app to async/awaitrefactoring84.8
Build RAG pipeline with vector searchbackend79.7
Find and fix 4 hidden backdoors in Flask appdebugging90.2
Convert React app to PWA with offline supportfrontend74.6
Add Google OAuth2 login to Express appfull-stack76.3
Add cursor-based pagination to REST APIbackend73.5
Add retry logic and dead letter queue to Python task queuebackend79.8
Fix broken responsive layoutfrontend78.7
Implement background job scheduler with persistencebackend79.7
Build real-time portfolio risk calculatorbackend72.7
Add file upload with S3 presigned URLsbackend86.5
Fix flaky test suitedebugging87.5
Refactor monolithic handler to CQRSrefactoring70.0
Port Python CLI to Rustmulti-language65.3
Fix race conditions in order matching enginebackend87.3
Build terminal UI dashboardfrom-scratch75.0
Dockerize Node.js monorepofull-stack76.7
Implement transformer inference engine with KV cachefrom-scratch83.2
Remove AI slop and over-engineering from codebaserefactoring83.4
Implement Stripe webhook handlerbackend82.9
Fix hallucination and context window bugs in RAG agentbackend77.7
Build codebase indexer for LLM context windowsfrom-scratch74.9
Build CLI tool with subcommands and configfrom-scratch82.0
Build production website with auth and members areafrontend76.3
Split 1100-line god file into proper modulesrefactoring83.7