APEX
Back to models

Gemini 2.5 Flash Lite

Google

1000K context$0.10/M input$0.40/M output
1201peak 1210

Avg Score

52.9

Avg Cost

$0.02

Score/$

2183.8

Runs

52

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

debuggingmedium
1579
from-scratchhard
1362
refactoringmedium
1338
from-scratcheasy
1332
multi-language
1300
multi-languageexpert
1288
backendexpert
1286
refactoring
1268
full-stackmedium
1260
frontendmedium
1249
debugging
1240
code-reviewmedium
1228
code-review
1212
from-scratch
1197
frontend
1154
full-stack
1150
backend
1142
debuggingexpert
1135
debugginghard
1083
backendhard
964
backendmedium
956
full-stackhard
881
frontendeasy
552
frontendhard
546
frontendexpert
417
code-reviewhard
307
backendeasy
228
refactoringexpert
103
from-scratchmedium
0
multi-languagehard
0
from-scratchexpert
0

All Results

TaskCategoryScore
Fix broken GitHub Actions CI pipelinedebugging52.8
Implement zero-trust API authentication layerbackend62.3
Add caching layer to eliminate slow SSR page loadsfull-stack65.5
Add i18n with locale routing to Next.js appfull-stack19.3
Remove AI slop and over-engineering from codebaserefactoring65.0
Optimize bloated React bundle under 500KBfrontend68.5
Build codebase indexer for LLM context windowsfrom-scratch47.1
Add streaming SSE endpoint for LLM chatbackend67.5
Find and patch all OWASP Top 10 vulnerabilitiesdebugging37.8
Implement multi-tenant row-level security in Postgresbackend33.4
Split 1100-line god file into proper modulesrefactoring50.8
Harden insecure Docker setup with 12 vulnerabilitiescode-review70.0
Write Kubernetes manifests for Node.js microservicefull-stack80.4
Convert React app to PWA with offline supportfrontend40.9
Fix broken responsive layoutfrontend60.1
Replace console.log with structured loggingrefactoring69.3
Implement JWT auth middlewarebackend34.8
Dockerize Node.js monorepofull-stack67.0
Port Python CLI to Rustmulti-language58.9
Migrate callback-hell Express app to async/awaitrefactoring73.8
Write complex SQL report with window functionsbackend50.8
Build MCP server for database managementbackend51.8
Implement transformer inference engine with KV cachefrom-scratch37.5
Build production website with auth and members areafrontend44.5
Fix deadlocking transaction patterns in Flask appbackend68.0
Fix N+1 query in dashboardbackend45.0
Build real-time portfolio risk calculatorbackend57.2
Debug and fix 6 broken database triggers and constraintsdebugging74.4
Build LLM evaluation harness with structured gradingbackend44.9
Optimize slow Postgres queries in Flask appbackend61.1
Zero-downtime schema migrationfull-stack53.0
Add cursor-based pagination to REST APIbackend27.8
Write integration tests for payment flowcode-review37.8
Build distributed node cluster with gossip protocolfrom-scratch36.0
Add slash commands and moderation to Discord botbackend51.0
Write tests for untested legacy Flask servicecode-review45.5
Fix 12 WCAG accessibility violations in checkout formfrontend64.3
Fix flaky test suitedebugging81.2
Add rate limiting middlewarebackend46.3
Find and fix 4 hidden backdoors in Flask appdebugging67.3
Implement Stripe webhook handlerbackend48.3
Add GraphQL layer over REST APImulti-language29.8
Fix hallucination and context window bugs in RAG agentbackend29.5
Fix data integrity bugs in denormalized e-commerce schemadebugging45.5
Refactor monolithic handler to CQRSrefactoring34.7
Fix memory leak in event handlerdebugging59.9
Add WebSocket real-time updatesfull-stack62.7
Debug race condition in worker pooldebugging57.3
Fix React hydration mismatchfrontend65.3
Build terminal UI dashboardfrom-scratch18.4
Build REST API from scratchfrom-scratch79.5
Fix race conditions in order matching enginebackend77.3