APEX
Back to models

Kimi K2.5

OpenRouter

262K context$0.45/M input$2.25/M output
1590peak 1603

Avg Score

72.1

Avg Cost

$0.13

Score/$

563.7

Runs

122

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

backendeasy
2334
multi-languageexpert
2238
code-reviewhard
2170
multi-languagehard
2101
from-scratchexpert
2072
from-scratchmedium
2043
frontendexpert
1828
full-stackmedium
1728
frontendhard
1721
code-review
1701
full-stackhard
1700
full-stack
1686
multi-language
1684
code-reviewmedium
1678
debuggingmedium
1664
from-scratchhard
1650
from-scratch
1640
debuggingexpert
1604
debugginghard
1597
backendmedium
1580
debugging
1578
backend
1577
backendhard
1574
backendexpert
1571
frontend
1570
from-scratcheasy
1559
refactoringmedium
1552
frontendmedium
1540
frontendeasy
1538
refactoring
1524
refactoringexpert
1376

All Results

TaskCategoryScore
Add Google OAuth2 login to Express appfull-stack82.5
Build LLM evaluation harness with structured gradingbackend73.8
Find and patch all OWASP Top 10 vulnerabilitiesdebugging69.3
Write integration tests for payment flowcode-review75.0
Fix hallucination and context window bugs in RAG agentbackend49.1
Build distributed node cluster with gossip protocolfrom-scratch48.0
Add i18n with locale routing to Next.js appfull-stack74.7
Implement multi-tenant row-level security in Postgresbackend73.7
Port Python CLI to Rustmulti-language47.9
Debug race condition in worker pooldebugging89.5
Build real-time portfolio risk calculatorbackend74.9
Build SaaS admin dashboard from scratchfrom-scratch55.6
Fix race conditions in order matching enginebackend74.5
Add cursor-based pagination to REST APIbackend80.8
Fix broken responsive layoutfrontend70.1
Find and fix 4 hidden backdoors in Flask appdebugging93.8
Convert React app to PWA with offline supportfrontend66.3
Implement transformer inference engine with KV cachefrom-scratch86.0
Fix N+1 query in dashboardbackend53.4
Fix deadlocking transaction patterns in Flask appbackend77.4
Add rate limiting middlewarebackend81.7
Fix memory leak in event handlerdebugging81.8
Add retry logic and dead letter queue to Python task queuebackend82.9
Fix Node.js stream backpressure causing OOM on large filesbackend90.3
Write Kubernetes manifests for Node.js microservicefull-stack90.2
Add GraphQL layer over REST APImulti-language84.1
Code review: identify security vulnscode-review79.0
Zero-downtime schema migrationfull-stack76.1
Debug and fix 6 broken database triggers and constraintsdebugging71.9
Build terminal UI dashboardfrom-scratch62.3
Fix 12 WCAG accessibility violations in checkout formfrontend84.3
Build MCP server for database managementbackend84.0
Fix broken GitHub Actions CI pipelinedebugging83.3
Fix auth bypass vulnerabilitydebugging88.5
Split 1100-line god file into proper modulesrefactoring63.8
Implement JWT auth middlewarebackend53.8
Optimize slow Postgres queries in Flask appbackend78.4
Add slash commands and moderation to Discord botbackend72.7
Add file upload with S3 presigned URLsbackend79.8
Fix React hydration mismatchfrontend83.3
Add WebSocket real-time updatesfull-stack84.4
Build production website with auth and members areafrontend67.1
Build codebase indexer for LLM context windowsfrom-scratch27.5
Add virtual scrolling to table rendering 5000 rowsfrontend54.0
Replace console.log with structured loggingrefactoring44.7
Fix data integrity bugs in denormalized e-commerce schemadebugging85.0
Implement Stripe webhook handlerbackend84.4
Harden insecure Docker setup with 12 vulnerabilitiescode-review76.3
Dockerize Node.js monorepofull-stack75.3
Implement zero-trust API authentication layerbackend67.5
Add streaming SSE endpoint for LLM chatbackend81.8
Optimize bloated React bundle under 500KBfrontend80.5
Build materialized view refresh pipeline for analyticsbackend76.7
Fix flaky test suitedebugging87.5
Add caching layer to eliminate slow SSR page loadsfull-stack81.2
Write tests for untested legacy Flask servicecode-review53.3
Build RAG pipeline with vector searchbackend49.5
Remove AI slop and over-engineering from codebaserefactoring74.8
Implement background job scheduler with persistencebackend58.0
Migrate callback-hell Express app to async/awaitrefactoring58.0
Write complex SQL report with window functionsbackend66.8
Build CLI tool with subcommands and configfrom-scratch74.5
Refactor monolithic handler to CQRSrefactoring42.6
Add Redis caching layer to Express APIbackend85.2
Build REST API from scratchfrom-scratch77.2
Build codebase indexer for LLM context windowsfrom-scratch31.9
Harden insecure Docker setup with 12 vulnerabilitiescode-review73.2
Convert React app to PWA with offline supportfrontend64.2
Replace console.log with structured loggingrefactoring62.7
Fix broken responsive layoutfrontend74.7
Dockerize Node.js monorepofull-stack75.2
Add caching layer to eliminate slow SSR page loadsfull-stack85.9
Implement zero-trust API authentication layerbackend71.7
Implement multi-tenant row-level security in Postgresbackend64.2
Split 1100-line god file into proper modulesrefactoring88.8
Optimize bloated React bundle under 500KBfrontend74.8
Remove AI slop and over-engineering from codebaserefactoring76.8
Find and patch all OWASP Top 10 vulnerabilitiesdebugging75.0
Implement JWT auth middlewarebackend30.3
Add i18n with locale routing to Next.js appfull-stack72.5
Add file upload with S3 presigned URLsbackend74.3
Write Kubernetes manifests for Node.js microservicefull-stack82.7
Build MCP server for database managementbackend76.0
Implement background job scheduler with persistencebackend71.3
Build production website with auth and members areafrontend69.6
Build SaaS admin dashboard from scratchfrom-scratch73.5
Implement transformer inference engine with KV cachefrom-scratch86.9
Build CLI tool with subcommands and configfrom-scratch34.8
Build real-time portfolio risk calculatorbackend72.1
Write integration tests for payment flowcode-review75.2
Fix race conditions in order matching enginebackend73.2
Debug and fix 6 broken database triggers and constraintsdebugging77.2
Find and fix 4 hidden backdoors in Flask appdebugging69.7
Fix Node.js stream backpressure causing OOM on large filesbackend90.4
Add GraphQL layer over REST APImulti-language70.7
Write complex SQL report with window functionsbackend81.9
Zero-downtime schema migrationfull-stack70.6
Add cursor-based pagination to REST APIbackend67.6
Refactor monolithic handler to CQRSrefactoring60.5
Add slash commands and moderation to Discord botbackend71.7
Build distributed node cluster with gossip protocolfrom-scratch67.2
Fix flaky test suitedebugging83.2
Code review: identify security vulnscode-review85.8
Fix hallucination and context window bugs in RAG agentbackend52.0
Add Redis caching layer to Express APIbackend77.8
Fix 12 WCAG accessibility violations in checkout formfrontend78.9
Add virtual scrolling to table rendering 5000 rowsfrontend76.7
Write tests for untested legacy Flask servicecode-review61.5
Build terminal UI dashboardfrom-scratch49.5
Build LLM evaluation harness with structured gradingbackend54.1
Fix auth bypass vulnerabilitydebugging92.9
Add rate limiting middlewarebackend63.0
Fix React hydration mismatchfrontend76.2
Optimize slow Postgres queries in Flask appbackend82.7
Fix deadlocking transaction patterns in Flask appbackend71.3
Implement Stripe webhook handlerbackend67.8
Fix data integrity bugs in denormalized e-commerce schemadebugging78.5
Fix memory leak in event handlerdebugging64.0
Build REST API from scratchfrom-scratch83.3
Debug race condition in worker pooldebugging88.4
Add retry logic and dead letter queue to Python task queuebackend74.5
Fix N+1 query in dashboardbackend89.7