APEX
Back to models

Kimi K2.6

OpenRouter

262K context$0.73/M input$3.49/M output
1678peak 1694

Avg Score

80.8

Avg Cost

$0.47

Score/$

170.7

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languagehard
2381
refactoringexpert
2312
code-reviewhard
2248
from-scratchmedium
2238
backendeasy
2201
from-scratcheasy
2076
code-reviewmedium
1830
code-review
1829
frontendhard
1820
backendhard
1802
multi-language
1796
refactoring
1778
from-scratchhard
1772
refactoringmedium
1761
debuggingmedium
1761
debuggingexpert
1747
multi-languageexpert
1740
backendexpert
1716
from-scratch
1713
backend
1696
full-stackhard
1695
from-scratchexpert
1694
frontendeasy
1679
frontendexpert
1675
frontendmaster
1662
frontend
1649
full-stack
1648
backendmedium
1643
frontendmedium
1641
debugging
1608
full-stackmedium
1594
debugginghard
1504
backendmaster
1230

All Results

TaskCategoryScore
Add streaming SSE endpoint for LLM chatbackend84.3
Replace console.log with structured loggingrefactoring73.3
Fix and extend Chrome browser extensionfrontend63.6
Migrate Express monolith to modular architecturebackend70.2
Build real-time portfolio risk calculatorbackend41.7
Implement zero-trust API authentication layerbackend81.7
Add file upload with S3 presigned URLsbackend70.3
Write tests for untested legacy Flask servicecode-review88.1
Optimize slow Postgres queries in Flask appbackend87.1
Build production website with auth and members areafrontend70.5
Build SaaS admin dashboard from scratchfrom-scratch62.1
Add slash commands and moderation to Discord botbackend75.7
Fix deadlocking transaction patterns in Flask appbackend86.6
Implement background job scheduler with persistencebackend72.5
Dockerize Node.js monorepofull-stack81.9
Migrate callback-hell Express app to async/awaitrefactoring85.3
Build LLM evaluation harness with structured gradingbackend82.9
Add Redis caching layer to Express APIbackend86.6
Fix race conditions in order matching enginebackend90.9
Remove AI slop and over-engineering from codebaserefactoring86.3
Fix flaky test suitedebugging87.0
Split 1100-line god file into proper modulesrefactoring86.4
Write Kubernetes manifests for Node.js microservicefull-stack86.8
Build MCP server for database managementbackend87.0
Implement JWT auth middlewarebackend82.4
Build CLI tool with subcommands and configfrom-scratch70.5
Add GraphQL layer over REST APImulti-language91.2
Add WebSocket real-time updatesfull-stack79.3
Fix N+1 query in dashboardbackend78.3
Code review: identify security vulnscode-review78.7
Add caching layer to eliminate slow SSR page loadsfull-stack78.3
Build RAG pipeline with vector searchbackend87.5
Build materialized view refresh pipeline for analyticsbackend81.7
Build codebase indexer for LLM context windowsfrom-scratch81.3
Write integration tests for payment flowcode-review84.8
Build distributed node cluster with gossip protocolfrom-scratch81.6
Fix broken GitHub Actions CI pipelinedebugging93.0
Add i18n with locale routing to Next.js appfull-stack82.2
Build multi-tool LLM agent runtimebackend67.3
Fix 12 WCAG accessibility violations in checkout formfrontend84.7
Port Python CLI to Rustmulti-language62.2
Convert React app to PWA with offline supportfrontend81.3
Implement multi-tenant row-level security in Postgresbackend83.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging80.5
Find and fix 4 hidden backdoors in Flask appdebugging89.3
Fix auth bypass vulnerabilitydebugging92.2
Fix data integrity bugs in denormalized e-commerce schemadebugging88.4
Write complex SQL report with window functionsbackend87.5
Fix hallucination and context window bugs in RAG agentbackend81.8
Fix memory leak in event handlerdebugging52.2
Debug and fix 6 broken database triggers and constraintsdebugging86.8
Build 3D browser game with physics and multiplayer syncfrontend85.0
Optimize bloated React bundle under 500KBfrontend71.4
Implement transformer inference engine with KV cachefrom-scratch80.5
Zero-downtime schema migrationfull-stack81.7
Build interactive data visualization dashboardfrontend79.1
Add retry logic and dead letter queue to Python task queuebackend82.8
Build terminal UI dashboardfrom-scratch77.3
Refactor monolithic handler to CQRSrefactoring82.5
Implement Stripe webhook handlerbackend87.5
Add Google OAuth2 login to Express appfull-stack80.8
Fix React hydration mismatchfrontend84.4
Fix Node.js stream backpressure causing OOM on large filesbackend91.7
Build REST API from scratchfrom-scratch87.3
Harden insecure Docker setup with 12 vulnerabilitiescode-review93.5
Add virtual scrolling to table rendering 5000 rowsfrontend82.4
Fix broken responsive layoutfrontend77.7
Debug race condition in worker pooldebugging84.8
Add rate limiting middlewarebackend86.8
Add cursor-based pagination to REST APIbackend80.9