· By the ToolNav Team · 6 min read Moonshot AI Kimi Open Weights AI Coding Tools Open Source

Moonshot Releases Kimi K2.7-Code — Open Weights at $0.95/$4 per MTok

TL;DR

Moonshot AI released Kimi K2.7-Code (model ID kimi-k2.7-code) on June 12, 2026, publishing full weights to Hugging Face under a Modified MIT license. It is a 1-trillion-parameter Mixture-of-Experts model (32B active) tuned for long-horizon agentic coding, priced at just $0.95 / $4.00 per million input/output tokens — a fraction of frontier-tier rates. The catch for operators: every benchmark published so far is Moonshot's own — there are no independent third-party numbers yet. See the best AI coding tools for how it stacks up.

$0.95 / $4.00

Per million input/output tokens — model ID kimi-k2.7-code; under a fifth of Opus 4.8 rates

1T params

Mixture-of-Experts with ~32B active across 384 experts; 256K-token context window

Modified MIT

Full weights on Hugging Face — self-hosting and air-gapped deployment are possible

Vendor benchmarks only

+21.8% on Kimi Code Bench v2 over K2.6, but no independent third-party numbers yet

Moonshot AI released Kimi K2.7-Code on June 12, 2026, announcing it via kimi.com/code and publishing the full model weights to Hugging Face the same day. The model ID is kimi-k2.7-code. It is a coding-first release: a 1-trillion-parameter Mixture-of-Experts model with roughly 32B active parameters across 384 experts, a 256K-token context window, and a Modified MIT license that allows self-hosting. For a market where the frontier-tier coding models are closed and priced at $5–$10 per million input tokens, an open-weight 1T model at under a dollar is a genuinely different cost structure.

The pricing is the headline. Moonshot lists API access at $0.95 per million input tokens and $4.00 per million output tokens. For comparison, that is roughly a fifth of Claude Opus 4.8's $5/$25 and under a tenth of the now-suspended Fable 5's $10/$50 — for a current cross-model price comparison see the AI Tool Pricing Database. The model pairs with Kimi Code, Moonshot's terminal-first coding agent, with membership plans listed from $19 per month. For high-volume agentic workloads — where output token counts dominate the bill — the gap is large enough to change which workloads are economically viable to automate.

Open weights change the deployment math. Because the full weights ship on Hugging Face under a Modified MIT license, K2.7-Code is not only an API — it is a model teams can run themselves. That opens self-hosting and air-gapped deployment for organizations that cannot send code to a third-party API at all: regulated industries, defense-adjacent work, or any team with a hard data-residency requirement. The closed frontier models cannot serve that segment regardless of price. See our roundup of the best AI coding tools for where self-hostable options fit.

Moonshot's efficiency claim. The company reports that K2.7-Code uses roughly 30% fewer "thinking" tokens than its predecessor K2.6 while scoring higher on Moonshot's coding benchmarks — a meaningful claim, because reasoning-token usage is a direct driver of both latency and cost on agentic tasks. If it holds up, the effective cost advantage is larger than the headline per-token price suggests.

The caveat operators cannot skip. Every benchmark Moonshot has published for K2.7-Code is self-run and self-reported — it reports +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite, all relative to K2.6, with the company running the evaluations itself. As of release, there are no independent third-party numbers for K2.7-Code on the standard public leaderboards — a gap reporters flagged at launch. Vendor-reported gains on a vendor's own benchmark are a starting point, not a verdict. The improvement-over-our-last-model framing also says nothing about how it compares to the closed frontier models most teams actually weigh it against.

What it means for builders. Kimi K2.7-Code is best read as a cost-and-control play rather than a confirmed capability leap. If you run high-volume agentic coding and your bill is dominated by output tokens, the price difference is worth a real evaluation. If you have data-residency constraints that rule out closed APIs, the open weights are the actual unlock. But until independent benchmarks land, treat the performance claims as unverified and run the model against your own repository and your own task suite before trusting it on anything that matters.

Why It Matters

An open-weight 1-trillion-parameter coding model at under a dollar per million input tokens is a different cost structure than the closed frontier, and for output-token-heavy agentic workloads the gap is large enough to change what is economical to automate. The Modified MIT license unlocks self-hosting and air-gapped deployment — serving regulated and data-residency-bound teams that closed APIs cannot reach at any price. But the performance story is entirely vendor-reported: every published benchmark is Moonshot's own, with no independent third-party validation yet, and the improvement is measured against Moonshot's previous model rather than the closed leaders teams actually compare against. The right posture is evaluation, not adoption-on-faith.

Who's Affected

  • Teams running high-volume agentic coding — if your bill is dominated by output tokens, K2.7-Code's $4/MTok output rate versus $25 on Opus 4.8 is worth a real cost evaluation.
  • Organizations with data-residency or air-gap requirements — the open weights under a Modified MIT license make self-hosting possible, which closed frontier APIs cannot offer regardless of price.
  • Operators choosing between coding models on benchmark claims — be cautious: all of K2.7-Code's numbers are vendor-reported on proprietary suites, with no independent validation at release.
  • Cost-sensitive solo builders and small teams — Kimi Code membership starts at $19/month (the 'Moderato' tier), a low entry point for agentic coding compared to frontier-tier tooling.

What To Do Now

  1. 1. Benchmark it on your own repository before trusting any claim. The published numbers are Moonshot's own and untested by third parties — run K2.7-Code against your real task suite and compare output quality side by side with your current model.
  2. 2. Model the cost on output tokens, not input. The savings are largest where output dominates. Pull your current token mix and project the bill at $0.95/$4.00 to see whether the difference is material for your workload.
  3. 3. Evaluate self-hosting only if you actually need it. The open weights are the real unlock for data-residency or air-gapped use. If you have no such constraint, the hosted API is simpler — do not take on inference ops for a price you can get via API.
  4. 4. Wait for independent leaderboard results before betting a critical workflow on it. Treat the vendor benchmarks as a reason to test, not a reason to migrate. Keep a proven model as your default until third-party numbers confirm the gains.

More on this topic — Best AI Coding Tools

The AI Hustle Playbook Newsletter

Get one practical AI playbook each week.

Tools, workflows, and side-income ideas — curated for people who want to build, not browse forever.

No spam. Unsubscribe anytime. We respect your privacy.