Cursor Composer 2.5: In-House Coding Model at a Fraction of Frontier Price
TL;DR
Cursor shipped Composer 2.5, its in-house AI coding model, positioned as a frontier-class alternative to Claude Opus 4.7 at a meaningfully lower per-token price.
May 19, 2026
Composer 2.5 release date — successor to Composer 2 (March 19, 2026)
$0.50 / $2.50
Per million input/output tokens, Composer 2.5 standard tier (Cursor reported)
79.8% / 80.5%
SWE-Bench Multilingual: Composer 2.5 vs Claude Opus 4.7 (Cursor reported)
85%
Share of compute Cursor says it spent on its own post-training pipeline vs base model
Cursor released Composer 2.5 on May 19, 2026 — an update to its in-house coding model, positioned as a frontier-class option for sustained multi-file work at meaningfully lower token cost than Claude Opus 4.7 or GPT-5.5. The release is a continuation of Cursor's strategy, started with Composer 1 in late 2025, of training its own coding-specific models rather than only renting frontier models from Anthropic or OpenAI.
What changed. Composer 2.5 is built on the same open-source Kimi K2.5 base checkpoint that Cursor used for Composer 2 (released March 19, 2026), but Cursor says roughly 85% of its compute budget for this update went into its own post-training and reinforcement learning pipeline. Cursor reports two pricing tiers — a standard variant and a faster interactive variant — with the standard tier at $0.50 per million input tokens and $2.50 per million output tokens. The faster tier, used by default for interactive sessions inside the Cursor editor, runs at $3 per million input and $15 per million output. Both numbers are below Claude Opus 4.7's published rates.
The benchmark claims. Cursor's published numbers put Composer 2.5 at roughly 79.8% on SWE-Bench Multilingual, within striking distance of Claude Opus 4.7's reported 80.5%. On CursorBench v3.1 — the company's own internal benchmark — Composer 2.5 reports 63.2% at default effort, edging both Claude Opus 4.7 and GPT-5.5. Both benchmark families are Cursor's own framing. Independent third-party SWE-Bench comparisons against the latest Claude and GPT models on the same prompts are not yet public. Treat the headline numbers as Cursor's positioning until external benchmarks land.
Why this matters for ToolNav's audience. For solo builders and indie hackers who use Cursor for AI-assisted coding, the practical impact is cost. Sustained multi-file work — refactors, agentic background tasks, long-running Composer sessions — burns tokens, and the choice of underlying model is a meaningful chunk of the monthly bill. If Composer 2.5's reported benchmarks hold up on your specific codebase, switching the default model from Claude Opus to Cursor's own model could lower per-task cost by an order of magnitude without losing capability. The honest test is to run your actual work — a real refactor, a real feature build — on both and compare output quality, not just benchmark scores. See our Cursor vs Claude Code comparison for where the GUI-first and terminal-native workflows differ, and Cursor vs GitHub Copilot for the budget-first developer choice. For a broader category map, our Best AI Coding Tools roundup tracks how Cursor stacks up against the rest of the field.
The competitive pattern. Composer 2.5 is part of a broader shift: tools that used to be thin wrappers on frontier model APIs are increasingly training their own models tuned to their specific workflows. Cursor's bet is that a coding-specific model on its own training pipeline can match general-purpose frontier models on the work that actually pays Cursor's bills — long-horizon code edits, multi-file refactors, terminal use, and agentic tasks. If that bet pays off, the cost-per-build math gets noticeably better for Cursor users versus tools still routing every call to Claude or GPT.
What to do this week. If you're already on Cursor Pro, switch the default model to Composer 2.5 for a real task this week — a refactor, a feature build, or a multi-file edit. Compare output quality against your previous default (Claude or GPT) on the same prompt. The cost saving is real if the quality holds. If you're evaluating coding tools for the first time, Find My Tool can narrow the field based on your stack and budget. And the AI Tool Pricing Database tracks current rates across the major coding-tool providers if you want a side-by-side cost view.
Why It Matters
Cursor is moving from frontier-model renter to frontier-model builder. Composer 2.5 is the second iteration of its in-house coding model, and the pricing gap versus Claude Opus 4.7 and GPT-5.5 is wide enough to materially change cost-per-build for anyone running sustained agentic work inside Cursor. The benchmark claims need independent verification, but the strategic signal is clear: the coding-tool layer is starting to vertically integrate. That has knock-on effects for every other tool that still routes every prompt to a third-party frontier API.
Who's Affected
- — Cursor Pro and Ultra subscribers. Composer 2.5 is available now as the default option in the model picker. Your existing subscription quotas apply differently depending on whether you stay on Composer 2.5 or switch to a frontier model.
- — Indie builders running long agentic sessions. Background Agents, multi-file refactors, and any workflow that consumes large token counts is exactly where Composer 2.5's price advantage compounds. If you've been hitting cost ceilings on Claude Opus inside Cursor, this is the lever to pull first.
- — Teams evaluating coding tool budgets. If your team uses Cursor Business or Ultra, the model choice is now a real budget decision. Run the same benchmark task across Composer 2.5, Claude Opus, and GPT-5.5 on your actual codebase before locking in a default for the team.
- — Anthropic and OpenAI. Both are watching closely. If Cursor's positioning holds up under independent benchmarking, expect API pricing pressure on Claude Opus and GPT-5.5 within the coding-tool channel specifically.
What To Do Now
- 1. Treat the benchmarks as Cursor's claims. SWE-Bench and CursorBench numbers are Cursor's framing of its own model. Run the comparison on your real work before treating the headline figures as facts.
- 2. The cost math is the real story. Even if Composer 2.5 is slightly weaker than Claude Opus on the hardest tasks, the price difference is large enough that the total productive output per dollar may favour it. Measure on dollars-per-finished-feature, not on benchmark percentages alone.
- 3. Don't conflate model with tool. Cursor is the IDE; Composer 2.5 is one model inside it. You can still pick Claude Opus or GPT-5.5 inside Cursor when the task warrants it. The decision is per-task, not all-or-nothing.
- 4. Watch the vertical-integration pattern. Cursor is the most public example, but other coding tools and AI builders will likely follow. Plan for a market where the underlying model is increasingly bundled into the tool rather than a separable choice.
More on this topic — Best AI Coding Tools
Independent Review
Cursor
Pricing, pros and cons, real-world verdict — no affiliate spin.
Read the Cursor reviewMore from ToolNav News
GitHub Copilot Switches to Usage-Based Billing on June 1 — 10 Days to Plan
2026-05-20
Mistral Launches Its 128B Flagship Model With Async Cloud Coding and a New Work Agentic Mode
2026-05-18
Cursor Lands in Microsoft Teams + Switches Bugbot to Usage-Based Billing — Solo Devs Get a Real Pricing Reset
2026-05-15