The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper emphasizes that in AI-assisted software development, the model itself is only a small part of system performance. The key lies in harness design and context engineering, shifting focus from model size to configuration and verification.

A new Google whitepaper asserts that the model in AI coding agents accounts for only about 10% of the system’s behavior. This challenges the common focus on model size and suggests that harness design and context engineering are more critical for effective AI development. The paper emphasizes that the shift in software engineering is toward expressing intent and trusting machines to interpret that intent, rather than relying solely on larger models.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, reports that 85% of professional developers use AI coding agents regularly, with 51% using them daily. It notes that roughly 41% of all new code is generated by AI. The core insight is that the model size is less important than the harness—the prompts, rules, tools, and observability layers surrounding the model. Experiments cited in the paper show that modifying the harness can significantly improve AI performance, even with the same model. The authors argue that failures are often due to configuration errors rather than model limitations, making harness design the key to reliable AI systems.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper on SDLC highlights that the model used in AI coding agents accounts for only 10% of system behavior, with the majority driven by harness and context engineering.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Harness and Context Engineering Trump Model Size

This shift in focus from model size to harness and context engineering has major implications for AI development strategies. It suggests that organizations can achieve better results by investing in system configuration, verification, and structured context rather than constantly chasing larger models. This approach can reduce costs, improve reliability, and enhance security, making AI deployment more sustainable and controllable.

Building AI-Powered Products: The Essential Guide to AI and GenAI Product Management

Building AI-Powered Products: The Essential Guide to AI and GenAI Product Management

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development and the SDLC Shift

The whitepaper builds on the evolving understanding of AI’s role in software engineering. Historically, larger models promised better performance, but recent experiments and industry practices indicate that system design, verification, and context management are more influential. The paper references recent trends where AI is integrated into development workflows, with a growing emphasis on configuration and scaffolding to control AI behavior. This aligns with broader industry observations that effective AI systems depend heavily on how they are built and managed, not just on the models themselves.

“The model is only 10% of what determines behavior; the harness is 90%. Focus on configuration and context engineering.”

— Addy Osmani

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Aspects of Harness and Context Are Still Unclear

It is not yet clear how organizations will scale these insights across diverse AI applications or how quickly industry practices will shift towards harness-centric development. The precise methods for optimal context engineering and their long-term effectiveness remain under study.
Amazon

AI observability and monitoring software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Implementing Harness-Centric AI Development

Organizations are likely to focus on building and refining harness components, including tools, prompts, and verification frameworks. Future research and industry practice will explore standardized approaches to context management and cost-effective scaling. Monitoring and evaluating the impact of these strategies on reliability and security will be critical as AI integration deepens.

Amazon

AI configuration management software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model size less important than the harness?

The whitepaper shows that system behavior is driven mostly by how the AI is configured and managed, not just the size of the underlying model. The harness—prompts, tools, and rules—shapes the output significantly.

How can organizations improve their AI systems based on this insight?

Focusing on system configuration, context engineering, and verification can lead to more reliable and cost-effective AI systems. Investing in building robust harnesses is now more critical than chasing larger models.

Does this mean larger models are obsolete?

Not necessarily. Larger models still provide capabilities, but their value is amplified when paired with well-designed harnesses and context management. The emphasis shifts from size to system design.

What are the main challenges in adopting harness-centric development?

Developing standardized, scalable methods for context engineering and verification can be complex. Organizations will need to invest in tools and expertise to manage these systems effectively.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

Sovereignty Is a Pipe, Not a Passport

European AI firm Mistral claims sovereignty through infrastructure, but legal jurisdiction and supply chains reveal vulnerabilities. Here’s what’s confirmed and what remains uncertain.

Capital: The Lever Beneath the Levers

Analysis of how capital funding shapes AI infrastructure, the circular flow of investments, and the risks of a fragile, debt-financed AI boom in 2026.

Évian and the Fallout: What Europe Actually Wants From Amodei, Hassabis, and Altman

Europe pushes for reliable access, sovereignty, and safety in AI, demanding guarantees from Amodei, Hassabis, and Alt after US export controls.

Minerva. The opposite path.

Italy’s Minerva-3B, trained from scratch on 2.5 trillion tokens, scores just 4.9% on Italian school exams, raising questions about scale and investment in sovereign LLMs.