The Risk
In May 2026, three major AI providers shipped breaking changes within two weeks of each other. OpenAI deprecated an API version. Anthropic restructured pricing. Google released a new generation that obsoleted prompts written six months prior.
The teams that survived the month had one thing in common: a thin routing layer between their application code and any specific provider. The teams that didn't were on the phone with their agency arguing about who pays for the emergency upgrade.
This is why model-agnostic by architecture is non-negotiable.
What It Means
A model-agnostic architecture has three properties:
1. Application code calls a generic AI interface, not a provider-specific SDK.
```ts // Bad import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); const response = await openai.chat.completions.create({ ... });
// Good import { ai } from "@/lib/ai"; const response = await ai.complete({ ... }); ```
2. The routing layer can swap providers per call type without changing application code.
The library (LiteLLM, OpenRouter, or a custom 100-300 line wrapper) accepts a generic call and routes it to whichever provider is currently best for the task. Routing rules are configurable.
3. Provider keys live as environment variables, not in code.
Adding a new provider takes 30 seconds: add the env var, add a routing rule. No code changes. No deploy.
What This Costs You
The first time you build this, it costs maybe a day. Maybe two if the team has never done it before.
After that, it costs nothing. The library does the work.
What It Saves You
Every time a provider deprecates an API version: zero engineering effort. Swap the routing rule. Done.
Every time a provider raises prices: route the high-volume calls to a cheaper provider. Zero code change.
Every time a new model lands: test it against your evals. If it scores better at lower cost, swap. Zero code change.
Every time a customer asks "are you locked into one AI vendor?": show them the routing config.
What "Locked In" Looks Like
These are red flags in any AI agency proposal:
- "We use OpenAI exclusively because it's the best."
- "Our platform handles all the AI provider stuff for you." (Translation: their platform is the lock-in.)
- "Switching providers would require a rewrite."
- "We negotiated a special rate with [provider] so it's already the cheapest."
The first three are technical lock-in. The fourth is financial lock-in. Both are bad.
What "Free" Looks Like
These are green flags:
- "We use LiteLLM as the routing layer. Here is the config."
- "You pay model providers directly through your own accounts."
- "We test new models against your eval suite quarterly."
- "Our routing rules are in your repo. You can change them."
If your AI vendor cannot say these four things, they are selling you lock-in.
The Tooling
The libraries that handle this well in 2026:
- LiteLLM. Most mature, supports 100+ providers, drop-in OpenAI-compatible interface.
- OpenRouter. Managed-service variant of LiteLLM, can be used as the routing layer itself.
- Vercel AI SDK. Provider-agnostic and ergonomic, but optimized for streaming UX.
- In-house wrapper. For teams that want full control. Usually 200-300 lines.
We default to LiteLLM on most engagements. It is boring, stable, and exactly the right level of abstraction.
The Bigger Picture
Model-agnostic architecture is not just risk management. It is competitive advantage.
A team that can swap providers in hours can:
- Optimize cost per output continuously
- Test new model releases the day they ship
- Negotiate with providers from a position of strength
- Recover from provider outages without downtime
A team locked in cannot do any of that.
This is one of the seven engineering principles every Kastling engagement enforces. We do not ship architecture our clients will regret in six months.