Vendor lock-in is either never discussed in most teams, or it’s tackled with the “everything must be open source” reflex. In the field neither produces good results.
The truth is more balanced: some vendors give you speed, others give you security or compliance. The problem isn’t using a vendor; it’s not knowing the cost of leaving and failing to put operational control points in place.
What does an “exit plan” mean? (my definition)
An exit plan is the joint answer to:
- If I had to migrate this system within 6 months, what’s the biggest technical blocker?
- On the operations side, which runbooks collapse if the vendor goes away?
- On the data and identity side, where does it become a one-way door?
In short, not “could we leave one day?” but “if we had to leave, in how many weeks and at what risk?“
5 classic mistakes that grow lock-in
- Storing data in the vendor’s format (export never tested)
- Tying observability to the vendor (logs/metrics aren’t portable)
- Burying identity and authorization into a single platform (no IAM portability)
- Making operations dependent on the vendor’s UI (no CLI/API, no IaC)
- Leaving the exit clauses weak in the contract (egress, SLA, support)
Technical exit strategy: “portable layers”
In practice I recommend 4 layers:
1) Identity boundary
- The IdP (SSO) should be under the organization’s control where possible
- Standardize the authorization model on a “role” basis
- Keep audit logs in-house
2) Data boundary
- Export format: open and automated (parquet/csv/json or similar)
- Derive a “migration window” from your RPO/RTO targets
- Regular export tests: not “I got the file” but “I restored it”
3) Operations boundary
- IaC for installation/configuration (terraform/opentofu, etc.)
- Runbooks should rely on API/CLI, not the vendor UI
- “Break-glass” and incident flow should be owned by the organization
4) Observability boundary
- The telemetry pipeline should not be embedded in the vendor (a control plane like an OTel collector)
- Define alarms and dashboards in a portable model
Operational contract: “technical ops clauses”
In the contract with the vendor (or in an addendum) the headings I like to nail down:
- Egress cost: pulling data out shouldn’t carry a “penalty”
- Export SLA: is the export API / workflow stable?
- Support: during an incident, which channel and what response time?
- Audit log access: who holds the logs, and for how long?
- Change notice: how are breaking changes announced?
These clauses matter as much as “technical design”; half the risk lives in the contract.
The leadership angle: how do you keep an exit plan alive?
The operational rhythm I use:
- Quarterly “risk review”: vendor dependency map + actions
- Every 6 months a “export/restore” mini rehearsal
- For major vendor decisions, a mandatory “exit plan check”
Conclusion
Managed correctly, vendor lock-in stops being a “nightmare” and becomes a deliberate trade-off. The exit plan works when you’ve drawn the identity / data / operations / observability boundaries correctly and tied the contract clauses to operational reality. The most critical difference, though, is rehearsal: an exit plan that hasn’t been tested might as well not exist.