“I’m renegotiating a 3-year service contract for data center Disaster Recovery (DR) services. My current Service Provider offers the ability to exit the agreement if they fail to perform. What other SLAs should be in place?”
ANSWER: On the surface, this seems like the ultimate supplier promise: “If we don’t deliver, you can leave – no questions asked.” That’s perfect for major issues, but outsourcing relationships are more typically plagued by chronic small problems than catastrophic delivery failures. You will need some tactical tools in addition to the nuclear option of terminating the agreement.
Defined Service Level Agreements (SLAs) should encourage favorable behavior and align work priorities to business needs. SLAs should be meaningful, measurable, and evocable. In your example, even if the Service Provider’s SLAs are measuring the right metrics (RTO and RPO are among typical DR SLAs), there is no appropriate management mechanism to improve the environment. Your only redress is termination which can be difficult and expensive, a cure that’s worse than the disease…and the service provider knows it. And transitions are fraught with risk: imagine migrating to a new environment from a cranky service provider, especially if you decide to use a competitor.
Service providers have largely accepted defined SLAs, and in some instances even become vocal advocates. SLAs help their delivery teams manage customer expectations and prioritize resources. In some emerging fields like Business Process Outsourcing, providers occasionally use defined SLAs as a competitive differentiator to unseat incumbent suppliers, actively challenging prospective clients to measure and monitor service delivery.
While there are many standard candidates for SLA metrics like uptime, availability, and response time, it’s important to correlate them to meaningful business related metrics. And leading indicators are better than lagging indicators. For example, a traditional DR SLA would grant a financial credit if, pursuant to a declared disaster, the supplier fails to restore the client environment within an agreed upon period – a lagging indicator. A leading indicator would be SLAs for tests, DR plan updates, training, etc; all critical success factors to a successful restoration during an actual disaster. Where ever possible, find a predictor of success.
SLA remedies don’t have to be financial. If your supplier fails to restore your environment during a DR test, a re-test makes more sense than a bucket of cash. Or negotiate the right to exit the agreement if the supplier repeatedly fails the DR tests.
SLAs are an important monitor of your outsource relationship. They enable you to manage key service delivery elements because everyone has a vested interest in success. They can be difficult to negotiate but an extremely effective tool when applied creatively.