Watermelon SLAs

How to Recognize and Deal with Watermelon SLAs

In these difficult times caused by the COVID-19 crisis, there’s an increased focus on IT service delivery and support performance. In particular, that IT’s efforts are focused on doing the right things and in line with business expectations. In some instances, this will mean that performance targets are changed or even dropped. But it also means that IT organizations need to be better-informed on how well their performance is meeting the needs of business stakeholders. Especially that what’s reported as success is actually meeting those needs.

It’s, of course, important that, as IT service management (ITSM) professionals (yep, that’s me), we’re accountable for the quality of the IT services we provide to our customers. Service level agreements (SLAs) are a way of accomplishing this and, done well, they can make a big difference to service quality levels and customer happiness. And who doesn’t want happy customers?

All too often though, customers are faced with what’s called a “watermelon SLA” – one that contains a metric target that, when assessed against, states that all is well. When in reality we in IT have left a trail of unhappy customers in our wake. So, the SLA – or SLA target – is something that’s green on the outside but, once cut into, is red on the inside. Just like a watermelon!

Think about it – have you ever been in a service review meeting where the service provider was reporting everything was on target when in reality you’ve experienced unscheduled downtime, poor performance, or end-user complaints? This is a watermelon SLA – green on the outside but red on the inside. Please read on for tips on how to recognize and fix them.

The risks of “bad” SLAs

Badly thought out SLAs affect us all – customers and providers alike.

From a customer perspective, poorly-designed SLAs can cause any or all of the following issues:

  • Lack of clarity about the product or service
  • Scope creep
  • No escalation process (for when things, unfortunately, go wrong)
  • Business-impacting incidents not being given the appropriate levels of focus or taking too long to resolve
  • Service requests not being dealt with in a timely manner
  • Problems going undetected and undiagnosed
  • Lack of confidence in the service provider and out-of-process (or shadow IT) initiatives putting the organization at risk

From a service provider perspective, having poor SLA agreements and documentation can cause:

  • A focus on the wrong things, for example speed rather than quality
  • Having a differing understanding of what applications and services are truly business-critical
  • No understanding of business impact
  • Small issues or concerns spiraling out of control
  • Poor satisfaction ratings
  • Constantly firefighting and dealing with escalations and complaints
  • Adversarial service review meetings, with one side feeling defensive and the other experiencing poor levels of service

What does bad look like?

Badly thought out SLA measurements are nothing new. There are many examples of how measurements can drive the wrong outcomes but the two that are most relevant here are the laws of Campbell and Goodheart, who are a psychologist/social scientist and economist respectively.

In the world of social science, Campbell’s law states that:

“Once a metric has been identified as a primary indicator of success, its ability to accurately measure success tends to be compromised.”

A similar law exists in the realms of economics; Goodhart’s law states that:

“When a measure is a target, it ceases to be a good measure.”

In short, it doesn’t matter what environment you’re in, or which industry you’re part of, all too often companies use SLAs as a box-ticking exercise – something to be documented before moving on to the next task. The danger with this approach is that the focus becomes all about hitting potentially arbitrary targets rather than the overall service quality.

Poor metrics might also reward the wrong behaviors. For instance, triaging incidents too fast risks missing vital information or not applying workarounds to existing problems and known errors.

If the service desk analyst is only focused on speed, then things like adding proper notes, checking the knowledge base, or taking a couple of extra seconds to attempt a first-time fix will fall by the wayside. Over time, this will lead to a decrease in incidents being resolved at the first point of contact, a dip in overall service quality, and an increase in time taken to restore. None of which are good.

Avoiding Watermelon SLAs

If you’re worried about the watermelon SLA effect – and I hope I’ve convinced you enough that you should be – here are some key things to look out for:

  • Targets that are vague/not enforceable
  • Measurements that aren’t achievable
  • Little or no information about business criticality or impact
  • Overly punitive measures
  • Technical jargon
  • SLAs that don’t have agreed timelines
  • Decreased customer satisfaction ratings
  • Service review meetings that go badly despite targets being met

Then there are everyday IT support activities that we can do better – whether related to outcomes we target through metrics (in SLAs) or not – because, after all, our customers do deserve better!

It’s also not just about money and client renewals/job retention. Our service desk and support teams also deserve better. IT needs to be empowered to provide the best possible levels of service, not be constrained by SLAs. Instead, the message needs to be – anyone that touches the service we provide should have the best possible experience because, as IT professionals, we need to put quality front and center.

It’s time to change the conversation around SLAs

When we focus on numbers, all we see are the numbers, not the end-user experience. For instance, the SLA for your payroll application may have an availability target of 99.5%. Say you meet that target – all is good right? No – not if that 0.5% of downtime hit at month-end when the payroll run was being processed.

It instead needs to be all about end-user experience and we need to shift the paradigm from measurements and metrics to business impact.

In the payroll example, the SLA targets may well have been met but do the Finance and Payroll teams feel good about using the service? And what about the end users (employees) who may have experienced delays or errors in getting their wages?

Watermelon SLAs hurt us all, so it’s time that we fixed them.

Have you seen watermelon SLAs in action? How did they affect your organization? Please let me know in the comments!


Posted by Joe the IT Guy

Joe the IT Guy

Native New Yorker. Loves everything IT-related (and hugs). Passionate blogger and Twitter addict. Oh...and resident IT Guy at SysAid Technologies (almost forgot the day job!).