3 Important Cloud Views for the ITSM Capacity Manager
When applications and infrastructure run out of capacity, they adversely impact services and business operations. To mitigate this risk, it’s common for a capacity manager role to exist within an organization to ensure that business service capacity is understood. Which includes non-technical capabilities (for example, IT service desk agents) as well as IT resources.
Now that cloud services are mainstream and many IT services run on them, IT service management (ITSM) capacity managers now need to be cloud aware. You can find out more about the capacity management implications of cloud in this blog: “How Cloud Changes ITIL Capacity Management from the Vague to the Specific.”
And, to help you out further, this blog provides important information on what you need to do to modernize capacity management in the context of cloud services.
Capacity Management in the Cloud
Although the leading public cloud providers such as Amazon Web Services (AWS) and Microsoft Azure have more than one hundred cloud services between them, in practice there are only a small handful of cloud services commonly used to underpin the business. Each of these has a different capacity profile:
- Elastic and automated compute
- Higher-order cloud services like databases
- Wasted capacity and cost-optimization
I’m going to look at each of these and then finish with a 5-step action plan for cloud capacity management.
1. Elastic and Automated Compute
Long gone are the months required to provision servers to add more capacity, which has in turn eliminated the need for forecasting. Servers can be scaled up – made bigger – or scaled out – made many – in minutes. The approvals are what takes up time now.
The capacity manager is now much more interested in getting work off his or her plate by programming the cloud to do the job for them. Using the cloud’s elastic features, here’s what a savvy capacity manager does now:
- Work out, using real data, which data points indicate that more or less capacity is needed. This could be CPU, transactions, queue sizes, memory usage, etc.
- Based on the data points, program the cloud to scale-up and scale-down, and add/take away instances from load balancers or other integrated systems.
- Sometimes the cloud’s elastic features are not appropriate for large-scale events like ticket releases, so a capacity manager needs to know how to pre-scale services such as pre-warming load balancers (which involves a support ticket with the cloud service provider).
Elastic scaling services such as AWS EC2 Auto Scaling Groups (ASGs) can scale up and down instance counts across multiple datacenters without any interference, and they can also update load balancers and other dependent systems.
2. Higher-Order Cloud Services Like Databases
Friends don’t let friends manage virtual machines.
Certainly not by hand, any way!
The leading public cloud providers are making life easier – and their platform much stickier – by bundling up common stacks, like databases, into a managed service.
For databases, a capacity manager no longer needs to work with an internal DBA and infrastructure team to manage the database infrastructure capacity. Using a cloud service provider dashboard, it’s possible for a capacity manager to add read capacity with “read replicas,” scale-up the database instance by changing the underlying instance size, and even tweak the database engine parameters. These features are inconsistently available from the leading cloud service providers, with AWS’s Relational Database Service (RDS) ahead and Azure, Google, and others trailing behind.
Other database services such as NoSQL systems (e.g. AWS DynamoDB) and data warehouse systems (e.g. AWS Redshift) have different capacity management dimensions but they still have the same ease of management.
3. Wasted Capacity and Cost-Optimization
In the cloud, cost optimization is the #1 concern and now the leading initiative for mature cloud users according to the annual RightScale State of Cloud Report.
This was again reflected in the 2017 version of the report, with its position even higher across each maturity level:
Cost optimization is 100% a capacity management concern!
Because all cloud resources are metered, their usage and cost are transparent. This means not just providing enough capacity, but tuning the capacity to match the workload with as little wastage as possible.
Examples of this are a capacity manager implementing (with technical assistance) a switch-off policy for developer environments. If you turn off developer environments out of hours, you can save 560 hours of processing in a 720 hour month, which is 77% of your bill.
An Action Plan for ITSM Capacity Managers
By combining all of the above practices, the cloud-savvy ITSM capacity manager can use cloud to deliver significant benefits to the business.
These are likely to be new skills to acquire and practice, so here’s a plan to help get there:
- Purchase and study online cloud training on AWS or Azure. Learning by doing is not enough by itself, it’s important to learn what’s correct and what’s not.
- Practice scale-out with systems that are capacity-related such as AWS EC2 and AWS Application Load Balancers.
- Practice scale-up with right-sizing instance sizes, understanding the difference between cloud instance families and sizes.
- Use cloud tagging and policies to improve cost-optimization with capacity management.
- Formalize your new cloud capacity management approach into a “Cloud Service Management Framework.”
Once this is all done, your traditional capacity management skills will be fit for public cloud service use cases and you will be better able to demonstrate the business value of your capacity management efforts.
In terms of your traditional capacity management skills, these will still be relevant given that most organizations will use a combination of cloud and on-premises delivery models. You can find “5 Ways to Improve Your Capacity Management” here.