Optimizing Cloud Infrastructure Costs During Uncertain Times
Optimizing Cloud Infrastructure Costs During Uncertain Times
Infrastructure Costs
During Uncertain Times
It should be obvious to not reduce or shut down services without consulting your technology leaders. The last thing your company
needs is to hinder its services. Instead, there must be data-driven FinOps discussions and dialog before coming to an agreement of
what to shut down or where to optimize.
Use this eBook to learn how to generate the correct kinds of cloud cost data to have bigger, strategic savings dialog with other
business leaders. Also, learn tactics that any engineering, DevOps, product management, or FinOps team can utilize to generate
savings and increase efficiency
The goal:
To find a level of efficiency in how you operate your cloud infrastructure that can
yield savings without having to turn things off.
Before any discussions of altering cloud services begin, make sure every technologist, finance professional, and business leader is
looking at the same cloud cost data. This cost data is complicated enough—billed by the second and often delivered monthly in the
form of millions of rows of spreadsheet data for large companies.
Instead, ensure that your technology and finance teams have a cloud cost management platform or tool in place to assist in making
sense of this data. These kinds of services, like Cloudability, help ingest all of this data, process it within an analytics engine, and
produce easy-to-read dashboards and reports. This creates a common language for all technologists, finance personnel, or business
leaders to help make sense of cloud costs.
Spoiler alert:
It’s no fun to actually read through a detailed billing report.
The more a team can tag resources, the better and more accurate their cloud cost allocation will be. Dashboards and reports will be clearer and
discussions about who owns or owes what from a cloud cost standpoint will make much more sense.
This tagging work can be a challenge, however. Cloudability typically recommends three ways to help improve cloud service tagging:
Create a policy around tagging No manual tagging of services Use a management platform to
new resources Instead, use automation software, ingest and report on these tags
Any new service MUST be tagged. like Terraform, Puppet, or Chef to Manual human work often takes too
tag newly-created cloud resources. long or contains too many errors.
Some engineers even consider untagged resources as a form of technical debt. The longer teams put off tagging or re-tagging services to improve
allocation accuracy, the more inaccurate budgeting and forecasting that can possibly occur. Avoid this opportunity cost by taking the time to tag
resources that have missing, misspelled, or even multiple-language tags.
Once you have these reports ready to inform your teams, it’s time to discuss ways to lower cloud costs and improve infrastructure efficiency.
A thoroughly-tagged infrastructure means every bit of a resource and its cost have some type of connection back to a team or project that owns it.
It means its costs are being ingested in a cloud cost management solution at a much more consistent rate than when the monthly bill arrives.
While some businesses have the ability to invest internal resources to create their own data-driven anomaly detection, many might find it easier
to use an existing, proven solution. Cloudability provides this level of anomaly detection within its multi-cloud cost management platform. Always
monitoring, and always ready to inform viewers and budget owners when a spike or dip occurs. The sooner these people know, the quicker they
can take action.
Instead, find a platform that can take thoroughly-tagged cloud service utilization and costs and generate meaningful budget analysis and forecasts
around it. As measures are taken to contract spending across the business, this cloud cost information and proactive analysis can help teams get
ready to adapt to changes in budget.
Cloudability recommends using a cloud cost management tool to ingest data, analyze, and produce these forecasts. DIY solutions are possible, but
might end up costing more to build than the amount of money to be saved!
EC2 instances of various types and sizes have different per-second costs. The generic M-class caters to typical workloads at an affordable rate.
However, if users require compute-intensive, memory-heavy, or GPU-focused workloads, they’ll need to pay different rates for these services.
Using a cloud cost management platform, look back at tags and allocation to see which teams, projects, or departments are generating high VM or
compute costs. Dig deeper and see which services they’re using. Ask these questions:
Pro tip:
If you’re unsure about how well your teams are utilizing their existing services, a cloud cost management platform can use
machine-learning data analytics to make sense of it all for you. See how Cloudability combines multiple cloud cost data points
from multiple cloud providers to surface your actual cloud costs in detail.
While faster SSD or NVMe storage is the go-to for workloads, users can opt for cheaper “cold” storage solutions for data they hardly touch or use.
These solutions are often cheaper to run over time, but might require costs to transfer that data from cold storage back into everyday use.
Use a cloud cost management platform to help manage read and write activity to get a sense of how much storage is actually costing your teams
across your cloud infrastructure.
An example could be looking at a compute cluster and seeing if each individual instance is using all of its capability. Were these instances set up
during a time of intense utilization, and are they now running idle at times? This is an opportunity for change and for savings. Using a cloud cost
management platform to help determine where this waste is happening can help your technology teams determine what to shut off and what to
rightsize.
Speaking of rightsizing...
One way to find savings by the second is to ensure that services on your infrastructure are the correct size. Using a data platform to help your
teams determine what percentage of utilization is efficient can help with rightsizing discussions.
You might find numerous instances that are too large for their workloads. This is an opportunity to size down and find some savings at scale.
Note:
If you change the size of a cloud service, and it is attached to a lower committed rate agreement, e.g. AWS EC2 instance
and their respective Reserved Instances, you’ll want to be ready to convert or sell those RIs!
Use a cloud cost management platform to not only ingest and manage all cloud services, but to also surface reports that help identify
which services require rightsizing. Not only does this avoid guesswork, but it utilizes actual data over time to help your technology teams
determine different baselines for service utilization. These baselines can show which services are operating efficiently and perhaps which are
underperforming. Some might even be overburdened, creating opportunities to reduce service downtime by sufficiently provisioning enough
computing power to sustain those workloads.
Leading cloud cost management platforms will have API functionality to connect to infrastructure and automate rightsizing, taking away risky,
error-prone manual processes.
To help determine which services have this type of seasonal increased utilization, use a cloud cost management platform. Create a dashboard by
tagged service to know exactly which teams, projects, or departments tend to have “spiky” utilization and costs. This can be a great starting point
to create automation around temporary scaling of these services.
There’s no need to keep instances on when they aren’t in use. This usage with low utilization is actually waste. Hunting for this waste manually can
be quite the chore, especially within a massive, scaled cloud infrastructure. Instead, rely on a data analytics platform to help your teams identify
and rightsize services to improve utilization.
The places where insights can come up can be surprising as well. In a well-tagged infrastructure running a cloud cost management platform,
sometimes “zombie costs” from services left on or unattended can be detected by users operating within different teams or even locations.
Democratizing how teams access cloud cost and utilization data opens up more opportunities for FinOps-minded members to seek out ways to
save, react to anomalies, and generally be more vocal about cloud cost optimization.
When there’s a spike or need for more services, simply turn them on. But also, don’t forget to “turn the lights off” when you’re done. Using a cloud
cost management platform, FinOps experts let the machines do the work and automation by setting up thresholds of utilization for services to not
go under or over. This wayfinding helps teams be proactive about whether or not their infrastructures require more (or fewer) services. This is by
far better than over-purchasing and forgetting about things (and having them incur costs by the seconds!).
Once your teams tag infrastructure properly and thoroughly, and identify which types of services and how many of them they need, it’s time to
use those services at the most optimal rate possible. Across most major cloud service providers, there are reserved rate discounts that require
committing to a certain amount of time for a lower rate (e.g. AWS’s Reserved Instances). There can be multiple lower rates to choose from
depending on the term of the commitment as well as whether dues are paid upfront or not.
Some of these discounts are very flexible and can be converted as well (once again, AWS RIs come to mind). Specifically with AWS RIs, if you
change the service related to the RI, you can convert the RI to the new service to maintain the coverage of the discount. This leads to one tactic
that any FinOps team can investigate if looking for ways to save on cloud costs.
Instead, use a data analytics tool, like a cloud cost management platform, to help your teams ensure that every pre-paid discount is being applied to
an active cloud service. For example, your team might want to emulate Atlassian’s waterline strategy, where they ensure that their AWS Reserved
Instances cover at least 90 percent of their active instances. With their infrastructure changing all of the time, having an automated way of
reapplying or adjusting discounts ensures they aren’t paying the full, non-reserved rate for a majority of their services.
There also might be the case where a previously purchased reservation simply doesn’t make sense anymore. Maybe the service is no longer being
used or engineers have determined that they no longer need those services. Instead of letting them lapse, try converting those instances into
different service sizes or types that your infrastructure actually uses, or sell excess reservations on the AWS Marketplace to gain back some of
those costs.
While AWS has its own set of pre-paid discounts via Reserved Instances, Microsoft Azure has its own system, called Azure Reserved Virtual
Machine Instances. Google Cloud Platform users can purchase “Committed Use Discounts” across their compute services. Every platform has a
different type of committed usage discount and they must be tracked, analyzed, and adjusted in their own way to map to an ever-changing multi-
cloud infrastructure.
The best way to reduce the amount of manual analysis across clouds and to get accurate analytics and recommendations is to find a multi-cloud
cost management platform that can ingest and more importantly retain cloud cost data from multiple services and serve them to your teams from a
FinOps efficiency perspective.
This is where a cloud cost management platform comes in. Use one that has a feature focused on Reserved Instance recommendations and
adjustments using the AWS API. This removes manual changes while ensuring that recommendations and adjustments are backed by real cost and
usage data.
While these discounts may not be as high as normal RIs or as granularly adjustable, Savings Plans cater to teams that know they’ll use specifically
EC2 and Fargate but without knowing how their infrastructures might change or scale.
A savings plan might be a path to take if your teams expect changes in workload and infrastructure needs, but don’t want to constantly do the
legwork of adjusting or converting RIs, or reconciling unused ones to other services.
Even with an AWS Savings Plan, making sense of those cloud costs requires an additional layer of data and analysis. Use a cloud cost management
platform to help your teams understand where each dollar within those Savings Plans is going—by team, department, project, or more—and
attribute these costs to real business metrics that your organization might be using.
If your business is seeking new ways to reduce costs or save, and the IT infrastructure is where they’re looking, we hope this guide can be of some
use. The four tactics we speak of are building blocks for a greater FinOps practice.
According to the FinOps Foundation and the best practice book, Cloud FinOps, there are three cyclical phases that finance experts and
technologists must recognize to create strong processes and culture around cloud cost management: Inform, Optimize, and Operate.
Following this guide not only helps your teams immediately identify ways to save, but also start building the foundation for a stronger FinOps
practice across your organization. Times of market contraction and uncertainty are normally short—at some point your businesses will get back on
track and the optimization work put in now will return massive savings later on.
To ensure that proper information is utilized across your teams, it’s critical to rely on a data- and analytics-driven cloud cost management platform.
Take the manual work out of seeking out efficiency and savings and rely on a proven platform to help you find these opportunities to save and to
conquer cloud costs moving forward. The less time your FinOps experts (technologists, finance, and operations) spend on manually counting costs
and building charts, the more time they can spend responding to times of uncertainty to help put your business back on the track toward success.
Eliminate your cloud waste with actionable insights into your usage and cost data - Get started today.
20 | Optimizing
© 2021 Apptio,
Cloud
Inc.
Infrastructure
All rights reserved.
Costs During
Trademarks
Uncertain
and logos
Timesare the property of their respective owners. A1674 V2104-1 Apptio.com