The advent of cloud computing has enabled us to develop faster and ship more than ever before, but we’re quickly realizing (if we haven’t already) that it’s come with a tradeoff: we’re spending a lot more on cloud than we anticipated. Our battle now is not increasing our speed of innovation, but rather one of cloud cost management -- managing and optimizing the costs that have come with faster innovation.
While we want to empower developers to build, test, and deploy code quickly, we want to try and be as efficient as we can with our spend at the same time. Whether it’s a team in finance, operations, engineering, or the new hotness of FinOps, someone is on the hook for keeping those cloud costs in check!
Let’s go through a few approaches we can tangibly take to understand and manage cloud costs.
Pros
Cons
Typically, this method is best used by early startups and in some cases, growing SMBs, and it’s most effective when these requirements are met:
Naturally, we can manage our cloud costs by keeping track of what we’re using and where in a spreadsheet. Makes sense and seems easy enough, right? This approach can make a lot of sense when we’re spinning up ten AWS EC2 instances and three S3 buckets, but anyone who’s done their own management knows that this quickly becomes a chore, and is, in fact, impossible as we scale -- especially when we start introducing things like containers.
Think about the work involved in managing just your thirteen AWS resources. Now, think about what that would mean if you introduced containerization and exploded the complexity of managing these resources: you have to keep track of how containers and resources map to each other, and the usage at any given point in time, making it difficult to get an accurate accounting of what’s going on if you do it manually.
The amount of effort looks something like the below -- as we have more cloud resources to manage, the effort it takes to manage them manually grows exponentially.
If this model is what works best for our org, it’s not that hard! In fact, cloud providers have created a concept called tagging specifically so that we can organize and manage our cloud resources. We can then leverage the cloud provider’s built-in cost dashboard to see what things are costing based on how we’ve tagged those resources.
Tags are essentially key-value pairs assigned to each resource, just like in a hashmap. They allow us to define how we want to identify resources and assign the key-values required to uniquely identify a resource. For example, in our small pool of resources, we know that to uniquely identify a resource, we only need to know the owner and the application, so we can tag any resources accordingly. In this instance, we might end up with tags on our three AWS S3 buckets that read {“owner:foo”; “application:bar”}, {“owner:joe”; “application:schmoe”}, and {“owner:sally”; “application:pally”}. Armed with this identification ability, we can hop into our AWS Cost Explorer dashboard and understand exactly what these resources are costing us and assign them very easily to our internal model.
Ultimately, doing this relies heavily on creating strong governance around the usage of cloud resources. We want to create consistency in how we identify resources, regardless of how they are invoked, and follow an identification schema that lets us effectively group and find resources for us to keep track of costs and eventually optimize them. At that point, we can come to the table at any time to do a costing exercise and see where costs are coming from and how we might be able to optimize them.
This always runs the risk of things being tagged incorrectly or not being tagged at all, and the burden will fall on us to maintain the policies and validate the tags over time. This strategy relies on good tag coverage and good tag accuracy to understand costs, which may not always work out, and failure breaks down the whole strategy. On the other hand, it does carry the benefit of giving us lots of control, and we’ll definitely understand our costs better than ever!
Pros
Cons
Typically, this method is best used by companies who are SMBs or early mid-market, and it’s most effective when these requirements are met:
For those of us who have tried to use the manual method as we hit our stride and grow our use cases, we know that it becomes nearly impossible to properly track and tag all of the resources in order to understand costs and create more efficiency. In fact, trying to maintain tag hygiene sometimes creates more inefficiency than it solves for, making it an inefficient method to manage cloud costs as we scale.
At this point, we want to strongly consider some kind of management software, including cloud cost management tools, whether that’s something built in-house or provided by a vendor to help with a few core problems:
At this point, we can confidently say we understand our cloud costs.
But who understands the cloud costs, and who is responsible for understanding them? We have to realize that with cloud cost management software, we’ve flipped the script. When we were small, engineers owned and managed resources and cost, but with cost management software we go from bottom-up cloud cost management to top-down: engineers are tasked with innovation while a manager (usually sitting between a few different teams and having no context) is tasked with managing the costs.
With cost management software, suddenly we abstract away a lot of the really difficult work required with manual management of cloud costs, and there’s a good chance we’ll net a bunch of savings that previously hadn't been discovered via some great “reserved” resource pricing or rightsizing opportunities. That, and there’s a good chance we’re able to get this view without any tagging! What’s not to like?
The promise of such a top-down management method is realized early on in the process and time to value is certainly minimized because there are a lot of upfront opportunities to optimize the cloud resource fleet. But there’s a caveat: even in a scenario where we can track every resource and identify every savings opportunity, it’s tough to actually implement them because chances are, we can’t just go and make changes. We’ll have to contend with individual budget owners, CI/CD issues and velocity, and rope multiple owners and functions into every conversation.
Not only does this mean the savings we found might never actually get implemented, but also that we’ve solved one problem only to run into a brick wall. You just can’t solve these problems as a top-down cost optimizer that doesn’t get into the weeds the same way a stay-at-home armchair analyst couldn’t do the same job as someone on the ground.
We can definitely understand our costs now, but we’re not much closer to actually managing them well or optimizing them, which was the whole idea.
As it turns out, cloud cost management software provides great point-in-time views of what’s going on and automates some of the thinking around what to do at a high level, but it’s really just a nicer way of doing the same manual management, so now we get to spend time trying to talk others into cost optimizations rather than trying to find the optimizations in the first place. We’ve made progress, but this situation is still far from ideal.
If we really want to create a long-term scalable model, we have to consider whether the top-down approach that traditional cloud cost management software provides may not be the best fit for our needs. It’s awesome for a high-level view and powerful recommendations, but that method of implementing a winning strategy doesn’t work for everyone. For these folks, it’s important to solve the core efficiency problem and solve a different kind of challenging problem.
Pros
Cons
This method works well for companies at all stages, but particularly well for growing SMBs, mid-market companies, and especially enterprises. It’s most effective when these requirements are met:
There are a lot of good things about cloud cost management software, and we definitely want to preserve those. In particular, regardless of the solution we choose, we want to ensure we can have a simple view that helps us understand our cloud costs. But we want to get away from top-down management of cloud costs, because we see that as we scale to greater heights, it creates a lot of inefficiencies and doesn’t get to the core of the problem: the engineering teams that are building things.
We want our engineers to build awesome stuff and we don’t want to slow down their velocity. After all, this is the whole reason we moved to the cloud and created this innovation model. At the same time, we can’t escape the fact that this model has created cost concerns and we want to get more efficient about it.
Let’s take a step back and see what we know:
Long story short, we want the control we had when we could handle it ourselves, we want to keep moving fast, and we want to operate at scale. There’s no need to imagine whether that could happen, because that’s where collaborative cost management comes into the picture.
The collaborative cost management model leverages the best of bottom-up ownership of resources as well as the top-down view of what’s happening. The question to answer then is how to implement such a model.
At its core, the model is composed of two parts. First, we need a top-down view of things -- we already have that from our cost management solution. Second, we need to bring engineers back into the cost picture and empower them to track and manage their costs -- and we can already track resource-level costs, so this bottom-up need is already teed up to be solved.
If we can do both of these things, suddenly we have a better understanding of our overall costs than we ever did, and we do it so much faster. Imagine this: when we look at a point-in-time view of our cloud costs and have questions, 1) we can easily find the answers instead of having to spend time tracking down budget owners; and 2) we find that we’re already managing our costs pretty well because engineers are on top of it from the get-go. In this scenario, we’re never worrying about our costs, we’re still innovating quickly, and now we’ve created cost efficiency and cloud cost savings.
To create this model, recall that we need to have the right tools in place, and it’s a matter of getting engineers involved. Engineers will need training, of course, but we’ll also need the tooling to make their involvement possible. Hopefully our cost management solution makes this easy.
Once the engineers know what they’re responsible for and how to stay on top of it, we’re off to the races and have a new and efficient collaborative cloud cost management model in place. All pretty easy to do!
Here’s the plug: Harness Cloud Cost Management -- your comprehensive cloud cost management platform. All of the goodies we want in our cloud cost management model packed right in, and it’s built into software, so most of the heavy lifting is done already. The best part? It’s part of a CI/CD platform, so it ties right into your usual engineering process instead of adding yet another tool to the long list.
Do you want to learn more about which cloud cost optimization tools there are out there? We mapped out the top cloud cost management tools to consider.
Managing cloud costs is a big problem, but it really shouldn’t be. We conquered the data center, why can’t we conquer the cloud? Get your free trial of Harness Cloud Cost Management to make your cloud cost management easy.
Enjoyed reading this blog post or have questions or feedback?
Share your thoughts by creating a new topic in the Harness community forum.