Managing Cloud Spend in SaaS: 7 overlooked opinions

In this article, let’s uncover the topic of cloud spend and delve into some of the 'overlooked opinions' surrounding this critical aspect of modern business operations that deserves a fresh breakdown.

Startups may begin with ample cloud credits and funding, but as customer and business data increases, so does cloud usage and incurred cost. Additionally, our partner discussions indicate ‘cloud cost’ will become an interesting talking point for 2023 for many late-stage startups, especially in SaaS.

As SaaS startups grow, managing cloud spend becomes a crucial task due to their unique financial and operational needs. This is because operating on a subscription-based model and maintaining a comfortable gross margin is essential for profitability. Additionally, with the increasing complexity of cloud usage, effectively managing cloud spend becomes a challenge.

In this article, we’ll dive into some overlooked opinions on cloud spend, especially relevant for SaaS companies, that could make a big impact. Let's get started!

Locate a common ground—a North Star metric 

Given the elastic nature of the cloud, it is important to agree on a shared “North Star” metric—such as the cost of a single transaction—that can be understood and utilized by business, tech, and finance teams to make decisions. The North Star metric can vary based on the business and its goals and can be expressed as cost per user, user activity, or page views, but for the purpose of simplicity, we'll use ‘cost of a single transaction’.

For instance, in my experience working for a CRM SaaS company, we used ‘retail transactions’ as a key metric. And, instead of dividing cloud cost based on total transactions, we split it into two parts: a) active cost and b) holding cost.

The cost of each transaction hitting the system in real-time is called active cost per transaction, let’s say $0.1/txn, while the cost of transaction retention (for analytics purposes), let’s say $0.001/txn. This approach closely reflected the nature of our systems, as new transactions (for real-time processing) and historical transactions (for analytical processing) are handled by separate systems.

This enabled our sales teams to price new accounts based on:

  • Expected monthly transactions, and
  • Transaction retention months on which analytics is needed 

This also enabled finance teams to calculate gross margins per account. 

Tech teams were also given OKRs to improve their systems to reduce both active cost per transaction and holding cost per transaction, providing a clear picture of the cost associated with each transaction, and helping all teams to make informed decisions.

Tech teams often avoid this conversation because some systems may not be directly tied to transactions, which is normal. An approximate model is sufficient and multiple parameters can be included as long as they align with the cost model.

Consider Private Cloud deployment, if you move up-market

Multi-tenant, single deployments across all customers are how SaaS startups create agility. While this is true, there is still a case to consider for private cloud deployments.

Here’s why:

  • As SaaS startups move up-market and add accounts with high transactions, mixing workloads of a high-volume account with low-volume accounts may make your cloud comparatively less efficient. For instance, a customer may request over-provisioning, at the expense of paying extra, tighter response times, and a different disaster recovery strategy. This can result in increased costs and complexity across all deployments.
  • Optimal resource utilization won't prevent the cost of cloud services from rising in proportion to the number of transactions, due to the structure of cloud pricing. However, not all SaaS startups have the same pricing method as cloud providers. You may charge more for accounts with more transactions, but offer enterprise deals or volume discounts. The differences between your pricing model and the cloud providers' billing model can reduce your gross margins.

In these scenarios, deploying on a private cloud on a customer’s account may be better as it allows for the offloading of cloud spend to the customer and can provide the level of service and performance that high-end customers expect. Moreover, it can give early-stage SaaS startups an edge over large enterprises, as data stays within their cloud, which can be a key selling point—as many large companies are concerned about data security and compliance.

However, it's an overlooked approach because it comes with higher maintenance costs for tech teams managing multiple environments. Nonetheless, modern DevOps tools and practices have come a long way to make it manageable.

Maintain a striking distance from building cloud optionality

One reason to consider cloud optionality is the potential for private cloud deployments, where the customer may have a preference for a particular cloud. Other reasons include:

  • Lower costs in a specific cloud region,
  • The possibility of future cloud partnerships, and
  • Creating leverage by avoiding dependence on a single cloud provider.

Cloud optionality does not mean deploying on multiple clouds, but it does mean keeping a striking distance and maintaining the ability to switch to another cloud. This position enables SaaS startups to be more adaptable to market changes and make informed decisions to reduce costs and improve performance.

Also, cloud optionality does not mean not using managed services at all—but, preferring to use “blue-collar” services that have equivalents in other cloud services. For example, PostgreSQL Aurora is a relational database service that is fully compatible with the PostgreSQL engine and offers simpler lifecycle management. By using a service like Aurora, you can avoid maintaining Postgres on your own, while still retaining the option to switch to another cloud provider if needed. In this way, you can be both "cloud agnostic" and "cloud native" at the same time.

Decentralize cloud cost governance

Decentralized cost governance for SaaS startups is a less popular but potentially more sustainable option than centralized war room-based cost governance. The popular theory is that developers don't care about cloud costs, but that’s not true! Developers work in small squads and rarely have the visibility of the entire cloud cost to take action. 

To make decentralized cost governance work, two things are necessary:

  • Fine-grained cost attribution to the services for which developers are responsible, and
  • Treating attributed cost as a first-class metric in developer sprints, on par with availability and performance 

With decentralized cost governance, developers have greater ownership and accountability for their systems’ costs and are empowered to make decisions that can create new cost optimization opportunities.

On the other hand, centralized war room-based cost governance can be effective in the short term but may not be sustainable in the long term. This approach is often dependent on a small team of experts who are responsible for managing costs. However, this can lead to a lack of ownership and accountability among other teams, which can make it more difficult to achieve cost savings in the long run.

Document service resource requirements

Teams often implement various tools to analyze cloud bills and work backward to services consuming maximum resources. However, they overlook formally documenting each service resource requirement, which results in missed cost optimization opportunities. 

To address this, document the requirements for each service, including:

  • Whether the service is required to be always running or can it be a job,
  • Can the service sustain restarts (balance on-demand and spot instances), and
  • How the application needs to be scaled.

By formally documenting and analyzing these requirements, organizations can gain a deeper understanding of their cost baseline and identify cost optimization opportunities that may not be apparent from cost dashboards alone. This can help organizations make more informed decisions—about how to allocate resources and optimize costs.

That said, it's important to note that this approach requires a systematic and consistent approach to documenting and analyzing resource requirements, which may require an initial effort but offers long-term benefits in terms of cost saving and improved efficiency.

Focus on low-risk, non-production environments first

In SaaS startups, non-production costs are usually attributed to the technology teams and the production costs are attributed to PnLs. Generally, the latter is often more directly tied to the business and larger in amount. While optimizing them makes sense, it involves risks that need more careful planning. Low-hanging cost optimizations can start from non-production environments which are less risky.

Some ideas for cost optimization in non-production environments include:

  • Running all on-spot instances,
  • Constraining resource requests,
  • Having separate environments for functional and load testing, and
  • Switching off environments during non-usage (weekends)

It's also important to have a clear and cost-effective local development environment. 

Non-production infrastructure can cost anywhere up to 20% of production infrastructure (for a typical SaaS company of $1M cloud spend), and generally, there are quick ways to reduce it to below 8%.

By focusing initially on cost optimization in non-production environments, you can gain immediate benefits without introducing unnecessary risks to production systems.

Apply budget constraints on Data Workloads

In the last decade, SaaS startups have benefited from the data revolution and have added intelligence to their offerings. Modern data processing systems such as Apache Spark and Snowflake have made this possible with sub-second querying capabilities, but at the cost of higher resource usage compared to traditional warehouses. The shift has moved infrastructure from being constrained to unconstrained, that can handle ad-hoc queries. It's now common for a company's data workload costs to exceed 50% of its total cloud spend.

While we don't want to go back to legacy-constrained systems, it is important to find a middle ground and create a bounded approach.

Luckily, there is a common pattern that is generally used in building data products. Usually, it starts with experimental queries through notebooks and is eventually productized—if the ROI is significant. We propose cloud cost should also be considered when deciding on the next set of productization when the budget is hit.

Generally, experimental workloads are often the most expensive because productization, by definition, means doing it cost-optimally. So, allocating budgets and choosing when to productize will keep the cost under control. 

The goal of productization is to create a set of pre-calculated reports, dashboards, or datasets that can be accessed quickly without much variation, reducing the need for raw queries, and controlling costs. This process should be monitored to ensure that costs stay within the 20% budget.

table of contents



Get Your Facets Developer Control Plane

Consult our experts for your DevOps needs by booking a demo