An analysis of cloud computing costs
Cloud computing has mainstreamed consumption-based billing of technology. The variable spend model, however, is nuanced and can be complex to master. The ease of provisioning and fulfilment on the cloud has induced a softening of cost controls. In response, financial management paradigms are changing to bring cost-sensitivity into the world of software development and operations (DevOps) on the cloud. Earlier, good coders needed to only worry about maximizing algorithmic performance, but with the cloud, a developer’s choices also need to optimize operating expenses. The line between engineers and finance managers has thus blurred. Let’s understand the cost complexity problem on cloud and discuss ways of approaching it.
Twist of Role – The Developer is also a Finance Manager: In the world of cloud, DevOps engineers manage large systems through code. Compute, storage and network resources are programmatically provisioned and de-provisioned on the go. This brings massive benefits such as software-defined operations and scalability. However, these gains are based on the premise of DevOps engineers expanding their responsibilities to include the procurement and release of underpinning infrastructure and services in real-time.
Billing on the cloud accumulates every month like utility bills and varies dynamically based on usage. This calls for tight coupling between application deployment and cloud cost control. The cheapest on-demand instance on the three major hyperscalers (Amazon Web Services or AWS, Microsoft Azure, and Google Cloud) costs under $100 a year. However, a buggy autoscaling policy or a faulty deployment pipeline can add redundant instances and multiply costs. At the time of this writing in 2022, the most expensive virtual machine on one of the hyperscalers costs about a million dollars a year. Leaving even a single such instance inadvertently turned on can force a small business into bankruptcy.
Taking another example, if a system administrator provisions an Object Storage container and leaves it unsecured with writes open to public, it can see unsolicited uploads of petabytes of data that can erode your budget.
Spend control levers are thus shifting from siloed policies set by procurement and finance departments to code written by development teams and Site Reliability Engineers (SREs). Parallelly, cloud run rates of businesses are increasing steadily; it is no longer uncommon to see monthly cloud spends in the millions of dollars. This means that you might be entrusting your corporate fiscal health to your developer or SRE when you are assigning them technical deliverables. In this pay-as-you-use model, the developer, thus, also performs some functions of the finance manager; the resulting practice is called FinOps.
Cost Convolution with the Variable Spend Model: Comprehending the cloud rate card can itself be an onerous task. If you are developing a cloud-native India-scale application with a peak concurrent user load in the double-digit millions, how can you model optimal cloud infrastructure costs?
Let’s start with compute. In addition to classical on-demand instances, cloud providers also offer a rental construct variously called Reserved Instances or Committed Use Instances. Reserved instances cannot be de-provisioned until the end of a committed time duration. But if high utilization can be consistently maintained, you will typically see a ~40% discount over on-demand instances for a 1-year commitment and a ~60% savings for a 3-year purchase obligation. Services such as database instances that need to be always-on, could advantageously reside on reserved compute.
Cloud providers also offer Spot Instances or Preemptible Instances that get activated and deactivated based on the customer’s bid price relative to the current on-demand compute economics. If parts of your application can be designed to live with the uncertainties associated with preemptible spot compute, you can see up to a 90% savings on your spend. You could also take advantage of auto-scaling policies and serverless computing to further optimize your compute cost model.
Or take the case of storage services. Cloud storage usually has complex billing rules with many variables in the pricing equation. The incurred charge is based on the chosen tier, which is a function of the frequency of access, speed of retrieval, the desired degree of replication and such. It may also depend on volumes; the first slab (of pre-determined size) used during a month might cost more per gigabyte than the next.
Most cloud resources are priced based on the selected “t-shirt size”. You can, for example, choose different network gateway sizes based on requirements of bandwidth and uptime guarantees. There are also charges associated with services such as monitoring, logging and security controls.
Innovating with Cost Awareness: To obtain the expected return on investment with cloud, it is thus key to adopt principles of FinOps and build guard rails so that innovation becomes cost-aware.
Most clouds natively offer sophisticated tools that can monitor spend against set budgets and generate billing alarms when a threshold is breached. Some hyperscalers also offer virtual advisors that can analyse a cloud deployment, decipher spending patterns using AI models, and recommend actions to optimize cost. Several vendors specialize in cost visualization tools that offer past spend observability and predict future charges.
Cost exploration tools available in the market can detect anomalies in your spend trends and recommend how to resize cloud environments based on observed compute and memory utilizations. They can, for example, determine under-utilized servers and suggest new tiers better suited to usage. This may include recommendations to adjust server configurations, move on-demand instances to reserved servers, or automatically stop development-test servers during periods of lean usage such as weekends. Cost-conscious DevOps is reactive to the information and advice generated by such tools.
To harness disruption, innovate, and digitize your business models using the power of cloud computing, cost-awareness in real-time has to be built into technical solution development. And this calls for adopting FinOps, a combination of new practices and a culture shift.
Sreekrishnan Venkateswaran, CTO, Kyndryl India