Insights & Resources
Cloud & AWSArtificial Intelligence

AI in Cloud Computing: Are We Building Smarter Clouds or Just More Expensive Ones?

Learn what AI in Cloud Computing is and whether we are building smarter clouds or just more expensive ones?...................

Priyanka Shaw15 Jun 20269 min read
Cloud & AWS

For a company, 2018 was a time for a straightforward purpose of shifting into the cloud- to lower infrastructure costs, to increase scalability, or to eliminate server maintenance. The cloud had become a utility-like resource, rather like electricity. Businesses needed computing resources only when needed, and they did not wish to think of or worry about what occurred behind the scenes. 

The same organization is no longer asking how many servers it requires. Instead, they are asking entirely different questions:

“Can AI see the failure of a system before it happens?” “Can AI automatically optimize our cloud spending?” “Can AI even recognize security threats we might overlook?” “Can AI tell us what future decisions to make?” 

The cloud, however, is no longer just such a utility; it is fast becoming a decision-making platform. And that shift may be the most important evolution of cloud computing since the cloud itself. 

The Narrative Everybody Gets Wrong

The AI narrative in cloud computing often gets encapsulated in the simplest story: 

Cloud provides computing power

AI takes up that computing power

Everybody wins

It, however, is more complex in reality. AI is not just another workload that can be turned on within the cloud. AI actually fundamentally changes the functioning of cloud environments, the management of technology, and the very decision-making process. 

The question is not if cloud computing environment(s) need humans- rather, the question is whether or not future CCE(s) will even require human management. 

A Tale of Two Cloud Environments

There are two different companies. They both run a high-traffic e-commerce site that serves millions of customers each day. 

Company A- Conventional Cloud Operations

A customer experiences a slow checkout page. An engineer goes to the logs to discover what happened. They find the bottleneck. Then, the team scales/elevates their resources accordingly. The customer’s problem has been fixed. Total time for /the cost of fixing the customer’s issue was 4 hours. 

Company B- AI-driven Cloud Operations

The AI continuously monitors the app for performance anomalies. The AI detects unusual latency in one of our DBs. It identifies that there is a high probability of performance degradation based on how things normally work. The AI provisions additional resources. The app’s bottleneck was eliminated before the customer even knew there was an issue. Total time for the /cost of fixing the customer’s issue is 0 hours. 

The difference is not in the infrastructure. Both companies are using the same cloud. The difference is intelligence. 

Traditional cloud 

AI-driven cloud

Reacts to problems 

Predicts problems 

Human-led monitoring

Continuous automated analysis 

Fixed operational rules

Adaptive decision-making 

Post-incident response

Pre-incident prevention 

Manual optimization 

Automated optimization 

This is where the disruption really begins. 

AI is Quietly Becoming the Cloud’s Operating Systems

The cloud has many different cloud computing vendors with three main types of offerings: storage, networking, and computing (compute). 

These three core dimensions of cloud computing are now all commodity cloud products; hence, the new battleground for competition between cloud vendors will be on intelligence. All major vendors are now differentiating themselves by being able to provide answers to some combination of these four questions:

  • What workloads should we optimize?

  • What resources are wasted and costing us money?

  • Which applications will fail next?

  • What security events will require immediate action?

In the past these questions were answered by engineers within the enterprise; today, these types of questions are increasingly answered by AI. That may lead to a lot more efficient. However, it also creates risk. 

The Unspoken Problem

AI provides operational efficiencies to operations. However, operational efficiency and accuracy have nothing in common. Think about an AI designed to reduce operational costs in a cloud environment. This AI detects underutilized resources and makes the recommendation for the company to shut them down. This makes good sense and follows the logic of cost-cutting.

However, what happens if one of the underutilized resources was needed to support an application that only has a peak load twice a month? While the AI would show this resource as wasteful and therefore cost effective to eliminate the resource, the business will be using it for resiliency; two very different views of the same information.

This is a dilemma organizations rarely have a candid discussion about. I believe that AI (not human) can optimize for what the AI can measure, while an organization must optimize for what the business values. There is often no correlation between these two parameters.

Why AI may Make the Cost of Cloud Worse?

The biggest promise of AI when it comes to cloud computing is that it can create significant cost savings through cost optimization. 

Traditional cloud workload 

AI-powered workload 

Predictable resource usage 

Highly variable resource usage 

Standard compute instances 

GPU-intensive infrastructure 

Moderate storage growth 

Massive data requirements 

Lower energy consumption 

Significantly higher processing demands 

Easier cost forecasting 

Complex cost management 

The requirements for running an AI platform require significant resources; such as training models; performing inference; working with large datasets; storing embeddings; maintaining vector databases - will all lead to dramatic increases in the cost to use cloud services.

At the same time, many organizations adopting AI are at an all-time high when it comes to cloud expenditure as they adopt AI to be more efficient. The challenge is not whether AI has valuable benefits; the challenge will be to show that the value created by the use of AI surpasses the expenses.

Security: The Greatest Opportunity and the Greatest Risk

AI is one of the greatest tools when it comes to security in the cloud because it can analyze billions of events and identify suspicious activities, being able to identify anomalies that otherwise may go un-noticed by a human analyst.

However, AI systems can also serve as attack surfaces as organizations now must secure;

  • Cloud Infrastructure

  • Applications

  • Data

  • Machine Learning Models

  • AI APIs

  • Training Pipelines

Security teams are identifying a paradoxical relationship with AI. It is a valuable tool in defending against security infractions in cloud environments, but at the same time, AI creates entirely new risks and entry points into an organization's security.

Organizations that will thrive in the new era of AI will be those that can identify both sides of the coin, the benefit and the risk.

Transition From Infrastructure Management to Trust Management

Historically, cloud teams were focused on managing physical infrastructure and, as a result, had an emphasis on the infrastructure that contains or processes data.

The emphasis has changed from being solely on infrastructure and moved towards building trust in the results that AI generates.

  • Can we trust the recommended action?

  • Can we trust the predicted action?

  • Can we trust the automated action?

  • Is it possible to understand why the AI reached its conclusion?

That paradigm shift will alter every way we see the future. Future cloud engineers will invest less energy to set up a server and more in validating what the AI determined to be the appropriate decision. It is not a tech issue. It is a governance issue.

How Cloud will Evolve over the Next 5 years?

Experts say that the cloud will transition to primarily autonomous.

Visualize logging into your cloud console and seeing:

  • "Reduce costs 17%."

  • "Secure a remediated vulnerability."

  • "Optimize (3) workloads."

  • "Prevent future outage."

  • No tickets, no manual labor, no war rooms -- just outcomes.

The technology is moving toward this type of outcome. However, before companies adopt automation, companies must due diligence and not assume that more automation will produce better outcomes. Historically, automating has produced better results when human decision-making remains part of the process. 

The Future of AI and the Cloud

The future of AI is not to eliminate the role of cloud engineers. The future of AI will change the role of cloud engineers. Future leaders of successful companies will not be purely automating everything. Future leaders of successful companies will leverage AI for creating value and will still provide human discernment for what will be done with that value.

This distinction will create the differentiation between those that automate merely for automation's sake and those with true purpose in applying AI.

The primary challenge is ensuring that businesses can maintain the appropriate level of control over their AI systems while also benefiting from the improvements they provide.

Final Thoughts

As we close this discussion on AI in cloud computing (and also finish this article), the truth will always be that AI has and will continue to reshape the nature of work done within the cloud. And while some use cloud computing purely as a means of consuming resources and as such, may find themselves relegated to just another level of usage without realizing any benefit from using AI, others will be able to harness its potential to transform their organization's overall productivity. AI-enabled innovations will drive new business opportunities as companies leverage data in new ways to develop new products and services and to deliver superior customer experiences.

By leveraging the cloud to provide the resources and capabilities required to effectively utilize AI technologies, companies can capitalize on these new opportunities to create value for themselves and their customers.

Next Step

Need help turning this into a working system?

Let's Talk