As we step into 2024, it's not about playing the role of a visionary but engaging in a strategic exercise to anticipate market directions and make informed decisions about positioning and learning.
In this spirit, I present three key predictions for the cloud technology landscape in 2024:
The emerging discipline of LLM Cost Management as Generative AI becomes increasingly embedded in applications,
The expansion of Cloud FinOps beyond the Public Cloud to integrate Private Clouds and Hybrid infrastructure, along with PaaS and SaaS cost management,
And the rising global traction of Cloud Sustainability and GreenOps, especially in the wake of new EU regulations.
These trends are not mere buzzwords but pivotal elements that, I believe, will shape cloud strategy in 2024, emphasizing responsible and efficient resource use aligned with broader business and environmental objectives.
1. Emerging Discipline of LLM Cost Management: Balancing Performance and Price in Generative AI
By 2026, Gartner predicts that over 80% of enterprises will have used GenAI APIs and models and/or deployed GenAI-enabled applications in production environments, up from less than 5% in early 2023.
The current challenges in managing cloud costs for generative AI and LLMs are already significant. These include the high computational demands of training state-of-the-art models, leading to substantial cloud infrastructure costs. Large-scale data storage for these models incurs additional expenses, as does the ongoing need to manage and adapt to new data. The intensive use of high-cost GPUs and TPUs for model training and inference further escalates costs. Moreover, scalability issues, where balancing resource allocation with model performance is crucial, add to the complexity.
As generative AI will gain broader adoption across industries, these cost implications are expected to become more pronounced in 2024, impacting a wider range of organizations beyond those currently developing and training these models. The increasing demand, expanding use cases, and compute-intensive nature of these AI applications will likely drive up cloud costs. Monitoring and controlling the costs of training and inference of LLMs will indeed become crucial
Estimating and forecasting LLM costs is a complex challenge due to variations in tokenization among different models. For example, GPT-3's Byte-Pair Encoding and BERT's WordPiece approach result in different token structures and lengths, affecting cost calculations. This is particularly relevant for multilingual content, where a single English word might be one token, but its equivalent in another language could be multiple tokens.
Beyond tokenization, key questions to consider when evaluating the costs of LLM include
Whether to use a proprietary model like GPT-4 or an open-source LLM from Mistral or the Hugging Face library,
The choice of platform for training, e.g. Google Vertex Vs AWS Bedrock,
And deciding between training a smaller LLM or fine-tuning a larger model using techniques like fine-tuning & LORA.
2. Expanding Horizons of Cloud FinOps: Integrating Private and Hybrid Clouds with PaaS Cost Management
As FinOps methodologies gain momentum, companies are applying them not just to public clouds but also to private and hybrid architectures, seeking to optimize their overall cloud spend.
The FinOps Open Cost & Usage Specification (FOCUS™) supports this trend by standardizing cloud cost data across all major cloud service providers, enabling a clearer visualization and management of costs. This evolution reflects a broader trend where cloud costs, sustainability, and efficiency are becoming key drivers in IT investment decisions across various sectors.
Moreover, as the usage of PaaS solutions like Databricks and Snowflake expands, FinOps platforms are increasingly including these in their monitoring scope.
Simplifying and unifying billing data across multi-cloud environments, FOCUS™ aids organizations in managing costs more effectively, enhancing transparency, and fostering a culture of accountability and optimization in cloud financial management.
3. Cloud Sustainability and GreenOps Gaining Global Traction
The demand for green Data Centre infrastructure, especially in Europe, is set to surge, reflecting a growing global concern for cloud sustainability. This trend is bolstered by the European Union’s Corporate Sustainability Reporting Directive (CSRD), which mandates large companies to report their greenhouse gas emissions starting in 2025. This regulation will push organizations to better report on the carbon footprint of their IT operations, including the Cloud, making sustainability a critical aspect of their operations. Nearly 50,000 companies in Europe, along with more than 10,000 non-EU companies and their European subsidiaries, will be impacted by this directive.
This requirement is expected to be a compelling event that will shift cloud sustainability from the current minority of sustainability enthusiasts to a broader audience of early adopters and eventually to the majority. Public cloud providers are anticipated to introduce more tools for carbon monitoring, energy management, and circular economy capabilities.
Additionally, Managed Service Providers (MSPs) are preparing to offer cloud-based sustainability services, emphasizing the integration of sustainability considerations into cloud services. This aligns with the growing awareness of energy consumption risks associated with AI applications and underscores the need for sustainable Cloud practices.
These insights paint a picture of a cloud technology landscape in 2024 where cost optimization, FinOps evolution, and sustainability are not just buzzwords but critical components of cloud strategy.
These trends suggest a shift towards more responsible and efficient use of cloud resources, aligning with broader business and environmental goals.
I encourage everyone to engage in this forward-thinking exercise. Share your own predictions for 2024 in the comments below. Let's collaborate and discuss how these emerging trends might shape our strategies and decisions in the cloud domain. Your insights are valuable in painting a comprehensive picture of what the future holds.