Artificial Intelligence (AI) has continued to reshape the technological landscape. As we enter 2024, experts and AI agents are offering their predictions on the trends that will define AI networking in the next 12 months.
Exponential growth in AI workloads
In a world where AI applications like ChatGPT, Bard, and X.AI’s Grok have become commonplace, the hunger for larger, more powerful AI models persists. The exponential growth in computing power has enabled large-scale AI model training, and the demand for bigger and better models shows no signs of slowing down. Hyperscalers are now challenged to support even larger workloads with clusters of thousands of GPUs.
This growth is driven by the development of new AI algorithms and the widespread adoption of AI applications across various industries. Consequently, AI workloads are expanding, and the GPU clusters supporting them are growing. The efficient use of these clusters and the successful training of AI models heavily depend on the underlying architecture and network connectivity.
Open networking: A paradigm shift
Hyperscalers have already embraced open and disaggregated networking solutions in their data centers. The rationale behind this shift is clear: monolithic and proprietary networking solutions can’t deliver the scalability, flexibility, and cost-effectiveness required for managing large-scale compute resources.
Proprietary networking solutions have long been suitable for High-Performance Computing (HPC), but they tend to stifle innovation and drive up costs due to a lack of competition. On the other hand, open and standard networking solutions are essential for the growth of the AI ecosystem. They enable cost-effective infrastructure for high-scale workloads, fostering the proliferation of Large Language Models (LLMs) and enabling new applications to flourish.
The Ultra Ethernet Consortium (UEC) is set to play a pivotal role in this transformation, promoting open AI networking to a standardized Ethernet-based model. Ethernet’s adoption for AI backend networking is predicted to develop significantly in 2024.
Edge computing and distributed architecture
While large backend workloads excel at handling complex tasks and training extensive AI models, the trend in 2024 is to move computing power closer to applications, enhancing user experiences, especially in scenarios requiring rapid decision-making. While a fully distributed AI workload might not be realized this year, the momentum toward edge computing continues to grow.
This shift necessitates more frequent interconnections between front-end and back-end networks. However, it also highlights a pressing networking issue: the inconsistency in connectivity protocols between these two network segments. To streamline network management and potentially boost overall performance, the industry is beginning to take steps towards unified networking solutions by introducing initiatives like the Ultra Ethernet Consortium (UEC).
Sustainable and energy-efficient networking
As AI workloads intensify, particularly those involving thousands of GPUs, substantial power consumption becomes a major concern. Although the energy impact of networking is lower than that of computation, it’s a concern that needs to be addressed. Moreover, the carbon footprint remains a key issue regardless of scale.
In response, new AI networking solutions are expected to emphasize energy efficiency more. This includes adopting energy-efficient hardware and aligning with the principles of the circular economy to promote sustainability. Additionally, advanced software designed to enhance resource utilization is anticipated to gain prominence.
AIOps is already making its mark in the networking world, with several vendors implementing it to improve network operations. In 2024, it is expected that increased investments in AIOps tools will significantly impact network operations efficiency, revolutionizing the networking landscape.
Powered by AI, predictive analytics and real-time anomaly detection can play a pivotal role in resolving potential network issues and improving reliability. As AI networking evolves, high-performance connectivity is poised to improve substantially through the integration of AIOps.