NVIDIA’s H100 Tensor Core GPUs have set a new standard for generative AI performance in the debut MLPerf benchmark. The benchmark revealed that the H100 GPUs outperformed all other competitors on all eight tests, including the challenging test for generative AI. The GPUs demonstrated their excellence both per-accelerator and at scale in massive servers, delivering top performance for large language models (LLMs) that power generative AI.
Unmatched scalability and at-scale performance
A notable achievement was recorded on a cluster of 3,584 H100 GPUs, co-developed by startup Inflection AI and operated by cloud service provider CoreWeave. This system completed a massive GPT-3-based training benchmark in less than eleven minutes. CoreWeave’s CTO, Brian Venturo, emphasized the great performance their customers enjoy, thanks to the thousands of H100 GPUs on their fast, low-latency InfiniBand networks.
Inflection AI leveraged the exceptional performance of the H100 GPUs to build its advanced LLM, which powers its first personal AI called Pi (personal intelligence). The company plans to create personal AIs that users can interact with in simple, natural ways. Mustafa Suleyman, CEO of Inflection AI, highlighted that anyone can experience the power of a personal AI today based on their state-of-the-art large language model trained on CoreWeave’s powerful network of H100 GPUs.
The MLPerf benchmarks reaffirmed the exceptional performance of the H100 GPUs. They achieved the highest performance across all benchmarks, including large language models, recommenders, computer vision, medical imaging, and speech recognition. The H100 GPUs were the only chips to run all eight tests, demonstrating the versatility of the NVIDIA AI platform.
The benchmarks also showcased the excellence of the H100 GPUs when running at scale. On every MLPerf test, the H100 GPUs set new performance records for AI training. Optimizations across the technology stack enabled near-linear performance scaling, even as the submissions scaled from hundreds to thousands of H100 GPUs. CoreWeave’s cloud-based performance mirrored the performance achieved by NVIDIA’s AI supercomputer in a local data center, highlighting the low-latency networking capabilities of NVIDIA’s Quantum-2 InfiniBand networking.
Broad ecosystem and industry recognition
MLPerf updated its benchmark for recommendation systems, and NVIDIA was the only company to submit results on the enhanced benchmark. The broad ecosystem of NVIDIA AI was showcased in this round, with submissions from nearly a dozen companies, including major system makers such as ASUS, Dell Technologies, GIGABYTE, Lenovo, and QCT. This level of participation reassures users that they can expect excellent performance from NVIDIA AI, whether in the cloud or in their own data centers.
NVIDIA’s commitment to performance across all workloads is evident through its participation in MLPerf. The benchmarks cover various workloads, including computer vision, translation and reinforcement learning, generative AI, and recommendation systems. Users can rely on MLPerf results to make informed buying decisions as the tests are transparent and objective, backed by a broad group of industry leaders.
Driving energy efficiency and software accessibility
The importance of energy efficiency for AI performance was also emphasized. NVIDIA GPUs in accelerated data centers enable increased efficiency by using fewer server nodes, reducing rack space and energy consumption. Accelerated networking further boosts efficiency and performance, while ongoing software optimizations bring additional gains on the same hardware. Energy-efficient performance not only benefits the environment but also helps organizations speed up the time to market and develop advanced applications.
NVIDIA AI Enterprise, the software layer of the NVIDIA AI platform, provides optimized performance on leading accelerated computing infrastructure. It offers enterprise-grade support, security, and reliability for running AI in corporate data centers. Anyone interested in utilizing the power of NVIDIA’s AI technology can access these top-notch findings because the software used for the MLPerf benchmarks is available via the MLPerf repository.