Today at Ignite, Microsoft is announcing the public preview of the NC H100 v5 Virtual Machine Series, the latest addition to our portfolio of purpose-built infrastructure for High Performance Computing (HPC) and Artificial Intelligence (AI) workloads.
The new Azure NC H100 v5 series is powered by NVIDIA Hopper Generation H100 NVL 94GB PCIe Tensor Core GPUs and 4th Gen AMD EPYC™ Genoa processors, delivering powerful performance and flexibility for a wide range of AI and HPC applications.
What are the benefits of NC H100 v5 VMs?
Azure NC H100 v5 VMs are designed to accelerate a broad range of AI and HPC workloads, including:
Azure NC H100 v5 VMs offer the following features and capabilities:
Size |
vCPU |
Memory (GiB) |
NVIDIA H100 NVL PCIe GPUs |
HBM3 Memory Capacity |
Azure Network (GBps) |
Standard_NC40ads_H100_v5 |
40 |
320 |
1 |
94GB |
40 |
Standard_NC80adis_H100_v5 |
80 |
640 |
2 |
188GB |
80 |
Preliminary specification, subject to change
How do NC H100 v5 VMs compare to the previous generation?
The NC H100 v5 VMs offer significant performance improvements over the previous s of Azure VMs in the NC series, due to the following factors:
What are the performance test results of NC H100 v5 VMs?
We have conducted initial performance tests on the NC H100 v5 VMs using several AI benchmarks and workloads. The results show that the NC H100 v5 VMs can achieve between 1.6x-1.9x inference performance on one GPU size depending on the types of workloads. Performance is expected to improve over time following further software optimization releases from NVIDIA:
Figure 1: Preliminary performance results of the NC H100 v5-series vs NC A100 v4-series on AI inference workloads for 1xGPU VM size.
GPT-J is a large-scale language model with 6 billion parameters, based on GPT-3 architecture, and submitted as part of MLPerf Inference v3.1 benchmark. GPT-J can generate natural and coherent text for various natural language generation tasks, such as text summarization, text completion, and text generation. GPT-J inference requires high compute, memory, and communication bandwidth to process the large amount of data and parameters involved in the model.
We compared the inference performance of GPT-J on the dual GPU VM version of Azure NC H100 v5 virtual machine vs an on-premise system powered by the previous generation of NVIDIA A100 Tensor Core GPUs.. Our results show that the NC H100 v5 VMs can achieve up to 2.5x performance improvements over the prior results (Figure 2).
Figure 2: Relative inference performance on the model GPT-J (6 billion parameters) from MLPerf Inference v3.1 between the Dell submission on the on-premises A100 platform (3.1-0061) and Azure on NC80adis_H100_v5 virtual machines (unverified).
How to access the preview of the Azure NC H100 v5-series
The NC H100 v5-series are currently in public preview and available in the Azure South-Central US region. Availability will expand to additional regions in the coming months..
If you are interested in trying out the NC H100 v5-series, sign up for preview here: https://aka.ms/NCadsH100v5PreviewSignup
Confidential computing on NVIDIA H100
Confidential computing is the protection of data in use by performing computation in hardware-based, attested Trusted Execution Environments (TEEs). These TEEs prevent unauthorized access or modification of application code and data during use. The Azure confidential computing team is excited to announce the NCC H100 v5-series Azure confidential VMs with NVIDIA H100 Tensor Core GPUs in Preview . These VMs are ideal for training, fine-tuning and serving popular open-source models, such as Stable Diffusion and its larger variants (SDXL, SSD…) and language models (Zephyr, Falcon, GPT2, MPT, Llama2. Wizard, Xwin).
For more information on the Azure NC H100 v5-series and the NCC H100 v5-series VMs, you can check out the following resources:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.