Gen AI: The next S curve for the semiconductor industry?

Gen AI: The next S curve for the semiconductor industry?

As generative AI (gen AI) applications such as ChatGPT take the world by storm, businesses are figuring out how to capture its tremendous value. We expect gen AI’s impact on productivity to add up to $4.4 trillion annually for the global economy.

A growing need for gen AI means the demand for computational power is skyrocketing. This surge in demand is driving not only innovation in software for gen AI applications but also heavy capital expenditures investments in infrastructure into data centers and semiconductor fabs.   

The big question for executives and other leaders is if the semiconductor industry will be able to provide enough supply to support the huge incoming demands of gen AI. My colleagues and I have developed a perspective to help CXOs develop strategies and identify opportunities for growth.

First, let’s look at what gen AI needs to run effectively. Gen AI applications typically run on dedicated servers in specialized data centers equipped with graphics processing units (GPUs) or AI accelerators. These processors excel in parallel processing, efficiently managing complex tasks and large datasets. These data centers and servers are, in many ways, different from a typical server today, consuming up to 20 times more electricity, for example.

So where is this demand coming from? And how will gen AI be used? We expect to see two different types of applications for generative AI: B2C and B2B use cases. We estimate B2C use cases will account for about 60 percent of the demand for computational power (in teraflops) by 2030. This will involve basic and advanced consumer queries for private use—for example, in e-mail drafting or conversion of text into visual creations. We expect B2B use cases[1] to make up the other approximately 40 percent of the demand. These include use cases such as advanced content creation for businesses (for example, gen AI-assisted code creation), addressing customer inquiries, generating standard financial reporting, and so on.

In the B2C as well as B2B market, the demand for gen AI is split into training and inference. Training involves model training with substantial data, while inference generates the real-time outputs. By 2030, we project only 5 to 8 percent demand for training with the remainder for inference.

Along with the rising demand for computing power and servers comes a corresponding need for semiconductor chips. The areas with the highest demand in terms of wafer volume are logic chips (central processing units, GPUs, and AI accelerators), memory chips (high-bandwidth memory [HBM] and dynamic random-access memory [DRAM]), data storage chips (NAND), power semiconductor chips, and other components. Semiconductor leaders will need to account for some variables as well. Both the penetration rate of AI use cases (such as queries per day) and the complexity of the respective AI models (such as the size of a model) will substantially influence the needs for compute capacity.

Given the dynamic development of gen AI, predicting its compute needs is challenging. To aid semiconductor leaders, we’ve developed several possible scenarios for gen AI’s impact in the data center and their underlying assumptions. There will be also implications on edge devices like smartphones, which will not be covered as we assume the upside potential of gen AI will be smaller.

For this post, we only look at one scenario, which we call the “base scenario.” We assume approximately ten daily queries by every smartphone user and two available underlying classes of gen AI models similar to GPT-3.5 and GPT-4 today but differing in capability and compute requirements. In addition, we assume five out of six (excluding complex concision) B2B use cases analyzed by the McKinsey Global Institute are adopted, creating approximately 90 percent of the estimated value of up to $4.4 trillion. Based on these parameters, we anticipate a demand for gen AI data centers with capacity of around 390 gigawatts (GW) by 2030 and an additional available capacity of roughly 130 GW for other applications. This represents an eightfold increase compared to today’s capacity. Combined, all data centers would contribute to about 10 percent of the global electricity demand.

As logic chips perform the actual calculations in the server, they will also face increasing demand. In our scenario, the demand for servers from gen AI translates into an additional demand of 2.4 million wafers by 2030 on the smallest node sizes. This estimated need suggests a requirement for equivalent of up to six fabs at a nameplate capacity of 480,000 wafers per year with the latest logic chip technology by 2030. These fabs would be in addition to the 24 fabs already slated for production at seven nanometers and below within the same timeframe. 

DRAM memory, including DDR and HBM, that is used to store data during the computations is another component to be considered. By 2030, we expect the demand from DRAM memory to be between about nine million and 14 million wafers on the latest DRAM technologies, accounting for 25 to 35 percent of the global DRAM demand. This will require capacity equivalent to eight to 12 fabs with a nameplate capacity of 1.2 million wafers per year. The wide range reflects the uncertainty in the demand for memory of AI accelerators. In simplifying the underlying complex architecture scenarios, the lower part of the range represents a scenario with limited growth in memory per device whereas the upper range represents a scenario with moderate growth.

For NAND storage, we anticipate a demand of approximately four million wafers by 2030, accounting for approximately 12 percent of the total demand. Meeting the additional demand will require a capacity equivalent of three fabs at a nameplate capacity of 1.6 million wafers per year (exhibit).

In total, we estimate the cost of these new fabs to be more than $250 billion. Given the constraints of capital, available labor, and material, this high demand for additional fabs will be challenging to meet. The industry will likely need to look for innovative ways to reduce the wafer demand—and thus the demand for new fabs—until 2030. These could include chip design (such as developing higher performance logic chips), higher data densities, or improvements in algorithms such as those we have seen with the invention of transformer models.

Gen AI will undoubtedly change the way of digital products and services. But it will require a lot of computing power, which, in turn, means that many players in the semiconductor industry will need to step up. The need for new fabs for logic, DRAM, and NAND chips is just the tip of the iceberg. Gen AI will also need more capacity for power semiconductors, optical transceivers, and many more. The full force of the supply chain—construction workers, machines, raw material, engineers, technicians—will be necessary to bring this new capacity online.

We wish to thank Demi Liu, Dr. Klaus Pototzky, Mark Patel, Diana Tang, Rutger Vrijen, Wenyi (Wendy) Zhu, and many more for their contributions to this post.


[1] We consider six B2B use cases: coding and software development, creative content generation, customer engagement, innovation, simple concision, and complex concision.

Rainer Veit

#connect You 🫵🏻 to the world of electronics

5mo

That’s what I’m saying. Semiconductor industries is predicting relief in 2024 driven by AI and emobility. Both are requiring tons of other components as well, which in my opinion will heat up shortage for the rest. To overcome this situation the utilization of digital tools can provide a clearer picture and data for decisions. #APIinaday helps to connect with all major suppliers for electronic components without any knowledge within 1 day.

Like
Reply
Georg Steinberger

Semiconductor Industry and Market Expert

5mo

By when does this foreseeable shortage hit the market? Now? 2025? Is that why Microsoft will start building nuke plants?

Like
Reply
Sheikh Shabnam

Producing end-to-end Explainer & Product Demo Videos || Storytelling & Strategic Planner

5mo

Exciting times ahead in the world of genAI and semiconductors! 🚀

Like
Reply
Alex Joseph Varghese

Senior Manager @Accenture Strategy | Semiconductors | Growth | Cost Optimization | Organization and Digital Transformation

5mo

Good insights. As SRAM scaling slows down to a halt, will chiplet technology be the key to unlocking now performance potential?

Fabian Forthmann

Empowering financial institutions with reliable AI and MLOps solutions

5mo

Important topic!

Like
Reply

To view or add a comment, sign in

Explore topics