Skip to main content

No, ChatGPT isn’t going to cause another GPU shortage

ChatGPT is exploding, and the backbone of its AI model relies on Nvidia graphics cards. One analyst said around 10,000 Nvidia GPUs were used to train ChatGPT, and as the service continues to expand, so does the need for GPUs. Anyone who lived through the rise of crypto in 2021 can smell a GPU shortage on the horizon.

I’ve seen a few reporters build that exact connection, but it’s misguided. The days of crypto-driven-type GPU shortages are behind us. Although we’ll likely see a surge in demand for graphics cards as AI continues to boom, that demand isn’t directed toward the best graphics cards installed in gaming rigs.

Recommended Videos

Why Nvidia GPUs are built for AI

A render of Nvidia's RTX A6000 GPU.
Image used with permission by copyright holder

First, we’ll address why Nvidia graphics cards are so great for AI. Nvidia has bet on AI for the past several years, and it’s paid off with the company’s stock price soaring after the rise of ChatGPT. There are two reasons why you see Nvidia at the heart of AI training: tensor cores and CUDA.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

CUDA is Nvidia’s Application Programming Interface (API) used in everything from its most expensive data center GPUs to its cheapest gaming GPUs. CUDA acceleration is supported in machine learning libraries like TensorFlow, vastly speeding training and inference. CUDA is the driving force behind AMD being so far behind in AI compared to Nvidia.

Don’t confuse CUDA with Nvidia’s CUDA cores, however. CUDA is the platform that a ton of AI apps run on, while CUDA cores are just the cores inside Nvidia GPUs. They share a name, and CUDA cores are better optimized to run CUDA applications. Nvidia’s gaming GPUs have CUDA cores and they support CUDA apps.

Tensor cores are basically dedicated AI cores. They handle matrix multiplication, which is the secret sauce that speeds up AI training. The idea here is simple. Multiply multiple sets of data at once, and train AI models exponentially faster by generating possible outcomes. Most processors handle tasks in a linear fashion, while Tensor cores can rapidly generate scenarios in a single clock cycle.

Again, Nvidia’s gaming GPUs like the RTX 4080 have Tensor cores (and sometimes even more than costly data center GPUs). However, for all of the specs Nvidia cards have to accelerate AI models, none of them are as important as memory. And Nvidia’s gaming GPUs don’t have a lot of memory.

It all comes down to memory

A stack of HBM memory.
Wikimedia

“Memory size is the most important,” according to Jeffrey Heaton, author of several books on artificial intelligence and a professor at Washington University in St. Louis. “If you do not have enough GPU RAM, your model fitting/inference simply stops.”

Heaton, who has a YouTube channel dedicated to how well AI models run on certain GPUs, noted that CUDA cores are important as well, but memory capacity is the dominant factor when it comes to how a GPU functions for AI. The RTX 4090 has a lot of memory by gaming standards — 24GB of GDDR6X — but very little compared to a data center-class GPU. For instance, Nvidia’s latest H100 GPU has 80GB of HBM3 memory, as well as a massive 5,120-bit memory bus.

You can get by with less, but you still need a lot of memory. Heaton recommends beginners have no less than 12GB, while a typical machine learning engineer will have one or two 48GB professional Nvidia GPUs. According to Heaton, “most workloads will fall more in the single A100 to eight A100 range.” Nvidia’s A100 GPU has 40GB of memory.

You can see this scaling in action, too. Puget Systems shows a single A100 with 40GB of memory performing around twice as fast as a single RTX 3090 with its 24GB of memory. And that’s despite the fact that the RTX 3090 has almost twice as many CUDA cores and nearly as many Tensor cores.

Memory is the bottleneck, not raw processing power. That’s because training AI models relies on large datasets, and the more of that data you can store in memory, the faster (and more accurately) you can train a model.

Different needs, different dies

Hopper H100 graphics card.
Image used with permission by copyright holder

Nvidia’s gaming GPUs generally aren’t suitable for AI due to how little video memory they have compared to enterprise-grade hardware, but there’s a separate issue here as well. Nvidia’s workstation GPUs don’t usually share a GPU die with its gaming cards.

For instance, the A100 that Heaton referenced uses the GA100 GPU, which is a die from Nvidia’s Ampere range that was never used on gaming-focused cards (including the high-end RTX 3090 Ti). Similarly, Nvidia’s latest H100 uses a completely different architecture than the RTX 40-series, meaning it uses a different die as well.

There are exceptions. Nvidia’s AD102 GPU, which is inside the RTX 4090 and RTX 4080, is also used in a small range of Ada Lovelace enterprise GPUs (the L40 and RTX 6000). In most cases, though, Nvidia can’t just repurpose a gaming GPU die for a data center card. They’re separate worlds.

There are some fundamental differences between the GPU shortage we saw due to crypto-mining and the rise in popularity of AI models. According to Heaton, the GPT-3 model required over 1,000 A100 Nvidia GPUs to trains and about eight to run. These GPUs have access to the high-bandwidth NVLink interconnect as well, while Nvidia’s RTX 40-series GPUs don’t. It’s comparing a maximum of 24GB of memory on Nvidia’s gaming cards to multiple hundreds on GPUs like the A100 with NVLink.

There are some other concerns, such as memory dies being allocated for professional GPUs over gaming ones, but the days of rushing to your local Micro Center or Best Buy for the chance to find a GPU in stock are gone. Heaton summed that point up nicely: “Large language models, such as ChatGPT, are estimated to require at least eight GPUs to run. Such estimates assume the high-end A100 GPUs. My speculation is that this could cause a shortage of the higher-end GPUs, but may not affect gamer-class GPUs, with less RAM.”

Jacob Roach
Former Digital Trends Contributor
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
Sam Altman thinks GPT-5 will be smarter than him — but what does that mean?
Sam Altman at The Age of AI Panel, Berlin.

Sam Altman did a panel discussion at Technische Universität Berlin last week, where he predicted that ChatGPT-5 would be smarter than him -- or more accurately, that he wouldn't be smarter than GPT-5.

He also did a bit with the audience, asking who considered themselves smarter than GPT-4, and who thinks they will also be smarter than GPT-5.
"I don’t think I’m going to be smarter than GPT-5. And I don’t feel sad about it because I think it just means that we’ll be able to use it to do incredible things. And you know like we want more science to get done. We want more, we want to enable researchers to do things they couldn’t do before. This is the history of, this is like the long history of humanity."
The whole thing seemed rather prepared, especially since he forced it into a response to a fairly unrelated question. The host asked about his expectations when partnering with research organizations, and he replied "Uh... There are many reasons I am excited about AI. ...The single thing I'm most excited about is what this is going to do for scientific discovery."

Read more
OpenAI’s ChatGPT Search is now free to use without a login
A person sits in front of a laptop. On the laptop screen is the home page for OpenAI's ChatGPT artificial intelligence chatbot.

ChatGPT is becoming more accessible to the masses. Its ChatGPT Search feature is now available without having to log in to the popular chatbot. Parent company OpenAI has also confirmed that ChatGPT Search will be free to use– the feature works similarly to a search engine.

When accessing the service’s web address, ChatGPT you will see ChatGPT Search front and center, with a message saying “What can I help you with?” You can immediately input your query into the text box. At the bottom of the text box are options that say “Search” and “Reason.” The Search option is the option that allows you to use the page without logging in. Selecting the Reason option will prompt you to log in or sign up to access ChatGPT.

Read more
It’s easier than ever to use ChatGPT Search — sign-in no longer needed
The ChatGPT Search icon on the prompt window

You no longer need to sign in to use ChatGPT Search.

“ChatGPT search is now available to everyone on chatgpt.com,” OpenAI said in a post on X announcing the change, adding, “No sign up required.”

Read more