Skip to main content

The future of fast PC graphics? Connecting directly to SSDs

Performance boosts are expected with each new generation of the best graphics cards, but it seems that Nvidia and IBM have their sights set on greater changes.

The companies teamed up to work on Big accelerator Memory (BaM), a technology that involves connecting graphics cards directly to superfast SSDs. This could result in larger GPU memory capacity and faster bandwidth while limiting the involvement of the CPU.

A chart breaks down Nvidia and IBM's BaM technology.
Image source: Arxiv Image used with permission by copyright holder

This type of technology has already been thought of, and worked on, in the past. Microsoft’s DirectStorage application programming interface (API) works in a somewhat similar way, improving data transfers between the GPU and the SSD. However, this relies on external software, only applies to games, and only works on Windows. Nvidia and IBM researchers are working together on a solution that removes the need for a proprietary API while still connecting GPUs to SSDs.

Recommended Videos

The method, amusingly referred to as BaM, was described in a paper written by the team that designed it. Connecting a GPU directly to an SSD would provide a performance boost that could prove to be viable, especially for resource-heavy tasks such as machine learning. As such, it would mostly be used in professional high-performance computing (HPC) scenarios.

The technology that is currently available for processing such heavy workloads requires the graphics card to rely on large amounts of special-purpose memory, such as HBM2, or to be provided with efficient access to SSD storage. Considering that datasets are only growing in size, it’s important to optimize the connection between the GPU and storage in order to allow for efficient data transfers. This is where BaM comes in.

“BaM mitigates the I/O traffic amplification by enabling the GPU threads to read or write small amounts of data on-demand, as determined by the compute,” said the researchers in their paper, first cited by The Register. “The goal of BaM is to extend GPU memory capacity and enhance the effective storage access bandwidth while providing high-level abstractions for the GPU threads to easily make on-demand, fine-grain access to massive data structures in the extended memory hierarchy.”

An Nvidia GPU core sits on a table.
Niels Broekhuijsen / Digital Trends

For many people who don’t work directly with this subject, the details may seem complicated, but the gist of it is that Nvidia wants to rely less on the processor and connect directly to the source of the data. This would both make the process more efficient and free up the CPU, making the graphics card much more self-sufficient. The researchers claim that this design would be able to compete with DRAM-based solutions while remaining cheaper to implement.

Although Nvidia and IBM are undoubtedly breaking new ground with their BaM technology, AMD worked in this area first: In 2016, it unveiled the Radeon Pro SSG, a workstation GPU with integrated M.2 SSDs. However, the Radeon Pro SSG was intended to be strictly a graphics solution, and Nvidia is taking it a few steps further, aiming to deal with complex and heavy compute workloads.

The team working on BaM plans to release the details of their software and hardware optimization as open source, allowing others to build on their findings. There is no mention as to when, if ever, BaM might find itself implemented in future Nvidia products.

Monica J. White
Monica is a computing writer at Digital Trends, focusing on PC hardware. Since joining the team in 2021, Monica has written…
I tested the future of AI image generation. It’s astoundingly fast.
Imagery generated by HART.

One of the core problems with AI is the notoriously high power and computing demand, especially for tasks such as media generation. On mobile phones, when it comes to running natively, only a handful of pricey devices with powerful silicon can run the feature suite. Even when implemented at scale on cloud, it’s a pricey affair.
Nvidia may have quietly addressed that challenge in partnership with the folks over at the Massachusetts Institute of Technology and Tsinghua University. The team created a hybrid AI image generation tool called HART (hybrid autoregressive transformer) that essentially combines two of the most widely used AI image creation techniques. The result is a blazing fast tool with dramatically lower compute requirement.
Just to give you an idea of just how fast it is, I asked it to create an image of a parrot playing a bass guitar. It returned with the following picture in just about a second. I could barely even follow the progress bar. When I pushed the same prompt before Google’s Imagen 3 model in Gemini, it took roughly 9-10 seconds on a 200 Mbps internet connection.

A massive breakthrough
When AI images first started making waves, the diffusion technique was behind it all, powering products such as OpenAI’s Dall-E image generator, Google’s Imagen, and Stable Diffusion. This method can produce images with an extremely high level of detail. However, it is a multi-step approach to creating AI images, and as a result, it is slow and computationally expensive.
The second approach that has recently gained popularity is auto-regressive models, which essentially work in the same fashion as chatbots and generate images using a pixel prediction technique. It is faster, but also a more error-prone method of creating images using AI.
On-device demo for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
The team at MIT fused both methods into a single package called HART. It relies on an autoregression model to predict compressed image assets as a discrete token, while a small diffusion model handles the rest to compensate for the quality loss. The overall approach reduces the number of steps involved from over two dozen to eight steps.
The experts behind HART claim that it can “generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster.” HART combines an autoregressive model with a 700 million parameter range and a small diffusion model that can handle 37 million parameters.

Read more
The RTX 50-series is the worst GPU launch in recent memory
The RTX 5090 sitting on a pink background.

Nvidia has had some less-than-stellar graphics card launches over the years. Its RTX 2000-series was poorly received, with little interest in the flagship features of the time, and the RTX 40-series hardly blew us away. But the RTX 50-series has been something else entirely. It's the worst GPU launch I can remember in a long time.

If you've been following along, the latest is that the RTX 5060 and 5060 Ti are delayed again. But that's just one more straw on the camel's funeral pyre for this catastrophic GPU generation.
In the beginning, there was overhype
It all started off strong for the RTX 50 series. Nvidia CEO Jensen Huang took to the stage at CES 2025 and made some truly grandiose claims which had everyone excited. The RTX 5090 was going to double performance of the RTX 4090. The RTX 5070 was going to offer 4090-level performance at $549. Multi frame generation was going to give Nvidia such a lead, that AMD's cards would look ridiculous in comparison.

Read more
AMD GPUs are supposed to be plentiful, but good luck finding one
Various AMD RX 9000 series graphics cards.

It's clear that AMD hit the jackpot with its recent RX 9070 XT and non-XT GPUs. The pair quickly climbed up every list of the best graphics cards, and perhaps more importantly, received a warm welcome from the GPU market at large (and thus sold out immediately). A new leak tells us that AMD is shipping lots of GPUs to try to keep up with the demand -- and yet they're still not in stock.

The information comes from Moore's Law Is Dead on YouTube, who claims to have spoken to a major online retailer about RDNA 4 stock levels. Both Nvidia and AMD have been in a pretty dire place since the release of their latest graphics cards, with many people referring to the RTX 50-series as a "paper launch." The cards just sell out too quickly and too many people are left trying to find one.

Read more