BLOG

Choosing the Best DRAM for AI

Last updated: mars 03, 2025

As AI workloads become more demanding, selecting the right DRAM configuration is crucial for improving performance. But what matters more, speed or capacity? In this article, five different DRAM kits will be evaluated on a gaming/AI PC to find the best DRAM configuration.

Different DRAM setups will be tested using Ollama tool, measuring tokens per second and memory usage across three large language models (LLMs).

Test Setup

The test platform consists of:

CPU: Intel Core Ultra 9 285K
GPU: PNY XLR8 RTX 4090 24GB
Motherboard: ASRock Z890 Taichi
SSD: CORSAIR MP700 2TB PCIe 5.0 (Gen 5) x4 NVMe M.2 SSD
DRAM Configurations:
- 2x16GB at 6000 MT/s
  CMH32GX5M2B6000Z30K
- 2x24GB at 8400 MT/s
  CMHC48GX5M2X8400C40
- 2x32GB at 6000 MT/s
  CMP64GX5M2B6000C30
- 2x48GB at 7000 MT/s
  CMH96GX5M2B7000C40
- 4x48GB at 5200 MT/s
  CMH192GX5M4B5200C38

AI Models

Three LLMs (Large Language Models) of varying sizes with different memory demands were evaluated:

Llama 3.2 (3B) – A lightweight model suited for most AI PCs.
Llama 3.1 (8B) – A mid-range model that benefits from both speed and capacity.
DeepSeek-R1 (70B) – A large model that pushes DRAM limits.

Benchmarking Methodology

The Ollama utility was used to load the different models. Ollama provides a tokens/second score to measure inference speed. The memory usage was tracked to identify how different DRAM configurations handle each model. The prompts tested are the following:

"What are the benefits of DDR5 over DDR4 memory?"
"Write a short story about an AI revolution in a world where humans and machines coexist."
"Explain the concept of neural networks and how they are trained in simple terms."
"Explain the concept of reinforcement learning and how it differs from supervised learning, with examples."

Finally, Final Fantasy XIV Dawntrail benchmark was tested to measure the gaming performance of each DRAM configuration. The goal is to find out the DRAM configuration that can handle AI tasks and gaming at the same time without compromising performance.

Results & Analysis

Fig,1: This chart shows the tokens/second for each one of the DRAM configurations tested in different LLMs utilizing only the CPU.

Fig.2: Performance of the DRAM configurations tested when the LLMs run with the RTX 4090. Results are in tokens/second.

Fig.3: Measured the system memory usage while running each LLM. CPU vs CPU and GPU. The results are approx. in GBs.

FFXIV Dawntrail benchmark for each of the DRAM configurations tested.

A few important observations from the results.

32GB and 48GB configurations were not able to fit the DeepSeek-R1 (70B) model when using the CPU only. Even with the RTX 4090 used, 32GB of DRAM is on the edge when the 70b model is in use.

The higher speed of the 8400 MT/s CUDIMM kit outperforms any other DRAM configuration in gaming but even with an RTX 4090 running a large size AI model there is not much available memory left.

Which DRAM Kit is Best for AI?

AI Workloads only: It is quite unlikely that someone will run any LLM without a GPU so a minimum of a 64GB DRAM kit would be preferred. Ideally, the 96GB DRAM kit is the best choice to be future proofed in case an even larger LLM will be used.
Gaming + AI: The 8400 MT/s is the obvious winner here. Although the 64GB DRAM kit is not that far behind in gaming and since the RTX 4090 provides that extra available system memory, this is the preferred choice here.

Conclusion

For AI-heavy workloads, DRAM capacity plays a critical role, especially with larger models. However, for hybrid systems managing both gaming and AI, speed and latency optimization is as important. With LLMs continuously optimized for different hardware and the requirements are reduced, it would be best to have extra memory capacity so a larger model could run in the system to achieve more accurate results.

Based on considerations outlined above, the DRAM kit of choice is the CMH96GX5M2B7000C40.