As AI workloads become more demanding, selecting the right DRAM configuration is crucial for improving performance. But what matters more, speed or capacity? In this article, five different DRAM kits will be evaluated on a gaming/AI PC to find the best DRAM configuration.
Different DRAM setups will be tested using Ollama tool, measuring tokens per second and memory usage across three large language models (LLMs).
The test platform consists of:
Three LLMs (Large Language Models) of varying sizes with different memory demands were evaluated:
The Ollama utility was used to load the different models. Ollama provides a tokens/second score to measure inference speed. The memory usage was tracked to identify how different DRAM configurations handle each model. The prompts tested are the following:
Finally, Final Fantasy XIV Dawntrail benchmark was tested to measure the gaming performance of each DRAM configuration. The goal is to find out the DRAM configuration that can handle AI tasks and gaming at the same time without compromising performance.
Fig,1: This chart shows the tokens/second for each one of the DRAM configurations tested in different LLMs utilizing only the CPU.
Fig.2: Performance of the DRAM configurations tested when the LLMs run with the RTX 4090. Results are in tokens/second.
Fig.3: Measured the system memory usage while running each LLM. CPU vs CPU and GPU. The results are approx. in GBs.
FFXIV Dawntrail benchmark for each of the DRAM configurations tested.
A few important observations from the results.
32GB and 48GB configurations were not able to fit the DeepSeek-R1 (70B) model when using the CPU only. Even with the RTX 4090 used, 32GB of DRAM is on the edge when the 70b model is in use.
The higher speed of the 8400 MT/s CUDIMM kit outperforms any other DRAM configuration in gaming but even with an RTX 4090 running a large size AI model there is not much available memory left.
For AI-heavy workloads, DRAM capacity plays a critical role, especially with larger models. However, for hybrid systems managing both gaming and AI, speed and latency optimization is as important. With LLMs continuously optimized for different hardware and the requirements are reduced, it would be best to have extra memory capacity so a larger model could run in the system to achieve more accurate results.
Based on considerations outlined above, the DRAM kit of choice is the CMH96GX5M2B7000C40.
ENREGISTREMENT DU PRODUIT