Fundamental Latency Trade-offs in Architecting DRAM Caches (Alloy Cache)

September 04, 2018

Authors: Moinuddin K. Qureshi and Gabriel H. Loh
Venue: MICRO 2012

The authors present a new architecture for DRAM cache which targets optimizing latency rather than hit rate. The authors show that for every optimization made to improve hit rate, one must also consider the added cache latency, thus resulting in a new "break even hit rate". The authors compare to three key baselines: an infeasibly large SRAM-tag array, the prior "LH-Cache", and a ideal latency-optimized cache. They find that "unoptimizations" improve latency, while minimally impacting hit-rate. Their final result is a cache which boosts performance 35% improvement over the baseline, and nearly 20% better than the previous LH-cache. The specific contributions are:

Moving to a direct mapped cache provides significant improvement (by reducing latency)
Combining tag-store and data-store into a single entity improves performance 21%
Implementing a small per-core predictor enhances performance further (to a total of 35%), of whether the data will reside in memory or the DRAM cache
The latency optimizations even outperform that of an SRAM array for tags

Full Text

Search This Blog

Karl Taht's Research Paper Blog

Fundamental Latency Trade-offs in Architecting DRAM Caches (Alloy Cache)

Comments

Post a Comment

Popular posts from this blog

ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions

AutoFDO: Automatic Feedback-Directed Optimization for Warehouse-Scale Applications