Posts

Showing posts from May, 2018

Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach

Authors: Ramazan Bitirgen, Engin Ipek, Jose F. Martinez Venue:    MICRO 2008 This paper presents a scheme to dynamically allocate system resources. The paper focuses on LLC cache partitioning via ways, bandwidth partitioning, and DVFS. They propose Coordinated Hill-climbing , to dynamically allocate these resources. The system profiles first in the default fair-share configuration. If the prediction framework has a high CoV (coeffient of variation) for the baseline performance, the algorithm does nothing. However, if the CoV is accurate, a profiling phase occurs. Once the initial training set is provided, the controller continues to sample for every 1 and 5 intervals. The model itself is a ensemble of fifty, 2-layer FC ANN's, each of which have 9 inputs (power, cache usage, read hits/misses, write hits/misses, bandwidth usage and L2 cache dirty ratio). The model attempts to predict the performance given the statistics. The model guides the search, such that search is shifte...

Using Multiple Input, Multiple Output Formal Control to Maximize Resource Efficiency in Architectures

Authors: Raghavendra Pradyumna Pothukuchi, Amin Ansari, Petros Voulgaris, and Josep Torellas Venue:    ISCA 2016 This paper using control theory to design an adaptive system. While many approaches exist, the key novelty to this work is the multiple-input, multiple-output coordination which considers multiple trade-offs simultaneously to tune the system. The framework essentially uses several "training set" workloads to train a controller which is able to predict power and performance. Then, a "test set" of workloads is used to evaluate the controller's performance when trying to tune the power-performance trade-offs. The benefit is formal guarantees, however, the cost is the user requirement to specify reference values. More specifically, rather than "maximize x  trade-off", MIMO controller works best by specifying "achieve values x1, x2, with weights w1 and w2" for importance. Overall, this work excellently integrates considerations such a...

Scaling Datacenter Accelerators with Computation Reuse Architectures

Authors: Adi Fuchs, David Wentzalff. Princeton University Venue: ISCA 2018 Being the third paper at ISCA-18 that exploits input redundancy in one way or other (after EVA2 and Euphrates), COREx (COmputation-REuse Accelerators) proposes an effective idea to improve speedup and energy efficiency of datacenters. The paper is motivated by the manifestation of Zipf's law  in data center workloads such as internet traffic and data compression.  As the paper title suggests, COREx stores the outputs and inputs of common kernels, and skips computation by sending the stored output to the host, if the current input is the same as stored input. They define the storing step as " memorization " Trading communication for computation, this work is the exact opposite of AMNESIAC (published at ASPLOS-17), which trades computation for communication. They define 3 constraints that needs to be satisfied in-order for memorization to be successful. (1) Correct results: Memorization must pr...