SoftSKU: Optimizing Server Architectures for Microserive Diversity @Scale
Authors: Akshitha Sriraman, Abhishek Dhanotia, Thomas F. Wenisch
Venue: ISCA 2019
This work comprises of two main parts: a detailed analysis and tool to improve course-grain parameters based on general application (microservice) behavior. The author's analyze workloads in Facebook's datacenter in the categories of Web, Feed, Ads, and Cache which have varying throughput and latency requirements. The data center workloads exhibit significant front-end stalls (instruction fetch misses), significant branch mispredictions, and significant back-end stalls (mostly data cache misses).
uSKU is presented as a tool which automates the process of parameter tuning in an effort to improve system optimization for specific classes of microservices. Core frequency, uncore frequency, core count, code-and-data prioritization, prefetchers, transparent and static huge pages are explored. Knobs are tested independently and thus to do not consider dependent effects (Gaussian process search seems a good fit for this space). CDP in the LLC is one of the most interesting results, where the authors find that trading an increased data cache miss rate for fewer instruction misses results in better overall throughput. Additionally, their framework finds better performance but tuning different prefetcher configurations (when bandwidth is saturated), and changing the number of allocated huge pages.
Full Text
Venue: ISCA 2019
This work comprises of two main parts: a detailed analysis and tool to improve course-grain parameters based on general application (microservice) behavior. The author's analyze workloads in Facebook's datacenter in the categories of Web, Feed, Ads, and Cache which have varying throughput and latency requirements. The data center workloads exhibit significant front-end stalls (instruction fetch misses), significant branch mispredictions, and significant back-end stalls (mostly data cache misses).
uSKU is presented as a tool which automates the process of parameter tuning in an effort to improve system optimization for specific classes of microservices. Core frequency, uncore frequency, core count, code-and-data prioritization, prefetchers, transparent and static huge pages are explored. Knobs are tested independently and thus to do not consider dependent effects (Gaussian process search seems a good fit for this space). CDP in the LLC is one of the most interesting results, where the authors find that trading an increased data cache miss rate for fewer instruction misses results in better overall throughput. Additionally, their framework finds better performance but tuning different prefetcher configurations (when bandwidth is saturated), and changing the number of allocated huge pages.
Full Text
Comments
Post a Comment