Posts

Showing posts from September, 2019

Dominant Resource Fairness: Fair Allocation of Multiple Resource Types

Authors: Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica Venue:   NSDI 2011 This work presents a very computationally efficient scheduling algorithm in the context of data centers. The problem is presented as fair resource allocation, but the goal is accomplished through choosing which task to schedule (and how many of each). This done by assigning each task with a resource vector of it's requirements, and a corresponding vector of available resources. The algorithm considers each job's allocation via it's dominant resource. For example, if a job uses 1 CPU and 1 GB of memory, but there are 4 CPUs and 8GB of memory, it would be dominated  by it's CPU usage (1/4 > 1/8). Tasks are continually scheduled such that the job with the lowest dominant resource share will be given priority. The algorithm takes O(log(n)) for n tasks. The work presents 4 main properties, and was well as 4 other "nice to have". I'll briefly ...

Post-Silicon CPU Adaptation Made Practical Using Machine Learning

Authors: Stephen J. Tarsa, Gautham Chinya, Hong Wang, et. al. Venue:    ISCA 2019 Preface : Before I begin, I'd like to preface that this is one of my favorite papers of 2019; it is well-written, shows poise in the application of machine learning techniques, and consideration of real-world applicability. I read this paper and produced my own slides for it, which can be found here . Overview: This paper presents an adaptive architecture controlled by a machine learning solution. Adaptive architecture itself is not a novel idea, there have been several works regarding tile-based clock gating, heterogenous core scheduling, pipeline gating, etc. This core chooses a simple adaptive piece of hardware, a binary decision to enable to disable a cluster. In this case, a cluster comprises of instruction cache, a decoder, memory execution unit, register file, ROB, and execution units. The authors hint at this being something like a modern SMT core, which can use all it's resource fo...

Stream-based Memory Access Specialization for General Purpose Processors

Authors: Zhengrong Wang, Tony Nowatzki Venue:    ISCA 2019 This paper presents change to architecture, ISA, and compilers to optimize the performance of memory loading in load/store streams. Streams are defined as "the dynamic sequence of memory operations associated with a static instruction, where the longest extent is defined as the entry and exit of the outermost containing loop." These can be characterized as affine (simple strides), indirect (based off a single pointer), or pointer-chasing. Most streams are affine or indirect via the author's measurements. To accelerate the memory subsystem, code must be augmented with sematics which pass information to the proposed stream engine. Based off this and other code information, the stream engine can fetch instructions ahead of time. The authors also extend this design with the option to bypass the cache in streaming designs. The mechanism as a whole outperforms 1000 instruction run-ahead processing as well as hardware...

Spectre Attacks: Exploiting Speculative Execution

Normally, I prefer to keep these blog posts very academic. However, I'd like to move to a more conversational format, and I think this paper presents the perfect starting point for such a shift. This paper is one of the more "advanced" papers I've read. It requires a rich and detailed understanding of computer architecture details AND operating system constructs. This post may contain misinformation, and therefore I advise any potential reader to consider reading the paper. *** Spectre leverages speculative execution to expose a timing side-channel. The two main variants described in this paper are the Variant 1: Bounds Check Bypass and Variant 2: Branch Target Injection. It is worth noting that Spectre attacks are perhaps most useful in the case of trying to access data within the same process. The common example of this is a web browser which runs multiple threads and JavaScript. The site's Java script can use Spectre to gain access to what you are typing in...

Meltdown: Reading Kernel Memory from User Space

Authors: Mortiz Lipp, Michael Schwarz, Daniel Gruss, et. al Venue: arXiv This paper is one of the massive mainstream security vulnerabilities exposed in 2018 for CPUs. The attack initially was independent of software vulnerabilities, and works even with the presence of KASLR/ASLR. The attack is based off speculative execution, which can result in what they call "transient instructions". These instructions begin execution, but do not finish so there is no change in architectural state. However, their are micro-architectural artifacts, which can be exploited via the Flush+Reload attack methodology. There are three main details worth explaining that are non-trivial: 1. Exception handling. When the program attempts to read from kernel memory, this will cause an exception. One way to handle this is via having the exception happen on a separate thread. However, an even more elegant solution is to put the attack in Intel TSX (transactional memory). In this case, an exception is...