Sparrow: Distributed, Low Latency Scheduling
Authors: Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Venue: SOSP 2013 This work presents Sparrow, a stateless, decentralized scheduler for cluster scheduling. The scheduling component uses two key ideas: batch sampling and late binding. Batch sampling is an extension of the power of two choices [1], which shows that the "tail" can quickly be cut off by simply sampling between two machines versus randomly selecting one. Batch sampling generalizes this by sampling dm machines, and placing the m tasks on the machine with the lowest load. Late binding delays the actual task transfer until the machine is ready to process the request. This can be thought of as having a place holder in the worker's queue, and when the worker is finally ready to process it, the actual task is transferred from the scheduler to the worker. This avoids having to rely on inaccurate metrics such as queue depth. Each worker maintains its "instance" of Sparrow, which us...