Learning Scheduling Algorithms for Data Processing Clusters

Authors: Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh
Venue: Proceedings of the ACM Special Interest Group on Data Communication


This paper utilizes reinforcement learning to schedule learn a scheduling policy for Spark jobs. The scheduler has two main decisions: (i) what stage to schedule and (ii) how much parallelism to exploit for that stage. The RL problem is formulated as given the state of the cluster and DAG input, output a scheduling action. Reward is defined as -T x J where T is the time step and J is the number of jobs in the system. The decision making process is particularly difficult because an job can present a DAG of any shape for dependencies, yet, the neural network input is of fixed size. To solve this, a method based on graph convolutional neural networks [1] is used. The RL policy network predicts a composite action of stage of maximum parallelism level. To train the network in the case of continuous job arrivals, differential reward is used for feedback. Additionally, the variance reduction techniques are applied to compensate for the "input driven" environment [2]. The net result performs 19% better than the fair scheduling algorithm.


Further reading:
[1]: T.N. Kipf, M. Welling. "Semi-Supervised Classification with Graph Convolutional Neural Networks"
[2]: H. Mao. et. al. "Variance Reduction for Reinforcement Learning in Input-Driven Environments."

Comments

Popular posts from this blog

Fundamental Latency Trade-offs in Architecting DRAM Caches (Alloy Cache)

ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions

AutoFDO: Automatic Feedback-Directed Optimization for Warehouse-Scale Applications