Learning Scheduling Algorithms for Data Processing Clusters
Authors: Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh Venue: Proceedings of the ACM Special Interest Group on Data Communication This paper utilizes reinforcement learning to schedule learn a scheduling policy for Spark jobs. The scheduler has two main decisions: (i) what stage to schedule and (ii) how much parallelism to exploit for that stage. The RL problem is formulated as given the state of the cluster and DAG input, output a scheduling action. Reward is defined as -T x J where T is the time step and J is the number of jobs in the system. The decision making process is particularly difficult because an job can present a DAG of any shape for dependencies, yet, the neural network input is of fixed size. To solve this, a method based on graph convolutional neural networks [1] is used. The RL policy network predicts a composite action of stage of maximum parallelism level. To train the network in the case of continuous job arrivals, ...