Learning Scheduling Algorithms for Data Processing Clusters
Authors: Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh
Venue: Proceedings of the ACM Special Interest Group on Data Communication
This paper utilizes reinforcement learning to schedule learn a scheduling policy for Spark jobs. The scheduler has two main decisions: (i) what stage to schedule and (ii) how much parallelism to exploit for that stage. The RL problem is formulated as given the state of the cluster and DAG input, output a scheduling action. Reward is defined as -T x J where T is the time step and J is the number of jobs in the system. The decision making process is particularly difficult because an job can present a DAG of any shape for dependencies, yet, the neural network input is of fixed size. To solve this, a method based on graph convolutional neural networks [1] is used. The RL policy network predicts a composite action of stage of maximum parallelism level. To train the network in the case of continuous job arrivals, differential reward is used for feedback. Additionally, the variance reduction techniques are applied to compensate for the "input driven" environment [2]. The net result performs 19% better than the fair scheduling algorithm.
Further reading:
[1]: T.N. Kipf, M. Welling. "Semi-Supervised Classification with Graph Convolutional Neural Networks"
[2]: H. Mao. et. al. "Variance Reduction for Reinforcement Learning in Input-Driven Environments."
Venue: Proceedings of the ACM Special Interest Group on Data Communication
This paper utilizes reinforcement learning to schedule learn a scheduling policy for Spark jobs. The scheduler has two main decisions: (i) what stage to schedule and (ii) how much parallelism to exploit for that stage. The RL problem is formulated as given the state of the cluster and DAG input, output a scheduling action. Reward is defined as -T x J where T is the time step and J is the number of jobs in the system. The decision making process is particularly difficult because an job can present a DAG of any shape for dependencies, yet, the neural network input is of fixed size. To solve this, a method based on graph convolutional neural networks [1] is used. The RL policy network predicts a composite action of stage of maximum parallelism level. To train the network in the case of continuous job arrivals, differential reward is used for feedback. Additionally, the variance reduction techniques are applied to compensate for the "input driven" environment [2]. The net result performs 19% better than the fair scheduling algorithm.
Further reading:
[1]: T.N. Kipf, M. Welling. "Semi-Supervised Classification with Graph Convolutional Neural Networks"
[2]: H. Mao. et. al. "Variance Reduction for Reinforcement Learning in Input-Driven Environments."
Comments
Post a Comment