Automatic Database Management System Tuning Through Large-scale Machine Learning

Authors: Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang
Venue:    SIGMOD 2017

The paper presented an automated framework for tuning database configuration knobs called OtterTune. OtterTune uses a hybrid of offline and online learning, which enables it to recognize similar behaviors and perform online optimization faster. 
     Before final optimization of a workload, a database of various knob settings and workloads must be collected. Next, OtterTune uses factor analysis to to first prune the set of performance metrics (a dimensionality reduction technique). Next performance tuning knobs are ranked via Lasso regularization. This allows for the automated tuner to reduce the search space by prioritizing the most impactful knobs. 
    Online, the OtterTune first tries to find a match between a previous workload and the current workload. The next phase is the exploration/exploitation phase which is guided by a Gaussian Process. This technique is not only accurate, but also provides insight as to the confidence of a given prediction. But adding variance to modeled points which have not been tested yet, the model can perform a greedy approach while still exploring unknown points. This tuning selection is further augmented by gradually scaling up the complexity of the space by incrementally adding more tuning knobs throughout the process. 
   The authors also provide a comparison between prior state of the art, iTune and find that their model finds better configurations, faster. Overall, the pipeline of the work is very intuitive, but careful application of different techniques shows substantial improvement. As a side note, it would have been very interesting to see various sensitivity studies, however the authors clearly lacked space to provide such detail. Overall, I have no doubts in the efficacy of the approach, just would have liked to see more data. 

Full Text

Comments

Popular posts from this blog

Fundamental Latency Trade-offs in Architecting DRAM Caches (Alloy Cache)

ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions

AutoFDO: Automatic Feedback-Directed Optimization for Warehouse-Scale Applications