Title: Learning Representations for Hyperparameter Transfer Learning
Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization, critical in deep learning. Typically, BO relies on conventional Gaussian process regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, Gaussian process-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs. After a brief intro to BO and an overview of several use cases at Amazon, I will discuss a multi-task adaptive Bayesian linear regression model, whose computational complexity is attractive (linear) in the number of function evaluations and able to leverage information of related black-box functions through a shared deep neural net. Experimental results show that the neural net learns a representation suitable for warm-starting related BO runs and that they can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss). The proposed method was found to be at least one order of magnitude faster than competing neural net-based methods recently published in the literature.
This is joint work with Valerio Perrone, Rodolphe Jenatton, and Matthias Seeger. It will be presented at NIPS 2018.
|Date||October 18, 2018 (Thu) 10:30 - 11:30|