Automated Deep Learning: Joint Neural Architecture and Hyperparameter Search (discussions) | AISC
Toronto Deep Learning Series, 10 December 2018
Paper: https://arxiv.org/abs/1807.06906
Discussion Lead: Mark Donaldson (Ryerson University)
Discussion Facilitator: Masoud Hashemi (RBC)
Host: Shopify
Date: Dec 10th, 2018
Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
While existing work on neural architecture search (NAS) tunes hyperparameters in a separate post-processing step, we demonstrate that architectural choices and other hyperparameter settings interact in a way that can render this separation suboptimal. Likewise, we demonstrate that the common practice of using very few epochs during the main NAS and much larger numbers of epochs during a post-processing step is inefficient due to little correlation in the relative rankings for these two training regimes. To combat both of these problems, we propose to use a recent combination of Bayesian optimization and Hyperband for efficient joint neural architecture and hyperparameter search.