Ritchie Ng

Hyperparameter Optimization: The Great Conundrum

With every deep learning algorithm comes a set of hyperparameters. Optimizing them is crucial in achieving faster convergence and lower error rates. For many years, the mass majority of people in the deep learning community are currently using common heuristics to tune hyperparameters such as learning rates, decay rates and L2 regularization. In recent works, researchers have tried to cast hyperparameter optimization as a deep learning problem but they are limited by their lack of scalability. I show how it is now possible for scalable hyperparameter optimization that accelerates convergence that can be trained on one problem while enjoying the benefits of transfer learning that is scalable. This has impact on the industrial level where deep learning algorithms can be accelerated to convergence without manual hand-tuning even for large models.

I am currently conducting research in the field of deep learning, computer vision and natural language processing in NExT Search Centre that is jointly setup between National University of Singapore (NUS) and Tsinghua University with the support of Media Development Authority (MDA) of Singapore. As a recipient of the Global Merit Scholarship, NUS top full scholarship, I am able to freely explore deep learning and push the boundaries of artificial intelligence. With more than 60 machine learning and deep learning guides published online and tens of thousands of lines of open-source projects in Python, C++, TensorFlow, PyTorch and good old bash scripting, I continually push for greater and easier access to the benefits of deep learning.