Branco Weiss Fellow Since
2022
Research Category
Computer Science
Research Location
Helmholtz AI, Munich, Germany
Background
Machine learning, especially in the flavor of deep learning, has made remarkable progress in recent years and has permeated many areas of our modern life. However, most current deep learning models are still quite data-hungry, which is why the most salient advances have been made in areas where huge datasets are readily available, such as image recognition and language processing. This is partially due to the community’s focus on “universal” learning algorithms, which can learn from any kind of data without any prior assumptions. An alternative approach is to try to make use of the prior knowledge which is available in many important domains, such as medical or scientific applications, which not only promises to make the resulting methods more performant, but also to make them more data efficient. Bayesian statistics prescribes how to optimally use prior knowledge in any learning problem, which has been integrated into deep learning models resulting in the nascent field of Bayesian deep learning.
However, previous research has shown that often-used priors in Bayesian deep learning are suboptimal and can lead to different pathologies, and that better priors can be identified using different model selection and meta-learning techniques. These preliminary results have strong implications for how prior knowledge from different application areas can be incorporated into modern machine learning models.
Details of Research
Dr. Vincent Fortuin aims to extend his previous research on Bayesian deep learning to discover better ways of designing priors for Bayesian neural networks and deep Gaussian processes. He also plans to synthesize his research expertise in representation learning and meta-learning to devise a unified framework, where reusable representations can be learned on unlabeled data to then be used as inputs to Bayesian models in downstream applications with small labeled datasets. Moreover, both the representation learning and the downstream predictive models can incorporate Bayesian prior knowledge, which can be specified based on existing domain expertise or meta-learned from related tasks. Put together, this framework aims to achieve robust performance and uncertainty estimation in critical applications with small datasets, thus unlocking many application areas for which deep learning has currently not been feasible and therefore finally delivering on the promises that the machine learning community has been making in recent years.