Glen Berseth

I am an assistant professor at the University de Montreal and Mila. My research explores how to use deep learning and reinforcement learning to develop generalist robots.

I am an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, CIFAR AI chair, and co-director of the Robotics and Embodied AI Lab (REAL). I was a Postdoctoral Researcher with Berkeley Artificial Intelligence Research (BAIR), working with Sergey Levine. His previous and current research has focused on solving sequential decision-making problems for real-world autonomous learning systems (robots). The specific of his research has covered the areas of reinforcement-, continual-, meta-, hierarchical learning, and human-robot collaboration. In his work, Dr. Berseth has published at top venues across the disciplines of robotics, machine learning, and computer animation. Currently, he is teaching a course on robot learning at Université de Montréal and Mila that covers the most recent research on machine learning techniques for creating generalist robots.

To see a more formal biography, click here.

Interested in joining the lab?

Are you interested in the practical and theoretical challenges of creating generalist problem-solving robots? Please see this page to apply. I may not respond to emails.

News

Jan 2022: New paper accepted to ICLR on Continual Meta-Reinforcement Learning.
Sep 2021: New paper accepted to CoRL on autonomous robot learning!
Sep 2021: New paper accepted to NeurIPS on surprise minimization in partially observed environments!
Sep 2021: I will be teaching a course on deep reinforcement learning for robotics in January 2022!
May 2021: I will be joining Mila and starting as an assistant professor at the University of Montreal in August 2021!
Apr 2021: Our research paper that will be presented at ICRA2021 on RL for bipedal robots was featured in MIT Technology Review
Mar 2021: Associate Editor for IROS 2021
Feb 2021: Two papers accepted to ICRA2021!
Jan 2021: SMiRL: Surprise Minimizing RL in Unstable Environments receives oral presentation at ICLR 2021 (top 1.8% of submitted papers)
Jan 2021: Two papers accepted to ICLR 2021
Jan 2021: Invited talk at IJCAI workshop Neuro-Cognitive Modeling of Humans and Environments
Aug 2020: Deep Integration of Physical Humanoid Control and Crowd Navigation receives best paper runner up at MIG2020
Sep 2019: Visual Imitation work featured in MIT CSAIL podcast
Apr 2019: Started PostDoc at Berkeley Artificial Intelligence Research (BAIR) group working in the Robotic AI & Learning Lab (RAIL) lab with Sergey Levine
Feb 2019: Defended Ph.D. Thesis at the University of British Columbia under the supervision of Michiel van de Panne
Aug 2017: SIGGRAPH paper DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning covered in the Popular Mechanics.
Mar 2017: Awarded NSERC PhD scholarship
May 2016: Modelling Dynamic Brachiation receives best poster award at Graphics Interfaces 2016
Mar 2016: Robust Space-Time Footsteps for Agent-Based Steering receives best short paper award at CASA 2016

Sergey Levine Articles

Representative Publications

General task learning by inferring rewards from example data
Can we use reinforcement learning to learn general-purpose policies that can perform a wide range of different tasks, resulting in flexible and reusable skills? Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity. Categorical contexts preclude generalization to entirely new tasks. Goal-conditioned policies may enable some generalization, but cannot capture all tasks that might be desired. In this paper, we propose goal distributions as a general and broadly applicable task representation suitable for contextual policies. Goal distributions are general in the sense that they can represent any state-based reward function when equipped with an appropriate distribution class, while the particular choice of distribution class allows us to trade off expressivity and learnability. We develop an off-policy algorithm called distribution-conditioned reinforcement learnin (DisCo) to efficiently learn these policies. We evaluate DisCo on a variety of robot manipulation tasks and find that it significantly outperforms prior methods on tasks that require generalization to new goal distributions.
Entropy minimization for emergent behaviour
All living organisms carve out environmental niches within which they can maintain relative predictability amidst the ever-increasing entropy around them [schneider1994, friston2009]. Humans, for example, go to great lengths to shield themselves from surprise --- we band together in millions to build cities with homes, supplying water, food, gas, and electricity to control the deterioration of our bodies and living spaces amidst heat and cold, wind and storm. The need to discover and maintain such surprise-free equilibria has driven great resourcefulness and skill in organisms across very diverse natural habitats. Motivated by this, we ask: could the motive of preserving order amidst chaos guide the automatic acquisition of useful behaviors in artificial agents?