Dynamic Terrain Traversal Skills Using Reinforcement Learning

Xue Bin Peng Glen Berseth Michiel van de Panne
University of British Columbia

Real-time planar simulation of a dog capable of traversing terrains with gaps, walls, and steps. The control policy for this skill is computed offline using reinforcement learning. The ground markings indicate the landing points for the front and hind legs.

In this work we construct a physics-based controler to navigate dynamic terrains. The system is evaluated using two controllers, a quadruped and a biped. We use reinforcement learning to learn good navigation strategies.

Abstract

The locomotion skills developed for physics-based characters most often target flat terrain. However, much of their potential lies with the creation of dynamic, momentum-based motions across more complex terrains. In this paper, we learn controllers that allow simulated characters to traverse terrains with gaps, steps, and walls using highly dynamic gaits. This is achieved using reinforcement learning, with careful attention given to the action representation, non-parametric approximation of both the value function and the policy; epsilon-greedy exploration; and the learning of a good state distance metric. The methods enable a 21-link planar dog and a 7-link planar biped to navigate challenging sequences of terrain using bounding and running gaits. We evaluate the impact of the key features of our skill learning pipeline on the resulting performance.

Bibtex

You can find the paper describing the project here
You can the presentation for the work here (coming soon)

This video demonstrates some of the example results of our method.

Supplementary video

Glen Berseth