Talk by Shane Gu a Research Scientist at Google Brain: Model-Based Reinforcement Learning with Predictability Maximization予測可能性最大化によるモデルベース強化学習の手法

October 31, 2019 17:00

Abstract

Title: Model-Based Reinforcement Learning with Predictability Maximization 予測可能性最大化によるモデルベース強化学習の手法

Abstract:
Intelligence is often associated with the ability to optimize the environment for maximizing one’s objectives (e.g. survival). In particular, the ability to predict the future conditioned on own actions enables intelligent agents to efficiently evaluate possible futures and choose the best one to realize. Such model-based reinforcement learning (RL) algorithms have recently shown promising results in sample-efficient learning of robotics and gaming RL environments. However, standard model-based approaches naively try to predict everything about the world, including noises that are not predictable or controllable. In this talk, I will share my recent works (temporal difference models (TDM), and dynamics-aware discovery of skills (DADS)) and discuss how goal-conditioned Q-learning and empowerment — the ability to predictively change the world — relate to model-based RL and can learn abstracted Markov Decision Processes (MDPs) where the predictability is inherently maximized. I’ll show that such approaches enable successful model-based planning in difficult environments where classic model-based planners fail, significantly outperforming model-free approaches in terms of sample efficiency. I’ll end with a discussion of how reachability and empowerment/mutual information connect to each other and potential directions of future research.

Bio:
Shixiang (Shane) Gu is a Research Scientist at Google Brain, where he mainly works on research problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. His recent research focuses on scalable RL methods that could solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review. He completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where he was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During his PhD, he also interned and collaborated closely with Sergey Levine/Ilya Sutskever at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. He holds B.ASc. in Engineering Science from the University of Toronto, where he did my thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. He is a Japan-born Chinese Canadian. Having lived in Japan, China, Canada, the US, the UK, and Germany, he goes under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐうせいしょう).

More Information

Date	November 13, 2019 (Wed) 15:00 - 16:30
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/99969

Venue

〒103-0027 Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi,Chuo-ku, Tokyo(Google Maps)

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	15th	Link to the event page for the 16th	17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Events