TrustML Young Scientist Seminar #39 20221031

November 2, 2022 11:35

TrustML Young Scientist Seminar #39 20221031 thumbnails

Description

The 39th Seminar
Date and Time: Oct. 31th 2:00 pm – 3:00 pm(JST)
Venue: Zoom webinar
Language: English

Speaker: Tuan Dam (TU Darmstadt)
Title: A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search
Short Abstract:
Monte-Carlo Tree Search (MCTS) is a class of methods for solving complex decision-making problems through the synergy of Monte-Carlo planning and Reinforcement Learning (RL). The highly combinatorial nature of the problems commonly addressed by MCTS requires the use of efficient exploration strategies for navigating the planning tree and quickly convergent value backup methods. These crucial problems are particularly evident in recent advances that combine MCTS with deep neural networks for function approximation. In this work, we propose two methods for improving the convergence rate and exploration based on a newly introduced backup operator and entropy regularization. We provide strong theoretical guarantees to bound convergence rate, approximation error, and regret of our methods. Moreover, we introduce a mathematical framework based on the use of the $\alpha$-divergence for backup and exploration in MCTS. We show that this theoretical formulation unifies different approaches, including our newly introduced ones, under the same mathematical framework, allowing to obtain different methods by simply changing the value of $\alpha$. In practice, our unified perspective offers a flexible way to balance between exploration and exploitation by tuning the single $\alpha$ parameter according to the problem at hand. We validate our methods through a rigorous empirical study from basic toy problems to the complex Atari games, and including both MDP and POMDP problems.

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
	1st	2nd	Link to the event page for the 3rd	4th	Link to the event page for the 5th	6th
7th	8th	9th	10th	11th	12th	13th
14th	15th	16th	17th	18th	19th	20th
21th	Link to the event page for the 22th	23th	Link to the event page for the 24th	Link to the event page for the 25th	26th	27th
28th	29th	30th

Center for Advanced Intelligence Project

Videos

Description

Videos

Calendar

Sub menu

External Links

Center for Advanced Intelligence Project

Videos