Masatoshi Uehara (Harvard University)
Efficient Offline Reinforcement Learning
Off-policy evaluation and learning, or the study of novel sequential decision policies using observational data, is crucial for applications of reinforcement learning (RL) where exploration is limited, such as in medicine. However, off-policy RL is also notoriously difficult due to the curse of horizon, wherein the overlap between any policy and the observed data diminishes exponentially with trajectory length. To tackle this issue, we consider for the first time the semi-parametric efficiency limits of both policy evaluation and policy gradient estimation in a Markov decision process (MDP). Since these bounds describes the best-achievable error, they exactly characterizes when the curse of horizon makes the problem intractable. In particular, our results show there is hope when one leverage Markovian and/or time-invariant structure. To capitalize on this, we propose a new off-policy evaluation estimator we call Double Reinforcement Learning (DRL), which we show is efficient under weak conditions on the estimation of nuisances using flexible machine learning methods. We show how this also translates to statistically efficient off-policy policy gradients (EOPPG), which can enable off-policy learning. We prove EOPPG enjoys a unique 3-way double robustness and that its statistical efficiency translates to strong regret bounds that eschew the curse-of-horizon issues that plague existing approaches. We demonstrate the significant benefits of our approaches in various reinforcement learning settings.
Weihua Hu (Stanford University)
Advances in Graph Neural Networks: Expressive Power, Pre-training, and Open Graph Benchmark
Machine learning on graphs, especially with Graph Neural Networks (GNNs), is an emerging field of research with diverse application domains. In this talk, I will first present our theoretical and methodological advances in GNNs, analysing the expressive power of GNNs and proposing their effective pre-training strategies. Next, I aim to address the issue that the field is lacking appropriate benchmark datasets to rigorously and reliably evaluate the progress. To this end, I will present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale (up to 100+ million nodes and 1+ billion edges), encompass multiple important graph ML tasks, and cover a diverse range of domains. We show that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation.
|日時||2020/08/05(水) 10:00 - 12:00|