October 6, 2020 16:23

Abstract

This is an online event. Registration is required.
https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/112800

Talk 1

Speaker: Bo Dai (Google Brain) (https://sites.google.com/site/daibohr/)

Title: Reinforcement Learning via Dual Lens

Abstract: Offline reinforcement learning (RL) is aiming for exploiting the tremendous historical experiences for future decision making. In this talk, with the review of some basic concepts of convex duality, we summarize our recent work on how the duality will be applied to a variety of offline RL problems, including policy evaluation, confidence interval estimation, policy optimization and imitation learning. The derivations not only provide a unified treatment and perspective on many existing methods, more importantly, yield a number of novel RL algorithms towards practical applications.

The talk is based on the following papers:
1. Ofir Nachum and Bo Dai. Reinforcement Learning via Fenchel-Rockafellar Duality.
2. Ofir Nachum, *Bo Dai, Ilya Kostrikov, Yinlam Chow, Lihong Li, Dale Schuurmans. AlgaeDICE: Policy gradient from arbitrary experience.
3. *Bo Dai, *Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvari, Dale Schuurmans. CoinDICE: Off-policy Confidence Interval Estimation, NeurIPS 2020 (Spotlight).
4. *Mengjiao Yang, *Ofir Nachum, *Bo Dai, Lihong Li, Dale Schuurmans. Off-Policy Evaluation via the Regularized Lagrangian, NeurIPS 2020.
5. *Junfeng Wen, *Bo Dai, Lihong Li, Dale Schuurmans. Batch Stationary Distribution Estimation, ICML 2020.
6. *Ruiyi Zhang,*Bo Dai, Lihong Li, Dale Schuurmans. GenDICE: Generalized Offline Estimation of Stationary Values, ICLR 2020 (Oral).
7. Ofir Nachum
, Yinlam Chow*, Bo Dai, Lihong Li. DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections, NeurIPS 2019 (Spotlight).

Short Bio: Bo Dai is a research scientist at Google Brain. He obtained his Ph.D. from Georgia Tech. He was the recipient of the best paper award of AISTATS 2016 and NIPS 2017 workshop on Machine Learning for Molecules and Materials. His research interest lies in developing principled (deep) machine learning methods using tools from optimization, especially on reinforcement learning and representation learning for structured data, as well as their applications.

Talk 2

Speaker: Jingfeng Zhang (NUS -> RIKEN-AIP) (jingfeng.zhang9660@gmail.com)

Title: Adversarial Robustness via Adversarial Training.

Abstract:
Crafted adversarial data can easily fool the standard-trained deep models by adding human-imperceptible noise to the natural data, which leads to the security issue in applications such as medicine, finance, and autonomous driving. Adversarial training is so far the most effective method for obtaining the adversarial robustness against adversarial data.
There was a common belief that adversarial robustness against adversarial data and standard accuracy on natural hurt each other. In this talk, we challenge this common belief. Firstly, we propose a friendly adversarial training (FAT), which can improve accuracy while maintaining robustness. Secondly, we propose geometry-aware instance-reweighted adversarial training (GAIRAT), which can improve robustness while maintaining accuracy. Combing two directions (i.e., FAT and GAIRAT), we improve both robustness and accuracy of standard adversarial training.

The talk is based on the following papers:
Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, and Mohan Kankanhalli, Attacks Which Do Not Kill Training Make Adversarial Learning Stronger, ICML 2020. Paper link https://arxiv.org/abs/2002.11242
Jingfeng Zhang, Jianing Zhu, Gang Niu, Bo Han, Masashi Sugiyama, and Mohan Kankanhalli, Geometry-aware Instance-reweighted Adversarial Training. New work. Paper link https://arxiv.org/abs/2010.01736

Short Bio: Jingfeng Zhang is currently Ph.D. candidate at the National University of Singapore (NUS). He is expected to obtain his Ph.D. on Dec. 2020. He will go to RIKEN-AIP as a postdoctoral researcher under the supervision of Prof. Mashashi Sugiyama, starting from Jan. 2021. His research interest lies in robustness in machine learning as well as their applications.

More Information

Date October 29, 2020 (Thu) 10:00 - 12:00
URL https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/112800