Abstract
Talk by Prof. Vincent Y. F. Tan (National University of Singapore)
Title: Thompson Sampling for Cascading Bandits
Abstract:
We design and analyze TS-Cascade, a Thompson sampling algorithm for the cascading bandit problem. In TS-Cascade, Bayesian estimates of the click probability are constructed using a univariate Gaussian; this leads to a more efficient exploration procedure vis-`a-vis existing UCB-based approaches. We also incorporate the empirical variance of each item’s click probability into the Bayesian updates. These two novel features allow us to prove an expected regret bound of the form $tilde{O}(sqrt{KLT})$ where $L$ and $K$ are the number of ground items and the number of items in the chosen list respectively and $Tge L$ is the number of Thompson sampling update steps. This matches the state-of-the-art regret bounds for UCB-based algorithms. More importantly, it is the first theoretical guarantee on a Thompson sampling algorithm for any stochastic combinatorial bandit problem model with partial feedback. Empirical experiments demonstrate superiority of TS-Cascade compared to existing UCB-based procedures in terms of the expected cumulative regret and the time complexity.
This is joint work with Wang-Chi Cheung (IHPC, A*STAR) and Zixin Zhong (NUS) and the details can be found here (https://arxiv.org/abs/1810.01187).
Bio: Vincent Y. F. Tan is currently an Associate Professor in the Department of Electrical and Computer Engineering and the Department of Mathematics at the National University of Singapore (NUS). He received the B.A. and M.Eng. degrees in Electrical and Information Sciences from Cambridge University in 2005 and the Ph.D. degree in Electrical Engineering and Computer Science (EECS) from the Massachusetts Institute of Technology (MIT) in 2011. His research interests include information theory, machine learning, and statistical signal processing.
Dr. Tan received the MIT EECS Jin-Au Kong outstanding doctoral thesis prize in 2011, the NUS Young Investigator Award in 2014, the NUS Engineering Young Researcher Award in 2018, and the Singapore National Research Foundation (NRF) Fellowship (Class of 2018). He is also an IEEE Information Theory Society Distinguished Lecturer for 2018/9. He has authored a research monograph on “Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities” in the Foundations and Trends in Communications and Information Theory Series (NOW Publishers). He is currently serving as an Associate Editor of the IEEE Transactions on Signal Processing.
More Information
Date | November 21, 2018 (Wed) 13:00 - 14:30 |
URL | https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/81365 |