Branislav Kveton

I am a Principal Scientist at AWS AI Labs. I was at Google Research from 2018 to 2021, at Adobe Research from 2014 to 2018, at Technicolor’s Research Center from 2011 to 2014, and at Intel Research from 2006 to 2011. Before 2006, I was a graduate student in the Intelligent Systems Program at the University of Pittsburgh. My advisor was Milos Hauskrecht. My e-mail is bkveton@amazon.com.

I propose, analyze, and apply algorithms that learn incrementally, run in real time, and converge to near optimal solutions as the number of observations increases. Most of my recent work focuses on designing bandit algorithms for structured real-world problems.

Exploration-exploitation trade-off is a fundamental online learning problem, between taking exploration actions that lead to learning a better model, and taking exploitation actions that leverage it. A multi-armed bandit arose as a de-facto standard approach to solving this problem. In the multi-armed bandit, the actions of the agent are arms. The arms can be treatments in a clinical trial or ads on a website. After the arm is pulled, the agent receives its reward. The agent aims to maximize its cumulative reward. It does not know the mean rewards of the arms a priori and thus must decide whether to explore, and learn more about the arms; or exploit, and pull the arm with the highest estimated reward thus far.

I made several fundamental contributions to the field of multi-armed bandits. My earlier work focused on structured bandit problems with graphs, submodularity, semi-bandit feedback, and low-rank matrices. This culminated in my work on online learning to rank, where we design bandit algorithms that handle both combinatorial action sets and partial feedback. These algorithms are simple, theoretically sound, robust, and remain the state of the art. My recent work focuses on making bandit algorithms practical. This involves follow-the-perturbed-leader exploration, which can be analyzed up to generalized linear bandits and applied to neural networks; latent bandits, which can be combined with offline graphical models; and even learning of bandit algorithms from logged data.

Some of my recent papers are below.

2024

Aadirupa Saha and Branislav Kveton. Only Pay for What Is Uncertain: Variance-Adaptive Thompson SamplingIn Proceedings of the 12th International Conference on Learning Representations, 2024.

Shima Alizadeh, Aniruddha Bhargava, Karthick Gopalswamy, Lalit Jain, Branislav Kveton, and Ge Liu. Pessimistic Off-Policy Multi-Objective Optimization. In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics, 2024.

Ziqian Lin, Hao Ding, Nghia Hoang, Branislav Kveton, Anoop Deoras, and Hao Wang. Pre-trained Recommender Systems: A Causal Debiasing Perspective. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024.

Behnam Rahdari, Hao Ding, Ziwei Fan, Yifei Ma, Zhuotong Chen, Anoop Deoras, and Branislav Kveton. Logic-Scaffolding: Personalized Aspect-instructed Recommendation Explanation Generation using LLMs. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024.

2023

Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, and Sujay Sanghavi. Finite-Time Logarithmic Bayes Regret Upper Bounds. In Advances in Neural Information Processing Systems 36, 2023.

Branislav Kveton, Yi Liu, Johan Matteo Kruijssen, and Yisu Nie. Non-Compliant Bandits. In Proceedings of the 32nd International Conference on Information and Knowledge Management, 2023.

Hao Ding, Branislav Kveton, Yifei Ma, Youngsuk Park, Venkataramana Kini, Yupeng Gu, Ravi Divvela, Fei Wang, Anoop Deoras, and Hao Wang. Trending Now: Modeling Trend Recommendations. In Proceedings of the 17th ACM Conference on Recommender Systems, 2023.

Anusha Lalitha, Kousha Kalantari, Yifei Ma, Anoop Deoras, and Branislav Kveton. Fixed-Budget Best-Arm Identification with Heterogeneous Reward Variances. In Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence, 2023.

Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, and Mohammad Ghavamzadeh. Multi-Task Off-Policy Learning from Bandit Feedback. In Proceedings of the 40th International Conference on Machine Learning, 2023.

Yu-Guan Hsieh, Shiva Kasiviswanathan, Branislav Kveton, and Patrick Blobaum. Thompson Sampling with Diffusion Generative Prior. In Proceedings of the 40th International Conference on Machine Learning, 2023.

Runzhe Wan, Haoyu Wei, Branislav Kveton, and Rui Song. Multiplier Bootstrap-Based Exploration. In Proceedings of the 40th International Conference on Machine Learning, 2023.

Imad Aouali, Branislav Kveton, and Sumeet Katariya. Mixed-Effect Thompson Sampling. In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics, 2023.

Javad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, and Sumeet Katariya. Meta-Learning for Simple Regret Minimization. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023.

2022

Yu-Guan Hsieh, Shiva Kasiviswanathan, and Branislav Kveton. Uplifting Bandits. In Advances in Neural Information Processing Systems 35, 2022.

Branislav Kveton, Muhammad Jehangir Amjad, Christophe Diot, Dimitris Konomis, Augustin Soule, and Xiaolong Yang. Optimal Probing with Statistical Guarantees for Network Monitoring at Scale. Computer Communications, 192:119-131, 2022.

MohammadJavad Azizi, Branislav Kveton, and Mohammad Ghavamzadeh. Fixed-Budget Best-Arm Identification in Structured Bandits. In Proceedings of the 31st International Joint Conference on Artificial Intelligence, 2022.

Nan Wang, Hongning Wang, Maryam Karimzadehgan, Branislav Kveton, and Craig Boutilier. IMO3: Interactive Multi-Objective Off-Policy Optimization. In Proceedings of the 31st International Joint Conference on Artificial Intelligence, 2022.

Runzhe Wan, Branislav Kveton, and Rui Song. Safe Exploration for Efficient Policy Evaluation and Comparison. In Proceedings of the 39th International Conference on Machine Learning, 2022.

Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, and Mohammad Ghavamzadeh. Deep Hierarchy in Bandits. In Proceedings of the 39th International Conference on Machine Learning, 2022.

Behnam Rahdari, Branislav Kveton, and Peter Brusilovsky. The Magic of Carousels: Single vs. Multi-List Recommender Systems. In Proceedings of the 33rd ACM Conference on Hypertext and Social Media, 2022.

Ruihao Zhu and Branislav Kveton. Safe Optimal Design with Applications in Off-Policy Learning. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, 2022.

Rong Zhu and Branislav Kveton. Random Effect Bandits. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, 2022.

Branislav Kveton, Ofer Meshi, Masrour Zoghi, and Zhen Qin. On the Value of Prior in Online Learning to Rank. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, 2022.

Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, and Craig Boutilier. Thompson Sampling with a Mixture Prior. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, 2022.

Joey Hong, Branislav Kveton, Manzil Zaheer, and Mohammad Ghavamzadeh. Hierarchical Bayesian Bandits. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, 2022.

2021

Soumya Basu, Branislav Kveton, Manzil Zaheer, and Csaba Szepesvari. No Regrets for Learning the Prior in Bandits. In Advances in Neural Information Processing Systems 34, 2021.

Nan Wang, Branislav Kveton, and Maryam Karimzadehgan. CORe: Capitalizing On Rewards in Bandit Exploration. In Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence, 2021.

Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, and Csaba Szepesvari. Meta-Thompson Sampling. In Proceedings of the 38th International Conference on Machine Learning, 2021.

Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, and Amr Ahmed. Non-Stationary Off-Policy Optimization. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021.

2020

Craig Boutilier, Chih-wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, and Manzil Zaheer. Differentiable Meta-Learning of Bandit Policies. In Advances in Neural Information Processing Systems 33, 2020.

Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, and Craig Boutilier. Latent Bandits Revisited. In Advances in Neural Information Processing Systems 33, 2020.

Tong Yu, Branislav Kveton, Zheng Wen, Ruiyi Zhang, and Ole Mengshoel. Graphical Models Meet Bandits: A Variational Thompson Sampling Approach. In Proceedings of the 27th International Conference on Machine Learning, 2020.

Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, and Craig Boutilier. Randomized Exploration in Generalized Linear Bandits. In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020.

Sharan Vaswani, Abbas Mehrabian, Audrey Durand, and Branislav Kveton. Old Dog Learns New Tricks: Randomized UCB for Bandit Problems. In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020.

2019

Branislav Kveton, Csaba Szepesvari, Mohammad Ghavamzadeh, and Craig Boutilier. Perturbed-History Exploration in Stochastic Multi-Armed Bandits. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019.

Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvari, and Masrour Zoghi. BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback. In Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence, 2019.

Branislav Kveton, Csaba Szepesvari, Mohammad Ghavamzadeh, and Craig Boutilier. Perturbed-History Exploration in Stochastic Linear Bandits. In Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence, 2019.

Gaurush Hiranandani, Harvineet Singh, Prakhar Gupta, Iftikhar Burhanuddin, Zheng Wen, and Branislav Kveton. Cascading Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank. In Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence, 2019.

Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Tor Lattimore, and Mohammad Ghavamzadeh. Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits. In Proceedings of the 36th International Conference on Machine Learning, 2019.

Yang Cao, Zheng Wen, Branislav Kveton, and Yao Xie. Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019.

Sumeet Katariya, Branislav Kveton, Zheng Wen, and Vamsi Potluru. Conservative Exploration using Interleaving. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019.

Thanh Tan Nguyen, Ali Shameli, Yasin Abbasi-Yadkori, Anup Rao, and Branislav Kveton. Sample Efficient Graph-Based Optimization with Noisy Observations. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019.

2018

Tor Lattimore, Branislav Kveton, Shuai Li, and Csaba Szepesvari. TopRank: A Practical Algorithm for Online Stochastic RankingIn Advances in Neural Information Processing Systems 31, 2018.

Charles Chen, Sungchul Kim, Hung Bui, Ryan Rossi, Eunyee Koh, Branislav Kveton, and Razvan Bunescu. Predictive Analysis by Leveraging Temporal User Behavior and User EmbeddingsIn Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018.

Xiuyuan Lu, Zheng Wen, and Branislav Kveton. Efficient Online Recommendation via Low-Rank Ensemble SamplingIn Proceedings of the 12th ACM Conference on Recommender Systems, 2018.

Tong Yu, Branislav Kveton, Zheng Wen, Hung Bui, and Ole Mengshoel. SpectralLeader: Online Spectral Learning for Single Topic ModelsIn Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2018.

Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S. Muthukrishnan, Vishwa Vinay, and Zheng Wen. Offline Evaluation of Ranking Policies with Click ModelsIn Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018.

Branislav Kveton, S. Muthukrishnan, Hoa Vu, and Yikun Xian. Finding Subcube Heavy Hitters in Analytics Data StreamsIn Proceedings of the 2018 Web Conference, 2018.

2017

Zheng Wen, Branislav Kveton, Michal Valko, and Sharan Vaswani. Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback. In Advances in Neural Information Processing Systems 30, 2017.

Tong Yu, Branislav Kveton, and Ole Mengshoel. Thompson Sampling for Optimizing Stochastic Local Search. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2017.

Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Claire Vernade, and Zheng Wen. Bernoulli Rank-1 Bandits for Click Feedback. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017.

Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, and Mark Schmidt. Model-Independent Online Learning for Influence Maximization. In Proceedings of the 34th International Conference on Machine Learning, 2017.

Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, Tomas Tunys, Zheng Wen, and Masrour Zoghi. Online Learning to Rank in Stochastic Click Models. In Proceedings of the 34th International Conference on Machine Learning, 2017.

Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, and Zheng Wen. Get to the Bottom: Causal Analysis for User Modeling. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, 2017.

Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Claire Vernade, and Zheng Wen. Stochastic Rank-1 Bandits. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017.

Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, Nikos Vlassis, and Zheng Wen. Does Weather Matter? Causal Analysis of TV Logs. In Proceedings of the 26th International World Wide Web Conference, 2017.

2016

Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, and Siqi Sun. Graphical Model Sketch.
In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2016.

Branislav Kveton and Shlomo Berkovsky. Minimal Interaction Content Discovery in Recommender Systems. ACM Transactions on Interactive Intelligent Systems 6, pages 15:1-15:25, 2016.

Suvash Sedhain, Hung Bui, Jaya Kawale, Nikos Vlassis, Branislav Kveton, Aditya Menon, Trung Bui, and Scott Sanner. Practical Linear Models for Large-Scale One-Class Collaborative Filtering. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016.

Shi Zong, Hao Ni, Kenny Sung, Nan Rosemary Ke, Zheng Wen, and Branislav Kveton. Cascading Bandits for Large-Scale Recommendation Problems. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence, 2016.

Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, and Zheng Wen. DCM Bandits: Learning to Rank with Multiple Clicks. In Proceedings of the 33rd International Conference on Machine Learning, 2016.

2015

Branislav Kveton, Zheng Wen, Azin Ashkan, and Csaba Szepesvari. Combinatorial Cascading Bandits. In Advances in Neural Information Processing Systems 28, 2015.

Jaya Kawale, Hung Bui, Branislav Kveton, Long Tran-Thanh, and Sanjay Chawla. Efficient Thompson Sampling for Online Matrix-Factorization Recommendation. In Advances in Neural Information Processing Systems 28, 2015.

Branislav Kveton, Csaba Szepesvari, Zheng Wen, and Azin Ashkan. Cascading Bandits: Learning to Rank in the Cascade Model. In Proceedings of the 32nd International Conference on Machine Learning, 2015.

Zheng Wen, Branislav Kveton, and Azin Ashkan. Efficient Learning in Large-Scale Combinatorial Semi-Bandits. In Proceedings of the 32nd International Conference on Machine Learning, 2015.

Azin Ashkan, Branislav Kveton, Shlomo Berkovsky, and Zheng Wen. Optimal Greedy Diversity for Recommendation. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, 2015.

Branislav Kveton, Zheng Wen, Azin Ashkan, and Csaba Szepesvari. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits. In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 2015.

Branislav Kveton and Shlomo Berkovsky. Minimal Interaction Search in Recommender Systems. In Proceedings of the 20th ACM Conference on Intelligent User Interfaces, 2015.

2014

Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, and S. Muthukrishnan. Large-Scale Optimistic Adaptive Submodularity. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, 2014.

Branislav Kveton, Zheng Wen, Azin Ashkan, and Hoda Eydgahi. Matroid Bandits: Practical Large-Scale Combinatorial Bandits. In Proceedings of AAAI Workshop on Sequential Decision-Making with Big Data, 2014.

Tomas Kocak, Michal Valko, Remi Munos, Branislav Kveton, and Shipra Agrawal. Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems. In Proceedings of AAAI Workshop on Sequential Decision-Making with Big Data, 2014.

Branislav Kveton, Zheng Wen, Azin Ashkan, Hoda Eydgahi, and Brian Eriksson. Matroid Bandits: Fast Combinatorial Optimization with Learning. In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, 2014.

Salman Salamatian, Nadia Fawaz, Branislav Kveton, and Nina Taft. SPPM: Sparse Privacy Preserving Mappings. In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, 2014.

Michal Valko, Remi Munos, Branislav Kveton, and Tomas Kocak. Spectral Bandits for Smooth Graph Functions. In Proceedings of the 31st International Conference on Machine Learning, 2014.

Before 2014

Please see the list of my publications at scholar.