Multi-Armed Bandits: Combinatorial/Submodular/Gaussian:
Exploiting structure in the data across multiple dimensions can lead to efficient techniques for efficient sampling and fingerprinting. The figure alongside shows different network service attributes dependence on time, spatial locations and data service quality measurements thus motivating tensor structure. Further, the map shows the received signal strength in indoor has spatial dependence and the signals from different APs are correlated and thus adaptive sampling can reduce fingerprints needed for localization.
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "Stochastic Top K-Subset Bandits with Linear Space and Non-Linear Feedback with Applications to Social Influence Maximization," Accepted to ACM/IMS Transactions on Data Science, Dec 2021.
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "DART: aDaptive Accept RejecT for non-linear top-K subset identification," in Proc. AAAI, Feb 2021 (21% acceptance rate, 1692/7911).
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "Stochastic Combinatorial Bandits with Linear Space and Non-Linear Feedback," in Proc. ALT, Mar 2021 (PMLR 132:306-339, 2021.) (29.3% acceptance rate, 46/157).
Gaussian Process Bandits:
Bayesian optimization is a framework for global search via maximum a posteriori updates rather
than simulated annealing, and has gained prominence for decision-making under uncertainty. In
this work, we cast Bayesian optimization as a multi-armed bandit problem, where the payoff
function is sampled from a Gaussian process (GP). Further, we focus on action selections via upper
confidence bound (UCB) or expected improvement (EI) due to their prevalent use in practice. We derive sublinear regret bounds of GP bandit algorithms up to factors depending on the
compression parameter for both discrete and continuous action sets. Experimentally, we observe state of the art accuracy and complexity tradeoffs for GP bandit algorithms applied to global optimization, suggesting the merits of compressed GPs in bandit
- Amrit Singh Bedi, Dheeraj Peddireddy, Vaneet Aggarwal, and Alec Koppel, "Sublinear Regret and Belief Complexity in Gaussian Process Bandits via Information Thresholding," Submitted to JMLR, Dec 2020 (under revision).
- Amrit Singh Bedi, Dheeraj Peddireddy, Vaneet Aggarwal, and Alec Koppel, "Efficient Large-Scale Gaussian Process Bandits by Believing only Informative Actions," in Proc. L4DC, Jun 2020.