方略学科导航

搜索结果: 1-10 共查到“统计学 Bandits”相关记录10条 . 查询时间(0.062 秒)

Thompson Sampling for Contextual Bandits with Linear Payoffs Thompson Sampling Contextual Bandits Linear Payoffs 2012/11/23

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several s...

存档附件原文地址

Regret Bounds for Restless Markov Bandits Regret Bounds Restless Markov Bandits 2012/11/23

We consider the restless Markov bandit problem, in which the state of each arm evolves according to a Markov process independently of the learner's actions. We suggest an algorithm that after $T$ step...

存档附件原文地址

From Bandits to Experts: On the Value of Side-Observations Shie Mannor, Ohad Shamir 2011/7/6

We consider an adversarial online learning setting where a decision maker can choose an action in every stage of the game. In addition to observing the reward of the chosen action, the decision maker ...

存档附件原文地址

Efficient Optimal Learning for Contextual Bandits Efficient Optimal Learning Contextual Bandits 2011/7/6

We address the problem of learning in an online setting where the learner repeatedly observes features, selects among a set of actions, and receives reward for the action taken.

存档附件原文地址

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences Finite-Time Multi-armed Bandits Problems Kullback-Leibler Divergences 2011/6/20

We consider a Kullback-Leibler-based algorithmfor the stochastic multi-armed bandit prob- lem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic r...

存档附件原文地址

Lipschitz Bandits without the Lipschitz Constant Lipschitz Bandits Constant strategy environments 2011/6/20

We consider the setting of stochastic bandit problems with a continuum of arms. We first point out that the strategies considered so far in the literature only provided theoretical guarantees of the...

存档附件原文地址

PAC-Bayesian Analysis of Martingales and Multiarmed Bandits PAC-Bayesian Analysis Martingales Multiarmed Bandits 2011/6/21

We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of...

存档附件原文地址

The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond Stochastic Bandits Beyond KL-UCB 2011/3/21

This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems. We prove two distinct results: first, for arbitrary bounded rew...

存档附件原文地址

Nonparametric Bandits with Covariates Bandit regression regret inferior sampling rate minimax rate 2010/3/11

We consider a bandit problem which involves sequential sampling from two populations (arms). Each arm produces a noisy reward realization which depends on an observable random covariate. The goal is...

存档附件原文地址

X-Armed Bandits X-Armed Bandits stochastic bandits 2010/3/9

We consider a generalization of stochastic bandits where the set of arms, X, is allowed to be a generic measurable space and the mean-payoff function is “locally Lipschitz” with respect to a dissimi...

存档附件原文地址

中国研究生教育排行榜-条

正在加载...

中国学术期刊排行榜-条

正在加载...

世界大学科研机构排行榜-条

正在加载...

中国大学排行榜-条

正在加载...

人　物-篇

正在加载...

课　件-篇

正在加载...

视听资料-篇

正在加载...

研招资料 -篇

正在加载...

知识要闻-篇

正在加载...

国际动态-篇

正在加载...

会议中心-篇

正在加载...

学术指南-篇

正在加载...

学术站点-篇

正在加载...

中国研究生教育排行榜-条

中国学术期刊排行榜-条

世界大学科研机构排行榜-条

中国大学排行榜-条

人 物-篇

课 件-篇

视听资料-篇

知识库-篇

研招资料 -篇

知识要闻-篇

国际动态-篇

会议中心-篇

学术指南-篇

学术站点-篇

人　物-篇

课　件-篇