Risk-Sensitive Online Learning (2009)
Eyal Even-dar, Michael Kearns, Jennifer Wortman
Abstract. We consider the problem of online learning in settings in which we want to compete not simply with the rewards of the best expert or stock, but with the best trade-off between rewards and...
Bid Optimization in Broad-Match Ad auctions (2009)
Even-dar, Eyal, Mansour, Yishay, Mirrokni, Vahab, Muthukrishnan, S., Nadav, Uri
Ad auctions in sponsored search support ``broad match'' that allows an advertiser to target a large number of queries while bidding only on a limited number. While giving more expressiveness to...
On-line Markov Decision Processes (2008)
Eyal Even-dar, Sham M. Kakade, Yishay Mansour
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts...
The Value of Observation for Monitoring Dynamic Systems (2008)
We consider the fundamental problem of monitoring (i.e. tracking) the belief state in a dynamic system, when the model is only approximately correct and when the initial belief state might be...
Approximate Equivalence of Markov Decision Processes (2008)
Abstract. We consider the problem of finding the minimal s^-equivalent MDP for an MDP given in its tabular form. We show that the problem is NP-Hard and then give a bicriteria approximation algorithm...
Convergence of Optimistic and Incremental Q-Learning (2008)
Abstract We show the convergence of two deterministic variants of Q-learning. The first is the widely used optimistic Q-learning, which initializes the Q-values to large initial values and then...
Regret to the Best vs. Regret to the Average (2008)
Eyal Even-dar, Michael Kearns, Yishay Mansour, Jennifer Wortman
Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all...
Eyal Even-dar, Sham M. Kakade, Michael Kearns, Yishay Mansour
We study the stability properties of the dynamics of the standard continuous limit-order mechanism that is used in modern equity markets. We ask whether such mechanisms are susceptible to...
Regret to the Best vs. Regret to the Average (2008)
Eyal Even-dar, Michael Kearns, Yishay Mansour, Jennifer Wortman
Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all...
Convergence Time to Nash Equilibrium in Load Balancing (2008)
We study the number of steps required to reach a pure Nash equilibrium in a load balancing scenario where each job behaves selfishly and attempts to migrate to a machine which will minimize its cost....
Eyal Even-dar, Sham M. Kakade, Yishay Mansour
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts...
A Small World Threshold for Economic Network Formation (2008)
We introduce a game-theoretic model for network formation inspired by earlier stochastic models that mix localized and long-distance connectivity. In this model, players may purchase edges at...
Eyal Even-dar, Sham M. Kakade, Yishay Mansour
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts...
We introduce a game-theoretic model for network formation inspired by earlier stochastic models that mix localized and long-distance connectivity. In this model, players may purchase edges at...
Regret to the Best vs. Regret to the Average (2008)
Eyal Even-dar, Michael Kearns, Yishay Mansour, Jennifer Wortman
Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all...
The Value of Observation for Monitoring Dynamic Systems (2008)
We consider the fundamental problem of monitoring (i.e. tracking) the belief state in a dynamic system, when the model is only approximately correct and when the initial belief state might be...
A Note on Maximizing the Spread of Influence in Social Networks (2008)
Abstract. We consider the spread maximization problem that was defined by Domingos and Richardson [7, 22]. In this problem, we are given a social network represented as a graph and are required to...
Risk-Sensitive Online Learning (2008)
Eyal Even-dar, Michael Kearns, Jennifer Wortman
Abstract. We consider the problem of online learning in settings in which we want to compete not simply with the rewards of the best expert or stock, but with the best trade-off between rewards and...
We introduce a game-theoretic model for network formation inspired by earlier stochastic models that mix localized and long-distance connectivity. In this model, players may purchase edges at...
Regret to the Best vs. Regret to the Average (2008)
Eyal Even-dar, Michael Kearns, Yishay Mansour, Jennifer Wortman
Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all...
Regret to the Best vs. Regret to the Average (2008)
Eyal Even-dar, Michael Kearns, Yishay Mansour, Jennifer Wortman
Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all...
Eyal Even-dar, Sham M. Kakade, Michael Kearns, Yishay Mansour
We study the stability properties of the dynamics of the standard continuous limit-order mechanism that is used in modern equity markets. We ask whether such mechanisms are susceptible to...
Convergence of Optimistic and Incremental Q-Learning (2007)
We show the convergence of two deterministic variants of Qlearning. The rst is the widely used optimistic Q-learning, which initializes the Q-values to large initial values and then follows a greedy...
Regret to the Best vs. Regret to the average. (2007)
Even-Dar, Eyal, Kearns, Michael, Mansour, Yishay, Wortman, Jenn
We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all experts, the...
Sponsored search with contexts (2007)
Eyal Even-dar, Michael Kearns, Jennifer Wortman
We examine a formal model of sponsored search in which advertisers can bid not only on search terms, but on search terms under specific contexts. A context is any auxiliary information that might...
Asymptotic active learning (2007)
Maria-florina Balcan, Eyal Even-dar, Steve Hanneke, Michael Kearns, Yishay Mansour, Jennifer Wortman
We describe and analyze a PAC-asymptotic model for active learning. We show that in many cases where it has traditionally been believed that active learning does not help, active learning does help...
Asymptotic active learning (2007)
Maria-florina Balcan, Eyal Even-dar, Steve Hanneke, Michael Kearns, Yishay Mansour, Jennifer Wortman
We describe and analyze a PAC-asymptotic model for active learning. We show that in many cases where it has traditionally been believed that active learning does not help, active learning does help...
Asymptotic active learning (2007)
Maria-florina Balcan, Eyal Even-dar, Steve Hanneke, Michael Kearns, Yishay Mansour, Jennifer Wortman
We describe and analyze a PAC-asymptotic model for active learning. We show that in many cases where it has traditionally been believed that active learning does not help, active learning does help...
A Network Formation Game for Bipartite Exchange Economies (2007)
Eyal Even-dar, Michael Kearns, Siddharth Suri
Abstract We introduce a natural new network formation gamein which buyers and sellers may purchase edges representing trading opportunities between themselves, andthen accrue wealth in the resulting...
Sponsored search with contexts (2007)
Eyal Even-dar, Michael Kearns, Jennifer Wortman
We examine a formal model of sponsored search in which advertisers can bid not only on search terms, but on search terms under specific contexts. A context is any auxiliary information that might...
A Network Formation Game for Bipartite Exchange Economies (2007)
Eyal Even-dar, Michael Kearns, Siddharth Suri
We introduce a natural new network formation game in which buyers and sellers may purchase edges representing trading opportunities between themselves, and then accrue wealth in the resulting...
The Value of Observation for Monitoring Dynamic Systems (2006)
Even-Dar, Eyal, Kakade, Sham, Mansour, Yishay
We consider the fundamental problem of tracking the belief state in a POMDP, when the model is only approximately correct and the initial belief state might be unknown. In this general setting where...
(In)Stability Properties of Limit Order Dynamics (2006)
Even-Dar, Eyal, Kakade, Sham, Kearns, Mchael, Mansour, Yishay
We study the stability properties of the dynamics of the standard continuous limit-order mechanism that is used in modern equity markets. We ask whether such mechanisms are susceptible to ``butterfly...
On Nash Equilibria for a Network Creation Game (2006)
Albers, Susanne, Eilts, Stefan, Even-Dar, Eyal, Mansour, Yishay, Roditty, Liam
We study a network creation game recently proposed by Fabrikant, Luthra, Maneva, Papadimitriou and Shenker. In this game, each player (vertex) can create links (edges) to other players at a cost of...
On Nash equilibria for a network creation game (2006)
Susanne Albers, Stefan Eilts, Eyal Even-dar, Yishay Mansour, Liam Roditty
We study a network creation game recently proposed by Fabrikant, Luthra, Maneva, Papadimitriou and Shenker. In this game, each player (vertex) can create links (edges) to other players at a cost of...
Avrim Blum, Eyal Even-dar, Katrina Ligett
Abstract There has been substantial work developing simple, efficient no-regret algorithms for a wideclass of repeated decision-making problems including online routing. These are adaptive strategies...
Routing Without Regret: On Convergence To Nash . . . (2006)
Avrim Blum, Eyal Even-Dar, Katrina Ligett
There has been substantial work developing simple, efficient no-regret algorithms for a wide class of repeated decision-making problems including online routing. These are adaptive strategies an...
Eyal Even-dar, Shie Mannor, Yishay Mansour, Sridhar Mahadevan
We incorporate statistical confidence intervals in both the multi-armed bandit and the reinforcement learning problems. In the bandit problem we show that given n arms, it suffices to pull the arms a...
Avrim Blum, Eyal Even-dar, Katrina Ligett
Abstract. There has been substantial work developing simple, efficient no-regret algorithms for a wide class of repeated decision-making problems including online routing. These are adaptive...
Avrim Blum, Eyal Even-dar, Katrina Ligett
There has been substantial work developing simple, efficient no-regret algorithms for a wide class of repeated decision-making problems including online routing. These are adaptive strategies an...
in)stability properties of limit order dynamics (2006)
Eyal Even-dar, Sham M. Kakade, Michael Kearns, Yishay Mansour
We study the stability properties of the dynamics of the standard continuous limit-order mechanism that is used in modern equity markets. We ask whether such mechanisms are susceptible to...
Reinforcement Learning in POMDPs Without Resets (2005)
Even-Dar, Eyal, Kakade, Sham, Mansour, Yishay
We consider the most realistic reinforcement learning setting in which an agent starts in an unknown environment (the POMDP) and must follow one continuous and uninterrupted chain of experience with...
Planning in POMDPS using Multiplicity Automata (2005)
Even-Dar, Eyal, Kakade, Sham, Mansour, Yishay
Planning and learning in Partially Observable MDPs (POMDPs) are among the most challenging tasks in both the AI and Operation Research communities. Although solutions to these problems are...
Fast Convergence of Selfish Routing (2005)
Even-Dar, Eyal, Mansour, Yishay
We consider $n$ anonymous selfish users that route their communication through $m$ parallel links. The users are allowed to reroute, concurrently, from overloaded links to underloaded links. The...
Fast Convergence of Selfish Rerouting (2005)
Even-Dar, Eyal, Mansour, Yishay
We consider $n$ identical selfish users that route their communication using $m$ parallel links. The users are allowed to reroute, concurrently, from overloaded links to underloaded links. The...
Y.: Planning in pomdps using multiplicity automata (2005)
Planning and learning in Partially Observable MDPs (POMDPs) are among the most challenging tasks in both the AI and Operation Research communities. Although solutions to these problems are...
Reinforcement learning in POMDPs without resets (2005)
We consider the most realistic reinforcement learning setting in which an agent starts in an unknown environment (the POMDP) and must follow one continuous and uninterrupted chain of experience with...
Reinforcement learning in POMDPs without resets (2005)
We consider the most realistic reinforcement learning setting in which an agent starts in an unknown environment (the POMDP) and must follow one continuous and uninterrupted chain of experience with...
Experts in a Markov Decision Process (2004)
Even-Dar, Eyal, Kakade, Sham, Mansour, Yishay
We consider the MDP setting in which the reward function is chosen arbitrarily (possibly by an adversary) during each time step of play, yet the dynamics remain fixed. Similar to the experts setting,...
Experts in a Markov Decision Process (2004)
Even-Dar, Eyal, Kakade, Sham, Mansour, Yishay
We consider the MDP setting in which the reward function is chosen arbitrarily (possibly by an adversary) during each time step of play, yet the dynamics remain fixed. Similar to the experts setting,...
Convergence time to nash equilibria (2003)
Eyal Even-dar, Alex Kesselman, Yishay Mansour
Abstract. We study the number of steps required to reach a pure Nash Equilibrium in a load balancing scenario where each job behaves selfishly and attempts to migrate to a machine which will minimize...
Action elimination and stopping conditions for reinforcement learning (2003)
Eyal Even-dar, Shie Mannor, Yishay Mansour
We consider incorporating action elimination procedures in reinforcement learning algorithms. We suggest a framework that is based on learning an upper and a lower estimates of the value function or...
Convergence Time to Nash Equilibria (2003)
Eyal Even-dar, Alex Kesselman, Yishay Mansour
We study the number of steps required to reach a pure Nash Equilibrium in a load balancing scenario where each job behaves selfishly and attempts to migrate to a machine which will minimize its cost....
Action elimination and stopping conditions for reinforcement learning (2003)
Eyal Even-dar, Shie Mannor, Yishay Mansour
We consider incorporating action elimination procedures in reinforcement learning algorithms. We suggest a framework that is based on learning an upper and a lower estimates of the value function or...
Convergence time to nash equilibria (2003)
Eyal Even-dar, Alex Kesselman, Yishay Mansour
Abstract. We study the number of steps required to reach a pure Nash Equilibrium in a load balancing scenario where each job behaves selfishly and attempts to migrate to a machine which will minimize...
PAC bounds for multi-armed bandit and Markov decision processes (2002)
Eyal Even-dar, Shie Mannor, Yishay Mansour
Abstract. The bandit problem is revisited and considered under the PAC model. Our main contribution in this part is to show that given n arms, it suffices to pull the arms O ( n ɛ2 log 1) times to...
PAC bounds for multi-armed bandit and Markov decision processes (2002)
Eyal Even-dar, Shie Mannor, Yishay Mansour
1 Introduction The Multi-armed bandit problem is one of the classical problems in decisiontheory. The problem is very simple to model- there are a number of alternative arms, each with a stochastic...
Learning rates for Q-learning (2001)
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in the Q-learning. For a polynomial learning...