Leslie Pack Kaelbling

Finding aircraft collision-avoidance strategies using policy search methods (2009)

Kaelbling, Leslie Pack, Lozano-Perez, Tomas

A progress report describing the application of policy gradient and policy search by dynamic programming methods to an aircraft collision avoidance problem inspired by the requirements of...

Efficient Distributed Reinforcement Learning Through Agreement (2009)

Paulina Varshavskaya, Leslie Pack Kaelbling, Daniela Rus, Paulina Varshavskaya, Leslie Pack Kaelbling, Daniela Rus, ...

Abstract Distributed robotic systems can benefit from automatic controller design and online adaptation by reinforcement learning (RL), but often suffer from the limitations of partial observability....

Multi-Agent Filtering with Infinitely Nested Beliefs (2009)

Luke S. Zettlemoyer, Brian Milch, Leslie Pack Kaelbling

In partially observable worlds with many agents, nested beliefs are formed when agents simultaneously reason about the unknown state of the world and the beliefs of the other agents. The multi-agent...

Automatic Class-Specific 3D Reconstruction from a Single Image (2009)

Chiu, Han-Pang, Kaelbling, Leslie Pack, Lozano-Perez, Tomas

Our goal is to automatically reconstruct 3D objects from a single image, by using prior 3D shape models of classes. The shape models, defined as a collection of oriented primitive shapes centered at...

Automatic Class-Specific 3D Reconstruction from a Single Image (2009)

Lozano-Perez, Tomas, Kaelbling, Leslie Pack, Chiu, Han-Pang

Our goal is to automatically reconstruct 3D objects from a single image, by using prior 3D shape models of classes. The shape models, defined as a collection of oriented primitive shapes centered at...

Abstract (2009)

Natalia H. Gardiol, Leslie Pack Kaelbling

A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making domains involve large...

First-Order Variable (2008)

Kristian Kersting, Brian Milch, Luke S. Zettlemoyer, Michael Haimes, Leslie Pack Kaelbling

do not scale well to large populations • Lifted inference idea: • Many individuals are interchangeable in model • Exploit that symmetry to speed up inference

Abstract (2008)

Georgios Theocharous, Leslie Pack Kaelbling

require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief...

Combining dynamic abstractions in very large MDPs (2008)

Kurt Steinkraus, Leslie Pack Kaelbling

What: The goal of our research is to develop a computer system capable of creating plans of action and then executing those plans in very large, stochastic domains. Provable optimality is intractable...

Deictic Representation in Reinforcement Learning (2008)

Sarah Finney, Natalia H. Gardiol, Leslie Pack Kaelbling

Abstract Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic...

Spatial and Temporal Abstractions in POMDPS: Learning and Planning (2008)

Georgios Theocharous, Leslie Pack Kaelbling

Introduction: A popular approach to artificial intelligence is to model an agent and its interaction with its environment through actions, perceptions, and rewards [1]. Intelligent agents should...

Adaptive Envelope MDPs for Relational Equivalence-based Planning (2008)

Gardiol, Natalia H., Kaelbling, Leslie Pack

We describe a method to use structured representations of the environment’s dynamics to constrain and speed up the planning process. Given a problem domain described in a probabilistic logical...

Adaptive Envelope MDPs for Relational Equivalence-based Planning (2008)

Gardiol, Natalia H., Kaelbling, Leslie Pack

We describe a method to use structured representations of the environment’s dynamics to constrain and speed up the planning process. Given a problem domain described in a probabilistic logical...

Scaling Techniques for Large Markov Decision Process Planning Problems (2008)

Terran Lane, Leslie Pack Kaelbling

Planning in Large Domains: The Markov decision process (MDP) formalism has emerged as a powerful representation for control and planning domains that are subject to stochastic effects. In particular,...

Learning Rich, Tractable Models of the Real World (2008)

Tim Oates, Leslie Pack Kaelbling

The Problem: We tend to think of the world as being made up of objects. There are chairs and apples and clouds and meetings. Certainly, part of the basis for this view is that there are clumps of...

Interval Programming: A Multiple Criteria Decision Making Model for Autonomous Vehicle Control (2008)

Michael R. Benjamin, Leslie Pack Kaelbling

The Problem: We want to create a new model for capturing and optimizing over multiple competing objectives that characterize autonomous vehicle control in complex, dynamic, and unpredictable...

Grasping POMDPs: Theory and Experiments (2008)

Ross Glashan, Kaijen Hsiao, Leslie Pack Kaelbling, Tomás Lozano-pérez

Abstract — We describe a method for planning under uncertainty for robotic manipulation of objects by partitioning the configuration space into a set of regions that are closed under compliant...

Learning and Planning with Probabilistic Relational Rules (2008)

Hanna Pasula, Luke Zettlemoyer, Leslie Pack Kaelbling

The Problem: Our research involves learning models of world action dynamics, which can then be used to construct plans to reach a wide range of goals. The work is applied to simulated worlds, such as...

Reasoning about Large Populations with Lifted Probabilistic Inference (2008)

Kristian Kersting, Brian Milch, Luke S. Zettlemoyer, Michael Haimes, Leslie Pack Kaelbling

We use a concrete problem in the context of planning meetings to show how lifted probabilistic inference can dramatically speed up reasoning. We also extend lifted inference to deal with cardinality...

Planning with Probabilistic Rules in a Relational World (2008)

Natalia H. Gardiol, Leslie Pack Kaelbling

The Problem: Being able to represent and reason about the world as though it were composed of “objects ” seems like a useful abstraction. The typical approach to representing a world composed of...

Learning Grammatical Models for Object Recognition (2008)

Aycinena, Meg, Kaelbling, Leslie Pack, Lozano-Perez, Tomas

Many object recognition systems are limited by their inability to share common parts or structure among related object classes. This capability is desirable because it allows information about parts...

Learning Grammatical Models for Object Recognition (2008)

Aycinena, Meg, Kaelbling, Leslie Pack, Lozano-Perez, Tomas

Many object recognition systems are limited by their inability to share common parts or structure among related object classes. This capability is desirable because it allows information about parts...

Abstract (2008)

Yu-han Chang, Leslie Pack Kaelbling

role of beliefs in multi-agent learning

Kaelbling. Learning symbolic models of stochastic domains (2008)

Hanna M. Pasula, Luke S. Zettlemoyer, Leslie Pack Kaelbling

In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models...

Instructions for Formatting JMLR Articles (2008)

Leslie Pack Kaelbling, David Cohn, Pack Kaelbling

This document, which is based on an earlier document by Minton et al. (1999), describes the required formatting of JMLR papers, including margins, fonts, citation styles, and figure placement. It...

Instructions for Formatting JMLR Articles (2008)

Leslie Pack Kaelbling, David Cohn, Pack Kaelbling

This document, which is based on an earlier document by Minton et al. (1999), describes the required formatting of JMLR papers, including margins, fonts, citation styles, and figure placement. It...

Kaelbling. Learning symbolic models of stochastic domains (2008)

Hanna M. Pasula, Luke S. Zettlemoyer, Leslie Pack Kaelbling

In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models...

Kaelbling. Learning symbolic models of stochastic domains (2008)

Hanna M. Pasula, Luke S. Zettlemoyer, Leslie Pack Kaelbling

In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a a new probabilistic planning rule representation to compactly model model noisy,...

Abstract (2008)

Natalia H. Gardiol, Leslie Pack Kaelbling

A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making domains involve large...

Abstract (2008)

Yu-han Chang, Leslie Pack Kaelbling

role of beliefs in multi-agent learning

Abstract (2008)

Georgios Theocharous, Leslie Pack Kaelbling

require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief...

Time-Critical Planning and Scheduling Research at Brown University (2008)

Thomas L. Dean, Lloyd Greenwald, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

This report is a summary of recent work on time-critical planning and scheduling at Brown University. Much of our research over the last six years has been concerned with systems that are capable of...

Learning Probabilistic Relational Dynamics for Multiple Tasks (2008)

Deshpande, Ashwin, Milch, Brian, Zettlemoyer, Luke S., Kaelbling, Leslie Pack

The ways in which an agent's actions affect the world can often be modeled compactly using a set of relational probabilistic planning rules. This extended abstract addresses the problem of learning...

Learning Grammatical Models for Object Recognition (2008)

Aycinena Lippow, Meg, Kaelbling, Leslie Pack, Lozano-Perez, Tomas

Many object recognition systems are limited by their inability to share common parts or structure among related object classes. This capability is desirable because it allows information about parts...

Time-Critical Planning and Scheduling in Stochastic Domains (Extended Abstract) (2007)

Thomas L. Dean, Lloyd Greenwald, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

) Thomas L. Dean Lloyd Greenwald Leslie Pack Kaelbling Jak Kirman Ann Nicholson Department of Computer Science Brown University, Box 1910, Providence, RI 02912 In this note we summarize our recent...

Depamnent of Cognitive and Linguistic Sciences (2007)

Andrew Duchon, William H. Warren, Leslie Pack Kaelbling

ABSTRACT: There is a close relationship between ecological psychology and behavior-based robotics. One point of intersection between these two fields is the investigation of control laws,...

Planning and Acting in Partially Observable Stochastic Domains (2007)

Leslie Pack, Kaelbling Michael, L. Littman, Anthony R. Cass, Leslie Pack Kaelbling, Michael L. Littman, ...

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

A Innovative Claims for the Proposed Research (2007)

Technical Poc, Leslie Pack Kaelbling

The research program proposed here is founded on three theses: 1. Neither human programming nor robot learning is alone su#cient for the construction of robust intelligent robot systems. 2. There is...

Learning Sponsored by the National Science Foundation (2007)

Sridhar Mahadevan, Leslie Pack Kaelbling

The findings and recommendations contained in the document are based on the discussions at the workshop, and do not necessarily reflect the views of NSF. 1 This document describes the results of a...

Leslie Pack Kaelbling, Tim Oates, Natalia Hernandez and Sarah Finney (2007)

Arti Cial Intelligence, Leslie Pack Kaelbling, Tim Oates, Natalia Hern, Sarah Finney

Introduction We are interested in building systems that learn to interact with complex real world environments, by representing the dynamics of the world with models that allow strong generalization...

All Learning is Local: Multi-agent learning in global reward games (2007)

Yu-han Chang, Tracey Ho, Leslie Pack Kaelbling

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms.

Automated design of adaptive controllers for modular robots using reinforcement learning’, accepted for publication (2007)

Paulina Varshavskaya, Leslie Pack Kaelbling, Daniela Rus

Designing distributed controllers for self-reconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate...

Grasping POMDPs (2007)

Kaijen Hsiao, Leslie Pack Kaelbling, Tomás Lozano-pérez

Abstract — We provide a method for planning under uncertainty for robotic manipulation by partitioning the configuration space into a set of regions that are closed under compliant motions. These...

Logical particle filtering (2007)

Luke S. Zettlemoyer, Hanna M. Pasula, Leslie Pack Kaelbling

Abstract. In this paper, we consider the problem of filtering in relational hidden Markov models. We present a compact representation for such models and an associated logical particle filtering...

Computing action equivalences for planning under time-constraints (2006)

Gardiol, Natalia H., Kaelbling, Leslie Pack

In order for autonomous artificial decision-makers to solverealistic tasks, they need to deal with the dual problems of searching throughlarge state and action spaces under time pressure.We study the...

Computing action equivalences for planning under time-constraints (2006)

Gardiol, Natalia H., Kaelbling, Leslie Pack

In order for autonomous artificial decision-makers to solverealistic tasks, they need to deal with the dual problems of searching throughlarge state and action spaces under time pressure.We study the...

Inductive Synthesis of Functional Programs: An Explanation Based Generalization Approach (2006)

Emanuel Kitzelmann, Ute Schmid, Leslie Pack Kaelbling

We describe an approach to the inductive synthesis of recursive equations from input/outputexamples which is based on the classical two-step approach to induction of functional Lisp programs of...

Activity recognition from physiological data using conditional random fields (2006)

Hai Leong Chieu, Wee Sun Lee, Leslie Pack Kaelbling

Abstract — We describe the application of conditional random fields (CRF) to physiological data modeling for the application of activity recognition. We use the data provided by the Physiological...

Activity recognition from physiological data using conditional random fields (2006)

Hai Leong Chieu, Wee Sun Lee, Leslie Pack Kaelbling

Abstract — We describe the application of conditional random fields (CRF) to physiological data modeling for the application of activity recognition. We use the data provided by the Physiological...

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation (2005)

Theocharous, Georgios, Mahadevan, Sridhar, Kaelbling, Leslie Pack

Partially observable Markov decision processes (POMDPs) are a well studied paradigm for programming autonomous robots, where the robot sequentially chooses actions to achieve long term goals...

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation (2005)

Theocharous, Georgios, Mahadevan, Sridhar, Kaelbling, Leslie Pack

Partially observable Markov decision processes (POMDPs) are a well studied paradigm for programming autonomous robots, where the robot sequentially chooses actions to achieve long term goals...

Probabilistic Geometric Grammars for (2005)

Ob Ject Recognition, Leslie Pack Kaelbling

Certified by.......................................................... Tom'as Lozano-P'erez

Transfer learning with an ensemble of background tasks (2005)

Zvika Marx, Michael T. Rosenstein, Leslie Pack Kaelbling, Thomas G. Dietterich

We demonstrate the transfer of learning from an ensemble of background tasks, which becomes helpful in cases where a single background task does not transfer well. This approach is accomplished...

Transfer learning with an ensemble of background tasks (2005)

Zvika Marx, Michael T. Rosenstein, Leslie Pack Kaelbling, Thomas G. Dietterich

We demonstrate the transfer of learning from an ensemble of background tasks, which becomes helpful in cases where a single background task does not transfer well. This approach is accomplished...

Hedged learning: regretminimization with learning experts (2005)

Yu-han Chang, Leslie Pack Kaelbling

In non-cooperative multi-agent situations, there cannot exist a globally optimal, yet opponent-independent learning algorithm. Regret-minimization over a set of strategies optimized for potential...

To transfer or not to transfer (2005)

Michael T. Rosenstein, Zvika Marx, Leslie Pack Kaelbling, Thomas G. Dietterich

With transfer learning, one set of tasks is used to bias learning and improve performance on another task. However, transfer learning may actually hinder performance if the tasks are too dissimilar....

Combining dynamic abstractions in large MDPs (2004)

Steinkraus, Kurt, Kaelbling, Leslie Pack

One of the reasons that it is difficult to plan and act in real-worlddomains is that they are very large. Existing research generallydeals with the large domain size using a static representation...

Combining dynamic abstractions in large MDPs (2004)

Steinkraus, Kurt, Kaelbling, Leslie Pack

One of the reasons that it is difficult to plan and act in real-worlddomains is that they are very large. Existing research generallydeals with the large domain size using a static representation...

Representing hierarchical POMDPs as DBNs for multi-scale robot localization (2004)

Georgios Theocharous, Kevin Murphy, Leslie Pack Kaelbling

We explore the advantages of representing hierarchical partially observable Markov decision processes (H-POMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of...

Learning distributed control for modular robots (2004)

Paulina Varshavskaya, Leslie Pack Kaelbling, Daniela Rus

Abstract — We propose to automate controller design for distributed modular robots. In this paper, we present some initial experiments with learning distributed controllers for synthesizing...

Representing hierarchical POMDPs as DBNs for multi-scale robot localization (2004)

Georgios Theocharous, Kevin Murphy, Leslie Pack Kaelbling

We explore the advantages of representing hierarchical partially observable Markov decision processes (H-POMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of...

Mobilized ad-hoc networks: A reinforcement learning approach (2004)

Yu-han Chang, Tracey Ho, Leslie Pack Kaelbling

Research in mobile ad-hoc networks has focused on situations in which nodes have no control over their movements. We investigate an important but overlooked domain in which nodes do have control over...

Mobilized ad-hoc networks: A reinforcement learning approach (2003)

Chang, Yu-Han, Ho, Tracey, Kaelbling, Leslie Pack

Research in mobile ad-hoc networks has focused on situations in which nodes have no control over their movements. We investigate an important but overlooked domain in which nodes do have control over...

Mobilized ad-hoc networks: A reinforcement learning approach (2003)

Chang, Yu-Han, Ho, Tracey, Kaelbling, Leslie Pack

Research in mobile ad-hoc networks has focused on situations in whichnodes have no control over their movements. We investigate animportant but overlooked domain in which nodes do have controlover...

Mobilized ad-hoc networks: A reinforcement learning approach (2003)

Chang, Yu-Han, Ho, Tracey, Kaelbling, Leslie Pack

Research in mobile ad-hoc networks has focused on situations in whichnodes have no control over their movements. We investigate animportant but overlooked domain in which nodes do have controlover...

Mobilized ad-hoc networks: A reinforcement learning approach (2003)

Chang, Yu-Han, Ho, Tracey, Kaelbling, Leslie Pack

Research in mobile ad-hoc networks has focused on situations in which nodes have no control over their movements. We investigate an important but overlooked domain in which nodes do have control over...

Learning object segmentation from video data (2003)

Ross, Michael G., Kaelbling, Leslie Pack

This memo describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have...

Learning object segmentation from video data (2003)

Ross, Michael G., Kaelbling, Leslie Pack

This memo describes the initial results of a project to create aself-supervised algorithm for learning object segmentation from videodata. Developmental psychology and computational experience...

Learning object segmentation from video data (2003)

Ross, Michael G., Kaelbling, Leslie Pack

This memo describes the initial results of a project to create aself-supervised algorithm for learning object segmentation from videodata. Developmental psychology and computational experience...

Learning object segmentation from video data (2003)

Ross, Michael G., Kaelbling, Leslie Pack

This memo describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have...

6.034 Artificial Intelligence, Spring 2003 (2003)

Lozano-Perez, Tomas, Kaelbling, Leslie Pack, Winston, Patrick Henry

Introduces representations, techniques, and architectures used to build applied systems and to account for intelligence from a computational point of view. Applications of rule chaining, heuristic...

6.034 Artificial Intelligence, Spring 2003 (2003)

Lozano-Perez, Tomas, Kaelbling, Leslie Pack, Winston, Patrick Henry

Introduces representations, techniques, and architectures used to build applied systems and to account for intelligence from a computational point of view. Applications of rule chaining, heuristic...

Envelope-based Planning in Relational MDPs (2003)

Natalia H. Gardiol, Leslie Pack Kaelbling

Introduction: A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making domains...

Representing Hierarchical POMDPs as DBNs for Multi-Scale Map Learning (2003)

Georgios Theocharous, Kevin Murphy, Leslie Pack Kaelbling

We explore the advantages of representing hierarchical partially observable Markov decision processes (H-POMDPs) as dynamic Bayesian networks (DBNs). We use this model for representing and learning...

Representing Hierarchical POMDPs as DBNs for Multi-Scale Robot Localization (2003)

Georgios Theocharous, Kevin Murphy, Leslie Pack Kaelbling

We explore the advantages of representing hierarchical partially observable Markov decision processes (H-POMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of...

A systematic approach to learning object segmentation from motion (2003)

Michael G. Ross, Leslie Pack Kaelbling

This paper describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience...

Approximate Planning in POMDPs with Macro-Actions (2003)

Georgios Theocharous, Leslie Pack Kaelbling

Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and...

All Learning is Local: Multi-agent learning in global reward games (2003)

Yu-han Chang, Tracey Ho, Leslie Pack Kaelbling

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms.

Learning object segmentation from video data (2003)

Michael G. Ross, Leslie Pack Kaelbling

This memo describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have...

Learning with Deictic Representation (2002)

Finney, Sarah, Gardiol, Natalia H., Kaelbling, Leslie Pack, Oates, Tim

Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation...

Learning with Deictic Representation (2002)

Finney, Sarah, Gardiol, Natalia H., Kaelbling, Leslie Pack, Oates, Tim

Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation...

Date (2002)

Leslie Pack Kaelbling, Thomas Dean Reader

Peder J. Estrup Dean of the Graduate School and Research iii

Learning with deictic representations (2002)

Sarah Finney, Natalia H. Gardiol, Leslie Pack Kaelbling, Tim Oates

Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation...

Learning geometrically-constrained hidden markov models for robot navigation: Bridging the geometrical-topological gap (2002)

Hagit Shatkay, Leslie Pack Kaelbling

You will come to a place where the streets are not marked. Some windows are lighted but mostly they're darked. A place you could sprain both your elbow and chin! Do you dare to stay out? Do you...

Learning geometrically-constrained hidden markov models for robot navigation: Bridging the geometrical-topological gap (2002)

Hagit Shatkay, Leslie Pack Kaelbling

You will come to a place where the streets are not marked. Some windows are lighted but mostly they're darked. A place you could sprain both your elbow and chin! Do you dare to stay out? Do you...

Learning to Cooperate via Policy Search (2001)

Peshkin, Leonid, Kim, Kee-Eung, Meuleau, Nicolas, Kaelbling, Leslie Pack

Cooperative games are those in which both agents share the same payoff structure. Value-based reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning...

Playing is believing: The role of beliefs in multi-agent learning (2001)

Yu-han Chang, Leslie Pack Kaelbling

We propose a new classification for multi-agent learning algorithms, with each league of players characterized by both their possible strategies and possible beliefs. Using this classification, we...

Approaches to macro decompositions of large markov decision process planning problems (2001)

Terran Lane, Leslie Pack Kaelbling

Mobile robot navigation tasks are subject to motion stochasticity arising from the robot’s local controllers, which casts the navigational task into a Markov decision process framework. The MDP...

State-based classification of finger gestures from electromyographic signals (2000)

Peter Ju, Leslie Pack Kaelbling, Yoram Singer

Electromyographic signals may provide an important new class of user interface for consumer electronics. In order to make such interfaces effective, it will be crucial to map EMG signals to user...

State-based classification of finger gestures from electromyographic signals (2000)

Peter Ju, Leslie Pack Kaelbling, Yoram Singer

Electromyographic signals may provide an important new class of user interface for consumer electronics. In order to make such interfaces effective, it will be crucial to map EMG signals to user...

Learning to Cooperate via Policy Search (2000)

Leonid Peshkin, Kee-eung Kim, Leslie Kaelbling, Nicolas Meuleau, Leslie Pack Kaelbling

Cooperative games are those in which both agents share the same payoff structure. Valuebased reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning...

Sampling Methods for Action Selection in Influence Diagrams (2000)

Luis E. Ortiz, Leslie Pack Kaelbling

Sampling has become an important strategy for inference in belief networks. It can also be applied to the problem of selecting actions in influence diagrams. In this paper, we present methods with...

Adaptive Importance Sampling for Estimation in Structured Domains (2000)

Luis E. Ortiz, Leslie Pack Kaelbling

Sampling is an important tool for estimating large, complex sums and integrals over highdimensional spaces. For instance, importance sampling has been used as an alternative to exact methods for...

Practical Reinforcement Learning in Continuous Spaces (2000)

William D. Smart, Leslie Pack Kaelbling

Dynamic control tasks are good candidates for the application of reinforcement learning techniques. However, many of these tasks inherently have continuous state or action variables. This can cause...

Practical reinforcement learning in continuous spaces (2000)

William D. Smart, Leslie Pack Kaelbling

Dynamic control tasks are good candidates for the application of reinforcement learning techniques. However, many of these tasks inherently have continuous state or action variables. This can cause...

Learning finite-state controllers for partially observable environments (1999)

Nicolas Meuleau, Leonid Peshkin, Kee-eung Kim, Leslie Pack Kaelbling

Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable...

Solving POMDPs by searching the space of finite policies (1999)

Nicolas Meuleau, Kee-eung Kim, Leslie Pack Kaelbling, Anthony R. Cass

Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the...

Learning Finite-State Controllers for Partially Observable Environments (1999)

Nicolas Meuleau Leonid, Leonid Peshkin, Kee-eung Kim, Leslie Pack Kaelbling

Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable...

Solving POMDPs by Searching the Space of Finite Policies (1999)

Nicolas Meuleau, Kee-eung Kim, Leslie Pack Kaelbling, Anthony R. Cassandra

Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the...

Shortest Paths in a Dynamic Uncertain Domain (1999)

David Meir, David Meir Blei, Leslie Pack Kaelbling

This paper describes solutions to finding shortest paths in stochastic graphs with partially unknown topologies. We consider graphs which are both static and dynamic. We solve the static problem by...

Learning Policies with External Memory (1999)

Leonid Peshkin, Nicolas Meuleau, Leslie Pack Kaelbling

In order for an agent to perform well in partially observable domains, it is usually necessary for actions to depend on the history of observations. In this paper, we explore a stigmergic approach,...

Learning Finite-State Controllers for Partially Observable Environments (1999)

Nicolas Meuleau, Leonid Peshkin, Kee-eung Kim, Leslie Pack Kaelbling

Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable...

Accelerating EM: An Empirical Study (1999)

Luis Ortiz Leslie, Luis E. Ortiz, Leslie Pack Kaelbling

Many applications require that we learn the parameters of a model from data. EM (ExpectationMaximization) is a method for learning the parameters of probabilistic models with missing or hidden data....

Learning Policies with External Memory (1999)

Leonid Peshkin Computer, Leonid Peshkin, Nicolas Meuleau, Leslie Pack Kaelbling

In order for an agent to perform well in partially observable domains, it is usually necessary for actions to depend on the history of observations. In this paper, we explore a stigmergic approach,...

Accelerating EM: An Empirical Study (1999)

Luis E. Ortiz, Leslie Pack Kaelbling

Many applications require that we learn the parameters of a model from data. EM (ExpectationMaximization) is a method for learning the parameters of probabilistic models with missing or hidden data....

Notes on methods based on maximum-likelihood estimation for learning the parameters of the mixture of Gaussians model (1999)

Luis Ortiz Leslie, Leslie Kaelbling, Luis E. Ortiz, Luis E. Ortiz, Leslie Pack Kaelbling

In these notes, we present and review dierent methods based on maximum-likelihood estimation for learning the parameters of the mixture-of-Gaussians model. We describe a method based on the...

Notes on methods based on maximum-likelihood estimation for learning the parameters of the mixture of Gaussians model (1999)

Luis E. Ortiz, Leslie Pack Kaelbling

In these notes, we present and review dierent methods based on maximum-likelihood estimation for learning the parameters of the mixture-of-Gaussians model. We describe a method based on the...

Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs (1999)

Andrew Moore, Leemon Baird, Leslie Pack Kaelbling

In goal-based Markov Decision Problems, it is usual to generate... Actions come from?", and whether it is necessary to have some high-level prior understanding of the class of tasks at hand in...

Planning and acting in partially observable stochastic domains (1998)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Hierarchical Solution of Markov Decision Processes using Macro-actions (1998)

Milos Hauskrecht, Nicolas Meuleau, Leslie Pack Kaelbling, Tom Dean, Craig Boutilier

We investigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-actions and...

Solving Very Large Weakly Coupled Markov Decision Processes (1998)

Nicolas Meuleau, Milos Hauskrecht, Kee-eung Kim, Leonid Peshkin, Leslie Pack Kaelbling, Thomas Dean, ...

We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPs). We exploit two key properties to avoid...

Planning and Acting in Partially Observable Stochastic Domains (1998)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Hierarchical Solution of Markov Decision Processes using Macro-actions (1998)

Milos Hauskrecht Nicolas, Nicolas Meuleau, Leslie Pack Kaelbling, Thomas Dean

MDP for a four-room example. Grey circles mark peripheral states of the original MDP, i.e. states of the abstract MDP.

Solving very large weakly coupled Markov Decision Processes (1998)

Nicolas Meuleau, Milos Hauskrecht, Kee-eung Kim, Leonid Peshkin, Leslie Pack Kaelbling, Thomas Dean

We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPs). We exploit two key properties to avoid...

Solving very large weakly coupled Markov Decision Processes (1998)

Nicolas Meuleau, Milos Hauskrecht, Kee-eung Kim, Leonid Peshkin, Leslie Pack Kaelbling, Thomas Dean

We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPs). We exploit two key properties to avoid...

Planning and acting in partially observable stochastic domains (1998)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Planning and acting in partially observable stochastic domains (1998)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Planning and Acting in Partially Observable Stochastic Domains (1997)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Learning Topological Maps with Weak Local Odometric Information (1997)

Hagit Shatkay, Leslie Pack Kaelbling

Topological maps provide a useful abstraction for robotic navigation and planning. Although stochastic maps can theoretically be learned using the Baum-Welch algorithm, without strong prior...

Learning Hidden Markov Models with Geometric Information (1997)

Hagit Shatkay, Hagit Shatkay, Leslie Pack Kaelbling, Leslie Pack Kaelbling

Hidden Markov models (hmms) and partially observable Markov decision processes (pomdps) provide a useful tool for modeling dynamical systems. They are particularly useful for representing...

Strategic directions for artificial intelligence (1996)

Jon Doyle, Thomas Dean, Et Al, Thomas Dean (co, Johan De Kleer, Thomas Dietterich, ...

The field of artificial intelligence (AI) consists of long-standing intellectual and technological efforts addressing several interrelated scientific and practical aims: —constructing intelligent...

Reinforcement learning: a survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

The NSF workshop on reinforcement learning: Summary and observations (1996)

Sridhar Mahadevan, Leslie Pack Kaelbling

Reinforcement learning (RL) has become one of the most actively studied learning frameworks in the area of intelligent autonomous agents. This article describes the results of a three-day meeting of...

Reinforcement learning: a survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement Learning: A Survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Efficient dynamic-programming updates in partially observable Markov decision processes (1996)

Anthony R. Cass, Michael Littman, Michael L. Littman, Anthony R. Cassandra, Leslie Pack Kaelbling, Leslie Pack Kaelbling

We examine the problem of performing exact dynamic-programming updates in partially observable Markov decision processes (pomdps) from a computational complexity viewpoint. Dynamic-programming...

Acting under Uncertainty: Discrete Bayesian Models for Mobile-Robot Navigation (1996)

Anthony Cassandra Arc, A. Cass, L. Kaelbling, J. Kurien, Anthony R. Cassandra, Leslie Pack Kaelbling, ...

Discrete Bayesian models have been used to model uncertainty for mobile-robot navigation, but the question of how actions should be chosen remains largely unexplored. This paper presents the optimal...

Acting under Uncertainty: Discrete Bayesian Models for Mobile-Robot Navigation (1996)

Anthony R. Cassandra, Leslie Pack Kaelbling, James A. Kurien

Discrete Bayesian models have been used to model uncertainty for mobile-robot navigation, but the question of how actions should be chosen remains largely unexplored. This paper presents the optimal...

Efficient dynamic-programming updates in partially observable Markov decision processes (1996)

Michael L. Littman, Anthony R. Cassandra, Leslie Pack Kaelbling

We examine the problem of performing exact dynamic-programming updates in partially observable Markov decision processes (pomdps) from a computational complexity viewpoint. Dynamic-programming...

Reinforcement Learning: A Survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement Learning: A Survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement Learning: A Survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement learning: a survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement learning: a survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

Reinforcement learning: A survey (1996)

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of...

On the complexity of solving Markov decision problems (1995)

Michael L. Littman, Thomas L. Dean, Leslie Pack Kaelbling

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize...

On the complexity of solving Markov decision problems (1995)

Michael L. Littman, Thomas L. Dean, Leslie Pack Kaelbling

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize...

On the complexity of solving Markov decision problems (1995)

Michael L. Littman, Thomas L. Dean, Leslie Pack Kaelbling

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize...

Planning Under Time Constraints in Stochastic Domains (1995)

Thomas Dean, Thomas Dean, Leslie Pack Kaelbling, Leslie Pack Kaelbling, Jak Kirman, Jak Kirman, ...

We provide a method, based on the theory of Markov decision processes, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Planning and Acting in Partially Observable Stochastic Domains (1995)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

Learning policies for partially observable environments: Scaling up (1995)

Michael L. Littman, Anthony R. Cassandra, Leslie Pack Kaelbling

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the...

Learning policies for partially observable environments: Scaling up (1995)

Michael Littman, Anthony R. Cassandra, Leslie Pack Kaelbling

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the...

Planning Under Time Constraints in Stochastic Domains (1995)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision processes, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

A Situated View of Representation and Control (1995)

Stanley J. Rosenschein, Leslie Pack Kaelbling, Leslie Pack Kaelbling

Intelligent agents are systems that have a complex, ongoing interaction with an environment that is dynamic and imperfectly predictable. Agents are typically difficult to program because the...

On the Complexity of Solving Markov Decision Problems (1995)

Michael Littman, Thomas L. Dean, Leslie Pack Kaelbling

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize...

Ecological Robotics: Controlling Behavior with Optical Flow (1995)

Andrew P. Duchon, William H. Warren, Leslie Pack Kaelbling

There are striking parallels between ecological psychology and new trends in robotics and computer vision, particularly regarding how agents interact with the environment. We present some ideas from...

Learning policies for partially observable environments: Scaling up (1995)

Michael L. Littman, Anthony R. Cassandra, Leslie Pack Kaelbling

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the...

Learning Dynamics: System Identification for Perceptually Challenged Agents (1995)

Kenneth Basye, Thomas Dean, Leslie Pack Kaelbling

From the perspective of an agent, the input/output behavior of the environment in which it is embedded can be described as a dynamical system. Inputs correspond to the actions executable by the agent...

Planning and Acting in Partially Observable Stochastic Domains (1995)

Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov...

On the Complexity of Solving Markov Decision Problems (1995)

Michael L. Littman, Thomas L. Dean, Leslie Pack Kaelbling

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize...

Learning dynamics: System identification for perceptually challenged agents (1995)

Kenneth Basye, Thomas Dean, Leslie Pack Kaelbling

From the perspective of an agent, the input/output behavior of the environment in which it is embedded can be described as a dynamical system. Inputs correspond to the actions executable by the agent...

Acting Optimally in Partially Observable Stochastic Domains (1994)

Anthony Cassandra Leslie, Leslie Pack Kaelbling, Michael L. Littman

In this paper, we describe the partially observable Markov decision process (pomdp) approach to finding optimal or near-optimal control strategies for partially observable stochastic environments,...

Acting Optimally in Partially Observable Stochastic Domains (1994)

Anthony R. Cass, Anthony Cassandra Leslie, Leslie Pack Kaelbling, Leslie Pack Kaelbling, Michael L. Littman, Michael L. Littman

In this paper, we describe the partially observable Markov decision process (pomdp) approach to finding optimal or near-optimal control strategies for partially observable stochastic environments,...

Associative Reinforcement Learning: A Generate and Test Algorithm (1994)

Leslie Pack Kaelbling

. An agent that must learn to act in the world by trial and error faces the reinforcementlearning problem, which is quite different from standard concept learning. Although good algorithms exist for...

A Bibliography of Work Related to Reinforcement Learning (1994)

Leslie Pack Kaelbling, Michael L. Littman, Richard S. Sutton, Paul J. Werbos, Ronald J. Williams, ...

greedy policies based on imperfect value functions. Technical Report NU-CCS-93-14, Northeastern University College of Computer Science, 1993. Tight lower bound on cumulative reward obtained from...

Associative Reinforcement Learning: Functions in k-DNF (1994)

Leslie Pack Kaelbling

An agent that must learn to act in the world by trial and error faces the reinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for...

Ecological Robotics (1994)

Andrew P. Duchon, William H. Warren, Leslie Pack Kaelbling

There are striking parallels between ecological psychology and the new trends in robotics and computer vision, particularly regarding how agents interact with the environment. We present some ideas...

Toward Approximate Planning in Very Large Stochastic Domains (1994)

Ann E. Nicholson, Leslie Pack Kaelbling

In this paper we extend previous work on approximate planning in large stochastic domains by adding the ability to plan in automaticallygenerated abstract world views. The dynamics of the domain are...

Acting Optimally in Partially Observable Stochastic Domains (1994)

Anthony R. Cassandra, Leslie Pack Kaelbling, Michael L. Littman

In this paper, we describe the partially observable Markov decision process (pomdp) approach to finding optimal or near-optimal control strategies for partially observable stochastic environments,...

Toward Approximate Planning in Very Large Stochastic Domains (1994)

Ann E. Nicholson, Leslie Pack Kaelbling

In this paper we extend previous work on approximate planning in large stochastic domains by adding the ability to plan in automaticallygenerated abstract world views. The dynamics of the domain are...

Acting Optimally in Partially Observable Stochastic Domains (1994)

Anthony Cassandra, Leslie Pack Kaelbling, Michael L. Littman

In this paper, we describe the partially observable Markov decision process (pomdp) approach to finding optimal or near-optimal control strategies for partially observable stochastic environments,...

Hierarchical Learning in Stochastic Domains: Preliminary Results (1993)

Leslie Pack Kaelbling

This paper presents the HDG learning algorithm, which uses a hierarchical decomposition of the state space to make learning to achieve goals more efficient with a small penalty in path quality....

Planning Under Time Constraints in Stochastic Domains (1993)

Thomas Dean Leslie, Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision processes, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Planning With Deadlines in Stochastic Domains (1993)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision problems, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Feedforward and Recurrent Neural Networks and Genetic Programs for Stock Market and Time Series Forecasting (1993)

Peter C. McCluskey, Peter C. Mccluskey, Leslie Pack Kaelbling

Adding recurrence to neural networks improves their time series forecasts. Well chosen inputs such as a window of time-delayed inputs, or intelligently preprocessed inputs, are more important than...

Planning Under Time Constraints in Stochastic Domains (1993)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision processes, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Deliberation Scheduling for Time-Critical Sequential Decision Making (1993)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We describe a method for time-critical decision making involving sequential tasks and stochastic processes. The method employs several iterative refinement routines for solving different aspects of...

Planning With Deadlines in Stochastic Domains (1993)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision problems, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Learning to Achieve Goals (1993)

Leslie Pack, Leslie Pack Kaelbling

Temporal difference methods solve the temporal credit assignment problem for reinforcement learning. An important subproblem of general reinforcement learning is learning to achieve dynamic goals....

Planning Under Time Constraints in Stochastic Domains (1993)

Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson

We provide a method, based on the theory of Markov decision processes, for efficient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world...

Input generalization in delayed reinforcement learning: An algorithm and performance comparisons (1991)

David Chapman, Leslie Pack Kaelbling

Delayed reinforcement learning is an attractive framework for the unsupervised learning of action policies for autonomous agents. Some existing delayed reinforcement learning techniques have shown...

Learning Topological Maps with Weak Local Odometric Information

Hagit Shatkay Leslie, Leslie Pack Kaelbling

Topological maps provide a useful abstraction for robotic navigation and planning. Although stochastic maps can theoretically be learned using the Baum-Welch algorithm, without strong prior...