Dimitri P. Bertsekas

9 Improved Temporal Difference Methods with Linear Function Approximation (2009)

Dimitri P. Bertsekas, Angelia Nedich, Vivek S. Borkar

Editor’s Summary: This chapter considers temporal difference algorithms within the context of infinite-horizon finite-state dynamic programming problems with discounted cost and linear cost...

Contents (2009)

Dimitri P. Bertsekas, Huizhen Yu

We consider fixed point equations, and approximation of the solution by projection on a low-dimensional subspace. We propose stochastic iterative algorithms, based on simulation, which converge to...

Approximate Dynamic Programming Contents (2009)

Dimitri P. Bertsekas

This is an updated version of the research-oriented Chapter 6 on Approximate Dynamic Programming. In addition to editorial revisions and rearrangements, it includes an account of new research (joint...

New Error Bounds for Approximations from Projected Linear Equations (2008)

Yu, Huizhen, Bertsekas, Dimitri P.

Joint Technical Report of U.H. and M.I.T. Technical Report C-2008-43 Dept. Computer Science University of Helsinki and LIDS Report 2797 Dept. EECS M.I.T. July 2008; revised July 2009

Publisher’s Cataloging-in-Publication Data (2008)

Dimitri P. Bertsekas, Cover Design, Ann Gallager, Bertsekas Dimitri P

All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without...

CONVERGENCE THEORIES OF DISTRIBUTED ITERATIVE PROCESSES: A SURVEYt by (2008)

Dimitri P. Bertsekas, John N. Tsitsiklis, Michael Athans

We consider a model of distributed iterative algorithms whereby several processors participate in the computation while collecting, possibly stochastic information from the environment or other...

Auction Algorithms (2008)

Dimitri P. Bertsekas

The auction algorithm is an intuitive method for solving the classical assignment problem. It outperforms substantially its main competitors for important types of problems, both in theory and in...

AN AUCTION ALGORITHM 1 FOR THE MAX-FLOW PROBLEM (2008)

Dimitri P. Bertsekas

To appear in J.O.T.A. We propose a new algorithm for the max-flow problem. It consists of a sequence of augmen-tations along paths constructed by an auction-like algorithm. These paths are not...

Multiplier Methods: A Survey*t (2008)

Dimitri P. Bertsekas

An analysis of the convergence properties of multiplier methods demonstrates their superiority over ordinary penalty methods for constrained minimization. Summary--The purpose of this paper is to...

c ○ 1997 Kluwer Academic Publishers Rollout Algorithms for Combinatorial Optimization ∗ (2008)

Dimitri P. Bertsekas, John N. Tsitsiklis, Cynara Wu

We consider the approximate solution of discrete optimization problems using procedures that are capable of magnifying the effectiveness of any given heuristic algorithm through sequential...

Comments on “Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules” (2008)

Dimitri P. Bertsekas, John N. Tsitsiklis

Abstract—We clarify the relation between the model and the convergence results of Jadbabaie et al. to those of Bertsekas et al. We show that the update equations in Jadbabaie et al. are a special...

'I:' (2008)

Dimitri P. Bertsekas, B. Rhodes

Abstract- Tbis paper is cOllcemed with the problem of estimating the state of a linear dynamic system using noise-corrupted observations, wben input disturbances and observation errors are IDIknown...

SIAM J. on Optimization, to appear AN ε-RELAXATION METHOD FOR SEPARABLE CONVEX COST NETWORK FLOW PROBLEMS 1 (2008)

Dimitri P. Bertsekas, Lazaros C. Polymenakos, Paul Tseng

We propose a new method for the solution of the single commodity, separable convex cost network flow problem. The method generalizes the ɛ-relaxation method developed for linear cost problems, and...

Enhanced Fritz John Conditions for Convex Programming 1 by (2008)

Dimitri P. Bertsekas, Asuman E. Ozdaglar, Paul Tseng

We consider convex constrained optimization problems, and we enhance the classical Fritz John optimality conditions to assert the existence of multipliers with special sensitivity properties. In...

6. Using Basis Functions Involving Powers of A (2008)

Dimitri P. Bertsekas, Huizhen Yu

We consider fixed point equations, and approximation of the solution by projection on a low-dimensional subspace. We propose stochastic iterative algorithms, based on simulation, which converge to...

New Error Bounds for Approximations from Projected Linear Equations (2008)

Huizhen Yu, Dimitri P. Bertsekas

We consider linear fixed point equations and their approximations by projection on a low dimensional subspace. We derive new bounds on the approximation error of the solution, which are expressed in...

Auction Algorithms (2007)

Dimitri P. Bertsekas

The auction algorithm is an intuitive method for solving the classical assignment problem. It outperforms substantially its main competitors for important types of problems, both in theory and in...

Play Selection in American Football: A Case Study in Neuro-Dynamic Programming (2007)

Stephen D. Patek, Dimitri P. Bertsekas

: We present a computational case study of neuro-dynamic programming, a recent class of reinforcement learning methods. We cast the problem of play selection in American football as a stochastic...

Admission Control for Wireless Networks (2007)

Cynara C. Wu, Dimitri P. Bertsekas

With the population of wireless subscribers increasing at a rapid rate, overloaded sit-uations are likely to become an increasing problem. Admission control can be used to balance the goals of...

1.1. Linear Algebra and Analysis................. (2007)

Dimitri P. Bertsekas, Angelia Nedić, Asuman E. Ozdaglar

These notes were developed for the needs of the 6.291 class at M.I.T. (Spring 2001). They are copyright-protected, but they may be reproduced freely for noncommercial purposes.

Contact information: (2007)

Dimitri P. Bertsekas, Angelia Nedic, Angelia Nedic, Prof Dimitri, P. Bertsekas

1 Research supported by NSF under Grant ACI-9873339. 1 1.

Distributed Asynchronous Optimal Routing in Data Networks (2007)

Tsitsiklis, John N., Bertsekas, Dimitri P.

We prove convergence of a distributed gradient projection method for optimal routing in a data communication network. The analysis is carried out without any synchronization assumptions and takes...

Some Issues in Distributed Asynchronous Routing in Virtual Circuit Data Networks (2007)

Tsai, Wei K., Tsitsiklis, John N., Bertsekas, Dimitri P.

We consider the behavior of distributed asynchronous routing algorithms for optimizing the flows in a virtual circuit data network, with respect to a given convex cost function. The algorithms...

Abstract (2007)

Huizhen Yu, Dimitri P. Bertsekas

We consider the average cost problem for partially observable Markov decision processes (POMDP) with finite state, observation, and control spaces. We prove that there exists an ɛ-optimal...

Abstract (2006)

Huizhen Yu, Dimitri P. Bertsekas

We consider finite-state Markov decision processes and prove convergence and rate of convergence results for certain least squares policy evaluation algorithms. These are temporal difference methods...

Comment on “Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules ” ∗ (2006)

Dimitri P. Bertsekas, John N. Tsitsiklis

We clarify the relation of the model and the convergence results of Jadbabaie et al. [3] to those studied by Bertsekas et al. [6, 5, 1]. We show that the update equations in [3] are a very special...

Abstract (2006)

Huizhen Yu, Dimitri P. Bertsekas

We consider the solution of discounted optimal stopping problems using linear function approximation methods. A Q-learning algorithm for such problems, proposed by Tsitsiklis and Van Roy, is based on...

LECTURE 1 LECTURE OUTLINE • Significance of Feedback DP AS AN OPTIMIZATION METHODOLOGY (2004)

Dimitri P. Bertsekas, Nd Edition, Vols I, Athena Scientific

The slides are copyrighted, but may be freely reproduced and distributed for any noncommercial purpose.

Enhanced Fritz John Conditions for Convex Programming1 (2004)

Dimitri P. Bertsekas, Asuman E. Ozdaglar, Paul Tseng

Abstract We consider convex constrained optimization problems, and we enhance the classical Fritz John optimality conditions to assert the existence of multipliers with special sensitivity...

ECTURE SLIDES ON DYNAMIC PROGRAMMING BASED ON LECTURES GIVEN AT THE (2004)

Dimitri P. Bertsekas

The slides are copyrighted, but may be freely reproduced and distributed for any noncommercial purpose. LECTURE CONTENTS • These slides consist of 24 Lectures, whose summary is given in the next 24...

6.041 / 6.431 Probabilistic Systems Analysis and Applied Probability, Fall 2002 (2002)

Medard, Muriel, Tsitsiklis, John N., Bertsekas, Dimitri P., Abou Faycal, Ibrahim C. (Ibrahim Chafik)

Modeling, quantification, and analysis of uncertainty. Formulation and solution in sample space. Random variables, transform techniques, simple random processes and their probability distributions,...

6.041 / 6.431 Probabilistic Systems Analysis and Applied Probability, Fall 2002 (2002)

Medard, Muriel, Tsitsiklis, John N., Bertsekas, Dimitri P., Abou Faycal, Ibrahim C. (Ibrahim Chafik)

Modeling, quantification, and analysis of uncertainty. Formulation and solution in sample space. Random variables, transform techniques, simple random processes and their probability distributions,...

6.231 Dynamic Programming and Stochastic Control, Fall 2002 (2002)

Bertsekas, Dimitri P.

Sequential decision-making via dynamic programming. Unified approach to optimal control of stochastic dynamic systems and Markovian decision problems. Applications in linear-quadratic control,...

Second Derivative Algorithms for Minimum Delay Distributed Routing in Networks, (2002)

Bertsekas,Dimitri P., Gafni,Eli M., Gallager,Robert G.

We propose a class of algorithms for finding an optimal quasistatic routing in a communication network. The algorithms are based on Gallager's method. Their main feature is that they utilize second...

Projected Newton Methods and Optimization of Multicommodity Flows. (2002)

Bertsekas,Dimitri P., Gafni,Eli M.

A superlinearly convergent Newton-like method for linearly constrained optimization problems is adapted for solution of multicommodity network flow problems of the type arising in communication and...

Distributed power control algorithms for wireless networks (2001)

Cynara C. Wu, Dimitri P. Bertsekas

Power control has been shown to be an e ective way toincrease capacity inwireless systems. In previous work on power control, it has been assumed that power levels can be assigned from a continuous...

Distributed power control algorithms for wireless networks (2001)

Cynara C. Wu, Dimitri P. Bertsekas

Power control has been shown to be an e ective way toincrease capacity inwireless systems. In previous work on power control, it has been assumed that power levels can be assigned from a continuous...

Missile defense and interceptor allocation by neuro-dynamic programming (2000)

Dimitri P. Bertsekas, Mark L. Homer, David A. Logan, Stephen D. Patek, Nils R. S

Abstract|Thepurpose of this paper is to propose a solution methodology for a missile defense problem involving the sequential allocation of defensive resources over a series of engagements. The...

Gradient Convergence In Gradient Methods With Errors (1999)

Dimitri P. Bertsekas, John N. Tsitsiklis

We consider the gradient method x t+1 = x t +# t (s t +w t ), where s t is a descent direction of a function f : # n # # and w t is a deterministic or stochastic error. We assume that #f is Lipschitz...

Distributed Power Control Algorithms for Wireless Networks (1999)

Cynara C. Wu, Dimitri P. Bertsekas

Power control has been shown to be an effective way to increase capacity in wireless systems. In previous work on power control, it has been assumed that power levels can be assigned from a...

Missile Defense and Interceptor Allocation by Neuro-Dynamic Programming (1999)

Dimitri P. Bertsekas, Mark L. Homer, David A. Logan, Stephen D. Patek, Nils R. Sandell, Nils R. S

The purpose of this paper is to propose a solution methodology for a missile defense problem involving the sequential allocation of defensive resources over a series of engagements. The problem is...

ON THE MINIMAX REACHABILITY OF TARGET SETS AND TARGET TUBES, (1998)

Bertsekas,Dimitri P., Rhodes,Ian B.

The paper is concerned with the closed-loop control of discrete-time systems in the presence of uncertainty. The uncertainty may arise as disturbances in the system dynamics, disturbances corrupting...

RECURSIVE STATE ESTIMATION FOR A SET-MEMBERSHIP DESCRIPTION OF UNCERTAINTY, (1998)

Bertsekas,Dimitri P., Rhodes,Ian B.

The paper is concerned with the problem of estimating the state of a linear dynamic system using noise-corrupted observations, when input disturbances and observation errors are unknown except for...

Steepest Descent for Optimization Problems with Nondifferentiable Cost Functionals, (1998)

Bertsekas,Dimitri P., Mitter,Sanjoy K.

The steepest descent method is a commonly used algorithm for finding the minimum of a differentiable cost functional. At each iteration a descent is made at the direction of the negative gradient...

Sufficiently Informative Functions and the Minimax Feedback Control of Uncertain Dynamic Systems, (1998)

Bertsekas,Dimitri P., Rhodes,Ian B.

In the paper the problem of optimal feedback control of uncertain discrete-time dynamic systems is considered where the uncertain quantities do not have a stochastic description but instead are known...

On the Minimax Feedback Control of Uncertain Dynamic Systems, (1998)

Bertsekas,Dimitri P., Rhodes,Ian B.

In the paper the problem of optimal feedback control of uncertain discrete-time dynamic systems is considered where the uncertain quantities do not have a stochastic description but instead are known...

Planning and Scheduling: Dynamic Assignment and Scheduling with Contingencies (1998)

Castanon, David A., Logan, David A., Bertsekas, Dimitri P., Patek, Stephen D.

Planning and scheduling under uncertainty is important in many important Air Force operations. When real time monitoring of the progress of operations is available, it is possible to replan missions...

Stochastic Approximation for Non-Expansive Maps: Application to Q-Learning Algorithms (1998)

Jinane Abounadi, Dimitri P. Bertsekas, Vivek Borkar

We discuss synchronous and asynchronous variants of fixed point iterations of the form x k+1 = x k + #(k) F (x k , # k ) - x k , where F is a non-expansive mapping under a suitable norm, and {# k }...

A Note On Error Bounds For Convex And Nonconvex Programs (1998)

Dimitri P. Bertsekas

Given a single feasible solution xF and a single infeasible solution x I of a mathematical program, we provide an upper bound to the optimal dual value. We assume that xF satisfies a weakened form of...

Finite-State Average Cost Stochastic Games With Compact Constraint Sets And A Recurrence Condition (1998)

Stephen D. Patek, Dimitri P. Bertsekas

. We characterize and establish the existence of stationary equilibrium solutions for a class of finite-state average cost games. We assume that two players choose actions at each stage from compact...

A New Value Iteration Method for the Average Cost Dynamic Programming Problem (1998)

Dimitri P. Bertsekas

. We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists...

Stochastic Shortest Path Games (1997)

Stephen D. Patek, Dimitri P. Bertsekas

. We consider dynamic, two-player, zero-sum games where the "minimizing" player seeks to drive an underlying finite-state dynamic system to a special terminal state along a least expected...

Rollout Algorithms For Combinatorial Optimization (1997)

Dimitri P. Bertsekas, John N. Tsitsiklis, Cynara Wu

We consider the approximate solution of discrete optimization problems using procedures that are capable of magnifying the effectiveness of any given heuristic algorithm through sequential...

Differential Training Of Rollout Policies (1997)

Dimitri P. Bertsekas

We consider the approximate solution of stochastic optimal control problems using a neurodynamic programming/reinforcement learning methodology. We focus on the computation of a rollout policy, which...

A New Class Of Incremental Gradient Methods For Least Squares Problems (1997)

Dimitri P. Bertsekas

The least mean squares (LMS) method for linear least squares problems di#ers from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the...

A New Class of Incremental Gradient Methods for Least Squares Problems (1997)

Dimitri P. Bertsekas

Abstract. The least mean squares (LMS) method for linear least squares problems differs from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of...

Gradient Convergence in Gradient Methods with Errors (1997)

Dimitri P. Bertsekas, N. Tsitsiklis

Abstract. We consider the gradient method xt+1 = xt + γt(st + wt), where st is a descent direction of a function f: ℜ n →ℜand wt is a deterministic or stochastic error. We assume that ∇f is...

A Neuro-Dynamic Programming Approach to Retailer Inventory Management (1996)

Benjamin Van Roy, Dimitri P. Bertsekas, Yuchun Lee, John N. Tsitsiklis

We present a model of two-echelon retailer inventory systems, and we cast the problem of generating optimal control strategies into the framework of dynamic programming. We formulate two specific...

A New Class Of Incremental Gradient Methods For Least Squares Problems (1996)

Dimitri P. Bertsekas

The LMS method for linear least squares problems di#ers from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the parameter vector under...

A New Value Iteration Method For The Average Cost Dynamic Programming Problem (1996)

Dimitri P. Bertsekas

We propose a new value iteration method for the classical average cost Markovian Decision problem, under the assumption that all stationary policies are unichain and furthermore there exists a state...

CONVEX COST NETWORK FLOW PROBLEMS1 (1996)

Dimitri P. Bertsekas, Lazaros C. Polymenakos, Paul Tseng

Abstract We consider a generic auction method for the solution of the single commodity, separable convex cost network flow problem. This method provides a unifying framework for the ffl-relaxation...

Parallel asynchronous labelcorrecting methods for shortest paths (1996)

Dimitri P. Bertsekas, Francesca Guerriero, Roberto Musmanno

Abstract. In this paper we develop parallel asynchronous implementations of some known and some new label correcting methods for finding a shortest path from a single origin to all the other nodes of...

Finite termination of asynchronous iterative algorithms (1996)

Serap A. Savari, Dimitri P. Bertsekas

Abstract: We consider n-processor distributed systems where the ith processor executes asyn-chronously the iteration xi = fi(x). It is natural to terminate the iteration of the ith processor when...

A counterexample to temporal differences learning (1995)

Dimitri P. Bertsekas

Sutton's TD(λ) method aims to provide a representation of the cost function in an absorbing Markov chain with transition costs. A simple example is given where the representation obtained...

Partial Proximal Minimization Algorithms for Convex Programming (1995)

Dimitri P. Bertsekas, Paul Tseng

We consider an extension of the proximal minimization algorithm where only some of the minimization variables appear in the quadratic proximal term. We interpret the resulting iterates in terms of...

Generic Rank-One Corrections For Value Iteration In Markovian Decision Problems (1994)

Dimitri P. Bertsekas

Given a lineariterationoftheformx := F (x),we considermo difledversionsoftheform x := F (x + d),whered isa flxed direction,and ischosentominimizethenorm ofthe residualkx + d F (x + d)k. W e...

Partial proximal minimization algorithms for convex programming (1994)

Dimitri P. Bertsekas, Paul Tseng

Abstract. An extension ofthe proximal minimization algorithm is considered where only some of the minimization variables appear in the quadratic proximal term. The resulting iterates are interpreted...

Thevenin Decomposition And Large-Scale Optimization (1993)

Dimitri P. Bertsekas

Thevenin's theorem, one of the most celebrated results of electric circuit theory, provides a two-parameter characterization of the behavior of an arbitrarily large circuit, as seen from two of...

A Simple and Fast Label Correcting Algorithm for Shortest Paths (1993)

Dimitri P. Bertsekas

We propose a new method for ordering the candidate nodes in label correcting methods for shortest path problems. The method is equally simple but much faster than the D'Esopo-Pape algorithm. It...

Parallel primal-dual methods for the minimum cost flow problem (1992)

Bertsekas, Dimitri P., Castañon, David A.

"This report is a substantial revision of report LIDS-P-1998, September 1990."

Parallel primal-dual methods for the minimum cost flow problem (1992)

Bertsekas, Dimitri P., Castañon, David A.

"This report is a substantial revision of report LIDS-P-1998, September 1990."

An Auction/Sequential Shortest Path Algorithm for the Min Cost Flow Problem," Laboratory for Information and Decision Systems Report P-2146 (1992)

Dimitri P. Bertsekas

We propose a new algorithm for the solution of the linear minimum cost network flow problem, based on a sequential shortest path augmentation approach. Each shortest path is constructed by means of...

Auction Algorithms for Network Flow Problems: A Tutorial Introduction (1992)

Dimitri P. Bertsekas

This paper surveys a new and comprehensive class of algorithms for solving the classical linear network flow problem and its various special cases such as shortest path, max-flow, assignment,...

Polynomial Auction Algorithms for Shortest Paths (1992)

Dimitri P. Bertsekas, Stefano Pallottino

In this paper we consider strongly polynomial variations of the auction algorithm for the single origin/many destinations shortest path problem. These variations are based on the idea of graph...

On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators (1992)

Jonathan Eckstein, Dimitri P. Bertsekas

This paper shows, by means of a new type of operator called a splitting operator, that the Douglas-Rachford splitting method for finding a zero of the sum of two monotone operators is a special case...

Convergence rate and termination of asynchronous iterative algorithms (1989)

D. P. Bertsekas, J. N. Tsitsiklis, Dimitri P. Bertsekas, John N. Tsitsiklis

Abstract cations, it is natural to consider distributed exe-We consider iterative algorithms of the form x: = cutions of this iteration whereby the ith processor f (x), executed by a parallel or...

North-Holland On the Douglas-Rachford (1989)

Jonathan Eckstein, Dimitri P. Bertsekas

splitting method and the proximal point algorithm for maximal monotone operators

PARALLEL AND DISTRIBUTED ITERATIVE ALGORITHMS: A SELECTIVE SURVEY 1 (1988)

Dimitri P. Bertsekas, John N. Tsitsiklis

Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available. If you are...

Adaptive aggregation methods for discounted dynamic programming (1986)

Bertsekas, Dimitri P., Castañon, David A.

"Proceedings of the 25th IEEE Conferecne on Decision and Control, Athens, Greece, December 1986."

Adaptive aggregation methods for discounted dynamic programming (1986)

Bertsekas, Dimitri P., Castañon, David A.

"Proceedings of the 25th IEEE Conferecne on Decision and Control, Athens, Greece, December 1986."

North-Holland RELAXATION METHODS FOR PROBLEMS WITH STRICTLY CONVEX SEPARABLE COSTS AND LINEAR CONSTRAINTS (1986)

Paul Tseng, Dimitri P. Bertsekas

We consider the minimization problem with strictly convex, possibly nondifferentiable, separable cost and linear constraints. The dual of this problem is an unconstrained minimization problem with...

Relaxation methods for minimum cost network flow problems" Laboratory for Information & Decision Systems Report LIDS-P-1339 (1983)

Dimitri P. Bertsekas, Paul Tseng

We view the optimal single commodity network flow problem with linear arc costs and its dual as a pair of monotropic programming problems, i.e. problems of minimizing the separable sum of scalar...

North-Holland A UNIFIED FRAMEWORK FOR PRIMAL-DUAL METHODS IN MINIMUM COST NETWORK FLOW PROBLEMS (1982)

Dimitri P. Bertsekas

We introduce a broad class of algorithms for finding a minimum cost flow in a capacitated network. The algorithms are of the primal-dual type. They maintain primat feasibility with respect to...

North-Holland DISTRIBUTED ASYNCHRONOUS COMPUTATION OF FIXED POINTS* (1981)

Dimitri P. Bertsekas

We present an algorithmic model for distribnted computation of fixed points whereby several processors participate simultaneously in the calculations while exchanging information via communication...

Control of uncertain systems with a set-membership description of the uncertainty. (1971)

Bertsekas, Dimitri P

Massachusetts Institute of Technology. Dept. of Electrical Engineering. Thesis. 1971. Ph.D.

Control of uncertain systems with a set-membership description of the uncertainty. (1971)

Bertsekas, Dimitri P

Massachusetts Institute of Technology. Dept. of Electrical Engineering. Thesis. 1971. Ph.D.