Shalabh Bhatnagar

WeA16.4 Parametrized Actor-Critic Algorithms for Finite-Horizon MDPs (2009)

Mohammed Shahid Abdulla, Shalabh Bhatnagar

Abstract — Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more...

Discrete Parameter Simulation Optimization Algorithms with Applications to Admission Control with Dependent Service Times (2009)

Vivek Mishra, Shalabh Bhatnagar, N. Hemach

Abstract — We propose certain discrete parameter variants of well known simulation optimization algorithms. Two of these algorithms are based on the smoothed functional (SF) technique while two...

its generalization to Tsallis case (2009)

Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar

Information theoretic justification ofBoltzmann selection and

On Measure-Theoretic aspects of Nonextensive Entropy Functionals and corresponding Maximum Entropy Prescriptions (2009)

Ambedkar Dukkipati, Shalabh Bhatnagar, M. Narasimha Murty

Shannon entropy of a probability measure P, defined as − � dP dP X dµ ln dµ dµ on a measure space (X,M,µ), is not a natural extension from the discrete case. However, maximum entropy (ME)...

A probabilistic constrained nonlinear optimization framework to optimize RED parameters (2009)

Patro, Rajesh Kumar, Bhatnagar, Shalabh

The random early detection (RED) technique has seen a lot of research over the years. However, the functional relationship between RED performance and its parameters viz,, queue weight (omega(q)),...

Ant Colony Optimization Algorithms for Shortest Path Problems (2009)

Kolavali, Sudha Rani, Bhatnagar, Shalabh

We propose four variants of recently proposed multi-timescale algorithm in [1] for ant colony optimization and study their application on a multi-stage shortest path problem. We study the performance...

Two-Timescale Q-Learning with an Application to Routing in Communication Networks 1 (2008)

Mohan Babu K, Shalabh Bhatnagar

We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values...

Gelfand-Yaglom-Perez Theorem for Generalized Relative Entropy Functionals (2008)

Ambedkar Dukkipati, Shalabh Bhatnagar, M. Narasimha Murty

The measure-theoretic definition of Kullback-Leibler relative-entropy (or simply KLentropy) plays a basic role in defining various classical information measures on general spaces. Entropy, mutual...

Multimedia Systems manuscript No. (will be inserted by the editor) (2008)

Sudha Velusamy, Lakshmi Gopal, Shalabh Bhatnagar, V Sridhar, S. Bhatnagar

Abstract With broadcast Television (TV) going digital, the number of channels and the programs aired have increased tremendously. Millions of audiences of various categories such as adults, children,...

Discrete Parameter Simulation Optimization Algorithms with Applications to Admission Control with Dependent Service Times (2008)

Vivek Mishra, Shalabh Bhatnagar, N. Hemach

Abstract — We propose certain discrete parameter variants of well known simulation optimization algorithms. Two of these algorithms are based on the smoothed functional (SF) technique while two...

Abstract An Algorithm for Dynamic Optimal Bandwidth Allocation in Communication Networks (2008)

Diksha Sharma, Shalabh Bhatnagar, Shyam Chakraborty

We study the problem of optimal bandwidth allocation in communication networks. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length...

Incremental Natural Actor-Critic Algorithms (2008)

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online...

Robust Optimization of Random Early Detection 1 (2008)

Rahul Vaidya, Shalabh Bhatnagar

Random Early Detection (RED) is the most widely used Adaptive Queue Management(AQM) mechanism in the internet. Although RED shows better performance than its predecessor, DropTail, its performance is...

Quotient evolutionary space: Abstraction of evolutionary process w.r.t macroscopic properties (2008)

Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar

Abstract- Darwinian evolution, which is characterized in terms of particular macroscopic behavior that emerges from microscopic organismic interaction, considers populations as units of evolutionary...

Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes (2008)

Bhatnagar, Shalabh, Abdulla, Mohammed Shahid

We develop four simulation-based algorithms for finite-horizon Markov decision processes. Two of these algorithms are developed for finite state and compact action spaces while the other two are for...

SPSA Based Feature Relevance Estimation For Video Retrieval (2008)

Velusamy, Sudha, Bhatnagar, Shalabh, Basavaraja, S, Sridhar, V

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance''...

Power Law Behavior in Single Server Queue with Random Rates (2008)

Shachi Sharma, Shalabh Bhatnagar

This paper presents a new single server queueing system – MR/MR/1 – that is an extension of the M/M/1 system with random arrival and service rates. We give a sample path based analysis for...

An efficient ad recommendation system for TV programs (2008)

Velusamy, Sudha, Gopal, Lakshmi, Bhatnagar, Shalabh, Varadarajan, Sridhar

With broadcast Television (TV) going digital, the number of channels and the programs aired have increased tremendously. Millions of audiences of various categories such as adults, children, youth...

Fuzzy Clustering Based Ad Recommendation for TV Programs (2008)

Sudha Velusamy, Lakshmi Gopal, Sridhar. V, Shalabh Bhatnagar

Abstract. Advertisements(Ads) are the main revenue earner for Television (TV) broadcasters. As TV reaches a large audience, it acts as the best media for advertisements of products and services. With...

An Efficient and Optimized Bluetooth Scheduling Algorithm for Piconets (2008)

Vijay Prakash Chaturvedi, V. Rakesh, Shalabh Bhatnagar

Abstract. Bluetooth is an emerging standard in short range, low cost and low power wireless networks. MAC is a generic polling based protocol, where a central Bluetooth unit (master) determines...

An Optimal Weighted-Average Congestion Based Pricing Scheme for Enhanced QoS (2008)

Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemach

Abstract. Pricing is an effective tool to control congestion and achieve quality of service (QoS) provisioning for multiple differentiated levels of service. In this paper, we consider the problem of...

Incremental Natural Actor-Critic Algorithms (2008)

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online...

OPTIMAL CONTROL OF A FEED-BACK QUEUE VIA STOCHASTIC APPROXIMATION (2008)

Shalabh Bhatnagar, Vinod Sharmat

An optimal feed-back control policy which provides good performance in terms of several conflicting per-formance criteria like mean delay, delay jitter, mean throughput, etc., is obtained for a...

An efficient ad recommendation system for TV programs (2008)

Velusamy, Sudha, Gopal, Lakshmi, Bhatnagar, Shalabh, Varadarajan, Sridhar

With broadcast Television (TV) going digital, the number of channels and the programs aired have increased tremendously. Millions of audiences of various categories such as adults, children, youth...

Optimal Parameter Trajectory Estimation in Parameterized SDEs: An Algorithmic Procedure (2008)

Shalabh Bhatnagar, Vivek Kumar Mishra

1 We consider the problem of estimating the optimal parameter trajectory over a finite time interval in a parameterized stochastic differential equation (SDE), and propose a simulation-based...

Rate Based ABR Flow Control using Two Timescale SPSA (2007)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus

In this paper, a two timescale simultaneous perturbation stochastic approximation (SPSA) algorithm is developed and applied to closed loop rate based available bit rate (ABR) flow control. The...

On measure-theoretic aspects of nonextensive entropy functionals and corresponding maximum entropy prescriptions (2007)

Dukkipati, Ambedkar, Bhatnagar, Shalabh, Murty, Narasimha M

Shannon entropy of a probability measure P, defined as $- \int_X(dp/d \mu) \hspace{2} ln (dp/d \mu)d \mu $ on a measure space $ (X, m,\mu )$ source, is not a natural extension from the discrete case....

Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes (2007)

Abdulla, Mohammed Shahid, Bhatnagar, Shalabh

This article proposes several two-timescale simulation-based actor-critic algorithms for solution of infinite horizon Markov Decision Processes with finite state-space under the average cost...

Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes (2007)

Abdulla, Mohammed Shahid, Bhatnagar, Shalabh

This article proposes several two-timescale simulation-based actor-critic algorithms for solution of infinite horizon Markov Decision Processes with finite state-space under the average cost...

On measure-theoretic aspects of nonextensive entropy functionals and corresponding maximum entropy prescriptions (2007)

Dukkipati, Ambedkar, Bhatnagar, Shalabh, Murty, Narasimha M

Shannon entropy of a probability measure P, defined as $- \int_X(dp/d \mu) \hspace{2} ln (dp/d \mu)d \mu $ on a measure space $ (X, m,\mu )$ source, is not a natural extension from the discrete case....

Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization (2007)

Bhatnagar, Shalabh

In this article, we present three smoothed functional (SF) algorithms for simulation optimization.While one of these estimates only the gradient by using a finite difference approximation with two...

Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization (2007)

Bhatnagar, Shalabh

In this article, we present three smoothed functional (SF) algorithms for simulation optimization.While one of these estimates only the gradient by using a finite difference approximation with two...

Natural-Gradient Actor-Critic Algorithms (2007)

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee

We prove the convergence of four new reinforcement learning algorithms based on the actorcritic architecture, on function approximation, and on natural gradients. Reinforcement learning is a class of...

Optimal Multi-layered Congestion Based Pricing Schemes for Enhanced (2007)

Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemach

Pricing is an effective tool to control congestion and achieve quality of service (QoS) provisioning for multiple differentiated levels of service. In this paper, we consider the problem of pricing...

Link-Route Pricing for Enhanced QoS (2007)

Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemach

Abstract — Pricing is an effective tool to control congestion and achieve quality of service (QoS) provisioning for multiple differentiated levels of service. In this paper, we consider the problem...

and (2007)

Shalabh Bhatnagar

The problem of estimating the time-dependent statistical characteristics of a random dynamical system is studied under two different settings. In the first, the system dynamics is governed by a...

Adaptive newton-based multivariate smoothed functional algorithms for simulation optimization (2007)

Shalabh Bhatnagar

In this paper, we present three smoothed functional (SF) algorithms for simulation optimization. While one of these estimates only the gradient by using a finite difference approximation with two...

New Algorithms of the Q-Learning Type (2007)

Shalabh Bhatnagar, K. Mohan Babu

We propose two algorithms for Q-learning that use the two timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state-action pairs at each instant while...

Reinforcement learning based algorithms for average cost Markov decision processes”, Discrete Event Dynamic Systems: Theory and Applications (2007)

Mohammed Shahid Abdulla, Shalabh Bhatnagar

This article proposes several two-timescale simulation-based actor-critic algorithms for solution of infinite horizon Markov Decision Processes with finite state-space under the average cost...

Robust optimization of Random Early Detection (2006)

Vaidya, Rahul, Bhatnagar, Shalabh

Random Early Detection (RED) is the most widely used Adaptive Queue Management (AQM) mechanism in the internet. Although RED shows better performance than its predecessor, DropTail, its performance...

Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification (2006)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Nearest neighbor (NN) classifier is a popular non-parametric classifier. It is conceptually a simple classifier and shows good performance. Due to the curse of dimensionality effect, the size of...

A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events (2006)

Bhatnagar, Shalabh, Borkar, Vivek S, Akarapu, Madhukar

We study the problem of long-run average cost control of Markov chains conditioned on a rare event. In a related recent work, a simulation based algorithm for estimating performance measures...

Actor-critic algorithms for hierarchical Markov decision processes (2006)

Bhatnagar, Shalabh, Panigrahi, Ranjan J

We consider the problem of control of hierarchical Markov decision processes and develop a simulation based two-timescale actor-critic algorithm in a general framework. We also develop certain...

Nonextensive triangle equality and other properties of Tsallis relative-entropy minimization (2006)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

Kullback-Leibler relative-entropy has unique properties in cases involving distributions resulting from relative-entropy minimization. Tsallis relative-entropy is a one-parameter generalization of...

Actor-critic algorithms for hierarchical Markov decision processes (2006)

Bhatnagar, Shalabh, Panigrahi, Ranjan J

We consider the problem of control of hierarchical Markov decision processes and develop a simulation based two-timescale actor-critic algorithm in a general framework. We also develop certain...

Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification (2006)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Nearest neighbor (NN) classifier is a popular non-parametric classifier. It is conceptually a simple classifier and shows good performance. Due to the curse of dimensionality effect, the size of...

A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events (2006)

Bhatnagar, Shalabh, Borkar, Vivek S, Akarapu, Madhukar

We study the problem of long-run average cost control of Markov chains conditioned on a rare event. In a related recent work, a simulation based algorithm for estimating performance measures...

Robust optimization of Random Early Detection (2006)

Vaidya, Rahul, Bhatnagar, Shalabh

Random Early Detection (RED) is the most widely used Adaptive Queue Management (AQM) mechanism in the internet. Although RED shows better performance than its predecessor, DropTail, its performance...

A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events (2006)

Shalabh Bhatnagar, Vivek S. Borkar, Madhukar Akarapu, Shie Mannor

We study the problem of long-run average cost control of Markov chains conditioned on a rare event. In a related recent work, a simulation based algorithm for estimating performance measures...

A Discrete Parameter Stochastic Approximation Algorithm for Simulation Optimization (2005)

Bhatnagar, Shalabh, Kowshik, Hemant J

The authors develop a two-timescale simultaneous perturbation stochastic approximation algorithm for simulation-based parameter optimization over discrete sets. This algorithm is applicable in cases...

Overlap pattern synthesis with an efficient nearest neighbor classifier (2005)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Nearest neighbor (NN) classifier is the most popular non-parametric classifier. It is a simple classifier with no design phase and shows good performance. Important factors affecting the efficiency...

Adaptive Multivariate Three-Timescale Stochastic Approximation Algorithms for Simulation Based Optimization (2005)

Bhatnagar, Shalabh

We develop in this article, four adaptive three-timescale stochastic approximation algorithms for simulation optimization that estimate both the gradient and Hessian of average cost at each update...

Overlap pattern synthesis with an efficient nearest neighbor classifier (2005)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Nearest neighbor (NN) classifier is the most popular non-parametric classifier. It is a simple classifier with no design phase and shows good performance. Important factors affecting the efficiency...

A Discrete Parameter Stochastic Approximation Algorithm for Simulation Optimization (2005)

Bhatnagar, Shalabh, Kowshik, Hemant J

The authors develop a two-timescale simultaneous perturbation stochastic approximation algorithm for simulation-based parameter optimization over discrete sets. This algorithm is applicable in cases...

Adaptive Multivariate Three-Timescale Stochastic Approximation Algorithms for Simulation Based Optimization (2005)

Bhatnagar, Shalabh

We develop in this article, four adaptive three-timescale stochastic approximation algorithms for simulation optimization that estimate both the gradient and Hessian of average cost at each update...

Information theoretic justification of Boltzmann selection and its generalization to Tsallis case (2005)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

A generalized evolutionary algorithm based on Tsallis statistics is proposed. The algorithm uses Tsallis generalized canonical distribution, which is one parameter generalization of Boltzmann...

Information theoretic justification of Boltzmann selection and its generalization to Tsallis case (2005)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

A generalized evolutionary algorithm based on Tsallis statistics is proposed. The algorithm uses Tsallis generalized canonical distribution, which is one parameter generalization of Boltzmann...

Information theoretic justification of Boltzmann selection and its generalization to Tsallis case (2005)

Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar

Abstract- A generalized evolutionary algorithm based on Tsallis statistics is proposed. The algorithm uses Tsallis generalized canonical distribution, which is one parameter generalization of...

A discrete parameter stochastic approximation algorithm for simulation optimization (2005)

Shalabh Bhatnagar, Hemant J. Kowshik

We develop in this paper a two-timescale simultaneous perturbation stochastic approximation algorithm for simulation based parameter optimization over discrete sets. This algorithm is applicable in...

With today’s Internet applications posing newer demands... (2005)

Rajesh Kumar Patro, Shalabh Bhatnagar

The random early detection (RED) technique has seen a lot of research over the years. However, the functional relationship between RED performance and its parameters viz., queue weight (wq), marking...

Reinforcement learning based algorithms for finite horizon markov decision processes,” Submitted (2005)

Shalabh Bhatnagar, Mohammed Shahid Abdulla

Abstract — We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed...

A discrete parameter stochastic approximation algorithm for simulation optimization”, Simulation (2005)

Shalabh Bhatnagar, Hemant J. Kowshik

The authors develop a two-timescale simultaneous perturbation stochastic approximation algorithm for simulation-based parameter optimization over discrete sets. This algorithm is applicable in cases...

Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization (2005)

Shalabh Bhatnagar

We develop in this article, four adaptive three-timescale stochastic approximation algorithms for simulation optimization that estimate both the gradient and Hessian of average cost at each update...

A Simultaneous Perturbation Stochastic Approximation-Based Actor-Critic Algorithm for Markov Decision Processes (2004)

Bhatnagar, Shalabh, Kumar, Shishir

A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is...

A Simultaneous Perturbation Stochastic Approximation-Based Actor-Critic Algorithm for Markov Decision Processes (2004)

Bhatnagar, Shalabh, Kumar, Shishir

A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is...

Hierarchical Decision Making in Semiconductor Fabs Using Multi-Time Scale Markov Decision Processes (2004)

Panigrahi, Jnana Ranjan, Bhatnagar, Shalabh

There are different timescales of decision making in semiconductor fabs. While decisions on buying/discarding of machines are made on the slower timescale, those that deal with capacity allocation...

Cauchy Annealing Schedule: An Annealing Schedule for Boltzmann Selection Scheme in Evolutionary Algorithms (2004)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

Boltzmann selection is an important selection mechanism in evolutionary algorithms as it has theoretical properties which help in theoretical analysis. However, Boltzmann selection is not used in...

A Pattern Synthesis Technique with an Efficient Nearest Neighbor Classifier for Binary Pattern Recognition (2004)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Important factors affecting the efficiency and performance of the nearest neighbor classifier (NNC) are space, classification time requirements and for high dimensional data, due to the curse of...

Hierarchical Decision Making in Semiconductor Fabs Using Multi-Time Scale Markov Decision Processes (2004)

Panigrahi, Jnana Ranjan, Bhatnagar, Shalabh

There are different timescales of decision making in semiconductor fabs. While decisions on buying/discarding of machines are made on the slower timescale, those that deal with capacity allocation...

Cauchy Annealing Schedule: An Annealing Schedule for Boltzmann Selection Scheme in Evolutionary Algorithms (2004)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

Boltzmann selection is an important selection mechanism in evolutionary algorithms as it has theoretical properties which help in theoretical analysis. However, Boltzmann selection is not used in...

A Pattern Synthesis Technique with an Efficient Nearest Neighbor Classifier for Binary Pattern Recognition (2004)

Viswanath, P, Murty, Narasimha M, Bhatnagar, Shalabh

Important factors affecting the efficiency and performance of the nearest neighbor classifier (NNC) are space, classification time requirements and for high dimensional data, due to the curse of...

Optimized RIO for diffserve networks (2004)

Rahul Vaidya, Shalabh Bhatnagar

Even though Differentiated Services with Assured Forwarding provide bandwidth and other guarantees, the equilibrium queue size of the router depends on network conditions as well as network settings....

Cauchy Annealing Schedule: An Annealing Schedule for Boltzmann Selection Scheme in Evolutionary Algorithms (2004)

Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar

Abstract — Boltzmann selection is an important selection mechanism in evolutionary algorithms as it has theoretical properties which help in theoretical analysis. However, Boltzmann selection is...

Two-Timescale Simultaneous Perturbation Stochastic Approximation Using Deterministic Perturbation Sequences (2003)

Bhatnagar, Shalabh, Fu, Michael C, Marcus, Steven I, Wang, I-Jeng

Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found to be very effective for high-dimensional simulation optimization problems. The main idea is to estimate the...

Two-Timescale Simultaneous Perturbation Stochastic Approximation Using Deterministic Perturbation Sequences (2003)

Bhatnagar, Shalabh, Fu, Michael C, Marcus, Steven I, Wang, I-Jeng

Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found to be very effective for high-dimensional simulation optimization problems. The main idea is to estimate the...

Quotient Evolutionary Space: Abstraction of Evolutionary process w.r.t macroscopic properties (2003)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

Darwinian evolution, which is characterized in terms of particular macroscopic behavior that emerges from microscopic organismic interaction, considers populations as units of evolutionary change. We...

Quotient Evolutionary Space: Abstraction of Evolutionary process w.r.t macroscopic properties (2003)

Dukkipati, Ambedkar, Murty, Narasimha M, Bhatnagar, Shalabh

Darwinian evolution, which is characterized in terms of particular macroscopic behavior that emerges from microscopic organismic interaction, considers populations as units of evolutionary change. We...

Convergence of simultaneous perturbation stochastic approximation for nondifferentiable optimization (2003)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus

Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found to be very effective for high-dimensional simulation optimization problems. The main idea is to estimate the...

Convergence of simultaneous perturbation stochastic approximation for nondifferentiable optimization (2003)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus

Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found to be very effective for high-dimensional simulation optimization problems. The main idea is to estimate the...

Multiscale chaotic SPSA and smoothed functional algorithms for simulation optimization”, Simulation (2003)

Shalabh Bhatnagar, Vivek S. Borkar

The authors propose a two-timescale version of the one-simulation smoothed functional (SF) algorithm with extra averaging. They also propose the use of a chaotic simple deterministic iterative...

A time aggregation approach to Markov decision processes. Automatica (2002)

Xi-ren Cao, Zhiyuan Ren, Shalabh Bhatnagar, Michael Fu, Steven Marcus

Abstract We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out...

Optimal structured feedback policies for ABR flow control using two-timescale SPSA (2001)

Shalabh Bhatnagar, Steven I. Marcus, Pedram Jaefari, M. Fard

Optimal structured feedback control policies for rate-based flow control of available bit rate (ABR) service in asynchronous transfer mode (ATM) networks are obtained in the presence of information...

Randomized Difference Two-Timescale Simultaneous Perturbation Stochastic Approximation Algorithms for Simulation Optimization of Hidden Markov Models (2000)

Bhatnagar, Shalabh, Fu, Michael C., Marcus, Steven I., Bhatnagar, Shashank

We proposetwo finite difference two-timescale simultaneous perturbationstochastic approximation (SPSA)algorithmsfor simulation optimization ofhidden Markov models. Stability and convergence of both...

Approximate Policy Iteration for Semiconductor Fab-Level Decision Making - a Case Study (2000)

He, Ying, Bhatnagar, Shalabh, Fu, Michael C., Marcus, Steven I.

In this paper, we propose an approximate policy iteration (API) algorithm for asemiconductor fab-level decision making problem. This problem is formulated as adiscounted cost Markov Decision Process...

Randomized difference two-timescale simultaneous perturbation stochastic approximation algorithms for simulation optimization of hidden Markov models (2000)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus, Shashank Bhatnagar

We propose two finite difference two-timescale simultaneous perturbation stochastic approximation (SPSA) algorithms for simulation optimization of hidden Markov models. Stability and convergence of...

Optimal Multilevel Feedback Policies for ABR Flow Control using Two Timescale SPSA (1999)

Bhatnagar, Shalabh, Fu, Michael C., Marcus, Steven I.

Optimal multilevel control policies for rate based flow control in available bit rate (ABR) service in asynchronous transfer mode (ATM) networks are obtained in the presence of information and...

Optimal Multilevel Feedback Policies for ABR Flow Control using Two Timescale SPSA (1999)

Bhatnagar, Shalabh, Fu, Michael C., Marcus, Steven I.

Optimal multilevel feedback control policies for rate based flow controlin available bit rate (ABR) service in asynchronous transfer mode (ATM)networks are obtained in the presence of information and...

A Markov Decision Process Model for Capacity Expansion and Allocation (1999)

Shalabh Bhatnagar, Emmanuel Fernandez-Gaucherand, Michael C. Fu, Ying He, Steven I. Marcus

We present a finite-horizon Markov decision process (MDP) model for providing decision support in semiconductor manufacturing on such critical operational issues as when to add additional capacity...

ISR develops, applies and teaches advanced methodologies of design and analysis to solve complex, hierarchical, (1999)

Heterogeneous And Dynamic, Shalabh Bhatnagar, Shalabh Bhatnagar, Michael C. Fu, Michael C. Fu, Steven I. Marcus, ...

Optimal multilevel feedback control policies for rate based flow control in available bit rate (ABR) service in asynchronous transfer mode (ATM) networks are obtained in the presence of information...

Optimal Multilevel Feedback Policies for ABR Flow Control using Two Timescale SPSA (1999)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus

Optimal multilevel feedback control policies for rate based flow control in available bit rate (ABR) service in asynchronous transfer mode (ATM) networks are obtained in the presence of information...

Optimal Structured Feedback Policies for ABR Flow Control Using Two Timescale SPSA (1994)

Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus, Pedram J. Fard

Abstract—Optimal structured feedback control policies for rate-based flow control of available bit rate service in asynchronous transfer mode networks are obtained in the presence of information...