A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

V. S. Borkar

doi:10.1017/S0269964800142081

A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

Published online by Cambridge University Press: 01 April 2000

V. S. Borkar

Show author details

V. S. Borkar: Affiliation:
School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India, E-mail: borkar@tifr.res.in

Article contents

Abstract

Get access

Rights & Permissions

Abstract

A simulation-based algorithm for learning good policies for a discrete-time stochastic control process with unknown transition law is analyzed when the state and action spaces are compact subsets of Euclidean spaces. This extends the Q-learning scheme of discrete state/action problems along the lines of Baker [4]. Almost sure convergence is proved under suitable conditions.

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 14 , Issue 2 , April 2000 , pp. 243 - 258

DOI: https://doi.org/10.1017/S0269964800142081 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article contents

A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests