Measure-Valued Differentiation for Random Horizon Problems
2006, v.12, №3, 509-536
This paper deals with sensitivity analysis (gradient estimation) of random horizon cost functions of Markov chains. More precisely, we consider general state-space Markov chains and the random horizon is given through a hitting time of the chain onto a predefined set. The "cost" of interest is an expectation of a functional of the stopped process. This encompasses a wide range of models, such as the Gambler's ruin problem and performance evaluation for stationary queueing networks. We work within the framework of measure-valued differentiation and provide a general condition under which the gradient of the random horizon performance can be obtained in a closed form expression. For several scenarios, which occur typically in applications, we subsequently provide sufficient conditions for our general condition to hold. We illustrate our results with a series of examples. Eventually, we discuss unbiased sensitivity estimators and establish a new unbiased estimator for the gradient of stationary Markov chains.
Keywords: weak derivatives,gradient estimation,optimization,Markov chains