Semi-Markov Decision Processes

Decoding Denso QR Bar Code In .NETUsing Barcode Control SDK for Visual Studio .NET Control to generate, create, read, scan barcode image in .NET framework applications.

70 INTRODUCTION

QR Code JIS X 0510 Printer In .NETUsing Barcode maker for VS .NET Control to generate, create QR Code ISO/IEC18004 image in Visual Studio .NET applications.

The previous chapter dealt with the discrete-time Markov decision model In this model, decisions can be made only at xed epochs t = 0, 1, However, in many stochastic control problems the times between the decision epochs are not constant but random A possible tool for analysing such problems is the semiMarkov decision model In Section 71 we discuss the basic elements of this model Also, for the optimality criterion of the long-run average cost per time unit, we give a data-transformation method by which the semi-Markov decision model can be converted into an equivalent discrete-time Markov decision model The datatransformation method enables us to apply the recursive method of value-iteration to the semi-Markov decision model Section 72 summarizes various algorithms for the computation of an average cost optimal policy In Section 73 we discuss the value-iteration algorithm for a semi-Markov decision model in which the times between the decision epochs are exponentially distributed For this particular case the computational effort of the value-iteration algorithm can considerably be reduced by introducing ctitious decision epochs This simple trick creates sparse transition matrices leading to a much more effective value-iteration algorithm Section 74 illustrates how value iteration in combination with an embedding idea can be used in the optimization of queues The semi-Markov decision model is a very useful tool for optimal control in queueing systems In Section 75 we will exploit a remarkable feature of the policy-iteration algorithm, namely that the algorithm typically achieves its largest improvements in costs in the rst few iterations This nding is sometimes useful to attack the curse of dimensionality in applications with a multidimensional state space The idea is to determine rst the relative values for a reasonable starting policy and to apply next a single policy-improvement step This heuristic approach will be illustrated to a dynamic routing problem

QR Reader In .NET FrameworkUsing Barcode reader for .NET Control to read, scan read, scan image in .NET applications.

A First Course in Stochastic Models HC Tijms c 2003 John Wiley & Sons, Ltd ISBNs: 0-471-49880-7 (HB); 0-471-49881-5 (PB)

Create Bar Code In Visual Studio .NETUsing Barcode maker for .NET framework Control to generate, create barcode image in .NET applications.

SEMI-MARKOV DECISION PROCESSES

Recognize Barcode In .NET FrameworkUsing Barcode recognizer for VS .NET Control to read, scan read, scan image in .NET applications.

THE SEMI-MARKOV DECISION MODEL

QR-Code Creation In C#.NETUsing Barcode maker for VS .NET Control to generate, create QR Code JIS X 0510 image in .NET applications.

Consider a dynamic system whose state is reviewed at random epochs At those epochs a decision has to be made and costs are incurred as a consequence of the decision made The set of possible states is denoted by I For each state i I , a set A(i) of possible actions is available It is assumed that the state space I and the action sets A(i), i I are nite This controlled dynamic system is called a semiMarkov decision process when the following Markovian properties are satis ed: if at a decision epoch the action a is chosen in state i, then the time until the next decision epoch and the state at that epoch depend only on the present state i and the subsequently chosen action a and are thus independent of the past history of the system Also, the costs incurred until the next decision epoch depend only on the present state and the action chosen in that state We note that in speci c problems the state occurring at the next transition will often depend on the time until that transition Also, the costs usually consist of lump costs incurred at the decision epochs and rate costs incurred continuously in time As an example, consider a single-product inventory system in which the demand process is described by a Poisson process and the inventory position can be replenished at any time In this example the decision epochs are the demand epochs and they occur randomly in time The decision is whether or not to raise the inventory position after a demand has occurred The costs typically consist of xed replenishment costs and holding costs that are incurred continuously in time The long-run average cost per time unit The long-run average cost per time unit is taken as the optimality criterion For this criterion the semi-Markov decision model is in fact determined by the following characteristics: pij (a) = the probability that at the next decision epoch the system will be in state j if action a is chosen in the present state i, i (a) = the expected time until the next decision epoch if action a is chosen in the present state i, ci (a) = the expected costs incurred until the next decision epoch if action a is chosen in the present state i It is assumed that i (a) > 0 for all i I and a A(i) As before, a stationary policy R is a rule which adds to each state i a single action Ri A(i) and always prescribes to take this action whenever the system is observed in state i at a decision epoch Since the state space is nite, it can be shown that under each stationary policy the number of decisions made in any nite time interval is nite with probability 1 We omit the proof of this result Let Xn = the state of the system at the nth decision epoch Then it follows that under a stationary policy R the embedded stochastic process {Xn } is a discrete-time Markov chain with one-step transition probabilities pij (Ri )

QR-Code Creator In .NETUsing Barcode generation for ASP.NET Control to generate, create QR image in ASP.NET applications.

Quick Response Code Printer In VB.NETUsing Barcode creator for VS .NET Control to generate, create QR Code image in .NET framework applications.

Barcode Drawer In VS .NETUsing Barcode creation for .NET framework Control to generate, create bar code image in Visual Studio .NET applications.

EAN / UCC - 8 Printer In Visual Studio .NETUsing Barcode creator for VS .NET Control to generate, create EAN-8 Supplement 2 Add-On image in .NET framework applications.

Making EAN13 In .NETUsing Barcode generator for ASP.NET Control to generate, create EAN 13 image in ASP.NET applications.

Print UPCA In JavaUsing Barcode creation for Java Control to generate, create UPC-A Supplement 2 image in Java applications.

UPC-A Supplement 2 Decoder In VS .NETUsing Barcode recognizer for .NET Control to read, scan read, scan image in .NET applications.

ANSI/AIM Code 39 Printer In Visual Basic .NETUsing Barcode printer for Visual Studio .NET Control to generate, create Code 39 Full ASCII image in .NET applications.