ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Hierarchical Decision Making in Semiconductor Fabs Using Multi-Time Scale Markov Decision Processes

Panigrahi, Jnana Ranjan and Bhatnagar, Shalabh (2004) Hierarchical Decision Making in Semiconductor Fabs Using Multi-Time Scale Markov Decision Processes. In: 43rd IEEE Conference on Decision and Control, 2004. CDC, 14-17 December, Nassau,Bahamas, Vol.4, 4387-4392.

[img]
Preview
PDF
Hierarchical.pdf

Download (1MB)

Abstract

There are different timescales of decision making in semiconductor fabs. While decisions on buying/discarding of machines are made on the slower timescale, those that deal with capacity allocation and switchover are made on the faster timescale. We formulate this problem along the lines of a recently developed multi-time scale Markov decision process (MMDP) framework and present numerical experiments wherein we use TD(0) and Q-learning algorithms with linear approximation architecture, and show comparisons of these with the policy iteration algorithm. We show numerical experiments under two different scenarios. In the first, transition probabilities are computed and used in the algorithms. In the second, transitions are simulated without explicitly computing the transition probabilities. We observe that TD(0) requires less computation than Q-learning. Moreover algorithms that use simulated transitions require significantly less computation than their counterparts that compute transition probabilities.

Item Type: Conference Paper
Publisher: IEEE
Additional Information: �©1990 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Keywords: Semiconductor fab;Multi-time scale Markov decision process (MMDP);Reinforcement learning;Temporal difference (TD(0)) learning;Q-learning
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 05 Dec 2005
Last Modified: 19 Sep 2010 04:21
URI: http://eprints.iisc.ac.in/id/eprint/4257

Actions (login required)

View Item View Item