Dynamic Programming

Instructor: Dr. Rajesh Ganesan

Office: Engr Building, Room 2217

Phone: (703) 993-1693

Fax: (703) 993-1521

Text: DP by Eric Denardo https://www.amazon.com/Dynamic-Programming-Applications-Computer-Science/dp/0486428109 (pdf not available for download)

http://castlelab.princeton.edu/Papers/Powell_UnifiedFramework_ICSNewsletterFall2012.pdf

https://ia801309.us.archive.org/2/items/OperationalResearchWinstonWayne/Operational-Research-Winston-Wayne.pdf Winston's book for OR 541/2

******************************

Midterm Exam due May 6 (Extended to May 13th)

problem 1 - attempt after week 6

problem 2- attempt after week 5

problem 3 -attempt after week 10

problem 4 - attempt after week 6

problem 5 - attempt after week 5

problem 6 - attempt after week 10

problem 7 - attempt after week 7

problem 8 - attempt after week 7

problem 9 - attempt after week 6

problem 10 - attempt after week 3

problem 11 - attempt after week 10

problem 12 - attempt after week 6

pdf to go with midterm (from Winston's book - incase the link given at the top does not open) Q8- Prob 6 pag 82 from Denardo

Project Due on Final exam day May 6 (Extended to May 13th)

Final exam Due May 6 along with the project. 1046 1049 1050 Attempt after week 12 (Extended to May 13th)

Email submission for all of the above. I am looking for the DP formulation, definition of stage, state, action, contribution function, the cost and action matrices. If in the form of an Excel sheet then one sheet per problem and the final answer highlighted (final answer and the path, again only if you can solve it with numerical values). If the problem is solved using a code then you may email me the code files and a word file with the answers. If hand written then please scan to pdf. You can use a combination of the above modes as well.

*****************************

Week 1:

Introduction, Big picture

Week 2: Deterministic DP- Finite Horizon

Examples for DP/Approx DP, Notes 1

Applications: Shortest path, longest path problems, Dijkstra's Alg, Dp vs Dijkstra

Practice Problems: Problem 2, 4 from text Pages 25-26, Group A and B problems from Winston's Book (Page 968-969 from the link given above)

We are not meeting today Jan 28th. I sent out an email yesterday to the class.

Three videos have been posted today Jan 28th to Canvas Log into Canvas Canvas Login- select the course – click modules – you should see the videos. Please review the videos before coming to the class on Wed Feb 4th next week.

Weeks 3 and 4: DP formulation: Resource Allocation: Investment and Knapsack, Min-max problems, Equipment replacement with bounded Horizon Notes

Practice Problems: Prob 3 page 63 of the text, Prob 2 page 974, problem 2, 5 page 985 of Winston's Book (from the link given above)

Solve Min-max problems, Equipment replacement with bounded Horizon

Excel solutions to some problems covered in Weeks 1-4

Week 5: Capacity expansion, multiple resource allocation, capacity allocation, a hybrid invertory-investment problem (objective as state) notes

Practice Problems: Problem 7 -page 27 of text, Problem 6-page 1000 of Winston (similar to one on page 998), Problem 5 page 1014 of Winston's

Solve the above using excel or a code

Week 6: Deterministic Inventory control - 3 examples, notes for inventory

Practice Problems: Pg 105 text Q3, Q1 & Q3 from text pages 25-26

Curse of dimensionality, Traveling salesman, Population Growth model, Mitigating Curse of dimensionality, cyclic graph notes

Week 7: Deterministic DP - Infinite Horizon Equipment replacement unbounded horizon analytical solution, value iteration.

Spring break

Weeks 8 and 9:

Stochastic DP-Finite Horizon types, only backward recursion, every action taken from a given state has an associated set of probable outcomes, optimality in the expected sense, tracing of path.

Examples of Stochatic DP -finite horizon notes

Stochastics Dynamic Resource allocation

Milk Distribution problem from Winston's notes

Gambling example from Winston's

Practice problem 6 and 7 page 130 Denardo.

Stochastic inventory control (s,S) policy

Practice Problems: problem 4 page 1034 from Winston's book

Practice: Problem 4 page 1023 and Problem 5 page 1035 of Winston's book

Markov Chains Limiting Probabilities

Weeks 10 and 11: Stochastic DP-Infinite Horizon

Differences between Stochastic DP Finite vs Infinite Horizon notes

Stochastic DP - Infinite horizon notes

MDP - Markov Decision process: Exhaustive enumeration, LP solution to MDP (academic interest)

MDP - Markov Decision process using Stochastic DP - infinite horizon (Bellman's equation)

MDP- Average Reward/cost- Policy and Value Iteration for machine replacement problem. -Idea behind Average criteria

MDP- Discounted Reward/cost- Policy and Value Iteration for machine replacement problem.

Excel Example value iteration for MDP both average and discounted criteria

policy iteration excel sheet Machine replacement problem

MATLAB value iteration matlab for machine replacement

Practice problem: 4,6,9,11 on page 1049

Week 12: Additional MDP examples

Questions for the following problems

Water Resource example: see handout value iteration using excel

Inventory control Example, see hand out, excel for inventory control

Machine maintenance and gardner fertilization problem excel for both problems

Weeks 13 and 14: SMDP - Semi-Markov Decision process:

SMDP summary

SMDP questions. For solution click links below

SMDP for Machine replacement problem value iteration

SMDP for pricing based investment/asset acquisition value iteration

ADP/RL Introdcution (Need and Motivation)

https://www.geeksforgeeks.org/machine-learning/types-of-reinforcement-learning/

Review DP

*******************************************************************

How to use these files

1. Save the long or short code.

2. Change the c and X matrices to suit your problem. For some problems these are available at the end of the code, which you can copy and paste to the top of the code. Use Cntl+R and Cntl+T to remove the comments % on the matrices.

2.a. Make sure that the c matrix has only one state in stage 1. This is to initialize. Consequently the first column will be all zeros.

2.b. If c is 0 on a feasible arc then use 0.0001 instead of zero. If an arc is infeasible put a large c value for the min problem or a large negative c value for a max problem.

2.c. If an arc does not exist at all then mark the c value as 0.

3. change max and min to suit your problem.

4. Save and run (green arrow icon). For the very first time you might get a pop up window asking you to add a path. Click on "Add to path".

Matlab short code: Finds the length and only 1 path (Generic)

Matlab long code: Finds the length and up to 12 paths (Generic)

matlab code for investment problem to generate costand action matrix (investment in class problem)

MATLAB for INVENTORY CONTROL problems

matlab code for inventory problem to generate cost and action matrix (for inv control discussed in class page 969 in Winston)

*The following are used together*

matlab code for inventory problem to generate cost and action matrix (for Sailco inv control )

matlab code for (for Sailco inv control - first problem from the first class)

matlab code for inventory problem to generate costand action matrix (for pg 25 prob 1 in Denado's book). You can then use the Generic codes

*The following are used together*

matlab code for inventory problem to generate costand action matrix (for pg 26 prob 3 in Denado's book)

Long code for (for pg 26 prob 3 in Denado's book). The long code has been modified from the generic one by adding a few extra lines at the bottom

********************************************************