Hi,
I am trying to model conflict scenarios with response solutions (for electricity access) across 11 regions in a country using an MDP.
I am stuck with some debugging errors:
(1) Domain violation for elements L, M & H and (2) Row dimension inconsistency for the regions
GAMS file is attached (it is part of a larger code)
Kindly assist.
MDP Modeling
MDP Modeling
 Attachments

 Conflict Model.gms
 (2.85 KiB) Downloaded 33 times
Re: MDP Modeling
This syntax does not work in GAMS (even if you change L M H to low medium high, of the set elements conflicts from low medium high to L M H):
What do you try to say? If you want to "map" then use a map:
Moreover, you can optimize a free scalar variable only, so "solve MDP using mip maximizing V(Regions, Time);" does not work. I changed your objective to use a scalar variable.
Your BellmanEq is a quadratic constraint, so you can't solve this as a MIP. I changed to MIQCP. Now this compiles and solves but is probably not what you want. These are a lot of rookie mistakes. I suggest that you go back to square one and invest learning the system by studying the GAMS tutorials.
Michael
Code: Select all
table ConflictToSolution(Regions, Conflicts, Solutions) 'Mapping conflict level to solutions in each region'
L M H A B C
region1 0 0 1 1 0 0
region2 0 0 1 1 0 0
region3 0 0 1 1 0 0
region4 0 0 1 1 0 0
region5 0 0 1 1 0 0
region6 0 0 1 1 0 0
region7 0 0 1 1 0 0
region8 0 0 1 1 0 0
region9 0 0 1 1 0 0
region10 0 0 1 1 0 0
region11 0 0 1 1 0 0;
Code: Select all
set ConflictToSolution(Regions, Conflicts, Solutions) /
region1.high.A
region2.high.A
region3.high.A
region4.high.A
...
/
Your BellmanEq is a quadratic constraint, so you can't solve this as a MIP. I changed to MIQCP. Now this compiles and solves but is probably not what you want. These are a lot of rookie mistakes. I suggest that you go back to square one and invest learning the system by studying the GAMS tutorials.
Michael
Re: MDP Modeling
Thanks Michael.
Great to see the model work.
However, I've tried something else on the same subject, (I attempted the mapping hint  couldn't get it to work) and now have difficulty resolving some of the errors.
For one, "missing suffix" on line 47 and 48.
kindly assist.
Great to see the model work.
However, I've tried something else on the same subject, (I attempted the mapping hint  couldn't get it to work) and now have difficulty resolving some of the errors.
For one, "missing suffix" on line 47 and 48.
kindly assist.
 Attachments

 Conflict Model_2.gms
 (1.7 KiB) Downloaded 29 times
Re: MDP Modeling
I started looking at this and it is full of compilation error and it shows fundamental lack of understanding of the GAMS language from your side. My initial advise, to invest in learning the language by starting to go through some tutorial material still holds. Good luck.
Michael
Michael
Re: MDP Modeling
Understood.
Just pressed for time and deliverable. Will invest some time much later.
For the meantime, can you provide some guidance.
Just pressed for time and deliverable. Will invest some time much later.
For the meantime, can you provide some guidance.
Re: MDP Modeling
Hi Michael,
I have made some adjustments to the model.
Although I have one syntax error to resolve, can you look through and see if it makes sense.
I'll want to ultimately generate the best possible solution for each region  indicated by the solution with maximum reward.
I have made some adjustments to the model.
Although I have one syntax error to resolve, can you look through and see if it makes sense.
I'll want to ultimately generate the best possible solution for each region  indicated by the solution with maximum reward.
 Attachments

 Conflict Model_4.gms
 (4.24 KiB) Downloaded 19 times
Re: MDP Modeling
No. I can't review the model this takes insight into the problem and time. If this is student work check with your advisor. If this is professional work, get some professional help on such tasks.can you look through and see if it makes sense
Michael
Re: MDP Modeling
Hi Michael,
I keep getting "Uncontrolled set entered as constant" error for the Bellman and Policy equations which affects the Solve statement below.
I have tried different declaration types: sets, parameters, tables and different arrangement for the data under these declarations, no luck.
I ultimately want to print optimised solutions/results for the regions r1 to r11 based on highest reward path and then the worst, medium and best cases/path based on rewards.
Can you assist?
I keep getting "Uncontrolled set entered as constant" error for the Bellman and Policy equations which affects the Solve statement below.
Code: Select all
Sets
Regions /r1*r11/,
RewardMap /r1hA, r1hB, r1hC, r2mA , r2mB, r2mC, r3mA , r3mB, r3mC, r4mA , r4mB, r4mC, r5lA , r5lB, r5lC, r6lA , r6lB, r6lC, r7lA , r7lB, r7lC,r8lA , r8lB, r8lC,r9lA , r9lB, r9lC,r10lA , r10lB, r10lC,r11lA , r11lB, r11lC/,
Conflicts /low, medium, high/,
Solutions /A, B, C/;
alias (Regions, r);
alias (Regions, rr);
alias (Conflicts, c);
alias (Solutions, s);
Table Prob(r, *) "probability of each conflict in each region"
CPr
r1 0.54
r2 0.15
r3 0.10
r4 0.06
r5 0.04
r6 0.03
r7 0.03
r8 0.02
r9 0.02
r10 0.01
r11 0.01;
*Options A=330/132kV, B=33/11kV, C=415V to Solution Low=A, Medium=B or High=C
*The higher the conflict, the closer the grid solution to the point of use and vise versa
Table Reward(RewardMap, *) "reward for taking an option in a conflict in a region"
RewardV
r1hA 5
r1hB 5
r1hC 10
r2mA 3
r2mB 10
r2mC 3
r3mA 3
r3mB 10
r3mC 3
r4mA 3
r4mB 10
r4mC 3
r5lA 10
r5lB 5
r5lC 2
r6lA 10
r6lB 5
r6lC 2
r7lA 10
r7lB 5
r7lC 2
r8lA 10
r8lB 5
r8lC 2
r9lA 10
r9lB 5
r9lC 2
r10lA 10
r10lB 5
r10lC 2
r11lA 10
r11lB 5
r11lC 2;
Parameter discount "discount factor" /0.9/;
Variables
V(r) "expected cumulative discounted reward"
policy(r) "optimal policy";
Equations
Bellman(Regions) "Bellman equation";
V.up(r) = 1000;
V.lo(r) = 1000;
Bellman(Regions).. V.l(Regions) =e= sum(c, prob(Regions, 'CPr') * (Reward(RewardMap, 'RewardV') + discount * sum(r$(ord(r) <> ord(r)), prob(Regions, 'CPr') * V.l(r))));
* calculate the optimal policy for each region (smax instead of argmax)
policy.l(r) = smax(s, sum(c, prob(r, 'CPr')*(Reward(RewardMap, 'RewardV') + discount * sum(rr$(ord(rr) <> ord(r)), prob(r, 'CPr') * V.l(r)))));
Model MDP /all/;
Solve MDP using LP maximizing V;
* print the optimal policy and expected cumulative discounted reward for each region
Loop(Regions,
Display "Region ", r, " Optimal policy: ", policy.l, " Expected cumulative discounted reward: ", V.l
);
I ultimately want to print optimised solutions/results for the regions r1 to r11 based on highest reward path and then the worst, medium and best cases/path based on rewards.
Can you assist?
Re: MDP Modeling
If you look at your failing equation "Bellman(Regions).. V.l(Regions) =e= sum(c, prob(Regions, 'CPr') * (Reward(RewardMap, 'RewardV') ..." ask your self what element of RewardMap should GAMS take when instantiating this equation. You don't give any hint for that. This is what GAMS tells you went is says that there is an uncontrolled set. You need to tell GAMS which of the many ReWardMap elements it should take for a given Regions and c element (these are your controlled sets). You can control a set by adding it to the equation index (index Regions) set or by summing over it (set c).
Michael
Michael
Re: MDP Modeling
Thank you, Michael, for the help so far. I returned back to tweak the initial code that had worked.
Please I'm trying something else and cant understand the errors:
I keep getting a "767 Unexpected symbol will terminate the loop  symbol replaced by )" & "409 Unrecognizable item  skip to find a new statement looking for a ';' or a key word to get started again" errors at the final display, additional parenthesis generates more errors.
Please kindly assist.
Please I'm trying something else and cant understand the errors:
I keep getting a "767 Unexpected symbol will terminate the loop  symbol replaced by )" & "409 Unrecognizable item  skip to find a new statement looking for a ';' or a key word to get started again" errors at the final display, additional parenthesis generates more errors.
Code: Select all
sets
regions /r1*r11/
actions /A, B, C/
states /low, medium, high/
states_next /low, medium, high/
time /2023*2050/;
alias (regions, r);
alias (actions, a);
alias (states, s);
alias (states_next, sn);
parameters
reward(states,actions) /low.A 5,
low.B 1,
low.C 5,
medium.A 5,
medium.B 1,
medium.C 5,
high.A 5,
high.B 1,
high.C 5/
discount_rate /0.95/
value(states, regions)
value_new(states, regions)
epsilon /0.01/
max_iter /1000/
iter;
value(states, regions)=0;
value_new(states, regions)=0;
iter=0;
* Transition probability
* Probability of transitioning from one state to another when an action is taken
* Format: (region, current state, action, next state)
* Actions A=415V, B=33/11kV, C=330/132kV
Set transition_prob(regions, states, actions, states_next); transition_prob('r1', 'high', 'A', 'low') = 0.54;
transition_prob('r1', 'high', 'B', 'low') = 0.54;
transition_prob('r1', 'high', 'C', 'low') = 0.54;
transition_prob('r2', 'medium', 'A', 'low') = 0.15;
transition_prob('r2', 'medium', 'B', 'low') = 0.15;
transition_prob('r2', 'medium', 'C', 'low') = 0.15;
transition_prob('r3', 'medium', 'A', 'low') = 0.10;
transition_prob('r3', 'medium', 'B', 'low') = 0.10;
transition_prob('r3', 'medium', 'C', 'low') = 0.10;
transition_prob('r4', 'medium', 'A', 'low') = 0.06;
transition_prob('r4', 'medium', 'B', 'low') = 0.06;
transition_prob('r4', 'medium', 'C', 'low') = 0.06;
transition_prob('r5', 'low', 'A', 'low') = 0.04;
transition_prob('r5', 'low', 'B', 'low') = 0.04;
transition_prob('r5', 'low', 'C', 'low') = 0.04;
transition_prob('r6', 'low', 'A', 'low') = 0.03;
transition_prob('r6', 'low', 'B', 'low') = 0.03;
transition_prob('r6', 'low', 'C', 'low') = 0.03;
transition_prob('r7', 'low', 'A', 'low') = 0.03;
transition_prob('r7', 'low', 'B', 'low') = 0.03;
transition_prob('r7', 'low', 'C', 'low') = 0.03;
transition_prob('r8', 'low', 'A', 'low') = 0.02;
transition_prob('r8', 'low', 'B', 'low') = 0.02;
transition_prob('r8', 'low', 'C', 'low') = 0.02;
transition_prob('r9', 'low', 'A', 'low') = 0.02;
transition_prob('r9', 'low', 'B', 'low') = 0.02;
transition_prob('r9', 'low', 'C', 'low') = 0.02;
transition_prob('r10', 'low', 'A', 'low') = 0.01;
transition_prob('r10', 'low', 'B', 'low') = 0.01;
transition_prob('r10', 'low', 'C', 'low') = 0.01;
transition_prob('r11', 'low', 'A', 'low') = 0.01;
transition_prob('r11', 'low', 'B', 'low') = 0.01;
transition_prob('r11', 'low', 'C', 'low') = 0.01;
* Value iteration to convergence
* V(s) = max_a (R(s,a) + gamma * sum_s' P(s's,a) * V(s'))
*while((iter = max_iter) or ((value_new  value) < epsilon),
while(iter = max_iter,
iter = iter + 1;
* value = value_new;
loop(regions,
loop(states,
loop(actions,
value_new(states, regions) = max(reward(states, actions) +
discount_rate * sum(transition_prob(regions, states, actions, states_next), value_new(states, regions)), value_new(states, regions));
);
);
);
);
* Print the optimal policy
* Q(s,a) = R(s,a) + gamma * sum_s' P(s's,a) * V(s')
display "Optimal policy for each region:", value_new;
loop(regions,
display regions;
loop(states, display states, " action: ", actions(smax(reward(states, actions) +
discount_rate * sum(transition_prob(regions, states, actions, states_next), value(states_next, regions)),
value(states_next, regions)))
);
* );