Parallel versus sequential assignments - or beware of "loop"-statements

Frequently asked questions about GAMS

Moderator: aileen

Forum rules
Please ask questions in the other sub-forums
Locked
aileen
User
User
Posts: 136
Joined: 3 years ago

Parallel versus sequential assignments - or beware of "loop"-statements

Post by aileen »

Should I use indexed assignments or loop statements in GAMS?
aileen
User
User
Posts: 136
Joined: 3 years ago

Re: Parallel versus sequential assignments - or beware of "loop"-statements

Post by aileen »

Indexed assignments within GAMS are done using simultaneous or parallel assignments. In most cases you don’t need explicit loops, an assignment is an implicit loop. The loop statement is necessary when parallel assignments are not sufficient - an example is provided below. However, a loop statement is not executed in parallel, but sequentially. This may slow down the execution of your code tremendously. Below are some examples:

A simple example

Code: Select all

set i / i1*i100000 /;
parameter u(i);
* bad!
loop(i, u(i) = uniform(0,2));
* good
u(i) = uniform(0,2);
If you run this fragment with the profile option set to one, you'll get something like:

Code: Select all

----      4 Loop                     7.235     7.235 SECS     14 MB 
----      6 Assignment u             0.000     7.235 SECS     14 MB  100000
The execution of the loop statement was significantly slower than the parallel assignment! Thus parallel assignments should be used wherever possible.

Note: That "parallel" assignment that use the same symbol on the left and on the right creates a copy of the symbol for the use on the right. Hence the updated value on the left is not used when referencing on the right as shown in the next example. If this is required a loop statement must be used. Here is an example, which illustrates this:

Code: Select all

set j      /j1*j4/
parameters a(j);
a('j1')=1;
* wrong!
a(j)$(ord(j)>1)=a(j-1)+1;
display 'Wrong:',a;
* correct!
loop(j$(ord(j)>1), a(j)=a(j-1)+1);
display 'Correct:',a;

Code: Select all

----      6 Wrong:
----      6 PARAMETER a  
j1 1.000,    j2 2.000,    j3 1.000,    j4 1.000

----      9 Correct:
----      9 PARAMETER a  
j1 1.000,    j2 2.000,    j3 3.000,    j4 4.000
Another example
Let's define three sets, stores, products, and weeks. How does the following iteration work exactly?

Code: Select all

found =0;
loop((stores, products, weeks) $ (condition1(weeks)
                              and condition2(stores)
                              and condition3(products)), found =1);
Assuming that we have 6000 stores, 20 products, and 100 weeks. And only one week, week1, satisfies condition1; only one store, store1, satisfies condition2; only one product, product1, satisfies condition3. Will the iteration go through ALL the stores, products, and weeks even though only one element satisfies all the conditions. Given the conditional control, will the dimensions of the sets have a big impact on run time?

GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You have a better chance when you do:

Code: Select all

set w(weeks), s(stores), p(products);
w(weeks)    = condition1(weeks);
s(stores)   = condition2(stores);
p(products) = condition3(products);
found = card(w) and card(s) and card(p) // or card(w)*card(s)*card(p)
Even if GAMS can't be smart about the individual conditions, it only walks them once card(weeks)+card(stores)+card(products) in the original loop you probably get * instead of +). If the conditions are simple and can be optimized by the GAMS compiler you get constant time. It is a little like code optimization in other language compilers (C, FORTRAN, etc).

A third example
Here we have an example, where we have to match elements from two different sets. We show different formulation, which are much faster than using a loop statement.

Code: Select all

set k /k1*k90000/, t /t1*t10000/;

parameter a(k), b(t), bref(t);
a(k)=normal(0,1);

* expansive loop
loop(k, b(t)$(ord(t) eq ord(k)-card(t))=a(k));
bref(t) = b(t);
option clear=b;

* better, but not yet optimal
b(t) = sum(k$(ord(k)=ord(t)), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b;

* much faster using element matching introduced in distribution 227
set sk(k) 'shifted k', kt(k,t);
sk(k+card(t)) =  yes;
option kt(sk:t);
b(t)=sum(k$kt(k,t),a(k));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b, clear=kt;

* this formulation is more compact
option kt(k:t);
b(t)=sum(kt(k,t), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
Running the code using the GAMS profiler (profile 1) yields:

Code: Select all

----      7 Loop                    29.218    29.234 SECS     14 MB 
----     12 Assignment b            20.188    49.422 SECS     16 MB  10000
----     18 Assignment sk            0.015    49.437 SECS     17 MB  80000
----     20 Assignment b             0.000    49.437 SECS     21 MB  10000
----     26 Assignment b             0.125    49.562 SECS     21 MB  10000
A fourth example
In this example, a large matrix with a small band of non-zeros off the diagonal is copied from the name space j,jj to a new parallel matching name space x,xx where 'j' and 'x' match perfectly. The band matrix (PDmatrix_jj) is created and the copied to the same matrix with different labels (PDmatrix_xx). This is done once with the combination of two j:x mappings (jx and jxx) in a loop statement and once with a sequence of parallel assignment utilizing a temporary 4-dimensional map symbol 'j_jj_x_xx':

Code: Select all

set x / x1*x50000 /;
singleton set xone(x) / x1 /;
$eval xCard card(x)
set j /1*%xCard% /;
set jx(j,x) / #j:#x /;
alias (x,xx,xxx), (j,jj,jjj), (jx,jxx);

parameter
	LTmatrix_jj(j,jj)
	PDmatrix_jj(j,jj)
    
scalar LTwidth / 3 /;
$eval LTWIDTH LTwidth
set jsub(j) / 1*%LTWIDTH% /;

LTmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = normal(0,1);
PDmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = sum(jjj, LTmatrix_jj(j,jjj)*LTmatrix_jj(jj+(ord(j)-LTwidth),jjj));

Parameter
    PDmatrix_xx1(x,xx)
    PDmatrix_xx2(x,xx);

* Slow loop
loop((jx(j,x),jxx(jj,xx))$PDmatrix_jj(j,jj), PDmatrix_xx1(x,xx) = PDmatrix_jj(j,jj));

* Fast sequence of 
set j_jj_x_xx(j,jj,x,xx), x_xx(x,xx);
j_jj_x_xx(j,jj,x+(ord(j)-1),x+(ord(jj)-1))$xone(x) = PDmatrix_jj(j,jj);
option x_xx<j_jj_x_xx;
PDmatrix_xx2(x_xx(x,xx)) = SUM(j_jj_x_xx(j,jj,x,xx), PDmatrix_jj(j,jj));

abort$(smax(x_xx, abs(PDmatrix_xx1(x_xx)-PDmatrix_xx2(x_xx)))>1e-6) 'bad calculation';
The relevant part of the GAMS profiler report yields:

Code: Select all

----     24 Loop                    14.625    18.657 SECS     32 MB 
----     28 Assignment j_jj_x_xx     3.250    21.907 SECS     45 MB  149997
----     29 Other                    0.000    21.907 SECS     49 MB 
----     30 Assignment PDmatrix_xx2  0.093    22.000 SECS     69 MB  149997
So the loop is more than 4 times slower than the sequence of parallel assignment statements.
Locked