## Parallel versus sequential assignments - or beware of "loop"-statements

**Moderator:** aileen

**Forum rules**

Please ask questions in the other sub-forums

### Parallel versus sequential assignments - or beware of "loop"-statements

Should I use indexed assignments or loop statements in GAMS?

### Re: Parallel versus sequential assignments - or beware of "loop"-statements

Indexed assignments within GAMS are done using simultaneous or parallel assignments. In most cases you don’t need explicit loops, an assignment is an implicit loop. The loop statement is necessary when parallel assignments are not sufficient - an example is provided below. However, a loop statement is not executed in parallel, but sequentially. This may slow down the execution of your code tremendously. Below are some examples:

If you run this fragment with the profile option set to one, you'll get something like:

The execution of the loop statement was significantly slower than the parallel assignment! Thus parallel assignments should be used wherever possible.

Note: That "parallel" assignment that use the same symbol on the left and on the right creates a copy of the symbol for the use on the right. Hence the updated value on the left is not used when referencing on the right as shown in the next example. If this is required a loop statement must be used. Here is an example, which illustrates this:

Let's define three sets, stores, products, and weeks. How does the following iteration work exactly?

Assuming that we have 6000 stores, 20 products, and 100 weeks. And only one week,

GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You have a better chance when you do:

Even if GAMS can't be smart about the individual conditions, it only walks them once

Here we have an example, where we have to match elements from two different sets. We show different formulation, which are much faster than using a loop statement.

Running the code using the GAMS profiler (

In this example, a large matrix with a small band of non-zeros off the diagonal is copied from the name space

The relevant part of the GAMS profiler report yields:

So the loop is more than 4 times slower than the sequence of parallel assignment statements.

**A simple example**Code: Select all

```
set i / i1*i100000 /;
parameter u(i);
* bad!
loop(i, u(i) = uniform(0,2));
* good
u(i) = uniform(0,2);
```

Code: Select all

```
---- 4 Loop 7.235 7.235 SECS 14 MB
---- 6 Assignment u 0.000 7.235 SECS 14 MB 100000
```

Note: That "parallel" assignment that use the same symbol on the left and on the right creates a copy of the symbol for the use on the right. Hence the updated value on the left is not used when referencing on the right as shown in the next example. If this is required a loop statement must be used. Here is an example, which illustrates this:

Code: Select all

```
set j /j1*j4/
parameters a(j);
a('j1')=1;
* wrong!
a(j)$(ord(j)>1)=a(j-1)+1;
display 'Wrong:',a;
* correct!
loop(j$(ord(j)>1), a(j)=a(j-1)+1);
display 'Correct:',a;
```

Code: Select all

```
---- 6 Wrong:
---- 6 PARAMETER a
j1 1.000, j2 2.000, j3 1.000, j4 1.000
---- 9 Correct:
---- 9 PARAMETER a
j1 1.000, j2 2.000, j3 3.000, j4 4.000
```

**Another example**Let's define three sets, stores, products, and weeks. How does the following iteration work exactly?

Code: Select all

```
found =0;
loop((stores, products, weeks) $ (condition1(weeks)
and condition2(stores)
and condition3(products)), found =1);
```

*week1*, satisfies*condition1*; only one store,*store1*, satisfies*condition2*; only one product,*product1*, satisfies*condition3*. Will the iteration go through ALL the stores, products, and weeks even though only one element satisfies all the conditions. Given the conditional control, will the dimensions of the sets have a big impact on run time?GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You have a better chance when you do:

Code: Select all

```
set w(weeks), s(stores), p(products);
w(weeks) = condition1(weeks);
s(stores) = condition2(stores);
p(products) = condition3(products);
found = card(w) and card(s) and card(p) // or card(w)*card(s)*card(p)
```

*card(weeks)+card(stores)+card(products)*in the original loop you probably get***instead of*+*). If the conditions are simple and can be optimized by the GAMS compiler you get constant time. It is a little like code optimization in other language compilers (C, FORTRAN, etc).**A third example**Here we have an example, where we have to match elements from two different sets. We show different formulation, which are much faster than using a loop statement.

Code: Select all

```
set k /k1*k90000/, t /t1*t10000/;
parameter a(k), b(t), bref(t);
a(k)=normal(0,1);
* expansive loop
loop(k, b(t)$(ord(t) eq ord(k)-card(t))=a(k));
bref(t) = b(t);
option clear=b;
* better, but not yet optimal
b(t) = sum(k$(ord(k)=ord(t)), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b;
* much faster using element matching introduced in distribution 227
set sk(k) 'shifted k', kt(k,t);
sk(k+card(t)) = yes;
option kt(sk:t);
b(t)=sum(k$kt(k,t),a(k));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b, clear=kt;
* this formulation is more compact
option kt(k:t);
b(t)=sum(kt(k,t), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
```

*profile 1*) yields:Code: Select all

```
---- 7 Loop 29.218 29.234 SECS 14 MB
---- 12 Assignment b 20.188 49.422 SECS 16 MB 10000
---- 18 Assignment sk 0.015 49.437 SECS 17 MB 80000
---- 20 Assignment b 0.000 49.437 SECS 21 MB 10000
---- 26 Assignment b 0.125 49.562 SECS 21 MB 10000
```

**A fourth example**In this example, a large matrix with a small band of non-zeros off the diagonal is copied from the name space

*j,jj*to a new parallel matching name space*x,xx*where*'j'*and*'x'*match perfectly. The band matrix (*PDmatrix_jj*) is created and the copied to the same matrix with different labels (*PDmatrix_xx*). This is done once with the combination of two*j:x*mappings (*jx*and*jxx*) in a loop statement and once with a sequence of parallel assignment utilizing a temporary 4-dimensional map symbol*'j_jj_x_xx'*:Code: Select all

```
set x / x1*x50000 /;
singleton set xone(x) / x1 /;
$eval xCard card(x)
set j /1*%xCard% /;
set jx(j,x) / #j:#x /;
alias (x,xx,xxx), (j,jj,jjj), (jx,jxx);
parameter
LTmatrix_jj(j,jj)
PDmatrix_jj(j,jj)
scalar LTwidth / 3 /;
$eval LTWIDTH LTwidth
set jsub(j) / 1*%LTWIDTH% /;
LTmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = normal(0,1);
PDmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = sum(jjj, LTmatrix_jj(j,jjj)*LTmatrix_jj(jj+(ord(j)-LTwidth),jjj));
Parameter
PDmatrix_xx1(x,xx)
PDmatrix_xx2(x,xx);
* Slow loop
loop((jx(j,x),jxx(jj,xx))$PDmatrix_jj(j,jj), PDmatrix_xx1(x,xx) = PDmatrix_jj(j,jj));
* Fast sequence of
set j_jj_x_xx(j,jj,x,xx), x_xx(x,xx);
j_jj_x_xx(j,jj,x+(ord(j)-1),x+(ord(jj)-1))$xone(x) = PDmatrix_jj(j,jj);
option x_xx<j_jj_x_xx;
PDmatrix_xx2(x_xx(x,xx)) = SUM(j_jj_x_xx(j,jj,x,xx), PDmatrix_jj(j,jj));
abort$(smax(x_xx, abs(PDmatrix_xx1(x_xx)-PDmatrix_xx2(x_xx)))>1e-6) 'bad calculation';
```

Code: Select all

```
---- 24 Loop 14.625 18.657 SECS 32 MB
---- 28 Assignment j_jj_x_xx 3.250 21.907 SECS 45 MB 149997
---- 29 Other 0.000 21.907 SECS 49 MB
---- 30 Assignment PDmatrix_xx2 0.093 22.000 SECS 69 MB 149997
```