Colorado GAMS

A GAMS Utility for Ranking One-Dimensional Numeric Data

Thomas F. Rutherford*

Department of Economics
University of Colorado

Paul van der Eijk

GAMS Development Corporation

July, 2003

* This research supported by the GAMS Applied General Equilibrium Research Fund. The author remains responsible for any bugs which exist in this software. This software is not officially supported by GAMS Corporation.

This GAMS libinclude routine implements an O(nlog(n)) algorithm for ranking one-dimensional numeric data within a GAMS program. The routine uses the GDX facility and an external program to sort the data. This document specifies calling syntax and provides some examples of their use. Contents:


This utility consists of an executable file, gdxrank.exe, and a GAMS libinclude file, rank.gms. The former must reside in the GAMS system directory, and the latter should be installed in the GAMS library subdirectory. (It must be in the libinclude directory if you want to have the tool as a standard component of your GAMS installation.) The libinclude file is provided along with other libinclude utilities in inclib.pck. Installation involves copying inclib.pck into your GAMS system directory and running GAMSINST.

If you want to use rank.gms in a model which may be run by someone who may not have installed the inclib.pck library, include copies of rank.gms and gdxrank.exe in your model directory and replace "$libinclude" by "$batinclude".

The file inclib.pck contains all of the libinclude programs which have been developed at the University of Colorado. These include files for data management (GAMS2TXT), output of comma-delimited tables (GAMS2CSV), Excel file input and output (XLLINK), plotting (GNUPLOT), report writing ( GAMS2TBL), use of GEMPACK data files (GAMS2HAR), and scenario management (GAMSSM).

Syntax for rank

$LIBINCLUDE rank v s r [p]

The first three arguments are required. The last is optional. These are defined as following:
v(s)Array of values to be ranked.
sA one-dimensional set, the domain of array v.
r(s)Rank order of element v(s), an integer between 1 and card(s), ranking from smallest to largest.
Optional:(input and output)
p(*) On input this vector specifies percentile levels to be computed. On output, it returns the linearly interpolated percentiles.

General Comments

(i) RANK only works for numeric data. You cannot sort sets.

(ii) The first invocation must be outside of a loop or if block. This routine may be used within a loop or if block only if it is first initialized with blank invocations ("$LIBINCLUDE rank" in a context where set and parameter declarations are permitted (See Example 3).

(iii) The following names are used within these routines and may not be used in the calling program:

        rank_tmp        rank_u          rank_p

(iv) This routine returns rank values and does not return sorted vectors, however rank values are easily used to produce a sorted array. This can be done using computed "leads" and "lags" in GAMS' ordered set syntax, as illustrated in examples 1 and 3 below.

Example 1: Rank a vector, and display the data in sorted order.

set     i       Set on which random data are defined /a,b,d,c,e,f /,
        k       Ordered set for displaying sorted data/1*6/;

parameter       x(i)    Random data to be sorted,
                r(i)    Rank values,
                s(k,i)  Sorted data;

x(i) = uniform(0,1);

$libinclude rank x i r
display x;

*       Generate a sorted list using the ordered set k.

*       This assignment statement illustrates how the rank orders
*       can be used to sort output for display in proper order.  This
*	statement uses GAMS support for computed "leads" and "lags"
*	on the ordered set k:

s(k+(r(i)-ord(k)),i) = x(i);

option s:3:0:1;
display s;

Example1 writes the following lines to ex1.lst:

----     11 PARAMETER x  Random data to be sorted

a 0.172,    b 0.843,    d 0.550,    c 0.301,    e 0.292,    f 0.224

----     75 PARAMETER s  Sorted data

1.a 0.172
2.f 0.224
3.e 0.292
4.c 0.301
5.d 0.550
6.b 0.843

Example 2: Generate percentiles for a random vector.

set     i       Set on which random data are defined /a,b,d,c,e /,
        p       Percentiles (all of them) /0*100/;

parameter       x(i)    Random data to be sorted;

*       Generate the random data on set i:

x(i) = uniform(0,1);
display x;

parameter       r(i)    Rank values,
                pct(*)  Percentiles to be computed /20 20.0, median 50.0, 75 75.0/;

*       Generate ranks and compute the specified percedntiles (Note that 
*       the rank array, r, is required, even if the values are not used.)

$libinclude rank x i r pct

*       Display three percentiles:

display pct;

The random data are displayed as follows in the listing file:

----     11 PARAMETER x  Random data to be sorted

a 0.172,    b 0.843,    d 0.550,    c 0.301,    e 0.292

The interpolated percentiles are computed as follows:

----    103 PARAMETER pct  Percentiles

20     0.268,    75     0.550,    median 0.301

The following code evaluates a full set of percentiles, from 1 to 100. The GAMS special value of EPS is used to represent zero in the percentile calculation. (Percentiles between zero and one are not permitted to avoid misunderstandings about how percentiles are scaled.)

pct(p) = (ord(p) - 1) + eps;
pct("median") = 0;
display pct;

$libinclude rank x i r pct
display pct;

*       Plot the results using GNUPLOT:

set pl(p) /20,40,60,80,100/;
$setglobal domain p
$setglobal labels pl
$libinclude plot pct

Example 3: Use GDXRANK to report multisectoral Monte Carlo results.

One of the most perplexing challenges in economic modeling with GAMS is to present multisectoral results in an easily interpreted format. One simple idea is to present sectoral results in a sorted sequence to make it easier to identify the most seriously affectd sectors. The presentation of results in a multisectoral model is made even more challenging when model results are generated for a randomized set of scenarios. A summary of Monte Carlo results involves reporting both mean results and their sensitivity. One means of characterizing the sensitivity of model results is to report functions of the sample distribution such as the upper and lower quartiles.

In this example, I illustrate how gdxrank can be used to help report results from the Monte Carlo analysis of a multisectoral model.

SET     run     Samples (potentially not all solved) /1*1000/, 
        i       Sectors for which results are to be compared  /

parameter	v(run,i) Monte-Carlo results for all sectors;

*       Load the sectoral results from the sensitivity analysis from
*	a GDX file:

$gdxin 'ex3.gdx'
$load v

set     s(run)		Solved cases,
        k		Count used for quartiles /1*1000/,
        ki		Count used for sectors /1*100/
        qtl		Quartiles (set) /q25,q50,q75/,
	imap(ki,i)	Mapping from ordered list to sector labels;

parameter       qvalue(i,*)     Quartile values
                mean(i)         Mean impact on sector i, 
                meanrank(i)     Mean rank of sector i,
                x(run)          Vector used for sorting,
                r(run)          Rank values returned,
                qv(qtl)         Quartiles (values) /q25 25, q50 50, q75 75/
                quartile(qtl)   Quartiles (evaluated);

*       Identify which scenarios have been solved (some of the runs
*       may have failed):

s(run) = yes$(smax(i,abs(v(run,i))) ne 0);

*       Evaluate means and support for results:

qvalue(i,"mean") = sum(s, v(s,i)) / card(s);
qvalue(i,"min") = smin(s, v(s,i));
qvalue(i,"max") = smax(s, v(s,i));
display qvalue;

*       Determine ranking of sectors by mean impact:

mean(i) = qvalue(i,"mean");
$libinclude rank mean i meanrank 

*       The following statement creates a tuple matching the ordered
*       set, ki, to the set of sectors, i.  In this tuple, the sequence of
*       assignments corresponds to increasing mean impacts:

imap(ki+(meanrank(i)-ord(ki)),i) = yes;
*       Evaluate quartiles of sectoral impacts for each sector:

  x(s) = v(s,i);

*       Load quartile with the perctiles to be
*       evaluated (25th, 50th and 75th):

  quartile(qtl) = qv(qtl);

$libinclude rank x s r quartile

*       Save the quartile values:

  qvalue(i,qtl) = quartile(qtl);
display qvalue;

parameter       results(ki,i,*)         Summary of impacts (sorted);
results(ki,i,"mean")$imap(ki,i) = mean(i);
results(ki,i,qtl)$imap(ki,i) = qvalue(i,qtl);
display results;

The program produces the following display output:

               q25         q50         q75        mean

1 .FOO     -13.767     -12.110     -10.671     -12.259
2 .MWO     -13.496     -11.982     -10.512     -12.035
3 .SCS     -12.244     -10.418      -8.738     -10.588
4 .CLO      -8.763      -7.407      -6.158      -7.480
5 .ADM      -7.865      -5.601      -3.470      -5.812
6 .CNM      -6.437      -5.320      -4.404      -5.461
7 .OTH      -5.732      -4.781      -4.012      -4.892
8 .PIP      -3.142      -1.930      -0.836      -2.145
9 .OIN      -2.124      -1.339      -0.563      -1.358
10.TPP      -2.527      -0.988       0.521      -1.041
11.GEO      -1.542      -0.826      -0.196      -0.883
12.AGF      -1.178      -0.624      -0.072      -0.625
13.ECM       0.137       0.466       0.796       0.441
14.SSM       0.303       0.711       1.079       0.684
15.CON       0.604       1.071       1.479       1.041
16.PST       0.814       1.924       2.966       1.858
17.RLW       1.885       2.787       3.632       2.738
18.OFU       2.789       3.300       3.829       3.330
19.OLE       3.155       3.553       3.964       3.554
20.ELE       3.480       3.896       4.318       3.892
21.PSM       3.733       4.134       4.591       4.168
22.CAT       3.216       4.417       5.461       4.369
23.TRO       3.953       5.123       6.549       5.282
24.OLP       4.614       5.262       5.909       5.295
25.TRD       5.694       6.454       7.250       6.476
26.AIR       5.227       7.182       9.276       7.320
27.FIN       4.919       7.301      10.175       7.640
28.COA       6.581       7.818       9.125       7.878
29.TMS       5.668       8.530      11.086       8.591
30.CHM       8.071       9.299      10.534       9.319
31.TRK       7.031       9.367      11.851       9.577
32.MAR       9.485      13.072      16.424      13.049
33.GAS       4.437      13.183      22.311      13.462
34.FME      14.794      16.617      18.382      16.642
35.NFM      23.299      26.398      29.281      26.263

The tabular report is helpful, but it does not convey the results as immediately as a picture. GNUPLOT's errorbar plot format is a convenient graphical format for portraying this information. The libinclude interface to GNUPLOT does not support this type of plot, so the continuation of the program produces the GNUPLOT command and data files before invoking the GNUPLOT program:

*       Write out a GNUPLOT file to generate a chart of the results:

file kplt /ex3.gnu/; put kplt; kplt.lw=0;
put "reset"/;
put 'set title "Sectoral Impacts with Quartiles"'/;
put "set linestyle 1 lt 8 lw 1 pt 8 ps 0.5"/;
put "set grid"/;
put 'set ylabel "% change"'/;
put "set xzeroaxis"/;
put "set bmargin 4"/;
put "set xlabel 'sector'"/;
put 'set xrange [1:',card(i),']'/;
put 'set xtics rotate (';
loop(ki, loop(i$imap(ki,i),     put '\'/' "',,'" ',ord(ki):0,',';)); 
put @( ')'/;
put "plot 'ex3.dat' notitle with errorbars ls 1"/;
file kpltdata /ex3.dat/; put kpltdata;; kpltdata.nw=14; kpltdata.nd=6;
loop(ki, loop(i$imap(ki,i),  put ord(ki):0,qvalue(i,"mean"), qvalue(i,"q25"), qvalue(i,"q75")/;));
execute 'wgnupl32 ex3.gnu -';

Economics Department, University of Colorado, Boulder CO 80309-0256
Phone: (303) 492-5169, Fax: (303) 492-8969
Last modified July, 2003 by TFR