GTAPinGAMS and GTAP-EG: Global Datasets for Economic Research and Illustrative Models - Chapter 2

Previous Section
Next Section (Chapter 3)
Back to the list of contents
Back to the GTAPinGAMS Homepage

2  The GTAP Datasets

All GTAP datasets are defined in terms of three primary sets: r - the set of countries and regions, i - the set of sectors and produced commodities, and f - the set of primary factors. The GTAPinGAMS dataset, as the original GTAP4 dataset, has 45 regions, 50 goods5, and 5 primary factors. The GTAP-EG dataset has 45 regions, 23 goods (5 of which are energy goods), and 5 primary factors. Lesser number of goods is determined by the structure of the energy statistics collected in OECD by Complainville [1998]. Identifiers for regions, sectors, and primary factors, as they are defined in GTAPinGAMS and GTAP-EG, are presented in Appendix 1.

An important feature of GTAPinGAMS package is that datasets may be freely aggregated into fewer regions, fewer sectors and even fewer primary factors. This feature permits a modeller to do preliminary model development using a small dataset to ensure rapid response and a short debug cycle. After having implemented a small model, it is then a simple matter to expand the number of sectors and/or regions in order to obtain a more precise empirical estimate.

2.1  The GTAPinGAMS Dataset

The GTAP data describe economic transactions in 1995. All parameters in GTAP are expressed in terms of values (i.e. price times quantity). Units of account in GTAP in its original GEMPACK representation are millions of 1995$. The units in GTAPinGAMS (and GTAP-EG) are different by a factor of 10,000. GTAPinGAMS measures transactions in tens of billions of 1995$. Scaling units in this way assures better numerical precision in equilibrium calculations.

GAMS statements which declare all parameters in a GTAP dataset is shown in Table . The parameters beginning with v are base year (1995) value data, most of which are from the original GEMPACK implementation of GTAP. Not all value data from the original dataset are included here. The principal difference is that GTAPinGAMS dataset stores tax rates rather than gross and net of tax transaction values. The tax parameters, beginning with t are not in the original GEMPACK dataset.

Picture 1

Fig. 1. GTAP flows explicitly represented in the dataset.

Figure 1 presents the general GTAP database flows, which are explicitly represented in the dataset. Whenever the GTAPinGAMS dataset is read, additional intermediate parameter values are assigned. Additional parameters are calculated based on the general flows. Declarations for the computed parameters are presented in Table 2.

Table 1: Parameters Explicitly Represented in a GTAPinGAMS Dataset
alias (i,j), (r,s);
        ty(i,r)         Output tax
        ti(j,i,r)       Intermediate input tax
        tf(f,i,r)       Factor tax
        tx(i,s,r)       Export tax rate (defined on a net basis)
        tm(i,s,r)       Import tariff rate
        tg(i,r)         Tax rates on government demand
        tp(i,r)         Tax rate on private demand
        vafm(j,i,r)     Aggregate intermediate inputs
        vfm(f,i,r)      Value of factor inputs (net of tax)
        vxmd(i,r,s)     Value of commodity trade (fob - net export tax)
        vtwr(i,r,s)     Transport services
        vst(i,r)        Value of international transport sales
        vdgm(i,r)       Government demand (domestic)
        vigm(i,r)       Government demand (imported)
        vdpm(i,r)       Aggregate private demands (domestic)
        vipm(i,r)       Aggregate private demands (domestic);

Table 2: Computed Benchmark Parameters
        vim(i,r)        Total value of imports (gross tariff)
        vxm(i,r)        Value of export (gross excise tax)
        vdm(i,r)        Value of domestic output (net excise tax)
        vdfm(i,r)       Aggregate intermediate demand (domestic)
        vifm(i,r)       Aggregate intermediate demand (imported)
        vom(i,r)        Aggregate output value (gross of tax)
        vgm(i,r)        Public expenditures
        vpm(i,r)        Private expenditures
        vg(r)           Total value of public expenditure
        vp(r)           Total value of private expenditure
        vi(r)           Total value of investment
        vt              Value of international trade margins
        vb(*)           Net capital inflows
        market(*,*)     Consistency check for calibrated benchmark
        evoa(f,r)       Value of factor income
        va(d,i,r)       Armington supply
        vd(d,i,r)       Domestic supply
        vm(d,i,r)       Imported supply;

Table 3 describes the GAMS parameter assignment statements for the computed items. Briefly, this is done as follows: (i) aggregate exports at market prices (vxm) are defined from the matrix of bilateral trade flows; (ii) aggregate imports at market prices (vim) are defined by bilateral exports, export taxes, transportation margins and tariff rates; (iii) domestic output (vdm) is determined as a residual through the zero profit condition; (vi) domestic supply to the intermediate demand (vdfm) is defined as a residual given domestic production and other demands for domestic output; (vii) import supply to intermediate demand (vifm) is also defined as a residual given aggregate imports, private and public import demand. This sequence of assignments implies that any imbalance in the dataset shows up as either a discrepancy in the demand and supply for intermediate inputs or as an imbalance between demand and supply of transportation services. The parameter market is created to generate a report of consistency of the benchmark data. Primary factor markets always balance because endowments are computed residually given benchmark factor demands across sectors. Likewise, regional current account balances are computed from the income-expenditure identity.

In the GTAP models we use Armington [1969] assumption that goods produced in different regions are qualitatively distinct. The GTAPinGAMS model uses the computed parameters va, vm, and vd which are defined over the market segment (intermediate, public, or private) represented by the set d.

Table 3: Assignments for Computed Benchmark Parameters
vxm(i,r) = sum(s, vxmd(i,r,s)) + vst(i,r);
vim(i,r) = sum(s,(vxmd(i,s,r)*(1+tx(i,s,r))+vtwr(i,s,r))*(1+tm(i,s,r)));
vdm(i,r) = ( sum(j, vafm(j,i,r)*(1+ti(j,i,r)))
           + sum(f,  vfm(f,i,r)*(1+tf(f,i,r)))) / (1-ty(i,r)) - vxm(i,r);
vdfm(i,r) = vdm(i,r)  - vdgm(i,r) - vdpm(i,r) - vdm(i,r)$cgd(i);
vi(r) = sum(cgd, vdm(cgd,r));
vifm(i,r) = vim(i,r) - vipm(i,r) - vigm(i,r);
vom(i,r) = vdm(i,r) + vxm(i,r);
vgm(i,r) = vigm(i,r)+vdgm(i,r);
vpm(i,r) = vipm(i,r)+vdpm(i,r);
vg(r) = sum(i, vgm(i,r) * (1 + tg(i,r)));
vp(r) = sum(i, vpm(i,r) * (1 + tp(i,r)));
vt = sum((i,r), vst(i,r));
evoa(f,r) = sum(i, vfm(f,i,r));
vb(r) = vp(r) + vg(r) + vdm("cgd",r)
        - sum(f, evoa(f,r))
        - sum(i,     ty(i,r)   * vom(i,r))
        - sum((i,j), ti(j,i,r) * vafm(j,i,r))
        - sum((i,f), tf(f,i,r) * vfm(f,i,r))
        - sum((i,s), tx(i,r,s)  * vxmd(i,r,s))
        - sum((i,s), tm(i,s,r)  * (vxmd(i,s,r)*(1+tx(i,s,r)) + vtwr(i,s,r)) )
        - sum(i,     tg(i,r)*vgm(i,r))
        - sum(i,     tp(i,r)*vpm(i,r));

vm("c",i,r) = vipm(i,r);        vd("c",i,r) = vdpm(i,r);
vm("g",i,r) = vigm(i,r);        vd("g",i,r) = vdgm(i,r);
vm("i",i,r) = vifm(i,r);        vd("i",i,r) = vdfm(i,r);
va(d,i,r) = vm(d,i,r) + vd(d,i,r);
market(r,i) = vdfm(i,r) + vifm(i,r) - sum(j, vafm(i,j,r));
market("world","t") = vt - sum((i,r,s), vtwr(i,r,s));

Table 4 lists declarations and assignments of reference prices for each of the benchmark transactions which are subject to tax. These parameters are used in the GAMS model as part of the calibration of demand functions. For more discussion about the GAMS implementation of the static model, see Section 3.

Table 4: Benchmark Prices
        pc0(i,r)        Reference price index for private consumption
        pf0(f,i,r)      Reference price index for factor inputs
        pg0(i,r)        Reference price index for public
        pi0(j,i,r)      Reference price index for intermediate inputs
        pt0(i,s,r)      Reference price index for transport
        px0(i,s,r)      Reference price index for imports;
px0(i,s,r) = (1+tx(i,s,r))*(1+tm(i,s,r));
pt0(i,s,r) = 1+tm(i,s,r);
pc0(i,r)   = 1+tp(i,r);
pg0(i,r)   = 1+tg(i,r);
pi0(j,i,r) = 1+ti(j,i,r);
pf0(f,i,r) = 1+tf(f,i,r);

2.2  The GTAP-EG Dataset

The GTAP-EG dataset is the GAMS version of the energy-economic dataset GTAP-E. It has been observed that the GTAP economic data provide a poor representation of energy flows (Babiker and Rutherford [1997]). The process of economic and energy data integration has proceeded in parallel at Purdue University and the University of Colorado at Boulder. Reconciling the data requires heroic adjustments. Two different approaches for calibration have been used. As a result, two energy datasets have been created.

An approach for calibration taken at Purdue University is to use the RAS procedure (United Nations [1973]) to fit energy quantities with ``target'' quantities, and then use FIT procedure to adjust the single region input-output coefficients. The process of incorporating energy data into GTAP is described in detail by Malcolm and Truong [1999]. We denote the Purdue dataset as GTAP-E-FIT. As a result of the FIT procedure, the information from all three data sources (GTAP economic data, IEA energy quantities, and price data) has been changed in the process of calibration.

In contrast to Purdue approach, we apply standard optimization techniques for calibrating the GTAP data to energy statistics. The resulting dataset which is described in this paper called GTAP-EG (GTAP-Energy in GAMS). Accordingly, the dataset and an illustrative model are presented in the GAMS programming language (Brook, Kendrick, Meeraus [1992]). The process of GTAP-EG creation by incorporating energy statistics into GTAP format is described in Rutherford and Paltsev [2000]. The GTAP-EG approach is to modify the GTAP value data as little as required while preserving the IEA energy quantity statistics and most of the prices.

The energy statistics collected in OECD by Complainville [1998] have 135 regions, 32 goods, and 7 energy commodities. The resulting GTAP-EG dataset has 45 regions, 23 goods (5 of which are energy goods), and 5 primary factors. An aggregation of 135 IEA-format regions into 45 GTAP regions is shown in Appendix 2. Most of the regional identifiers in the dataset correspond to standard UN three-character country codes6.

To combine energy and trade data, 32 IEA-format sectors are aggregated into 22 sectors. In order to comply with IEA aggregation, the original 50 industrial sectors of GTAP data are also aggregated into the same 22 sectors. A sector for the investment composite is added to the original GTAP-GEMPACK representation. Table A.4 in Appendix 1 presents the identifiers for the 23 GTAP-EG sectors. The sectoral identifiers for energy are different from the GTAP-E-FIT identifiers7. The differences are noted in Table 5.

A concordance between IEA, GTAP 4, and GTAP-EG production sectors is presented in Appendix 3. The process of incorporating IEA statistics into GTAP-EG format is described in detail in Rutherford and Paltsev [2000]. Sectors may be aggregated to produce more compact datasets. The aggregation routine is described in Section 4.

Table 5: Differences between GTAP-E-FIT and GTAP-EG sectoral identifiers.

Electricity and heat ELY ELE
Refined oil products P_C OIL
Crude oil OIL CRU

Table 6 presents the three-character identifiers used for primary factors. Note that these differ from the primary factor names employed in the GEMPACK model.

Table 6: Differences between GTAP-E-FIT and GTAP-EG primary factor identifiers.

Land Land LND
Skilled labor SkLab SKL
Unskilled labor UnSkLab LAB
Capital Capital CAP
Natural resources NatRes RES

The GTAP-EG dataset has a similar structure to the GTAPinGAMS dataset with the addition of energy quantities. The general database flows are shown in Figure 1. The parameters explicitly represented in the GTAP-EG are listed in Table 1 and Table 5. The energy parameters, beginning with ``e'' are in neither the original GTAP nor in the GTAPinGAMS dataset. Energy prices can be recovered by division of the respective values by the energy quantities. IEA statistics are expressed in a common unit, tonnes of oil equivalent. In the GTAP-EG units for electricity are converted into trillion kilowatt hour (TKWH) and units for other energy flows are converted into exajoules (EJ)8.

Table 7: Energy Parameters Explicitly Represented in a GTAP-EG Dataset
        eind(i,i,r)     Industrial energy demand,
        efd(i,r)        Final energy demand,
        eexp(i,r)       Energy exports,
        eimp(i,r)       Energy imports;

A summary of economic activity by production sectors and regions in the GTAP-EG dataset is presented in Appendix 4. These numbers differ slightly from GTAP-E-FIT dataset9. The two energy datasets are different even though they are based on the same initial data, such as the GTAP version 4 (Hertel [1997]) expressed in terms of values (i.e. price times quantity), OECD International Energy Agency statistics (Complainville [1998]) expressed in terms of quantity, and energy price and tax data (Babiker and Malcolm [1998]). The reason for this discrepancy is the different calibration procedures that have been used. Since only two out of three variables (price, quantity, value) can be regarded as independent, it is problematic to incorporate both price and quantity data into the GTAP database.

To illustrate the difference between GTAP-EG and GTAP-E-FIT, we calculate carbon dioxide emissions and then compare the results with the IEA [1997] publication where the carbon dioxide emissions from fuel combustion are reported. It should be noted that the results from the IEA publication [1997] and the IEA statistics collected by Complainville are different. One source of the difference is International Marine Bunkers which are present in IEA book but not in the datasets. The International Marine Bunkers contains emissions from fuels burned by sea-going ships of all flags that are engaged in international transport. These emissions are excluded from national totals in IEA publication. As a result, the data for countries with big sea fleet differs substantially.

The CO2 emissions for the full list of GTAP countries are presented in Appendix 4. Table 8 shows the results for the countries where differences in calculated CO2 emissions are substantial. We report carbon dioxide emissions from the IEA publication. Then we compare them with the calculated emissions based on IEA statistics, GTAP-E-FIT, and GTAP-EG energy datasets. We have also provided the numbers for the GTAP-EG dataset without a fix for agriculture in USA (an ad hoc adjustment) described in Rutherford and Paltsev [2000]. It should be noted that there is a discrepancy between all four sources of the energy data. The calibration procedures employed in both the GTAP-E-FIT and the GTAP-EG do not reconcile precisely the IEA statistics. The carbon dioxide emissions are underestimated in the GTAP-E-FIT while they are overestimated slightly in the GTAP-EG.

Table 8: Carbon dioxide emissions (selected countries) - billion of tonnes
IEA book IEA stat E-FIT EG before fix EG
JPN 1.151 1.208 1.145 1.257 1.257
KOR 0.353 0.449 0.396 0.449 0.449
SGP 0.059 0.085 0.085 0.085 0.085
CHN 3.007 3.098 2.902 3.112 3.112
IND 0.803 0.771 0.765 0.773 0.773
CAN 0.471 0.505 0.472 0.506 0.506
USA 5.228 5.339 5.175 5.340 5.460
MEX 0.328 0.328 0.309 0.328 0.328
BRA 0.287 0.269 0.256 0.289 0.289
GBR 0.565 0.605 0.540 0.607 0.607
DEU 0.884 0.973 0.865 0.973 0.973
REU 1.560 1.734 1.628 1.735 1.735
FSU 2.483 2.542 2.341 2.549 2.549
RME 0.817 0.788 0.755 0.827 0.827
ROW 0.518 0.208 0.183 0.208 0.208
total 22.150 22.482 21.272 22.644 22.764

Previous Section
Next Section (Chapter 3)
Back to the list of contents
Back to the GTAPinGAMS Homepage

Maintained by Sergey Paltsev
Last Updated 01/20/01