0% found this document useful (0 votes)
2 views

Data Analysis Assignment_Prompt

The document outlines an assignment for an International Economics course that involves data analysis using a panel dataset on Global Value Chains for 65 countries over 24 years. Students are required to merge additional data from the World Bank, create new variables, report summary statistics, and conduct various analyses including country-level comparisons and graphical representations. The assignment is due on December 13, 2024, and emphasizes the importance of interpreting the results.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Data Analysis Assignment_Prompt

The document outlines an assignment for an International Economics course that involves data analysis using a panel dataset on Global Value Chains for 65 countries over 24 years. Students are required to merge additional data from the World Bank, create new variables, report summary statistics, and conduct various analyses including country-level comparisons and graphical representations. The assignment is due on December 13, 2024, and emphasizes the importance of interpreting the results.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

ECON 205: International Economics

Data Analysis Assignment


Instructions: This Assignment is due on December 13, 2024. The final Excel file and the
assignment with graphs and answers should be uploaded on Moodle by the deadline. Explain
and discuss interpretations wherever required.
To conduct the following Data analysis assignment, you need to make use of a panel dataset on
Global Value Chains. The excel sheet required for this analysis is uploaded on Moodle. The
data comprises of 3-way panel dataset for 65 countries, 70 industries and 24 years (1995-2018).
It has data on Gross exports, Gross imports, Gross output, Value added (all in millions of US$),
Domestic value added in foreign exports as a share of gross exports (in %), and Foreign value
added as a share of gross exports (in %).
Using this dataset, conduct the following data analysis and answer the following questions.
Exercise/Questions:
Exercise 1:
Let us begin by adding some more data to our existing dataset (excel file).
Use the below link to the World Bank’s World Development Indicator (WDI) database:
https://databank.worldbank.org/source/world-development-indicators
Once you are on this page, look at the data selection options on the left-hand menu.
(a) Do not change the database option. You must use the WDI database.
(b) For Country, manually select the 65 countries listed in our excel sheet.
(c) For Series, search in the search bar and select the following variables:
(i) GDP per capita growth (annual %)
(ii) GDP growth (annual %)
(iii) LFPR: Labor force participation rate, total (% of total population ages 15-64)
(modeled ILO estimate)
(iv) LFPR, male: Labor force participation rate, male (% of male population ages 15-64)
(modeled ILO estimate)
(v) LFPR, female: Labor force participation rate, female (% of female population ages
15-64) (modeled ILO estimate)
(d) For Time, select all years from 1995 to 2018.
Once you have done all the above-mentioned selections, you will be able to “Apply changes” on
the right-hand side (center) of the page. Then, at the top right of the page, below sign in option,
you should be able to see a dropdown menu called “download options”. Click on that and
download the data as an excel file.
Now, you need to merge or match the downloaded dataset from WDI with our existing
excel file. You need to use this merged final dataset file for all analysis moving forward. Let us
call this file the “Master dataset”.
Note: The data you downloaded from WDI is country-level data from 1995-2018. But our excel
file is also at the industry level (70 industries for each country and year). You need to figure out
how to match this country and year level WDI data with the original excel file which has
country, industry and year level data.

Exercise 2:
Create two new variables called GVC participation and Trade Openness. GVC participation
is our variable for trade in GVCs, while Trade Openness is our variable for trade in terms of
gross exports and imports. Both these variables should be added as two new columns in the same
merged “Master dataset”. The formula for generating these variables is:
GVC participation = DVA in FX share exp (%) + FVA share exp (%)
[Gross Exports (mils) + Gross Imports (mils)]
Trade Openness = ∗ 100
Gross Output (mils)
Note: GVC participation measures trade in intermediate inputs while trade openness measures
trade in final goods.

Exercise 3:
Now, you need to report the summary statistics for all the variables in our Master dataset listed
below. The statistics should be reported in a table form as below.
Table 1.A: All countries
Variable Mean S. D Min Max
GVC participation
Trade openness
Value added
GDP per capita
LFPR, total
LFPR, male
LFPR, female
Table 1.B: Developed and Developing countries
Developed Developing
Variable Mean S. D Min Max Mean S. D Min Max
GVC participation
Trade openness
Value added
GDP per capita
LFPR, total
LFPR, male
LFPR, female

Note: To distinguish between developed and developing countries in the sample, use the
variable “countrytype” in the original excel sheet data. Countrytype is a binary variable, which
takes a value 0 for developed countries and 1 for developing countries. You should now report
the summary statistics for developed and developing countries using this countrytype variable.
You can either use STATA to create summary stats or do it in excel directly. In STATA you can
use the command: sum variablename if countrytype==0, and so on.

Question 1:

By how much does GVC participation and Trade openness differ across all countries (in Table
1.A), and between developed and developing countries (in Table 1.B)? Briefly describe one key
reason for this difference?

Exercise 4:

Let us now transition from industry to country-level analysis. In other words, we will aggregate
the data across industries, such that our new data is at the country and year level. To do this, you
should create a modified version of the “Master dataset” by only keeping the industry code
indcode = “DTOTAL” or industryid = 70 (and dropping all the other 69 industry groups). This
is the sum total value of each variable across all industries. You should only keep the row
“DTOTAL” for all 65 countries and 24 years. Now your dataset will have 65 countries and 24
years only (No 70 industries anymore). Save this dataset as “Country level dataset”, in the
“Master dataset” excel file as a new sheet.
Note: You can use a software like STATA (or R) to do this step. Or simply do it on Excel. In
STATA, use a command that keeps just the indcode “DTOTAL”.
Exercise 5:
Now, let’s do some country-level analysis (using the above Country-level dataset). Let us plot
some graphs and analyze the data.
Question 2:
(a) Plot GVC participation (x-axis) and GDP per capita (y-axis) on a graph. The graph is for all
countries in the sample. Analyze and interpret the relationship between these two variables.
(Hint: you can use the command “scatter” on STATA to generate such graphs. Or do it how you
like, on excel or any other software).
(b) Plot Trade openness (x-axis) and GDP per capita (y-axis) on a graph. The graph is for all
countries in the sample. Analyze and interpret the relationship between these two variables.
Discuss how the results in this graph differs from the results in part (a).
(c) Plot GVC participation (x-axis) and LFPR, LFPR male and LFPR female (y-axis) on three
separate graphs. The graph is for all countries in the sample. Analyze and interpret the
relationship between GVC participation and overall employment, and on employment by
gender.

Exercise 6:
Finally, let’s repeat the country-level analysis (using the above Country-level dataset) in question
5 for a select sample of developed and developed nations.
Now, only take the following sub-sample of countries: United States (USA), Germany, South
Korea, China, Brazil, and India.
For these selected countries, do the following analysis:
Question 3:
(a) Plot GVC participation (y-axis), LFPR (y-axis), and year (x-axis) on a graph. The graph
should have separate trend lines for each of the selected countries from 1995-2018.
Analyze and interpret the trend of GVC and employment for the selected sample of countries.
(b) Plot Trade openness (y-axis), LFPR (y-axis), and year (x-axis) on a graph. The graph should
have separate trend lines for each of the selected countries from 1995-2018.
Analyze and interpret the trend of trade openness and employment for the selected sample of
countries.
Note: Use different colours for each country’s trend line. And label it clearly.

You might also like