Introduction
We have seen that one of the biggest global problems is GDP per person employed in
many different countries. Every nation aims to increase its GDP. A large portion of the
advancements are visible in the Middle Eastern nations. It is crucial for countries to
control GDP as their populations grow. Italy will be taken into consideration in our study.
We'll also take into account a few other variables, such the employment rate, industrial
production, and the population growth throughout the past five years. We think that
each of these elements influences the nation's GDP in some manner. We attempted to
forecast for the upcoming quarter using Italy's quarterly statistics for the parameters. To
achieve this method, we have performed Regression Analysis as well as different
Forecasting Techniques.
1) The data from Italy is obtained from OECD stats. This data is taken from 2018
Q1 to 2023 Q2 and has the following variables on a quarter-on-quarter basis.
a) Active population
b) Industrial Production Index
c) Employment Rate (%)
d) Gross Domestic Product per person employed.
In this part we have performed regression analysis in a step-by-step manner.
First, we will try to showcase the dependencies of a variable on another variable
through some scatter plot.
Employment Rate %
79.0
78.0
77.0
76.0
75.0
74.0
73.0
72.0
2,750 2,800 2,850 2,900 2,950 3,000 3,050
The above graph shows the relation between the population at any point of time
and its employment rate. If I observe the graph, with increasing population the
employment rate also increases. A straight line can be fit showing the relationship
between the two variables. Thus, from the graph it can be concluded that there is
a relationship between the population and the employment rate.
Industrial Production%
10.0
8.0
6.0
4.0
2.0
0.0
2,750 2,800 2,850 2,900 2,950 3,000 3,050
-2.0
-4.0
-6.0
-8.0
-10.0
In the above graph the industrial production rate is shown against the population.
From the graph it can be clearly visible that there is no straight relationship that
can be defined as the plot is very scattered and has no certain pattern. From this
we can state that Industrial production % is not dependent on the population at
any point of time.
GDP per person employed
110.0
108.0
106.0
104.0
102.0
100.0
98.0
96.0
94.0
92.0
90.0
2,750 2,800 2,850 2,900 2,950 3,000 3,050
The above graph shows the relation between GDP per person and active
population. Although the data is diverse and scattered, we can see that with
growing population there is some trend that can be developed but the relation is
not so strong. It is not safe to conclude that the two variables are somewhat
independent.
Industrial Production%
10.0
8.0
6.0
4.0
2.0
0.0
73.5 74.0 74.5 75.0 75.5 76.0 76.5 77.0 77.5 78.0 78.5
-2.0
-4.0
-6.0
-8.0
-10.0
We see from the above scatter plot of Industrial production rate and employment
rate that there is no visual relationship that can be developed between the two
variables because the data is scattered and does not follow a single pattern to
conclude any existing relationship.
GDP per person employed
110.0
108.0
106.0
104.0
102.0
100.0
98.0
96.0
94.0
92.0
90.0
73.5 74.0 74.5 75.0 75.5 76.0 76.5 77.0 77.5 78.0 78.5
The above graph shows the relation between GDP per person and Employment
Rate. Although the data is diverse and scattered, we can see that with growing
population there is some upward trend that can be developed but the relation is
not so strong. It is not safe to conclude that the two variables are somewhat
independent as there can be relationships which are visually not seen.
Industrial Production%
12.0
10.0
8.0
6.0
4.0
2.0
0.0
0.0 2.0 4.0 6.0 8.0 10.0 12.0
We see from the above scatter plot of Industrial production rate and GDP per
person that there is no visual relationship that can be developed between the two
variables because the data is scattered and does not follow a single pattern to
conclude any existing relationship.
In the next step we look at the correlation matrix between the variables to
look for any existing relationship. It is to be noted that correlation values lie
between 1 and -1 with 0 demonstrating no relationship while 1 and -1
representing strong positive and negative relationship respectively.
GDP per
Active person
Population Employment Industrial employe
(Thousands) Rate % Production% d
Active
Population
(Thousands
) 1
Employmen 0.90691519 1
t Rate % 4
Industrial
Production 0.39047018 0.36319002
% 9 3 1
GDP per - -
person 0.02552423 0.08015854 0.05257601
employed 9 6 2 1
From the above correlation matrix, it can be suggested that there is a high
correlation between population and employment rate % since the value is close to 1.
Other variables also have some correlation between them, but it is not as high as the
above two variables stated. Thus, from the scatter plot as well as the correlation matrix
we can conclude that Employment rate % is positively high correlated with the Active
Population.
In the next step we will perform regression analysis of all the variables followed
by regression of some of the chosen variables. Here we will try to predict the GDP per
person employed given other variables. We will use the Regression tool from excel. This
tool is available in the data analysis tool pack add-in in Excel.
So our Independent variables are Active Population, Employment Rate % and Industrial
production %
And Dependent Variable is GDP per person
The result is as follows:
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.396607
R Square 0.157297
Adjusted R Square 0.016847
Standard Error 3.219707
Observations 22
ANOVA
df SS MS F
34.829858 11.609 1.1199
Regression 3 92 95 48
Residual 18 186.59725 10.366
14 51
221.42711
Total 21 03
Coefficien Standard
ts Error t Stat P-value
48.558951 0.9916 0.3344
Intercept 48.15493 02 8 98
Active
Population(Thousan 0.0222187 0.6315 0.5356
ds) 0.014033 77 64 11
Employment Rate 1.2979826 0.1398 0.8903
% 0.181568 06 84 05
Industrial 0.1639598 0.3054 0.7635
Production% 0.050076 96 18 51
In this case our independent variables are Active Population and Industrial Production
%. While the dependent variables remain the same as GDP per person employed.
The model stands at.
GDP per person = 48.155 + 0.014 * Active Population + 0.181 * Employment Rate% +
0.05 * Industrial Production %
The overall model is still bad because we see the value of Adjusted R 2 to be 0.016 i.e
only 1.6% of the variability of GDP per person is explained properly by the variables. We
see that from the p-value obtained in the table. None of the coefficients parameters
estimates are statistically significant. For the sake of inference we have taken a p-value
< 0.05 means that the parameters is statistically significant i.e. the lower the p-value the
higher the chance of the coefficient not being a zero.
To test for significance of the model we will see the F test statistics mentioned in the
table above.
The F test statistics for the above model is Fα = 1.119
Degree of freedom 1 = 2
Degree of freedom 2 = 19
From the F table if we see the value of F2,19 = 2.61
Therefore, since Fα< F2,19 we can conclude that the regression model is not statistically
significant.
2) In this step we will perform three types of forecasting techniques
a) Naïve forecasting – This is a technique in which the value for the last period is
used as the forecast for this period without adjusting or attempting to address
the casual factors.
b) Moving Average Forecasting – This is a technique in which the average of the
last few periods is taken as a forecast for this period. This is usually used to
identify the direction of the trend. In our case we have taken the Moving
average for the last three periods i.e. the last three quarters.
c) Exponential smoothing Forecasting – In this technique of forecasting we
assign weights based on the period of observations. We assign more weight
to recent observations while less weight to past observations. The weights
are represented by a factor named α. In our case we have done forecast for
values of alpha between 0.1 to 0.9 and checked the Mean square errors.
To perform Moving average forecast we will use the Moving average tool present in the
data analysis toolpack add in and for Exponential Smoothing we will use the
Exponential Smoothing tool in the data analysis toolpack add in.
GDP forecast.
The forecasted values for Q3 2023 as well as the MSE obtained by applying the
above forecasting methods are represented below.
Techniques Forecasted Value for Q3 MSE
2023
Naïve Forecast 101.7 18.53
Moving Average 104.8 16.32
Exponential Smoothing 103.2238 12.9
Alpha = 0.1
Alpha = 0.9 102.1618 17.1
Since Exponential Smoothing with Alpha value of 0.1 gives the least MSE of the
above forecasting methods. Therefore, for GDP forecasting we will use
Exponential smoothing with a smoothing factor of 0.1.
Forecast for Q3 2023 = 103.2238
Industrial Production % Forecast:
The forecasted values for Q3 2023 as well as the MSE obtained by applying the
above forecasting methods are represented below.
Techniques Forecasted Value for Q3 MSE
2023
Naïve Forecast -7.8 14.627
Moving Average -3.8 23.212
Exponential Smoothing 0.227 19.1
Alpha = 0.1
Alpha = 0.9 -7.57291 13.9
Exponential Smoothing with Alpha value of 0.9 gives the least MSE of the
above forecasting methods. Therefore, for Industrial Production % forecasting we
will use Exponential smoothing with a smoothing factor of 0.9.
Forecast for Q3 2023 = -7.57
Employee Rate Forecast:
The forecasted values for Q3 2023 as well as the MSE obtained by applying the
above forecasting methods are represented below.
Techniques Forecasted Value for Q3 MSE
2023
Naïve Forecast 77.4 0.234
Moving Average 77.4 0.611
Exponential Smoothing 76.38 1.504
Alpha = 0.1
Alpha = 0.9 77.38 0.243
Naïve forecasting gives the least MSE of the above forecasting methods.
Therefore, for Employee rate forecasting we will use Naïve Forecasting.
Forecast for Q3 2023 = 77.4
Active Population Forecast:
The forecasted values for Q3 2023 as well as the MSE obtained by applying the
above forecasting methods are represented below.
Techniques Forecasted Value for Q3 MSE
2023
Naïve Forecast 3002 568
Moving Average 2980 1217
Exponential Smoothing 2904.998 5179
Alpha = 0.1
Alpha = 0.9 2999.313 578
Naïve Forecasting gives the least MSE of the above forecasting methods.
Therefore, for Industrial Production % forecasting we will use Naive Forecasting.
Forecast for Q3 2023 (In thousands) = 3002
3) In this report we have performed both Regression and Forecasting. In regression
we have tried to implement simple linear regression. We tried to formulate a
linear regression model where our dependent variable is GDP per person. All the
other data features namely Active population, Industrial production rate and
employment rate are the independent variables. In this method we see that the
values of the dependent variable are not related to time but with other variables.
The main goal of regression analysis is to determine how one or more
independent variables and a dependent variable are related. Determining and
measuring the effect of independent factors on the dependent variable is helpful
(Nikolopoulas.et.al(1)). Regression does not predict the future values based on
the past. It can only be used for prediction only when there is observed data. In
our case if we have the value of Active population, industrial production rate and
employment rate for Q3 2023 then only regression can predict the GDP per
person employed of that quarter. Also, since the model results, we obtained is not
so good the prediction obtained for GDP per person will not be great and will
have a large residual value. This is because in the model we are force fitting the
variables to the dependent variable (Zhao.et.al (2)).
Forecasting on the other hand does not depend on any other features. It is
a time dependent feature where time is the only independent variable while rest
of the variable are considered separate. This is mostly obtained when we look at
the past data and try to find some pattern or trend in the data. On the basis of
that data we predict the future. In this the values of the other variables are mot
important or we do not need to know the values of other variable in the future to
predict the result. Since the forecasting algorithm does not depend on any other
variable there can be a number of ways in which forecasting can be done. In our
analysis we have put three types of forecasting. First being the Naïve forecasting
where the values from the previous period are put as a forecast for this period.
This is a simple way of forecasting where it is believed that whatever the demand
of this period will be the demand for the next period. The next is the moving
average-based forecasting. This method is to find any trend or pattern in data.
We select a certain period and average out the values of the last period of those
numbers. This makes sure that the past data is utilized taking a certain number
of periods and not only the last periods data. The last method used is exponential
smoothing. This method is used when there is certain uncontrollable factors in
the environment that might affect the value for the next period. Those dampening
factors is represented by Alpha. The Alpha value seeks to make sure that some
of the past data is taken while more focus is given on recent past data.
Regression analysis finds the main factors that influence a given result,
which aids in resource optimization (Sulaimon. 2015 (3)). For instance, by
determining which variables have the most effects on their objectives, it may
assist companies in better effectively allocating their budget. Regression analysis
offers accurate insights into the factors affecting a given outcome by providing a
complete grasp of the connections between variables. This accuracy can be very
important for adjusting tactics. However, its accuracy is dependent on other
parameters at that point of time so other parameter values needs to be find out
by the business for its predictions.
Forecasting provides people with insightful information that is useful for
budgeting and company planning. Anticipating future demand, sales, and
financial performance helps firms set realistic goals and spend resources
efficiently. It assist market firms to predict trends and consumer behavior. It also
helps in supporting long-term strategic vision. It reduces the effect of uncertainty
and permits proactive decision-making. It also reduces the cost of complication
since no other variables are needed to predict (Billings. 1998 4)). The only tuning
it requires is the dampening factor which is a environment dependent.
References
1) Nikolopoulos, K., Goodwin, P., Patelis, A., & Assimakopoulos, V. (2007).
Forecasting with cue information: A comparison of multiple regression with
alternative forecasting approaches. European journal of operational
research, 180(1), 354-368.
2) Zhao, H., Sinha, A. P., & Bansal, G. (2011). An extended tuning method for cost-
sensitive regression and forecasting. Decision support systems, 51(3), 372-383.
3) Sulaimon Mutiu, O. (2015). Application of weighted least squares regression in
forecasting. Int. J. Recent. Res. Interdiscip. Sci, 2(3), 45-54.
4) Billings, R. B., & Agthe, D. E. (1998). State-space versus multiple regression for
forecasting urban water demand. Journal of water resources planning and
management, 124(2), 113-117.