Final Report - Introduction of Regression Analysis
Final Report - Introduction of Regression Analysis
of a change in one event on another events. Changes in an event can be expressed by changes
in variable values. The results of the analysis can be used as a basis for assessing a policy.
Suppose the company takes a policy to increase the amount of promotional expenses so that
sales increase. Regression analysis that analyzes one independent variable (x) is simple
regression analysis, while regression analysis that analyzes more than one independent variable
(x) is simple regression analysis. Regression can also be interpreted as an attempt to estimate
change. So that there is no misunderstanding that forecasting does not give a definite answer
about what will happen, but rather tries to find an approach to what will happen. So, regression
suggests curiosity about what will happen in the future to contribute to determining the best
decision. Regression can also contribute to determining the best decision. Regression analysis
is used to predict how far the changes in the value of the independent variable manipulated /
changed or increased or decreased.
1. Basic theory of regression analysis
Regression analysis is a technique for constructing equations and using those equations
to make predictions. Thus, regression analysis is often referred to as predictive analysis.
Because it is a prediction, the predicted value is not always precise with the real value,
the smaller the deviation rate between the predicted value and the real value, the more
precise the regression equation formed. So it can be defined that regression analysis is
a statistical method used to determine the possible form of relationship between
variables, to predict or estimate the value of another unknown variable.
An example for simple regression is the amount of pocket money received by students
is influenced by the distance from home to campus. If based on a logical explanation,
the closer the distance from home to campus, the smaller the value of student pocket
money, on the other hand, if the farther the distance from home to campus, the greater
the amount of student pocket money, so that the distance from home to campus
(variable X) will affect the value of student pocket money (variable Y) positively.
4. Correlation Analysis
Correlation is an analytical technique included in one of the measures of association
techniques. Measurement of association is a general term that refers to a group of
techniques in bivariate statistics used to measure the strength of the relationship
between two variables. Among the many techniques of measuring association, there are
two correlation techniques that are very popular until now, namely Pearson Product
Moment Correlation and Spearman Rank Correlation. Besides these two techniques,
there are also other correlation techniques, such as Kendal, Chi-Square, Phi Coefficient,
Goodman-Kruskal, Somer, and Wilson. Association measures impose numerical values
to determine the degree of association or strength of relationship between variables.
Two variables are said to be associated if the behavior of one variable affects the other.
If there is no influence, then the two variables are called independent. Correlation is
useful to measure the strength of the relationship between two variables (sometimes
more than two variables) with certain scales, for example Pearson data must be interval
or ratio scale; Spearman and Kendal use ordinal scale; Chi Square uses nominal data.
The strength of the relationship is measured between a range of 0 to 1. Correlations
have the possibility of testing
Hypothesis testing (two tailed). Correlation is unidirectional if the correlation
coefficient value is found to be positive; conversely, if the correlation coefficient value
is negative, the correlation is called unidirectional. What is meant by correlation
coefficient is a statistical measurement of covariation or association between two
variables. If the correlation coefficient is found to be not equal to zero (0), then there is
dependence between the two variables. If the correlation coefficient is found to be +1.
then the relationship is referred to as a perfect correlation or perfect linear relationship
with a positive slope. If the correlation coefficient is found -1. then the relationship is
referred to as perfect correlation or perfect linear relationship with a negative slope. In
perfect correlation there is no need for hypothesis testing, because the two variables
have a perfect linear relationship. This means that variable X affects variable Y
perfectly. If the correlation is equal to zero (0), then there is no relationship between the
two variables. In actual correlation, the terms independent variable and dependent
variable are not recognized. Usually, the calculation uses the symbol X for the first
variable and Y for the second variable. In the example of the relationship between the
remuneration variable and job satisfaction, the remuneration variable is variable X and
job satisfaction is variable Y.
5. Correlation Coefficient
The correlation coefficient is a value that shows the strength of the linear relationship
between two variables. The correlation coefficient is usually denoted by the letter r
where the value of r can vary from -1 to +1. An r value close to -1 or +1 indicates a
strong between the two variables and an r value close to 0 indicates a weak relationship
between the two variables. Meanwhile, the + (positive) and - (negative) signs provide
information about the direction of the relationship between the two variables. If it is +
(positive) then the two variables have a unidirectional relationship. In other words, an
increase in X will coincide with an increase in Y and vice versa. vice versa. If it is -
(negative), it means that the correlation between the two variables is opposite. An
increase in the value of X will be accompanied by a decrease in Y. Pearson correlation
coefficient or Product Moment Coefficient of Correlation is a value that shows the
closeness of the linear relationship between two variables. which shows the closeness
of the linear relationship between two variables with interval or ratio data scales. The
formula used is
The Spearman rank correlation coefficient is a value that shows the closeness of the
linear relationship between two variables with an ordinal data scale. The Spearman
coefficient is usually denoted by the formula used is
If variables X and Y are independent then the value of r = 0, but if the value of r = 0, X
and Y are not always independent. The variables X and Y are just not associated. Please
note that the results of the correlation coefficient coefficient can only be used as an
initial indication in the analysis. The value of the correlation coefficient cannot describe
the cause and effect relationship between variables X and Y. To arrive at the existence
of a cause and effect relationship, more intensive research is needed or can be based on
existing theories where X affects Y or Y affects X.
6. Determination Coefficient
The coefficient of determination shows the extent to which the contribution of the
independent variables in the regression model is able to explain the variation in the
dependent variable. The coefficient of determination can be seen through the R-square
(R2) value in the Model Summary table. The coefficient of determination test is carried
out to determine how much the endogenous variables are simultaneously able to explain
the exogenous variables. The higher the R2 value, the better the prediction model of the
proposed research model. The coefficient of determination (R2) test is carried out to
determine and predict how much or important the contribution of the influence given
by the independent variables together to the dependent variable. The coefficient of
determination is between 0 and 1. If the value is close to 1, it means that the independent
variables provide almost all the information needed to predict the dependent variable.
However, if the R2 value is getting smaller, it means that the ability of the independent
variables to explain the dependent variable is quite limited (Ghozali, 2016). According
to Chin (1998), the R-Square value is categorized as strong if more than 0.67, moderate
if more than 0.33 but lower than 0.67, and weak if more than 0.19 but lower than 0.33.