0% found this document useful (0 votes)
82 views35 pages

Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)

This document discusses different statistical concepts related to correlation and regression analysis, including: 1. Correlation measures the strength and direction of a linear relationship between two variables. Positive correlation indicates that as one variable increases, so does the other. Product moment correlation is commonly used with interval or ratio level data. 2. Regression models describe the relationship between a dependent variable and one or more independent variables using a best fitting line or curve. Simple linear regression is used when there is one continuous independent variable. 3. Residual analysis involves examining residuals (differences between observed and predicted dependent values) to validate regression model assumptions like the errors being independent and normally distributed. Significant deviations may require modifying the model.

Uploaded by

sanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views35 pages

Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)

This document discusses different statistical concepts related to correlation and regression analysis, including: 1. Correlation measures the strength and direction of a linear relationship between two variables. Positive correlation indicates that as one variable increases, so does the other. Product moment correlation is commonly used with interval or ratio level data. 2. Regression models describe the relationship between a dependent variable and one or more independent variables using a best fitting line or curve. Simple linear regression is used when there is one continuous independent variable. 3. Residual analysis involves examining residuals (differences between observed and predicted dependent values) to validate regression model assumptions like the errors being independent and normally distributed. Significant deviations may require modifying the model.

Uploaded by

sanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Business statistics

method
By farah nurul aisyah( 4122001020)
Jasmine alviana zalzabillah(4122001070)
Definition correlation

Correlation is one of the analytical techniques in statistics that is used to find the relationship
between two variables which is also quantitative. For example, we can use the height and age of
elementary school students as variables in a positive correlation.
Correlation statistics is a way or method to determine whether or not there is a linear relationship
between variables. And if there is a relationship, the changes that occur in one of the X variables will
result in changes in the other variable (Y).
Correlation

Types of Correlation

PRODUCT MOMENT CORRELATION

To apply the correlation coefficient between two variables, each of which


has an interval measurement scale, the product moment correlation
developed by Karl Pearson is used.

There are two kinds of product moment correlation formulas, namely:

Product moment correlation with deviation formula (deviation).


Product moment correlation with crude number formula.

Product moment correlation with deviation formula (deviation)


correlation
1. Example: Finding the correlation coefficient between math scores and physics scores obtained by students.
No. Mat. Fisika Y–
X–  . y
Resp. X Y y
1 6,5 6,3 0,0 -0,1 0,00 0,01 0,00
2 7,0 6,8 +0,5 +0,4 0,25 0,16 +0,20
3 7,5 7,2 +1,0 +0,8 1,00 0,64 +0,80
4 7,0 6,8 +0,5 +0,4 0,25 0,16 +0,20
5 6,0 7,0 -0,5 +0,6 0,25 0,36 -0,30
6 6,0 6,2 -0,5 -0,2 0,25 0,04 +0,10
7 5,5 5,1 -1,0 -1,3 1,00 1,69 +1,30
8 6,5 6,0 0,0 0,4 0,00 0,16 0,00
9 7,0 6,5 +0,5 +0,1 0,25 0,01 +0,05
10 6,0 5,9 -0,5 -0,6 0,25 0,36 +0,30
Jumla
65,0 63,8 – – 3,50 3,59 2,65
h
Example :
No.
X Y x.y
Resp.
1 6,5 6,3 42,25 39,69 40,95
2 7,0 6,8 49,00 46,24 47,60
3 7,5 7,2 56,25 51,84 57,00
4 7,0 6,8 49,00 46,24 47,60
5 6,0 7,0 36,00 49,00 42,00
6 6,0 6,2 36,00 38,44 37,20
7 5,5 5,1 30,25 26,01 28,05
8 6,5 6,0 42,25 36,00 39,00
9 7,0 6,5 49,00 42,25 45,50
10 6,0 5,9 36,00 34,81 35,40
Jumla 426,0 410,5 417,3
65,0 63,8
h 0 2 0
So,

Product moment correlation is generally also used to determine the


validity of the attitude instrument items and other psychological
characteristics whose item scores are considered to have an interval
measurement scale.
Kinds of Correlation

A correlation that has occurred between two variables is not


always in the form of an addition to the value of the Y variable if
the X variable increases, this kind of correlation is interpreted as a
positive correlation.

Sometimes it is found that there is a relationship where if one


variable value increases, the other variable decreases, this kind of
relationship can be interpreted as a negative correlation. Not only
positive and negative correlations, but also sometimes there are
cases where the relationship between variables is very weak and
even no correlation is found.
1. Positive Correlation

Positive correlation can be interpreted as a relationship between variables X and Y which can be
shown by a causal relationship where if there is an increase in the value of the variable X, it will be
followed by the addition of the value of the variable Y.

Examples of Positive Correlation:

- In the assessment, if fertilizer is added (X), the rice production will also increase (Y).
- Of course, the higher the height (X) of a child, the weight will also increase (Y).
- The wider the area planted with cocoa (X), the cocoa production will also increase.
2. Negative Correlation

If the positive correlation is for an increase in the value of X


and will be followed by the addition of the value of Y, this
negative correlation can apply the other way around. If the
value of the variable X increases, the value of the variable Y
actually decreases.

Examples of Negative Correlation:

If the price of goods (X) is increasing, it is likely that the


demand for these goods will also decrease.
3. No Correlation

This correlation will occur if the two variables (X and Y) do


not show a linear relationship.
Example:
Hair length (X) and height (cannot be calculated or have
no relationship at all).
4. Perfect Correlation

Perfect correlation can usually occur if the increase or decrease in the variable X is always
proportional to the increase or decrease in the variable Y. If a point diagram or scatter diagram is
clearly depicted, the points in a row will form a straight line, with almost no scatter.

The magnitude of the relationship between the independent variable and the dependent variable
can usually be measured by the correlation coefficient.
The symbols are:
= ie population correlation coefficient and r = sample correlation coefficient.
The value of the correlation coefficient is in the range of -1 to +1, where if:

The correlation coefficient is 0 (zero), meaning that there is no relationship between the two
variables.
The correlation coefficient is negative, meaning that the relationship between the two variables
is negative or inversely proportional to each other.
The correlation coefficient is positive, meaning that the relationship between the two variables
is positive or can also be directly proportional to each other.
Example of Correlation Problem

Example question 1:

1. If you want to know how strong the relationship between


a person's income and expenses (consumption) per month is.
Data from 6 interviewees were obtained from data.
Solution:
X (income): 800 900 700 600 700 800 (thousands)
Y (consumption): 300 300 200 100 200 200 (thousands)
To calculate the correlation coefficient, the following
auxiliary table is compiled:
Based on the auxiliary table obtained the values:
x = 4,500
y = 1,300
x
2
= 3,430,000
y
2
= 310,000
xy = 1.010,000
n=6
Regression models

Regression models describe the relationship between variables by fitting a


line to the observed data. Linear regression models use a straight line, while
logistic and nonlinear regression models use a curved line. Regression allows
you to estimate how a dependent variable changes as the independent
variable(s) change.
Simple linear regression is used to estimate the relationship between two
quantitative variables. You can use simple linear regression when you want
to know:
How strong the relationship is between two variables (e.g. the relationship
between rainfall and soil erosion).
The value of the dependent variable at a certain value of the independent
variable (e.g. the amount of soil erosion at a certain level of rainfall).
Example regression models

Example You are a social researcher


interested in the relationship between
income and happiness. You survey 500
people whose incomes range from $15k
to $75k and ask them to rank their
happiness on a scale from 1 to 10.Your
independent variable (income) and
dependent variable (happiness) are both
quantitative, so you can do a regression
analysis to see if there is a linear
relationship between them.
Residual analysis

The analysis of residuals plays an important role in validating the regression


model. If the error term in the regression model satisfies the four
assumptions noted earlier, then the model is considered valid. Since the
statistical tests for significance are also based on these assumptions, the
conclusions resulting from these significance tests are called into question if
the assumptions regarding ε are not satisfied.
The ith residual is the difference between the observed value of the
dependent variable, yi, and the value predicted by the
estimated regression equation, ŷi. These residuals, computed from the
available data, are treated as estimates of the model error, ε. As such, they
are used by statisticians to validate the assumptions concerning ε. Good
judgment and experience play key roles in residual analysis.
Graphical plots and statistical tests concerning the residuals are examined
carefully by statisticians, and judgments are made based on these
examinations. The most common residual plot shows ŷ on the horizontal
axis and the residuals on the vertical axis. If the assumptions regarding the
error term, ε, are satisfied, the residual plot will consist of a horizontal
band of points. If the residual analysis does not indicate that the model
assumptions are satisfied, it often suggests ways in which the model can
be modified to obtain better results.
Estimation Standard Error

The standard deviation around the regression estimation line


that measures the variability of the actual Y value from the
predicted Y, denoted by SYX. Although the least-squares
method (OLS) produces an estimate line with the minimum
amount of variation (unless the coefficient of determination
r2 = 1) the regression equation is not a perfect predictor.
The variability around the regression line is shown in the
figure above, which provides a scatter plot and a regression
line for the motorcycle sales data. We can see from the figure
that some values ​are above the regression line and other
values ​are below the regression line. In this example, the
standard error of estimation is equal to 0.158 units, as in the
Excel output below.
Example
If we calculate manually we can use the formula:
If we calculate using the formula, like this
What is Data Interpretation?
Data interpretation is the process of reviewing data through some
predefined processes which will help assign some meaning to the data
and arrive at a relevant conclusion. It involves taking the result of data
analysis, making inferences on the relations studied, and using them
to conclude.
Therefore, before one can talk about interpreting data, they need to be
analyzed first. What then, is data analysis?
Data analysis is the process of ordering, categorizing, manipulating,
and summarizing data to obtain answers to research questions. It is
usually the first step taken towards data interpretation.
Example

Example of Data Interpretation


Data interpretation methods are how analysts help people
make sense of numerical data that has been collected,
analyzed and presented.For example, when founders are
pitching to potential investors, they must interpret data
(e.g. market size, growth rate, etc.) for better understanding.
What are Data Interpretation Methods?
Data interpretation methods are how analysts help people make sense of
numerical data that has been collected, analyzed and presented. Data,
when collected in raw form, may be difficult for the layman to
understand, which is why analysts need to break down the information
gathered so that others can make sense of it.
For example, when founders are pitching to potential investors, they
must interpret data (e.g. market size, growth rate, etc.) for better
understanding. There are 2 main methods in which this can be done,
namely; quantitative methods and qualitative methods. 
Qualitative Data Interpretation Method 
The qualitative data interpretation method is used to analyze
qualitative data, which is also known as categorical data.
This method uses texts, rather than numbers or patterns to
describe data.
Qualitative data is usually gathered using a wide variety of p
erson-to-person techniques
, which may be difficult to analyze compared to the
quantitative research method.
Quantitative Data Interpretation Method
The quantitative data interpretation method is used to analyze quantitative
data, which is also known as numerical data. This data type contains
numbers and is therefore analyzed with the use of numbers and not texts.
Quantitative data are of 2 main types, namely; discrete and continuous data.
Continuous data is further divided into interval data and ratio data, with all
the data types being numeric.
Due to its natural existence as a number, analysts do not need to employ the
coding technique on quantitative data before it is analyzed. The
process of analyzing quantitative data involves statistical modelling
techniques such as standard deviation, mean and median.
Some of the statistical methods used in analyzing quantitative data are highlighted below:
Mean
The mean is a numerical average for a set of data and is calculated by dividing the sum of the values by the
number of values in a dataset. It is used to get an estimate of a large population from the dataset obtained from
a sample of the population. 
For example, online job boards in the US use the data collected from a group of registered users to estimate the
salary paid to people of a particular profession. The estimate is usually made using the average salary submitted
on their platform for each profession.
Standard deviation
This technique is used to measure how well the responses align with or deviates from the mean. It describes the
degree of consistency within the responses; together with the mean, it provides insight into data sets.
In the job board example highlighted above, if the average salary of writers in the US is $20,000 per annum,
and the standard deviation is 5.0, we can easily deduce that the salaries for the professionals are far away from
each other. This will birth other questions like why the salaries deviate from each other that much. 
With this question, we may conclude that the sample contains people with few years of experience, which
translates to a lower salary and people with many years of experience, translating to a higher salary. However, it
does not contain people with mid-level experience.
Frequency distribution
This technique is used to assess the demography of the respondents
or the number of times a particular response appears in research.  It
is extremely keen on determining the degree of intersection between
data points.
Some other interpretation processes of quantitative data include:
Regression analysis
Cohort analysis
Predictive and prescriptive analysis
Thank you 

You might also like