0% found this document useful (0 votes)
75 views17 pages

SPSS Exam

There are two main types of hypotheses discussed in the document: 1) The null hypothesis, which assumes that an event will not occur or that there is no difference. It is represented by H0. 2) The alternate hypothesis, which is the logical opposite of the null hypothesis. It is only considered if the null hypothesis is rejected. It is represented by H1. As an example, the hypotheses for determining if a coin is fair would be: the null hypothesis is that the probability of heads and tails is not equal, while the alternate hypothesis is that the probability of heads and tails is equal.

Uploaded by

Sourabh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views17 pages

SPSS Exam

There are two main types of hypotheses discussed in the document: 1) The null hypothesis, which assumes that an event will not occur or that there is no difference. It is represented by H0. 2) The alternate hypothesis, which is the logical opposite of the null hypothesis. It is only considered if the null hypothesis is rejected. It is represented by H1. As an example, the hypotheses for determining if a coin is fair would be: the null hypothesis is that the probability of heads and tails is not equal, while the alternate hypothesis is that the probability of heads and tails is equal.

Uploaded by

Sourabh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

A hypothesis is an educated guess about something in the

dataset or world. It should be testable either by experiment


or observa on. Hypothesis Tes ng is a type of sta s cal
analysis in which you put your assump ons about a
popula on parameter to the test.

Two type of Hypothesis:

Null Hypothesis: It is the assump on that the event will not


occur. It does not a ect your readings outcome in any way
unless it is rejected. We use H0 to represent it.

Alternate Hypothesis: It is the logical opposite of null


hypothesis. We consider it only when we reject null
hypothesis. We use H1 to represent it.

Eg. Test Case: To determine whether the coin is fair or not

Null Hypothesis: The probability of head and tails is not


equal.
Alternate Hypothesis: The probability of head and tails is
equal.

ti
ti
ff

ti
ti

ti

ti
ti

(1) Normality of Data?

Normality is the property of data which is


distributed according to Normal Distribu on. Normal
Distribu on is the theore cal distribu on of data
around the mean or standard devia on. It gives a
bell shaped and symmetric graph and has one peak
only. The shape remains same for normal
distribu on. The normal distribu on is popular
because it describes many real-life situa ons, such
as the distribu on of people’s heights, weights, and
income. We usually use p-value test to check
whether the distribu on is normally distributed or
not. A normal distribu on is perfectly symmetrical
around its center. Normal distribu ons are
con nuous and have tails that are asympto c, which
means that they approach but never touch the x-
axis.
ti

ti
ti
ti
ti
ti

ti
ti
ti
ti
ti
ti
ti
ti
(2) Types of Data and their Di eren a on.
Types of data

1 Nominal data

2 Ordinal data

3 Interval data

4 Ra o data

1. Nominal data: Nominal data is “labeled” or “named” data which can be


divided into various groups that do not overlap. Data is not measured or
evaluated in this case, it is just assigned to mul ple groups. These groups
are unique and have no common elements.

Example: town of residence, colour of car, male or female

2. Ordinal data: Ordinal data is classi ed into categories within


a variable that have a natural rank order. However, the distances between
the categories are uneven or unknown.

For example, the variable “frequency of physical exercise” can be


categorized into the following: 1. Never 2. Rarely 3. Some mes 4. O en 5.
Always

There is a clear order to these categories, but we cannot say that the
di erence between “never” and “rarely” is exactly the same as that
between “some mes” and “o en”. Therefore, this scale is ordinal.

3. Interval data: Interval data, also called an integer, is de ned as a data


type which is measured along a scale, in which each point is placed at
equal distance from one another. Interval data always appears in the form
of numbers or numerical values where the distance between the two
points is standardized and equal.
ff
ti

ti
ft

ff
fi

ti

ti
ti

fi
ti

ft
For example : Temperature measured in Cen grade, a cup of co ee at
80°c isn't twice as hot a

one at 40°c

4. Ra o data: Ra o Data is de ned as quan ta ve data, having the same


proper es as interval data, with an equal and de ni ve ra o between
each data and absolute “zero” being treated as a point of origin. In other
words, there can be no nega ve numerical value in ra o data.

For example: Heights, Weights, Salaries, Ages. If someone is twice as


heavy as someone else

in pounds, this will s ll be true in kilograms.


ti
ti

ti
ti

ti

fi
ti

ti
ti
fi
ti
ti
ti

ff
(3) How can you create a new variable on SPSS based on
an existing variable.

To insert a new variable into a dataset:

In the Data View window, click the name of the column to the
right of of where you want your new variable to be inserted.

• You can now insert a variable in several ways:


• Click Edit > Insert Variable;
• Right-click an existing variable name and click Insert
Variable; or
• Click the Insert Variable icon.
• A new, blank column will appear to the left of the column
or cell you selected.

• New variables will be given a generic name (e.g.


VAR00001). You can enter a new name for the variable on
the Variable View tab.

(4) Parametric $ Non Parametric Data and their
Di eren a on.
ff
ti
ti

(5) What are Outliers and how can you remove them?

An outlier is an observa on that lies an abnormal distance from other


values in a random sample from a popula on. they’re unusual values in a
dataset. Outliers are problema c for many sta s cal analyses because
they can cause tests to either miss signi cant ndings or distort real
results. There are no strict rules to classify outliers, it usually depend data
understanding and the subject.

Sort your dataset and nd for unusual high or low values, these will be
the outliers.

We can also plot the graph to nd outliers, boxplots, histograms or


sca erplots are some commonly used methods.

We can use Z-score method to check and remove outliers. Z-scores can
quan fy the unusualness of an observa on when your data follow the
normal distribu on. Z-scores are the number of standard devia ons
above and below the mean that each value falls.

— We can use IQR method to de ne our fence for data. In this we de ne


fences and check whether the data lies in it or not. This is a more
professional method.

— You can use hypothesis tests to nd outliers.

— We can use any of the above men oned test depending on the type
and quan ty of data.
tt
ti
ti

ti

fi
ti
fi
ti

fi
fi
ti
ti
fi
ti
fi
ti

ti

ti
fi

(6) 3 Methods to check whether Data is Normally


Distributed or not.

3 ways to check Normality:

1. Shapiro Wilk Test:


The Shapiro Wilk test is the most powerful test when testing for a
normal distribution.
• If the P-Value of the Shapiro Wilk Test is larger than 0.05, we
assume a normal distribution

• If the P-Value of the Shapiro Wilk Test is smaller than 0.05, we do


not assume a normal distribution


2. Boxplot:
Here two tests for normality are run. For dataset small than 2000
elements, we use the Shapiro-Wilk test, otherwise, the Kolmogorov-
Smirnov test is used. In our case, since we have only 20 elements,
the Shapiro-Wilk test is used.

3. We can also use Histogram. If the variable is normally distributed,


the histogram should take on a “bell” shape with more values
located near the center and fewer values located out on the tails. It
gives us a quick way to visualize the distribu on of a variable and
gives us a rough idea.

4. We can calculate the mean, median and mode and if they are equal
or almost equal then we can say the data to be normally
distributed.

ti
(8) What is Data Analysis with examples.
DATA ANALYSIS

Data analysis is the process of capturing useful informa on by inspec ng, cleansing,
transforming, and modeling the dataset; methodologies involved in doing so can be
categorized as Descrip ve Analysis(it gets the insight of

the data numerically), Exploratory Analysis( it receives the wisdom of the informa on
visually), Predic ve

Analysis( it conveys the sense of the data using historical events) and Inferen al
Analysis(this involves ge ng

the understanding of the popula on by obtaining the informa on from the sample).

Types of Data Analysis

Based on the methodologies used, data analysis can be divided into the following
four parts:

· Descrip ve Analysis

· Exploratory Data Analysis

· Predic ve Analysis

· Inferen al Analysis

A simple example of data analysis can be seen whenever we make a decision in our

daily lives by evalua ng what has happened in the past or what will happen if we

make that decision. Basically, this is the process of analyzing the past or future and

making a decision based on that analysis.


ti
ti
ti

ti

ti

ti
tti

ti

ti

ti
ti
ti

ti

(9) What are the Di . Steps to Download SPSS pre. Version.

To download and install IBM SPSS Sta s cs Subscrip on, go to the IBM

Marketplace and then:

1. Sign in with your IBM account (also known as IBMid). You must register for an

IBMid if you do not already have an ac ve IBM account.

2. A er you log in, your account pro le provides a Product and

Services sec on that displays all of the IBM products and services to which

your are en tled.

3. Click Download next to IBM SPSS Sta s cs Subscrip on.

4. On the Product and Services page, click the Download link underneath IBM

SPSS Sta s cs Subscrip on.

5. Click Save File if prompted.

6. On Microso Windows machines, navigate to the save le loca on, right click

the le, and select Run as administrator from the menu.

On the Mac OS, you must double-click the installer le a er you mount the

disk image.

7. Follow the installa on steps (including accep ng the license agreement) un l

the product installa on is complete.

8. IBM SPSS Sta s cs Subscrip on is now ready for use.

The way you lay out your data in SPSS will depend upon the kind of data you have and the

analysis you propose to carry out. However there are some basic principals that apply in all

situa ons.
ft
fi
ti
ti

ti

ti
ti
ft
ti

ti
ti
ti

ti
ff

ti
fi

ti
ti
ti
ti
ti
ti
fi
ti
ti

ft
fi

ti

ti

(10) What is Dependent and Independent variable


with Examples.

Independent Variable:
As per the name, an independent variable (IV) stands alone. The value does not
change due to the impact of any other variable. Independent variables in some cases
can already exist like age, but it is not dependent on any other variable. It is used
in sta s cs, where you es mate the extent to which an independent variable change
can explain or predict changes in the dependent variable.

Dependent Variable:
A dependent variable is the variable that changes as a result of the independent
variable manipula on. It’s the outcome you’re interested in measuring, and it
“depends” on your independent variable. You can also predict how much your
dependent variable will change as a result of varia on in the independent variable.
ti
ti
ti

ti

ti

(12) How can you Structure your Data?

1 SPSS expects you to put each case on a row. Usually this means that
each research subject will have a row to their self.

2 Categorical variables are best represented by numbers even if they


are not ordered categories, they can then be ascribed a text label using
the "Variable Labels" option.
3 The variable name that appears at the top of the column in SPSS is
limited in length and the characters it will hold, the variable label can
hold a more meaningful description of the variable and will be used
on output (graphs etc.) if you ll it in.

4 If you have two (or more) groups of subjects each subject will still
have a row to their self, however you will need to dedicate a variable
(column) to let the system know which group each subject belongs to.


fi

(13) Order to Tackle your Data.



Discriminant analysis is a versa le sta s cal method o en used


by market researchers to classify observa ons into two or more
groups or categories. In other words, discriminant analysis is
used to assign objects to one group among a number of known
groups.

For example, a doctor could perform a discriminant analysis to


iden fy pa ents at high or low risk for stroke. The analysis might
classify pa ents into high- or low-risk groups, based on personal
a ributes (e.g., chololesterol level, body mass) and/or lifestyle
behaviors (e.g., minutes of exercise per week, packs of cigare es
per day).

Another example can be, A large interna onal air carrier has
collected data on employees in three di erent job classi ca ons:
1) customer service personnel, 2) mechanics and 3) dispatchers.
The director of Human Resources wants to know if these three
job classi ca ons appeal to di erent personality types. Each
employee is administered a ba ery of psychological test which
include measures of interest in outdoor ac vity, sociability and
conserva veness.
tt
ti

ti
fi
ti
ti
ti

ff
tt
ti
ti
ff
ti
ti
ti
ti
ft
fi
ti
tt

You might also like