0% found this document useful (0 votes)
24 views27 pages

Unit 4 - Data Preparation

Ipr

Uploaded by

yasminbrands
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views27 pages

Unit 4 - Data Preparation

Ipr

Uploaded by

yasminbrands
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

29/07/2023

R.NIRMAL KUMAR. BE., M.B.A


Assistant Professor,
Department of Management Studies,
Meenakshi College of Engineering

R.NIRMAL KUMAR B.E.,M.B.A 1


29/07/2023

UNIT – IV DATA PREPARATION AND ANALYSIS

• Data Preparation – editing – Coding –Data entry – Validity of


data – Qualitative Vs Quantitative data analyses – Bivariate
and Multivariate statistical techniques – Factor analysis –
Discriminant analysis – cluster analysis – multiple regression
and correlation – multidimensional scaling – Application of
statistical software for data analysis.

DATA PREPARATION AND ANALYSIS

• Once the data has been collected, process of analysis begins


• But the data has to be translated in appropriate form
• This process is known as data preparation.

• RAW DATA-- REQUIRED DATA

R.NIRMAL KUMAR B.E.,M.B.A 2


29/07/2023

DATA PREPARATION AND ANALYSIS

• After data collection, the researcher must prepare the


data to be analyzed.
• Organizing the data correctly can save a lot of time and
prevent mistakes.
• Most researchers choose to use a database or statistical
analysis program (e.g. Microsoft Excel, SPSS) that they
can format to fit their needs and organize their data
effectively

DATA PREPARATION AND ANALYSIS


• Once the data has been entered, it is crucial that the researcher
check the data for accuracy.
• This can be accomplished by spot-checking a random assortment
of participant data groups, but this method is not as effective
as re-entering the data a second time and searching for
discrepancies.
• This method is particularly easy to do when using numerical
data because the researcher can simply use the database program
to sum the columns of the spreadsheet and then look for differences
in the totals.

R.NIRMAL KUMAR B.E.,M.B.A 3


29/07/2023

DATA PREPARATION AND ANALYSIS


• Data preparation includes
▫ editing,
▫ coding and
▫ data entry

• It is the activity that ensures the accuracy of the data and their
conversation from RAW form to reduced and classified forms that
are more appropriate for analysis.

• Preparing a descriptive statistical summary is another preliminary step


leading to an understanding of the collected data

R.NIRMAL KUMAR B.E.,M.B.A 4


29/07/2023

10

EDITING
• The process of checking and adjusting responses in the
completed questionnaires for omissions, legibility, and
consistency and readying them for coding and storage.
• Detects errors and omissions, correct them when possible, and
certifies that minimum data quality standards are achieved.
• It ensures
▫ Completeness
▫ Accuracy and
▫ Uniformity

R.NIRMAL KUMAR B.E.,M.B.A 5


29/07/2023

11

12

TYPES OF EDITING
1. FIELD EDITING
1. Premilinary editing will be done on the same day by a field
supervisor
2. Contact to the respondents immediately in the field if in case of
any query

2. IN HOUSE-CENTRAL EDITING
1. Editing will be done by central office staff
2. Respondents can be contacted later for clarification

R.NIRMAL KUMAR B.E.,M.B.A 6


29/07/2023

13

CODING
• Data coding involves assigning a number to the participants
responses so, they can be entered in database
• In coding, categories are the partitions of a data set of a given
variable.
• E.g) If the variable is gender, the categories are male and
female

14

con

R.NIRMAL KUMAR B.E.,M.B.A 7


29/07/2023

15

16

R.NIRMAL KUMAR B.E.,M.B.A 8


29/07/2023

17

qualitative and quantitative data


analysis

18

R.NIRMAL KUMAR B.E.,M.B.A 9


29/07/2023

19

Multivariate Analysis Techniques

20

Multivariate Analysis Techniques


• There are three categories of analysis to be aware of:

• Univariate analysis, which looks at just one variable


• Bivariate analysis, which analyzes two variables
• Multivariate analysis, which looks at more than two
variables

R.NIRMAL KUMAR B.E.,M.B.A 10


29/07/2023

21

Multivariate Analysis Techniques


• Multivariate analysis (MVA) is a Statistical procedure for
analysis of data involving more than one type of
measurement or observation.

• It may also mean solving problems where more than one


dependent variable is analyzed simultaneously with
other variables.

22

Multivariate Analysis Techniques


• All statistical techniques which simultaneously analyse more
than two variables on a sample of observations can be
categorized as multivariate techniques.

• We may as well use the term ‘multivariate analysis’ which is a


collection of methods for analyzing data in which a number of
observations are available for each object. In the analysis of
many problems, it is helpful to have a number of scores for
each object.

R.NIRMAL KUMAR B.E.,M.B.A 11


29/07/2023

23

An example of multivariate analysis


• Let’s imagine you’re interested in the relationship
between a person’s social media habits and their self-
esteem. You could carry out a bivariate analysis,
comparing the following two variables:

• How many hours a day a person spends on Instagram


• Their self-esteem score (measured using a self-esteem
scale)

24

An example of multivariate analysis


• You may or may not find a relationship between the two
variables; however, you know that, in reality, self-esteem
is a complex concept. It’s likely impacted by many
different factors—not just how many hours a person
spends on Instagram.

• You might also want to consider factors such as age,


employment status, how often a person exercises, and
relationship status (for example).

• In order to deduce the extent to which each of these


variables correlates with self-esteem, and with each
other, you’d need to run a multivariate analysis.

R.NIRMAL KUMAR B.E.,M.B.A 12


29/07/2023

25

Multivariate Analysis Techniques


• The score on each test is one variable, Xi, and there are
several, k, of such scores for each object, represented as X1,
X2 …Xk.

• Most of the research studies involve more than two variables


in which situation analysis is desired of the association
between one (at times many) criterion variable and several
independent variables, or we may be required to study the
association between variables having no dependency
relationships.

• All such analyses are termed as multivariate analyses or


multivariate techniques.

26

REASONS FOR GROWTH OF MULTIVARIATE


TECHNIQUES
• The main reason being that a series of univariate
analysis carried out separately for each variable may, at
times, lead to incorrect interpretation of the result.

• This is so because univariate analysis does not


consider the correlation or inter-dependence
among the variables.

R.NIRMAL KUMAR B.E.,M.B.A 13


29/07/2023

27

REASONS FOR GROWTH OF MULTIVARIATE


TECHNIQUES
• As a result, during the last fifty years, a number of
statisticians have contributed to the development of
several multivariate techniques. Today, these techniques
are being applied in many fields such as
economics, sociology, psychology, agriculture,
anthropology, biology and medicine.

• These techniques are used in analyzing social,


psychological, medical and economic data, specially
when the variables concerning research studies of these
fields are supposed to be correlated with each other and
when rigorous probabilistic models cannot be
appropriately used.

28

APPLICATIONS OF MULTIVARIATE TECHNIQUES


• For example, take the case of college entrance examination
wherein a number of tests are administered to candidates, and
the candidates scoring high total marks based on many
subjects are admitted.

• This system, though apparently fair, may at times be biased in


favour of some subjects with the larger standard deviations.

• Multivariate techniques may be appropriately used in such


situations for developing norms as to who should be admitted
in college.

R.NIRMAL KUMAR B.E.,M.B.A 14


29/07/2023

29

APPLICATIONS OF MULTIVARIATE TECHNIQUES


• Many medical examinations such as blood pressure
and cholesterol tests are administered to patients.

• Each of the results of such examinations has


significance of its own, but it is also important to
consider relationships between different test results
or results of the same tests at different occasions in
order to draw proper diagnostic conclusions and to
determine an appropriate therapy.

30

CLASSIFICATION OF MULTIVARIATE TECHNIQUES


• conveniently classified into two broad categories viz.,
I. dependence methods and
II. interdependence methods.

• This sort of classification depends upon the question:


Are some of the involved variables dependent upon
others?

• If the answer is ‘yes’, we have dependence methods;


but in case the answer is ‘no’, we have
interdependence methods.

R.NIRMAL KUMAR B.E.,M.B.A 15


29/07/2023

31

Multivariate analysis techniques:


Dependence vs. interdependence
• Dependence methods
• Dependence methods are used when one or some of the
variables are dependent on others. Dependence looks
at cause and effect; in other words, can the values of
two or more independent variables be used to explain,
describe, or predict the value of another, dependent
variable?

• To give a simple example, the dependent variable of


“weight” might be predicted by independent variables
such as “height” and “age.”

32

Multivariate analysis techniques:


Dependence vs. interdependence
• Interdependence methods
• Interdependence methods are used to understand the
structural makeup and underlying patterns within a
dataset. In this case, no variables are dependent on
others, so you’re not looking for causal relationships.

• Rather, interdependence methods seek to give meaning


to a set of variables or to group them together in
meaningful ways.

R.NIRMAL KUMAR B.E.,M.B.A 16


29/07/2023

33

34

CLASSIFICATION OF MULTIVARIATE TECHNIQUES


• Two more questions are relevant for understanding the
nature of multivariate techniques.

• Firstly, in case some variables are dependent, the


question is how many variables are dependent?

• The other question is, whether the data are metric


or non-metric? This means whether the

data are quantitative, collected on interval or ratio scale, or


whether the data are qualitative, collected on nominal or
ordinal scale.

R.NIRMAL KUMAR B.E.,M.B.A 17


29/07/2023

35

1.MULTIPLE REGRESSION
• In multiple regression we form a linear composite of
explanatory variables in such way that it has maximum
correlation with a criterion variable.

• This technique is appropriate when the researcher has a single,


metric criterion variable. Which is supposed to be a function
of other explanatory variables.

• The main objective in using this technique is to predict the


variability the dependent variable based on its covariance
with all the independent variables.

36

1.MULTIPLE REGRESSION
• One can predict the level of the dependent
phenomenon through multiple regression
analysis model, given the levels of independent
variables.

• Given a dependent variable, the linear-multiple


regression problem is to estimate constants B1,
B2, ... Bk and A such that the expression

▫ Y = B1X1 + B2X2 + ... + BkXk + A


• are Provides a good estimate of an individual’s Y
score based on his X scores.

R.NIRMAL KUMAR B.E.,M.B.A 18


29/07/2023

37

1.MULTIPLE REGRESSION
• Sometimes the researcher may use step-wise regression
techniques to have a better idea of the independent
contribution of each explanatory variable.

• Under these techniques, the investigator adds the independent


contribution of each explanatory variable into the prediction
equation one by one, computing betas and R at each step.
2

• Formal computerized techniques are available for the purpose


and the same can be used in the context of a particular
problem being studied by the researcher.

38

2.Multiple discriminant analysis:


• Through discriminant analysis technique, researcher
may classify individuals or objects into one of two or
more mutually exclusive and exhaustive groups on
the basis of a set of independent variables.

• Discriminant analysis requires interval independent


variables and a nominal dependent variable.

R.NIRMAL KUMAR B.E.,M.B.A 19


29/07/2023

39

2.Multiple discriminant analysis- For example,


• suppose that brand preference (say brand x or y)
is the dependent variable of interest and its
relationship to an individual’s income, age,
education, etc. Is being investigated, then we
should use the technique of discriminant
analysis.

40

2.Multiple discriminant analysis- For example,


• There happens to be a simple scoring system that assigns a
score to each individual or object. This score is a weighted
average of the individual’s numerical values of his
independent variables. On the basis of this score, the
individual is assigned to the ‘most likely’ category. For
example, an individual is 20 years old, has an annual income
of Rs 12,000,and has 10 years of formal education. Let b1, b2,
and b3 be the weights attached
• to the independent variables of age, income and education
respectively.
• The individual’s
• score (z), assuming linear score, would be:
• z = b1 (20) + b2 (12000) + b3 (10)

R.NIRMAL KUMAR B.E.,M.B.A 20


29/07/2023

41

FACTOR ANALYSIS

42

FACTOR ANALYSIS
• Factor analysis is the generic name given to a class of
techniques whose purpose is data reduction and
summarization.

• Factor analysis is a statistical method used to describe


variability among observed, correlated variables in terms of a
potentially lower number of unobserved variables called
factors.

R.NIRMAL KUMAR B.E.,M.B.A 21


29/07/2023

43

44

Basic assumptions
• Scaled data (1-5), (1-7)
• Variables and respondents composition 1:4
(preferably 1:10)
• Sample must be homogenous
• Interdependence technique. No distinction
between dependent and independent variable.

R.NIRMAL KUMAR B.E.,M.B.A 22


29/07/2023

45
MULTIDIMENSIONAL SCALING
(MDS)
• Multidimensional scaling is a class of procedures for
representing perceptions and preferences of respondents
spatially by means of a visual display.

• It is also referred as perception mapping. It is the perception


of individuals being mapped in the space. We can identify
what they feel about the various brands that the marketer
wants to know about.

46

MULTIDIMENSIONAL SCALING
(MDS)
• Mostly it is used in the field of marketing
▫ To identify the image of the product.
▫ To position of the product in the customers mind.
▫ Market segmentation
▫ To identify the gap
▫ New product development.
Dimensions
• Price
• Safety
• Design
• Comfort
• mileage

R.NIRMAL KUMAR B.E.,M.B.A 23


29/07/2023

47

STATISTICS AND TERMS ASSOCIATED


WITHMDS
• Similarity Judgements:
▫ Similarity judgements are ratings on all possible pairs of
brands or other stimuli using a likert scale. We can use
correlation value for comparing each brands.
1 indicates higher similarities and
the value nearer to 0 indicates lower similarities
• Preference rankings:
▫ Preference rankings are rank orderings of the brands or
other stimuli from the most preferred to the least performed.
They are normally obtained from the respondents.
• Stress:
▫ Stress=(original distance do- perceived distance
dp)/average distance

48

PROCEDURE FOR CONDUCTING MDS


• Formulate the problem
• Obtain input data
• Select an MDS procedure
• Decide on the number of dimensions
• Label the Dimensions and interpret the
configuration.
• Assess Reliability and Validity

R.NIRMAL KUMAR B.E.,M.B.A 24


29/07/2023

49

CONJOINT ANALYSIS
• Technique developed in early 70s
• It measures how buyers valued components of
a product/service bundle
• Customer choose the product on the
combination of many attributes

50

CONJOINT ANALYSIS
• It is a tool/technique highly utilized by marketers
• Marketer identifies the possible combination of
several attributes and then which combination of
attribute has highest value in customer’s point of
view
• E.g coco cola in 3 different pack. How a cusotmer
prefer his choice. Customer choose the product on the
combination of many attributes
▫ Price and quality
▫ Quality and design
▫ Taste and price
▫ Volume and price

R.NIRMAL KUMAR B.E.,M.B.A 25


29/07/2023

51

52

cluster analysis
• cluster analysis identifies homogeneous
subgroups of study objects or participants
and then studies the data by these subgroups.
• Definition

• Cluster analysis is a multivariate data mining


technique whose goal is to groups objects (eg.,
products, respondents, or other entities) based
on a set of user selected characteristics or
attributes.

R.NIRMAL KUMAR B.E.,M.B.A 26


29/07/2023

53

Cluster analysis-Example

R.NIRMAL KUMAR B.E.,M.B.A 27

You might also like