0% found this document useful (0 votes)
37 views34 pages

Research: Strategies and Methods

The document discusses primary and secondary data sources. Primary data is collected directly by the researcher through methods like measurement, observation, interrogation, and participation. It is more reliable than secondary data which has already been interpreted. The document also discusses population and sampling methods. Probability sampling ensures each member has a chance of selection, while non-probability sampling does not. Common sampling techniques include simple random, systematic, stratified, cluster, convenience, voluntary response, purposive, and snowball sampling. Finally, the levels of data measurement are discussed as nominal, ordinal, interval, and ratio.

Uploaded by

sibhat mequanint
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views34 pages

Research: Strategies and Methods

The document discusses primary and secondary data sources. Primary data is collected directly by the researcher through methods like measurement, observation, interrogation, and participation. It is more reliable than secondary data which has already been interpreted. The document also discusses population and sampling methods. Probability sampling ensures each member has a chance of selection, while non-probability sampling does not. Common sampling techniques include simple random, systematic, stratified, cluster, convenience, voluntary response, purposive, and snowball sampling. Finally, the levels of data measurement are discussed as nominal, ordinal, interval, and ratio.

Uploaded by

sibhat mequanint
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Chapter 4

Research Strategies and methods


PRIMARY Vs SECONDARY DATA
1)
Primary data:-

Are more reliable than secondary data.
•There are four basic types of primary data distinguished by the way they are collected:
A)Measurement – collections of numbers indicating amounts, e.g. voting polls, exam
results, car mileages, oven temperatures etc.
B)Observation – records of events, situations or things experienced with your own
senses and perhaps with the help of an instrument, e.g. camera, tape recorder,
microscope, etc.
C)Interrogation – data gained by asking and probing, e.g. information about people’s
convictions, likes and dislikes etc.
D)Participation – data gained by experiences of doing things e.g. the experience of learning to ride a
bike tells you different things about balance, dealing with traffic etc., rather than just observing.
Population and Sampling
Population
• A collective term used to describe the total quantity of things (or
cases) of the type which are the subject of your study such as
certain types of objects, organizations, people or even events.
• It is important to carefully define your target population according
to the purpose and practicalities of your project.
• However, it is normally impossible to get all of them to answer your
questions
• Populations can have the following characteristics:
homogeneous – all cases are similar, e.g. bottles of beer on a
production line;
stratified – contain strata or layers, e.g. people with different
levels of income: low, medium, high;
clustered – contains clusters (groups with similar
characteristics)
grouped by location – different groups according to where
they are e.g. animals in different habitats – desert, equatorial
forest, savannah, tundra.
Sampling

The process of selecting just a small group of cases from out of a large group is called
sampling
• Samples are easier to collect data from because they are practical, cost-
effective, convenient and manageable
• In a population, there will probably be only certain groups that will be of
interest to your study. This selected category is your sampling frame. It is
from this sampling frame that the sample is selected.
• Sampling errors are statistical errors that arise when a sample does not
represent the whole population which will lead to bias in the results and
cannot be used to make accurate generalizations about the population.
Types of sampling methods/procedure:
1. Probability sampling
2. Non-probability sampling
1)Probability sampling
• Every member of the population has a chance of being selected.
It is mainly used in quantitative research. If you want to
produce results that are representative of the whole population,
probability sampling techniques are the most valid choice.
Main types of probability sampling method
1. Simple random sampling
• In a simple random sample, every member of the population
has an equal chance of being selected. Your sampling frame
should include the whole population.
• Use tools like random number generators or other techniques
that are based entirely on chance.
2. Systematic sampling
• Instead of randomly generating numbers, individuals are chosen
at regular intervals.
• If we use this technique, it is important to make sure that there is
no hidden pattern in the list that might skew the sample.
• Example, if the HR database groups employees by team, and
team members are listed in order of seniority, there is a risk that
your interval might skip over people in junior roles, resulting in a
sample that is skewed towards senior employees
3. Stratified sampling
• This sampling method divide the population into subgroups (called
strata) based on the relevant characteristic (e.g. gender, age range,
income bracket, job role).
• Based on the overall proportions of the population, we calculate how
many people should be sampled from each subgroup. Then you use
random or systematic sampling to select a sample from each
subgroup.
Example
The company has 800 female employees and 200 male employees. You want to ensure that the
sample reflects the gender balance of the company, so you sort the population into two strata
based on gender. Then you use random sampling on each group, selecting 80 women and 20
men, which gives you a representative sample of 100 people.
4. Cluster sampling
• Involves dividing the population into subgroups, but
each subgroup should have similar characteristics to
the whole sample.
• This method is good for dealing with large and
dispersed populations, but there is more risk of
error in the sample, as there could be substantial
differences between clusters. It’s difficult to
guarantee that the sampled clusters are really
representative of the whole population.
Example
• The company has offices in 10 cities across the country (all with
roughly the same number of employees in similar roles). You don’t
have the capacity to travel to every office to collect your data, so you
use random sampling to select 3 offices – these are your clusters.
2) Non-probability sampling
• Individuals are selected based on non-random criteria, and not
every individual has a chance of being included.
• It is easier and cheaper to access, but it has a higher risk of 
sampling bias. That means the inferences we can make about
the population are weaker than with probability samples, and
your conclusions may be more limited.
• This sampling techniques are often used in exploratory and 
qualitative research. In these types of research, the aim is not to
test a hypothesis about a broad population, but to develop an
initial understanding of a small or under-researched population.
Main types of Non-probability sampling methods
1. Convenience Sampling
• A convenience sample simply includes the individuals
who happen to be most accessible to the researcher.
• This is an easy and inexpensive way to gather initial data,
but there is no way to tell if the sample is representative
of the population, so it can’t produce generalizable
results.
Example
• You are researching opinions about student support services in
your university, so after each of your classes, you ask your fellow
students to complete a survey on the topic. This is a convenient
way to gather data, but as you only surveyed students taking the
same classes as you at the same level, the sample is not
representative of all the students at your university.
2. Voluntary response sampling
• Similar to a convenience sample, a voluntary response sample is
mainly based on ease of access. Instead of the researcher choosing
participants and directly contacting them, people volunteer themselves
(e.g. by responding to a public online survey).
Example
• You send out the survey to all students at your university and a lot of students
decide to complete it. This can certainly give you some insight into the topic, but
the people who responded are more likely to be those who have strong opinions
about the student support services, so you can’t be sure that their opinions are
representative of all students.
3. Purposive sampling
• Also known as judgement sampling, involves the researcher using their
expertise to select a sample that is most useful to the purposes of the
research.
• It is often used in qualitative research, where the researcher wants to
gain detailed knowledge about a specific phenomenon rather than make
statistical inferences, or where the population is very small and specific.
An effective purposive sample must have clear criteria and rationale for
inclusion.
Example
• You want to know more about the opinions and experiences of disabled
students at your university, so you purposefully select a number of
students with different support needs in order to gather a varied range
of data on their experiences with student services.
4. Snowball sampling
• If the population is hard to access, snowball sampling can
be used to recruit participants via other participants. The
number of people you have access to “snowballs” as you
get in contact with more people.
Example
• You are researching experiences of homelessness in your city.
Since there is no list of all homeless people in the city,
probability sampling isn’t possible. You meet one person who
agrees to participate in the research, and she puts you in contact
with other homeless people that she knows in the area.
1)Secondary data sources:- data that have been interpreted and
recorded or written sources
 Are less reliable than primary data
 Data in the form of news bulletins, books, journals, magazines,
newspapers, documentaries, advertising, the Internet etc.
 Survey data – government census of population, employment,
household surveys, economic data. These may be carried out on
a periodic basis, with frequent regularity or continuously. They
may also be limited to sector, time, area.
 The quality of the data depends on the source and the methods of
presentation.
MEASUREMENT OF DATA
• Data can be measured in different ways depending on their nature. These are
commonly referred to as levels of measurement – Nominal, ordinal, Interval and
Ratio.

A)Nominal level
 Divides the data into separate distinctive categories that can then be compared
with each other.
 Example:
o Sex (Male, Female),
o Marital status (single, Married, Divorced, Widowed).
 Nominal data can be analyzed using only simple graphic and statistical
techniques such as Bar graphs.
B)Interval level
Data must be able to be measured precisely on a regular scale of
some sort.
 Example: one value is so many units (degrees, inches) more or less
than another, you have an interval scale;

C) Ratio level
 The most complete level of measurement, having a true zero: the point
where the value is truly equal to zero.
• Example: one value is so many times as big or bright or tall or heavy
as another, you have a ratio scale.
COLLECTING AND ANALYSING SECONDARY DATA 

• As a researcher, we will face several problems when


seeking previously recorded historical and data. The main
reasons are:
 locating and accessing them
 authenticating the sources
 assessing credibility (truthfulness)
 Evaluating how representative they are
 selecting methods to interpret them.
Data set and Data processing
• A dataset (or data set)
• A collection of data, usually presented in tabular form. Each
column represents a particular variable. Each row corresponds
to a given member of the dataset in question.
Data processing
• Manipulation of data by a
computer. It includes the
conversion of raw data to
machine-readable form, flow
of data through the CPU and
memory to output devices, and
formatting or transformation of
output.
• Any use of computers to perform
defined operations on data can
be included under data
processing.
Data processing

• The output or “processed” data can be obtained in


various forms. Example of these forms include image,
graph, table, vector file, audio, charts or any other desired
format.
• The form obtained depends on the software or method
used. When done itself it is referred to as automatic data
processing. 
• Data Centers are the key component as it enables processing,
storage, access, sharing and analysis of data. More and more
information can be sorted in this manner.
• The advancement in areas such as data security, machine leaning,
data science, network security etc requires a focused approach for
reliable, accurate & cost effective processing.
• All the businesses, especially those which require real time
processing need reliable & efficient data center. These centres houses
the critical infrastructure and provide robust processing required to
keep services running.
Statistical analysis
• It is a component of data analytics. Statistical analysis can be
used in situations like gathering research interpretations,
statistical modeling or designing surveys and studies. It can
also be useful for business intelligence organizations that have
to work with large data volumes.
• The goal of statistical analysis is to identify trends. A retail
business, for example, might use statistical analysis to find
patterns in unstructured and semi-structured customer data that
can be used to create a more positive customer experience and
increase sales. 
Steps of statistical analysis
Statistical analysis can be broken down into five discrete steps,
as follows:
• Describe the nature of the data to be analyzed.
• Explore the relation of the data to the underlying population.
• Create a model to summarize an understanding of how the data
relates to the underlying population.
• Prove (or disprove) the validity of the model.
• Employ predictive analytics to run scenarios that will help
guide future actions.
Simple Statistical Analysis
Summarising Data: Grouping and Visualising
• The first thing to do with any data is to summarise it, which means to
present it in a way that best tells the story.
• The starting point is usually to group the raw data into categories,
and/or to visualise it. For example, if you think you may be interested
in differences by age, the first thing to do is probably to group your data
in age categories, perhaps ten- or five-year chunks.
• One of the most common techniques used for summarising is using 
graphs, particularly bar charts, which show every data point in order,
or histograms, which are bar charts grouped into broader categories.
 An example is shown below,
which uses three sets of data,
grouped by four categories. This
might, for example, be ‘men’,
‘women’, and ‘other/no gender
specified’, grouped by age
categories 20–29, 30–39, 40–49
and 50–59.
• An alternative to a histogram is
a line chart, which plots each
data point and joins them up
with a line. The same data as in
the bar chart are displayed in a
line graph below.
Importance of Visualizing data
The important thing about drawing a graph
• it gives you an immediate ‘picture’ of the data.
• it shows you straight away whether your data are grouped
together, spread about, tending towards high or low values, or
clustered around a central point.
• It will also show you whether you have any ‘outliers’, that is,
very high or very low data values, which you may want to
exclude from the analysis, or at least revisit to check that they are
correct.
• It
• You can also display grouped data in a pie chartworth drawing a graph before you
start any further analysis, just to have a look at your data.
Measures of Location: Averages

• The average gives you information about the size of the effect of


whatever you are testing, in other words, whether it is large or small.
There are three measures of average: mean, median and mode.

• Mean(Average):- the advantage that it uses all the data values


obtained and can be used for further statistical analysis.
• Median :- the mid-point of all the data. The
median is not skewed by extreme values, but it is
harder to use for further statistical analysis.

• Mode:- is the most common value in a data set.


Measures of Spread: Range, Variance and Standard
Deviation
• Researchers often want to look at the spread of the data, i.e., how
widely the data are spread across the whole possible measurement
scale.
• There are three measures which are often used for this:
Range 
• The difference between the largest and smallest values. Example: In {4,
6, 9, 3, 7} the lowest value is 3, and the highest is 9. So the range is 9 −
3 = 6. 
Standard deviation 
• Measures the average spread around the mean, and therefore gives a
sense of the ‘typical’ distance from the mean.
Variance 
• is the square of the standard deviation.
• it gives the variance.

• Standard deviation looks at how spread out a group of numbers is from the


mean, by looking at the square root of the variance. The variance measures the
average degree to which each point differs from the mean—the average of all
data points.

Skew:- suddenly change direction or position.


Measures how symmetrical the data set is, or whether it has more high
values, or more low values.
• A sample with more low values is described as negatively skewed and
a sample with more high values as positively skewed.
End of Chapter 4

You might also like