0% found this document useful (0 votes)
24 views240 pages

Final BRM

Uploaded by

balaji785367
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views240 pages

Final BRM

Uploaded by

balaji785367
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 240

22BA206

DEPARTMENTOF MASTER OF BUSINESS ADMINISTRATION

VISION OF THE DEPARTMENT


• To develop competent managers with excellent professionalism and focus on research
to meet the
• Global challenges and demand.

MISSION OF THE DEPARTMENT


• To impart quality and real time education to contribute in the field of management.
• To impart soft skills, leadership qualities and professional ethics among the young graduates.
• To develop graduates to compete at the global level.
• To deal with the contemporary issues and to cater to the societal needs.

PROGRAMME EDUCATIONAL OBJECTIVES (PEOs):


MBA programme curriculum is designed to prepare the postgraduate students
PEO1: To have a thorough understanding of the core aspects of the business.
PEO2: To provide the learners with the management tools to identify, analyze and create business
opportunities as well as solve business problems.
PEO3: To prepare them to have a holistic approach towards management functions.
PEO4: To motivate them for continuous learning.

Business Research Methods 1


and Data Analytics
22BA206

PROGRAMME OUTCOMES (POs):

On successful completion of the programme, students have the following:

PO1: Ability to apply the business acumen gained in practice.


PO2: Ability to understand and solve managerial issues.
PO3: Ability to communicate and negotiate effectively, to achieve organizational and individual
goals.
PO4: Ability to understand one’s own ability to set achievable targets and complete them.
PO5: Ability to fulfill social out reach
PO6: Ability to take up challenging assignments

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO 1: To design and develop systems for real time problems in the areas related to
management andhuman values using latest techniques.
PSO 2: To develop innovative, eco-friendly solutions and ideas in the field of Management using
varioussoftware tools with analytical skills.

PEOs (1 to 3) mapped with POs and PSOs

PEO PO PSO
PO 1 PO 2 PO 3 PO 4 PO 5 PO 6 PSO1 PSO2

I. 3 3 2 2 2 2 3 3
II. 3 3 3 3 2 1 3 2
III. 3 2 3 3 3 2 3 2
IV 3 3 2 3 3 2 3 3

Bloom’s
COURSE OUTCOMES: Taxonomy Level
Co1: Students would know to write research proposals K3
C21: Students would be able to analyze data and find solutions to the K2
problems.
Co3: Ability to understand the role of Business Analytics in decision K2
making andgenerating reports
Co4: Ability to identify the appropriate tool for the analytics scenario K4
Co5: Ability to apply the different analytics tools and generate solutions K2

Business Research Methods 2


and Data Analytics
22BA206

SYLLABUS

22BA206- BUSINESS RESEARCH METHODS AND DATA ANALYTICS

Programme MBA Sem. Category L T P Credit


&Branch
Prerequisites Nil 2 Analytics 3 1 0 4
This course will provide students up skill and stay up-to-date with the latest concepts such as big
Preamble data analytics, R-programming, Python, etc. The course covers data-driven studies along with
business ethics, organizational behavior, business and management principles, and marketing
management and develop new business insights in business analytics
UNIT I INTRODUCTION 12
Research: Meaning, Purpose, Scientific method, types of research; scope of businessresearch. Selection and formulation of
a research problem, formulation of hypothesis, Types of hypothesis, operational definition of concepts- Review of literature-
Data: Types of data- Primary and secondary data sources - Relevance & Scope of Research in Management and steps
involved in the Research Process
UNIT II RESEARCHMETHODS AND MEASURES 12
Components of research design-Definition & types, Research methods- Types, Reliability- types of reliability, Validity-
types of validity, Variables- types of variables, Sampling- Methods, of collection- tools for collection data; Data collection
instruments- Primary & secondary: in-depth interviews, projective techniques and focus groups- interview schedule,
interview guide, questionnaire, rating scale, sociometry, check list; pretesting of tools, pilot study. Processing of data;
checking, editing, coding,transcription, tabulation, preparation of tables, graphical representation.

UNIT III SAMPLING AND ANALYSIS 12


Data management plan – Sampling & measurement, – Tabulation, Introducing database application, testing for association,
Analysis Techniques: Qualitative & QuantitativeAnalysis Techniques- Techniques of Testing Hypothesis – Methods of
analysis: Chi-square, T-test Correlation & Regression Analysis- Analysis of Variance, etc. – Making Choice of an
Appropriate Analysis Technique- market survey.

UNIT IV DATA ANALYTICS 12


Introduction to Data analytics- Types of Data analytics- Data visualization for decision making- Graphical techniques,
skewness, kurtosis, formatting data- different operations using chart, pivot chart and formatting plot area-Data wrangling -
Business Problem Solving across different domains- Dash boarding Fundamentals
UNIT V INTERMEDIATE DATA ANALYTICS 12
Dash boarding for Enterprise Reporting -Visualization : introducing visualization tool with Dash boarding - Linear &
Logistic regression modeling and their types– Time series modeling, forecasting- Introduction to big data –Predictive
analysis- Automated analytics-cloud analytics-Report – Writing: Planning report writing work- target audience, type of
report, style of writing synoptically outline of chapters; steps in drafting the report

MAPPING OF COURSE OUTCOMES WITH THE PROGRAM OBJECTIVES:

PO1 PO2 PO3 PO4 PO5 PO6


CO1 3 2 3 3 3 2
CO2 3 2 3 3 3 3
CO3 3 3 2 3 2 3
CO4 3 3 3 3 3 3
CO5 3 3 3 3 2 3

Business Research Methods 3


and Data Analytics
22BA206

UNIT I INTRODUCTION 12

Research: Meaning, Purpose, Scientific method, types of research; scope of business research. Selection
and formulation of a research problem, formulation of hypothesis, Types of hypothesis, operational
definition of concepts- Review of literature-Data: Types of data- Primary and secondary data sources -
Relevance & Scope of Research in Management and steps involved in the Research Process

Learning Objectives
• To learn fundamentals of business research, its significance and need for present business
scenario
• To understand to formulate research problem and hypothesis
• To learn about operationalisation of variables and research process in business

Learning Outcomes
At the end of the unit they will be able to:
• To apply different sampling techniques and research method
• To apply qualitative and quantitative techniques
• To apply research process in future business

DETAILEDSESSIONPLAN TOPIC WISE

Mode of Assessment
S.No Title of Teaching Textbook/ Link Tool(Quiz/P
Topic (PPT/Semina Reference Book (if Applicable) link uzzle/
r/Chalk & should on Assignment/
Board etc.) Springboard/ Seminaretc..
Coursera / Nptel )
NPTEL Link:
https://onlinecourses.n
Meaning Chalk and Uma Sekaran and Roger ptel.ac.in/noc23_mg54
and scope board Bougie, Research methods /unit?unit=17&lesson=
1. of research for Business, 5th Edition, 18
Wiley India, NewDelhi,
method
2012.
https://www.youtube.c
om/watch?v=Bqef3syc Quiz
mZY

https://www.youtube.c
Methods PPT Panneerselvam. R, Research om/watch?v=1vf8ZvA
2. and types Methodology, 2nd Edition, DxfY
PHI Learning, 2014.
of research

Panneerselvam. R, Research https://www.youtube.c


3 Research Seminar Methodology, 2nd Edition, om/watch?v=axHm4zt
Process PHI Learning, 2014. onoA

Business Research Methods 4


and Data Analytics
22BA206

1. INTRODUCTION
In the present fast track business environment marked by cut-throat competition, many organizations
rely on business research to gain a competitive advantage and greater market share. A good research
study helps organizations to understand processes, products, customers, markets and competition, to
develop policies, strategies and tactics that are most likely to succeed.
The word research is composed of two syllables, re and search. The word re is a prefix meaning
again, anew or over again and the word search is a verb meaning to examine closely and carefully,
to test and try, or to probe. Together they form a noun describing a careful, systematic, patient study
and investigation in some field of knowledge, undertaken to establish facts or principles.
Research is a structured enquiry that utilizes acceptable scientific methodology to solve
problems and create new knowledge that is generally applicable. Scientific methods consist of
systematic observation, classification and interpretation of data.
According to Robert Ross, “Research is essentially an investigation, a recording and an
analysis of evidence for the purpose of gaining knowledge”. It can generally be defined as a
systematic method of finding solutions to problems”. It aims at discovering the truth. It is the
search for knowledge through objective and systematic method of finding solution to problems. It is
carried on both for discovering new facts and verification of old ones. Therefore, research is a
process of systematic and in-depth study or search of any particular topic, subject or area of
investigation backed by collection, computation, presentation and interpretation of relevant data.
Research need not lead to ideal solution but it may give rise to new problems which may
require further research. In other words, research is not an end to a problem since every research
gives birth to a new question. It is carried on both for discovering new facts and verification of
old ones.

1.1 Research: Meaning


Business research refers to systematic collection and analysis of data with the purpose of
finding answers to problems facing management. It can be carried out with the objective to
explore, to describe or to diagnose a phenomenon. It involves establishing objectives and
gathering relevant information to obtain the answer to a business issue and it can be conducted
to answer a business-related question, such as: What is the target market of my product?
Business research can also be used
to solve a business-related problem, such as determining how to decrease the amount of
excess inventory on hand.
When deciding whether business research is to be conducted or not, the firm keeps in
mind factors like the availability of data, time constraints and the value of the research
information to the company. Adequate planning and information-gathering are essential to
derive results for business.

1.2 Purpose of research

(i) Progress and Good Life

The purpose of all research is progress and good life. Progress results if the space of ignorance is
occupied by knowledge and wisdom. The latter are the results of good research. Knowledge and
wisdom drive the mankind to live an orderly good life.

Business Research Methods 5


and Data Analytics
22BA206

(ii) Development of Scientific Attitude

One of the purposes of research is to develop scientific attitude. Scientific attitude is one that asks
‘Why’ and ‘How’ and answers are found. This ‘Know-why’ and ‘Know-how’ attitude nurtures
talents and such intellectual talents are the great assets of society.

(iii) Creativity and Innovativeness

One of the purposes of research is encouragement to creativity and innovation. New products, new
processes and new uses are the means through which the world goes dynamic. A dynamic world
is not possible without newness introduced every now and then in every walk of life. And this is
possible only through creativity and innovation. Research kindles the creativity and innovative
instincts of people and thus experiments on the possibility of new things instead of waiting for the
accidental and slow experience path to creativity and innovation.

(iv) Testing Hypothesis and Establishing Theories

A very important purpose of research is testing of hypothesis and establishing theories. As was
already pointed out knowledge is power. That knowledge comes from testing hypotheses and
establishing new theories. Proven hypotheses become theories.

(v) Prediction and Control

Applied research has a great say in prediction and control in almost all walks of human endeavor
Prediction is jumping into the future and the theories constitute the launch pad. Control looks for
deviation between actual happening and predicted happening. In the process, the theories get
reevaluated and redefined.

(vi) Purposive Development

Development = Growth + Change, Growth is uni-scaled while change is multi-scaled. In the


natural process development does take place through trial and error through casual observations,
through actual exposure and the like. But this is evolutionary and time consuming. Revolutionary
development takes fourth through discontinuous change. Research is the seed of such dichotomous
change or even disruptive change which contributes to purposive development.

(vii) Problem Solving

The purpose of any research is problem solving. What is a problem? Problem is deprivation or
depreciation of something. Knowledge deprivation, efficiency deprivation, productivity
depreciation, etc., exist. How can these be solved? Research into the forces that cause deprivation
and measures to contain them from causing deprivation is needed. Thus, problem solving is a great
purpose of research.

Business Research Methods 6


and Data Analytics
22BA206

Schematic Evaluation

Research is also carried out to systematically evaluate a process or practice of an organization to


know its strengths and weaknesses so that areas for improvement process can be identified.

(viii) Impact Analysis

Research is undertaken to assess the impact of certain measures or change introduced on relevant
variables. Impact studies are useful for biological, social, business, economic and other areas of
decision making.

(ix) Methodological Improvement

Another purpose of research is improving research methodology itself. Developments in the field
of measurement and scaling are immense. Whether these can be appropriately used in the case of
particular research areas? To answer the question research needs to be done. Validation,
revalidation and de-validation of methodological aspects thus constitute good piece of research.
And this is one of the purposes of research. In fact, any research has a responsibility towards
contribution to methodological enrichment.

1.3 Nature of Business Research


Business research is the process of gathering thorough information on all aspects of a
company's operations and applying that information to improve operational excellence, which can
lead to an increase in sales and profits.
• A study like this can assist businesses in figuring the product or service that is most
profitable.
• It entails determining where money should be spent to boost sales, profitability, or/and
market share.
Given the increasing competition in all industries, market research has become extremely necessary
to make intelligent and informed decisions that fuel business growth.
Business research is one of the most effective ways to understand your customers and the
overall market, as well as analyze competitors. This type of research aids businesses in determining
market demand and supply. It can help business organizations to cut unnecessary expenses and
develop tailor-made solutions or products that appeal to the demand in the market. Research for
startups aids in gathering information for professional or commercial purposes to assess business
prospects and goals. Business research can also help startups find the right audience profile for their
offerings. It is the holy grail when looking to achieve success in the modern, ultra-competitive
business world.

1.4 Need for Research

The main importance of research is to produce knowledge that can be applied outside
a research setting. Research also forms the foundation of program development and policies
everywhere around the universe. It also solves particular existing problems of concern.
Research is important because we are able to learn more about things, people, and events. In
doing research, we are able to make smart decisions.
Marketing research is important because it allows consumers and producers to become
more familiar with the products, goods, and services around them. Research is important to

Business Research Methods 7


and Data Analytics
22BA206

society because it allows us to discover more and more that might make are lives easier, more
comfortable, and safer. It presents more information for investigation. This allows for
improvements based on greater information and study. It is very important. Research
encourages interdisciplinary approaches to find solution to problems and to make new
discoveries. Research is a basic ingredient for development and therefore serves as a means
for rapid economic development.
The main importance or uses may be listed as under:

• It provides basis for government policies

• Helps in solving various operational and planning problems of business and industry

• Research helps in problem solving

• Is useful to students, professionals, philosophers, literary men, analysts and intellectuals.

1.5 Advantages of Business Research


• Market research can help organizations gain a better perspective and understanding of their
market or target audience. This ensures that the company stays ahead of its competitors.
• Primary and secondary research can act as an insurance policy against obvious but silent
dangers on your business path.
• Market research findings help organizations learn from their weaknesses and adapt to new
business environments.
• By using certain research methodologies for competitor analysis, you can capitalize on your
new-found knowledge to steer ahead of the competition.
• Regular market research initiatives help take the ‘pulse’ of hot market trends, allowing you
to come up with “superhit” products and services.
• It helps with market forecasting, which allows you to project future numbers, characteristics,
and trends within your target market.

1.6 Objectives of Research

• To find out the truth which is hidden and which has not been discovered so far.

• Aims at advancing systematic knowledge and formulating basic theories about the forces
influencing the relation between groups as well as those acting on personality development
and is adjustment with individuals.

• Try to improve tools of analysis or to test these against the complex human behaviour and
institutions.

• To understand social life and thereby to gain a greater measure of control over social behaviour.
• To provide an educational program in the accumulated knowledge of group dynamics, in skills of
research, in techniques of training leaders and in social action.
1.7. Importance of Business research
• Business research is one of the most effective ways to understand customers, the market and
competitors. Such research helps companies to understand the demand and supply of the market.
Using such research will help businesses reduce costs, and create solutions or products that are
targeted to the demand in the market and the correct audience.

Business Research Methods 8


and Data Analytics
22BA206

• In-house business research can enable senior management to build an effective team or train or
mentor when needed. Business research enables the company to track its competitors and hence
can give you the upper hand to stay ahead of them. Failures can be avoided by conducting such
research as it can give the researcher an idea if the time is right to launch its product/solution and
also if the audience is right. It will help understand the brand value and measure customer
satisfaction which is essential to continuously innovate and meet customer demands. This will help
the company grow its revenue and market share.
• Business research also helps recruit ideal candidates for various roles in the company. By
conducting such research a company can carry out a SWOT analysis, i.e. understand the strengths,
weaknesses, opportunities, and threats. With the help of this information, wise decisions can be
made to ensure business success.
• Business research is the first step that any business owner needs to set up his business, to survive
or to excel in the market. The main reason why such research is of utmost importance is that it
helps businesses to grow in terms of revenue, market share and brand value.

1.8 Scope of Business Research


Scope of Business Research Includes the Following Areas:
(i) Production Management: The research performs an important function in product
development, diversification, introducing a new product, product improvement, process
technologies, choosing a site, new investment etc.
(ii) Personnel Management: Research works well for job redesign, organization restructuring,
development of motivational strategies and organizational development.
(iii) Marketing Management: Research performs an important part in choice and size of target
market, the consumer behavior with regards to attitudes, life style, and influences of the
target market. It is the primary tool in determining price policy, selection of channel of
distribution and development of sales strategies, product mix, promotional strategies, etc.
(iv) Financial Management: Research can be useful for portfolio management, distribution of
dividend, capital raising, hedging and looking after fluctuations in foreign currency and
product cycles.
(v) Materials Management: It is utilized in choosing the supplier, making the decisions
relevant to make or buy as well as in selecting negotiation strategies.
(vi) General Management: It contributes greatly in developing the standards, objectives, long-
term goals, and growth strategies. To perform well in a complex environment, you will have
to be equipped with an understanding of scientific methods and a way of integrating them
into decision making. You will have to understand what good research means and how to
conduct it. Since the complexity of the business environment has amplified, there is a
commensurate rise in the number and power of the instruments to carry out research. There
is certainly more knowledge in all areas of management. We now have started to develop
much better theories. The computer has provided us a quantum leap in the capability to take
care of difficulties. New techniques of quantitative analysis utilize this power.
Communication and measurement techniques have also been improved. These developments
reinforce each other and are having a substantial impact on business management

1.2. SCIENTIFIC METHOD


To be termed scientific, a method of inquiry must be based on empirical and measurable
evidence subject to specific principles of reasoning. The “scientific method” attempts to
minimize the influence of the researchers’ bias on the outcome of an experiment. The researcher
may have a preference for one outcome or another, and it is important that this preference not
bias the results or their interpretation. Sometimes “common sense” and “logic” tempt us into
believing that no test is needed. Another common mistake is to ignore or rule out data which

Business Research Methods 9


and Data Analytics
22BA206

do not support the hypothesis.


The scientific method is the process by which scientists, collectively and over time,
endeavor to construct an accurate (that is, reliable, consistent and non-arbitrary) representation
of the world.
Recognizing that personal and cultural beliefs influence both our perceptions and our
interpretations of natural phenomena, we aim through the use of standard procedures and
criteria to minimize those influences when developing a theory. As a famous scientist once said,
“Smart people (like smart lawyers) can come up with very good explanations for mistaken
points of view.” In summary, the scientific method attempts to minimize the influence of bias
or prejudice in the experimenter when testing an hypothesis or a theory.
Clover and Basely defines “scientific method is a systematic step by step procedure
following the logical process of reasoning.”

1.2.1 Important characteristics of scientific method

(i)Empirical
Scientific method is concerned with the realities that are observable through “sensory
experiences.” It generates knowledge which is verifiable by experience or observation. Some of the
realities could be directly observed, like the number of students present in the class and how many
of them are male and how many female. The same students have attitudes, values, motivations,
aspirations, and commitments. These are also realities which cannot be observed directly, but the
researchers have designed ways to observe these indirectly. Any reality that cannot be put to
“sensory experience” directly or indirectly (existence of heaven, the Day of Judgment, life hereafter,
God’s rewards for good deeds) does not fall within the domain of scientific method.

(ii) Verifiable
Observations made through scientific method are to be verified again by using the senses to
confirm or refute the previous findings. Such confirmations may have to be made by the same
researcher or others. We will place more faith and credence in those findings and conclusions if
similar findings emerge on the basis of data collected by other researchers using the same methods.
To the extent that it does happen (i.e. the results are replicated or repeated) we will gain confidence
in the scientific nature of our research. Replicability, in this way, is an important characteristic of
scientific method. Hence revelations and intuitions are out of the domain of scientific method.

(iii) Cumulative
Prior to the start of any study the researchers try to scan through the literature and see that their
study is not a repetition in ignorance. Instead of reinventing the wheel the researchers take stock of
the existing body of knowledge and try to build on it. Also the researchers do not leave their research
findings into scattered bits and pieces. Facts and figures are to be provided with language and thereby
inferences drawn. The results are to be organized and systematized. Nevertheless, we don’t want to
leave our studies as standalone. A linkage between the present and the previous body of knowledge
has to be established, and that is how the knowledge accumulates. Every new crop of babies does not
have to start from a scratch; the existing body of knowledge provides a huge foundation on which
there searchers build on and hence the knowledge keeps on growing.

(iv)Deterministic
Science is based on the assumption that all events have antecedent causes that are subject to
identification and logical understanding. For the scientist, nothing “just happens” – it happens for a
reason. The scientific researchers try to explain the emerging phenomenon by identifying its causes.
Of the identified causes which ones can be the most important? For example, in the 2006 BA/BSC

Business Research Methods 10


and Data Analytics
22BA206

examination of the Mumbai University 67 per cent of the students failed. What could be the
determinants of such a mass failure of students? The researcher may try to explain this phenomenon
and come up with variety of reasons which may pertain to students, teachers, administration,
curriculum, books, examination system, and so on. Looking into such a large number of reasons may
be highly cumbersome model for problem solution. It might be appropriate to tell, of all these factors
which one is the most important, the second most important, the third most important, which two in
combination are the most important. The researcher tries to narrow down the number of reasons in
such a way that some action could be taken. Therefore, the achievement of a meaningful, rather than
an elaborate and cumbersome, model for problem solution becomes a critical issue in research. That
is parsimony which implies the explanation with the minimum number of variables that are
responsible for an undesirable situation.

(v) Ethical and Ideological Neutrality


The conclusions drawn through interpretation of the results to data analysis should be objective;
that is, they should be based on the facts of the findings derived from actual data, and not on our
own subjective or emotional values. For instance, if we had a hypothesis that stated that greater
participation indecision making will increase organizational commitment, and this was not supported
by the results, it makes no sense if the researcher continues to argue that increased opportunities for
employee participation would still help. Such an argument would be based, not on the factual, data
based research findings, but on the subjective opinion of the researcher. If this was the conviction of
the researcher all along, then there was no need to do the research in the first place. Researchers are
human beings, having individual ideologies, religious affiliations, cultural differences which can
influence the research findings. Any interference of their personal likings and dis-likings in their
research can contaminate the purity of the data, which ultimately can affect the predictions made by
the researcher. Therefore, one of the important characteristics of scientific method is to follow the
principle of objectivity, uphold neutrality, and present the results in an unbiased manner.

1.2.2 Research Theory


Theory is defined as a set of systematically interrelated concepts, definitions and propositions
that are advanced to explain and predict a phenomenon. It may also specify causal relationship
among variables. A theory is an integrated body of definitions, assumptions, and general
propositions covering a given subject matter from which a comprehensive and consistent set
of specific and testable principles can be deducted logically. This theory provides a basis for
studying consumer behaviour and formulating appropriate marketing strategies.

1.2.2.1 Requisites (Criteria) of Theory


Theory starts out as ideas. The criteria to be met by the set of ideas are:
1. They must be logically consistent.
2. They must be interrelated.
3. The statements must be exhaustive.
4. The propositions should be mutually exclusive.
5. They must be capable of being tested through research.

Business Research Methods 11


and Data Analytics
22BA206

1.2.2.1.1 Methods of Formation of Theory

Deduction: It is one of the important methods employed in theory building. It is a process of


drawing generalizations, through a process of reasoning on the basis of certain assumptions
which are either self evident or based on observation. By deduction, is meant reasoning or
inference from the general to particular or from the universal to the individual.
Eg., All men are mortal (Major Premise) A
is a man (Minor premise)
Therefore, A is mortal (Conclusion)

The conclusion follows from the two premises logically. Therefore it is valid. The
deduction is the logical conclusion obtained by deducting it from the statements, called
premise of the argument. The argument is so constructed that if the premises are true,
conclusion must also be true. The logical deduction derives only conclusions from given
premises and it cannot affirm the truth of given statements. It serves in connecting
different truths and thus logical derivation is not a means to find ultimate truth.

Induction: It is the process of reasoning from a part to the whole, from particular to
general or from the individual to the universal. It gives rise to empirical generalizations.
It is a passage from observed to unobserved. It involves two processes namely observation
and generalization. Induction may be regarded as a method by means of which material
truth of the premises is established. Generating ideas from empirical observation is the
process of induction. As a matter of fact, concepts can be generated from experience which
justifies the description of particular situations towards theory- building. It is generally
observed that experience is regarded as a sum of individual observations held together by
the loose tie of association and constantly extended by the idea of inductive inferences.

Business Research Methods 12


and Data Analytics
22BA206

It is generally stated that knowledge is based on the foundations of particular facts. In


empirical sciences, we start from the consideration of a single case, go on to prove many
cases. Consider the following illustration. “I saw a raven in black colour. Other revens seen by
me were also black in colour”. “All ravens are therefore black”.
Inductive method is classified into two types- enumerative induction and analytical induction.
Retroduction: It is a technique of successive approximation by which, the concepts and
assumptions of theories are brought into closer alignment with relevant evidence. At the same
time it maintains the logical consistency required of deductive systems.

1.2.3 Scope of scientific methods


Social science research has a vast scope in respect of areas of application. The social science
research can be useful in a number of areas such as:
Economic Planning: Social science research can be of immense use in economic planning in a
given society. Economy planning requires basic data on the various aspects of our society and
economy, resource endowment and the needs, hopes and problems of the people, etc. Economic
planning is undertaken to achieve certain objectives such as:
• To bring about regional development.
• To make optimum use of available resources.
• To bring out self-reliance.
• To generate employment, etc.
A systematic research provides the required data for planning and developing various schemes or
programmes such as employment generation programmes, rural development programmes, etc.
Control over Social Phenomena: Through social science research, first-hand information can be
obtained in respect of the working of institutions and organisation, which in turn provides greater
power of control over the social phenomena. The social science research has practical implications
for formal and informal styles of managing, organisation structures, and introduction of changes
in the organisation.
Social Welfare: Social research can be used to collect the required data on different aspects of
social life in a given society, so as to develop social welfare programmes. For instance, in a
developing country like India, there are various social welfare problems such as low literacy, law
and order problems due to caste, religion, and other conflicts, social evils like child marriages,
abuse of women, and so on. Therefore, to overcome social problems the Government and other
organizations can collect relevant data through a systematic research, and accordingly develop
various social welfare programmes, such as family welfare campaigns, literacy programmes,
women and children welfare programmes, etc.
Helps to Solve Problems: Research can be undertaken to find solutions to solve specific problems.
For instance, an organization may initiate research to find solution to the problem of declining sales
of their products in the market. An educational institution can undertake research to find out the
causes of low attendance or poor results. A government organisation may undertake research to
solve the problem or to ascertain the impact of slums on the quality of life in a particular city, and
such other research activities. The research enables to find appropriate solutions to specific
problems which in turn helps to improve the quality of performance in various organizations or
institutions.
Verifies and Tests Existing Laws: Research may be undertaken to verify and test existing laws
or theories. Such verification and testing of existing theories helps to improve the knowledge and
ability to handle situations and events. This is true when the existing theories may not be
sufficient or relevant to handle certain situations and events, and
therefore, through research, improvements or modifications can be made in the existing laws or
theories.
Develops New Tools and Theories: Research helps to develop new tools, concepts and theories

Business Research Methods 13


and Data Analytics
22BA206

for a better study of an unknown phenomenon. For this purpose, exploratory research is undertaken
to achieve new insights into such phenomenon.
Helps to Predict Events: Research may be undertaken to predict future course of events. For
instance, research may be undertaken to find out the impact of growing unemployment of educated
youth on the social life of the society in future. The findings of such research would not only
indicate the possible impact, but would also make the concerned authorities to take appropriate
measures to reduce unemployment, to reduce the growth of population and to overcome the
negative consequences, as and when they take place.
Extends Knowledge: Researchers undertake research to extend the existing knowledge in
physical sciences (such as physics, chemistry, mathematics, etc). as well as in social sciences (like
sociology, management, psychology) etc. The knowledge can be enhanced by undertaking research
in general and by fundamental research in particular.

1.2.4 Distinction between induction and deduction


1.Generalizations: In induction, one arrives at universal generalization from particular facts. In
deduction, one deduces generalizations from universal to particular facts.
2.Material Truth: Induction is concerned with the establishment of the material truth of universal
propositions. Deduction is not concerned with the material truth of the premise.
3.Certainty of Conclusions: The conclusions of the inductive method are only probable and not always
certain. The deductive method provides conclusion that are certain. This is because in induction
method, conclusion is not implied in the premise, whereas, the conclusion in deductive method follows
from the premise logically or it is implied in the premise.
4.Observed Facts: Induction is concerned with discovering facts and relations between them. Observed
facts provide the basis for induction. The propositions from which deductions are made are assumed.
In deductive method the observed facts are not relevant.
5.Conclusion and Premise: In the induction method, the conclusion goes beyond the premise or the
contents of the data. The conclusion is more general than the premise. In deduction method, the
conclusion only seeks to discover what is in the premise. It does not go beyond premise. The
conclusion in deduction is never more general than the premise.

1.3. Types of Research

Business research is a process of acquiring detailed information of all the areas of business and
using such information in maximizing the sales and profit of the business. Such a study helps companies
determine which product/service is most profitable or in demand. In simple words, it can be stated as the
acquisition of information or knowledge for professional or commercial purpose to determine
opportunities and goals for a business.
Business research can be done for anything and everything. In general, when people speak about
business research it means asking research questions to know where the money can be spent to increase
sales, profits or market share. Such research is critical to make wise and informed decisions.

For example: A mobile company wants to launch a new model in the market. But they are not
aware of what are the dimensions of a mobile that are in most demand. Hence, the company conducts a
business research using various methods to gather information and the same is then evaluated and
conclusions are drawn, as to what dimensions are most in-demand, This will enable the researcher to make
wise decisions to position his phone at the right price in the market and hence acquire a larger market
share.

Business Research Methods 14


and Data Analytics
22BA206

1.3.1. Business research: Types and methodologies


Business research is a part of the business intelligence process. It is usually conducted to determine
whether a company can succeed in a new region, to understand their competitors, or to simply select a
marketing approach for a product. This research can be carried out using qualitative research methods or
quantitative research methods.

1.3.1.1 Quantitative research methods


Quantitative research methods are research methods that deal with numbers. It is a systematic empirical
investigation using statistical, mathematical or computational techniques. Such methods usually start
with data collection and then proceed to statistical analysis using various methods. The following are some
of the research methods used to carry out business research.

(i) Survey research


Survey research is one of the most widely used methods to gather data especially for conducting
business research. Surveys involve asking various survey questions to a set of audiences through
various types like online polls, online surveys, questionnaires, etc. Nowadays, most of the major
corporations use this method to gather data and use it to understand the market and make appropriate
business decisions. Various types of surveys like cross-sectional surveys which are needed to collect
data from a set of audience at a given point of time or longitudinal surveys which are needed to collect
data from a set of audience across various time duration in order understand changes in the
respondents’ behavior are used to conduct survey research. With the advancement in technology, now
surveys can be sent online through email or social media.
For example: A company wants to know the NPS score for their website i.e. how satisfied are
people who are visiting their website. An increase in traffic to their website or the audience spending
more time on a website can result in higher rankings on search engines which will enable the company
to get more leads as well as increase its visibility. Hence, the company can ask people who visit their
website with a few questions through an online survey to understand their opinions or gain feedback
and hence make appropriate changes to the website to increase satisfaction.

(ii) Correlation research


Correlation research is conducted to understand the relationship between two entities and what
impact each one of them has on the other. Using mathematical analysis methods, correlational research
enables the researcher to correlate two or more variables. Such research can help understand patterns,
relationships, trends, etc. Manipulation of one variable is possible to get the desired results as well.
Generally, a conclusion cannot be drawn only on the basis of correlational research.
For example: A research can be conducted to understand the relationship between colors and
gender-based audiences. Using such research and identifying the target audience, a company can
choose the production of particular color products to be released in the market. This can enable the
company to understand the supply and demand requirements of its products.

(iii) Causal-Comparative research


Causal-Comparative research is a method based on the comparison. It is used to deduce the cause-
effect relationship between variables. Sometimes also known as quasi-experimental research, it
involves establishing an independent variable and analyzing the effects on the dependent variable. In
such research, manipulation is not done; however, changes are observed on the variables or groups
under the influence of the same changes. Drawing conclusions through such research is a little tricky
as independent and dependent variables will always exist in a group, hence all other parameters have
to be taken into consideration before drawing any inferences from the research.

Business Research Methods 15


and Data Analytics
22BA206

For example: A research can be conducted to analyze the effect of good educational facilities in
rural areas. Such a study can be done to analyze the changes in the group of people from the rural areas
when they are provided with good educational facilities and before that.
Another example can be to analyze the effect of having dams and how it will affect the farmers or
production of crops in that area.

(iv) Experimental research


Experimental research is based on trying to prove a theory. Such research may be useful in
business research as it can let the product company know some behavioral traits of its consumers,
which can lead to more revenue. In this method, an experiment is carried out on a set of audiences to
observe and later analyze their behavior when impacted with certain parameters.
For example: Experimental research was conducted recently to understand if particular colors have an
effect on its consumers’ hunger. A set of the audience was then exposed to those particular colors
while they were eating and the subjects were observed. It was seen that certain colors like red or yellow
increase hunger. Hence, such research was a boon to the hospitality industry. You can see many food
chains like McDonalds, KFC, etc. using such colors in their interiors, brands, as well as packaging.
Another example of inferences drawn from experimental research, which is used widely by most
bars/pubs across the world is that loud music in the workplace or anywhere makes a person drink
more in less time. This was proven through experimental research and was a key finding for many
business owners across the globe.

(v) Online research / Literature research


Literature research is one of the oldest methods available. It is very economical and a lot of
information can be gathered using such research. Online research or literature research involves
gathering information from existing documents and studies which can be available at Libraries, annual
reports, etc. Nowadays, with the advancement in technology, such research has become even more
simple and accessible to everyone. An individual can directly research online for any information that
is needed, which will give him in-depth information about the topic or the organization. Such research
is used mostly by marketing and salespeople in the business sector to understand the market or their
customers. Such research is carried out using existing information that is available from various
sources, although care has to be taken to validate the sources from where the information is going to
be collected.
For example: A salesperson has heard a particular firm is looking for some solution which their
company provides. Hence, the salesperson will first search for a decision maker from the company,
investigate what department he is from and understand what the target company is looking for and
what are they into. Using this research he can cater his solution to be spot on when he pitches it to this
client. He can also reach out to the customer directly by finding a mean to communicate with him by
researching online.’

1.3.1.2 Qualitative research methods


Qualitative research is a method that has a high importance in business research. Qualitative
research involves obtaining data through open-ended conversational means of communication. Such
research enables the researcher to not only understand what the audience thinks but also why he thinks it.
In such research, in-depth information can be gathered from the subjects depending on their responses.
There are various types of qualitative research methods such as interviews, focus groups, ethnographic
research, content analysis, case study research that are widely used. Such methods are of very high
importance in business research as it enables the researcher to understand the consumer. What motivates
the consumer to buy and what does not is what will lead to higher sales and that is the prime objective for
any business.
Following are a few methods that are widely used in today’s world by most businesses.

Business Research Methods 16


and Data Analytics
22BA206

(i) Interviews
Interviews are somewhat similar to surveys, like sometimes they may have the same questions used.
The difference is that the respondent can answer these open ended questions at a length and the direction
of the conversation or the questions being asked can be changed depending on the response of the subject.
Such a method usually gives the researcher, detailed information about the perspective or opinions from
its subject. Carrying out interviews with subject matter experts can also give important information
critical to some businesses.
For example: An interview was conducted by a telecom manufacturer, with a group of women to
understand why they have less number of female customers. After interviewing them, the researcher
understood that there were less feminine colors in some of the models, hence females preferred not
purchasing them. Such information can be critical to a business such as a telecom manufacturer and
hence it can be used to increase its market share by targeting women customers by launching some
feminine colors in the market.
Another example would be to interview a subject matter expert in social media marketing. Such an
interview can enable a researcher to understand why certain types of social media advertising strategies
work for a company and why some of them don’t.

(ii) Focus groups


Focus groups are a set of individuals selected specifically to understand their opinions and behaviors.
It is usually a small set of a group that is selected keeping in mind, the parameters for their target market
audience to discuss a particular product or service. Such a method enables a researcher with a larger
sample than the interview or a case study while taking advantage of conversational communication.
Nowadays, focus groups can be sent online surveys as well to collect data and answer why, what and
how questions. Such a method is very crucial to test new concepts or products before they are launched
in the market.
For example: A research is conducted with a focus group to understand what dimension of screen size
is preferred most by the current target market. Such a method can enable a researcher to dig deeper if the
target market focused more on screen size, features or colors of the phone. Using this data, a company
can make wise decisions to its product line and secure a higher market share.

(iii) Ethnographic research


Ethnographic research is one of the most challenging research but can give extremely precise results.
Such research is used quite rarely, as it is time-consuming and can be expensive as well. It involves the
researcher to adapt to the natural environment and observe its target audience to collect data. Such a
method is generally used to understand cultures, challenges or other things that can occur in that particular
setting.
For example: The worldly renowned show “Undercover boss” would be an apt example of how
ethnographic research can be used in businesses. In this show, the senior management of a large
organization works in his own company as a regular employee to understand what improvements can be
done, what is the culture in the organization and to identify hard-working employees and to reward them.
It can be seen that the researcher had to spend a good amount of time in the natural setting of the
employees and adapt to their ways and processes. While observing in this setting, the researcher could
find out the information he needed first hand without loss of any information or any bias and to improve
certain things that would impact his business.

Business Research Methods 17


and Data Analytics
22BA206

(iv) Case study research


Case study research is one of the most important in business research. It is also used as marketing
collateral by most businesses to land up more clients. Case study research is conducted to assess customer
satisfaction, document the challenges that were faced and the solutions that the firm gave them. Using
these inferences are made to point out the benefits that the customer enjoyed for choosing their specific
firm. Such research is widely used in other fields like education, social sciences, and similar. Case studies
are provided by businesses to new clients to showcase their capabilities and hence such research plays a
crucial role in the business sector.
For example: A services company has provided a testing solution to one of its clients. A case study
research is conducted to find out what were the challenges faced during the project, what was the scope
of their work, what objective was to be achieved and what solutions were given to tackle the challenges.
The study can end with the benefits that the company provided through their solutions, like reduced time
to test batches, easy implementation or integration of the system, or even cost reduction. Such a study
showcases the capability of the company and hence it can be stated as empirical evidence to the new
prospect.

(v) Website visitor profiling/research


Website intercept surveys or website visitor profiling/research is something new that has come up and
is quite helpful in the business sector. It is an innovative approach to collect direct feedback from your
website visitors using surveys. In recent times a lot of business generation happens online and hence it is
important to understand the visitors of your website as they are your potential customers. Collecting
feedback is critical to any business as without understanding a customer, no business can be successful.
A company has to keep its customers satisfied and try to make them loyal customers in order to stay on
top.
A website intercept survey is an online survey that allows you to target visitors to understand their
intent and collect feedback to evaluate the customers’ online experience. Information like visitor
intention, behavior path, satisfaction of overall website, can be collected using this.
Depending on what information a company is looking for, multiple forms of website intercept surveys
can be used to gather responses. Some of the popular ones are Pop-ups or also called Modal boxes and on-
page surveys.
For example: A prospective customer is looking for a particular product that a company is selling.
Once he is directed to the website, an intercept survey will start noting his intent, and path. Once the
transaction has been made a pop-up or an on-page survey is provided to the customer to rate the website.
Such research enables the researcher to put this data to good use and hence understand the customers’
intent, his path and improve any parts of the website depending on the responses, which in turn would
lead to satisfied customers and hence, higher revenues and market share.

1.3.1.3 Some Other Types of Research: All other types of research are variations of one or more
of the above stated approaches, based on either the purpose of research, or the time required to
accomplish research, on the environment in which research is done, or on the basis of some other
similar factor.
• One Time Research: From the point of view of time, we can think of research either as one-
time research or longitudinal research. In the former case the research is confined to a single
time-period, whereas in the latter case the research is carried on over several time-periods.
• Laboratory Research: Research can be field-setting research or laboratory research or
simulation research, depending upon the environment in which it is to be carried out. Research
can as well be understood as clinical or diagnostic research. Such research follows case-study
methods or in-depth approaches to reach the basic causal relations. Such studies usually go deep
into the causes of things or events that interest us, using very small samples and very deep
probing data gathering devices.

Business Research Methods 18


and Data Analytics
22BA206

• Exploratory Research: The research may be exploratory or it may be formalized. The objective
of exploratory research is the development of hypotheses rather than their testing, whereas
formalized research studies are those with substantial structure and with specific hypotheses to
be tested.
• Historical Research: Historical research is that which utilizes historical sources like documents,
remains, etc., to study events or ideas of the past, including the philosophy of persons and groups
at any remote point of time.
• Conclusion-oriented Research: Research can also be classified as conclusion-oriented and
decision-oriented. While doing conclusion-oriented research, a researcher is free to pick up a
problem, redesign the enquiry as he proceeds and is prepared to conceptualize as he wishes.
Decision-oriented research is always for the need of a decision maker and the researcher in this
case is not free to embark upon research according to his own inclination. Operations research
is an example of decision- oriented research since it is a scientific method of providing executive
departments with a quantitative basis for decisions regarding operations under their control

1.3.1.4. Advantages of Business research


• Business research helps to identify opportunities and threats.
• It helps identify problems and using this information, wise decisions can be made to tackle the
issue appropriately.
• It helps to understand customers better and hence can be useful to communicate better with the
customers or stakeholders.
• Risks and uncertainties can be minimized by conducting business research in advance.
• Financial outcomes and investments that will be needed can be planned effectively using
business research.
• Such research can help track competition in the business sector.
• Business research can enable a company to make wise decisions as to where to spend and how
much.
• Business research can enable a company to stay up-to-date with the market and its trends and
appropriate innovations can be made to stay ahead in the game.
• Business research helps to measure reputation

1.3.1.5. Disadvantages of Business research


• Business research can be a high-cost affair
• Most of the time, business research is based on assumptions
• Business research can be time-consuming
• Business research can sometimes give you inaccurate information, because of a biased
population or a small focus group.
• Business research results can quickly become obsolete because of the fast-changing markets

1.4. Selection and Formulation of Research Problem

Problem means a question or an issue to be examined. A research problem refers to some


kind of problem which a researcher experiences or observes in the context of either a theoretical
or practical situation. The researcher has to find out suitable course of action by which the
objective can be attained optimally in the context of given environment. Thus, selection of
research problem has high value to the society and the researcher must be able to identify those
problems that need an urgent solution.

Business Research Methods 19


and Data Analytics
22BA206

1.4.1 Various Aspects of a Research Problem

For an effective formulation of the problem following aspects of the problem are to
be considered by the researcher.
(i) Definition of the problem: - Before one takes up a problem for the study one needs to define
it properly. The issues for inquiry are to be identified clearly and specified in details. If any
existing theoretical framework is tested, the particular theorem or theories must be identified.
(ii) Similarly if there are any assumptions made and terms used the meaning of them must be
made clear. As far as possible the statement of the problem should not give any scope for
ambiguity.
(iii) Scope of the problem: - The research scholar has to fix up the four walls of the study. The
researcher must identify which of the aspects he is trying to prove. Taking the example of
sickness he should specify. (1) Whether his study extends to all types of small scale
industries, or limited to only few of them. (2) Whether the study is limited to find cause for
sickness or also to prescribe certain prescriptions etc.
(iv) Justification of the problem: - Many a time research studies are put to the test of justification
or relevance. In the scientific curiosity of the problems, th problem that needs urgent
solution must be given preference.
(v) Feasibility of the problem: - Although a problem needs urgent attention and is justifiable in
several respects, one has to consider the feasibility of the same. Feasibility means the
possibility of conducting the study successfully. The elements of time, data, Cost is to be
taken into consideration before a topic is selected for study.
(vi) Originality of the problem: - In social sciences, particularly in commerce and management,
there is no systematic compilation of the works already done or on hand. Two people may
be doing a work more or less on similar topic. In such situations it is not advisable to
continue work in the same manner. What is advisable is that, each of them should try to
focus on different aspects, so that they could enrich the field of knowledge with their studies.
Another problem faced by a researcher is that a problem which he intends to do is already
worked out. Should he repeat the same or not? This depends upon the situation or
circumstances which engage his attention.

1.4.2 Requisites or Characteristics of a Good Research Problem

• The problem can be stated clearly and concisely.


• The problem generates research questions.
• It is grounded in theory.
• It relates to one or more academic fields of study.
• It has a base in the research literature.
• It has potential significance/importance.
• It is do-able within the time frame, budget.
• Sufficient data are available or can be obtained.
• The researcher’s methodological strengths can be applied to the problem.
• The problem is new; it is not already answered sufficiently

1.5 Formulation of Research hypothesis

Ordinarily, when one talks about hypothesis, one simply means a mere assumption or
some supposition to be proved or disproved. But for a researcher hypothesis is a formal
question that he/she intends to resolve. Thus a hypothesis may be defined as a proposition or a
set of proposition set forth as an explanation for the occurrence of some specified group of

Business Research Methods 20


and Data Analytics
22BA206

phenomena either asserted merely as a provisional conjecture to guide some investigation or


accepted as highly probable in the light of established facts. Quite often a research hypothesis
is a predictive statement, capable of being tested by scientific methods, that relates an
independent variable to some dependent variable.

1.5.1 Characteristics of hypothesis


Hypothesis must possess the following characteristics:
(i) Hypothesis should be clear and precise. If the hypothesis is not clear and precise, the
inferences drawn on its basis cannot be taken as reliable.
(ii) Hypothesis should be capable of being tested. In a swamp of untestable hypotheses,
many a time the research programmes have bogged down. Some prior study may be done
by researcher in order to make hypothesis a testable one. A hypothesis “is testable if other
deductions can be made from it which, in turn, can be confirmed or disproved by
observation.”1
(iii) Hypothesis should state relationship between variables, if it happens to be a relational
hypothesis.
(iv) Hypothesis should be limited in scope and must be specific. A researcher must
remember that narrower hypotheses are generally more testable and he should develop
such hypotheses.
(v) Hypothesis should be stated as far as possible in most simple terms so that the same is
easily understandable by all concerned. But one must remember that simplicity of
hypothesis has nothing to do with its significance.
(vi) Hypothesis should be consistent with most known facts i.e., it must be consistent with a
substantial body of established facts. In other words, it should be one which judges
accept as being the most likely.
(vii) Hypothesis should be amenable to testing within a reasonable time. One should not use
even an excellent hypothesis, if the same cannot be tested in reasonable time for one
cannot spend a life-time collecting data to test it.
(viii) Hypothesis must explain the facts that gave rise to the need for explanation. This means
that by using the hypothesis plus other known and accepted generalizations, one should
be able to deduce the original problem condition. Thus hypothesis must actually explain
what it claims to explain; it should have empirical reference.

1.5.2 Different Types of Hypothesis


• Descriptive Hypothesis – Describing the characteristics of a variable (may be an object,
person, organisation, event, and situation) Eg. Employment opportunity of commerce
graduates is more than the arts students.
• Relational Hypothesis – Establishes relationship between two variables. It may be
positive,
• negative or nil relationship. Eg. High income leads to high savings.
• Causal Hypothesis – The change in one variable leads to change in another variable
i.e. Dependent and independent variables, one variable is a cause and the other one is the
effect.
• Statistical Hypothesis – association or difference between two variables are hypothesized
• Null Hypothesis – it points out there is no difference between two populations in respect
of same property.
• Alternative Hypothesis- when we reject the null hypothesis, we accept another
hypothesis known as alternate hypothesis.
• Working Hypothesis- provisionally accepted as a basis for further research in the hope that
a tenable theory will be produced, even if the hypothesis ultimately fails. In this way, a

Business Research Methods 21


and Data Analytics
22BA206

working hypothesis is an accepted starting point for further research.

1.5.3 Role of hypothesis


In social science research, hypothesis serves several important functions.
1. A hypothesis guides the direction of study - A hypothesis guides the direction of study
or investigation. It states what we are looking for.
2. Its purpose is to include in the investigation - Its purpose is to include in the investigation
all available and pertinent data either to prove or disprove the hypothesis.
3. Research becomes unfocussed or random - Research becomes unfocussed or random
without a hypothesis and useless data may be collected in the hope that important data
is not omitted.
4. Hypothesis specifies the sources of data - Thus, a hypothesis specifies the sources of data,
which shall be studied and in what context they shall be studied.
5. Needs and prevents a blind search - It also determines the data needs, and prevents a
blind search.
6. Study a given problem - A hypothesis can suggest the type of research which is likely to
be appropriate to study a given problem.
7. Appropriate technique of data analysis - It determines the most appropriate technique of
data analysis.
8. Various hypotheses relating to a stated theory - A hypotheses can contribute to the
development of theory by testing various hypotheses relating to a stated theory. It is also
likely, in some cases, that a hypotheses helps in constructing a theory.

1.5.4 Sources of hypotheses


Hypotheses may be developed from a variety of sources. Some of them are as follows:-
(i) Observation: Hypotheses can be derived from observation. Relation between production,
cost and output of goods or relationship between price variation and demand are hypothesized
from observation.
(ii) Culture: A very important and major source of hypotheses is the culture or the socio-
economic background in which a researcher has grown. Hypotheses regarding relationships
between caste and family size, or income level and education level depend on the socio-
economic background. In India, caste system plays an important role in determining socio-
economic status, the same may not be the case of some other country.
(iii) Analogies: They are often a source of meaningful hypotheses, e.g., the hypotheses that similar
human types or activities may be found occupying the same territory has come from plant
ecology. Analogy is very suggestive. But one has to be careful in adopting models from other
disciplines. Economic theory has adopted a few models from physics also.
(iv) Theory: Theory is an extremely fertile seed bed of hypotheses. A theory represents what is
known and logical deductions from the theory lead to new hypotheses, which must be true
if the theory is true, e.g., various hypotheses are derived from the theory based on profit
maximization as the aim of a private enterprise. New hypotheses may be derived from the
established theory by method of logical induction or logical deduction.
(v) Findings of other studies: Hypotheses may also be developed from the findings of other
studies. This can happen when a study is repeated under different circumstances or different
time periods or for a different type of population. The findings of exploratory studies may be
formulated as hypotheses for other structured studies which aim at testing a hypothesis, e.g.,
the concept of trickledown effect of economic growth, later on becomes a testable hypotheses.
(vi) Level knowledge: An important source of hypotheses is the state of knowledge of any
particular science. Hypotheses can be deducted from existing formal theories. If the
hypotheses are rejected, the theory can be modified. If formal theories do not exist or are

Business Research Methods 22


and Data Analytics
22BA206

scarce which happens in case of new science, hypotheses are generated from formal
conceptual framework. This leads to the growth of theory. The growth of statistical theory of
sampling as also the development theories of economic growth illustrates this point. In either
case, the hypotheses are related to the conceptual theoretical level.
(vii) Continuity of research: Continuous research in a field is itself an important source of
hypotheses. The rejection of hypotheses leads to the formulation of new ones. These new
hypotheses explain the relationships between variables in the subsequent studies on the
same subject. In short, an ideal source of fruitful and relevant hypotheses is a fusion of two
elements. Past experience and Imagination in the disciplined mind of the scientist.

1.6 Operational Definition of Concepts


Research studies usually include terms that must be carefully and precisely defined, so
that others know exactly what has been done and there are no ambiguities. Two types of
definitions can be given: conceptual definitions and operational definitions.
Simply, a conceptual definition explains what to measure or observe (what a word or a
term means for your study), and an operational definitions defines exactly how to measure or
observe it.
For example, in a study of stress in students during a university semester, a conceptual
definition would describe what is meant by 'stress'. An operational definition would describe how
the 'stress' would be measured. Consider another Example: An operational definition would
explain how the temperature is measured: the thermometer type, how the thermometer was
positioned, how long was it left in the water, and so on.In contrast, a conceptual definition might
describe the scientific definition of temperature only.

It is critical to operationally define a variable to lend credibility to the methodology and ensure the
reproducibility of the study’s results. Another study may identify the same variable differently,
making it difficult to compare the results of these two studies.

An operational definition serves four purposes:

• It establishes the rules and procedures the researcher uses to measure the variable.
• It provides unambiguous and consistent meaning to terms/variables that can be interpreted
differently.
• It makes the collection of data and analysis more focused and efficient.
• It guides what type of data and information we are looking for.
By operationally defining a variable, a researcher can communicate a common methodology to
another researcher. Operational definitions lay down the ground rules and procedures that the
investigator will use to observe and record behavior and write down facts without bias. The sole
purpose of defining the variables operationally is to keep them unambiguous, thereby reducing
errors.
Concepts
A concept is a mental construct or a tool used to understand the world around us. An example
of a concept would be intelligence, humor, motivation, desire. These terms have meaning, but they
cannot be seen or observed directly. You cannot pick up intelligence, buy humor, or weigh either of
these. However, you can tell when someone is intelligent or has a sense of humor.
This is because constructs are observed indirectly through behaviors, which provide evidence of the
construct. For example, someone demonstrates intelligence through their academic success, how
they speak, etc. A person can demonstrate humor by making others laugh through what they say.
Concepts represent things around us that we want to study as researchers.
Defining Concepts

Business Research Methods 23


and Data Analytics
22BA206

To define a concept for the purpose of research requires the following three things
• A manner in which to measure the concept indirectly
• A unit of analysis
• Some variation among the unit of analysis
The criterion listed above is essentially a definition of a conceptual definition. Below is an example
of a conceptual definition of academic dishonesty
Below is a breakdown of this definition:
Academic dishonesty is the extent to which individuals exhibit a disregard towards educational
norms of scholarly integrity.
• Measurement: exhibit a disregard towards educational norms of scholarly integrity.
• Unit of analysis: individual
• Variation: Extent to which
It becomes much easier to shape a research study with these three components.
Measurement Models
A concept is not measured directly, as has already been mentioned. This means that when it is time
to analyze our data, our contract is a latent or unobserved variable. The items on the survey are
observed because people gave us this information directly. This means that the survey items are
observed variables.
The measurement model links the latent variables with the observed variables statistically. A strong
measurement model indicates that the observed variables correlate with the underlying latent
variable or construct.

1.7 Review of Literature


“A literature review is a description of the literature relevant to a particular field or topic. It
gives an overview of what has been said, who the key writers are, what are the prevailing theories
and hypotheses, what questions are being asked, and what methods and methodologies are
appropriate and useful”.
A literature review is not just a summary of everything you have read on the topic. It is a
critical analysis of the existing research relevant to your topic, and you should show how the
literature relates to your topic and identify any gaps in the area of research.
The purpose of a literature review is to gain an understanding of the existing research and debates
relevant to a particular topic or area of study, and to present that knowledge in the form of a written
report. Conducting a literature review helps you build your knowledge in your field.
You’ll learn about important concepts, research methods, and experimental techniques that
are used in your field. You’ll also gain insight into how researchers apply the concepts you’re
learning in your unit to real world problems. Another great benefit of literature reviews is that as
you read, you’ll get a better understanding of how research findings are presented and discussed in
your particular discipline. If you pay attention to what you read and try to achieve a similar style,
you’ll become more successful at writing for your discipline.

1.7.1 Evaluating the sources


A literature review should not just be a summary of each source. That would be more like an
annotated bibliography. Instead, you need to:
• Compare and contrast each source to other relevant literature on the topic
• Critically evaluate each source
• Indicate how each source contributes to the body of knowledge about the topic
• Integrate your discussion of the sources into your argument about the state of knowledge on
the topic, you can also organize your literature review report in a way that demonstrates your

Business Research Methods 24


and Data Analytics
22BA206

evaluation of the sources in terms of how each one relates to other sources and to the major
debates on the topic
• The primary purpose of a traditional or narrative literature review is to ana- lyse and
summaries a body of literature. This is achieved by presenting a comprehensive background
of the literature within the interested topic to highlight new research streams, identify gaps or
recognize inconsistencies. This type of literature review can help in refining, focusing and
shaping research questions as well as in developing theoretical and conceptual frameworks
(Coughlan et al., 2007).
• The systematic literature review in contrast undertakes a more rigorous approach to
reviewing the literature, perhaps because this type of review is often used to answer highly
structured and specific research questions.
• The meta-analysis literature review involves taking the findings from the chosen literature
and analyzing these findings by using standardized statistical procedures (Coughlan et al.,
2007). Polit and Beck (2006) argue that meta-analysis methods help in drawing conclusions
and detecting patterns and relationships between findings.
• They also discuss meta-synthesis, which is a non-statistical procedure; instead it evaluates
and analyses findings from qualitative studies and aims to build on previous
conceptualizations and interpretations.

1.7.2 Need of literature Review


Our aim is to provide guidance on undertaking a traditional literature review, concentrating
on the context of doing a more traditional and critical literature review rather than a systematic
literature review. Literature reviews are important for numerous reasons.
First, by undertaking a literature review, the information gathered from credible articles or studies
that are of relevance, important and valid can be summarised into a document (for example, a thesis
or a dissertation).This can then allow for the rationale or reason for a study to emerge, which
may include a justification for a specific research approach (McGhee et al, 2007).
Second, it provides a starting point for researchers where they are required to identify and
understand what has been written about a particular area. That will usually mean reading all the
relevant texts and then going through each to summarise, evaluate, critically review, synthesise
and compare these research studies in their chosen area.
Third, by carrying out a literature review, it not only highlights the gaps in knowledge but it means
that students, researchers and managers alike are not replicating or repeating previous work - it
identifies discrepan- cies, knowledge gaps and inconsistencies in the literature.
Lastly, it can support “clarity in thinking about concepts and possible theory development”

1.8 Business Research Data


Research data comes in many different formats and is gathered using a wide variety of
methodologies. In this module, we will provide you with a basic definition and understanding of
what research data are. We'll also explore how data fits into the scholarly research process.
Many people think of data-driven research as something that primarily happens in the sciences. It is
often thought of as involving a spreadsheet filled with numbers. Both of these beliefs are incorrect.
Research data are collected and used in scholarship across all academic disciplines and, while
it can consist of numbers in a spreadsheet, it also takes many different formats, including videos,
images, artifacts, and diaries. Whether a psychologist collecting survey data to better understand
human behavior, an artist using data to generate images and sounds, or an anthropologist using audio
files to document observations about different cultures, scholarly research across all academic fields
is increasingly data-driven.

Official Definition of Research Data

Business Research Methods 25


and Data Analytics
22BA206

U.S. Office of Management & Budget explained that, “Research data, unlike other types of information,
is collected, observed, or created, for purposes of analysis to produce original research results”
According to University of Edinburgh, "Research data is a recorded factual material commonly accepted
in the scientific community as necessary to validate research findings..."
National Endowment for the Humanities defined about research data as, "they are materials generated or
collected during the course of conducting research."

1.8.1 Research Data Formats


Research data takes many different forms. Data may be intangible as in measured numerical values
found in a spreadsheet or an object as in physical research materials such samples of rocks, plants,
or insects. Here are some examples of the formats that data can take:
• Documents (text, MS Word), spreadsheets • Sensor readings
• Lab notebooks, field notebooks, diaries • Test responses
• Questionnaires, transcripts, surveys • Artifacts, specimens, physical
• Codebooks samples
• Experimental data • Models, algorithms, scripts
• Films, audio or video tapes/files • Content analysis
• Photographs, image files • Focus group recordings;
interview notes
1.8.2 Types of Data
A) Primary data
(i) Primary data means first-hand information collected by an investigator.
(ii) It is collected for the first time.
(iii) It is original and more reliable.
(iv) For example, the population census conducted by the government of India after every ten
years is primary data.
B) Secondary data
(i) Secondary data refers to second-hand information.
(ii) It is not originally collected and rather obtained from already published or unpublished
sources.
(iii) For example, the address of a person taken from the telephone directory or the phone
number of a company taken from websites like Just Dial is secondary data.

1.8.3 Data sources


The sources of data can be classified into two types: statistical and non-statistical. Statistical
sources refer to data that is gathered for some official purposes, incorporate censuses, and officially
administered surveys. Non-statistical sources refer to the collection of data for other administrative
purposes or for the private sector.

The following are the two sources of data:


• Internal sources
When data is collected from reports and records of the organisation itself, they are known as the
internal sources.
For example, a company publishes its annual report’ on profit and loss, total sales, loans, wages, etc.

Business Research Methods 26


and Data Analytics
22BA206

• External sources
When data is collected from sources outside the organisation, they are known as the external
sources. For example, if a tour and travel company obtains information on Karnataka tourism from
Karnataka Transport Corporation, it would be known as an external source of data

1.8.3.1 Methods of Data Collection


Primary Data Primary data is collected through the following tools:
1. Questionnaire: A questionnaire is designed as per the purpose of the research topics and objectives to
be filled by the sample population which gets further analysed for suitable results.
2. Personal Interview: The team conducts personal interviews with every sample based on the objectives
of the research.
3. Survey: The team conducts on-field surveys to assess the behaviour of the sample population in
accordance with the research objectives.
4. Experiments: The team conducts experiments or randomised controlled experiments to assess the
results.
Secondary Data Secondary data can be accessed from the following sources:
1. Journals: The journals published every year have reliable sources of secondary data which gets verified
by a group of scholars and hence, can be used in research.
2. Government Databases: The government collects and record data over a period of time which can be
used for the analysis. The financial and economic data can be found from the RBI and finance
ministries database. One can find years of historical data on the databases many of which are available
to the public.
3. UN Databases: The United Nations collects primary data from their field works and interventions
which are available to the general public for use. Given the reputed name and reliable verification
systems, this database can be another good option to take secondary data from
4. Databases of Analytical Companies: Analytical companies like Bloomberg, Statista have their own
databases and analysis which are open to the users and can be cited as a verifiable source in the
research.

1.8.3.2 Comparison of Primary and Secondary Data


Basis For
Primary Data Secondary Data
Comparison

Meaning Primary data refers to the first Secondary data means data
hand data gathered by the collected by someone else earlier.
researcher himself.

Data Real time data Past data

Process Very involved Quick and easy

Source Surveys, observations, Government publications,


experiments, questionnaire, websites, books, journal articles,
personal interview, etc. internal records etc.

Cost Expensive Economical


effectiveness

Collection time Long Short

Business Research Methods 27


and Data Analytics
22BA206

Basis For
Primary Data Secondary Data
Comparison

Specific Always specific to the May or may not be specific to the


researcher's needs. researcher's need.

Available in Crude form Refined form

Accuracy and More Relatively less


Reliability

1.9 Steps involved in Research process

Before embarking on the details of research methodology and techniques, it seems


appropriate to present a brief overview of the research process. Research process consists of
series of actions or steps necessary to effectively carry out research and the desired sequencing
of these steps. One should remember that the various steps involved in a research process are not
mutually exclusive; nor they are separate and distinct. They do not necessarily follow each other
in any specific order and the researcher has to be constantly anticipating at each step in the
research process the requirements of the subsequent steps.
However, the following order concerning various steps provides a useful procedural
guideline regarding the research process: (1) formulating the research problem; (2) extensive
literature survey; (3) developing the hypothesis; (4) preparing the research design; (5)
determining sample design; (6) collecting the data; (7) execution of the project; (8) analysis of
data; hypothesis testing; (10) Ggeneralizations and interpretation, and (11) preparation of the
report or presentation of the results, i.e., formal write-up of conclusions reached. A brief
description of the above stated steps will be helpful.

Formulating the research problem:


There are two types of research problems, viz., those which relate to states of nature and
those which relate to relationships between variables. At the very outset the researcher must
single out the problem he wants to study, i.e., he must decide the general area of interest or
aspect of a subject- matter that he would like to inquire into. Initially the problem may be stated
in a broad general way and then the ambiguities, if any, relating to the problem be resolved.
Then, the feasibility of a particular solution has to be considered before a working formulation
of the problem can be set up. The formulation of a general topic into a specific research problem,
thus, constitutes the first step in a scientific enquiry. Essentially two steps are involved in
formulating the research problem, viz., understanding the problem thoroughly, and rephrasing
the same into meaningful terms from an analytical point of view.
The researcher must at the same time examine all available literature to get himself
acquainted with the selected problem. He may review two types of literature—the conceptual
literature concerning the concepts and theories, and the empirical literature consisting of studies
made earlier which are similar to the one proposed. The basic outcome of this review will be
the knowledge as to what data and other materials are available for operational purposes which
will enable the researcher to specify his own research problem in a meaningful context. The
problem to be investigated must be defined unambiguously for that will help discriminating
relevant data from irrelevant ones.

Business Research Methods 28


and Data Analytics
22BA206

Extensive literature survey:


Once the problem is formulated, a brief summary of it should be written down. It is
compulsory for a research worker writing a thesis for a Ph.D. degree to write a synopsis of the
topic and submit it to the necessary Committee or the Research Board for approval.
At this juncture the researcher should undertake extensive literature survey connected with
the problem. For this purpose, the abstracting and indexing journals and published or
unpublished bibliographies are the first place to go to. Academic journals, conference
proceedings, government reports, books etc., must be tapped depending on the nature of the
problem.

Development of working hypotheses:


After extensive literature survey, researcher should state in clear terms the working
hypothesis or hypotheses. Working hypothesis is tentative assumption made in order to draw
out and test its logical or empirical consequences. As such the manner in which research
hypotheses are developed is particularly important since they provide the focal point for
research. They also affect the manner in which tests must be conducted in the analysis of data
and indirectly the quality of data which is required for the analysis. In most types of research,
the development of working hypothesis plays an important role. Hypothesis should be very
specific and limited to the piece of research in hand because it has to be tested. The role of the
hypothesis is to guide the researcher by delimiting the area of research and to keep him on the
right track.

Preparing the research design:


The research problem having been formulated in clear cut terms, the researcher will be
required to prepare a research design, i.e., he will have to state the conceptual structure within
which research would be conducted. The preparation of such a design facilitates research to be
as efficient as possible yielding maximal information. In other words, the function of research
design is to provide for the collection of relevant evidence with minimal expenditure of effort,
time and money. But how all these can be achieved depends mainly on the research purpose.
Research purposes may be grouped into four categories, viz., (i) Exploration, (ii) Description,
(iii) Diagnosis, and (iv) Experimentation. A flexible research design which provides
opportunity for considering many different aspects of a problem is considered appropriate if the
purpose of the research study is that of exploration. But when the purpose happens to be an
accurate description of a situation or of an association between variables, the suitable design
will be one that minimizes bias and maximizes the reliability of the data collected and analyzed.

Determining sample design:


All the items under consideration in any field of inquiry constitute a ‘universe’ or
‘population’. A complete enumeration of all the items in the ‘population’ is known as a census
inquiry. It can be presumed that in such an inquiry when all the items are covered no element
of chance is left and highest accuracy is obtained. But in practice this may not be true. Even the
slightest element of bias in such an inquiry will get larger and larger as the number of
observations increases. Moreover, there is no way of checking the element of bias or its extent
except through a resurvey or use of sample checks. Besides, this type of inquiry involves a
great deal of time, money and energy. Not only this, census inquiry is not possible in practice
under many circumstances. For instance, blood testing is done only on sample basis. Hence,
quite often we select only a few items from the universe for our study purposes. The items so
selected constitute what is technically called a sample. The sample design to be used must be
decided by the researcher taking into consideration the nature of the inquiry and other related
factors.

Business Research Methods 29


and Data Analytics
22BA206

Collecting the data:


In dealing with any real life problem it is often found that data at hand are inadequate, and
hence, it becomes necessary to collect data that are appropriate. There are several ways of
collecting the appropriate data which differ considerably in context of money costs, time and
other resources at the disposal of the researcher.
Primary data can be collected either through experiment or through survey. If the researcher
conducts an experiment, he observes some quantitative measurements, or the data, with the help
of which he examines the truth contained in his hypothesis. The researcher should select one of
these methods of collecting the data taking into consideration the nature of investigation,
objective and scope of the inquiry, financial resources, available time and the desired degree of
accuracy.

Execution of the project:


Execution of the project is a very important step in the research process. If the execution
of the project proceeds on correct lines, the data to be collected would be adequate and
dependable. The researcher should see that the project is executed in a systematic manner and
in time. If the survey is to be conducted by means of structured questionnaires, data can be
readily machine-processed. In such a situation, questions as well as the possible answers may
be coded. If the data are to be collected through interviewers, arrangements should be made for
proper selection and training of the interviewers. The training may be given with the help of
instruction manuals which explain clearly the job of the interviewers at each step. Occasional
field checks should be made to ensure that the interviewers are doing their assigned job sincerely
and efficiently. A careful watch should be kept for unanticipated factors in order to keep the
survey as much realistic as possible. This, in other words, means that steps should be taken to
ensure that the survey is under statistical control so that the collected information is in
accordance with the pre-defined standard of accuracy.

Analysis of data:
After the data have been collected, the researcher turns to the task of analysing them. The
analysis of data requires a number of closely related operations such as establishment of
categories, the application of these categories to raw data through coding, tabulation and then
drawing statistical inferences. The unwieldy data should necessarily be condensed into a few
manageable groups and tables for further analysis. Thus, researcher should classify the raw data
into some purposeful and usable categories. Coding operation is usually done at this stage
through which the categories of data are transformed into symbols that may be tabulated and
counted. Editing is the procedure that improves the quality of the data for coding. With coding
the stage is ready for tabulation. Tabulation is a part of the technical procedure wherein the
classified data are put in the form of tables. The mechanical devices can be made use of at this
juncture. A great deal of data, especially in large inquiries, is tabulated by computers.
Computers not only save time but also make it possible to study large number of variables
affecting a problem simultaneously. Analysis work after tabulation is generally based on the
computation of various percentages, coefficients, etc., by applying various well defined
statistical formulae.

Hypothesis-testing:
After analysing the data as stated above, the researcher is in a position to test the
hypotheses, if any, he had formulated earlier. Do the facts support the hypotheses or they happen
to be contrary? This is the usual question which should be answered while testing hypotheses.

Business Research Methods 30


and Data Analytics
22BA206

Various tests, such as Chi square test, t-test, F-test, have been developed by statisticians for the
purpose. The hypotheses may be tested through the use of one or more of such tests, depending
upon the nature and object of research inquiry.

Generalizations and interpretation:


If a hypothesis is tested and upheld several times, it may be possible for the researcher to
arrive at generalisation, i.e., to build a theory. As a matter of fact, the real value of research lies
in its ability to arrive at certain generalisations. If the researcher had no hypothesis to start with,
he might seek to explain his findings on the basis of some theory. It is known as interpretation.
The process of interpretation may quite often trigger off new questions which in turn may lead
to further researches.

Preparation of the report or the Thesis:

i. Finally, the researcher has to prepare the report of what has been done by him. Writing of report
must be done with great care keeping in view the following:
ii. The layout of the report should be as follows: (i) the preliminary pages; (ii) the main text, and (iii)
the end matter.
iii. In its preliminary pages the report should carry title and date followed by acknowledgements and
foreword. Then there should be a table of contents followed by a list of tables and list of graphs
and charts, if any, given in the report.
iv. Report should be written in a concise and objective style in simple language avoiding vague
expressions such as ‘it seems,’ ‘there may be’, and the like.
v. Charts and illustrations in the main report should be used only if they present the information more
clearly and forcibly.
vi. Calculated ‘confidence limits’ must be mentioned and the various constraints experienced in
conducting research operations may as well be stated.
In its preliminary pages the report should carry title and date followed by acknowledgements and
foreword. Then there should be a table of contents followed by a list of tables and list of graphs
and charts, if any, given in the report.

Finally the main text of the report should have the following parts:
a) Introduction: It should contain a clear statement of the objective of the research and an
explanation of the methodology adopted in accomplishing the research. The scope of the study
along with various limitations should as well be stated in this part.
b) Summary of findings: After introduction there would appear a statement of findings and
recommendations in non-technical language. If the findings are extensive, they should be
summarized.
c) Main report: The main body of the report should be presented in logical sequence and broken-
down into readily identifiable sections.
d) Conclusion: Towards the end of the main text, researcher should again put down the results of his
research clearly and precisely. In fact, it is the final summing up.
At the end of the report, appendices should be enlisted in respect of all technical data.
Bibliography, i.e., list of books, journals, reports, etc., consulted, should also be given in the
end. Index should also be given specially in a published research report.

Business Research Methods 31


and Data Analytics
22BA206

I. Write down short answers for the following:


1. What is research? (Pg. 5)
2. Define the nature and purpose of research. (Pg.7 &8)
3. Describe scientific method.(Pg. 9)
4. What is a research theory?(Pg. 11)
5. Define induction and deduction (pg. 12)
6. Describe research methods.(Pg. 15)
7. What are various aspects of research problems?(Pg. 20)
8. What is hypothesis?(Pg. 20)
9. Define data and data collection.(Pg. 25 & 26)
10. Write down the importance of literature review.(Pg. 24)
II. Provide Detailed Answers:
1. Elaborate types and methods of business research (Pg. 15 to 18)
2. Describe types of hypothesis with example (Pg.21 to 22)
3. Explain different types of data collection methods (Pg. 26 to 28)
4. Describe the steps involved in research process.(Pg.28 to 30)
5. Compare primary and secondary data in research(Pg. 27 & 28)

Business Research Methods 32


and Data Analytics
22BA206

Unit-II RESEARCHMETHODS AND MEASURES 12


Components of research design-Definition & types, Research methods- Types, Reliability- types of reliability,
Validity- types of validity, Variables- types of variables, Sampling- Methods, of collection- tools for collection
data; Data collection instruments- Primary & secondary: in-depth interviews, projective techniques and focus
groups- interview schedule, interview guide, questionnaire, rating scale, sociometry, check list; pretesting of
tools, pilot study. Processing of data; checking, editing, coding, transcription, tabulation, preparation of tables,
graphical representation.

Learning Objectives
• To learn about reliability and validity during research tool construction
• To understand Data processing
• To learn presentation and data tabulation

Learning Outcomes
At the end of the unit they will be able to:
• To apply different data collection methods on business research
• To apply measuring and scaling techniques
• To present data visually for reporting

DETAILEDSESSIONPLAN TOPIC WISE

Mode of Assessment
S.No Title of Teaching Textbook/ Link Tool(Quiz/Pu
Topic (PPT/Seminar/ Reference Book (if Applicable) link zzle/
Chalk & should on Springboard/ Assignment/
Board etc.) Coursera / Nptel Seminaretc..)
NPTEL Link:
Types of https://onlinecourses.n
William G Zikmund, Barry J
research Chalk and ptel.ac.in/noc23_mg54
Babin, Jon C.Carr, Atanu
Reliability & board/PPT /unit?unit=17&lesson=
Adhikari, Mitch Griffin,
1. Validity 20
Business Research methods, A
South Asian Perspective, 8th
Edition, Cengage Learning, https://www.youtube.c
New Delhi,2012. om/watch?v=ur- Quiz
pIS0CxOg
https://www.youtube.c
Sampling Chalk and Donald R. Cooper & Pamela S om/watch?v=pTuj57u
2. Methods board/PPT Schindler, Business Research
XWlk
Methods, Tata MC Graw hill
Publishing companies, 9th
edition, New Delhi

Panneerselvam. R, Research https://onlinecourses.n


3 Data Illustrations Methodology, 2nd Edition, PHI ptel.ac.in/noc23_mg54
processing Learning, 2014.
/unit?unit=26&lesson=
& 30
presentation

Business Research Methods 33


and Data Analytics
22BA206

2 RESEARCHMETHODS AND MEASURES

2.1.1 Research Design

Design, at a basic level, means planning. Generally, some decisions are to be taken before the actual
action. The research design is the conceptual structure within which research is conducted; it constitutes
the blueprint for the collection, measurement and analysis of data. As such the design includes an outline
of what the researcher will do from writing the hypothesis and its operational implications to the final
analysis of data. Decisions regarding what, where, when, how much, by what means concerning an inquiry
or a research study constitute a research design. It is a process of deliberate anticipation directed towards
bringing an expected situation under control. Thus, Research design is the plan and structure of
investigation so conceived as to obtain answers to research questions. The plan is the overall scheme or
program of the research. It includes an outline of what the investigator will do from the selection of
research problem to the conclusion of the research study

A Research Design is simply a structural framework of various research methods as well as techniques
that are utilized by a researcher.

A researcher usually chooses the research methodologies and techniques at the start of the research. The
document that contains information about the technique, methods and essential details of a project is called
a research design. Experts define research design as the glue that holds the research project together. It
(research design) helps provide a structure and direction to the research, yielding favorable results.

2.1.2 Definition of Research Design

According to William Zikmund :


"Research design is defined as a master plan specifying the methods and procedures for collection and
analyzing the needed information."

According to Kerlinger :
"Research design is the plan, structure, and strategy of investigation conceived so as to obtain answers to
research questions and to control variance".

According to Selltiz et al. :


"A research design is the arrangement of conditions for collection and analysis of data in a manner that
aims to combine relevance to the research purpose with economy in procedure".

2.1.3 Characteristics of Research Design

A proper design sets your study up for success. Successful research studies provide insights that are
accurate and unbiased. You’ll need to create a survey that meets all of the main characteristics of a design.
There are four key characteristics:

• Neutrality: When you set up your study, you may have to make assumptions about the data you expect
to collect. The results projected in the research should be free from bias and neutral. Understand
opinions about the final evaluated scores and conclusions from multiple individuals and consider those
who agree with the results.

Business Research Methods 34


and Data Analytics
22BA206

• Reliability: With regularly conducted research, the researcher expects similar results every time.
You’ll only be able to reach the desired results if your design is reliable. Your plan should indicate
how to form research questions to ensure the standard of results.
• Validity: There are multiple measuring tools available. However, the only correct measuring tools are
those which help a researcher in gauging results according to the objective of the research.
The questionnaire developed from this design will then be valid.
• Generalization: The outcome of your design should apply to a population and not just a
restricted sample. A generalized method implies that your survey can be conducted on any part of a
population with similar accuracy.

The above factors affect how respondents answer the research questions, so they should balance all the
above characteristics in a good design.

2.1.4 Purpose (Need) of research design

1) Reduces Cost:
Research design is needed to reduce the excessive costs in terms of time, money and effort by planning
the research work in advance.
2) Facilitate the Smooth Scaling:
In order to perform the process of scaling smoothly, an efficient research design is of utmost importance.
It makes the research process effective enough to give maximum relevant outcome in an easy way.
3) Helps in Relevant Data Collection and Analysis:
Research design helps the researchers in planning the methods of data collection and analysis as per the
objective of research. It is also responsible for the reliable research work as it is the foundation for entire
research. Lack of proper attention in preparation of research design can harm the entire research work.
4) Assists in Smooth Flow of Research Operations:
Research design is necessary to give better and effective structure to the research. Since all the decisions
are made in advance, therefore, research design facilitates the smooth flow of research operations and
reduces the possible problems of researchers.
5) Helps in Getting Reviews from Experts :
Research design helps in developing an overview about the whole research process and thus assists in
getting responses and reviews from different experts in that field.
6) Provides a Direction to Executives:
Research design directs the researcher as well as the executives involved in the research for giving their
relevant assistance.

2.1.5 Components of Research Design

Research design, when done right, can generate similar results every time it is performed. However,
yielding similar results is only possible if your research design is reliable.Here are some of the
elements/components of a good research design:

• Purpose statement- Statement of research objectives, i.e., why the research project is to be
conducted
• Data collection methods- Methods and procedures used for collection of data, Constitution of
sample size and its procedure out of total population
• Techniques of data analysis- tools and techniques used to analyse data
• Types of research methodologies-
• Challenges of the research

Business Research Methods 35


and Data Analytics
22BA206

• Prerequisites required for study


• Duration of the research study- Time, costs, and responsibility specification
• Measurement of analysis- Probable output or research outcomes and expected actions to be taken
based on those outcomes

2.1.5.1 Research design can be split into four phases: In order to understand the research design
concept, we can go through following four phases:
1. The sampling design: It deals with the method of selecting items to be observed for the given study.
2. The observational design: It relates to the conditions under which the observations are to be made.
3. The statistical design: It deals with the question of how many subjects are to be observed and how the
observations are to be analysed.
4. The operational design: It deals with the specific techniques by which the procedures specified in the
sampling, statistical and observational designs can be carried out.

2.1.6 Process of Research Design


The stages in the process of research design are interactive in nature and often occur at the same time.
Designing of research study follows given process. Steps in research design :

Step 1: Defining Research Problem :


The definition of research problem is the foremost and important part of a research design process.
Defining the research problem includes supplying the information that is required by the management.
Without defining the research problem appropriately, it is not possible for the researcher to conclude the
accurate, results. While defining research problem, the researchers first analyse the problems or
opportunities in management, then they analyse the situation. The purpose of clarifying the research
problem is to make sure that the area of concern for research is properly reflected and management
decision is correctly described. After situation analysis, they develop a model for research which helps in
the next step which is specification of information.
Step 2: Assess the Value of Information :
When a research problem is approached, it is usually based on some information. These data are obtained
from past experiences as well as other sources. On the basis of this information, some preliminary
judgement are made regarding the research problem. There is always a need for additional information
which is available without additional cost and delay but waiting and paying for the valuable information
is quite difficult.
For example, a car manufacturing industry may be concerned about decrease in the sale of a particular
model. A researcher will look for the solutions by analyzing various aspects.
For this, the researcher has to continuously collect a lot of information and needs to evaluate them by
understanding their value and filtering out useless information.
Step 3: Select the Approach for Data Collection :
For any type of research, a researcher needs data. Once, it is identified that which kind of information is
required for conducting the research, the researchers proceed towards collecting the data. The data can be
collected using secondary or primary sources.
Secondary data is the previous collected information for some other purpose, while the primary data is
collected by the researcher especially for the research problem.
Step 4: Select the Measurement Technique :
After collecting data, the measurement technique for the collected data is selected. The major
measurement techniques used in research are as follows :
i) Questionnaire :
Questionnaire is a formal structure which contains questions to collect the information from the
respondents regarding his attitude, beliefs, behavior, knowledge, etc.

Business Research Methods 36


and Data Analytics
22BA206

ii) Attitude Scales :


Attitude scales are used to extract the beliefs and feelings of the respondents regarding an object or issue.
iii) Observation :
It is the monitoring of behaviors and psychological changes of the respondents. It is widely used in
research.
iv) Projective Techniques and Depth Interview :
Sometimes direct questions are not sufficient to get true responses from the individuals, that is why.
different approaches like depth interviews and projective techniques are used. These techniques allow the
respondents to give their responses without any fear. Researcher neither disagrees nor gives advice in
these techniques.
Step 5: Sample Selection :
Once, the measurement technique has been selected, the next step is selecting the sample to conduct the
research. The researchers in this stage select a sample out of the total population instead of considering
the population as a whole. Sample can be selected by using two techniques, i.e., random sampling
techniques, and non-random sampling techniques.
Step 6: Selecting Model of Analysis :
Researchers select the model of analysis or technique of data analysis, before collecting data. After this,
researchers evaluate the techniques using hypothetical values to ensure that the measurement technique
would provide the desired outcome regarding the research problem.
Step 7: Evaluate the Ethics of Research :
While conducting research, it becomes very much necessary for the researcher to follow ethical
practices. The researches which are conducted ethically draws interests of general public, respondents,
clients and other research professionals. Hence, it becomes the duty of the researcher to evaluate the
practices in research, to avoid any biasness on behalf of the observer and researcher as well.
Step 8: Estimate Time and Financial Requirements :
This step is one of the most important steps in designing research. Here, researchers use different methods
like Critical Path Method (CPM) and Programme Evaluation Review Technique (PERT) to design the
plan as well as control process and to determine the resources required.
A flowchart of these activities along with their approximate time is prepared for visual assessment of the
research process. With the help of this chart, the researcher can find out the sequence of activities to be
taken.
Step 9: Prepare the Research Proposal :
The final step in the process of research design is preparing the research proposal. A research proposal or
the research design is prepared the operation and control of research. An effective research proposal is
prepared before actual conduction of the research.

2. 1.7 Types of Research Design


In business research you can choose a research design by reviewing the methods other research
papers used and learning about different types of research designs. Here are 20 types of research design
that you can consider using for your research project:
1. Exploratory research design
One common type of research design is exploratory design. The exploratory research design format is
useful when you don't have a clearly defined problem to study. Often, this type of research design is less
structured than other research design options, and you can use it as a guide for your initial research to
uncover your research problem.
2. Observational research design
Observational design is also a common type of research design. The observational research design format
emphasizes observing your research topic without altering any variables. When using an observational

Business Research Methods 37


and Data Analytics
22BA206

research design, you can simply observe behaviors or phenomena and record them rather than conducting
an experiment.
3. Descriptive research design
Descriptive design is another type of research design. The goal of using a descriptive research design is
to describe a research topic, so this type of research is useful when you need more information about your
topic. Descriptive research design can also help you understand the "what," "where," "when" and "how"
of your research topic. The one question that a descriptive research design does not answer is "why."
2. Case study
Another type of observational research design is the case study format. Case studies are analyses of real-
world situations to understand and evaluate past problems and solutions. Therefore, case studies are useful
when you want to test how an idea applies to real life, and this research design is especially popular in
marketing, advertising and social science. The five-part case study format includes:
• Title
• Overview
• Problem
• Solution
• Results
5. Action research design
Another type of research design is the action research design. The action research design format involves
initial exploratory analysis and the development of an action strategy. This design format is collaborative,
and it focuses on finding solutions, making it practical for many research topics. You can use the action
research design when you want to solve real problems.
6. Experimental research design
Experimental research design is also common. The experimental research design is especially useful when
you want to test how different factors affect a situation, making this design type very versatile. The
experimental research design uses the scientific method, which includes elements like:
Hypothesis: A research hypothesis is a statement that describes what you predict your research to reveal.
Independent variable: An independent variable is a variable that does not depend on other variables.
Dependent variable: A dependent variable is a variable that depends on another variable.
Control variable: A control variable is a variable that remains constant throughout a research experiment.
7. Causal research design
The causal research design is another type of research design that researchers commonly choose. The
causal research design format attempts to identify and understand relationships between variables, which
can be valuable across many industries. Causal research designs typically involve at at least two variables
and explore many possible reasons for a relationship between variables.
8. Correlational research design
Along with the causal research design, the correlational research design is also commonly used. The
correlational research design format, like the causal format, identifies relationships between variables.
When you use a correlational research design, you measure variables but do not alter them.
9. Diagnostic research design
Another type of research design is the diagnostic research design. The diagnostic research design attempts
to find the underlying factors that cause events or phenomena to occur. This research type is useful to help
you understand what's causing problems so you can find solutions.
10. Cross-sectional research design
Cross-sectional design is another type of observational research design. The cross-sectional research
design involves observing multiple individuals at the same point in time. This research type does not alter
variables.
11. Sequential research design

Business Research Methods 38


and Data Analytics
22BA206

Sequential research design is another useful type of research design. The sequential research design format
divides research into stages, and each stage builds on the last. Therefore, you can complete sequential
research at multiple points in time, allowing you to study phenomena that occur over periods of time.
12. Cohort research design
Cohort research design, a type of observational research, is another research design type. This type of
research design is commonly used in medicine, but it can also have applications in other industries. Cohort
design involves examining research subjects who have already been exposed to a research topic, making
it especially effective for conducting ethical research on medical topics or risk factors. This design type is
very flexible, and it applies to both primary and secondary data.
13. Historical research design
Researchers can also use historical research design. Using the historical research design allows you to use
past data to test your hypothesis. Historical research relies on historical data like archives, maps, diaries
and logs. Using this research design can be especially useful for completing trend analysis or gathering
context for a research problem.
14. Field research design
Another type of research design is the field research design. The field research design, which is a
qualitative research method, allows you to observe subjects in natural environments. This can allow you
to collect data directly from real-world situations.
15. Systematic review
Systematic review is another type of research design. Completing a systematic review involves reviewing
existing evidence and analyzing data from existing studies. This can allow you to use previous research
to come up with new conclusions.
16. Survey
Researchers also use the survey research design frequently. You can use surveys to gather information
directly from your sample population. Some types of surveys include:
Interviews: Interviews are one popular type of survey. Interviews allow you to ask questions to a research
subject one-on-one, which can give you the opportunity to ask follow-up questions and gain additional
insights.
Online forms: You can also use online forms to conduct surveys. You can use many websites or software
programs to create intuitive online forms with a variety of question types, including short-answer and
multiple-choice.
Focus groups: Focus groups are another key survey method. By using focus groups, you can facilitate
discussions with a group of research subjects to gain valuable research insights from your sample
population.
Questionnaires: Another type of survey is a questionnaire. In a questionnaire, you can simply list questions
for a research subject to answer, making this an effective data collection method.
17. Meta-analysis research design
Meta-analysis is a type of quantitative research design. The meta-analysis research design format uses a
variety of populations from different existing studies. This means that this method allows you to use
previous research to form new conclusions.
18. Mixed-method research design
Researchers can also use a mixed-method research design. Mixed-method research designs combine
multiple research methods to create the best path for a specific research project. This type of research can
include both qualitative and quantitative research methods.

Business Research Methods 39


and Data Analytics
22BA206

19. Longitudinal research design


Another type of quantitative, observational research design is longitudinal design. The longitudinal
research design involves observing the same sample repeatedly over a period of time. This time span
might be anywhere from a few weeks to several decades, depending on your particular research.
20. Philosophical research design
Philosophical research design is another research design type. The philosophical research design can help
you analyze and understand your research problem. This design type builds on philosophical
argumentation techniques. The three key areas of philosophical research design are:
Epistemology: Epistemology focuses on knowledge and certainty.
Ontology: Ontology focuses on human nature and existence.
Axiology: Axiology is the study of values, and it applies especially to ethics.
You can use the philosophical research design to help you understand research purposes, make ethical
decisions and think critically about your research topic.

2.1.8 Factors Affecting Research Design


Various factors that affect research design are as follows :

1) Research Questions :
Research questions perform an important role in selecting the method to carry-out research. There are
various forms of research designs which include their own methods for collecting data.
For example, a survey can be conducted for the respondents to ask them descriptive or interconnected
questions while a case study or a field survey can be used to identify the firm's decision-making process.

2) Time and Budget Limits :


Researchers are bound with restricted amount of time and budget to complete the research study. The
researcher can select experimental or descriptive research when the time and budget constraints we
favorable to him for the detailed study. otherwise exploratory research design can be adopted when the
time is limited.

3) Research Objective :
Every research is carried out to obtain the results which help to achieve some objectives. This research
objective influences the selection of research design. Researcher should adopt the research design which
is suitable for research objective and also provides best solution to the problem along with the valuable
result.

4) Research Problem :
Selection of the research design is greatly affected by the type of research problems. For example, the
researcher selects experimental research design to find out cause and-effect relationship of the research
problem. Similarly, if the research problem includes in depth study, then the researcher generally adopts
experimental research design method.

5) Personal Experiences :
Selection of research design also depends upon the personal experience of researchers.
For example, the researcher who has expertise in statistical analysis would be liable to select the
quantitative research designs. While, those researchers who are specialists in theoretical facets of research
will be forced to select qualitative research design.

Business Research Methods 40


and Data Analytics
22BA206

6) Target Audience :
The type of target audience plays very important role in selection of research design. Researcher must
consider the target audience for which the research is carried-out. Audiences may either be general public,
business professionals or government.
For example, if the research is proposed for general public, then the researcher should select qualitative
research design. Similarly, quantitative research design would be appropriate for the researcher to
introduce the report in front of the business experts.

Typically, a good and well-planned research design consists of the following components, or tasks:
• Selection of appropriate type of design: Exploratory, descriptive and/or causal design.
• Identification of specific information needed based problem in hand and the selected design.
• Specification of measurement and scaling procedures for measuring the selected information.
• Mode of collection of information and specification of appropriate form for data collection.
• Designing of appropriate sampling process and sample size.
• Specification of appropriate data analysis method.

Research design is a plan to answer your research question. A research method is a strategy used to
implement that plan. Research design and methods are different but closely related, because good research
design ensures that the data you obtain will help you answer your research question more effectively. It
depends on your research goal. It depends on what subjects (and who) you want to study. Let's say you
are interested in studying what makes people happy, or why some students are more conscious about
recycling on campus.

2.2 Business research methods

Business research is a part of the business intelligence process. It is usually conducted to determine
whether a company can succeed in a new region, to understand their competitors, or to simply select a
marketing approach for a product. This research can be carried out using qualitative research methods or
quantitative research methods.

Business research can be done for anything and everything. In general, when people speak about business
research it means asking research questions to know where the money can be spent to increase sales,
profits or market share. Such research is critical to make wise and informed decisions.

For example: A mobile company wants to launch a new model in the market. But they are not aware of
what are the dimensions of a mobile that are in most demand. Hence, the company conducts a business
research using various methods to gather information and the same is then evaluated and conclusions are
drawn, as to what dimensions are most in-demand, This will enable the researcher to make wise decisions
to position his phone at the right price in the market and hence acquire a larger market share.

2.2.1 Quantitative research methods

Quantitative research methods are research methods that deal with numbers. It is a systematic empirical
investigation using statistical, mathematical or computational techniques. Such methods usually start
with data collection and then proceed to statistical analysis using various methods. The following are some
of the research methods used to carry out business research.

Business Research Methods 41


and Data Analytics
22BA206

Survey research

Survey research is one of the most widely used methods to gather data especially for conducting business
research. Surveys involve asking various survey questions to a set of audiences through various types
like online polls, online surveys, questionnaires, etc. Nowadays, most of the major corporations use this
method to gather data and use it to understand the market and make appropriate business decisions.
Various types of surveys like cross-sectional surveys which are needed to collect data from a set of
audience at a given point of time or longitudinal surveys which are needed to collect data from a set of
audience across various time duration in order understand changes in the respondents’ behavior are used
to conduct survey research. With the advancement in technology, now surveys can be sent online
through email or social media.

For example: A company wants to know the NPS score for their website i.e. how satisfied are people who
are visiting their website. An increase in traffic to their website or the audience spending more time on a
website can result in higher rankings on search engines which will enable the company to get more leads
as well as increase its visibility. Hence, the company can ask people who visit their website with a few
questions through an online survey to understand their opinions or gain feedback and hence make
appropriate changes to the website to increase satisfaction.

Correlational research

Correlational research is conducted to understand the relationship between two entities and what impact
each one of them has on the other. Using mathematical analysis methods, correlational research enables
the researcher to correlate two or more variables. Such research can help understand patterns,
relationships, trends, etc. Manipulation of one variable is possible to get the desired results as well.
Generally, a conclusion cannot be drawn only on the basis of correlational research.

For example: A research can be conducted to understand the relationship between colors and gender-based
audiences. Using such research and identifying the target audience, a company can choose the production
of particular color products to be released in the market. This can enable the company to understand the
supply and demand requirements of its products.

Causal-Comparative research

Causal-Comparative research is a method based on the comparison. It is used to deduce the cause-effect
relationship between variables. Sometimes also known as quasi-experimental research, it involves
establishing an independent variable and analyzing the effects on the dependent variable. In such research,
manipulation is not done; however, changes are observed on the variables or groups under the influence
of the same changes. Drawing conclusions through such research is a little tricky as independent and
dependent variables will always exist in a group, hence all other parameters have to be taken into
consideration before drawing any inferences from the research.

For example: A research can be conducted to analyze the effect of good educational facilities in rural
areas. Such a study can be done to analyze the changes in the group of people from the rural areas when
they are provided with good educational facilities and before that.

Another example can be to analyze the effect of having dams and how it will affect the farmers or
production of crops in that area.

Business Research Methods 42


and Data Analytics
22BA206

Experimental research

Experimental research is based on trying to prove a theory. Such research may be useful in business
research as it can let the product company know some behavioral traits of its consumers, which can lead
to more revenue. In this method, an experiment is carried out on a set of audiences to observe and later
analyze their behavior when impacted with certain parameters.

For example: Experimental research was conducted recently to understand if particular colors have an
effect on its consumers’ hunger. A set of the audience was then exposed to those particular colors while
they were eating and the subjects were observed. It was seen that certain colors like red or yellow increase
hunger. Hence, such research was a boon to the hospitality industry. You can see many food chains like
Mcdonalds, KFC, etc. using such colors in their interiors, brands, as well as packaging.

Another example of inferences drawn from experimental research, which is used widely by most bars/pubs
across the world is that loud music in the workplace or anywhere makes a person drink more in less time.
This was proven through experimental research and was a key finding for many business owners across
the globe.

Online research / Literature research

Literature research is one of the oldest methods available. It is very economical and a lot of information
can be gathered using such research. Online research or literature research involves gathering information
from existing documents and studies which can be available at Libraries, annual reports, etc. Nowadays,
with the advancement in technology, such research has become even more simple and accessible to
everyone. An individual can directly research online for any information that is needed, which will give
him in-depth information about the topic or the organization. Such research is used mostly by marketing
and salespeople in the business sector to understand the market or their customers. Such research is carried
out using existing information that is available from various sources, although care has to be taken to
validate the sources from where the information is going to be collected.

For example: A salesperson has heard a particular firm is looking for some solution which their company
provides. Hence, the salesperson will first search for a decision maker from the company, investigate what
department he is from and understand what the target company is looking for and what are they into. Using
this research he can cater his solution to be spot on when he pitches it to this client. He can also reach out
to the customer directly by finding a mean to communicate with him by researching online.’

2.2.2 Qualitative research methods

Qualitative research is a method that has a high importance in business research. Qualitative research
involves obtaining data through open-ended conversational means of communication. Such research
enables the researcher to not only understand what the audience thinks but also why he thinks it. In such
research, in-depth information can be gathered from the subjects depending on their responses. There are
various types of qualitative research methods such as interviews, focus groups, ethnographic research,
content analysis, case study research that are widely used. Such methods are of very high importance in
business research as it enables the researcher to understand the consumer. What motivates the consumer
to buy and what does not is what will lead to higher sales and that is the prime objective for any
business.Following are a few methods that are widely used in today’s world by most businesses.

Interviews

Business Research Methods 43


and Data Analytics
22BA206

Interviews are somewhat similar to surveys, like sometimes they may have the same questions used. The
difference is that the respondent can answer these open ended questions at a length and the direction of
the conversation or the questions being asked can be changed depending on the response of the subject.
Such a method usually gives the researcher, detailed information about the perspective or opinions from
its subject. Carrying out interviews with subject matter experts can also give important information critical
to some businesses.

For example: An interview was conducted by a telecom manufacturer, with a group of women to
understand why do they have less number of female customers. After interviewing them, the researcher
understood that there were less feminine colors in some of the models, hence females preferred not
purchasing them. Such information can be critical to a business such as a telecom manufacturer and hence
it can be used to increase its market share by targeting women customers by launching some feminine
colors in the market.

Another example would be to interview a subject matter expert in social media marketing. Such an
interview can enable a researcher to understand why certain types of social media advertising strategies
work for a company and why some of them don’t.

Focus groups

Focus groups are a set of individuals selected specifically to understand their opinions and behaviors. It is
usually a small set of a group that is selected keeping in mind, the parameters for their target market
audience to discuss a particular product or service. Such a method enables a researcher with a larger
sample than the interview or a case study while taking advantage of conversational communication.

Nowadays, focus groups can be sent online surveys as well to collect data and answer why, what and how
questions. Such a method is very crucial to test new concepts or products before they are launched in the
market.

For example: A research is conducted with a focus group to understand what dimension of screen size is
preferred most by the current target market. Such a method can enable a researcher to dig deeper if the
target market focused more on screen size, features or colors of the phone. Using this data, a company can
make wise decisions to its product line and secure a higher market share.

Ethnographic research

Ethnographic research is one of the most challenging research but can give extremely precise results. Such
research is used quite rarely, as it is time-consuming and can be expensive as well. It involves the
researcher to adapt to the natural environment and observe its target audience to collect data. Such a
method is generally used to understand cultures, challenges or other things that can occur in that particular
setting.

For example: The worldly renowned show “Undercover boss” would be an apt example of how
ethnographic research can be used in businesses. In this show, the senior management of a large
organization works in his own company as a regular employee to understand what improvements can be
done, what is the culture in the organization and to identify hard-working employees and to reward them.
It can be seen that the researcher had to spend a good amount of time in the natural setting of the employees
and adapt to their ways and processes. While observing in this setting, the researcher could find out the

Business Research Methods 44


and Data Analytics
22BA206

information he needed first hand without loss of any information or any bias and to improve certain things
that would impact his business.

Case study research

Case study research is one of the most important in business research. It is also used as marketing collateral
by most businesses to land up more clients. Case study research is conducted to assess customer
satisfaction, document the challenges that were faced and the solutions that the firm gave them. Using
these inferences are made to point out the benefits that the customer enjoyed for choosing their specific
firm. Such research is widely used in other fields like education, social sciences, and similar. Case studies
are provided by businesses to new clients to showcase their capabilities and hence such research plays a
crucial role in the business sector.

For example: A services company has provided a testing solution to one of its clients. A case study
research is conducted to find out what were the challenges faced during the project, what was the scope
of their work, what objective was to be achieved and what solutions were given to tackle the challenges.
The study can end with the benefits that the company provided through their solutions, like reduced time
to test batches, easy implementation or integration of the system, or even cost reduction. Such a study
showcases the capability of the company and hence it can be stated as empirical evidence to the new
prospect.

Website visitor profiling/research

Website intercept surveys or website visitor profiling/research is something new that has come up and is
quite helpful in the business sector. It is an innovative approach to collect direct feedback from your
website visitors using surveys. In recent times a lot of business generation happens online and hence it is
important to understand the visitors of your website as they are your potential customers. Collecting
feedback is critical to any business as without understanding a customer, no business can be successful.
A company has to keep its customers satisfied and try to make them loyal customers in order to stay on
top.

A website intercept survey is an online survey that allows you to target visitors to understand their intent
and collect feedback to evaluate the customers’ online experience. Information like visitor intention,
behavior path, satisfaction of overall website, can be collected using this.

Depending on what information a company is looking for, multiple forms of website intercept surveys can
be used to gather responses. Some of the popular ones are Pop-ups or also called Modal boxes and on-
page surveys.

For example: A prospective customer is looking for a particular product that a company is selling. Once
he is directed to the website, an intercept survey will start noting his intent, and path. Once the transaction
has been made a pop-up or an on-page survey is provided to the customer to rate the website. Such research
enables the researcher to put this data to good use and hence understand the customers’ intent, his path
and improve any parts of the website depending on the responses, which in turn would lead to satisfied
customers and hence, higher revenues and market share.

Business Research Methods 45


and Data Analytics
22BA206

Here is a chart that highlights the major differences between qualitative and quantitative research:

Qualitative Research Quantitative Research

Focus on explaining and understanding experienc Focus on quantifying and measuring phenome
es and perspectives. na.

Use of non-numerical data, such as words, images Use of numerical data, such as statistics and su
, and observations. rveys.

Usually uses small sample sizes. Usually uses larger sample sizes.

Typically emphasizes in-depth exploration and int Typically emphasizes precision and objectivity
erpretation. .

Data analysis involves interpretation and narrative Data analysis involves statistical analysis and h
analysis. ypothesis testing.

Results are presented numerically and statistic


Results are presented descriptively.
ally.

2.2.3 Advantages of Business research

• Business research helps to identify opportunities and threats.


• It helps identify problems and using this information, wise decisions can be made to tackle the issue
appropriately.
• It helps to understand customers better and hence can be useful to communicate better with the
customers or stakeholders.
• Risks and uncertainties can be minimized by conducting business research in advance.
• Financial outcomes and investments that will be needed can be planned effectively using business
research.
• Such research can help track competition in the business sector.
• Business research can enable a company to make wise decisions as to where to spend and how much.
• Business research can enable a company to stay up-to-date with the market and its trends and
appropriate innovations can be made to stay ahead in the game.
• Business research helps to measure reputation

2.2.4 Disadvantages of Business research

• Business research can be a high-cost affair


• Most of the time, business research is based on assumptions
• Business research can be time-consuming
• Business research can sometimes give you inaccurate information, because of a biased population or
a small focus group.
• Business research results can quickly become obsolete because of the fast-changing markets

Business Research Methods 46


and Data Analytics
22BA206

2.2.5 Importance of Business research

Business research is one of the most effective ways to understand customers, the market and competitors.
Such research helps companies to understand the demand and supply of the market. Using such research
will help businesses reduce costs, and create solutions or products that are targeted to the demand in the
market and the correct audience.

In-house business research can enable senior management to build an effective team or train or mentor
when needed. Business research enables the company to track its competitors and hence can give you the
upper hand to stay ahead of them. Failures can be avoided by conducting such research as it can give the
researcher an idea if the time is right to launch its product/solution and also if the audience is right. It will
help understand the brand value and measure customer satisfaction which is essential to continuously
innovate and meet customer demands.

This will help the company grow its revenue and market share. Business research also helps recruit ideal
candidates for various roles in the company. By conducting such research a company can carry out a
SWOT analysis, i.e. understand the strengths, weaknesses, opportunities, and threats. With the help of this
information, wise decisions can be made to ensure business success.

Business research is the first step that any business owner needs to set up his business, to survive or to
excel in the market. The main reason why such research is of utmost importance is that it helps businesses
to grow in terms of revenue, market share and brand value.

2.3 Reliability in Business Research

The term reliability business research refers to the consistency of a research study or measuring test
(whether the results can be reproduced under the same conditions). It shows how consistently a method
measures something. If the same result can be consistently achieved by using the same methods under the
same circumstances, the measurement is considered reliable.

For example, if a person weighs themselves during the course of a day they would expect to see a similar
reading. Scales which measured weight differently each time would be of little use. The same analogy
could be applied to a tape measure which measures inches differently each time it was used. It would not
be considered reliable.
Consider another example: You measure the temperature of a liquid sample several times under identical
conditions. The thermometer displays the same temperature every time, so the results are reliable.
If findings from research are replicated consistently they are reliable. A correlation coefficient can be used
to assess the degree of reliability. If a test is reliable it should show a high positive correlation.
Of course, it is unlikely the exact same results will be obtained each time as participants and situations
vary, but a strong positive correlation between the results of the same test indicates reliability.
2.3.1 Method to Access Reliability
To determine if your research methods are producing reliable results, you must perform the same task
multiple times or in multiple ways. Typically, this involves changing some aspect of the research
assessment while maintaining control of the research. For example, this could mean:

• Using the same test on different groups of people


• Using different tests on the same group of people.

Business Research Methods 47


and Data Analytics
22BA206

Both methods maintain control by keeping one element exactly the same and changing other elements to
ensure other factors don't influence the research results. Here are some careers that often test for reliability
in data:

• Media sociologist
• Food scientists
• Forensic science technicians
• Marketing analysts
• Medical scientists
• Economists
• Policy analysts
• Behavioral scientists
• Business analysts

2.3.2 Types of Reliability

1. Test-retest reliability

The test-retest reliability method in research involves giving a group of people the same test more than
once. If the results of the test are similar each time you give it to the sample group, that shows your
research method is likely reliable and not influenced by external factors, like the sample group's mood or
the day of the week. Here are the guidelines for this type of research:

• Pick a consistent research method


• Create a sample group and ensure the members are also consistent
• Administer your test using the chosen method
• Repeat the exact same testing process one or multiple times with the same sample group

Example: Give a group of college students a survey about their satisfaction with their school's parking
lots on Monday and again on Friday, then compare the results to check the test-retest reliability. Consider
another example, A test of color blindness for trainee pilot applicants should have high test-retest
reliability, because color blindness is a trait that does not change over time.

To measure test-retest reliability, you conduct the same test on the same group of people at two different
points in time. Then you calculate the correlation between the two sets of results.

Improving test-retest reliability

• When designing tests or questionnaires, try to formulate questions, statements, and tasks in a way that
won’t be influenced by the mood or concentration of participants.
• When planning your methods of data collection, try to minimize the influence of external factors, and
make sure all samples are tested under the same conditions.
• Remember that changes or recall bias can be expected to occur in the participants over time, and take
these into account.

Business Research Methods 48


and Data Analytics
22BA206

2. Parallel forms reliability

This strategy involves giving the same group of people multiple types of tests to determine if the results
stay the same when using different research methods. If they do, this means the methods are likely reliable
because, otherwise, the participants in the sample group may behave differently and change the results.
For this strategy to succeed, it's important that:

• Each research method is looking for the same information


• The group of participants behave similarly for each test

Example: In marketing, you may interview customers about a new product, observe them using the
product and give them a survey about how easy the product is to use and compare these results as a parallel
forms reliability test. In educational assessment, it is often necessary to create different versions of tests
to ensure that students don’t have access to the questions in advance. Parallel forms reliability means that,
if the same students take two different versions of a reading comprehension test, they should get similar
results in both tests.

The most common way to measure parallel forms reliability is to produce a large set of questions to
evaluate the same thing, then divide these randomly into two question sets.The same group of respondents
answers both sets, and you calculate the correlation between the results. High correlation between the two
indicates high parallel forms reliability.

Improving parallel forms reliability

• Ensure that all questions or test items are based on the same theory and formulated to measure the
same thing.

3. Inter-rater reliability

The inter-rater reliability testing involves multiple researchers assessing a sample group and comparing
their results. This can help them avoid influencing factors related to the assessor, including:

• Personal bias
• Mood
• Human error

If most of the results from different assessors are similar, it's likely the research method is reliable and can
produce usable research because the assessors gathered the same data from the group. This is useful for
research methods where each assessor may have different criteria but can still end up with similar research
results, like:

• Observations
• Interviews
• Surveys

Example: Multiple behavioral specialists may observe a group of children playing to determine their social
and emotional development and then compare notes to check for inter-rater reliability. In an observational

Business Research Methods 49


and Data Analytics
22BA206

study where a team of researchers collect data on classroom behavior, interrater reliability is important:
all the researchers should agree on how to categorize or rate different types of behavior.

To measure interrater reliability, different researchers conduct the same measurement or observation on
the same sample. Then you calculate the correlation between their different sets of results. If all the
researchers give similar ratings, the test has high interrater reliability.

Improving interrater reliability

• Clearly define your variables and the methods that will be used to measure them.
• Develop detailed, objective criteria for how the variables will be rated, counted or categorized.
• If multiple researchers are involved, ensure that they all have exactly the same information and
training.

4. Internal consistency reliability

Checking for internal consistency in research involves making sure your internal research methods or parts
of research methods deliver the same results. There are two typical ways to make this determination:

Split-half reliability test: You can perform this test by splitting a research method, like a survey or test,
in half, delivering both halves separately to a sample group, then comparing the results to ensure the
method can produce consistent results. If the results are consistent, then the results of the research method
are likely reliable.

Inter-item reliability test: With this assessment, you administer sample groups multiple testing items,
like with parallel forms reliability testing, and calculate the correlation between the results of each of the
method results. With this information, you calculate the average and use the number to determine if the
results are reliable.

Example: You may give a company's cleaning department a questionnaire about which cleaning products
work the best, but you split it in half and give each half to the department separately and calculate the
correlation to test for split-half reliability.Later, you interview the members of the cleaning department,
then bring them into small focus groups and observe them at work to determine which cleaning products
get the most use and which people like best. You calculate the correlation between these answers and
average the results to find the average inter-item reliability. Likewise To measure customer satisfaction
with an online store, you could create a questionnaire with a set of statements that respondents must agree
or disagree with. Internal consistency tells you whether the statements are all reliable indicators of
customer satisfaction.

Ways to measure it
Two common methods are used to measure internal consistency.

• Average inter-item correlation: For a set of measures designed to assess the same construct, you
calculate the correlation between the results of all possible pairs of items and then calculate the
average.
• Split-half reliability: You randomly split a set of measures into two sets. After testing the entire set
on the respondents, you calculate the correlation between the two sets of responses.

Business Research Methods 50


and Data Analytics
22BA206

Improving internal consistency

• Take care when devising questions or measures: those intended to reflect the same concept should be
based on the same theory and carefully formulated

Ensuring Reliability
• To enhance the reliability of your research, you need to apply your measurement method consistently.
The chances of reproducing the same results for a test are higher when you maintain the method you’re
using to experiment.
• For example, you want to determine the reliability of the weight of a bag of chips using a scale. You
have to consistently use this scale to measure the bag of chips each time you experiment.
• You must also keep the conditions of your research consistent. For instance, if you’re experimenting
to see how quickly water dries on sand, you need to consider all of the weather elements that day.
• So, if you experimented on a sunny day, the next experiment should also be conducted on a sunny day
to obtain a reliable result.

Summary of the above reliability methods:

Type of reliability Measures

Test-retest The same test over time. Measuring a property that you expect
to stay the same over time.

Inter-rater The same test conducted by Multiple researchers making


different people. observations or ratings about the same
topic.

Parallel forms Different versions of a test which Using two different tests to measure
are designed to be equivalent. the same thing.

Internal consistency The individual items of a test. Using a multi-item test where all the
items are intended to measure the
same variable.

2.4 Validity in Business Research

Validity refers to how accurately a method measures what it is intended to measure (whether the results
really do represent what they are supposed to measure).

If research has high validity, that means it produces results that correspond to real properties,
characteristics, and variations in the physical or social world.
Consider the same example for thermometer specified above, If the thermometer shows different
temperatures each time, even though you have carefully controlled conditions to ensure the sample’s
temperature stays the same, the thermometer is probably malfunctioning, and therefore its measurements
are not valid.

Business Research Methods 51


and Data Analytics
22BA206

2.4.1. Types of Validity


Researchers use validity to determine whether a measurement is accurate or not. The accuracy of
measurement is usually determined by comparing it to the standard value.
When a measurement is consistent over time and has high internal consistency, it increases the likelihood
that it is valid.

1) Content Validity
Content Validity a process of matching the test items with the instructional objectives. Content
validity is the most important criterion for the usefulness of a test, especially of an achievement test. It is
also called as Rational Validity or Logical Validity or Curricular Validity or Internal Validity or Intrinsic
Validity.
Content validity refers to the degree or extent to which a test consists items representing the
behaviors that the test maker wants to measure. The extent to which the items of a test are true
representative of the whole content and the objectives of the teaching is called the content validity of the
test.
Content validity is estimated by evaluating the relevance of the test items; i.e. the test items must duly
cover all the content and behavioural areas of the trait to be measured. It gives idea of subject matter or
change in behaviour. This way, content validity refers to the extent to which a test contains items
representing the behaviour that we are going to measure. The items of the test should include every
relevant characteristic of the whole content area and objectives in right proportion.
Before constructing the test, the test maker prepares a two-way table of content and objectives, popularly
known as “Specification Table”.
For example, if I were to measure what causes hair loss in women. I’d have to consider things like
postpartum hair loss, alopecia, hair manipulation, dryness, and so on.
By omitting any of these critical factors, you risk significantly reducing the validity of your research
because you won’t be covering everything necessary to make an accurate deduction.
For example, a certain woman is losing her hair due to postpartum hair loss, excessive manipulation, and
dryness, but in my research, I only look at postpartum hair loss. My research will show that she has
postpartum hair loss, which isn’t accurate.Yes, the conclusion is correct, but it does not fully account for
the reasons why this woman is losing her hair.

Some general points for ensuring content validity are given below:
1. Test should serve the required level of students, neither above nor below their standard.
2. Language should be upto the level of students.
3. Anything which is not in the curriculum should not be included in test items.
4. Each part of the curriculum should be given necessary weightage. More items should be selected from
more important parts of the curriculum.
Limitations:
1. The weightage to be given to different parts of content is subjective.
2. It is difficult to construct the perfect objective test.
3. Content validity is not sufficient or adequate for tests of Intelligence, Achievement, Attitude and to
some extent tests of Personality.
4. Weightage given on different behaviour change is not objective.

2) Criterion Validity
This measures how well your measurement correlates with the variables you want to compare it with to
get your result. The two main classes of criterion validity are predictive and concurrent.

Business Research Methods 52


and Data Analytics
22BA206

3) Predictive validity
It helps predict future outcomes based on the data you have. For example, if a large number of students
performed exceptionally well in a test, you can use this to predict that they understood the concept on
which the test was based and will perform well in their exams. Predictive Validity the extent to which test
predicts the future performance of employees.
Predictive validity is concerned with the predictive capacity of a test. It indicates the effectiveness of a
test in forecasting or predicting future outcomes in a specific area. The test user wishes to forecast an
individual’s future performance. Test scores can be used to predict future behaviour or performance and
hence called as predictive validity.
In order to find predictive validity, the tester correlates the test scores with testee’s subsequent
performance, technically known as “Criterion”. Criterion is an independent, external and direct measure
of that which the test is designed to predict or measure. Hence, it is also known as “Criterion related
Validity”.
The predictive or empirical validity has been defined by Cureton (1965) as an estimate of the correlation
coefficient between the test scores and the true criterion.
Example:
Medical entrance test is constructed and administered to select candidate for admission into M.B.B.S.
courses. Basing on the scores made by the candidates on this test we admit the candidates.
After completion of the course they appear at the final M.B.B.S. examination. The scores of final M.B.B.S.
examination is the criterion. The scores of entrance test and final examination (criterion) are correlated.
High correlation implies high predictive validity.
Similar examples like other recruitment tests or entrance tests in Agriculture, Engineering, Banking,
Railway etc. could be cited here which must have high predictive validity.
That is tests used for recruitment, classification and entrance examination must have high predictive
validity. This type of validity is sometimes referred to as ‘Empirical validity’ or ‘Statistical validity’ as
our evaluation is primarily empirical and statistical.
Limitation:
If we get a suitable criterion-measure with which our test results are to be correlated, we can determine
the predictive validity of a test. But it is very difficult to get a good criterion. Moreover, we may not get
criterion-measures for all types of psychological tests.

4) Concurrent validity
Concurrent Validity correlating the test scores with another set of criterion scores.
Concurrent validity refers to the extent to which the test scores correspond to already established or
accepted performance, known as criterion. To know the validity of a newly constructed test, it is correlated
or compared with some available information.
Thus a test is validated against some concurrently available information. The scores obtained from a newly
constructed test are correlated with pre-established test performance. Suppose we have prepared a test of
intelligence. We administer it to group of pupils. The Stanford-Binet test is also administered to the same
group. Now test scores made on our newly constructed test and test scores made by pupils on the Stanford-
Binet Intelligence Test are correlated. If the coefficient of correlation is high, our intelligence test is said
to have high concurrent validity.
The dictionary meaning of the term ‘concurrent’ is ‘existing’ or ‘done at the same time’. Thus the term
‘concurrent validity’ is used to indicate the process of validating a new test by correlating its scores with
some existing or available source of information (criterion) which might have been obtained shortly before
or shortly after the new test is given.

Business Research Methods 53


and Data Analytics
22BA206

For example, setting up a literature test for your students on two different books and assessing them at the
same time. You’re measuring your students’ literature proficiency with these two books. If your students
truly understood the subject, they should be able to correctly answer questions about both books.
To get a criterion measure, we are not required to wait for a long time.
The predictive validity differs from concurrent validity in the sense that in former validity we wait for the
future to get criterion measure. But in ease of concurrent validity we need not wait for longer gaps.

The term ‘concurrent’ here implies the following characteristics:


1. The two tests—the one whose validity is being examined and the one with proven validity—are
supposed to cover the same content area at a given level and the same objective;
2. The population for both the tests remains the same and the two tests are administered in almost similar
environments; and
3. The performance data on both the tests are obtainable almost simultaneously.
This type of validity is also known as “External Validity” or “Functional Validity”. Concurrent validity is
relevant to tests employed for diagnosis not for prediction of future success.

5) Face Validity
Face Validity to the extent the test appears to measure what is to be measured.
Face validity refers to whether a test appears to be valid or not i.e., from external appearance whether the
items appear to measure the required aspect or not. If a test measures what the test author desires to
measure, we say that the test has face validity. Thus, face validity refers not to what the test measures, but
what the test ‘appears to measure’. The content of the test should not obviously appear to be inappropriate,
irrelevant.
For example, a test to measure “Skill in addition” should contain only items on addition. When
one goes through the items and feels that all the items appear to measure the skill in addition, then it can
be said that the test is validated by face.
Although it is not an efficient method of assessing the validity of a test and as such it is not usually used
still then it can be used as a first step in validating the test. Once the test is validated at face, we may
proceed further to compute validity coefficient.
Moreover, this method helps a test maker to revise the test items to suit to the purpose. When a test is to
be constructed quickly or when there is an urgent need of a test and there is no time or scope to determine
the validity by other efficient methods, face validity can be determined. This type of validity is not
adequate as it operates at the facial level and hence may be used as a last resort.

6) Construct-Related Validity
Construct Validity the extent is which the test may be said to measure a theoretical construct or
psychological variable.
A construct is mainly psychological. Usually it refers to a trait or mental process. Construct validation is
the process of determining the extent to which a particular test measures the psychological constructs that
the test maker intends to measure.
It indicates the extent to which a test measures the abstract attributes or qualities which are not
operationally defined.
Gronlund and Linn states,” Construct validation maybe defined as the process of determining the extent
to which the test performance can be interpreted in terms of one or more psychological construct.”
Ebel and Frisbie describes, “Construct validation is the process of gathering evidence to support
the contention that a given test indeed measures the psychological construct that the test makers intended
for it to measure.”
Construct validity is also known as “Psychological Validity” or ‘Trait Validity’ or ‘Logical
Validity’. Construct validity means that the test scores are examined in terms of a construct. It studies the
construct or psychological attributes that a test measures.

Business Research Methods 54


and Data Analytics
22BA206

The extent to which the test measures the personality traits or mental processes as defined by the test-
maker is known as the construct validity of the test.
While constructing tests on intelligence, attitude, mathematical aptitude, critical thinking, study skills,
anxiety, logical reasoning, reading comprehension etc. we have to go for construct validity. Take for
example, ‘a test of sincerity’.

Before constructing such types of test the test maker is confronted with the questions:
1. What should be the definition of the term sincerity?
2. What types of behaviour are to be expected from a person who is sincere?
3. What type of behaviour distinguishes between sincerity and insincerity?
Each construct has an underlying theory that can be brought to bear in describing and predicting a pupil’s
behaviour.
Gronlund (1981) suggests the following three steps for determining construct validity:
(i) Identify the constructs presumed to account for test performance.
(ii) Derive hypotheses regarding test performance from the theory underlying each construct.
(iii) Verify the hypotheses by logical and empirical means.
It must be noted that construct validity is inferential. It is used primarily when other types of
validity are insufficient to indicate the validity of the test. Construct validity is usually involved in such
as those of study habits, appreciation, honesty, emotional stability, sympathy etc.)
7) Convergent validity
Convergent validity is a subtype of construct validity. Construct validity is an indication of how well a
test measures the concept it was designed to measure. Convergent validity refers to how closely a test is
related to other tests that measure the same (or similar) constructs. Here, a construct is a behavior,
attitude, or concept, particularly one that is not directly observable.
Ideally, two tests measuring the same construct, such as stress, should have a moderate to high correlation.
High correlation is evidence of convergent validity, which, in turn, is an indication of construct validity.
Example: Suppose you use two different methods to collect data about anger: observation and a self-
report questionnaire. If the scores of the two methods are similar, this suggests that they indeed measure
the same construct. A high correlation between the two test scores suggests convergent validity. Consider
another example, the scores of two tests, one measuring self-esteem and the other measuring extroversion,
are likely to be correlated—individuals scoring high in self-esteem are more likely to score high in
extroversion. These two tests would then have high convergent validity.

8) Discriminant validity
Divergent Validity is used to determine if a test is too similar to another test. If a test is found to correlate
too strongly (or be too similar) with another test then it suggests that the tests are measuring the same
thing and are too alike to be considered different. An example would be a test used by a company for
hiring purposes that measures how proficient someone is at a particular skill. If the test correlates too
strongly with an IQ test then it essentially is just another IQ test instead of measuring something different.
Discriminant validity is a way of validifying research that involves demonstrating that one scale is
unrelated to other scales. It helps researchers to discriminate between two scales.
Example: Self-Esteem vs Musical Preferences Scales: Measuring the self-esteem of teenagers
and musical preferences.
Social Skills vs Computer Skills: Assessing the degree of relationship between a measure of social
skills and a measure of computer skills.
Examines the validity of your research by determining what not to base it on. You are removing elements
that are not a strong factor to help validate your research. Being a vegan, for example, does not imply that
you are allergic to meat.

9) Factorial Validity:

Business Research Methods 55


and Data Analytics
22BA206

Factorial Validity the extent of correlation of the different factors with the whole test.
Factorial validity is determined by a statistical technique known as factor analysis. It uses methods of
explanation of inter-correlations to identify factors (which may be verbalised as abilities) constituting the
test.
In other words methods of inter-correlation and other statistical methods are used to estimate factorial
validity. The correlation of the test with each factor is calculated to determine the weight contributed by
each such factor to the total performance of the test.
This tells us about the factor loadings. This relationship of the different factors with the whole test is called
the factorial validity. Guilford (1950) suggested that factorial validity is the clearest description of what a
test measures and by all means should be given preference over other types of validity.
A. To Ensure Validity and Reliability in Your Research
You need a bulletproof research design to ensure that your research is both valid and reliable. This means
that your methods, sample, and even you, the researcher, shouldn’t be biased.

Ensuring Validity
There are several ways to determine the validity of your research, and the majority of them require the use
of highly specific and high-quality measurement methods.
Before you begin your test, choose the best method for producing the desired results. This method should
be pre-existing and proven.
Also, your sample should be very specific. If you’re collecting data on how dogs respond to fear, your
results are more likely to be valid if you base them on a specific breed of dog rather than dogs in general.

2.5 Variables
A variable is any kind of attribute or characteristic that you are trying to measure, manipulate and control
in statistics and research. All studies analyze a variable, which can describe a person, place, thing or idea.
A variable's value can change between groups or over time.
For example, if the variable in an experiment is a person's eye color, its value can change from brown to
blue to green from person to person.
Researchers and statisticians use variables to describe and measure the items, places, people or ideas
they're studying. Many types of variables exist, and you must choose the right variable to measure when
designing studies, selecting tests and interpreting results. A strong understanding of variables can lead to
more accurate statistical analyses and results.

2.5.1 Types of Variables


Below are commonly used business research variables:
• Independent variables
• Dependent variables
• Quantitative variables
• Qualitative variables
• Intervening variables
• Moderating variables
• Extraneous variables
• Confounding variables
• Control variables
• Composite variables

Business Research Methods 56


and Data Analytics
22BA206

2.5.2 Independent vs. dependent variables


Independent variables Dependent variables
A variable that stands alone and isn't A variable that relies on and can be
Definition changed by the other variables or changed by other factors that are
factored that are measured measured
Age: Other variables such as where A grade someone gets on an exam
someone lives, what they eat or how depends on factors such as how
Example
much they exercise are not going to much sleep they got and how long
change their age. they studied.

Researchers often try to find out whether an independent variable causes other variables to change and in
what way. When analyzing relationships between study objects, researchers often try to determine what
makes the dependent variable change and how. Independent variables can influence dependent variables,
but dependent variables cannot influence independent variables.
2.5.3 Quantitative vs. qualitative variables
Quantitative variables Qualitative variables
Any data sets that involve numbers or Non-numerical values or
Definition
amounts groupings
Examples Height, distance or number of items Eye color or dog breed
Types Discrete and continuous Binary, nominal and ordinal

Researchers can further categorize quantitative variables into two types:


• Discrete: Any numerical variables you can realistically count, such as the coins in your wallet or the
money in your savings account.
• Continuous: Numerical variables that you could never finish counting, such as time.
Researchers can further categorize qualitative, or categorical, variables into three types:
• Binary: Variables with only two categories, such as male or female, red or blue.
• Nominal: Variables you can organize in more than two categories that do not follow a particular order.
Take, for example, housing types: Single-family home, condominium, tiny home.
• Ordinal: Variables you can organize in more than two categories that follow a particular order. Take,
for example, level of satisfaction: Unsatisfied, neutral, satisfied.
2.5.4 Intervening vs. moderating variables
Intervening variables Moderating variables
Changes the relationship between
A theoretical variable used to
dependent and independent variables by
Definition explain a cause or connection
strengthening or weakening the intervening
between other study variables
variable's effect
Access to health care: If wealth is Age: If a study looking at the relationship
the independent variable, and a between economic status (independent
long life span is a dependent variable) and how frequently people get
variable, a researcher might physical exams from a doctor (dependent
Example
hypothesize that access to quality variable), age is a moderating variable.
health care is the intervening That relationship might be weaker in
variable that links wealth and life younger individuals and stronger in older
span. individuals.

Business Research Methods 57


and Data Analytics
22BA206

An intervening variable, also known as a mediator or mediating variable, explains the process through
which two variables are related, while a moderating, or moderator, variable affects the strength and
direction of that relationship.

2.5.5 Extraneous vs. confounding variables

Extraneous variables Confounding variables


Factors that affect the dependent Extra variables that the researcher did not
variable but that the researcher did account for that can disguise another
Definition
not originally consider when variable's effects and show false
designing the experiment correlations
Parental support, prior knowledge of In a study of whether a particular genre
a foreign language or socioeconomic of movie affects how much candy kids
status are extraneous variables that eat, with experiments are held at 9 a.m.,
could influence a study assessing noon and 3 p.m. Time could be a
Example
whether private tutoring or online confounding variable, as the group in the
courses are more effective at noon study might be hungrier and
improving students' Spanish test therefore eat more candy because
scores. lunchtime is typically at noon.

A confounding variable is a type of extraneous variable that is associated with both the independent and
dependent variables. An extraneous variable is anything that could influence the dependent variable.
These unwanted variables can unintentionally change a study's results or how a researcher interprets those
results. A confounding variable influences the dependent variable, and also correlates with or causally
affects the independent variable. Confounding variables can invalidate your experiment results by making
them biased or suggesting a relationship between variables exists when it does not. Some of the ways to
control confounding variables so they do not affect the results of your experiment include:

• Adjustment: Adjust study parameters to account for the confounding variable and minimize its effects.
• Matching: Compare study groups with the same degree of confounding variables.
• Multivariate analysis: Use when analyzing multiple variables at once.
• Randomization: Spread confounding variables evenly between study groups.
• Restriction: Remove subjects or samples that have confounding factors.
• Stratification: Create study subgroups in which the confounding variable does not vary or vary much.

2.5.6 Control vs. composite variables

Control variables Composite variables


Characteristics that are constant and do Two or more variables combined to
Definition
not change during a study make a more complex variable
Overall health is an example of a
In an experiment about plant
composite variable if a researcher uses
development, control variables might
other variables, such as genetics,
include the amounts of fertilizer and
Example medical care, education, quality of
water each plant gets. These amounts
environment and chosen behaviors, to
are always the same so that they do not
determine overall health in an
affect the plants' growth.
experiment.
Control, or controlling, variables have no effect on other variables and are often kept the same throughout

Business Research Methods 58


and Data Analytics
22BA206

an experiment to prevent bias. Composite variables are often made up of two or more variables that are
highly related to one another conceptually or statistically.

2.6 Research Sampling

A sample is a subset of individuals from a larger population. Sampling means selecting the group that
you will actually collect data from in your research. For example, if you are researching the opinions of
students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

When you conduct research about a group of people, it’s rarely possible to collect data from every person
in that group. Instead, you select a sample. The sample is the group of individuals who will actually
participate in the research.

To draw valid conclusions from your results, you have to carefully decide how you will select a sample
that is representative of the group as a whole. This is called a sampling method. There are two primary
types of sampling methods that you can use in your research:

• Probability sampling involves random selection, allowing you to make strong statistical inferences
about the whole group.
• Non-probability sampling involves non-random selection based on convenience or other criteria,
allowing you to easily collect data.

First, you need to understand the difference between a population and a sample, and identify the target
population of your research.

• The population is the entire group that you want to draw conclusions about.
• The sample is the specific group of individuals that you will collect data from.

The population can be defined in terms of geographical location, age, income, or many other
characteristics.

It can be very broad or quite narrow: maybe you want to make inferences about the whole adult population
of your country; maybe your research focuses on customers of a certain company, patients with a specific
health condition, or students in a single school.

It is important to carefully define your target population according to the purpose and practicalities of your
project.

Business Research Methods 59


and Data Analytics
22BA206

If the population is very large, demographically mixed, and geographically dispersed, it might be difficult
to gain access to a representative sample. A lack of a representative sample affects the validity of your
results, and can lead to several research biases, particularly sampling bias.

1. 2.6.1 Sampling frame


The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should
include the entire target population (and nobody who is not part of that population).

Example: Sampling frame You are doing research on working conditions at a social media marketing
company. Your population is all 1000 employees of the company. Your sampling frame is the company’s
HR database, which lists the names and contact details of every employee.

2. 2.6.2 Sample size


The number of individuals you should include in your sample depends on various factors, including the
size and variability of the population and your research design. There are different sample size
calculators and formulas depending on what you want to achieve with statistical analysis. Sampling
bias occurs when some members of a population are systematically more likely to be selected in
a sample than others.

2.7 Types of sampling methods

2.7.1 Probability sampling methods

Probability sampling means that every member of the population has a chance of being selected. It is
mainly used in quantitative research. If you want to produce results that are representative of the whole
population, probability sampling techniques are the most valid choice.

There are four main types of probability sample.

1. Simple random sampling


In a simple random sample, every member of the population has an equal chance of being selected. Your
sampling frame should include the whole population.

Business Research Methods 60


and Data Analytics
22BA206

To conduct this type of sampling, you can use tools like random number generators or other techniques
that are based entirely on chance.

Example: Simple random samplingYou want to select a simple random sample of 1000 employees of a
social media marketing company. You assign a number to every employee in the company database from
1 to 1000, and use a random number generator to select 100 numbers.

2. Systematic sampling
Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct.
Every member of the population is listed with a number, but instead of randomly generating numbers,
individuals are chosen at regular intervals.

Example: Systematic samplingAll employees of the company are listed in alphabetical order. From the
first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th
person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people.

If you use this technique, it is important to make sure that there is no hidden pattern in the list that might
skew the sample. For example, if the HR database groups employees by team, and team members are
listed in order of seniority, there is a risk that your interval might skip over people in junior roles, resulting
in a sample that is skewed towards senior employees.

3. Stratified sampling
Stratified sampling involves dividing the population into subpopulations that may differ in important
ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented
in the sample.

To use this sampling method, you divide the population into subgroups (called strata) based on the relevant
characteristic (e.g., gender identity, age range, income bracket, job role).

Based on the overall proportions of the population, you calculate how many people should be sampled
from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup.

4. Cluster sampling
Cluster sampling also involves dividing the population into subgroups, but each subgroup should have
similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you
randomly select entire subgroups.

If it is practically possible, you might include every individual from each sampled cluster. If the clusters
themselves are large, you can also sample individuals from within each cluster using one of the techniques
above. This is called multistage sampling.

This method is good for dealing with large and dispersed populations, but there is more risk of error in the
sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the
sampled clusters are really representative of the whole population.

Example: Cluster samplingThe company has offices in 10 cities across the country (all with roughly the
same number of employees in similar roles). You don’t have the capacity to travel to every office to collect
your data, so you use random sampling to select 3 offices – these are your clusters.

Business Research Methods 61


and Data Analytics
22BA206

2.7.2 Non-probability sampling methods

In a non-probability sample, individuals are selected based on non-random criteria, and not every
individual has a chance of being included.

This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. That means
the inferences you can make about the population are weaker than with probability samples, and your
conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as
representative of the population as possible.

Non-probability sampling techniques are often used in exploratory and qualitative research. In these types
of research, the aim is not to test a hypothesis about a broad population, but to develop an initial
understanding of a small or under-researched population.

1. Convenience sampling
A convenience sample simply includes the individuals who happen to be most accessible to the researcher.

This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is
representative of the population, so it can’t produce generalizable results. Convenience samples are at risk
for both sampling bias and selection bias.

Example: Convenience samplingYou are researching opinions about student support services in your
university, so after each of your classes, you ask your fellow students to complete a survey on the topic.
This is a convenient way to gather data, but as you only surveyed students taking the same classes as you
at the same level, the sample is not representative of all the students at your university.

2. Voluntary response sampling


Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead
of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by
responding to a public online survey).

Business Research Methods 62


and Data Analytics
22BA206

Voluntary response samples are always at least somewhat biased, as some people will inherently be more
likely to volunteer than others, leading to self-selection bias.

Example: Voluntary response samplingYou send out the survey to all students at your university and a lot
of students decide to complete it. This can certainly give you some insight into the topic, but the people
who responded are more likely to be those who have strong opinions about the student support services,
so you can’t be sure that their opinions are representative of all students.

3. Purposive sampling
This type of sampling, also known as judgement sampling, involves the researcher using their expertise
to select a sample that is most useful to the purposes of the research.

It is often used in qualitative research, where the researcher wants to gain detailed knowledge about a
specific phenomenon rather than make statistical inferences, or where the population is very small and
specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make
sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your
arguments.

Example: Purposive sampling You want to know more about the opinions and experiences of disabled
students at your university, so you purposefully select a number of students with different support needs
in order to gather a varied range of data on their experiences with student services.

4. Snowball sampling
If the population is hard to access, snowball sampling can be used to recruit participants via other
participants. The number of people you have access to “snowballs” as you get in contact with more people.
The downside here is also representativeness, as you have no way of knowing how representative your
sample is due to the reliance on participants recruiting others. This can lead to sampling bias.

Example: Snowball sampling You are researching experiences of homelessness in your city. Since there
is no list of all homeless people in the city, probability sampling isn’t possible. You meet one person who
agrees to participate in the research, and she puts you in contact with other homeless people that she knows
in the area.

Business Research Methods 63


and Data Analytics
22BA206

2.8 Research Techniques and Tools for Data collection

Data collection tools are the devices or instruments for gathering data, such as a paper questionnaire or
computer-assisted interviewing system.In addition, here are some of the data collection techniques used
by the Data Collection Tools:

• Interviews
• Questionnaires
• Case Studies
• Usage Data
• Checklists
• Surveys
• Observations
• Documents and records
• Focus groups
• Oral histories

Different Data Collection tools use different techniques as their working principles and not all the tools
are capable of functioning on all these techniques. These tools are developed especially for gathering
specific types of information by applying individual data collection methods. Consider the following
attributes before utilizing a Data Collection Tool:
Variable Type: Consider the type of information you want to collect, your research niche, and the ultimate
objectives of the research.
Study design: Select the approach you’ll follow to collect this information.
Data collection technique: Decide techniques and tools you prefer to collect data
Sample data: Decide the place you want to collect data. This actually refers to the population to be
sampled. Also, figure out which part of the population will be included in your investigation.
Sample size: Consider how many subjects you want to include in your study
Sample design: Also, think about the way you’ll select the sample

2.9 Types of Data Collection

Now depending on the problem statement, the data collection method is broadly classified into two
categories-

• Primary Data Collection


• Secondary Data Collection

2.9.1 Primary Data Collection

Above all, primary data collection is the process of gathering raw data by researchers directly from main
sources through surveys, interviews, or experiments. Now, it can be further classified into two categories-

• Quantitative Data Collection Methods


• Qualitative Data Collection Methods

Business Research Methods 64


and Data Analytics
22BA206

Quantitative Data Collection Methods: Quantitative methods use mathematical calculations to deduce
useful data and present them in numbers. For instance, you can take the questionnaire with close-ended
questions. It produces figures after mathematical calculation. Also, methods of correlation and regression,
mean, mode and median.

Qualitative Data Collection Methods: Qualitative research methods usually work based on non-
quantifiable elements such as the feeling or emotions of the researcher. In addition, it doesn’t require any
mathematical calculation to collect any numerical data. For instance, an example of this method can be an
open-ended questionnaire.

Interviews

The researcher asks questions of a large sampling of people, either by direct interviews or means of mass
communication such as by phone or mail. This method is by far the most common means of data gathering.

Projective Data Gathering

Projective data gathering is an indirect interview, used when potential respondents know why they're being
asked questions and hesitate to answer. For instance, someone may be reluctant to answer questions about
their phone service if a cell phone carrier representative poses the questions. With projective data
gathering, the interviewees get an incomplete question, and they must fill in the rest, using their opinions,
feelings, and attitudes.

Delphi Technique

The Oracle at Delphi, according to Greek mythology, was the high priestess of Apollo’s temple, who gave
advice, prophecies, and counsel. In the realm of data collection, researchers use the Delphi technique by
gathering information from a panel of experts. Each expert answers questions in their field of specialty,
and the replies are consolidated into a single opinion.

Focus Groups

Focus groups, like interviews, are a commonly used technique. The group consists of anywhere from a
half-dozen to a dozen people, led by a moderator, brought together to discuss the issue.

Questionnaires

Questionnaires are a simple, straightforward data collection method. Respondents get a series of questions,
either open or close-ended, related to the matter at hand.

2.9.2 Secondary Data Collection

Secondary data is the type of data that has already been collected by another person or organization for a
different purpose, e.g. reporting or research. You can collect these data from magazines, newspapers,
books, blogs, journals, etc.

Unlike primary data collection, there are no specific collection methods. Instead, since the information
has already been collected, the researcher consults various data sources, such as:

Business Research Methods 65


and Data Analytics
22BA206

• Financial Statements
• Sales Reports
• Retailer/Distributor/Deal Feedback
• Customer Personal Information (e.g., name, address, age, contact info)
• Business Journals
• Government Records (e.g., census, tax records, Social Security info)
• Trade/Business Magazines
• The internet
In the Secondary Data collection process, you have all the data available that someone analyses before.
Compare to primary data collection this is much less expensive and easier to collect. It may be either
published data or unpublished data.

Secondary data sources for published data include-

• Government publications
• Websites
• Public records
• Historical and statistical documents
• Business documents
• Technical and trade journals
• Podcast

Secondary data sources for unpublished data include-

• Diaries
• Letters
• Unpublished biographies

However, depending on your area of research, opportunity, niche type, and ultimate project goal you can
pick any of these data collection methods to make some productive decisions.

2.9.3. Other tools for Data collection

Interview

An interview is a qualitative research method that relies on asking questions in order to collect data.
Interviews involve two or more people, one of whom is the interviewer asking the questions.

There are several types of interviews, often differentiated by their level of structure.

• Structured interviews have predetermined questions asked in a predetermined order.


• Unstructured interviews are more free-flowing.
• Semi-structured interviews fall in between.

Interviews are commonly used in market research, social science, and ethnographic research.

Business Research Methods 66


and Data Analytics
22BA206

Structured interviews have predetermined questions in a set order. They are often closed-ended,
featuring dichotomous (yes/no) or multiple-choice questions. While open-ended structured interviews
exist, they are much less common. The types of questions asked make structured interviews a
predominantly quantitative tool.

Asking set questions in a set order can help you see patterns among responses, and it allows you to easily
compare responses between participants while keeping other factors constant. This can mitigate research
biases and lead to higher reliability and validity. However, structured interviews can be overly formal, as
well as limited in scope and flexibility. Structured interviews may be a good fit for your research if:

• You feel very comfortable with your topic. This will help you formulate your questions most
effectively.
• You have limited time or resources. Structured interviews are a bit more straightforward to analyze
because of their closed-ended nature, and can be a doable undertaking for an individual.
• Your research question depends on holding environmental conditions between participants constant.

Semi-structured interviews are a blend of structured and unstructured interviews. While the interviewer
has a general plan for what they want to ask, the questions do not have to follow a particular phrasing or
order.

Semi-structured interviews are often open-ended, allowing for flexibility, but follow a predetermined
thematic framework, giving a sense of order. For this reason, they are often considered “the best of both
worlds.”

However, if the questions differ substantially between participants, it can be challenging to look for
patterns, lessening the generalizability and validity of your results.

Semi-structured interviews may be a good fit for your research if:

• You have prior interview experience. It’s easier than you think to accidentally ask a leading question
when coming up with questions on the fly. Overall, spontaneous questions are much more difficult
than they may seem.
• Your research question is exploratory in nature. The answers you receive can help guide your future
research.

An unstructured interview is the most flexible type of interview. The questions and the order in which
they are asked are not set. Instead, the interview can proceed more spontaneously, based on the
participant’s previous answers.

Unstructured interviews are by definition open-ended. This flexibility can help you gather detailed
information on your topic, while still allowing you to observe patterns between participants.

However, so much flexibility means that they can be very challenging to conduct properly. You must be
very careful not to ask leading questions, as biased responses can lead to lower reliability or even
invalidate your research.

Unstructured interviews may be a good fit for your research if:

• You have a solid background in your research topic and have conducted interviews before.

Business Research Methods 67


and Data Analytics
22BA206

• Your research question is exploratory in nature, and you are seeking descriptive data that will deepen
and contextualize your initial hypotheses.
• Your research necessitates forming a deeper connection with your participants, encouraging them to
feel comfortable revealing their true opinions and emotions.

A focus group brings together a group of participants to answer questions on a topic of interest in a
moderated setting. Focus groups are qualitative in nature and often study the group’s dynamic and body
language in addition to their answers. Responses can guide future research on consumer products and
services, human behavior, or controversial topics.

Focus groups can provide more nuanced and unfiltered feedback than individual interviews and are easier
to organize than experiments or large surveys. However, their small size leads to low external validity and
the temptation as a researcher to “cherry-pick” responses that fit your hypotheses.

A focus group may be a good fit for your research if:

• Your research focuses on the dynamics of group discussion or real-time responses to your topic.
• Your questions are complex and rooted in feelings, opinions, and perceptions that cannot be answered
with a “yes” or “no.”
• Your topic is exploratory in nature, and you are seeking information that will help you uncover new
questions or future research ideas.

Type of interview Advantages Disadvantages

Structured • Can be used for quantitative • Researcher can’t ask additional questions for
interview research more clarification or nuance
• Data can be compared • Limited scope: you might miss out on interesting
• High reliability and validity data
• Time-effective for the • At risk of response bias
interviewer and the respondent • Due to the restricted answer options, people
might have to choose the “best fit”
Semi-structured • Can be used in quantitative • Lower validity than the structured interview
interview research • At risk of Hawthorne effect, observer
• Relatively high validity bias, recall bias, and social desirability bias
• You can ask additional questions • You need to have good conversational skills to
if needed get the most out of the interview
• Preparation is time-consuming
Unstructured • You can ask additional • Low reliability and validity
interview questions if needed • You need to have excellent conversational skills
• Respondents might feel more at to keep the interview going
ease • At risk of Hawthorne effect, observer
• You can collect rich, qualitative bias, recall bias, and social desirability bias
data • Easy to get sidetracked
• Can be used if little is known • Hard to compare data
about the topic • Preparation is very time-consuming

Business Research Methods 68


and Data Analytics
22BA206

Type of interview Advantages Disadvantages

Focus group • Efficient method, since you • You can ask a limited number of questions due
interview multiple people at to time constraints
once • You need good conversational and leadership
• Respondents are often more at skills
ease • There is a higher risk of observer bias, recall
• Relatively cost-efficient bias, and social desirability bias
• Easier to discuss difficult • You can’t guarantee confidentiality or
topics other ethical considerations, since there are
multiple people present

2.9.4 Benefits of Data Collection

Analyzing the Performances


As an entrepreneur, it’s important to not only collect data but also analyze and use this data in innovative
ways for performance analysis. While the data can be found in many places but the challenge here is
gathering reliable data that works towards helping your business grow and develop its quality output.
To do this data collection tools plays a vital role that yields accurate results, quicker solutions, and business
insights that will propel your business further!
Studying the Customers

There’s a reason why companies like Facebook and Google collect data to study their customers. It can
be helpful when you want to know what your customers are looking for. These two companies aren’t alone
in their quest for data. Any business that wants to grow wants information, which means that there is now
a gold rush of sorts happening with everyone trying to get more data than the other guy by way of social
media and other avenues.

What the tools out there for collecting and gathering data can do is way more than just give you more
customers. They can help you understand and study your customers, who may prove invaluable down the
line as you move your business forward.

Seeking Problem Solutions in Business

If any campaign doesn’t go as your expectations, and it ends up not working well in the market, even with
all of the efforts put into it, you can deploy scraping or data collection tools to help you analyze why this
may have happened. And that’s important so that if there is any particular area where a product can be
improved or modified before being re-launched or replicated, your time & money isn’t wasted as well by
repeating something time after time that doesn’t get the desired result.

2.9.5 Data Collection Tools


We have discussed various techniques, let’s narrow our focus even further by looking at some specific
tools. For example, we mentioned interviews as a technique, but we can further break that down into
different interview types (or “tools”).

Business Research Methods 69


and Data Analytics
22BA206

Word Association

The researcher gives the respondent a set of words and asks them what comes to mind when they hear
each word.

Sentence Completion

Researchers use sentence completion to understand what kind of ideas the respondent has. This tool
involves giving an incomplete sentence and seeing how the interviewee finishes it.

Role-Playing

Respondents are presented with an imaginary situation and asked how they would act or react if it was
real.

In-Person Surveys

The researcher asks questions in person.

Online/Web Surveys

These surveys are easy to accomplish, but some users may be unwilling to answer truthfully, if at all.

Mobile Surveys

These surveys take advantage of the increasing proliferation of mobile technology. Mobile collection
surveys rely on mobile devices like tablets or smartphones to conduct surveys via SMS or mobile apps.

Phone Surveys

No researcher can call thousands of people at once, so they need a third party to handle the chore.
However, many people have call screening and won’t answer.

Observation

Sometimes, the simplest method is the best. Researchers who make direct observations collect data
quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in small-scale
situations.

2.9.6 Data collection Process

1. Decide What Data You Want to Gather

The first thing that we need to do is decide what information we want to gather. We must choose the
subjects the data will cover, the sources we will use to gather it, and the quantity of information that we

Business Research Methods 70


and Data Analytics
22BA206

would require. For instance, we may choose to gather information on the categories of products that an
average e-commerce website visitor between the ages of 30 and 45 most frequently searches for.

2. Establish a Deadline for Data Collection

The process of creating a strategy for data collection can now begin. We should set a deadline for our data
collection at the outset of our planning phase. Some forms of data we might want to continuously collect.
We might want to build up a technique for tracking transactional data and website visitor statistics over
the long term, for instance. However, we will track the data throughout a certain time frame if we are
tracking it for a particular campaign. In these situations, we will have a schedule for when we will begin
and finish gathering data.

3. Select a Data Collection Approach

We will select the data collection technique that will serve as the foundation of our data gathering plan at
this stage. We must take into account the type of information that we wish to gather, the time period during
which we will receive it, and the other factors we decide on to choose the best gathering strategy.

4. Gather Information

Once our plan is complete, we can put our data collection plan into action and begin gathering data. In our
DMP, we can store and arrange our data. We need to be careful to follow our plan and keep an eye on
how it's doing. Especially if we are collecting data regularly, setting up a timetable for when we will be
checking in on how our data gathering is going may be helpful. As circumstances alter and we learn new
details, we might need to amend our plan.

5. Examine the Information and Apply Your Findings

It's time to examine our data and arrange our findings after we have gathered all of our information. The
analysis stage is essential because it transforms unprocessed data into insightful knowledge that can be
applied to better our marketing plans, goods, and business judgments. The analytics tools included in our
DMP can be used to assist with this phase. We can put the discoveries to use to enhance our business once
we have discovered the patterns and insights in our data.

2.9.7 Common Challenges in Data Collection

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand
them better and avoid them.

Data Quality Issues

The main threat to the broad and successful application of machine learning is poor data quality. Data
quality must be your top priority if you want to make technologies like machine learning work for you.
Let's talk about some of the most prevalent data quality problems in this blog article and how to fix them.

Inconsistent Data

Business Research Methods 71


and Data Analytics
22BA206

When working with various data sources, it's conceivable that the same information will have
discrepancies between sources. The differences could be in formats, units, or occasionally spellings. The
introduction of inconsistent data might also occur during firm mergers or relocations. Inconsistencies in
data have a tendency to accumulate and reduce the value of data if they are not continually resolved.
Organizations that have heavily focused on data consistency do so because they only want reliable data to
support their analytics.

Data Downtime

Data is the driving force behind the decisions and operations of data-driven businesses. However, there
may be brief periods when their data is unreliable or not prepared. Customer complaints and subpar
analytical outcomes are only two ways that this data unavailability can have a significant impact on
businesses. A data engineer spends about 80% of their time updating, maintaining, and guaranteeing the
integrity of the data pipeline. In order to ask the next business question, there is a high marginal cost due
to the lengthy operational lead time from data capture to insight.

Schema modifications and migration problems are just two examples of the causes of data downtime. Data
pipelines can be difficult due to their size and complexity. Data downtime must be continuously
monitored, and it must be reduced through automation.

Ambiguous Data

Even with thorough oversight, some errors can still occur in massive databases or data lakes. For data
streaming at a fast speed, the issue becomes more overwhelming. Spelling mistakes can go unnoticed,
formatting difficulties can occur, and column heads might be deceptive. This unclear data might cause a
number of problems for reporting and analytics.

Duplicate Data

Streaming data, local databases, and cloud data lakes are just a few of the sources of data that modern
enterprises must contend with. They might also have application and system silos. These sources are likely
to duplicate and overlap each other quite a bit. For instance, duplicate contact information has a substantial
impact on customer experience. If certain prospects are ignored while others are engaged repeatedly,
marketing campaigns suffer. The likelihood of biased analytical outcomes increases when duplicate data
are present. It can also result in ML models with biased training data.

Too Much Data

While we emphasize data-driven analytics and its advantages, a data quality problem with excessive data
exists. There is a risk of getting lost in an abundance of data when searching for information pertinent to
your analytical efforts. Data scientists, data analysts, and business users devote 80% of their work to
finding and organizing the appropriate data. With an increase in data volume, other problems with data
quality become more serious, particularly when dealing with streaming data and big files or databases.

Inaccurate Data

Business Research Methods 72


and Data Analytics
22BA206

For highly regulated businesses like healthcare, data accuracy is crucial. Given the current experience, it
is more important than ever to increase the data quality for COVID-19 and later pandemics. Inaccurate
information does not provide you with a true picture of the situation and cannot be used to plan the best
course of action. Personalized customer experiences and marketing strategies underperform if your
customer data is inaccurate.

Data inaccuracies can be attributed to a number of things, including data degradation, human mistake, and
data drift. Worldwide data decay occurs at a rate of about 3% per month, which is quite concerning. Data
integrity can be compromised while being transferred between different systems, and data quality might
deteriorate with time.

Hidden Data

The majority of businesses only utilize a portion of their data, with the remainder sometimes being lost in
data silos or discarded in data graveyards. For instance, the customer service team might not receive client
data from sales, missing an opportunity to build more precise and comprehensive customer profiles.
Missing out on possibilities to develop novel products, enhance services, and streamline procedures is
caused by hidden data

2.10 Questionnaire

A questionnaire is a list of questions or items used to gather data from respondents about their attitudes,
experiences, or opinions. Questionnaires can be used to
collect quantitative and/or qualitative information.

Questionnaires are commonly used in market research as well as in the social and health sciences. For
example, a company may ask for feedback about a recent customer service experience, or psychology
researchers may investigate health risk perceptions using questionnaires.

Questionnaires vs. surveys


A survey is a research method where you collect and analyze data from a group of people.
A questionnaire is a specific tool or instrument for collecting the data.

Designing a questionnaire means creating valid and reliable questions that address your research
objectives, placing them in a useful order, and selecting an appropriate method for administration.

But designing a questionnaire is only one component of survey research. Survey research also involves
defining the population you’re interested in, choosing an appropriate sampling method, administering
questionnaires, data cleansing and analysis, and interpretation.

Sampling is important in survey research because you’ll often aim to generalize your results to the
population. Gather data from a sample that represents the range of views in the population for externally
valid results. There will always be some differences between the population and the sample, but
minimizing these will help you avoid several types of research bias, including sampling
bias, ascertainment bias, and undercoverage bias.

Business Research Methods 73


and Data Analytics
22BA206

2.10.1 Questionnaire methods


Questionnaires can be self-administered or researcher-administered. Self-administered questionnaires are
more common because they are easy to implement and inexpensive, but researcher-administered
questionnaires allow deeper insights.

Self-administered questionnaires
Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through
mail. All questions are standardized so that all respondents receive the same questions with identical
wording.

Self-administered questionnaires can be:

• Cost-effective
• Easy to administer for small and large groups
• Anonymous and suitable for sensitive topics
• Self-paced

But they may also be:

• Unsuitable for people with limited literacy or verbal skills


• Susceptible to a nonresponse bias (most people invited may not complete the questionnaire)
• Biased towards people who volunteer because impersonal survey requests often go ignored.

Researcher-administered questionnaires
Researcher-administered questionnaires are interviews that take place by phone, in-person, or online
between researchers and respondents.

Researcher-administered questionnaires can:

• Help you ensure the respondents are representative of your target audience
• Allow clarifications of ambiguous or unclear questions and answers
• Have high response rates because it’s harder to refuse an interview when personal attention is given
to respondents

But researcher-administered questionnaires can be limiting in terms of resources. They are:

• Costly and time-consuming to perform


• More difficult to analyze if you have qualitative responses
• Likely to contain experimenter bias or demand characteristics
• Likely to encourage social desirability bias in responses because of a lack of anonymity

Open-ended vs. closed-ended questions


Your questionnaire can include open-ended or closed-ended questions or a combination of both.

Using closed-ended questions limits your responses, while open-ended questions enable a broad range of
answers. You’ll need to balance these considerations with your available time and resources.

Business Research Methods 74


and Data Analytics
22BA206

Closed-ended questions
Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from.
Closed-ended questions are best for collecting data on categorical or quantitative variables.

Categorical variables can be nominal or ordinal. Quantitative variables can be interval or ratio.
Understanding the type of variable and level of measurement means you can perform
appropriate statistical analyses for generalizable results.

Examples of closed-ended questions for different variables


Nominal variables include categories that can’t be ranked, such as race or ethnicity. This includes binary
or dichotomous categories.

It’s best to include categories that cover all possible answers and are mutually exclusive. There should be
no overlap between response items.

In binary or dichotomous questions, you’ll give respondents only two options to choose from.

Example: Nominal variablesWhat is your race?

White
Black or African American
American Indian or Alaska Native
Asian
Native Hawaiian or Other Pacific Islander

Are you satisfied with the current work-from-home policies?


Yes
No
Ordinal variables include categories that can be ranked. Consider how wide or narrow a range you’ll
include in your response items, and their relevance to your respondents.

Example: Ordinal variablesWhat is your age?

15 or younger
16–35
36–60
61–75
76 or older

Likert scale questions collect ordinal data using rating scales with 5 or 7 points.

Example: Likert-type questionsHow satisfied or dissatisfied are you with your online shopping experience
today?

Very dissatisfied
Somewhat dissatisfied
Neither satisfied nor dissatisfied
Somewhat satisfied
Very satisfied

Business Research Methods 75


and Data Analytics
22BA206

When you have four or more Likert-type questions, you can treat the composite data as quantitative data
on an interval scale. Intelligence tests, psychological scales, and personality inventories use multiple
Likert-type questions to collect interval data.

With interval or ratio scales, you can apply strong statistical hypothesis tests to address your research
aims.

Pros and cons of closed-ended questions


Well-designed closed-ended questions are easy to understand and can be answered quickly. However, you
might still miss important answers that are relevant to respondents. An incomplete set of response items
may force some respondents to pick the closest alternative to their true answer. These types of questions
may also miss out on valuable detail.

To solve these problems, you can make questions partially closed-ended, and include an open-ended
option where respondents can fill in their own answer.

Open-ended questions
Open-ended, or long-form, questions allow respondents to give answers in their own words. Because there
are no restrictions on their choices, respondents can answer in ways that researchers may not have
otherwise considered. For example, respondents may want to answer “multiracial” for the question on
race rather than selecting from a restricted list.

Example: Open-ended questions

• How do you feel about open science?


• How would you describe your personality?
• In your opinion, what is the biggest obstacle for productivity in remote work?

Open-ended questions have a few downsides.

They require more time and effort from respondents, which may deter them from completing the
questionnaire.

For researchers, understanding and summarizing responses to these questions can take a lot of time and
resources. You’ll need to develop a systematic coding scheme to categorize answers, and you may also
need to involve other researchers in data analysis for high reliability.

Question wording
Question wording can influence your respondents’ answers, especially if the language is unclear,
ambiguous, or biased. Good questions need to be understood by all respondents in the same way (reliable)
and measure exactly what you’re interested in (valid).

Use clear language


You should design questions with your target audience in mind. Consider their familiarity with your
questionnaire topics and language and tailor your questions to them.

For readability and clarity, avoid jargon or overly complex language. Don’t use double negatives because
they can be harder to understand.

Business Research Methods 76


and Data Analytics
22BA206

Use balanced framing


Respondents often answer in different ways depending on the question framing. Positive frames are
interpreted as more neutral than negative frames and may encourage more socially desirable answers.

Example: Positive vs negative frames

Positive frame Negative frame

Should protests of pandemic-related Should protests of pandemic-related restrictions


restrictions be allowed? be forbidden?

Use a mix of both positive and negative frames to avoid research bias, and ensure that your question
wording is balanced wherever possible.

Unbalanced questions focus on only one side of an argument. Respondents may be less likely to oppose
the question if it is framed in a particular direction. It’s best practice to provide a counter argument within
the question as well.

Example: Unbalanced vs balanced frames

Unbalanced Balanced
Do you favor…? Do you favor or oppose…?
Do you agree that…? Do you agree or disagree that…?

Avoid leading questions


Leading questions guide respondents towards answering in specific ways, even if that’s not how they truly
feel, by explicitly or implicitly providing them with extra information.

It’s best to keep your questions short and specific to your topic of interest.

Example: Leading questions

The average daily work commute in the US takes 54.2 minutes and costs $29 per day. Since 2020,
working from home has saved many employees time and money. Do you favor flexible work-from-
home policies even after it’s safe to return to offices?

Experts agree that a well-balanced diet provides sufficient vitamins and minerals, and multivitamins
and supplements are not necessary or effective. Do you agree or disagree that multivitamins are helpful
for balanced nutrition?

2.11 Rating Scale

Rating scales have been popularly used by brands to collect customer feedback on product or service
reviews. These questions are easy to recognize and understand and sometimes respondents don’t even
need to read the question. We see smiley or star ratings and know what to do. This type of scale is one of
the most commonly used questionnaire types for online and offline surveys. It consists of close-ended
questions along with a set of categories as options for respondents. It helps gain information on the
qualitative and quantitative attributes.

Business Research Methods 77


and Data Analytics
22BA206

The most common example is the Likert scale, star rating, and slider. For instance, when you visit an
online shopping site, it asks you to rate your shopping experience.

It is a popular choice for conducting market research. It can gather more relative information about a
product or certain aspects of the product. Leverage the best market research software that offers you
various types of rating scale questions.

The scale is commonly used to gain feedback or to evaluate. It can be used to gain insight into the
performance of a product, employee satisfaction or skill, customer service performance, etc.

2.11.1 Categories of Rating scale

It is divided into two categories: ordinal scale and interval scale. Some data are measured at the ordinal
level, and some at the interval level.

Ordinal Scale: An ordinal scale gathers data by putting them in a rank without a degree of difference.

Interval Scale: An interval scale measures data with equal distance between two adjacent attributes.

Robust online survey tools should allow you to create interactive surveys with rating questions to keep
the respondents engaged.

Now that we have learned what it is and the two categories of the collected data, let’s look into the different
types.

These six scales gather data based on the categories mentioned above.

1. Numeric scale.
2. Verbal scale.
3. Slider scale.
4. Likert scale.
5. Graphic scale.
6. Descriptive scale.
Researchers should ensure that the survey software they use enables them to create surveys with
various question types. We have explained these six types in detail to help you determine the right time to
use the right question.
Numeric rating scale or NRS
A numeric scale uses numbers to identify the items in a scale. However, not all numbers need to have an
attribute attached to them.
For example, you can ask your target audience to rate your product from 1 to 5 on a scale. You can put 1
as totally dissatisfied and 5 as totally satisfied.
Verbal rating scale or VRS:
Verbal scales are used for pain assessment. Also known as verbal pain scores and verbal descriptor scale
compiles a number of statements describing pain intensity and duration.
For instance, when you go to a dentist, you are asked to rate the intensity of your tooth pain. At that
time, you receive a scale with items like “none,” “mild,” “moderate,” “severe,” and “very severe.”
Visual analog scale(VAS) or Slider scale:
The idea behind VAS is to let the audience select any value from the scale between two endpoints. In the
scale, only the endpoints have attributes allotted to numbers, and the rest of the scale is empty.

Business Research Methods 78


and Data Analytics
22BA206

Often just called a slider scale, the audience can rate whatever they want without being restricted to
particular characteristics or rank.
For example, a scale rating ranges from extremely easy to extremely difficult, with no other value
allotted.
Likert scale:
A Likert scale is a useful tool for effective market research to receive feedback on a wide range of
psychometric attributes. The agree-disagree scale is particularly useful when your intention is to gather
information on frequency, experience, quality, likelihood, etc.
For example, to evaluate employee satisfaction with company policies, a Likert scale is a good tool to
use.
Graphic rating scale:
Instead of numbers imagine using pictures, such as stars or smiley faces to ask your customers and
audience to rate.
The stars and smiley faces can generate the same value as a number.
Descriptive scale:
In certain surveys or research, a numeric scale may not help much. A descriptive scale explains each
option for the respondent.
It contains a thorough explanation for the purpose of gathering information with deep insights.
You can use these six types in your surveys to make it an engaging and fun experience for the survey
takers. Robust online survey tools offer a diverse range of question types, such as rating scales, ranking
scales, MCQs, etc.

2.11.2 Advantages of using rating scales in Research


• It is a simple and easy-to-understand question type for both the researcher and the audience.
• It doesn’t take too much of the respondents’ time.
• There are various types of scales to help you create an engaging survey.
• In terms of marketing surveys, this scale is a valuable tool for data analysis. It can gain product
review for evaluation and a further improvement in marketing strategy.
2.11.3 Disadvantages of rating scales

• It does not help collect the reason behind a customer review.


• It gets access to the overall experience but not the reason behind the audience’s perception.
• In the case of VRS, the scale may oftentimes overestimate the patient’s pain experience. In addition,
patients with limited vocabulary may not understand the statements in a verbal descriptor scale.

2.11.4 Steps to Create a Rating Scale Question


A poorly created rating scale question can potentially confuse your audience, hamper your analysis, and
increase costs. Here are three steps to create the best rating scale questions for any audience:

Step 1: Choose the Right Rating Scale


As we discussed above, there are different types of rating scales available. Depending on your research
goals, you can choose a rating scale that best fits your needs. For instance, if your goal is to identify loyal
or at-risk customers, you can go for the traditional NPS rating scale. Similarly, if you wish to create
surveys for your international audience, you can use graphic or pictorial scales.
Step 2: Choose the Right Response Options
Choosing the right response options is as important as the question itself. The scale should offer enough
answer options for the respondents to easily choose the most valid answer. Depending on your objectives,
you can go for a 1-5 or 1-7 rating scale depending on your objectives. Make sure the listed options are
clear, easy to understand, and free from any technical jargon.

Business Research Methods 79


and Data Analytics
22BA206

You can include a text box question after your rating scale question so that respondents can get the
opportunity to expand on their previous answers. This will help you better understand why someone gave
you the rating they did.
Step 3: Share on the Best Channels
Once you are happy with the look and feel of your rating scale questionnaire, you must share it with your
target audience. With ProProfs Survey Maker, you can share your survey as a link via email, social media
or embed it directly on your website. Moreover, you can even monitor your responses in real-time

2.12 Socio-Metric Technique


Socio-metric technique or test as one of the non-testing devices was first developed by J.L. Moreno and
Hellen Jennings sometimes around 1960. It is a means of presenting simply and graphically the structure
of social relations, lines of communication and the patterns of friendship, attractions and rejection that
exist at a given time among members of a particular group.
Through this technique the counsellor or the guidance personnel can measure acceptance or rejection
frequently between the members of the group. It is commonly observed that some students always like to
stay together, some students are more liked by all students, some students aren’t liked by anyone and so
on. These social relationships existing among them influence all aspects of their development.
According to Moreno J.L, (1947), "Sociometry is a method for dis- covering, describing, and
evaluating social status, structure, and develop- ment through measuring the extent of acceptance or
rejection in social. groups."

The essential qualities or features of a socio-metric test are as follows:


(i) It is a simple and graphical presentation of data about the group.
(ii) It presents the structure of social relationship that exist among the members of the group.
(iii) It indicates the friendship pattern among group members.
(iv) It indicates the line of attraction and rejection among group members.
(v) It has always a time reference.
(vi) It indicates at the person most chosen as the leader and the person not chosen at all or the isolate.

The Techniques Followed:


(i) If the group is large divide the group into smaller subgroups consisting of ten members each.
(ii) The members of each group or sub-group may be numbered from one to ten.
(iii) Ask each member to write the name or the number of a student with whom he likes most to work, to
play, or to sit etc.

2.12.1 Uses of Socio-Metric Technique


The Socio-metric technique has the following uses in the guidance programme:
(i) By studying the choice of students through socio-metric technique the teacher can determine the nature
and degree of social relationship existing among the students.
(ii) It is useful in identifying those who are isolated, the one who is not preferred by any other individual.
(iii) It is also useful for identifying those who are liked by many others and who can be better leader of
the group. By working with them guidance can be provided.
(iv) Socio-metric technique is more useful with small groups. The position or status of the individual is
determined on the basis of some particular criterion.
(v) It is a simple, economical and natural method of observational and data collection.
(vi) Socio-metric methods are used whenever human actions like choosing, influencing, dominating and
communicating in group situations are involved.
(vii) They can be employed in a wide variety of research in the laboratory as well as in the field.
(viii) They can be used to discover cheques in groups, communication and influence channels, patterns of
cohesiveness and connectedness and so on.

Business Research Methods 80


and Data Analytics
22BA206

2.13 Checklist to be consider the end of Data Collection.

Stacey barr has formulated the below checklist:

Step 1: Make the purpose clear.


• Identify the performance measures, business questions or decisions that you require the data for.
• List the data items you need to collect (eg these may come from your Performance Measure
Definitions, if you have them, or analysis of the information requirements for your business
questions or decisions).
• Make sure your data items are useful, NOT just interesting. If they are just interesting, then
consider the unintended consequences of collecting them (such as cost, annoying respondents or
data collectors, compromising integrity, etc.).
• Develop a purpose statement for the data collection process, so that everyone understands why it
exists.
Step 2: Define the scope of your data collection.

• List the characteristics that define who or what you will be collecting data about (eg age groups,
roles, activities involved in, education level).
• List the characteristics that define where this data will be collected (eg specific departments or
divisions, geographical locations, specific offices or places of work).
• List the characteristics that define when this data will be collected (eg during November, all the
time, for the next 3 years, until an improvement is achieved).
• Use these lists to define the scope of your data collection: your ‘target population’.
• Check and refine your scope definition by testing it with examples of people, things, places or
times that are out of scope.

Step 3: Design your sample.

• Define how reliable you want the data to be (eg how small a change in your measures do you want to
be able to reliably detect?). This may already be recorded in your Performance Measure Definitions.
• Nominate any demographic or classification (or drilling) variables that you want to use in analysis of
your data (eg do you want to have averages or percentages by geographic location or age group or
department or gender?). This may already be recorded in your Performance Measure Definitions.
• Discuss what kind of results you are expecting, in terms of the range of data values you think you are
likely to get (eg are customers likely to rate their satisfaction mostly at 3 or 4 on your 5 point
satisfaction scale, or are they likely to be more spread out on the scale?).
• Explore logistical constraints of collecting data from your target population e.g. accessibility, cost and
data integrity.
• Use the above four decisions (and a survey statistician or other assistance) to decide whether or not a
sample will be more cost effective than a census. And if you have chosen to go with a sample, get
professional help so you don’t inadvertently make it completely useless:
• Identify a survey statistician or other assistance in survey sampling. It’s a science, not an art.
• Decide whether or not it will be stratified (ie your total sample is really a collection of smaller samples
based on your demographic or classification variable, which may be geographic location, age group,
department or gender). Stratifying a sample can sometimes be a way to reduce the overall sample size
or improve the overall reliability of the results.
• Select a sample size (or sample sizes, if stratifying) that will deliver the reliability you require.
• Select your sample using a random method – not a convenient method like quotas or volunteers – or
else you run the risk of bias, where the data you get is not representative of your target population.

Business Research Methods 81


and Data Analytics
22BA206

Step 4: Develop your data collection instrument.

• Decide the basic method of data collection you want (or can afford), such as self-completion, telephone
interview, face to face interview, focus group, or automated (if possible).
• Formulate questions or constructs around the set of data items you listed at Step 1. Give consideration
to the type of construct that will give you the data you need, such as open-ended questions, yes/no
questions, multiple choice, rating scales, option lists, etc.
• Sequence the questions or constructs in a logical order.
• Check the language and wording of your questions or constructs to remove ambiguity and “fluff”.
Give consideration to providing concise instructions for how to respond to each construct.
• Design a layout for arranging your questions in a readable and usable way. Give consideration to the
medium you will use (such as web page, computer data entry screen, paper, etc.), how you align things
on the “page”, how you use white space to stop it looking like a huge blob of text, how you use contrast
to make questions stand apart from instructions and the response area (eg the option list, the rating
scale, etc.).
• Test your questionnaire or form on a handful of people ideally those who will collect the data or
provide the data. The obvious problems won’t be obvious to you. Absorb their feedback for ideas on
making the questionnaire more relevant, understandable and usable.

Step 5: Flowchart the procedure of collecting the data.

• Identify the trigger that will let people know that data has to be collected. It might be a customer phone
call, a specific event occurring or finishing, an activity starting.
• Identify how the data will be captured, such as which database will it be entered into.
• List the steps that you think will be involved in the data collection procedure, from the trigger to the
capture of the data. Note down who will take a role in which step.
• Draw a flowchart (or cross-functional process map) that shows the flow of the steps through time,
against who performs them. Give consideration to the expected time frames within which each step
should be performed.
• For each step, identify the resources required to perform it successfully.

Step 6: Pilot test the whole thing.

• Choose a part of your data scope, based on location and time, in which you will conduct your pilot
test.
• List the outcomes that define success for this data collection process. Explore what success might look
like from each stakeholder’s point of view (eg people collecting the data, providing the data, capturing
the data, using the data, etc.). These might include impact on people’s time, data integrity, data
usability, costs and timeliness.
• Develop a Pilot Test plan for testing the data collection process, and a way to observe “evidence of
success”.
• Implement the Pilot Test plan.
• Reflect on the “evidence of success” and summarise what you learned. List changes or improvements
that you need to make to the data collection process and/or resources.
• Make the improvements to your data collection design.
• Deploy the data collection process. Continue to monitor it over time to ensure the “success outcomes”
are tracking well.

2.14Pilot study

Business Research Methods 82


and Data Analytics
22BA206

Pilot studies can play a very important role prior to conducting a full-scale research project
A pilot study, also called a 'feasibility' study, is a small scale preliminary study conducted before any
large-scale quantitative research in order to evaluate the potential for a future, full-scale project.
Pilot studies are a fundamental stage of the research process. They can help identify design issues and
evaluate feasibility, practicality, resources, time, and cost of a study before the main research is conducted.
This enables researchers to predict an appropriate sample size, budget accordingly, and improve upon the
study design prior to performing a full-scale project.
Pilot studies also provide researchers with preliminary data so they can gain insight into the potential
results of their proposed experiment.
However, pilot studies should not be used to test hypotheses since the appropriate power and sample size
are not calculated. Rather, pilot studies should be used to assess the feasibility of participant recruitment
or study design.
By conducting a pilot study, researchers will be better prepared to face the challenges that might arise in
the larger study, and they will be more confident with the instruments they will use for data collection.
In some studies, multiple pilot studies may be needed and qualitative and/or quantitative methods may be
used.
In order to avoid bias, pilot studies are usually carried out on individuals who are as similar as possible to
the target population, but not on those who will be a part of the final sample.

Components of a Pilot Study


Whether your research is a clinical trial of a medical treatment or a survey in the form of a questionnaire,
you want your study to be informative and add value to your research field. Things to consider in your
pilot study include:

• Sample size and selection. Your data needs to be representative of the target study population. You
should use statistical methods to estimate the feasibility of your sample size.
• Determine the criteria for a successful pilot study based on the objectives of your study. How will
your pilot study address these criteria?
• When recruiting subjects or collecting samples ensure that the process is practical and manageable.
• Always test the measurement instrument. This could be a questionnaire, equipment, or methods
used. Is it realistic and workable? How can it be improved?
• Data entry and analysis. Run the trial data through your proposed statistical analysis to see whether
your proposed analysis is appropriate for your data set.
• Create a flow chart of the process.

Importance of Pilot Study in Research


Pilot studies should be routinely incorporated into research designs because they:

1. Help define the research question


2. Test the proposed study design and process. This could alert you to issues which may negatively
affect your project.
3. Educate yourself on different techniques related to your study.
4. Test the safety of the medical treatment in preclinical trials on a small number of participants. This
is an essential step in clinical trials.
5. Determine the feasibility of your study, so you don’t waste resources and time.

Business Research Methods 83


and Data Analytics
22BA206

6. Provide preliminary data that you can use to improve your chances for funding and convince
stakeholders that you have the necessary skills and expertise to successfully carry out the research.

Advantages of Pilot Studies


• Increasing research quality
• Assessing the practicality and feasibility of the main study
• Testing the efficacy of research instruments
• Identifying and addressing any weaknesses or logistical problems
• Collecting preliminary data
• Estimating the time and costs required for the project
• Determining what resources are needed for the study
• Identifying the necessity to modify procedures that do not elicit useful data
• Adding credibility and dependability to the study
• Pretesting the interview format
• Enabling researchers to develop consistent practices and familiarize themselves with the
procedures in the protocol
• Addressing safety issues and management problems

Limitations of Pilot Studies


• Require extra costs, time, and resources.
• Do not guarantee the success of the main study.
• Contamination (ie: if data from the pilot study or pilot participants are included in the main study
results).
• Funding bodies may be reluctant to fund a further study if the pilot study results are published.
• Do not have the power to assess treatment effects due to small sample size.

2.15 Processing of Data

Data processing occurs when data is collected and translated into usable information. Usually performed
by a data scientist or team of data scientists, it is important for data processing to be done correctly as not
to negatively affect the end product, or data output.

Data processing starts with data in its raw form and converts it into a more readable format (graphs,
documents, etc.), giving it the form and context necessary to be interpreted by computers and utilized by
employees throughout an organization. It is a method of manipulation of data. It means the conversion of
raw data into meaningful and machine-readable content. It basically is a process of converting raw data
into meaningful information. “It can refer to the use of automated methods to process commercial data.”
Typically, this uses relatively simple, repetitive activities to process large volumes of similar information.
Raw data is the input that goes into some sort of processing to generate meaningful output.

Business Research Methods 84


and Data Analytics
22BA206

2.15.1 Methods of data processing

You can choose from three primary methods of data processing based on your needs:

Manual data processing: Through this method, users process data manually, meaning they carry out
every step without using electronics or automation software. Though this method is the least expensive
and requires minimal resources, it can be time-consuming and has a higher risk of producing errors.
Mechanical data processing: Mechanical processing involves the use of machines and devices to filter
data, such as calculators, printing presses or typewriters. This method is suitable for simple data processing
endeavors and produces fewer errors but is more complex than other techniques.
Electronic data processing: Researchers process data using modern data processing software and
technologies, where they feed an instruction set to the program to analyze the data and create a yield
output. Though this method is the most expensive, it is also the fastest and most reliable for generating
accurate output.

2.15.2 Benefits of data processing in quantitative research

When you use data processing in quantitative research, your company will experience a range benefits:

• Easier report building


• Higher processing speed
• Cost reduction
• Simple storage
• Greater data accuracy
• Regulatory compliance
• Enhanced security
• Smooth collaboration

Editing
First step in analysis is to edit the raw data. Editing detects errors and omissions, corrects them whatever
possible. Editor’s responsibility is to guarantee that data are – accurate; consistent with the intent of the
questionnaire; uniformly entered; complete; and arranged to simplify coding and tabulation.
Editing of data may be accomplished in two ways - (i) field editing and (ii) in-house also called central
editing. Field editing is preliminary editing of data by a field supervisor on the same data as the interview.
Its purpose is to identify technical omissions, check legibility, and clarify responses that are logically and
conceptually inconsistent. When gaps are present from interviews, a call-back should be made rather than
guessing what the respondent would probably say. Supervisor is to re-interview a few respondents at least
on some pre-selected questions as a validity check. In center or in-house editing all the questionnaires
undergo thorough editing. It is a rigorous job performed by central office staff.

Coding:
Coding is the process of assigning some symbols (either) alphabetical or numerals or (both) to the answers
so that the responses can be recorded into a limited number of classes or categories. The classes should be
appropriate to the research problem being studied. They must be exhaustive and must be mutually

Business Research Methods 85


and Data Analytics
22BA206

exclusive so that the answer can be placed in one and only one cell in a given category. Further, every
class must be defined in terms of only one concept. The coding is necessary for the efficient analysis of
data. The coding decisions should usually be taken at the designing stage of the questionnaire itself so that
the likely responses to questions are pre-coded. This simplifies computer tabulation of the data for further
analysis. It may be noted that any errors in coding should be eliminated altogether or at least be reduced
to the minimum possible level. Coding for an open-ended question is more tedious than the closed ended
question. For a closed ended or structured question, the coding scheme is very simple and designed prior
to the field work. For example, consider the following question:
What is your gender? 1.Male 2.Female
We may assign a code of `0' to male and `1' to female respondent. These codes may be specified prior to
the field work and if the codes are written on all questions of a questionnaire, it is said to be wholly
precoded.

The same approach could also be used for coding numeric data that either are not be coded into categories
or have had their relevant categories specified. For example, What is your monthly income? Here the
respondent would indicate his monthly income which may be entered in the relevant column.

The same question may also be asked like this: What is your monthly income? < Rs. 5000 Rs. 5000 -
8999 Rs. 13000 – 12999 Rs. 13000 or above
We may code the class less than Rs.5000' as , 1', Rs. 5000 - 8999' as `2', `Rs. 9000 - 12999' as `3' and `Rs.
13000 or above' as `4'.

Transcription

Qualitative research is more about exploring an idea or a topic instead of finding specific, concrete,
objective answers. Since qualitative research focuses on individuals, groups, and cultures, its data
can’t be measured with tools like thermometers and scales. Instead, qualitative data is measured with
questionnaires, observations, or interviews. All this can make qualitative data more difficult to record
and copy compared with quantitative data.

Qualitative researchers are focused on understanding a person’s opinion or why people behave in
certain ways. This means that researchers may conduct and record focus groups, group discussions,
individual interviews, or observations of a person or group of people. They may capture and preserve
the resulting data with video or audio recordings.

These interviews and other events create important data. However, that data is usually unstructured
and needs to be sorted through and organized before researchers can make sense of it.

This is where qualitative data transcription is incredibly important. Transcription creates a text -based
version of any original audio or video recording. Qualitative data transcription provides a good first
step in arranging your data systematically and analyzing it.

Transcription is vital for qualitative research because it:

• Puts qualitative data and information into a text-based format


• Makes data easier to analyze and share
• Allows researchers to become more immersed into the data they collect
• Helps researchers create a narrative with their data
• Makes patterns easier to find
• Helps preserve the accuracy and integrity of the data

Business Research Methods 86


and Data Analytics
22BA206

• Lets researchers focus on their observations instead of worrying about note-taking

Once data is transcribed in a text format, it can be put into a spreadsheet or similar type of document,
or entered into a qualitative data analysis tool. After data transcription, a qualitative researcher can
read through and annotate the transcriptions, then conceptualize and organize the data to
conduct inductive or deductive analysis. From there, it is a lot easier to make connections between
different observations or findings, and then write them up in the form of a study, report, or article.

Transcripts that are used mainly to select quotes and sound bites may not need the same level of details as
transcripts which will be systematically reviewed, grouped into themes (often through a process of
coding), and analyzed for content.
The sections below provide guidance on:
1) whether to transcribe, 2) budgeting time and resources required for transcription, 3) hiring transcribers,
4) tips and best practices in transcription, and 5) ethics and confidentiality.

1) Deciding Whether to Transcribe Is it always necessary to transcribe audio or audio-visual recordings?


Not always. Some qualitative data analysis software allows users to code sound recordings (as opposed to
written text). However, keep in mind that this won’t work if the recordings require translation. Also, some
researchers prefer to have written transcripts when conducting analysis, either for ease of use or to have a
back-up in case of technical failure.

Tabulation
Tabulation is the systematic and logical representation of figures in rows and columns to ease comparison
and statistical analysis. It eases comparison by bringing related information closer to each other and helps
further in statistical research and interpretation. In other words, tabulation is a method of arranging or
organizing data in a tabular form. The tabulation process may be simple or complex depending upon the
type of categorization.

Tabulation is defined as the process of placing classified data in tabular form. A table is a systematic
arrangement of statiscal information in rows and columns. The rows of a table are the horizontal
arrangement of data whereas the columns of a table are the vertical arrangement of data.

Components of Tabulation
Table Number –
This is the first part of a table and is given on top of any table to facilitate easy identification and for
further reference.

Business Research Methods 87


and Data Analytics
22BA206

Title of the Table –


One of the most important parts of any table is its title. The title is either placed just below the table
number or at its right.It is imperative for the title to be brief, crisp and carefully-worded to describe
the tables’ contents effectively.
Headnote –
The headnote of a table is presented in the portion just below the title. It provides information about
the unit of data in the table, like “amount in Rupees” or “quantity in kilograms”, etc.
Column Headings or Captions –
The headings of the columns are referred to as the caption. It consists of one or more column heads.
A caption should be brief, short, and self-explanatory, Column heading is written in the middle of a
column in small letters.
Row Headings or Stubs –
The title of each horizontal row is called a stub.
Body of a Table –
This is the portion that contains the numeric information collected from investigated facts. The data
in the body is presented in rows which are read horizontally from left to right and in columns, read
vertically from top to bottom.
Footnote –
Given at the bottom of a table above the source note, a footnote is used to state any fact that is not
clear from the table’s title, headings, caption or stub. For instance, if a table represents the profit
earned by a company, a footnote can be used to state if said profit is earned before, or after tax
calculations.
Source Note –
As its name suggests, a source note refers to the source from where the table’s information has been
collected.
Objectives of Tabulation
Tabulation essentially bridges the gap between the collection of data and analysing them. The primary
objectives of tabulation are given below –
To simplify complex data
It reduces the bulk of information, i.e., it reduces raw data in a simplified and meaningful form so that
it can be easily interpreted by a common man in less time.
To bring out essential features of data
It brings out the chief/main characteristics of data.
It presents facts clearly and precisely without textual explanation.
To facilitate comparison
The representation of data in rows and columns is helpful in simultaneous detailed comparison on the
basis of several parameters.
To facilitate statistical analysis
Tables serve as the best source of organised data for statistical analysis.
The task of computing average, dispersion, correlation, etc., becomes easier if data is presented in the
form of a table.
To save space
A table presents facts in a better way than the textual form.
It saves space without sacrificing the quality and quantity of data.

Types of Tabulation
Simple Tabulation or One-way Tabulation
When the data in the table are tabulated to one characteristic, it is termed as a simple tabulation or
one-way tabulation.

Business Research Methods 88


and Data Analytics
22BA206

For example, Data tabulation of all the people of the World is classified according to one single
characteristic like religion.

Double Tabulation or Two-way Tabulation


When the data in the table are tabulated considering two different characteristics at a time, then it is
defined as a double tabulation or two-way tabulation.
For example, Data tabulation of all the people of the World is classified by two different characteristics
like religion and sex.
Complex Tabulation
When the data in the table are tabulated according to many characteristics, it is referred to as a complex
tabulation.

For example, Data tabulation of all the people of the World is classified by three or more characteristics
like religion, sex, and literacy, etc.
There are a few general rules that have to be followed while constructing tables. These are

• The tables illustrated should be self-explanatory, simple and attractive. There should be no need
for further explanation (details). If the volume of information is substantial, it is best to put them
down in multiple tables instead of a single one. This reduces the chances of mistakes and defeats
the purpose of forming a table. However, each table formed should also be complete in itself and
serve the purpose of analysis.
• The number of rows and columns should be kept minimal to present information in a crisp and
concise manner.
• Before tabulating, data should be approximated, wherever necessary.
• Stubs and captions should be self-explanatory and should not require the help of footnotes to be
comprehended.
• If certain positions of data collected cannot be tabulated under any stub or captions, they should
be put down in a separate table under the heading `` miscellaneous.
• Quantity and quality of data should not be compromised under any scenario while forming a table.

Steps to prepare Table


1. Name your table. Write a title at the top of your paper. Make sure the title relates to the data you will
put in your table.
2. Figure out how many columns and rows you need.
3. Draw the table. Using a ruler, draw a large box. Make the necessary number of columns and rows.
Don't forget to leave the top row blank. This is where you will label your columns.
4. Label all your columns. The leftmost column should be reserved for your independent variable. For
example, if you're researching how much rain fell in the past year, your independent variable would
be the months of the year. Thus, your leftmost column would be labeled "Month" and the next column
would be labeled "Rainfall."
5. Record the data from your experiment or research in the appropriate columns. You want the
information in your table to be clear and obvious to anyone who sees it. When you're finished there
should be a number in every space. If there is an average or derived result from your data, that number
should be recorded in the rightmost column.
6. Check your table. Look over your work to make sure everything is correct and clear.

Some points you should consider before drafting the tables in your research report:

Business Research Methods 89


and Data Analytics
22BA206

• Finalize the results that are required to be presented in tabular form.


• Include the data or results that are relevant to the main aim of the study without being choosy and
including only those results that support your hypothesis.
• Create each table in a lucid manner and style without cluttering it with in-table citations.
• Number the tables in a sequence according to their occurrence in the text.
• Don’t mix tables with figures. Maintain separate numbering systems for tables and figures.
• Create tables in a storytelling manner. Remember that your tables communicate a story to the
reader that runs parallel to the text.
• If you are using or reproducing tables from other published articles, obtain permission from the
copyright holder (usually the publisher) or/andacknowledge the source.
• Do not repeat the tabular contents in the text again; that will create confusion among readers.
• Use clear and informative text for each table title.
• Take extra care while extending the data in your tables. If you have too many tables, consider
using them as appendices or supplementary materials.
• Create tables with sufficient spacing in the layout so that they do not look messy, crowded, or
cluttered.
• Do not forget to spell out abbreviations used in the tables, ideally in the footnotes.

Tables and illustrations are important tools for efficiently communicating information and data contained
in your research paper to the readers. They present complex results in a comprehensible and organized
manner.

However, it is advisable to use tables and illustrations wisely so as to maximize the impact of your
research. They should be organized in an easy-to-understand format to convey the information and
findings collected in your research. The tabular information helps the reader identify the theme of the
study more readily. Although data tables should be complete,they should not be too complex. Instead of
including a large volume of data in a single unwieldy table, it is prudent to use small tables to help readers
identify the important information easily.

Here are some points you should consider before drafting the tables in your research paper:

• Finalize the results that are required to be presented in tabular form.


• Include the data or results that are relevant to the main aim of the study without being choosy and
including only those results that support your hypothesis.
• Create each table in a lucid manner and style without cluttering it with in-table citations.
• Number the tables in a sequence according to their occurrence in the text.
• Don’t mix tables with figures. Maintain separate numbering systems for tables and figures.
• Create tables in a storytelling manner. Remember that your tables communicate a story to the
reader that runs parallel to the text.
• If you are using or reproducing tables from other published articles, obtain permission from the
copyright holder (usually the publisher) or/and acknowledge the source.
• Do not repeat the tabular contents in the text again; that will create confusion among readers.
• Use clear and informative text for each table title.
• Take extra care while extending the data in your tables. If you have too many tables, consider
using them as appendices or supplementary materials.
• Create tables with sufficient spacing in the layout so that they do not look messy, crowded, or
cluttered.
• Do not forget to spell out abbreviations used in the tables, ideally in the footnotes.

Business Research Methods 90


and Data Analytics
22BA206

For the reader, a research paper that is dense and text-heavy can be tiresome. Conversely, tables not
only encapsulate your data lucidly, but also welcome a visual relief for the reader. They add value to the
layout of your paper. Besides, and more importantly, reviewers often glance at your tabulated data and
illustrations first before delving into the text. Therefore, tables can be the initial draw for a reviewer and
deliver a positive impact about your research paper. If you can achieve an optimum balance among your
text, tables, and illustrations, it can go a long way toward being published.

Graphical Representation
Graphical Representation is a way of analysing numerical data. It exhibits the relation between data,
ideas, information and concepts in a diagram. It is easy to understand and it is one of the most important
learning strategies. It always depends on the type of information in a particular domain. There are different
types of graphical representation. Some of them are as follows:

• Line Graphs – Line graph or the linear graph is used to display the continuous data and it is useful
for predicting future events over time.
• Bar Graphs – Bar Graph is used to display the category of data and it compares the data using
solid bars to represent the quantities.
• Histograms – The graph that uses bars to represent the frequency of numerical data that are
organised into intervals. Since all the intervals are equal and continuous, all the bars have the same
width.
• Line Plot – It shows the frequency of data on a given number line. ‘ x ‘ is placed above a number
line each time when that data occurs again.
• Frequency Table – The table shows the number of pieces of data that falls within the given
interval.
• Circle Graph – Also known as the pie chart that shows the relationships of the parts of the whole.
The circle is considered with 100% and the categories occupied is represented with that specific
percentage like 15%, 56%, etc.
• Stem and Leaf Plot – In the stem and leaf plot, the data are organised from least value to the
greatest value. The digits of the least place values from the leaves and the next place value digit
forms the stems.
• Box and Whisker Plot – The plot diagram summarises the data by dividing into four parts. Box
and whisker show the range (spread) and the middle ( median) of the data.

General Rules for Graphical Representation of Data


There are certain rules to effectively present the information in the graphical representation. They are:

• Suitable Title: Make sure that the appropriate title is given to the graph which indicates the subject
of the presentation.
• Measurement Unit: Mention the measurement unit in the graph.
• Proper Scale: To represent the data in an accurate manner, choose a proper scale.
• Index: Index the appropriate colours, shades, lines, design in the graphs for better understanding.
• Data Sources: Include the source of information wherever it is necessary at the bottom of the
graph.
• Keep it Simple: Construct a graph in an easy way that everyone can understand.

Business Research Methods 91


and Data Analytics
22BA206

• Neat: Choose the correct size, fonts, colours etc in such a way that the graph should be a visual
aid for the presentation of information.

Business Research Methods 92


and Data Analytics
22BA206

Exercise
I. Write down short answers for the following:

1. What is research design?(Pg. 34)


2. Define the characteristics and purpose of research design. (Pg. 34&35 )
3. List out the factors affecting research design. (Pg. 40 & 41 )
4. Describe the advantages and disadvantages of business research. (Pg.46 & 47 )
5. Define reliability and validity. (Pg. 47 & 51)
6. Describe interview and its types. (Pg. )
7. What are variables? (Pg. 56 )
8. What is sample? (Pg. 59)
9. Define data collection and tools. (Pg. 64 to 69)
10. Write down the importance of pilot study. (Pg. 83 to 84)
11. List out the check list in data collection. (Pg. 81 to 83 )
11. Define graphical representation. (Pg.91 & 92 )
II. Provide Detailed Answers:
1. Elaborate the process of research design. (Pg. 36 & 37 )
2. Describe types of research design with example (Pg.37 to 40 )
3. Explain different types of research methods with illustrations (Pg. 41 to 46)
4. Describe different types of reliability and validity (Pg.48 to 56 )
5. Describe different types of variables (Pg.56 to 59 )
6. Describe probability and non probability techniques (Pg. 60 to 63 )
7. Elaborate data collection process with different research tools. (Pg. 70 to 71 )

Business Research Methods 93


and Data Analytics
22BA206

Unit-III SAMPLING AND ANALYSIS 12


Data management plan – Sampling & measurement, – Tabulation, Introducing database application, testing for association,
Analysis Techniques: Qualitative & Quantitative Analysis Techniques- Techniques of Testing Hypothesis – Methods of
analysis: Chi-square, T-test Correlation & Regression Analysis- Analysis of Variance, etc. – Making Choice of an Appropriate
Analysis Technique- market survey.

Learning Objectives
• To learn about Sampling measurement and techniques
• To learn qualitative and quantitative analysis techniques
• To understand different statistical testing methods

Learning Outcomes
At the end of the unit they will be able to:
• To apply different database application methods
• To apply data analytic techniques
• To choose relevant research tools and methods for analysis.

DETAILEDSESSIONPLAN TOPIC WISE

Mode of Assessment
S.No Title of Teaching Textbook/ Link Tool(Quiz/Pu
Topic (PPT/Seminar/ Reference Book (if Applicable) link zzle/
Chalk & should on Springboard/ Assignment/
Board etc.) Coursera / Nptel Seminaretc..)
NPTEL Link:
Sampling & https://onlinecourses.npt
William G Zikmund, Barry J
Measurement Chalk and el.ac.in/noc23_mg54/uni
Babin, Jon C.Carr,
board/PPT t?unit=26&lesson=31
AtanuAdhikari,Mitch Griffin,
1.
Business Researchmethods, A
https://onlinecourses.npt
South Asian Perspective, 8th
Edition, Cengage Learning, el.ac.in/noc23_mg54/uni
New Delhi,2012. t?unit=44&lesson=46 Quiz

https://www.youtube.co
Analysis Chalk and James R. Evans, "Business m/watch?v=aEgIzibOKT
2. techniques board/PPT Analytics - Methods, Models s
and Decisions", Pearson Ed,
2012.

Marc J. Schniederjans, Dara G. https://www.youtube.co


3 Methods of Chalk and Schniederjans and Christopher m/watch?v=gp5xQHdbw
analysis board/PPT M. Starkey, " Business wI
Analytics Principles, Concepts,
and Applications - What, Why,
and How" , Pearson Ed, 2014

Business Research Methods 94


and Data Analytics
22BA206

3. SAMPLING AND ANALYSIS

3.1.1 Data management plan

A data management plan (DMP) is a written document that describes the data you expect to acquire or
generate during the course of a research project, how you will manage, describe, analyze, and store those
data, and what mechanisms you will use at the end of your project to share and preserve your data.
You may have already considered some or all of these issues with regard to your research project, but
writing them down helps you formalize the process, identify weaknesses in your plan, and provide you
with a record of what you intend(ed) to do.
Data management is best addressed in the early stages of a research project, but it is never too late to
develop a data management plan.
Research Data can occur in a variety of formats that include, but are not limited to:
• Notebooks
• survey responses
• software and code
• measurements from laboratory or field equipment (such as IR spectra or hygrothermograph charts)
• images (such as photographs, films, scans, or autoradiograms)
• audio recordings
• physical samples
A proper DMP (Data Management Plan) is a formal plan that outlines how a business researcher intends
to manage research data during and after a research project. It includes a wide range of tasks and
procedures, such as:
• Collecting, processing, validating, and storing data
• Integrating different types of data from disparate sources, including structured and unstructured data
• Ensuring high data availability and disaster recovery
• Governing how data is used and accessed by people and apps
• Protecting and securing data and ensuring data privacy

A DMP is a living document, business research is all about discovery, and the process of doing research
sometimes requires you to shift gears and revise your intended career path. Your DMP is a living document
that you may need to alter as the course of your research changes in the study area. Remember, any time
your research plans change, you should review your DMP to ensure that it still meets your needs.

3.1.2 Components of Data management plan


ICPSR formulated the below components or elements of data management

Element Description

Data A description of the information to be gathered; the nature and scale of the data
description that will be generated or collected.
Existing data A survey of existing data relevant to the project and a discussion of whether and
how these data will be integrated.
Format Formats in which the data will be generated, maintained, and made available,
including a justification for the procedural and archival appropriateness of those
formats.
Metadata A description of the metadata to be provided along with the generated data, and a
discussion of the metadata standards used.

Business Research Methods 95


and Data Analytics
22BA206

Element Description

Storage and Storage methods and backup procedures for the data, including the physical and
backup cyber resources and facilities that will be used for the effective preservation and
storage of the research data.
Security A description of technical and procedural protections for information, including
confidential information, and how permissions, restrictions, and embargoes will be
enforced.
Responsibility Names of the individuals responsible for data management in the research project.

Intellectual Entities or persons who will hold the intellectual property rights to the data, and
property how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted
rights data collection instruments) should be noted.
Access and A description of how data will be shared, including access procedures, embargo
sharing periods, technical mechanisms for dissemination and whether access will be open
or granted only to specific user groups. A timeframe for data sharing and
publishing should also be provided.
Audience The potential secondary users of the data.

Selection andA description of how data will be selected for archiving, how long the data will be
retention held, and plans for eventual transition or termination of the data collection in the
periods future.
Archiving The procedures in place or envisioned for long-term archiving and preservation of
and the data, including succession plans for the data should the expected archiving
preservation entity go out of existence.
Ethics and A discussion of how informed consent will be handled and how privacy will be
privacy protected, including any exceptional arrangements that might be needed to protect
participant confidentiality, and other ethical issues that may arise.
Budget The costs of preparing data and documentation for archiving and how these costs
will be paid. Requests for funding may be included.
Data How the data will be managed during the project, with information about version
organization control, naming conventions, etc.
Quality Procedures for ensuring data quality during the project.
Assurance
Legal A listing of all relevant federal or funder requirements for data management and
requirements data sharing.

Data management is the practice of collecting, organizing, protecting, and storing an organization’s
data so it can be analyzed for business decisions. As organizations create and consume data at unprecedented
rates, data management solutions become essential for making sense of the vast quantities of data. Today’s
leading data management software ensures that reliable, up-to-date data is always used to drive decisions.
The software helps with everything from data preparation to cataloging, search, and governance, allowing
people to quickly find the information they need for analysis.

3.1.3 Types of Data Management


Data management plays several roles in an organization’s data environment, making essential functions
easier and less time-intensive. These data management techniques include the following:
• Data preparation is used to clean and transform raw data into the right shape and format for
analysis, including making corrections and combining data sets.

Business Research Methods 96


and Data Analytics
22BA206

• Data pipelines enable the automated transfer of data from one system to another.
• ETLs (Extract, Transform, Load) are built to take the data from one system, transform it, and load
it into the organization’s data warehouse.
• Data catalogs help manage metadata to create a complete picture of the data, providing a summary
of its changes, locations, and quality while also making the data easy to find.
• Data warehouses are places to consolidate various data sources, contend with the many data types
businesses store, and provide a clear route for data analysis.
• Data governance defines standards, processes, and policies to maintain data security and integrity.
• Data architecture provides a formal approach for creating and managing data flow.
• Data security protects data from unauthorized access and corruption.
• Data modeling documents the flow of data through an application or organization.

3.1.4 Importance of Data Management


Data management is a crucial first step to employing effective data analysis at scale, which leads to
important insights that add value to your customers and improve your bottom line. With effective data
management, people across an organization can find and access trusted data for their queries. Some
benefits of an effective data management solution include:
Visibility
Data management can increase the visibility of your organization’s data assets, making it easier for
people to quickly and confidently find the right data for their analysis. Data visibility allows your
company to be more organized and productive, allowing employees to find the data they need to better
do their jobs.
Reliability
Data management helps minimize potential errors by establishing processes and policies for usage and
building trust in the data being used to make decisions across your organization. With reliable, up-to-
date data, companies can respond more efficiently to market changes and customer needs.
Security
Data management protects your organization and its employees from data losses, thefts, and breaches
with authentication and encryption tools. Strong data security ensures that vital company information is
backed up and retrievable should the primary source become unavailable. Additionally, security
becomes more and more important if your data contains any personally identifiable information that
needs to be carefully managed to comply with consumer protection laws.
Scalability
Data management allows organizations to effectively scale data and usage occasions with repeatable
processes to keep data and metadata up to date. When processes are easy to repeat, your organization
can avoid the unnecessary costs of duplication, such as employees conducting the same research over
and over again or re-running costly queries unnecessarily.

3.1.5 Addressing challenges of Data management


Because data management plays a crucial role in today’s digital economy, it’s important that systems
continue to evolve to meet your organization’s data needs. Traditional data management processes make
it difficult to scale capabilities without compromising governance or security. Modern data management
software must address several challenges to ensure trusted data can be found.
1. Increased data volumes
Every department within your organization has access to diverse types of data and specific needs to
maximize its value. Traditional models require IT to prepare the data for each use case and then maintain
the databases or files. As more data accumulates, it’s easy for an organization to become unaware of
what data it has, where the data is, and how to use it.
2. New roles for analytics

Business Research Methods 97


and Data Analytics
22BA206

As your organization increasingly relies on data-driven decision-making, more of your people are asked
to access and analyze data. When analytics falls outside a person’s skill set, understanding naming
conventions, complex data structures, and databases can be a challenge. If it takes too much time or
effort to convert the data, analysis won’t happen and the potential value of that data is diminished or
lost.
3. Compliance requirements
Constantly changing compliance requirements make it a challenge to ensure people are using the right
data. An organization needs its people to quickly understand what data they should or should not be
using—including how and what personally identifiable information (PII) is ingested, tracked, and
monitored for compliance and privacy regulations.

3.1.6 Establish data management best practices


Data management is a critical business driver used to ensure data is acquired, validated, stored, and
protected in a standardized way. It is essential to develop and deploy the right processes so end users
are confident their data is reliable, accessible, and up to date. To make sure that your data is managed
most effectively and efficiently, here are seven best practices for your business to consider.
1. Build strong file naming and cataloging conventions
If you are going to utilize data, you have to be able to find it. You can’t measure it if you can’t manage
it. Create a reporting or file system that is user- and future-friendly—descriptive, standardized file names
that will be easy to find and file formats that allow users to search and discover data sets with long-term
access in mind.
• To list dates, a standard format is YYYY-MM-DD or YYYYMMDD.
• To list times, it is best to use either a Unix timestamp or a standardized 24-hour notation, such as
HH:MM:SS. If your company is national or even global, users can take note of where the information
they are looking for is from and find it by time zone.
Carefully consider metadata for data sets
Essentially, metadata is descriptive information about the data you are using. It should contain
information about the data’s content, structure, and permissions so it is discoverable for future use. If
you don’t have this specific information that is searchable and allows for discoverability, you cannot
depend on being able to use your data years down the line.
2. Data Storage
If you ever intend to be able to access the data you are creating, storage plans are an essential piece of
your process. Find a plan that works for your business for all data backups and preservation methods. A
solution that works for a huge enterprise might not be appropriate for a small project’s needs, so think
critically about your requirements.
A variety of storage locations to consider:
• Desktops/laptops
• Networked drives
• External hard drives
• Optical storage
• Cloud storage
• Flash drives (while a simple method, remember that they do degrade over time and are easily
lost or broken)

Business Research Methods 98


and Data Analytics
22BA206

3. The 3-2-1 methodology


A simple, commonly used storage system is the 3-2-1 methodology. This methodology suggests the
following strategic recommendations: 3: Store three copies of your data, 2: using two types of storage
methods, 1: with one of them stored offsite. This method allows smart access and makes sure there is
always a copy available in case one type or location is lost or destroyed, without being overly redundant
or overly complicated.
Try Tableau for free to create beautiful visualizations with your data.
4. Documentation
Within data management best practices, we can’t overlook documentation. It’s often smart to produce
multiple levels of documentation that will provide full context to why the data exists and how it can be
utilized.

Documentation levels:
• Project-level
• File-level
• Software used (include the version of the software so if future users are using a different version,
they can work through the differences and software issues that might occur)
• Context (it is essential to give any context to the project, why it was created, if hypotheses were
trying to be proved or disproved, etc.)
5. Commitment to data culture
A commitment to data culture includes making sure that your department or company’s leadership
prioritizes data experimentation and analytics. This matters when leadership and strategy are needed and
if budget or time is required to make sure that the proper training is conducted and received.
Additionally, having executive sponsorship as well as lateral buy-in will enable stronger data
collaboration across teams in your organization.
6. Data quality trust in security and privacy
Building a culture committed to data quality means a commitment to making a secure environment with
strong privacy standards. Security matters when you are working to provide secure data for internal
communications and strategy or working to build a relationship of trust with a client that you are
protecting the privacy of their data and information. Your management processes must be in place to
prove that your networks are secure and that your employees understand the critical nature of data
privacy. In today’s digital market, data security has been identified as one of the most significant
decision-making factors when companies and consumers are making their buying decisions. One data
privacy breach is one too many. Plan accordingly.
7. Invest in quality data-management software
When considering these best practices together, it is recommended, if not required, that you invest in
quality data-management software. Putting all the data you are creating into a manageable working
business tool will help you find the information you need. Then you can create the right data sets and
data-extract scheduling that works for your business needs. Data management software will work with
both internal and external data assets and help configure your best governance plan. Software like “R”
and “Tableau” offers a Data Management Add-On that can help you create a robust analytics
environment leveraging these best practices. Using a reliable software that helps you build, catalog, and
govern your data will build trust in the quality of your data and can lead to the adoption of self-service
analytics. Use these tools and best practices to bring your data management to the next level and build
your analytics culture on managed, trusted, and secure data.

Business Research Methods 99


and Data Analytics
22BA206

3.2 Sampling and measurement


Sampling
Sampling is the process of selecting units (e.g., people, organizations) from a population of interest
so that by studying the sample we may fairly generalize our results back to the population from which
they were chosen. Let’s begin by covering some of the key terms in sampling like “population” and
“sampling frame.” Then, because some types of sampling rely upon quantitative models, we’ll talk about
some of the statistical terms used in sampling. Finally, we’ll discuss the major distinction
between probability and Nonprobability sampling methods and work through the major types in each.

Measurement
Measurement is the process of observing and recording the observations that are collected as part
of a research effort. There are two major issues that will be considered here.
First, you have to understand the fundamental ideas involved in measuring. Here we consider
two of major measurement concepts. In Levels of Measurement, I explain the meaning of the four major
levels of measurement: nominal, ordinal, interval and ratio. Then we move on to the reliability of
measurement, including consideration of true score theory and a variety of reliability estimators.
Second, you have to understand the different types of measures that you might use in social
research. We consider four broad categories of measurements. Survey research includes the design and
implementation of interviews and questionnaires. Scaling involves consideration of the major methods
of developing and implementing a scale. Qualitative research provides an overview of the broad range
of non-numerical measurement approaches. And unobtrusive measures presents a variety of
measurement methods that don’t intrude on or interfere with the context of the research.

3.2.1 Characteristics of Measurement


1. Validity
a. A valid measurement is a quantity or dimension that corresponds to the measured variable
b. There are standard measurements (procedures and expressions) for common variables, but
where variables must be are operationally defined, or surrogate variables used because they
are more easily measured, then the validity of measurements has to be ensured
2. Accuracy
a. Closeness of measurements to an expected or true value
b. Accuracy is inversely proportional to error (i.e. high accuracy corresponds to low error)
c. Types of error:
i. Gross: blunders caused by carelessness of instrument failure
ii. Systematic: consistent overestimation or underestimation of the target value; usually
caused by poor calibration of an instrument or a poor measurement procedure; often
small enough to go undetected by results in a poor inference of a target value
iii. Random: human error randomly (normally) distributed with respect to the mean
observation
3. Precision
a. The closeness of repeated measurements to one another
4. Sample
A subset of all the measurements that could be derived from a very large or infinite
population, where the population is defined by one or more common characteristics (e.g. a certain
class of people, slopes in limestone)
• Sample and population refer to the items or to the corresponding sets of measurements
• The purpose of sampling is to
o Gain an impression of an area or collection of things
o To estimate a population parameter

Business Research Methods 100


and Data Analytics
22BA206

oTo test hypotheses: unproven theories or suppositions which are the basis for further
investigation
• Advantages of sampling
o The only means of obtaining data about an infinite population (e.g. Air temperatures)
o Cost and time effective means of obtaining data about a large finite population; better data
then hastily collected data for the entire population
o Desirable when measurement is destructive or stressful (e.g. Plant sampling, some
measurements on people)

3.2.2 Sampling error


• The difference between a sample estimate and the corresponding population parameter
• Not usually quantifiable since sampling normally is done because the population parameter cannot
be known, however, it can be predicted from statistical theory
• Depends on measurement error and the representativeness of a sample, which in turn depends on
Sample size
o There is a diminishing decrease in sampling error with increasing sample size, that is, the largest
decrease in error occurs with an increase in the size of a small sample
o A minimum sample size is three, since in a sample of two, a bad measurement cannot be
distinguished from the good one
o 30 observations or less is generally considered a small sample
The sampling frame
o The means by which the sampled population is identified from the target population
o May be spatial (e.g. A quadrat or transect) or non-spatial (e.g. A telephone book, , voter list, or
a street corner for the sampling of people)
o Is poor if it causes bias towards the sampling of certain individuals (e.g. Over representation of
housewifes and unemployed by sampling shoppers on a weekday; over representation of weak
soil rock exposed in stream cuts)
The sampling procedure
o Methods of inferential statistics assume random sampling, that is, that there is an equal
probability of choosing every item in the sampled population and every possible sample; these
conditions are satisfied only by independent random sampling (i.e. With replacement of
measured items), although sampling without replacement is not much different if the target
population is very large, since discarding an item does not significantly increase the probability
of selecting the remaining items
o However, the distribution of measurements is often arbitrary (e.g. Climate stations), because the
collection a random sample would be much more difficult
o The random location (coordinates) of items in a list or in an area are identified from a string of
randomly generated numbers; these locations may be clustered with respect to one another
o With systematic sampling, the sampled items are selected at regular intervals and thus have a
uniform distribution, however the sample can be regarded as random only if the population has
a random distribution; if the sampling interval corresponds to some periodic feature in the
sampled population, then sample will be biased
o Stratified sampling, dividing the target population into sub-populations (strata) according to one
or more criteria, is done to
▪ Enable comparisons between two or more strata
▪ Obtain a sample that is more evenly distributed among state (e.g. Canada is often stratified
in five regions, because sampling of socioeconomic phenomena will always be biased
towards central canada)
▪ Obtain sub-samples that are more representative of individual strata (i.e. Have less
variability and sampling error) than an unstratified sample would be of the total area

Business Research Methods 101


and Data Analytics
22BA206

▪ Experimental control over independent variables; for example, if an area is stratified by


slope gradient and elevation, then any topographic variation in vegetation cover can be
attributed to variations in aspect
o Sampling can be stratified random or stratified systematic

According to G.C. Helmstadter :


"Measurement is a process of obtaining a numerical description of the extent to which a person or object
possesses some characteristics".

According to Kerlinger :
"Measurement is the assignment of numerals to objects or events according to rules".

The major application of such data is in the area of marketing where measurements are taken
regarding predispositions or attitudes of current and potential customers of a company. By knowing about
the attitudes of the customers, the marketing managers may take important decisions which are effective
and beneficial to the company. Various areas of marketing where measurement techniques are used are
product positioning. market segmentation, advertising message effectiveness, etc.

Three categories of things can be measured as per Abraham Kaplan :

1) Direct Observables :
The things which can directly be observed are called direct observables. For example, by meeting an
individual the brand of his/her wrist watch can be directly observed.

2) Indirect Observables :
The things which cannot be directly observed are called 'indirect observables'. More complex and
refined observation efforts is required for observing such things. For example, minutes of earlier board
meetings of corporations can be used to observe past business decisions.

3) Constructs :
The things which cannot be observed directly or indirectly are called 'constructs'. These are the
theoretical concepts, which are developed by observing different aspects of an operation. For example,
IQ is known as a construct. It cannot be directly or indirectly observed. It is determined only by
mathematically observing answers of different test questions asked in an IQ test.

3.2.3 Steps in the Measurement Process


1) Developing Behavioral Categories :
Measurement process starts with the behavioral categorization of different events to be measured. It is
very crucial to categories events carefully so as to make the measurement process simple and
appropriate.

2) Selection of Appropriate Calculation Method :


After successfully categorizing the events, the next step in measurement process is selecting appropriate
calculation method for different behavioral categories. The different calculation methods are as follows
:
a) Frequency Method :
In this method, frequency counts are used for calculations. Number of occurrences of a particular event
in a definite time period is called its frequency count. Behaviors or events which occur number of times
within a definite time period, occur for short duration's, or have a sharp beginning or ending, are
calculated with the help of frequency method.

Business Research Methods 102


and Data Analytics
22BA206

b) Duration Method :
In this duration method of an individual being involved in a particular behavior or activity within a fixed
time limit is calculated.
c) Interval Method :
In this interval method, the whole observation time limit is divided into different intervals and these
intervals are checked for a specific behavior or activity.
3) Using Multiple Observers :
The final step in measurement process is using multiple observations so as to measure inter-rater
reliability. Observations used in measurement process art as follows :
a) Naturalistic Observation :
The naturalistic observation involves observing in a natural or real environment, Here, the actual
behavior of the respondents is observed and recorded, which is free from manipulations.
b) Participant Observation :
In this type of observation, the researcher joins the group of participants as an individual participant and
therefore, observes their behavior.
c) Contrived Observation :
When a simulated environment is created to observe the natural behavior of the respondents, it is called
contrived or structured observation. This type of observation eliminates the need of observing
respondents in natural selling.

3.2.4 Functions of Measurement


1) Allows to Summaries Information :
The framing of proper measures also allows the information to be summarized and presented in a better
way. This also allows researchers to use various graphs, tables, and, charts to represent he data properly.
This also makes the research and the research findings more presentable and attractive to any potential
user of the research report.
2) Provides Better Understanding of a Situation :
Measurement allows better understanding of a situation as compared to a scenario where there is no
measurement at all. For example, various data about population is obtained only when it is measured,
Until and unless the data is measured, it does not provide in-depth understanding of the situation.
3) Allows to Quantify Data and Statistical Sophistication :
The process of measurement also allows the researcher to quantify the abstract variables and research
parameters. The degree of statistical treatment of the data depends upon he measurement scale adopted
to quantify the data.
4) Important for Research Approach :
Selection of measurement techniques also determines the research approach and the way a researcher
will tend to solve the research problems. Deciding the measures is thus an essential part of the research
activity. The selection of proper measures goes a long way towards making the research a better planned
end organised activity.
5) Provides Important Set of Tools :
The measurement procedures and instruments to be used provide invaluable information to the
researcher which allows him to reach at a decision regarding the research problem. It also has a bearing
on the policies and programs. However. the measures that are framed are only the means towards the
objective of the researcher and not the ends. It helps the Researcher in reaching at critical decisions
regarding a research objective.

Business Research Methods 103


and Data Analytics
22BA206

3.2.5 Types of Measurement


There are four possible measurement techniques which are as follows :

1) Questionnaires :
The questionnaire is an inventory of questions used to seek information from respondents on different
topics like
behavior, demographic and psychographic details, opinions, attitudes, beliefs, feelings, etc. The questions
are designed for a particular study and are validated before concluding.

2) Attitude Scales :
Attitude seeks responses on the feelings of respondents towards a particular object. Attitude scales can be
of different types like as follows :
i) Rating scales make a respondent to place an object on a scale which is numerically numbered.
ii) Ranking scales require the respondents to compare a set of objects and rate them
between '1' to '10', where, '1' stands at a highest position and '10' stands at a lowest position.

3) Depth Interviews :
In depth interviews, the respondents have complete freedom to express their feelings without any fear of
rejection or meeting opposition from others. The responses which we received from the respondents are
recorded in specially designed formats. This technique is used when the researcher wants to gather in-
depth information about the feelings and opinions of respondents or when the researcher wants to examine
some new issue or aspect of the study.
Many times, depth interviews are also used to provide clarity or perspective on the other gathered data. It
helps to provide a more comprehensive picture on the data that has been gathered. In depth interviews, a
technique should be used in lieu of focus interviews where It is felt that the respondents will not be
comfortable talking about the topic in a group atmosphere or where the researcher wants to differential
between individual opinions and group opinions on a topic of discussion. Depth interviews are also used
where the researcher wants to refine questions for a future study or survey.

4) Observation :
Observation is a direct technique of examining the behavior or the results of the behavior. This requires
the researcher to observe the behavior of an individual or a group of people. This observation must be
done in a natural setting and over an interval of time. The biggest advantage of this method is it increases
the credibility of the research process. It utilizes trained researchers who are unbiased regarding the
research topic. By observing the behavior formally, the observers are often able to identify attitudes and
predispositions which are often over locked by researchers. The disadvantage is that observation is a time-
taking process and the observers often find that their presence influences the behavior of the people being
observed and thus affects the reliability of the observation process.

3.2.6 Criteria for Good Measurement


Seven important criteria art used for evaluating the measurement, which are as follows :

1) Reliability :
Reliability is an important criterion for testing the measurement. When the results offered by the
measuring instrument are consistent, it is called reliable. Although reliable instrument is not necessarily a
valid instrument in its nature, but it leads to validity of the measurement.
2) Validity :
The next criterion used for evaluating the measurement is its validity. The extent of

Business Research Methods 104


and Data Analytics
22BA206

to which a particular measuring instrument specifically measures is called its validity. It can also be
denoted as utility. It also expresses the extent to which differences described by a measuring instrument
between the two behaviors are true.
3) Practicality :
Practicality is also a criterion for testing the measuring instrument. The extent, to
which a particular measuring instrument is suitable, cost-effective and interpret-able, denotes the
practicality of the instrument.

4) Sensitivity :
The next criterion for evaluating the measurement instrument is its sensitivity. A particular measuring
instrument is said to be sensitive if all the variations in responses are effectively measured by it. Measuring
instruments dealing with 'Agree' or 'Disagree' types of responses are not so, sensitive. A little modification
is required in instruments so as to record more sensitive responses.
5) Generalisability :
Generalisability is also an important criterion for testing the measuring instrument. The ability of data
collection of an instrument from widespread respondents along with offering flexibility in its interpretation
is called generalisabilty.
6) Economy :
The choice of data collection method is also often dictated by economic factors. The rising cost of personal
interviewing first led to an increased use of telephone surveys and subsequently to the current rise in
Internet surveys. In standardized tests, the cost of test materials alone can be such a significant expense
that it encourages multiple reuses.
7) Convenience :
A measuring device passes the convenience test if it is easy to administer. A questionnaire or a
measurement scale with a set of detailed but clear instructions, with examples, is easier to complete
correctly than one that lacks these features. In a well-prepared study, it is not uncommon for the
interviewer instructions to be several times longer than the interview questions. Naturally, the more
complex the concepts and constructs, the greater is the need for clear and complete instructions.

3.2.7 Difficulties in Measurement

Measurement has the following difficulties :

1) Irrelevant Data :
Measurement leads to the generation of enormous data. However, it is not necessary that the data is always
relevant, the data may lack purpose of times. Sometimes, measurement forces the marketers to manipulate
the real data for their own purposes.
2) Inaccurate Response :
Respondents have a tendency of giving inaccurate responses in face-to-face interviews. It is very important
that the research activity elicits the correct response from the respondents. Now-a-days, web-based
surveys have made it possible to target large target segment, quickly and economically.
4) Training in Measurement is Rare :
Measurement requires that people have necessary skills and knowledge in particular field. However, very
few organisations invest in knowledge and skill building.
5) Delegating Measurement Strategy :
Deciding the right metrics often requires that the incumbents not only have a big picture perspective but
also the power to challenge the dominant marketing mind-sets of the organisation. This is often not
possible for middle managers but requires involvement of top management. Measurement should not be
delegated, as the quest for truth will then take a backseat in the organisation. It needs leadership and focus
in the organisation so that a congenial environment is created in the organisation.

Business Research Methods 105


and Data Analytics
22BA206

Issue of preciseness and practical use of the research work are the main concerns for several researchers.
They are curious about the contribution of their research work in the concerned field.

3.2.8 Scale of Measurement


Scales of measurement in research and statistics are the different ways in which variables are defined
and grouped into different categories. Sometimes called the level of measurement, it describes the nature
of the values assigned to the variables in a data set.
The term scale of measurement is derived from two keywords in statistics, namely; measurement and
scale. Measurement is the process of recording observations collected as part of the research.
Scaling, on the other hand, is the assignment of objects to numbers or semantics. These two words merged
together refer to the relationship among the assigned objects and the recorded observations.
A measurement scale is used to qualify or quantify data variables in statistics. It determines the kind
of techniques to be used for statistical analysis.
There are different kinds of measurement scales, and the type of data being collected determines the kind
of measurement scale to be used for statistical measurement. These measurement scales are four in
number, namely; nominal scale, ordinal scale, interval scale, and ratio scale.
The measurement scales are used to measure qualitative and quantitative data. With nominal and ordinal
scale being used to measure qualitative data while interval and ratio scales are used to measure quantitative
data.

3.2.9 Levels of Data Measurement


The level of measurement of a given data set is determined by the relationship between the values assigned
to the attributes of a data variable. For example, the relationship between the values (1 and 2) assigned to
the attributes (male and female) of the variable (Gender) is “identity”. This via. a nominal scale example.

By knowing the different levels of data measurement, researchers are able to choose the best method for
statistical analysis. The different levels of data measurement are: nominal, ordinal, interval, and ratio
scales
Nominal Scale
The nominal scale is a scale of measurement that is used for identification purposes. It is the coldest and
weakest level of data measurement among the four.
Sometimes known as categorical scale, it assigns numbers to attributes for easy identity. These numbers
are however not qualitative in nature and only act as labels.
The only statistical analysis that can be performed on a nominal scale is the percentage or frequency count.
It can be analyzed graphically using a bar chart and pie chart.
Nominal Scale Example
In the example below, the measurement of the popularity of a political party is measured on a nominal
scale.
Which political party are you affiliated with?
• Independent
• Republican
• Democrat
Labeling Independent as “1”, Republican as “2” and Democrat as “3” does not in any way mean any of
the attributes are better than the other. They are just used as an identity for easy data analysis.
Ordinal Scale
Ordinal Scale involves the ranking or ordering of the attributes depending on the variable being scaled.
The items in this scale are classified according to the degree of occurrence of the variable in question.
The attributes on an ordinal scale are usually arranged in ascending or descending order. It measures the
degree of occurrence of the variable.

Business Research Methods 106


and Data Analytics
22BA206

Ordinal scale can be used in market research, advertising, and customer satisfaction surveys. It uses
qualifiers like very, highly, more, less, etc. to depict a degree.
We can perform statistical analysis like median and mode using the ordinal scale, but not mean. However,
there are other statistical alternatives to mean that can be measured using the ordinal scale.
Ordinal Scale Example
For example: A software company may need to ask its users:
How would you rate our app?
• Excellent
• Very Good
• Good
• Bad
• Poor
The attributes in this example are listed in descending order.

Interval Scale
The interval scale of data measurement is a scale in which the levels are ordered and each numerically
equal distances on the scale have equal interval difference. If it is an extension of the ordinal scale, with
the main difference being the existence of equal intervals.
With an interval scale, you not only know that a given attribute A is bigger than another attribute B, but
also the extent at which A is larger than B. Also, unlike ordinal and nominal scale, arithmetic operations
can be performed on an interval scale.

A 5 Minutes Interval Time Scale


It is used in various sectors like in education, medicine, engineering, etc. Some of these uses include
calculating a student’s CGPA, measuring a patient’s temperature, etc.

Interval Scale Example


A common example is measuring temperature on the Fahrenheit scale. It can be used in calculating mean,
median, mode, range, and standard deviation.

Ratio Scale
Ratio Scale is the peak level of data measurement. It is an extension of the interval scale, therefore
satisfying the four characteristics of the measurement scale; identity, magnitude, equal interval, and the
absolute zero property.
This level of data measurement allows the researcher to compare both the differences and the relative
magnitude of numbers. Some examples of ratio scales include length, weight, time, etc.
With respect to market research, the common ratio scale examples are price, number of customers,
competitors, etc. It is extensively used in marketing, advertising, and business sales.
The ratio scale of data measurement is compatible with all statistical analysis methods like the measures
of central tendency (mean, median, mode, etc.) and measures of dispersion (range, standard deviation,
etc.).

Ratio Scale Example


For example: A survey that collects the weights of the respondents.
Which of the following category do you fall in? Weigh
• More than 100 kgs
• 81 – 100 kgs
• 61 – 80 kgs
• 40 – 60 kgs
• Less than 40 kgs

Business Research Methods 107


and Data Analytics
22BA206

Comparative Scales
In comparative scaling, respondents are asked to make a comparison between one object and the other.
When used in market research, customers are asked to evaluate one product in direct comparison to the
others. Comparative scales can be further divided into pair comparison, rank order, constant sum, and q-
sort scales.
Paired Comparison Scale
Paired Comparison scale is a scaling technique that presents the respondents with two objects at a time
and asks them to choose one according to a predefined criterion. Product researchers use it in comparative
product research by asking customers to choose the most preferred to them in between two closely related
products.
For example, there are 3 new features in the last release of a software product. But the company is
planning to remove 1 of these features in the new release. Therefore, the product researchers are
performing a comparative analysis of the most and least preferred feature.
• Which feature is most preferred to you between the following pairs?
• Filter – Voice recorder
• Filter – Video recorder
• Voice recorder – Video recorder

Rank Order Scale:


In rank order scaling technique, respondents are simultaneously provided with multiple options and asked
to rank them in order of priority based on a predefined criterion. It is mostly used in marketing to measure
preference for a brand, product, or feature.
When used in competitive analysis, the respondent may be asked to rank a group of brands in terms of
personal preference, product quality, customer service, etc. The results of this data collection are usually
obtained in the conjoint analysis, as it forces customers to discriminate among options.
The rank order scale is a type of ordinal scale because it orders the attributes from the most preferred to
the least preferred but does not have a specific distance between the attributes.
For example:
Rank the following brands from the most preferred to the least preferred.
• Coca-Cola
• Pepsi Cola
• Dr pepper
• Mountain Dew

Constant Sum Scale


Constant Sum scale is a type of measurement scale where the respondents are asked to allocate a constant
sum of units such as points, dollars, chips or chits among the stimulus objects according to some specified
criterion. The constant sum scale assigns a fixed number of units to each attribute, reflecting the
importance a respondent attaches to it.
This type of measurement scale can be used to determine what influences a customer’s decision when
choosing which product to buy. For example, you may wish to determine how important price, size,
fragrance, and packaging is to a customer when choosing which brand of perfume to buy.
Some of the major setbacks of this technique are that respondents may be confused and end up allocating
more or fewer points than those specified. The researchers are left to deal with a group of data that is not
uniform and may be difficult to analyze.
Avoid this with the logic feature on Formplus. This feature allows you to add a restriction that prevents
the respondent from adding more or fewer points than specified to your form.

Business Research Methods 108


and Data Analytics
22BA206

Q-Sort Scale
Q-Sort scale is a type of measurement scale that uses a rank order scaling technique to sort similar objects
with respect to some criterion. The respondents sort the number of statements or attitudes into piles,
usually of 11.
The Q-Sort Scaling helps in assigning ranks to different objects within the same group, and the differences
among the groups (piles) are clearly visible. It is a fast way of facilitating discrimination among a relatively
large set of attributes.
For example, a new restaurant that is just preparing its menu may want to collect some information about
what potential customers like:
The document provided contains a list of 50 meals. Please choose 10 meals you like, 30 meals you are
neutral about (neither like nor dislike) and 10 meals you dislike.

Non-Comparative Scales
In non-comparative scaling, customers are asked to only evaluate a single object. This evaluation is totally
independent of the other objects under investigation. Sometimes called monadic or metric scale, Non-
Comparative scale can be further divided into continuous and the itemized rating scales

Continuous Rating Scale


In continuous rating scale, respondents are asked to rate the objects by placing a mark appropriately on a
line running from one extreme of the criterion to the other variable criterion. Also called the graphic rating
scale, it gives the respondent the freedom to place the mark anywhere based on personal preference.
Once the ratings are obtained, the researcher splits up the line into several categories and then assign the
scores depending on the category in which the ratings fall. This rating can be visualized in both horizontal
and vertical form.
Although easy to construct, the continuous rating scale has some major setbacks, giving it limited usage
in market research.

Itemized Rating Scale


The itemized rating scale is a type of ordinal scale that assigns numbers each attribute. Respondents are
usually asked to select an attribute that best describes their feelings regarding a predefined criterion.
Itemized rating scale is further divided into 2, namely; Likert scale, Stapel scale, and semantic scale.

Likert Scale
A Likert scale is an ordinal scale with five response categories, which is used to order a list of attributes
from the best to the least. This scale uses adverbs of degree like very strongly, highly, etc. to indicate the
different levels.
Stapel Scale:
This a scale with 10 categories, usually ranging from -5 to 5 with no zero point. It is a vertical scale with
3 columns, where the attributes are placed in the middle and the least (-5) and highest (5) is in the 1st and
3rd columns respectively.
Semantic Differential Scale
This is a seven-point rating scale with endpoints associated with bipolar labels (e.g. good or bad, happy,
etc.). It can be used for marketing, advertising and in different stages of product development.
If there is more than one item being inherently investigated, it can be visualized on a table with more than
3 columns.

Business Research Methods 109


and Data Analytics
22BA206

Using quantitative and qualitative data in statistics


Once data scientists have a conclusive data set from their sample, they can start to use the information to
draw descriptions and conclusions. To do this, they can use both descriptive and inferential statistics.
Descriptive statistics
Descriptive statistics help demonstrate, represent, analyse and summarise the findings contained in a
sample. They present data in an easy-to-understand and presentable form, such as a table or graph. Without
description, the data would be in its raw form with no explanation.

Frequency counts
One way data scientists can describe statistics is using frequency counts, or frequency statistics, which
describe the number of times a variable exists in a data set. For example, the number of people with blue
eyes or the number of people with a driver’s license in the sample can be counted by frequency. Other
examples include qualifications of education, such as high school diploma, a university degree or
doctorate, and categories of marital status, such as single, married or divorced.
Frequency data is a form of discrete data, as parts of the values can’t be broken down. To calculate
continuous data points, such as age, data scientists can use central tendency statistics instead. To do this,
they find the mean or average of the data point. Using the age example, this can tell them the average age
of participants in the sample.
While data scientists can draw summaries from the use of descriptive statistics and present them in an
understandable form, they can’t necessarily draw conclusions. That’s where inferential statistics come in.
Inferential statistics
Inferential statistics are used to develop a hypothesis from the data set. It would be impossible to get data
from an entire population, so data scientists can use inferential statistics to extrapolate their results. Using
these statistics, they can make generalisations and predictions about a wider sample group, even if they
haven’t surveyed them all.
An example of using inferential statistics is in an election. Even before the entire country has voted, data
scientists can use these kinds of statistics to make assumptions regarding who might win based on a
smaller sample size.

3.3 Introducing database application


A database is a searchable collection of information. A research database is where you find
journal, magazine, and newspaper articles. Each database contains thousands of articles published in many
different journals, allowing you find relevant articles faster than you would by searching individual
journals.
Some databases are full text, where they provide the complete text of works such as articles or books.
Other databases will only provide abstracts, or summaries, of articles or books.
Searching a Library database is different from searching the Internet.
Internet Database

Examples Google, Wikipedia Academic Search Premier,


JSTOR, ScienceDirect

Authority/Credentials Anyone can publish and anyone does. Authority/credentials are


Difficult to verify credentials. Results guaranteed. Most articles are
are not always scholarly. scholarly and peer-reviewed.

Results Thousands. Duplicates are not Hundreds or fewer. Duplicates


filtered out. Many are not scholarly. are filtered out. You can limit to
full text.

Business Research Methods 110


and Data Analytics
22BA206

Internet Database

Relevance Lots of “noise” because there are no Databases focus on specific


subject headings assigned. subjects. Offer fewer but more
Information can be biased, untrue, or relevant results. Results are from
irrelevant. scholarly publishers and authors.

Limiters Can limit by document type (pdf, Can limit by date, document type,
doc) and source (gov, org, com) language, format, peer reviewed
status, full text availability, and
more.

Stability of Information from the Internet is Databases are a collection of


information unstable. It can disappear at any time. articles that have appeared in
Researchers will often be asked to journals. This makes their status
pay a fee to access journal articles. more stable than the Internet. The
(Note: These articles are available to information is paid for by
you via the Library as part of your subscription to be offered as part
tuition.) of a student’s tuition.

Popular Database Applications


The Oracle
Oracle is the most widely used commercial relational database management system, built-in assembly
languages such as C, C++, and Java. This database’s most recent version, 21c, contains a slew of new
features.
Oracle is the database management system that stands above the others. Overall, it is the most extensively
used RDBMS. It takes up less space and processes data faster, and it includes several new useful features
such as JSON from SQL.
MySQL
MySQL is one of the most popular databases to use in 2022 in the computer world, especially in web
application development. The main focus of this database is on stability, robustness, and maturity. The
most popular application of this database is for web development solutions.
MySQL is written in C and C++ and uses a structured query language. MySQL 8.0 is the most recent
version of this database, and it includes a better recovery option. The best SQL database comes in a variety
of editions, each with its own set of features.
MS SQL Server
Microsoft provides great toolset support for one of this best database software, both on-premise and in the
cloud. It’s attuned well with Linux and Windows systems. MS SQL is a multi-model database that
supports Structured Data (SQL), Semi-Structured Data(JSON), and Spatial Data.
It is not as inventive or advanced as other modern list of popular databases, but it has undergone
considerable improvements and overhauls over the years.
PostgreSQL
POSTGRES was the initial name of the database. Michael was also honored with the Turing Award for
his contributions to PostgreSQL.
PostgreSQL is a database management system written in C and used by businesses that deal with huge
amounts of data. This database management software is used by several gaming apps, database
automation tools, and domain registrations.
MongoDB

Business Research Methods 111


and Data Analytics
22BA206

When it comes to most popular databases to use in 2022 through a NoSQL database, there are a few things
to consider. MongoDB is the first Document Database management software that was released in 2009. It
is challenging to load and access data into RDBMS using object-oriented programming languages which
also require additional application-level mapping. Thus, to overcome this problem, Mongo was developed
to handle Document Data.
IBM DB2
IBM also offered DB2 LUW for Windows, Linux, and Unix. DB2 11.5 is the most recent release, and it
speeds up query execution.
The list of databases for mobile apps supports the relational model, but it has grown significantly in recent
years. It now supports object-relational features and non-relational forms such as JSON and XML.
Redis
It is a popular open-source database project. According to Stack Overflow’s Annual Developer Survey,
Redis is ranked as the Most Loved Database platform. As a distributed, in-memory key-value database, it
can be used. Redis can also be used as a distributed cache and message broker, with durability as an option.
Elasticsearch
Elasticsearch is an open core full-text search engine based on Lucene that was first released in 2010 by
Shay Banon. It’s a full-text search engine with a distributed, multi-tenant capability and a REST API.
It provides horizontal scaling via automatic sharing and REST API. It also supports structured and schema
less data (JSON), which is especially suited to analyze Logging or Monitoring data.
Cassandra
It is an open core, distributed, wide column store and commonly used database for an application that was
developed in 2008. This is a highly scalable database management software that is widely used by the
industries to handle massive data.
One of its main features is its decentralized database (Leaderless) having automatic replication and multi-
data center replication, leading it to become a fault-tolerant base without any failures. Cassandra has
several different operations and infrastructure. Cassandra and HBase databases go a long way and have
different use cases according to their types.
MariaDB
It is a Relational Database Management System which is compatible with MySQL Protocol and Clients.
The MySQL server can be easily replaced with MariaDB without requiring any code changes.
This management system provides columnar storage with massively parallel distributed data architecture.
In comparison to MySQL, MariaDB is more community-driven.
OrientDB
OrientDB is an open-source database with NoSQL multi-model database program that enables businesses
to leverage the capabilities of graph database management software without having to build several
systems to handle different data types.
It is a management solution with support for graph, document, key value and object-oriented database
models that improves performance and security while also allowing for scalability.
SQLite
SQLite is an open-source best SQL database with an integrated relational database management system.
It was created in the year 2000. It’s a top database that requires no configuration and doesn’t even require
a server or installation. Despite its simplicity, it contains many commonly used database management
system software functionalities to be used in mobile web development like react native.
DynamoDB
DynamoDB is a nonrelational best database from Amazon. It is a serverless database for mobile apps that
scales up and down automatically while also backing up your data.
This database program features built-in security and in-memory caching, as well as consistent latency.
Neo4j

Business Research Methods 112


and Data Analytics
22BA206

Neo4j is an open-source, Java-based NoSQL graph database that was launched in 2007. It uses a query
language known as Cypher, labeled on its site as the most efficient and expressive way to describe
relationship queries.
In this database management system software, your data is saved as graphs rather than tables. Neo4j’s
relationship system is quick that allows you to create and use other relationships later to “shortcut” and
speed up domain data as the need arises.
Firebirdsql
Firebird is a free SQL relational database management system software that operates on Mac OS X, Linux,
Microsoft Windows, and a variety of Unix platforms.
This best free database for web applications has upgraded the multi-platform RDBMS. From firebird
memberships to sponsorship commitments, it offers a variety of financing choices.

3.3.1 Analysis Technique


Qualitative and quantitative analysis are two fundamental methods of collecting and interpreting
data in research. The methods can be used independently or concurrently since they all have the same
objectives. They have some errors, and so using them concurrently can compensate for the errors each has
and then produce quality results.

Quantitative analysis
Quantitative analysis is often associated with numerical analysis where data is collected, classified, and
then computed for certain findings using a set of statistical methods. Data is chosen randomly in large
samples and then analyzed. The advantage of quantitative analysis the findings can be applied in a general
population using research patterns developed in the sample. This is a shortcoming of qualitative data
analysis because of limited generalization of findings.
Quantitative analysis is more objective in nature. It seeks to understand the occurrence of events and then
describe those using statistical methods. However, more clarity can be obtained by concurrently using
qualitative and quantitative methods. Quantitative analysis normally leaves the random and scarce events
in research results whereas qualitative analysis considers them.
Quantitative analysis is generally concerned with measurable quantities such as weight, length,
temperature, speed, width, and many more. The data can be expressed in a tabular form or any
diagrammatic representation using graphs or charts. Quantitative data can be classified as continuous or
discrete, and it is often obtained using surveys, observations, experiments or interviews.
There are, however, limitations in quantitative analysis. For instance, it can be challenging to uncover
relatively new concepts using quantitative analysis and that is where qualitative analysis comes into the
equation to find out “why” a certain phenomenon occurs. That is why the methods are often used
simultaneously.

Qualitative analysis
Qualitative analysis is concerned with the analysis of data that cannot be quantified. This type of data is
about the understanding and insights into the properties and attributes of objects (participants). Qualitative
analysis can get a deeper understanding of “why” a certain phenomenon occurs. The analysis can be used
in conjunction with quantitative analysis or precede it.
Unlike with quantitative analysis that is restricted by certain classification rules or numbers, qualitative
data analysis can be wide ranged and multi-faceted. And it is subjective, descriptive, non-statistical and
exploratory in nature.

Because qualitative analysis seeks to get a deeper understanding, the researcher must be well-rounded
with whichever physical properties or attributes the study is based on. Oftentimes, the researcher may
have a relationship with the participants where their characteristics are disclosed. In a quantitative analysis
the characteristics of objects are often undisclosed. The typical data analyzed qualitatively include color,

Business Research Methods 113


and Data Analytics
22BA206

gender, nationality, taste, appearance, and many more as long as the data cannot be computed. Such data
is obtained using interviews or observations.
There are limitations in qualitative analysis. For instance, it cannot be used to generalize the population.
Small samples are used in an unstructured approach and they are non-representative of the general
population hence the method cannot be used to generalize the entire population. That is where quantitative
analysis into the factor.

3.3.2 Data collection for qualitative and quantitative analysis

To prepare data for quantitative data analysis simply means to convert it to meaningful and readable
formats, below are the steps to achieve this:
Data Validation: This is to evaluate if the data was collected correctly through the required channels
and to ascertain if it meets the set-out standards stated from the onset. This can be done by checking
if the procedure was followed, making sure that the respondents were chosen based on the research
criteria, and checking for completeness in the data.
Data Editing: Large datasets may include errors where fields may be filled incorrectly or left empty
accidentally. To avoid having a faulty analysis, data checks should be done to identify and clear out
anything that may lead to an inaccurate result.
Data Coding: This involves grouping and assigning values to data. It might mean forming tables and
structures to represent the data accurately.
Now that you are familiar with what quantitative data analysis is and how to prepare your data for analysis,
the focus will shift to the purpose of this article which is the methods and techniques of quantitative data
analysis.

3.3.2.1 Methods and Techniques of Quantitative Data Analysis


Quantitative data analysis involves the use of computational and statistical methods that focuses
on the statistical, mathematical, or numerical analysis of datasets. It starts with a descriptive statistical
phase and is followed up with a closer analysis if needed to derive more insight such as correlation, and
the production of classifications based on the descriptive statistical analysis.
As can be deduced from the statement above, there are two main commonly used quantitative data analysis
methods namely the descriptive statistics used to explain certain phenomena and inferential statistics used
to make predictions. Both methods are used in different ways having techniques unique to them. An
explanation of both methods is done below.
• Descriptive Statistics
• Inferential Statistics
1) Descriptive Statistics
Descriptive statistics as the name implies is used to describe a dataset. It helps understand the details of
your data by summarizing it and finding patterns from the specific data sample. They provide absolute
numbers gotten from a sample but do not necessarily explain the rationale behind the numbers and are
mostly used for analyzing single variables. The methods used in descriptive statistics include:
• Mean: This is used to calculate the numerical average of a set of values.
• Median: This is used to get the midpoint of a set of values when the numbers are arranged in
numerical order.
• Mode: This is used to find the most commonly occurring value in a dataset.
• Percentage: This is used to express how a value or group of respondents within the data relates to
a larger group of respondents.
• Frequency: This indicates the number of times a value is found.
• Range: This shows the highest and lowest value in a set of values.
• Standard Deviation: This is used to indicate how dispersed a range of numbers is, meaning, it
shows how close all the numbers are to the mean.

Business Research Methods 114


and Data Analytics
22BA206

• Skewness: It indicates how symmetrical a range of numbers is, showing if they cluster into a
smooth bell curve shape in the middle of the graph or if they skew towards the left or right.
2) Inferential Statistics
In quantitative analysis, the expectation is to turn raw numbers into meaningful insight using numerical
values and descriptive statistics is all about explaining details of a specific dataset using numbers, but, it
does not explain the motives behind the numbers hence, the need for further analysis using inferential
statistics.
Inferential statistics aim to make predictions or highlight possible outcomes from the analyzed data
obtained from descriptive statistics. They are used to generalize results and make predictions between
groups, show relationships that exist between multiple variables, and are used for hypothesis testing that
predicts changes or differences.
They are various statistical analysis methods used within inferential statistics, a few are discussed below.
• Cross Tabulations: Cross tabulation or crosstab is used to show the relationship that exists
between two variables and is often used to compare results by demographic groups. It uses a basic
tabular form to draw inferences between different data sets and contains data that is mutually
exclusive or has some connection with each other. Crosstabs are helpful in understanding the
nuances of a dataset and factors that may influence a data point.
• Regression Analysis: Regression analysis is used to estimate the relationship between a set of
variables. It is used to show the correlation between a dependent variable (the variable or outcome
you want to measure or predict) and any number of independent variables (factors that may have
an impact on the dependent variable). Therefore, the purpose of the regression analysis is to
estimate how one or more variables might have an effect on a dependent variable to identify trends
and patterns to make predictions and forecast possible future trends. There are many types of
regression analysis and the model you choose will be determined by the type of data you have for
the dependent variable. The types of regression analysis include linear regression, non-linear
regression, binary logistic regression, etc.
• Monte Carlo Simulation: Monte Carlo simulation also known as the Monte Carlo method is a
computerized technique of generating models of possible outcomes and showing their probability
distributions. It considers a range of possible outcomes and then tries to calculate how likely each
outcome will occur. It is used by data analysts to perform an advanced risk analysis to help in
forecasting future events and taking decisions accordingly.
• Analysis of Variance (ANOVA): This is used to test the extent to which two or more groups
differ from each other. It compares the mean of various groups and allows the analysis of multiple
groups.
• Factor Analysis: A large number of variables can be reduced into a smaller number of factors
using the factor analysis technique. It works on the principle that multiple separate observable
variables correlate with each other because they are all associated with an underlying construct. It
helps in reducing large datasets into smaller, more manageable samples.
• Cohort Analysis: Cohort analysis can be defined as a subset of behavioral analytics that operates
from data taken from a given dataset. Rather than looking at all users as one unit, cohort analysis
breaks down data into related groups for analysis where these groups or cohorts usually have
common characteristics or similarities within a defined period.
• MaxDiff Analysis: This is a quantitative data analysis method that is used to gauge customers’
preferences for purchase and what parameters rank highest than the others in the process.
• Cluster Analysis: Cluster analysis is a technique used to identify structures within a dataset.
Cluster analysis aims to be able to sort different data points into groups that are internally similar
and externally different, that is, data points within a cluster will look like each other and different
from data points in other clusters.
• Time Series Analysis: This is a statistical analytic technique used to identify trends and cycles
over time. It is simply the measurement of the same variables at different points in time like

Business Research Methods 115


and Data Analytics
22BA206

weekly, and monthly email sign-ups to uncover trends, seasonality, and cyclic patterns. By doing
this, the data analyst can forecast how variables of interest may fluctuate in the future.
• SWOT analysis: This is a quantitative data analysis method that assigns numerical values to
indicate strengths, weaknesses, opportunities, and threats of an organization, product, or service to
show a clearer picture of competition to foster better business strategies

3.3.2.2 Methods and Techniques of Qualitative Data Analysis

Qualitative data can be observed and recorded. This data type is non-numerical in nature. This type
of data is collected through methods of observations, one-to-one interviews, conducting focus groups, and
similar methods. Qualitative data in statistics is also known as categorical data – data that can be arranged
categorically based on the attributes and properties of a thing or a phenomenon.

Qualitative Data techniques and Examples


Qualitative data is also called categorical data since this data can be grouped according to
categories.
For example, think of a student reading a paragraph from a book during one of the class sessions. A teacher
who is listening to the reading gives feedback on how the child read that paragraph. If the teacher gives
feedback based on fluency, intonation, throw of words, clarity in pronunciation without giving a grade to
the child, this is considered as an example of qualitative data.
It’s pretty easy to understand the difference between qualitative and quantitative data. Qualitative data
does not include numbers in its definition of traits, whereas quantitative data is all about numbers.
• The cake is orange, blue, and black in color (qualitative).
• Females have brown, black, blonde, and red hair (qualitative).
Quantitative data is any quantifiable information that can be used for mathematical calculation or
statistical analysis. This form of data helps in making real-life decisions based on mathematical
derivations. Quantitative data is used to answer questions like how many? How often? How much? This
data can be validated and verified.
To better understand the concept of qualitative and quantitative data, it’s best to observe examples of
particular datasets and how they can be defined. The following are examples of quantitative data.
• There are four cakes and three muffins kept in the basket (quantitative).
• One glass of fizzy drink has 97.5 calories (quantitative).

Importance of Qualitative Data


Qualitative data is important in determining the particular frequency of traits or characteristics. It allows
the statistician or the researchers to form parameters through which larger data sets can be observed. It
provides the means by which observers can quantify the world around them.
Qualitative data is about the emotions or perceptions of people, and what they feel. In quantitative
data, these perceptions and emotions are documented. It helps market researchers understand their
consumers’ language and deal with the problem effectively and efficiently.
Qualitative Data Collection Methods – Types of Qualitative Data
Qualitative data collection is exploratory; it involves in-depth analysis and research. Its collection
methods mainly focus on gaining insights, reasoning, and motivations; hence, they go deeper in research.
Since this data cannot be measured, researchers prefer methods or data collection tools that are structured
to a limited extent.
Here are the qualitative data collection methods:
1. One-to-One Interviews: It is one of the most commonly used data collection instruments for
qualitative research, mainly because of its personal approach. The interviewer or the researcher collects
data directly from the interviewee on a one-to-one basis. The interview may be informal and unstructured

Business Research Methods 116


and Data Analytics
22BA206

– conversational. Mostly the open-ended questions are asked spontaneously, with the interviewer letting
the flow of the interview dictate the questions to be asked.
2. Focus groups: This is done in a group discussion setting. The group is limited to 6-10 people, and a
moderator is assigned to moderate the ongoing discussion.
Depending on the data which is sorted, the members of a group may have something in common. For
example, a researcher conducting a study on track runners will choose athletes who are track runners or
were track runners and have sufficient knowledge of the subject matter.
3. Record keeping: This method makes use of the already existing reliable documents and similar sources
of information as the data source. This data can be used in the new research. It is similar to going to a
library. There, one can go over books and other reference material to collect relevant data that can be used
in the research.
4. Process of observation: In this data collection method, the researcher immerses himself/ herself in the
setting where his respondents are, and keeps a keen eye on the participants and takes down notes. This is
known as the process of observation.
Besides taking notes, other documentation methods, such as video and audio recording, photography, and
similar methods, can be used.
5. Longitudinal studies: This data collection method is performed on the same data source repeatedly over
an extended period. It is an observational research method that goes on for a few years and, in some cases,
can go on for even decades. This data collection method aims to find correlations through an empirical
study of subjects with common traits.
6. Case studies: In this method, data is gathered by an in-depth analysis of case studies. The versatility of
this method is demonstrated in how this method can be used to analyze both simple and complex subjects.
The strength of this method is how judiciously it uses a combination of one or more qualitative data
collection methods to draw inferences.

3.3.2.2.1 Two Main Approaches to Qualitative Data Analysis


Deductive Approach
The deductive approach involves analyzing qualitative data based on a structure that is predetermined by
the researcher. A researcher can use the questions as a guide for analyzing the data. This approach is quick
and easy and can be used when a researcher has a fair idea about the likely responses that he/she is going
to receive from the sample population.
Inductive Approach
The inductive approach, on the contrary, is not based on a predetermined structure or set ground
rules/framework. It is a more time-consuming and thorough approach to qualitative data analysis. An
inductive approach is often used when a researcher has very little or no idea of the research phenomenon.

3.3.2.3 Steps to Qualitative Data Analysis


Whether you are looking to analyze qualitative data collected through a one-to-one interview or from
a survey, these simple steps will ensure a robust data analysis.
Step 1: Arrange your Data
Once you have collected all the data, it is largely unstructured and sometimes makes no sense when looked
at a glance. Therefore, it is essential that as a researcher, you first need to transcribe the data collected.
The first step in analyzing your data is arranging it systematically. Arranging data means converting all
the data into a text format. You can either export the data into a spreadsheet or manually type in the data
or choose from any of the computer-assisted qualitative data analysis tools.

Business Research Methods 117


and Data Analytics
22BA206

Step 2: Organize all your Data


After transforming and arranging your data, the immediate next step is to organize your data. You
may have a large amount of information that still needs to be arranged in an orderly manner. One of the
best ways to organize the data is by going back to your research objectives and then organizing the data
based on the questions asked. Arrange your research objective in a table so it appears visually clear. At all
costs, avoid the temptations of working with unorganized data. You will end up wasting time, and no
conclusive results will be obtained.
Step 3: Set a Code to the Data Collected
Setting up proper codes for the collected data takes you a step ahead. Coding is one of the best ways to
compress a tremendous amount of information collected. Qualitative data coding means categorizing and
assigning properties and patterns to the collected data.
Coding is important in this data analysis, as you can derive theories from relevant research findings. After
assigning codes to your data, you can then begin to build on the patterns to gain in-depth insight into the
data that will help make informed decisions.
Step 4: Validate your Data
Validating data is one of the crucial steps of qualitative data analysis for successful research. Since data
is quintessential for research, it is imperative to ensure that the data is not flawed. Please note that data
validation is not just one step in this analysis; this is a recurring step that needs to be followed throughout
the research process. There are two sides to validating data:
• Accuracy of your research design or methods.
• Reliability, which is the extent to which the methods produce accurate data consistently.
Step 5: Concluding the Analysis Process
It is important to finally conclude your data, which means systematically presenting your data, a report
that can be readily used. The report should state the method that you, as a researcher, used to conduct the
research studies, the positives, and negatives and study limitations. In the report, you should also state the
suggestions/inferences of your findings and any related areas for future research.

Advantages
1. It helps in-depth analysis: The data collected provide the researchers with a detailed analysis, like
a thematic analysis of subject matters. While collecting it, the researchers tend to probe the participants
and can gather ample information by asking the right kind of questions. The data collected is used to
conclude a series of questions and answers.
2. Understand what customers think: The data helps market researchers understand their customers’
mindsets. The use of qualitative data gives businesses an insight into why a customer purchased a product.
Understanding customer language helps market research infer the data collected more systematically.
3. Rich data: Collected data can also be used to conduct future research. Since the questions asked to
collect qualitative data are open-ended questions, respondents are free to express their opinions, leading
to more information.

Disadvantages
1. Time-consuming: As collecting this data is more time-consuming, fewer people study than collecting
quantitative data. Unless time and budget allow, a smaller sample size is included.
2. Not easy to generalize: Since fewer people are studied, it is difficult to generalize the results of that
population.
3. Dependent on the researcher’s skills: This type of data is collected through one-to-one interviews,
observations, focus groups, etc. it relies on the researcher’s skills and experience to collect information
from the sample.
It is typically descriptive data and is more difficult to analyze than quantitative data. Now, you have to
decide which is the best option for your research project; remember that to obtain and analyze the
qualitative data, we need a little more time, so you should consider it in your planning.

Business Research Methods 118


and Data Analytics
22BA206

3.4 Hypothesis Testing


Hypothesis testing is the act of testing a hypothesis or a supposition in relation to a statistical
parameter. Analysts implement hypothesis testing in order to test if a hypothesis is plausible or not.

In data science and statistics, hypothesis testing is an important step as it involves the verification of an
assumption that could help develop a statistical parameter. For instance, a researcher establishes a
hypothesis assuming that the average of all odd numbers is an even number.

In order to find the plausibility of this hypothesis, the researcher will have to test the hypothesis using
hypothesis testing methods. Unlike a hypothesis that is ‘supposed’ to stand true on the basis of little or no
evidence, hypothesis testing is required to have plausible evidence in order to establish that a statistical
hypothesis is true.

Perhaps this is where statistics play an important role. A number of components are involved in this
process. But before understanding the process involved in hypothesis testing in research methodology, we
shall first understand the types of hypotheses that are involved in the process. Let us get started!

3.4.1 Types of Hypotheses

In data sampling, different types of hypothesis are involved in finding whether the tested samples test
positive for a hypothesis or not. In this segment, we shall discover the different types of hypotheses and
understand the role they play in hypothesis testing.

Alternative Hypothesis

Alternative Hypothesis (H1) or the research hypothesis states that there is a relationship between two
variables (where one variable affects the other). The alternative hypothesis is the main driving force for
hypothesis testing.
It implies that the two variables are related to each other and the relationship that exists between them is
not due to chance or coincidence.
When the process of hypothesis testing is carried out, the alternative hypothesis is the main subject of the
testing process. The analyst intends to test the alternative hypothesis and verifies its plausibility.

Null Hypothesis

The Null Hypothesis (H0) aims to nullify the alternative hypothesis by implying that there exists no
relation between two variables in statistics. It states that the effect of one variable on the other is solely
due to chance and no empirical cause lies behind it.
The null hypothesis is established alongside the alternative hypothesis and is recognized as important as
the latter. In hypothesis testing, the null hypothesis has a major role to play as it influences the testing
against the alternative hypothesis.

Non-Directional Hypothesis

The Non-directional hypothesis states that the relation between two variables has no direction.
Simply put, it asserts that there exists a relation between two variables, but does not recognize the direction
of effect, whether variable A affects variable B or vice versa. Directional Hypothesis

Business Research Methods 119


and Data Analytics
22BA206

The Directional hypothesis, on the other hand, asserts the direction of effect of the relationship that exists
between two variables.
Herein, the hypothesis clearly states that variable A affects variable B, or vice versa.

Statistical Hypothesis

A statistical hypothesis is a hypothesis that can be verified to be plausible on the basis of statistics.
By using data sampling and statistical knowledge, one can determine the plausibility of a statistical
hypothesis and find out if it stands true or not.

3.4.2 Performing Hypothesis Testing

Now that we have understood the types of hypotheses and the role they play in hypothesis testing, let us
now move on to understand the process in a better manner.

In hypothesis testing, a researcher is first required to establish two hypotheses - alternative hypothesis and
null hypothesis in order to begin with the procedure.

To establish these two hypotheses, one is required to study data samples, find a plausible pattern among
the samples, and pen down a statistical hypothesis that they wish to test.
A random population of samples can be drawn, to begin with hypothesis testing. Among the two
hypotheses, alternative and null, only one can be verified to be true. Perhaps the presence of both
hypotheses is required to make the process successful.
At the end of the hypothesis testing procedure, either of the hypotheses will be rejected and the other one
will be supported. Even though one of the two hypotheses turns out to be true, no hypothesis can ever be
verified 100%.

Therefore, a hypothesis can only be supported based on the statistical samples and verified data. Here is a
step-by-step guide for hypothesis testing.

1. Establish the hypotheses

First things first, one is required to establish two hypotheses - alternative and null, that will set the
foundation for hypothesis testing.
These hypotheses initiate the testing process that involves the researcher working on data samples in order
to either support the alternative hypothesis or the null hypothesis.

2. Generate a testing plan

Once the hypotheses have been formulated, it is now time to generate a testing plan. A testing plan or an
analysis plan involves the accumulation of data samples, determining which statistic is to be considered
and laying out the sample size. All these factors are very important while one is working on hypothesis
testing.

Business Research Methods 120


and Data Analytics
22BA206

3. Analyze data samples

As soon as a testing plan is ready, it is time to move on to the analysis part. Analysis of data samples
involves configuring statistical values of samples, drawing them together, and deriving a pattern out of
these samples.
While analyzing the data samples, a researcher needs to determine a set of things

• Significance Level - The level of significance in hypothesis testing indicates if a statistical result
could have significance if the null hypothesis stands to be true.

• Testing Method - The testing method involves a type of sampling-distribution and a test statistic
that leads to hypothesis testing. There are a number of testing methods that can assist in the analysis
of data samples.

• Test statistic - Test statistic is a numerical summary of a data set that can be used to perform
hypothesis testing.

• P-value - The P-value interpretation is the probability of finding a sample statistic to be as extreme
as the test statistic, indicating the plausibility of the null hypothesis.

4. Infer the results

The analysis of data samples leads to the inference of results that establishes whether the alternative
hypothesis stands true or not. When the P-value is less than the significance level, the null hypothesis
is rejected and the alternative hypothesis turns out to be plausible.

3.4.3 Methods of Hypothesis Testing

As we have already looked into different aspects of hypothesis testing, we shall now look into the different
methods of hypothesis testing. All in all, there are 2 most common types of hypothesis testing methods.
They are as follows -

1. Frequentist Hypothesis Testing

The frequentist hypothesis or the traditional approach to hypothesis testing is a hypothesis testing
method that aims on making assumptions by considering current data.

The supposed truths and assumptions are based on the current data and a set of 2 hypotheses are
formulated. A very popular subtype of the frequentist approach is the Null Hypothesis Significance
Testing (NHST).

The NHST approach (involving the null and alternative hypothesis) has been one of the most sought-
after methods of hypothesis testing in the field of statistics ever since its inception in the mid-1950s.

2. Bayesian Hypothesis Testing

Business Research Methods 121


and Data Analytics
22BA206

A much unconventional and modern method of hypothesis testing, the Bayesian Hypothesis Testing
claims to test a particular hypothesis in accordance with the past data samples, known as prior
probability, and current data that lead to the plausibility of a hypothesis.

The result obtained indicates the posterior probability of the hypothesis. In this method, the researcher
relies on ‘prior probability and posterior probability’ to conduct hypothesis testing on hand.

On the basis of this prior probability, the Bayesian approach tests a hypothesis to be true or false. The
Bayes factor, a major component of this method, indicates the likelihood ratio among the null
hypothesis and the alternative hypothesis.
The Bayes factor is the indicator of the plausibility of either of the two hypotheses that are established
for hypothesis testing.

2.4.3.1 Steps to Hypothesis Testing


1. Identify Population and Sample
Example:
Population: All GVSU students who enrolled in STA215 during WINTER 2018
Sample: 50 randomly selected students who enrolled in STA215 during WINTER 2018
2. State the Hypotheses in terms of population parameters
𝐻𝑜 – Null hypothesis, usually is the opposite of our research hypothesis. The null hypothesis always
includes equality.
𝐻𝑎 − Alternative hypothesis, corresponds to our research hypothesis. Does not include equality.

Ho ≥ ≤ = Make sure you match the signs, so they are opposite of each
Ha < > ≠ other, unless your professor wants 𝐻𝑜 to always have a “=”.

Example:
𝐻𝑜: The mean number of GVSU students enrolled in STA215 during WINTER 2018 who
speak English as a second language is 15
𝐻𝑎: The mean number of GVSU students enrolled in STA215 during WINTER 2018 who
speak English as a second language is not equal to 15
3. State Assumptions and Check Conditions
These are the conditions that need to be met in order for the hypothesis test to be performed. If the
conditions are not met, then the results of the test are not valid.
4. Calculate the Test Statistic
The test statistic varies depending on the test performed, see statistical tests handouts for details.
5. Calculate the P-value
P-value = the probability of getting the observed test statistic or something more extreme when
𝐻𝑜 is true. P-values can be find using a calculator or a table from the 215 textbook Introductory Applied
Statistics: A variable Approach.
6. State the Conclusion
If P-value>α then fail to reject the null hypothesis.
“There is insufficient evidence to conclude [𝐻𝑎 in words] If P-
value<α then reject the null hypothesis.
“There is sufficient evidence to conclude [𝐻𝑎 in words] Remember: Never accept 𝐻𝑎 just reject 𝐻𝑜
In hypothesis testing, an analyst tests a statistical sample, with the goal of providing evidence on
the plausibility of the null hypothesis.

Business Research Methods 122


and Data Analytics
22BA206

Statistical analysts test a hypothesis by measuring and examining a random sample of the population
being analyzed. All analysts use a random population sample to test two different hypotheses: the null
hypothesis and the alternative hypothesis.
The null hypothesis is usually a hypothesis of equality between population parameters; e.g., a null
hypothesis may state that the population mean return is equal to zero. The alternative hypothesis is
effectively the opposite of a null hypothesis (e.g., the population mean return is not equal to zero). Thus,
they are mutually exclusive, and only one can be true. However, one of the two hypotheses will always
be true.

Example: Real-World Example of Hypothesis Testing


If, for example, a person wants to test that a penny has exactly a 50% chance of landing on heads, the
null hypothesis would be that 50% is correct, and the alternative hypothesis would be that 50% is not
correct.
Mathematically, the null hypothesis would be represented as Ho: P = 0.5. The alternative hypothesis
would be denoted as "Ha" and be identical to the null hypothesis, except with the equal sign struck-
through, meaning that it does not equal 50%.
A random sample of 100 coin flips is taken, and the null hypothesis is then tested. If it is found that the
100 coin flips were distributed as 40 heads and 60 tails, the analyst would assume that a penny does not
have a 50% chance of landing on heads and would reject the null hypothesis and accept the alternative
hypothesis.
If, on the other hand, there were 48 heads and 52 tails, then it is plausible that the coin could be fair and
still produce such a result. In cases such as this where the null hypothesis is "accepted," the analyst states
that the difference between the expected results (50 heads and 50 tails) and the observed results (48 heads
and 52 tails) is "explainable by chance alone."

2.4.3.2 Error in Decision Making


Using hypothesis testing, you can make decisions about whether your data support or refute your
research predictions with null and alternative hypotheses.
Hypothesis testing starts with the assumption of no difference between groups or no relationship between
variables in the population—this is the null hypothesis. It’s always paired with an alternative
hypothesis, which is your research prediction of an actual difference between groups or a true relationship
between variables.
Example: Null and alternative hypothesisYou test whether a new drug intervention can alleviate symptoms
of an autoimmune disease.
In this case:
• The null hypothesis (H0) is that the new drug has no effect on symptoms of the disease.
• The alternative hypothesis (H1) is that the drug is effective for alleviating symptoms of the disease.
Then, you decide whether the null hypothesis can be rejected based on your data and the results of
a statistical test. Since these decisions are based on probabilities, there is always a risk of making the
wrong conclusion.
• If your results show statistical significance, that means they are very unlikely to occur if the null
hypothesis is true. In this case, you would reject your null hypothesis. But sometimes, this may
actually be a Type I error.
• If your findings do not show statistical significance, they have a high chance of occurring if the
null hypothesis is true. Therefore, you fail to reject your null hypothesis. But sometimes, this may
be a Type II error.
Example: Type I and Type II errorsA Type I error happens when you get false positive results: you
conclude that the drug intervention improved symptoms when it actually didn’t. These improvements
could have arisen from other random factors or measurement errors.

Business Research Methods 123


and Data Analytics
22BA206

A Type II error happens when you get false negative results: you conclude that the drug intervention didn’t
improve symptoms when it actually did. Your study may have missed key indicators of improvements or
attributed any improvements to other factors instead.

2.4.3.3 Type I error


A Type I error means rejecting the null hypothesis when it’s actually true. It means concluding that results
are statistically significant when, in reality, they came about purely by chance or because of unrelated
factors.
The risk of committing this error is the significance level (alpha or α) you choose. That’s a value that you
set at the beginning of your study to assess the statistical probability of obtaining your results (p value).
The significance level is usually set at 0.05 or 5%. This means that your results only have a 5% chance of
occurring, or less, if the null hypothesis is actually true.
If the p value of your test is lower than the significance level, it means your results are statistically
significant and consistent with the alternative hypothesis. If your p value is higher than the significance
level, then your results are considered statistically non-significant.
Example: Statistical significance and Type I errorIn your clinical study, you compare the symptoms of
patients who received the new drug intervention or a control treatment. Using a t test, you obtain a p value
of .035. This p value is lower than your alpha of .05, so you consider your results statistically significant
and reject the null hypothesis.
However, the p value means that there is a 3.5% chance of your results occurring if the null hypothesis is
true. Therefore, there is still a risk of making a Type I error.
To reduce the Type I error probability, you can simply set a lower significance level.

Type I error rate


The null hypothesis distribution curve below shows the probabilities of obtaining all possible results if the
study were repeated with new samples and the null hypothesis were true in the population.
At the tail end, the shaded area represents alpha. It’s also called a critical region in statistics.
If your results fall in the critical region of this curve, they are considered statistically significant and the
null hypothesis is rejected. However, this is a false positive conclusion, because the null hypothesis is
actually true in this case!

2.4.3.3 Type II error


A Type II error means not rejecting the null hypothesis when it’s actually false. This is not quite the same
as “accepting” the null hypothesis, because hypothesis testing can only tell you whether to reject the null
hypothesis.
Instead, a Type II error means failing to conclude there was an effect when there actually was. In reality,
your study may not have had enough statistical power to detect an effect of a certain size.
Power is the extent to which a test can correctly detect a real effect when there is one. A power level of
80% or higher is usually considered acceptable.
The risk of a Type II error is inversely related to the statistical power of a study. The higher the statistical
power, the lower the probability of making a Type II error.
Example: Statistical power and Type II errorWhen preparing your clinical study, you complete a power
analysis and determine that with your sample size, you have an 80% chance of detecting an effect size of

Business Research Methods 124


and Data Analytics
22BA206

20% or greater. An effect size of 20% means that the drug intervention reduces symptoms by 20% more
than the control treatment.
However, a Type II may occur if an effect that’s smaller than this size. A smaller effect size is unlikely to
be detected in your study due to inadequate statistical power.
Statistical power is determined by:
• Size of the effect: Larger effects are more easily detected.
• Measurement error: Systematic and random errors in recorded data reduce power.
• Sample size: Larger samples reduce sampling error and increase power.
• Significance level: Increasing the significance level increases power.
To (indirectly) reduce the risk of a Type II error, you can increase the sample size or the significance level.

Type II error rate


The alternative hypothesis distribution curve below shows the probabilities of obtaining all possible results
if the study were repeated with new samples and the alternative hypothesis were true in the population.
The Type II error rate is beta (β), represented by the shaded area on the left side. The remaining area under
the curve represents statistical power, which is 1 – β.
Increasing the statistical power of your test directly decreases the risk of making a Type II error.

a Type I error is usually worse. In practical terms, however, either type of error could be worse depending
on your research context.
A Type I error means mistakenly going against the main statistical assumption of a null hypothesis. This
may lead to new policies, practices or treatments that are inadequate or a waste of resources. In contrast,
a Type II error means failing to reject a null hypothesis. It may only result in missed opportunities to
innovate, but these can also have important practical consequences.

3.5 Methods of analysis


Pearson’s correlation coefficient
A typical example for quantifying the association between two variables measured on an interval/ratio
scale is the analysis of relationship between a person’s height and weight. Each of these two characteristic
variables is measured on a continuous scale. The appropriate measure of association for this situation is
Pearson’s correlation coefficient, r (rho), which measures the strength of the linear relationship between
two variables on a continuous scale. The coefficient r takes on the values of −1 through +1. Values of −1
or +1 indicate a perfect linear relationship between the two variables, whereas a value of 0 indicates no
linear relationship. (Negative values simply indicate the direction of the association, whereby as one
variable increases, the other decreases.) Correlation coefficients that differ from 0 but are not −1 or +1
indicate a linear relationship, although not a perfect linear relationship. In practice, ρ (the population
correlation coefficient) is estimated by r, which is the correlation coefficient derived from sample data.
Although Pearson’s correlation coefficient is a measure of the strength of an association (specifically the
linear relationship), it is not a measure of the significance of the association. The significance of an
association is a separate analysis of the sample correlation coefficient, r, using a t-test to measure the
difference between the observed r and the expected r under the null hypothesis.

Business Research Methods 125


and Data Analytics
22BA206

Spearman rank-order correlation coefficient


The Spearman rank-order correlation coefficient (Spearman rho) is designed to measure the strength of a
monotonic (in a constant direction) association between two variables measured on an ordinal or ranked
scale. Data that result from ranking and data collected on a scale that is not truly interval in nature (e.g.,
data obtained from Likert-scale administration) are subject to Spearman correlation analysis. In addition,
any interval data may be transformed to ranks and analyzed with the Spearman rho, although this results
in a loss of information. Nonetheless, this approach may be used, for example, if one variable of interest is
measured on an interval scale and the other is measured on an ordinal scale. Similar to Pearson’s
correlation coefficient, Spearman rho may be tested for its significance. A similar measure of strength of
association is the Kendall tau, which also may be applied to measure the strength of a monotonic
association between two variables measured on an ordinal or rank scale.
As an example of when Spearman rho would be appropriate, consider the case where there are seven
substantial health threats to a community. Health officials wish to determine a hierarchy of threats in order
to most efficiently deploy their resources. They ask two credible epidemiologists to rank the seven threats
from 1 to 7, where 1 is the most significant threat. The Spearman rho or Kendall tau may be calculated to
measure the degree of association between the epidemiologists’ rankings, thereby indicating
the collective strength of a potential action plan. If there is a significant association between the two sets
of ranks, health officials may feel more confident in their strategy than if a significant association is not
evident.
Get a Britannica Premium subscription and gain access to exclusive content.

-square test
The chi-square test for association (contingency) is a standard measure for association between two
categorical variables. The chi-square test, unlike Pearson’s correlation coefficient or Spearman rho, is a
measure of the significance of the association rather than a measure of the strength of the association.
A simple and generic example follows. If scientists were studying the relationship between gender
and political party, then they could count people from a random sample belonging to the various
combinations: female-Democrat, female-Republican, male-Democrat, and male-Republican. The
scientists could then perform a chi-square test to determine whether there was a significant
disproportionate membership among those groups, indicating an association between gender and political
party.

Relative risk and odds ratio


Specifically in epidemiology, several other measures of association between categorical variables
are used, including relative risk and odds ratio. Relative risk is appropriately applied to
categorical data derived from an epidemiologic cohort study. It measures the strength of an association by
considering the incidence of an event in an identifiable group (numerator) and comparing that with the
incidence in a baseline group (denominator). A relative risk of 1 indicates no association, whereas a
relative risk other than 1 indicates an association.
As an example, suppose that 10 out of 1,000 people exposed to a factor X developed liver cancer,
while only 2 out of 1,000 people who were never exposed to X developed liver cancer. In this case, the
relative risk would be (10/1000)/(2/1000) = 5. Thus, the strength of the association is 5, or, interpreted
another way, people exposed to X are five times more likely to develop liver cancer than people not
exposed to X. If the relative risk was less than 1 (perhaps 0.2, for example), then the strength of the
association would be equally evident but with another explanation: exposure to X reduces the likelihood
of liver cancer five-fold, indicating that X has a protective effect. The categorical variables are exposure
to X (yes or no) and the outcome of liver cancer (yes or no). This calculation of the relative risk, however,
does not test for statistical significance. Questions of significance may be answered by calculation of a
95% confidence interval. If the confidence interval does not include 1, the relationship is considered
significant.

Business Research Methods 126


and Data Analytics
22BA206

Similarly, an odds ratio is an appropriate measure of strength of association for categorical data
derived from a case-control study. The odds ratio is often interpreted the same way that relative risk is
interpreted when measuring the strength of the association, although this is somewhat controversial when
the risk factor being studied is common.

Additional methods
There are a number of other measures of association for a variety of circumstances. For example, if one
variable is measured on an interval/ratio scale and the second variable is dichotomous (has two outcomes),
then the point-biserial correlation coefficient is appropriate. Other combinations of data types (or
transformed data types) may require the use of more specialized methods to measure the association in
strength and significance.
Other types of association describe the way data are related but are usually not investigated for their own
interest. Serial correlation (also known as autocorrelation), for instance, describes how in a series of events
occurring over a period of time, events that occur closely spaced in time tend to be more similar than those
more widely spaced. The Durbin-Watson test is a procedure to test the significance of such correlations.
If the correlations are evident, then it may be concluded that the data violate the assumptions of
independence, rendering many modeling procedures invalid. A classical example of this problem occurs
when data are collected over time for one particular characteristic. For example, if an epidemiologist
wanted to develop a simple linear regression for the number of infections by month, there would
undoubtedly be serial correlation: each month’s observation would depend on the prior month’s
observation. This serial effect (serial correlation) would violate the assumption of independent
observations for simple linear regression and accordingly render the parameter estimates for simple linear
regression as not credible.

Inferring causality
Perhaps the greatest danger with all measures of association is the temptation to infer causality. Whenever
one variable causes changes in another variable, an association will exist. But whenever an association
exists, it does not always follow that causation exists. In epidemiology, the ability to infer causation from
an association is often weak because many studies are observational and subject to
various alternative explanations for their results. Even when randomization has been applied, as in clinical
trials, inference of causation is often limited.

Testing Association
Measure of association, in statistics, any of various factors or coefficients used to quantify a
relationship between two or more variables. Measures of association are used in various fields of research
but are especially common in the areas of epidemiology and psychology, where they frequently are used
to quantify relationships between exposures and diseases or behaviours.
A measure of association may be determined by any of several different analyses,
including correlation analysis and regression analysis. (Although the
terms correlation and association are often used interchangeably, correlation in a stricter sense refers to
linear correlation, and association refers to any relationship between variables.) The method used to
determine the strength of an association depends on the characteristics of the data for each variable.
Data may be measured on an interval/ratio scale, an ordinal/rank scale, or a nominal/categorical
scale. These three characteristics can be thought of as continuous, integer, and qualitative categories,
respectively.

3.5.1 Chi Square


The Chi-Square Test of Independence determines whether there is an association between categorical
variables (i.e., whether the variables are independent or related). It is a nonparametric test.
This test is also known as:

Business Research Methods 127


and Data Analytics
22BA206

• Chi-Square Test of Association.


This test utilizes a contingency table to analyze the data. A contingency table (also known as a cross-
tabulation, crosstab, or two-way table) is an arrangement in which data is classified according to two
categorical variables. The categories for one variable appear in the rows, and the categories for the other
variable appear in columns. Each variable must have two or more categories. Each cell reflects the total
count of cases for a specific pair of categories.
The Chi-Square Test of Independence is commonly used to test the following:
• Statistical independence or association between two categorical variables.
The Chi-Square Test of Independence can only compare categorical variables. It cannot make
comparisons between continuous variables or between categorical and continuous variables. Additionally,
the Chi-Square Test of Independence only assesses associations between categorical variables, and can
not provide any inferences about causation.
If your categorical variables represent "pre-test" and "post-test" observations, then the chi-square test of
independence is not appropriate. This is because the assumption of the independence of observations is
violated. In this situation, McNemar's Test is appropriate.

Consider the below example:


A survey was done to determine if job satisfaction was related to income. A total of 901 people
participated in the survey. The data are shown below. We will use the Chi-Square Test for Association
to determine if the two variables are associated.
1. Enter the data into an Excel worksheet as shown below.
Very Little Moderately Very
Income
Dissatisfied Dissatisfied Satisfied Satisfied
<6000 20 24 80 82
6000 -
22 38 104 125
$15000
15000 -
13 28 81 113
25000
>25000 7 18 54 92
2.Select all the data in the table above including the headings.
3. Select "Misc. Tools" from the "Statistical Tools" panel on the SPC for Excel ribbon.
4. Select the "Chi Square Test for Association" option and then OK.

• Enter Data Range with Labels: enter the range containing the data and the labels; default is the
range selected on the worksheet.

Business Research Methods 128


and Data Analytics
22BA206

• Alpha: this is the confidence level; 1-alpha is the confidence interval. Default is 0.05 for 95%
confidence.
• Row Title: Enter the title of the rows; default is value in first row of selected data.
• Column Title: enter the title of the columns; default is value in row above second column.
• Select OK to generate the results.
• Select Cancel to end the program.
Chi Square Test for Association Output
The output from the Chi-Square Test for Association is shown below. An explanation of the output
follows.

The top part of the output contains the data with the observed and expected values as well as the
contribution of each to χ2. The row and column totals are also given.
The middle portion of the output contains the following:
• Alpha (entered)
• The calculated χ2
• The degrees of freedom
• The critical χ2 value based on alpha and the degrees of freedom
• The calculated p value (will be in red if ≤ alpha)
The bottom portion of the output contains the residuals. The residuals are the difference between the
observed and the expected values. The conclusion is then given based on the values of alpha and the p
value. The null hypothesis (that the variables are not associated) is rejected if the p value < alpha.

3.5.2 T-test
The t test tells you how significant the differences between group means are. It lets you know if those
differences in means could have happened by chance. The t test is usually used when data sets follow
a normal distribution but you don’t know the population variance.
For example, you might flip a coin 1,000 times and find the number of heads follows a normal distribution
for all trials. So you can calculate the sample variance from this data, but the population variance is
unknown. Or, a drug company may want to test a new cancer drug to find out if it improves life expectancy.

Business Research Methods 129


and Data Analytics
22BA206

In an experiment, there’s always a control group (a group who are given a placebo, or “sugar pill”). So
while the control group may show an average life expectancy of +5 years, the group taking the new drug
might have a life expectancy of +6 years. It would seem that the drug might work. But it could be due to
a fluke. To test this, researchers would use a Student’s t-test to find out if the results are repeatable for an
entire population.
In addition, a t test uses a t-statistic and compares this to t-distribution values to determine if the results
are statistically significant.
However, note that you can only uses a t test to compare two means. If you want to compare three or more
means, use an ANOVA instead.
The T Score.

The t score is a ratio between the difference between two groups and the difference within the groups.
• Larger t scores = more difference between groups.
• Smaller t score = more similarity between groups.
A t score of 3 tells you that the groups are three times as different from each other as they are within each
other. So when you run a t test, bigger t-values equal a greater probability that the results are repeatable.

T-Values and P-values

Every t-value has a p-value to go with it. A p-value from a t test is the probability that the results from
your sample data occurred by chance. P-values are from 0% to 100% and are usually written as a decimal
(for example, a p value of 5% is 0.05). Low p-values indicate your data did not occur by chance. For
example, a p-value of .01 means there is only a 1% probability that the results from an experiment
happened by chance.
Calculating the Statistic / Test Types

There are three main types of t-test:


• One-sample is used to find out the mean or average of one group to compare it against the set
average.
• An independent Two-Sample test is conducted when samples from two different groups, species,
or populations are studied and compared.
• Paired Sample is the hypothesis testing conducted when two groups belong to the same
population or group.

• Equal Variance is conducted when the sample size in each group or population is the same, or
the variance of the two data sets is similar.

• Unequal Variance is used when the variance and the number of samples in each group are
different.
• An Independent Samples t-test compares the means for two groups.
• A Paired sample t-test compares means from the same group at different times (say, one
year apart).
• A One sample t-test tests the mean of a single group against a known mean.
You can find the steps for an independent samples t test here. But you probably don’t want to calculate
the test by hand (the math can get very messy. Use the following tools to calculate the t test:
• How to do a T test in Excel.
• T test in SPSS.
• T-distribution on the TI 89.
• T distribution on the TI 83.

Business Research Methods 130


and Data Analytics
22BA206

What is a Paired T Test (Paired Samples T Test / Dependent Samples T Test)?


A paired t test (also called a correlated pairs t-test, a paired samples t test or dependent samples t test) is
where you run a t test on dependent samples. Dependent samples are essentially connected — they are
tests on the same person or thing. For example:
• Knee MRI costs at two different hospitals,
• Two tests on the same person before and after training,
• Two blood pressure measurements on the same person using different equipment.
When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test
Choose the paired t-test if you have two measurements on the same item, person or thing. But you should
also choose this test if you have two items that are being measured with a unique condition. For example,
you might be measuring car safety performance in vehicle research and testing and subject the cars to a
series of crash tests. Although the manufacturers are different, you might be subjecting them to the same
conditions.
With a “regular” two sample t test, you’re comparing the means for two different samples. For example,
you might test two different groups of customer service associates on a business-related test or testing
students from two universities on their English skills. But if you take a random sample each group
separately and they have different conditions, your samples are independent and you should run
an independent samples t test (also called between-samples and unpaired-samples).

The null hypothesis for the independent samples t-test is μ1 = μ2. So it assumes the means are equal. With
the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H0: µd =
0).
Paired Samples T Test By hand
Example question: Calculate a paired t test by hand for the following data:

Step 1: Subtract each Y score from each X score.

Business Research Methods 131


and Data Analytics
22BA206

Step 2: Add up all of the values from Step 1 then set this number aside for a moment.

Step 3: Square the differences from Step 1.

Business Research Methods 132


and Data Analytics
22BA206

Step 4: Add up all of the squared differences from Step 3.

Step 5: Use the following formula to calculate the t-score:

1. The “ΣD” is the sum of X-Y from Step 2.


2. ΣD2: Sum of the squared differences (from Step 4).
3. (ΣD)2: Sum of the differences (from Step 2), squared.
If you’re unfamiliar with the Σ notation used in the t test, it basically means to “add everything up”. You
may find this article useful: summation notation.

Step 6: Subtract 1 from the sample size to get the degrees of freedom. We have 11 items. So 11 – 1 = 10.
Step 7: Find the p-value in the t-table, using the degrees of freedom in Step 6. But if you don’t have a
specified alpha level, use 0.05 (5%).
So for this example t test problem, with df = 10, the t-value is 2.228.

Business Research Methods 133


and Data Analytics
22BA206

Step 8: In conclusion, compare your t-table value from Step 7 (2.228) to your calculated t-value (-2.74).
The calculated t-value is greater than the table value at an alpha level of .05. In addition, note that the p-
value is less than the alpha level: p <.05. So we can reject the null hypothesis that there is no difference
between means.
However, note that you can ignore the minus sign when comparing the two t-values as ± indicates the
direction; the p-value remains the same for both directions.

3.5.3 Regression
Regression is a statistical method used in finance, investing, and other disciplines that attempts
to determine the strength and character of the relationship between one dependent variable (usually
denoted by Y) and a series of other variables (known as independent variables).
Also called simple regression or ordinary least squares (OLS), linear regression is the most common form
of this technique. Linear regression establishes the linear relationship between two variables based on
a line of best fit. Linear regression is thus graphically depicted using a straight line with the slope defining
how the change in one variable impacts a change in the other. The y-intercept of a linear regression
relationship represents the value of one variable when the value of the other is zero. Non-linear
regression models also exist, but are far more complex.
Regression analysis is a powerful tool for uncovering the associations between variables observed
in data, but cannot easily indicate causation. It is used in several contexts in business, finance, and
economics. For instance, it is used to help investment managers value assets and understand the
relationships between factors such as commodity prices and the stocks of businesses dealing in those
commodities.
Regression as a statistical technique should not be confused with the concept of regression to the mean
(mean reversion).

3.5.3.1 Regression in Data Science and Data Analytics


According to the renowned American mathematician John Tukey, “An approximate answer to
the right problem is worth a good deal more than an exact answer to an approximate problem". This is
precisely what regression analysis strives to achieve.
Regression analysis is basically a set of statistical processes which investigates the relationship
between a dependent (or target) variable and an independent (or predictor) variable. It helps assess the
strength of the relationship between the variables and can also model the future relationship between
the variables.
Regression analysis is widely used for prediction and forecasting, which overlaps with Machine
Learning. On the other hand, it is also used for time series modeling and finding causal effect
relationships between variables. For example, the relationship between rash driving and the number of
road accidents by a driver can be best analyzed using regression.
The regression method of forecasting, as the name implies, is used for forecasting and for finding the
casual relationship between variables. From a business point of view, the regression method of
forecasting can be helpful for an individual working with data in the following ways:
• Predicting sales in the near and long term.
• Understanding demand and supply.
• Understanding inventory levels.
• Review and understand how variables impact all these factors.
However, businesses can use regression methods to understand the following:
• Why did the customer service calls drop in the past months?
• How the sales will look like in the next six months?
• Which ‘marketing promotion’ method to choose?
• Whether to expand the business or to create and market a new product.

Business Research Methods 134


and Data Analytics
22BA206

The ultimate benefit of regression analysis is to determine which independent variables have the most
effect on a dependent variable. It also helps to determine which factors can be ignored and those that
should be emphasized.

3.5.3.2 Regression Model


A regression model determines a relationship between an independent variable and a dependent variable,
by providing a function. Formulating a regression analysis helps you predict the effects of the independent
variable on the dependent one.
Example: we can say that age and height can be described using a linear regression model. Since a person’s
height increases as age increases, they have a linear relationship.
Regression models are commonly used as statistical proof of claims regarding everyday facts.
B. different types of regression models?
There are three different types of regression models:
1. Linear
2. Non-linear
3. Multiple
Let’s look at them in detail:

Linear regression model


A linear regression model is used to depict a relationship between variables that are proportional to each
other. Meaning, that the dependent variable increases/decreases with the independent variable.
In the graphical representation, it has a straight linear line plotted between the variables. Even if the points
are not exactly in a straight line (which is always the case) we can still see a pattern and make sense of it.
For example, as the age of a person increases, the level of glucose in their body increases as well.

Non-linear regression model


In the non-linear regression model, the graph doesn’t show a linear progression. Depending on how the
response variable reacts to the input variable, the line will rise or fall showing the height or depth of the
effect of the response variable.
To know that a non-linear regression model is the best fit for your scenario, make sure you look into your
variables and their patterns. If you see that the response variable is showing not-so-constant output to the
input variable, you can choose to use a non-linear model for your problem.
For example, a patient’s response to treatment can be good or bad depending on their body’s tendency
and willpower.
Multiple regression model
A multiple regression model is used when there is more than one independent variable affecting a
dependent variable. While predicting the outcome variable, it is important to measure how each of the
independent variables moves in their environment and how their changes will affect the output or target
variable.
For example, the chances of a student failing their test can be dependent on various input variables like
hard work, family issues, health issues, etc.
stepwise regression modeling
Unlike the above-mentioned regression model types, stepwise regression modeling is more of a technique
used when various input variables are affecting one output variable. The analyst will automatically
proceed to measure the variable that is directly correlated input variable and build a model out of it. The
rest of the variables come into the picture when he decides to perfect the model.
The analyst may add the remaining inputs one after the other based on their significance and the extent to
which it affects the target variable.

Business Research Methods 135


and Data Analytics
22BA206

For example, vegetable prices have increased in a certain areas. The reason behind the event can be
anything from natural calamities to transport and supply chain management. When an analyst decides to
put it out on a graph, he will pick up the most obvious reason, heavy rainfall in the agricultural regions.
Once the model is built, he can then add the rest of the affecting input variables into the picture based on
their occurrence and significance.

3.5.3.2 Calculating Regression


Linear regression models often use a least-squares approach to determine the line of best fit. The least-
squares technique is determined by minimizing the sum of squares created by a mathematical function.
A square is, in turn, determined by squaring the distance between a data point and the regression line or
mean value of the data set.
Once this process has been completed (usually done today with software), a regression model is
constructed. The general form of each type of regression model is:

where:
Y= The dependent variable you are trying to predictor explain
X= The explanatory (independent) variable(s) you are using to predict or associate with Y
a= The y-intercept
b= (beta coefficient) is the slope of the explanatory variable (s)
u= The regression residual or error term

Example of How Regression Analysis Is Used in Finance


Regression is often used to determine how many specific factors such as the price of a commodity,
interest rates, particular industries, or sectors influence the price movement of an asset. The
aforementioned CAPM is based on regression, and it is utilized to project the expected returns for stocks
and to generate costs of capital. A stock's returns are regressed against the returns of a broader index,
such as the S&P 500, to generate a beta for the particular stock.
Beta is the stock's risk in relation to the market or index and is reflected as the slope in the CAPM model.
The return for the stock in question would be the dependent variable Y, while the independent variable
X would be the market risk premium.
Additional variables such as the market capitalization of a stock, valuation ratios, and recent returns can
be added to the CAPM model to get better estimates for returns. These additional factors are known as
the Fama-French factors, named after the professors who developed the multiple linear regression model
to better explain asset returns

3.5.4 Analysis of Variance (ANOVA) is a statistical formula used to compare variances across the
means (or average) of different groups. A range of scenarios use it to determine if there is any difference
between the means of different groups. The t- and z-test methods developed in the 20th century were
used for statistical analysis until 1918, when Ronald Fisher created the analysis of variance method.
ANOVA is also called the Fisher analysis of variance, and it is the extension of the t- and z-tests. The
term became well-known in 1925, after appearing in Fisher's book, "Statistical Methods for Research
Workers." It was employed in experimental psychology and later expanded to subjects that were more
complex.
ANOVA used in statistics that splits an observed aggregate variability found inside a data set into two
parts: systematic factors and random factors. The systematic factors have a statistical influence on the

Business Research Methods 136


and Data Analytics
22BA206

given data set, while the random factors do not. Analysts use the ANOVA test to determine the influence
that independent variables have on the dependent variable in a regression study.

For example, to study the effectiveness of different diabetes medications, scientists design and experiment
to explore the relationship between the type of medicine and the resulting blood sugar level. The sample
population is a set of people. We divide the sample population into multiple groups, and each group
receives a particular medicine for a trial period. At the end of the trial period, blood sugar levels are
measured for each of the individual participants. Then for each group, the mean blood sugar level is
calculated. ANOVA helps to compare these group means to find out if they are statistically different or if
they are similar.
The outcome of ANOVA is the ‘F statistic’. This ratio shows the difference between the within group
variance and the between group variance, which ultimately produces a figure which allows a conclusion
that the null hypothesis is supported or rejected. If there is a significant difference between the groups, the
null hypothesis is not supported, and the F-ratio will be larger.
ANOVA Terminology
Dependent variable: This is the item being measured that is theorized to be affected by the independent
variables.
Independent variable/s: These are the items being measured that may have an effect on the dependent
variable.
A null hypothesis (H0): This is when there is no difference between the groups or means. Depending on
the result of the ANOVA test, the null hypothesis will either be accepted or rejected.
An alternative hypothesis (H1): When it is theorized that there is a difference between groups and means.
Factors and levels: In ANOVA terminology, an independent variable is called a factor which affects the
dependent variable. Level denotes the different values of the independent variable that are used in an
experiment.
Fixed-factor model: Some experiments use only a discrete set of levels for factors. For example, a fixed-
factor test would be testing three different dosages of a drug and not looking at any other dosages.
Random-factor model: This model draws a random value of level from all the possible values of the
independent variable.

The Formula for ANOVA is:


The formula for Analysis of Variance is:
ANOVA coefficient, F= Mean sum of squares between the groups (MSB)/ Mean squares of errors (MSE).
Therefore F = MSB/MSE
where,
Mean squares between groups, MSB = SSB / (k – 1)
Mean squares of errors, MSE = SSE / (N – k)
And
Total degrees of freedom, N – 1= df3
Degrees of freedom of errors, N – k = df2 here, N is the total number of observations throughout k groups.
Degrees of freedom between groups, k – 1= df1, where k is the number of groups.

Business Research Methods 137


and Data Analytics
22BA206

Moreover, the ANOVA table below represents its many components:

Source Of Sum Of Degrees Of Mean


F Value
Variation Squares Freedom Squares

Between SSB = ∑ MSB = SSB / f =


df1 =k – 1
Groups nj (X̄j – X̄)2 (k-1) MSB/MSE

SSE =∑∑ (X- MSE = SSE /


Error df2 = N – k
X̄j)2 (N-k)

SST = SSB +
Total Df3 = N – 1
SSE

For the above table, the following represents:


SSB = sum of squares between groups
SSE = sum of squares of errors
X̄j – X̄ = mean of the jth group,
X- X̄j = overall mean, and nj is the sample size of the jth group.
X = each data point in the jth group (individual observation)
N = total number of observations/total sample size,
and SST = Total sum of squares = SSB + SSE
If the value of F is near about 1, then there is insignificant variance between the means of the two groups
of data set under observation.

3.5.4.1 Function of ANOVA

The ANOVA test is the initial step in analyzing factors that affect a given data set. Once the test is
finished, an analyst performs additional testing on the methodical factors that measurably contribute to
the data set's inconsistency. The analyst utilizes the ANOVA test results in an f-test to generate additional
data that aligns with the proposed regression models.
The ANOVA test allows a comparison of more than two groups at the same time to determine whether a
relationship exists between them. The result of the ANOVA formula, the F statistic (also called the F-
ratio), allows for the analysis of multiple groups of data to determine the variability between samples and
within samples.
If no real difference exists between the tested groups, which is called the null hypothesis, the result of
the ANOVA's F-ratio statistic will be close to 1. The distribution of all possible values of the F statistic
is the F-distribution. This is actually a group of distribution functions, with two characteristic numbers,
called the numerator degrees of freedom and the denominator degrees of freedom.

Example:
A researcher might, for example, test students from multiple colleges to see if students from one
of the colleges consistently outperform students from the other colleges. In a business application, an
R&D researcher might test two different processes of creating a product to see if one process is better
than the other in terms of cost efficiency.
The type of ANOVA test used depends on a number of factors. It is applied when data needs to be
experimental. Analysis of variance is employed if there is no access to statistical software resulting in

Business Research Methods 138


and Data Analytics
22BA206

computing ANOVA by hand. It is simple to use and best suited for small samples. With many
experimental designs, the sample sizes have to be the same for the various factor level combinations.
ANOVA is helpful for testing three or more variables. It is similar to multiple two-sample t-tests.
However, it results in fewer type I errors and is appropriate for a range of issues. ANOVA groups
differences by comparing the means of each group and includes spreading out the variance into diverse
sources. It is employed with subjects, test groups, between groups and within groups.

3.5.4.2 When to use ANOVA


As an analyst, you might use Analysis of Variance (ANOVA) to test a particular hypothesis. You'd use
ANOVA to figure out how your various groups react, with the null hypothesis being that the means of the
various groups are equal. If the difference between the two populations is statistically significant, then the
two populations are unequal.
Now that we have understood what ANOVA is, let’s understand some important terms related to ANOVA.
Significant Terms Related to ANOVA
Means (Grand and Sample)
A sample mean is the average value for a group, whereas the grand mean is the average of sample means
from various groups or the mean of all observations combined.
F-Statistics
F-statistic or F-ratio is a statistical measure that tells us about the extent of difference between the means
of different samples. Lower the F-ratio, closer are the sample means.
Sum of Squares
The sum of squares is a technique used in regression analysis to determine the dispersion of data points.
It is used in the ANOVA test to compute the value of F.
Mean Squared Error (MSE)
The Mean Squared Error gives us the average error in the data set.
Hypothesis
In ANOVA, we have Null Hypothesis and an Alternative Hypothesis. The Null hypothesis is valid when
all the sample means are equal, or they don’t have any major difference.
The Alternate Hypothesis is valid when at least one of the sample means is different from the other.
Group Variability
In ANOVA, a group is a set of samples within the independent variable.
Between-group variability occurs when there is a significant variation in the sample distributions of
individual groups.
Within-group variability occurs when there are variations in the sample distribution within a single group.

3.5.4.3 Analysis of Variance Assumptions


The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:
1. An ANOVA can only be conducted if there is no relationship between the subjects in each
sample. This means that subjects in the first group cannot also be in the second group (e.g.
independent samples/between-groups).
2. The different groups/levels must have equal sample sizes.
3. An ANOVA can only be conducted if the dependent variable is normally distributed, so that the
middle scores are most frequent and extreme scores are least frequent.
4. Population variances must be equal (i.e. homoscedastic). Homogeneity of variance means that
the deviation of scores (measured by the range or standard deviation for example) is similar
between populations.

3.5.4.4 Type of ANOVA

One Way ANOVA

Business Research Methods 139


and Data Analytics
22BA206

One way ANOVA analysis of variance is commonly called a one-factor test in relation to the dependent
subject and independent variable. Statisticians utilize it while comparing the means of groups
independent of each other using the Analysis of Variance coefficient formula. A single independent
variable with at least two levels. The one way Analysis of Variance is quite similar to the t-test.
A one-way ANOVA (analysis of variance) has one categorical independent variable (also known as a
factor) and a normally distributed continuous (i.e., interval or ratio level) dependent variable.
The independent variable divides cases into two or more mutually exclusive levels, categories, or groups.
The one-way ANOVA test for differences in the means of the dependent variable is broken down by the
levels of the independent variable.
An example of a one-way ANOVA includes testing a therapeutic intervention (CBT, medication, placebo)
on the incidence of depression in a clinical sample.
Both the One-Way ANOVA and the Independent Samples t-Test can compare the means for two groups.
However, only the One-Way ANOVA can compare the means across three or more groups.

Two Way ANOVA


A two-way ANOVA (analysis of variance) has two or more categorical independent variables (also known
as a factor), and a normally distributed continuous (i.e., interval or ratio level) dependent variable.
The independent variables divide cases into two or more mutually exclusive levels, categories, or groups.
A two-way ANOVA is also called a factorial ANOVA.
An example of a factorial ANOVAs include testing the effects of social contact (high, medium, low), job
status (employed, self-employed, unemployed, retired), and family history (no family history, some family
history) on the incidence of depression in a population.

The pre-requisite for conducting a two-way anova test is the presence of two independent variables; one
can perform it in two ways:
Two way ANOVA with replication or repeated measures analysis of variance – is done when the two
independent groups with dependent variables do different tasks.
Two way ANOVA sans replication – is done when one has a single group that they have to double test
like one tests a player before and after a football game.
Moreover, one must meet the following conditions for its applications:
• The population should be near normal distribution.
• All samples should be independent.
• Variances of the population have to be equal.
• There should be an equal-sized sample in the group.

Comparing One-Way ANOVA and Two-Way ANOVA


• There are two main types of ANOVA: one-way (or unidirectional) and two-way. There also variations
of ANOVA. For example, MANOVA (multivariate ANOVA) differs from ANOVA as the former
tests for multiple dependent variables simultaneously while the latter assesses only one dependent
variable at a time. One-way or two-way refers to the number of independent variables in your analysis
of variance test. A one-way ANOVA evaluates the impact of a sole factor on a sole response variable.
It determines whether all the samples are the same. The one-way ANOVA is used to determine
whether there are any statistically significant differences between the means of three or more
independent (unrelated) groups.
• A two-way ANOVA is an extension of the one-way ANOVA. With a one-way, you have one
independent variable affecting a dependent variable. With a two-way ANOVA, there are two
independents. For example, a two-way ANOVA allows a company to compare worker productivity
based on two independent variables, such as salary and skill set. It is utilized to observe the interaction
between the two factors and tests the effect of two factors at the same time.

Business Research Methods 140


and Data Analytics
22BA206

• ANOVA is to test for differences among the means of the population by examining the amount of
variation within each sample, relative to the amount of variation between the samples. Analyzing
variance tests the hypothesis that the means of two or more populations are equal.
• In a regression study, analysts use the ANOVA test to determine the impact of independent variables
on the dependent variable.

3.5.4.5 The other types of ANOVA are:


N-Way ANOVA (MANOVA)
It applies to multiple independent variables that affect the dependent variable. It is more effective than
Analysis of Variance as one can use it to observe multiple dependent variables simultaneously.
Repeated measures ANOVA test: In this ANOVA test, you take sample means from at least three
different sets of test statistics and compare them against one another. This way, you can look for any key
and critical values and notate their statistical significance level as well. You do so primarily through
utilizing repeated F-tests.
Two-way ANOVA test: Also known as a factorial ANOVA test, this two-way approach measures the
interaction effect between two different groups (or independent variables) on a control group. It does so
in part by using F-ratios with two mean square values related to each group.
Two-way ANOVA test with replication: Just as with a typical two-way ANOVA test, you’ll study the
effect size of two separate datasets. The main difference arises in the fact this test requires you to run
multiple studies with different groups of people, while still using the same response variables.

Understanding F-Value in ANOVA


The test statistic for an ANOVA is denoted as F. The formula for ANOVA is F = variance caused by
treatment/variance due to random chance.
The ANOVA F value can tell you if there is a significant difference between the levels of the independent
variable, when p < .05. So, a higher F value indicates that the treatment variables are significant.
Note that the ANOVA alone does not tell us specifically which means were different from one another.
To determine that, we would need to follow up with multiple comparisons (or post-hoc) tests.
When the initial F test indicates that significant differences exist between group means, post hoc tests are
useful for determining which specific means are significantly different when you do not have specific
hypotheses that you wish to test.
Post hoc tests compare each pair of means (like t-tests), but unlike t-tests, they correct the significance
estimate to account for the multiple comparisons.

3.5.4.6 Steps to Perform an ANOVA Test


ANOVA tests help you better understand data and statistics. Keep the steps for this basic tutorial in mind
as you conduct this statistical test:
• Assign variables. Start by deciding on your dependent variable group. This will remain constant
throughout all your experimentation. Move on to define your independent variables. Keep in mind the
amount you use will likely lend themselves to specific types of ANOVA tests. The goal is to eventually
get an F-value (or F-statistic)—or a set of these values—you can understand by the end of the test.
• Choose levels. Within your independent group variables, decide how many different levels of
information for which you’d like to test. For instance, if your goal is to see how people respond to
heart medication, your variable group would be this type of medicine in a general sense and your levels
would likely be specific forms or brands of heart medication.
• Collect data points. To build your F-distribution, start running the test and keep track of data over
time. See what the main effects of the independent variables are. At this phase of statistical analysis,
your goal should be to collect as much data as possible rather than to assign value to it.
• Decide which test to use. Make sure you’re using the appropriate form of ANOVA test to collect all
your population mean differences. If you’re just collecting one group, a simple one-way test can

Business Research Methods 141


and Data Analytics
22BA206

suffice. Alternatively, if you’re using multiple different independent variables, a two-way or


MANOVA approach might be more useful.
• Parse through your data set. After you gather all your data, go through your test results. See if they
hew closely to the standard deviation common to most statistical analysis or give you unpredictable
results. If you’re using multiple different independent variables, see if you have equal variances or
wide differences between each. If something seems wrong, consider mounting a regression campaign
to see where things went awry. You can always conduct some post-hoc tests as well.

Uses of ANOVA
• To test correlation and regression.
• To study the homogeneity in the case of two-way classification.
• To test the significance of the multiple correlation coefficient.
• To test the linearity of regression.

Advantages of ANOVA:
• Whereas the Z test can only be used to compare the means of two populations, the ANOVA test
can be used to compare the means of three or more populations.
• If there are two different treatments/factors affecting the dependent variable, then we can use the
two way ANOVA test to analyse the effect due to each treatment. The test will tell us whether the
difference due to each of the treatments is significnt or not.
• We can check equality of three or more populations means by repeatedly applying Z test pairwise.
But this increases the Type 1 error. On the other hand, the same comparison done by the ANOVA
technique, has low Type 1 error. This means that ANOVA test is a statistically powerful test.
• The ANOVA method is used in clinical testing to check for the effectiveness of experimental
medicines.
• The calculations involved in calculating the F statistics are easy and involve elementary operations
such as squaring, summing up and dividing. The decision criteria for rejecting or accepting the
null hypothesis are easy to understand.
Disadvantages of ANOVA:
• It often happens that the parent populations do not follow the normal distribution. For example,
the lifetimes of products generally follow the Weibull distribution. In such cases the ANOVA
method cannot be used. For instance, we may not be able to use the ANOVA technique to compare
the mean life of bulbs produced by three companies.
• If there are two or more dependent variables then the ANOVA technique cannot be applied. The
MANOVA test must be used in such cases.
• It rarely happens that all the population variances are equal. If the assumption of homoscedasticity
is violated then the use of ANOVA cannot be justified.
• If the null hypothesis is rejected we can only conclude that some population means are unequal.
The ANOVA test does not tell us anything about which of them are unequal. Some post hoc
tests must be carried out in order to know about that.
• Checking all the background assumptions such as independence, normality, homoscedasticity, etc.
is in and of itself a difficult task.
• Although the calculations involved are elementary they are still tedious to perform by hand. But
ANOVA tests are usually carries out using statistical software so this is not a huge barrier.

3.6 Making Choice of an Appropriate Analysis Technique


1. Affinity diagrams

Business Research Methods 142


and Data Analytics
22BA206

When engaged in brainstorming ideas, how can you avoid information overload? Affinity
diagrams help leaders and teams visually organise numerous ideas and data points in a simplified visual
form.

2. Analytic hierarchy process (AHP)


This decision-making technique helps to mitigate any subjectivity or intuition that goes into a decision.
Going with the gut or being blinkered by a subjective perspective is perfectly natural – it’s human nature,
and in some ways is a remarkable survival technique as it can lead to fast decisions based on personal
lived experience. However, it’s uncommon for a business issue to involve outrunning a non-allegorical
sabre-toothed tiger. Leadership often requires decision making to be analytical and as objective as
possible.
AHP, first developed in the 1970s by Dr. Thomas Saaty, combines the Multiple Criteria decision-
making technique with Paired Comparison and a splash of maths to explore multiple criteria and options
which might result in a single overall goal. The AHP decision making technique is normally reserved for
group solutions to complex challenges. Key use: complex decisions

3. Conjoint analysis

Market researchers will be familiar with this stats-oriented technique. Conjoint analysis is often used to
help forecast how accepting consumers will be of proposed changes. It’s also used to help determine a
brand’s positioning in the market. Conjoint analysis is a survey-based technique that helps reveal how
consumers might value the attributes (such as the function, features or benefits) of a product or service.

4. Cost/benefit analysis
This technique is solely for making decisions of a financial nature. It can also be used to acquire any
financial data you might wish to use as part of another decision-making technique. Key use: financial
decision making

5. Decision making trees


The outcomes aren’t always clear when business decisions need to be made. A business might, for
example, be required to choose between conflicting strategies while hampered by limited resources or
other impediments to success. A decision-making tree can provide a visual aid when considering the
various phases of proposed solutions with unclear outcomes. Key use: assessing multiple outcomes prior
to tough decision making

Business Research Methods 143


and Data Analytics
22BA206

6. Game theory
Game theory can help business leaders make decisions by putting themselves in the shoes of a third party
– e.g. a client, competitor or consumer – and anticipating what their actions, reactions and motives might
be. Playing out these scenarios in a safe hypothetical space can help a leader make decisions based on the
outcomes of the game.
Game theory can be a useful decision-making technique if you need to take into account exterior third
parties like competitors, clients or legislative authorities. It was invented in 1944 by John von Neumann
and Oskar Morgenstern. Since then, around 20 leading scientists and economists have been awarded the
Nobel Prize in Economic Sciences for their evolution of game theory, so it’s clearly an important aspect
of modern decision-making and analysis.
Game theory models the strategic interaction between two or more players in a situation that involves set
rules. Games are typically co-operative or non co-operative. There are various Players, Actions, Payoffs
and Information (known as PAPI). Players formulate strategies and try to gain as much benefit as they
can. Key use: negotiating with third parties or making strategic decisions that involve third parties

Key opportunities to use game theory in decision making:


• Bargaining and negotiating
• Product/service launch decisions
• Supply chain decisions (e.g. outsourcing)

7. Heuristic methods
Heuristic methods are used to refine a product or service over time, using trial and error. They’re not
accurate, but they can get the job done. Heuristic methods often have the benefit of saving time and
resource and reducing initial expenditure.
For example, decisions relating to a website launch could be resolved using heuristic methods, if it’s
determined the website doesn’t need to be perfect on launch. It can meet 80% of desired requirements,
and be improved in terms of content and function over time. Key use: save time on making decisions
where a perfect result isn’t required first time round

8. Influence diagrams approach (IDA)


IDA is a technique used in the field of human reliability assessment. It can be used in all kinds of sectors,
from business and HR to the healthcare and nuclear industries.
Decision making sometimes depends greatly on the people involved and their level of reliability. In some
projects, the reliability of the team can make or break a situation. An influence diagram can provide a
visual aid to determine how human error might influence a decision or project, and how much that might
affect outcomes. Key use: reducing the risk of human error in decision making

9. Linear programming (LP)

Linear programming uses maths to represent requirements as linear equations. It is, for example, useful
when making decisions relating to problems cropping up in operations research. Key use: making the most
of limited resources

10. Multiple criteria decision analysis


This decision-making technique allows a business to assess and evaluate various options against a set of
defined business criteria. Typical examples of criteria might be cost/price, level of quality, customer/client
satisfaction, or high returns.
This analysis technique is somewhat like a cost/benefit analysis, except it’s not limited to cost. We make
thousands of decisions every day - often intuitively, but some part of us is weighing up the various criteria.
When we buy a car, we weigh up cost, comfort, safety, fuel economy, function, form and aesthetics. When

Business Research Methods 144


and Data Analytics
22BA206

we buy a latte, we consider everything from cost and quality to the environmental friendliness of the
packaging.
Multiple criteria decision analysis enables leaders to weigh up different criteria. How does one measure
apples against cheese, or cost against comfort? The following MCDA steps can help. Key use: making
business decisions that reach a compromise between logical analysis and intuition

Multiple criteria decision analysis in 5 steps


• Specify the context
• Identify available options
• Confirm the objectives and select criteria that represent key values
• Measure each of the criteria in order to figure out their relative importance
• Calculate the different values by averaging out scores and weighting

11. Multi-voting
When making decisions as a group, use multi-voting to weed out lower priority options. You can then use
other, more exacting techniques to make key decisions on a smaller (and therefore more manageable)
group of options.
Multi-voting can be as simple as giving each member of the group a list of ideas and telling them they can
only vote for the three ideas they consider most important or beneficial. Tally up the votes to determine
which options are deemed most important by the group. Key use: making fair and balanced group
decisions

12. Net present value (NPV) and present value (PV)


The value of money flexes with time. A house bought twenty years ago might be worth far more now,
leading to questions of whether (and when) to sell or buy. Pension payments might rise substantially the
longer a person remains in employment, leading to questions of when to retire.
Calculation of NPV or PVC can help a business compare financial options representing future cash flows.
It’s key to use critical thinking to question all assumptions when making these calculations in order to
make a genuinely informed decision. Key use: making decisions relating to investment and capital budget

13. Paired comparison analysis


• The less experienced, cheaper hire or the more expensive, more experienced hire?
• Creating a product/service in-house or buying/outsourcing it?
• The quick, cost-effective option or the expensive, delayed, future-proofed option?
We often have to compare two options in order of importance. Paired comparison analysis can help with
that – and we do it intuitively all the time, but it’s advantageous in business to bring structured analysis
into the mix.
If paired comparison analysis has a catch, it’s that this technique doesn’t really surface any information
identifying the criteria supporting each option. You have to do the legwork yourself – but it’s a good
starting point. Key use: making decisions relating to comparing two options

14. Pro/con technique


The pro/con technique can be used in tandem with paired comparison analysis, and weighing up the pros
and cons of a decision is a tale as old as time.
Similar techniques include the plus/minus/interesting (PMI) technique and force field analysis. Key use:
making decisions relating to comparing two options

Business Research Methods 145


and Data Analytics
22BA206

15. Scientific method


The scientific method of decision making can also be called a heuristic method, since it’s best used in
circumstances where you don’t need 100% perfection first time round.
We all learned the drill in school – hypothesis, method, results, conclusion. There are a few more steps to
the scientific method, but in essence the format is the same as that of science experiments in school. Key
use: taking a scientific approach to business decisions

Using the scientific method in 7 steps:


• Question – formulate a question
• Research – do background research to gather as much clarity as possible
• Hypothesis - Based on your research, form a hypothesis, or statement you want to test the validity
of
• Experiment – test, test, test!
• Observations – collect data from the experiment
• Results – formulate results based on the data you’ve collected
• Conclusion – determine the validity of your hypothesis

16. Trial and error


Decision making based on trial and error sounds chaotic but it has an established place in business strategy.
Several of the decision-making techniques outlined above have their basis in a structured approach to trial
and error. Heuristic methods and the scientific method feature trial and error as the backbone of their
process. Agile project management is a very flexible management style that incorporates trial and error
into its process with minimum risk.
When using the trial and error method to make decisions, it’s important to acknowledge that any failure
as a result of decisions made is low risk. It’s also vital to reflect deeply on the results in order to understand
the causes of the failure and further remove the risks and challenges on the next iteration of the trial and
error process. Going in circles is not progression. Heading upwards in a spiral is. Key use: a general and
popular approach to making low-risk decisions

3.6.1 Successful choice of an appropriate Analysis Technique Depends on Several Factors,


including:
Nature of data: The type of data you have, such as numerical, categorical, or time-series data, will
determine the suitable analysis method.
Research question: The research question you are trying to answer will influence the choice of technique.
For example, if you want to establish a cause-and-effect relationship, you might consider using regression
analysis.
Number of variables: The number of variables you have and the relationship between them will
determine the appropriate technique. For example, if you have two variables, you might use a scatter plot,
while if you have more than two variables, you might use multiple regression or factor analysis.
Data distribution: The distribution of your data can affect the choice of analysis technique. For example,
if your data is not normally distributed, you might consider using a non-parametric test.
Sample size: The sample size of your data will also influence the choice of analysis technique. For
example, if you have a small sample size, you might consider using a non-parametric test.
It is important to remember that there is often more than one appropriate analysis technique for a
given data set and research question. It is important to consult with a statistical expert, or consult with
relevant literature, to determine the most appropriate technique for your study. The choice of an
appropriate analysis technique depends on several factors, including the type of data you have, the research
questions you want to answer, and the type of results you want to obtain. Here are again some common
analysis techniques used in various contexts:

Business Research Methods 146


and Data Analytics
22BA206

Descriptive Statistics: This type of analysis is used to summarize and describe the main features of a
dataset, such as the mean, median, mode, and standard deviation.
Inferential Statistics: This type of analysis involves making generalizations about a population based on
a sample of data. Common techniques include hypothesis testing, regression analysis, and analysis of
variance (ANOVA).
Exploratory Data Analysis (EDA): This type of analysis is used to summarize and visualize the main
features of a dataset, identify patterns, and detect outliers.
Machine Learning: This type of analysis involves training a model on a dataset to make predictions or
classify data points. Common techniques include decision trees, random forests, support vector machines,
and neural networks.
Network Analysis: This type of analysis is used to analyze relationships between nodes in a network,
such as social networks, transportation networks, and biological networks.
Time Series Analysis: This type of analysis is used to analyze data that is collected over time, such as
stock prices, weather patterns, and economic indicators.
It's important to choose the right analysis technique for your data and research questions, as using the
wrong technique can lead to incorrect results and invalid conclusions.

3.7 Market Survey


Market survey is a method used by businesses, organizations, and market researchers to gather information
about a target market, competition, and consumer preferences. It is typically used to inform decision-
making, such as product development, marketing strategies, and market positioning.
Market surveys can be conducted in various forms, including online surveys, phone surveys, in-person
surveys, and mail surveys. The choice of survey method depends on the target audience, the information
being sought, and the budget and resources available.
The steps involved in conducting a market survey typically include:
• Define the research objectives and questions.
• Determine the target population and sample size.
• Choose the survey method (e.g., online, phone, in-person, or mail survey).
• Develop the survey questionnaire.
• Administer the survey.
• Analyze the data.

Draw conclusions and make recommendations based on the results. Market surveys can provide
valuable information for businesses and organizations, such as insights into consumer preferences and
behaviors, market size and growth, and competitive landscape. It is important to ensure that the survey is
designed and conducted in a rigorous and scientific manner to ensure the validity and reliability of the
results.
To be simple Market survey is the survey research and analysis of the market for a particular
product/service which includes the investigation into customer inclinations. A study of various customer
capabilities such as investment attributes and buying potential. Market surveys are tools to directly collect
feedback from the target audience to understand their characteristics, expectations, and requirements.
Marketers develop new and exciting strategies for upcoming products/services but there can be no
assurance about the success of these strategies. For these to be successful, marketers should determine the
category and features of products/services that the target audiences will readily accept. By doing so, the
success of a new avenue can be assured.
Most marketing managers depend on market surveys to collect information that would catalyze the market
research process. Also, the feedback received from these surveys can be contributory in product marketing
and feature enhancement.
Market surveys collect data about a target market such as pricing trends, customer requirements,
competitor analysis, and other such details.

Business Research Methods 147


and Data Analytics
22BA206

3.7.1 Purpose of Market Survey


• Gain critical customer feedback: The main purpose of the market survey is to offer marketing
and business managers a platform to obtain critical information about their consumers so that
existing customers can be retained and new ones can be got onboard.
• Understand customer inclination towards purchasing products: Details such as whether the
customers will spend a certain amount of money for their products/services, inclination levels
among customers about upcoming features or products, what are their thoughts about the
competitor products etc.
• Enhance existing products and services: A market survey can also be implemented with the
purpose of improving existing products, analyze customer satisfaction levels along with getting
data about their perception of the market and build a buyer persona using information from existing
clientele database.
• Make well-informed business decisions: Data gathered using market surveys is instrumental in
making major changes in the business which reduces the degree of risks involved in taking
important business decisions.
• Market Survey Sources
• Product Surveys: New products/concept testing survey templates offer questions to obtain insights
about products and concepts. These survey questions are curated by market research experts and
can help in analyzing which kind of products or features will work in a market.
• Conference Feedback Surveys: Conference feedback survey templates provide questions that can
be asked to participants of a conference. An organization can organize better conferences by
implementing feedback received from these surveys such as enhancing overall conference
management, improved IT infrastructure, better content coverage or other such factors.
• Focus Group Surveys: Focus group survey templates can be implemented during and after the
recruitment of the focus group. Gaining insights from a dedicated group of 8-10 people can be
done easily with this existent survey template.
• Hardware And Software Surveys: Hardware and software survey templates offer editable
questions about software product evaluation, hardware product evaluation, pre-installation
procedure, technical documentation quality and other such factors.
• Website Surveys: Website survey templates are customizable as per application and consist of
questions pertaining to website customer feedback, visitor profile information, online retail
information etc.

3.7.2 Types of Market Survey


• Multiple types of market surveys are used by enterprises to collect data depending on the objective
of their market research. The information collected can be used to study various aspects of the
market to address topics such as the right time to launch the product/service, to understand the
trends in the market, to measure customer loyalty, to study their competitors and many more.
• There are various types of market surveys out of which we will talk about the top 10 to get
information from customers about their demands, expectations and what they opine about the
competitors. Each one of these market surveys has a different approach and has a marking impact
on the various aspects of a business.
• In order to conduct various types of market surveys, successful enterprises in today’s world, use
powerful market research survey software to get actionable market insights through real-time data
collection and robust analytics. Following are the top 10 types of market surveys that are conducted
by successful enterprises.
1. Market Surveys for segmentation: An organization can spot existing and prospective customers
and understand why the customers have chosen their products/services and the prospects have not yet
made a purchase. This can lead to a structured market segmentation and analysis.

Business Research Methods 148


and Data Analytics
22BA206

2. Market Surveys for exploring various aspects of the target market: Get information about
factors such as market size, demographic information such as age, gender, family income etc. to lay
out a roadmap by considering growth rate of the market, positioning, and average market share.
3. Market Surveys to probe into purchase procedure: How does a customer deciding on making a
purchase? What are the factors that convert product awareness into sales? This type of market survey
will unveil awareness, information, free trial, purchase, and repeat.
4. Market Surveys to establish buyer persona: These surveys are to build a buyer persona by
knowing about customer preferences, inclination, and capabilities of purchasing a product.
5. Market Surveys to measure customer loyalty: What is the degree of loyalty that the customers
have towards and organization? The answer to this question can be obtained by conducting a market
survey.
6. Market Surveys to analyze a new feature or concept: It is essential for an organization to include
market-compliant features and concepts. By carrying out a market survey to understand which features
to launch, will help all the teams involved in the feature development process to do that with proper
research.
7. Market Surveys for competitor analysis: Healthy competition is always good for an
organization’s progress. Market surveys done with the motive of competitor analysis will produce
results about how does the target market weigh the organization’s products/services in comparison to
the others in the market.
8. Market Surveys to understand the impact of sales activities: Sales activities are the backbone
of an organization and it becomes crucial to keep track of these activities. Market surveys for sales
activities will produce a report of the impact of sales activities, whether their frequency needs to
increase or any changes the audiences think should be inculcated in the sales process.
9. Market Surveys to assess prices for new products/services: Affordability of products also is an
aspect that drives the market for organizations. Price ranges, product variants to cater multiple price
ranges, target customers for each of the products etc.
10. Market Surveys for evaluation of customer service: Good customer service can lead to
enhanced satisfaction levels among customers. Factors such as time taken to resolve issues, the scope
of improvement, best practices of customer service etc.

3.7.3.Importance of Market Survey


There are various methods of conducting market surveys, including online surveys, telephone surveys,
face-to-face interviews, and focus groups. The method chosen will depend on the target audience, the type
of information being sought, and the budget available for the survey.
When conducting a market survey, it's important to use a well-designed questionnaire that is clear, concise,
and easy to understand. The questionnaire should be designed in such a way as to minimize bias and to
ensure that the data collected is accurate and reliable.
Once the survey data has been collected, it's important to analyze it appropriately using descriptive and
inferential statistical techniques, depending on the research questions and objectives. This will help to
identify trends and patterns in the data, and to draw conclusions about the market and consumer behavior.
Overall, market surveys are a valuable tool for businesses looking to better understand their customers
and make informed decisions about their products, services, and marketing strategies.
There are 5 factors that depict the importance of a market survey
1. Understanding the demand and supply chain of the target market: A product is most likely to
be successful if it is developed by keeping in mind the demand and supply of the target market. This
way, marketers can obtain insights about market capabilities to absorb new products and concepts to
develop customer-centric products and features.
2. Developing well-thought marketing plans: The World is a target market for an organization,
especially a well-established one. Getting data from the target market through thorough market research

Business Research Methods 149


and Data Analytics
22BA206

using market surveys and segmentation can be a source of creating concrete and long-term marketing
plans.
3. Figure out customer expectations and needs: All marketing activities revolve around customer
acquisition. All small and large organizations require market surveys to gather feedback from their
target audience regularly, using customer satisfaction tools such as Net Promoter Score, Customer
Effort Score, Customer Satisfaction Score (CSAT) etc. Organizations can analyze customer feedback
to measure customer experience, satisfaction, expectations etc.
4. Accurate launch of new products: Market surveys are influential in understanding where to test
new products or services. Market surveys provide marketers a platform to analyze the scope of success
of upcoming products and make changes in strategizing the product according to the feedback they
receive.
5. Obtain information about customer demographics: Customer demographics form the core of any
business and market surveys can be used to obtain intricate and sensitive details about customer
demographics such as race, ethnicity or family income.

Exercise
I. Write down short answers for the following:
12. What is data management plan. (pg. 95)
13. List out challenges in data management. (pg. 97 & 98)
14. List out measurement and its functions. (pg. 103)
15. Describe likert’s scale. (pg. 109)
16. Define descriptive statistics (pg.)
17. What are database application? (pg. 110)
18. What is cross tabulation? (pg.115)
19. What are Quantitative and qualitative analysis? (pg.113)
20. Define hypothesis. (pg.119)
21. Write down types of hypothesis. (pg.119)

II. Provide Detailed Answers:


8. Elaborate the components of data management plan. (pg. 95
9. Describe measurement process. (pg.106)
10. Explain different types of measurement. (pg. 104)
11. Describe different types of scales used in measurement. (pg.106)
12. Compare internet and database.
13. List out popular database applications used in business. (pg.111)
14. Elaborate data collection for quantitative and qualitative data.(pg. 114)
15. Elaborate type I and II errors. (pg.124)

Business Research Methods 150


and Data Analytics
22BA206

UNIT-IV DATA ANALYTICS 12

Introduction to Data analytics- Types of Data analytics- Data visualization for decision making- Graphical techniques,
skewness, kurtosis, formatting data- different operations using chart, pivot chart and formatting plot area-Data wrangling -
Business Problem Solving across different domains- Dash boarding Fundamentals

Learning Objectives
• To learn about Types of data analytics
• To understand data visualization
• To understand data wrangling

Learning Outcomes
At the end of the unit they will be able to:
• To apply different operations of data formatting
• To apply dash boarding fundamentals
• To solve business problems in various field with data analytics

DETAILEDSESSIONPLAN TOPIC WISE

Mode of Assessment
S.No Title of Teaching Textbook/ Link Tool(Quiz/Pu
Topic (PPT/Seminar/ Reference Book (if Applicable) link zzle/
Chalk & should on Springboard/ Assignment/
Board etc.) Coursera / Nptel Seminaretc..)
NPTEL Link:
Introduction https://onlinecourses.npt
William G Zikmund, Barry J
to data Chalk and el.ac.in/noc23_mg54/uni
Babin, Jon C.Carr,
analytics and board/PPT t?unit=26&lesson=31
AtanuAdhikari,Mitch Griffin,
1. visualization
Business Researchmethods, A
https://onlinecourses.npt
South Asian Perspective, 8th
el.ac.in/noc23_mg54/uni
Edition, Cengage Learning,
New Delhi,2012. t?unit=26&lesson=30 Quiz

https://www.youtube.co
Chalk and James R. Evans, "Business m/watch?v=1LgkR1R1A
2. Data board/PPT Analytics - Methods, Models CU
formatting and Decisions", Pearson Ed,
2012.

Marc J. Schniederjans, Dara G. https://onlinecourses.npt


3 Problem Chalk and Schniederjans and Christopher el.ac.in/noc23_mg54/uni
solving board/PPT M. Starkey, " Business t?unit=17&lesson=22
Analytics Principles, Concepts,
and Applications - What, Why,
and How" , Pearson Ed, 2014

Business Research Methods 151


and Data Analytics
22BA206

4. DATA ANALYTICS
4.1 Introduction to Data analytics

Data analytics is the process of inspecting, examining, cleaning, transforming, and modeling data with the
goal of discovering useful information, drawing conclusions, and supporting decision-making. It involves
the use of statistical and computational methods to extract insights and knowledge from data. The field of
data analytics has grown significantly in recent years with the explosion of data being generated in various
industries, such as healthcare, finance, and e-commerce. It enables organizations to make data-driven
decisions by transforming data into actionable insights. There are many software tools available for data
analytics, such as R, Python, SQL, SAS, and Tableau. The choice of tool depends on the type of data being
analyzed and the specific requirements of the project. Data analytics plays a critical role in making
informed decisions and solving complex problems by converting data into meaningful insights.

Data Analytics can be used to improve various aspects of business, including marketing, operations,
finance, and human resources. For example, a company can use data analytics to optimize its pricing
strategy, understand consumer behavior, improve supply chain management, or identify areas for cost
savings.

The data analytics process typically involves the following steps:

• Data Collection: The first step is to gather the data needed for analysis. This can come from various
sources, such as databases, spreadsheets, or external sources like social media or surveys.
• Data Cleaning: The next step is to clean and preprocess the data to ensure that it is accurate,
consistent, and in a format that can be easily analyzed. This involves removing missing values,
dealing with outliers, and correcting errors.
• Data Exploration: In this step, the data is explored and visualized to get a better understanding of
its structure and characteristics. This can involve creating histograms, scatter plots, and other types
of visualizations to identify patterns and relationships.
• Data Modeling: In this step, statistical and machine learning models are applied to the data to
identify patterns and make predictions. These models can be used to perform regression analysis,
clustering, classification, and more.
• Data Interpretation: The final step is to interpret the results of the analysis and draw meaningful
conclusions. This can involve presenting the findings in a clear and concise manner, making
recommendations, and communicating the results to relevant stakeholders.
Overall, data analytics plays a crucial role in today's data-driven world, helping organizations to make
better decisions and improve their operations.

Business Research Methods 152


and Data Analytics
22BA206

4.1.1 Types of Data Analytics


There are several types of data analytics, each with its own approach and focus. Some of the most
common types are:

• Descriptive Analytics: This type of analytics focuses on summarizing and describing the
characteristics of data, such as central tendencies and dispersion. Descriptive analytics can be used
to answer questions such as "What happened?" and "What is happening?"
• Diagnostic Analytics: This type of analytics goes beyond descriptive analytics and is used to
understand the reasons behind events. For example, it can be used to identify the root cause of a
problem or to determine why a particular trend is occurring.
• Predictive Analytics: This type of analytics uses historical data and statistical models to make
predictions about future events. Predictive analytics can be used to answer questions such as "What
will happen?" and "What is likely to happen?"
• Prescriptive Analytics: This type of analytics goes beyond predictive analytics and provides
guidance and recommendations for decision-making. It can be used to answer questions such as
"What should we do?" and "What actions should we take?"
• Big Data Analytics: This type of analytics focuses on analyzing large and complex datasets, often
using distributed processing systems such as Hadoop. Big data analytics can be used to uncover
insights that might not be apparent from smaller datasets.
• Text Analytics: This type of analytics focuses on analyzing unstructured text data, such as
customer reviews or social media posts. Text analytics can be used to extract sentiment, topics,
and key terms from text data.
• Visual Analytics: This type of analytics focuses on the visual representation of data and the use of
interactive visualizations to facilitate data exploration and discovery. Visual analytics can be used
to uncover patterns and relationships that might not be immediately apparent from tabular data.
These are just a few examples of the types of data analytics that exist. The specific approach and
techniques used will depend on the type of data being analyzed, the questions being asked, and the goals
of the analysis.

4.1.1.1 Descriptive Statistics


Descriptive statistics is a branch of statistics that focuses on summarizing, describing, and representing
data. It is concerned with collecting, organizing, and presenting data in a way that makes it easy to
understand and interpret.
The main goal of descriptive statistics is to provide a simple and intuitive summary of the main features
of a dataset. Some of the most common descriptive statistics include measures of central tendency (such
as mean, median, and mode), measures of dispersion (such as range, variance, and standard deviation),
and measures of shape (such as skewness and kurtosis).
Descriptive statistics can be used in a variety of applications, including quality control, marketing
research, and medical research. For example, in quality control, descriptive statistics can be used to
monitor production processes and ensure that products are being produced within specified limits. In
marketing research, descriptive statistics can be used to summarize customer data and understand
consumer behavior.
Descriptive statistics can be calculated using various software tools, including spreadsheets,
statistical packages, and data visualization tools. The results of descriptive statistical analysis can be
presented in various forms, such as tables, graphs, and charts, depending on the type of data and the goals
of the analysis.
Data analysts can use descriptive statistics to summarize more or less any type of data, although it
helps to think of it as the first step in a more protracted process. That’s because while descriptive statistics

Business Research Methods 153


and Data Analytics
22BA206

may describe trends or patterns, it won’t dig deeper. For this, we need tools
like diagnostic and predictive analytics. Nevertheless, descriptive analytics is exceptionally useful for
introducing yourself to unknown data.
The following kinds of data can all be summarized using descriptive analytics:
• Financial statements
• Surveys
• Social media engagement
• Website traffic
• Scientific findings
• Weather reports
• Traffic data
Essentially, any data set can be summarized in one way or another, meaning descriptive analytics has
an almost endless number of applications. We’ll explore these in more depth in section five. First, let’s
look at some of the benefits and drawbacks of descriptive analytics.
There are several types of descriptive statistics, including:
• Measures of Central Tendency: These are statistics that summarize the "typical" or "average" value of
a dataset. The most common measures of central tendency are the mean (average), median (middle
value), and mode (most frequently occurring value).
• Measures of Dispersion: These are statistics that describe how spread out the values in a dataset are.
The most common measures of dispersion are the range (difference between the largest and smallest
values), variance (a measure of how far the values in a dataset are from the mean), and standard
deviation (a measure of the average deviation of the values from the mean).
• Measures of Shape: These are statistics that describe the shape of the distribution of the data. The most
common measures of shape are skewness (a measure of the asymmetry of a distribution) and kurtosis
(a measure of the peakedness of a distribution).
• Percentiles and Quartiles: These are statistics that divide a dataset into equal parts and describe the
values that correspond to specific portions of the data. For example, the median is the 50th percentile,
and quartiles divide the data into four equal parts.
• Frequency Distributions: These are tables or graphs that summarize the number of occurrences of each
value in a dataset. Frequency distributions can be used to identify patterns and relationships in the
data, and are often used in conjunction with other descriptive statistics.
• Box Plots: These are graphical representations of the distribution of a dataset that provide a quick
summary of the distribution's shape, central tendency, and dispersion. Box plots are particularly useful
for comparing multiple datasets or for identifying outliers in a dataset.
• Histograms: These are graphs that represent the distribution of a dataset by dividing the data into
intervals and counting the number of occurrences in each interval. Histograms are useful for
visualizing the distribution of a dataset and identifying patterns and relationships in the data.
These are some of the most common types of descriptive statistics. The specific type of descriptive statistic
used will depend on the type of data being analyzed, the goals of the analysis, and the questions being
asked.

Advantages of descriptive analytics


Although relatively simplistic as analytical approaches go, descriptive analytics nevertheless has many
advantages. Descriptive analytics:
• Presents otherwise complex data in an easily digestible format.
• Provides a direct measure of the incidence of key data points.
• Is inexpensive and only requires basic mathematical skills to carry out.
• Is faster to carry out, especially with help from tools like Python or MS Excel.
• Relies on data that organizations already have access to, meaning there’s no need to source
additional data.

Business Research Methods 154


and Data Analytics
22BA206

• Looks at a complete population (rather than data sampling), making it considerably more accurate
than inferential statistics.
But, of course, being so straightforward means descriptive analytics also has its limitations. Let’s explore
some of these next.
Disadvantages of descriptive analytics
Okay, we’ve looked at the strengths of descriptive analytics—but where does it fall short? Some
disadvantages of descriptive analytics include:
• You can summarize data sets you have access to, but these may not tell a complete story.
• You cannot use descriptive analytics to test a hypothesis or understand why data present the way
they do.
• You cannot use descriptive analytics to predict what may happen in the future.
• You cannot generalize your findings to a broader population.
• Descriptive analytics tells you nothing about the data collection methodology, meaning the data
set may include errors.
As you may suspect, although descriptive analytics are useful, it’s important not to overstretch their
capabilities. Fortunately, we have diagnostic and predictive analytics to help fill in the gaps where
descriptive analytics falls short.
Descriptive analytics use cases

Tracking social media engagement


Social media is a key touch point along the sales journey. The ability to measure and present engagement
metrics across a complex constellation of campaigns and social networks is, therefore, vital for
determining the most successful approaches to digital marketing. Fortunately, marketing reports on social
media engagement will include descriptive analytics by default. Clicks, likes, shares, detail expands,
bounce rates, and so on are all measures of social media engagement that can be easily summarized using
descriptive techniques.
For instance, perhaps a company is interested in knowing which social media account is driving the most
traffic to their website. Using descriptive statistics, visualizations, and dashboards, they can easily
compare information about different channels. Similarly, marketing teams can look at specific shareable
content, perhaps comparing videos with blog posts, to see which results in the most clicks.
While none of this information draws direct conclusions (in that it doesn’t measure cause and effect) it’s
still valuable. It helps teams to devise hypotheses or make informed guesses about where to invest their
time and budget.
Streaming and e-commerce
Subscription streaming services like Spotify and Netflix, and e-commerce sites like Amazon and eBay all
use descriptive analytics to identify trends. Descriptive measures help determine what’s currently most
popular with users and buyers. Spotify, for example, uses descriptive analytics to learn which albums or
artists subscribers are listening to. Meanwhile, Amazon uses descriptive analytics to compare customer
purchases. In both cases, these insights inform their recommendation engines.
Netflix, meanwhile, takes this use of descriptive analytics even further. A highly data-driven
company, Netflix uses descriptive analytics to see what genres and TV shows interest their subscribers
most. These insights inform decision-making in areas from new content creation to marketing campaigns,
and even which production companies they work with.

Learning management systems


From traditional education to corporate training, many organizations and schools now use online/offline
hybrid learning. Learning management systems (or LMSs for those in the know!) are a ubiquitous part of
this. LMS platforms track everything from user participation and attendance to test scores, and—in the
case of e-learning courses—even how long it takes learners to complete. Summarizing this information,
descriptive-analytical reports offer a high-level overview of what’s working and what’s not.

Business Research Methods 155


and Data Analytics
22BA206

Using these data, teachers and training providers can track both individual and organization-level targets.
They can analyze grade curves, or see which teaching resources are most popular. And while they won’t
necessarily know why, it may be possible to infer from the data that videos, for example, are more popular
than, say, written documents. Presenting this information is the first step towards improving course design
and creating better learner outcomes.

4.1.1.2 Diagnostic Analytics


Diagnostic analytics is a type of data analytics that focuses on understanding the reasons behind
events or trends. It goes beyond descriptive analytics, which provides a simple summary of the data, and
is used to identify the root cause of a problem or to determine why a particular trend is occurring.
Diagnostic analytics involves the use of techniques such as data mining, regression analysis, and statistical
process control to analyze data and identify patterns and relationships. The goal is to identify the factors
that are driving a particular outcome, such as a drop in sales or an increase in customer complaints.
Diagnostic analytics can be used in a variety of applications, including healthcare, finance, and
marketing. For example, in healthcare, diagnostic analytics can be used to identify the factors that
contribute to disease outbreaks or to improve patient outcomes. In finance, it can be used to identify the
root cause of financial irregularities or to improve investment performance.
Diagnostic analytics typically involves the use of advanced data analysis tools, including statistical
software, data visualization tools, and machine learning algorithms. The results of diagnostic analytics
can be presented in various forms, such as tables, graphs, and charts, depending on the type of data and
the goals of the analysis.
Overall, diagnostic analytics is a powerful tool for understanding the underlying causes of events and
trends, and provides crucial information for making informed decisions and taking effective actions.

Business Research Methods 156


and Data Analytics
22BA206

Functions of Diagnostic Analytics:


Diagnostic analytics uses a variety of techniques to provide insights into the causes of trends. These
include:
• Data drilling: Drilling down into a dataset can reveal more detailed information about which aspects
of the data are driving the observed trends. For example, analysts may drill down into national sales
data to determine whether specific regions, customers or retail channels are responsible for increased
sales growth.
• Data mining hunts through large volumes of data to find patterns and associations within the data.
For example, data mining might reveal the most common factors associated with a rise in insurance
claims. Data mining can be conducted manually or automatically with machine learning technology.
• Correlation analysis examines how strongly different variables are linked to each other. For example,
sales of ice cream and refrigerated soda may soar on hot days

Three Diagnostic Analytics Categories


The diagnostic analytics process of determining the root cause of a problem or trend typically comprises
three primary stages.
1. Identify anomalies: Trends or anomalies highlighted by descriptive analysis may require
diagnostic analytics if the cause isn’t immediately obvious. In addition, it can sometimes be
difficult to determine whether the results of descriptive analysis really show a new trend, especially
if there’s a lot of natural variability in the data. In those cases, statistical analysis can help to
determine whether the results actually represent a departure from the norm.
2. Discovery: The next step is to look for data that explains the anomalies: data discovery. That may
involve gathering external data as well as drilling into internal data. For example, searching
external data might reveal changes in supply chains, new regulatory requirements, a shifting
competitive landscape or weather patterns that are associated with the anomalous data.
3. Causal relationships: Further investigation can provide insights into whether the associations in
the data point to the true cause of the anomaly. The fact that two events correlate doesn’t
necessarily mean one causes the other. Deeper examination of the data associated with the sales
increase can indicate which factor or factors were the most likely cause.

Use Cases of Diagnostic Analytics


Every department within an organization can benefit from analyzing root causes of events in order to
improve its processes and outcomes. For example, companies can use diagnostic analytics to investigate
the cause of:
• A sudden decline in revenue or increase in expenses
• The popularity of a product or service
• An increase in employee turnover
• Bottlenecks in production or distribution

Business Research Methods 157


and Data Analytics
22BA206

Diagnostic Analytics Examples


Diagnostic analytics can be helpful in any industry, from manufacturing and retail to health care. After
applying diagnostic analytics to discover why an event occurred, companies can use that knowledge
to create solutions and develop predictive models for the future.

Health care:

Diagnostic analytics can support many areas of health care, including the core function of diagnosing
medical problems. For example, descriptive analytics can answer questions like, how many patients were
admitted to the hospital last month? And how many returned within 30 days? After all, in some cases
reimbursement may be dependent in part on readmittance rates. Descriptive analytics can quantify events
and highlight things like what hospital resources are being used and even model the rate of disease
diagnosis. By comparing that data with historical trends, anomalies can be detected, and then the discovery
work trying to find causal relationships can begin. For example, did high readmittance rates coincide with
a change in rounding policy, or what time of day patients are sent home? Regardless, the first step to
prescriptive analytics to solve issues is to uncover the anomalies.

Retail:

A store that sells eco-friendly products noticed a recent surge in revenue from one state. During discovery,
the company learned that the surge was driven by a leap in sales of a single product — a canvas tote bag.
Research revealed the causal relationship: the state’s governor had signed a law making plastic shopping
bags illegal, causing sales of reusable bags to soar.

Manufacturing:

A contract manufacturer found that a valuable type of machine started experiencing intermittent failures.
By using diagnostic analytics to examine the machines’ logs, the company discovered that routine
software updates had been installed the previous day. It identified the update as a likely cause of failure.
It verified the cause by uninstalling the software, which eliminated the problem.

Human resources:

A company’s annual hiring report showed that one department hired more people than any other
department — but there was no net increase in the department’s staff because it was losing people as fast
as it hired them. Drilling down into the data revealed that many of the positions were for a specific team,
which paid its staff less than the industry average. The company used the information to examine pay
scales, interview employees and take other measures to improve retention.

Advantages of Diagnostic Analytics:


Root Cause Analysis: The main advantage of diagnostic analytics is that it helps identify the root cause of
a problem or trend, which can lead to more effective solutions and decision-making.
Improved Problem-Solving: By understanding the underlying causes of events and trends, organizations
can make better-informed decisions and take more effective actions to address problems.
Better Decision-Making: Diagnostic analytics provides valuable insights into the data, which can help
organizations make more informed and effective decisions.

Business Research Methods 158


and Data Analytics
22BA206

Increased Understanding: Diagnostic analytics helps organizations gain a deeper understanding of their
data and the factors that are driving outcomes, which can lead to improved performance and increased
efficiency.
Cost Savings: By identifying and addressing problems early on, diagnostic analytics can help
organizations avoid costly mistakes and improve overall performance.

Disadvantages of Diagnostic Analytics:


Complexity: Diagnostic analytics can be complex and time-consuming, requiring specialized skills and
knowledge.
Data Quality: The quality of the data used for diagnostic analytics is critical, and poor-quality data can
lead to incorrect conclusions and ineffective decision-making.
Cost: Implementing and using diagnostic analytics can be expensive, and requires significant investments
in technology and personnel.
Limited Usefulness: In some cases, the results of diagnostic analytics may not provide actionable insights,
and may not be useful for decision-making.
Bias: There is a risk of introducing bias into the analysis, which can lead to incorrect conclusions and
ineffective decision-making.

4.1.1.3 Predictive Analysis


The term predictive analytics refers to the use of statistics and modeling techniques to make predictions
about future outcomes and performance. Predictive analytics looks at current and historical data patterns
to determine if those patterns are likely to emerge again. This allows businesses and investors to adjust
where they use their resources to take advantage of possible future events. Predictive analysis can also be
used to improve operational efficiencies and reduce risk.

Types of Predictive Analytical Models


There are three common techniques used in predictive analytics: Decision trees, neural networks, and
regression. Read more about each of these below.
Decision Trees
If you want to understand what leads to someone's decisions, then you may find decision trees useful.
This type of model places data into different sections based on certain variables, such as price or market
capitalization. Just as the name implies, it looks like a tree with individual branches and leaves. Branches
indicate the choices available while individual leaves represent a particular decision.
Decision trees are the simplest models because they're easy to understand and dissect. They're also very
useful when you need to make a decision in a short period of time.
Regression
This is the model that is used the most in statistical analysis. Use it when you want to determine patterns
in large sets of data and when there's a linear relationship between the inputs. This method works by
figuring out a formula, which represents the relationship between all the inputs found in the dataset. For
example, you can use regression to figure out how price and other key factors can shape the performance
of a security.
Neural Networks
Neural networks were developed as a form of predictive analytics by imitating the way the human brain
works. This model can deal with complex data relationships using artificial intelligence and pattern
recognition. Use it if you have several hurdles that you need to overcome like when you have too much
data on hand, when you don't have the formula you need to help you find a relationship between the
inputs and outputs in your dataset, or when you need to make predictions rather than come up with
explanations.

Advantages of Predictive Analytics

Business Research Methods 159


and Data Analytics
22BA206

There are numerous benefits to using predictive analysis. As mentioned above, using this type of analysis
can help entities when you need to make predictions about outcomes when there are no other (and
obvious) answers available.
Investors, financial professionals, and business leaders are able to use models to help reduce risk. For
instance, an investor and their advisor can use certain models to help craft an investment portfolio with
minimal risk to the investor by taking certain factors into consideration, such as age, capital, and goals.
There is a significant impact to cost reduction when models are used. Businesses can determine the
likelihood of success or failure of a product before it launches. Or they can set aside capital for production
improvements by using predictive techniques before the manufacturing process begins.

Uses of Predictive Analytics in Business


Predictive analytics is a decision-making tool in a variety of industries.
Forecasting
Forecasting is essential in manufacturing because it ensures the optimal utilization of resources in
a supply chain. Critical spokes of the supply chain wheel, whether it is inventory management or the
shop floor, require accurate forecasts for functioning.
Predictive modeling is often used to clean and optimize the quality of data used for such forecasts.
Modeling ensures that more data can be ingested by the system, including from customer-facing
operations, to ensure a more accurate forecast.
Credit
Credit scoring makes extensive use of predictive analytics. When a consumer or business applies for
credit, data on the applicant's credit history and the credit record of borrowers with similar characteristics
are used to predict the risk that the applicant might fail to perform on any credit extended.
Underwriting
Data and predictive analytics play an important role in underwriting. Insurance companies examine
policy applicants to determine the likelihood of having to pay out for a future claim based on the current
risk pool of similar policyholders, as well as past events that have resulted in payouts. Predictive models
that consider characteristics in comparison to data about past policyholders and claims are routinely used
by actuaries.
Marketing
Individuals who work in this field look at how consumers have reacted to the overall economy when
planning on a new campaign. They can use these shifts in demographics to determine if the current mix
of products will entice consumers to make a purchase.
Active traders, meanwhile, look at a variety of metrics based on past events when deciding whether to
buy or sell a security. Moving averages, bands, and breakpoints are based on historical data and are used
to forecast future price movements.
Fraud Detection
Financial services can use predictive analytics to examine transactions, trends, and patterns. If any of this
activity appears irregular, an institution can investigate it for fraudulent activity. This may be done by
analyzing activity between bank accounts or analyzing when certain transactions occur.
Supply Chain
Supply chain analytics is used to predict and manage inventory levels and pricing strategies. Supply chain
predictive analytics use historical data and statistical models to forecast future supply chain performance,
demand, and potential disruptions. This helps businesses proactively identify and address risks, optimize
resources and processes, and improve decision-making. These steps allow companies to forecast what
materials will be on hand at any given moment and whether there will be any shortages.
Human Resources
Human resources uses predictive analytics to improve various processes, such as forecasting future
workforce needs and skills requirements or analyzing employee data to identify factors that contribute to
high turnover rates. Predictive analytics can also analyze an employee's performance, skills, and

Business Research Methods 160


and Data Analytics
22BA206

preferences to predict their career progression and help with career development planning in addition to
forecasting diversity or inclusion initiatives.

4.1.1.4 Prescriptive Analysis


Prescriptive analytics is a branch of analytics that focuses on determining the best course of action to take
in a given situation. It uses a combination of mathematical models, algorithms, and decision science
techniques to provide specific recommendations for decision-making.
Prescriptive analytics goes beyond descriptive and predictive analytics by not only analyzing what has
happened in the past and what is likely to happen in the future, but also providing specific
recommendations for action. It incorporates optimization, simulation, and other advanced analytics
techniques to identify the most optimal decision given a set of constraints and objectives.
Examples of applications of prescriptive analytics include supply chain optimization, resource allocation,
predictive maintenance, and financial portfolio optimization.
Overall, prescriptive analytics can help organizations make informed decisions, improve efficiency and
effectiveness, and ultimately achieve their goals.

Advantages and Disadvantages of Prescriptive Analytics


Advantages
Prescriptive analytics can cut through the clutter of immediate uncertainty and changing conditions. It
can help prevent fraud, limit risk, increase efficiency, meet business goals, and create more loyal
customers. When used effectively, it can help organizations make decisions based on highly analyzed
facts rather than jump to under-informed conclusions based on instinct.
Prescriptive analytics can simulate the probability of various outcomes and show the probability of each,
helping organizations to better understand the level of risk and uncertainty they face than they could be
relying on averages. Organizations that use it can gain a better understanding of the likelihood of worst-
case scenarios and plan accordingly.
Disadvantages
But prescriptive analytics is not foolproof. It is only effective if organizations know what questions to
ask and how to react to the answers. As such, it's only effective if its inputs are valid. If the input
assumptions are invalid, the output results will not be accurate.
This form of data analytics is only suitable for short-term solutions. This means businesses shouldn't use
prescriptive analytics to make any long-term ones. That's because it becomes more unreliable if more
time is needed.
Not all prescriptive analytics providers are made the same. So it's important for businesses to carefully
consider the technology and who provides it. Some may provide real, concrete results while others make
the promise of big data and fail to deliver

Examples of prescriptive analytics


Businesses use prescriptive analytics to solve all sorts of real-world problems. Analysts in different
industries can use it to improve their processes:
Marketing and sales
Marketing and sales agencies have access to large amounts of customer data that can help them to
determine optimal marketing strategies, such as what types of products pair well together and how to price
products. Prescriptive analytics allows marketers and sales staff to become more precise with their
campaigns and customer outreach, as they no longer have to act simply on intuition and experience.
Transportation industry
Cost-effective delivery is essential for success and profitability in the package delivery and transportation
industry. Minimizing energy usage through better route planning and solving logistical issues such as
incorrect shipping locations can save time and money.

Business Research Methods 161


and Data Analytics
22BA206

Shippers produce massive amounts of data. Rather than employing armies of analysts and dispatchers to
decide how to best operate, these businesses can automate and build prescriptive models to provide
recommendations.
Financial markets
Quantitative researchers and traders use statistical modeling to try to maximize returns. Financial firms
can use similar techniques to manage risk and profitability.
For example, financial firms can build algorithms to churn through historical trading data to measure risks
of trades. The resulting analytics can help them decide how to size positions, how to hedge them, or
whether to place trades at all.
Additionally, these firms can use models to reduce transaction costs by figuring out how and when to best
place their trades.
Prescriptive Analytics for Hospitals and Clinics
Prescriptive analytics can be used by hospitals and clinics to improve the outcomes for patients. It puts
health care data in context to evaluate the cost-effectiveness of various procedures and treatments and to
evaluate official clinical methods.
It can also be used to analyze which hospital patients have the highest risk of re-admission so that health
care providers can do more, via patient education and doctor follow-up to stave off constant returns to
the hospital or emergency room.
Prescriptive Analytics for Airlines
Suppose you are the chief executive officer (CEO) of an airline and you want to maximize your
company’s profits. Prescriptive analytics can help you do this by automatically adjusting ticket prices
and availability based on numerous factors, including customer demand, weather, and gasoline prices.
When the algorithm identifies that this year’s pre-Christmas ticket sales from Los Angeles to New York
are lagging last year’s, for example, it can automatically lower prices, while making sure not to drop
them too low in light of this year’s higher oil prices.
At the same time, when the algorithm evaluates the higher-than-usual demand for tickets from St. Louis
to Chicago because of icy road conditions, it can raise ticket prices automatically. The CEO doesn’t have
to stare at a computer all day looking at what’s happening with ticket sales and market conditions and
then instruct workers to log into the system and change the prices manually. Instead, a computer program
can do all of this and more—and at a faster pace, too.
Prescriptive Analytics in Banking
Banking is one of the industries that can benefit from prescriptive analytics the most. That's because
companies in this sector are always trying to find ways to better serve their customers while ensuring
they remain profitable. Applying prescriptive analytical tools can help the banking sector to:
• Create models for customer relationship management
• Improve ways to cross-sell and upsell products and services
• Recognize weaknesses that may result in losses, such as anti-money laundering (AML)
• Develop key security and regulatory initiatives like compliance reporting
Prescriptive Analytics in Marketing
Just like banking, data analytics is very critical in the marketing sector. Marketers can use prescriptive
analytics to stay ahead of consumer trends. Using past trends and past performance can give internal and
external marketing departments a competitive edge.
By employing prescriptive analytics, marketers can come up with effective campaigns that target specific
customers at specific times like, say, advertising for a certain demographic during the Superbowl.
Corporations can also identify how to engage different customers and how to effectively price
and discount their products and services.

4.1.1.4 Big Data Analytics

Business Research Methods 162


and Data Analytics
22BA206

Big data analytics refers to the process of examining large and complex datasets, also known as big data,
to uncover hidden patterns, correlations, and other useful information. The goal of big data analytics is to
transform raw data into actionable insights that can inform decision-making and drive business value.
Big data is generated by a variety of sources, including social media, e-commerce transactions, machine-
generated data from IoT devices, and more. The sheer volume and variety of big data presents significant
challenges for traditional data processing and storage methods. As a result, big data analytics relies on
new technologies and approaches, such as distributed computing, SQL databases, and machine learning
algorithms.
Some common use cases for big data analytics include customer behavior analysis, fraud detection,
predictive maintenance, and market trend analysis. By leveraging the insights provided by big data
analytics, organizations can gain a competitive advantage, improve operations, and make better-informed
decisions.
Overall, big data analytics is a rapidly evolving field that has the potential to revolutionize the way
organizations operate and compete.

Applications of Big Data Analytics


Here are some examples of the applications of big data analytics:
• Customer Acquisition and Retention: Customer information helps tremendously in marketing
trends, through data-driven actions, to increase customer satisfaction. For example, personalization
engines for Netflix, Amazon, and Spotify help with improved customer experiences and gaining
customer loyalty.
• Targeted Ads: Personalized data about interaction patterns, order history, and product page
viewing history can help immensely to create targeted ad campaigns for customers on a larger
scale and at the individual level.
• Product Development: It can generate insights on development decisions, product viability,
performance measurements, etc., and direct improvements that positively serve the customers.
• Price Optimization: Pricing models can be modeled and used by retailers with the help of diverse
data sources to maximize revenues.
• Supply Chain and Channel Analytics: Predictive analytical models help with B2B supplier
networks, preemptive replenishment, route optimizations, inventory management, and notification
of potential delays in deliveries.
• Risk Management: It helps in the identification of new risks with the help of data patterns for the
purpose of developing effective risk management strategies.
• Improved Decision-making: The insights that are extracted from the data can help enterprises
make sound and quick decisions.

Implementing Big Data Analytics


Retail
The retail industry is actively deploying big data analytics. It is applying the techniques of data analytics
to understand what the customers are buying and then offering products and services that are tailor-made
for them.
Today, it is all about having an omnichannel experience. Customers may make contact with a brand on
one channel and then finally buy the product(s) through another channel, meanwhile going through more
intermediary channels. The retailers will have to keep track of these customer journeys, and they must
deploy their marketing and advertising campaigns based on that, to improve the chances of increasing
sales and lowering costs.
Technology
Technology companies are heavily deploying big data analytics. They are finding out more about how
customers interact with websites or apps and gather key information. Based on this, technology companies
can optimize their sales, customer service, customer satisfaction, etc. This also helps them launch new

Business Research Methods 163


and Data Analytics
22BA206

products and services since we are living in a knowledge-intensive economy, and the companies in the
technology sector are reaping the benefits of big data analytics.
Healthcare
Healthcare is another industry that can benefit from big data analytics tools, techniques, and processes.
Healthcare personnel can diagnose the health of their patients through various tests, run them through the
computers, and look for telltale signs of anomalies, maladies, etc. It also helps in healthcare to improve
patient care and increase the efficiency of the treatment and medication processes. Some diseases can be
diagnosed before their onset so that measures can be taken in a preventive manner rather than a remedial
manner.
Manufacturing
Manufacturing is an industrial sector that is involved with developing physical goods. The life cycle of a
manufacturing process can vary from product to product. Manufacturing systems are involved within the
industry setup and across the manufacturing floor.
There are a lot of technologies that are involved in manufacturing such as the Internet of Things (IoT),
robotics, etc., but the backbone of all of these is firmly based on big data analytics. By using this,
manufacturers can improve their yield, reduce the time to market, enhance the quality, optimize the supply
chain and logistics processes, and build prototypes before the launch of products. It can help manufacturers
through all these steps.
Energy
Most oil and gas companies, which come under the energy sector, are extensive users of big data analytics.
It is deployed when it comes to discovering oil and other natural resources. Tremendous amounts of big
data go into finding out what the price of a barrel of oil will be, what the output should be, and if an oil
well will be profitable or not.
It is also deployed in finding out equipment failures, deploying predictive maintenance, and optimally
using resources in order to reduce capital expenditure.

Advantages of big data analytics


There are quite a few advantages to incorporating big data analytics into a business or organization. These
include:
• Cost reduction: Big data can reduce costs in storing all the business data in one place. Tracking
analytics also helps companies find ways to work more efficiently to cut costs wherever possible.
• Product development: Developing and marketing new products, services, or brands is much easier
when based on data collected from customers’ needs and wants. Big data analytics also helps businesses
understand product viability and keep up with trends.
• Strategic business decisions: The ability to constantly analyze data helps businesses make better and
faster decisions, such as cost and supply chain optimization.
• Customer experience: Data-driven algorithms help marketing efforts (targeted ads, as an example) and
increase customer satisfaction by delivering an enhanced customer experience.
• Risk management: Businesses can identify risks by analyzing data patterns and developing solutions
for managing those risks.
• Entertainment: Providing a personalized recommendation of movies and music according to a
customer’s individual preferences has been transformative for the entertainment industry (think Spotify
and Netflix).
• Education: Big data helps schools and educational technology companies alike develop new
curriculums while improving existing plans based on needs and demands.
• Health care: Monitoring patients’ medical histories helps doctors detect and prevent diseases.
• Government: Big data can be used to collect data from CCTV and traffic cameras, satellites, body
cameras and sensors, emails, calls, and more, to help manage the public sector.
• Marketing: Customer information and preferences can be used to create targeted advertising campaigns
with a high return on investment (ROI).

Business Research Methods 164


and Data Analytics
22BA206

• Banking: Data analytics can help track and monitor illegal money laundering.

Business Research Methods 165


and Data Analytics
22BA206

Tools used in big data analytics


Harnessing all of that data requires tools. Thankfully, technology has advanced so that there are many
intuitive software systems available for data analysts to use.
• Hadoop: An open-source framework that stores and processes big data sets. Hadoop is able to handle
and analyze structured and unstructured data.
• Spark: An open-source cluster computing framework used for real-time processing and analyzing data.
• Data integration software: Programs that allow big data to be streamlined across different platforms,
such as MongoDB, Apache, Hadoop, and Amazon EMR.
• Stream analytics tools: Systems that filter, aggregate, and analyze data that might be stored in different
platforms and formats, such as Kafka.
• Distributed storage: Databases that can split data across multiple servers and have the capability to
identify lost or corrupt data, such as Cassandra.
• Predictive analytics hardware and software: Systems that process large amounts of complex data,
using machine learning and algorithms to predict future outcomes, such as fraud detection, marketing,
and risk assessments.
• Data mining tools: Programs that allow users to search within structured and unstructured big data.
• NoSQL databases: Non-relational data management systems ideal for dealing with raw and
unstructured data.
• Data warehouses: Storage for large amounts of data collected from many different sources, typically
using predefined schemas.

Disadvantages of Big Data


Following are the drawbacks or disadvantages of Big Data:
• Traditional storage can cost lot of money to store big data.
• Lots of big data is unstructured.
• Big data analysis violates principles of privacy.
• It can be used for manipulation of customer records.
• It may increase social stratification.
• Big data analysis is not useful in short run. It needs to be analyzed for longer duration to leverage
its benefits.
• Big data analysis results are misleading sometimes.
• Speedy updates in big data can mismatch real figures.

4.1.1.5 Text Analytics


Text analytics, also known as text mining, is a subfield of data analytics that deals with the process of
extracting insights and information from unstructured text data. This includes analyzing the sentiment of
customer feedback, extracting key topics and themes from large volumes of customer reviews, or
identifying trends in news articles.
Text analytics makes use of natural language processing (NLP) techniques and machine learning
algorithms to process and analyze large volumes of text data. The process typically involves pre-
processing the text data, such as removing stop words and stemming, to convert it into a format that can
be analyzed by computational algorithms.
Text analytics can be used in a variety of applications, including customer experience management, social
media monitoring, and sentiment analysis. By extracting insights from text data, organizations can gain
valuable insights into customer opinions and preferences, track industry trends, and inform decision-
making.
Overall, text analytics is a powerful tool for organizations looking to derive meaningful insights from
unstructured text data. With the increasing amounts of text data generated by social media, customer

Business Research Methods 166


and Data Analytics
22BA206

feedback, and other sources, text analytics is becoming an increasingly important component of data
analytics.
Text mining, text analysis, and text analytics are often used interchangeably, with the end goal of
analyzing unstructured text to obtain insights. However, while text mining (or text analysis) provides
insights of a qualitative nature, text analytics aggregates these results and turns them into something that
can be quantified and visualized through charts and reports.
Text analysis and text analytics often work together to provide a complete understanding of all kinds of
text, like emails, social media posts, surveys, customer support tickets, and more. For example, you can
use text analysis tools to find out how people feel toward a brand on social media (sentiment analysis), or
understand the main topics in product reviews (topic detection).

Functions of Text Analytics


There are 7 basic steps involved in preparing an unstructured text document for deeper analysis:
1. Language Identification
2. Tokenization
3. Sentence Breaking
4. Part of Speech Tagging
5. Chunking
6. Syntax Parsing
7. Sentence Chaining
Each step is achieved on a spectrum between pure machine learning and pure software rules. Let’s review
each step in order, and discuss the contributions of machine learning and rules-based NLP.
1. Language Identification
The first step in text analytics is identifying what language the text is written in. Spanish? Singlish?
Arabic? Each language has its own idiosyncrasies, so it’s important to know what we’re dealing with.
As basic as it might seem, language identification determines the whole process for every other text
analytics function. So it’s very important to get this sub-function right.
Lexalytics supports 29 languages (first and final shameless plug) spanning dozens of
alphabets, abjads and logographies.
C. 2. Tokenization
Now that we know what language the text is in, we can break it up into pieces. Tokens are the individual
units of meaning you’re operating on. This can be words, phonemes, or even full
sentences. Tokenization is the process of breaking text documents apart into those pieces.
In text analytics, tokens are most frequently just words. A sentence of 10 words, then, would contain 10
tokens. For deeper analytics, however, it’s often useful to expand your definition of a token. For
Lexalytics, tokens can be:
• Words
• Punctuation (exclamation points intensify sentiment)
• Hyperlinks (https://…)
• Possessive markers (apostrophes)
Tokenization is language-specific, and each language has its own tokenization requirements. English, for
example, uses white space and punctuation to denote tokens, and is relatively simple to tokenize.
In fact, most alphabetic languages follow relatively straightforward conventions to break up words,
phrases and sentences. So, for most alphabetic languages, we can rely on rules-based tokenization.
But not every language uses an alphabet.
Many logographic (character-based) languages, such as Chinese, have no space breaks between words.
Tokenizing these languages requires the use of machine learning, and is beyond the scope of this article.
3. Sentence Breaking
Once you’ve identified the tokens, you can tell where the sentences end. (See, look at that period right
there, you knew exactly where the sentence ended, didn’t you Dr. Smart?)

Business Research Methods 167


and Data Analytics
22BA206

But look again at the second sentence above. Did it end with the period at the end of “Dr.?”
Now check out the punctuation in that last sentence. There’s a period and a question mark right at the end
of it!
Point is, before you can run deeper text analytics functions (such as syntax parsing, #6 below), you must
be able to tell where the boundaries are in a sentence. Sometimes it’s a simple process, and other times…
not so much.
Certain communication channels <cough> Twitter <cough> are particularly complicated to break down.
We have ways of sentence breaking for social media, but we’ll leave that aside for now.
D. 4. Part of Speech Tagging
Once we’ve identified the language of a text document, tokenized it, and broken down the sentences, it’s
time to tag it.
Part of Speech tagging (or PoS tagging) is the process of determining the part of speech of every token
in a document, and then tagging it as such.
For example, we use PoS tagging to figure out whether a given token represents a proper noun or a
common noun, or if it’s a verb, an adjective, or something else entirely.
Part of Speech tagging may sound simple, but much like an onion, you’d be surprised at the layers involved
– and they just might make you cry. At Lexalytics, due to our breadth of language coverage, we’ve had to
train our systems to understand 93 unique Part of Speech tags.
E. 5. Chunking
Let’s move on to the text analytics function known as Chunking (a few people call it light parsing, but
we don’t). Chunking refers to a range of sentence-breaking systems that splinter a sentence into its
component phrases (noun phrases, verb phrases, and so on).
Before we move forward, I want to draw a quick distinction between Chunking and Part of Speech tagging
in text analytics.
• PoS tagging means assigning parts of speech to tokens
• Chunking means assigning PoS-tagged tokens to phrases
Here’s what it looks like in practice. Take the sentence:
The tall man is going to quickly walk under the ladder.
PoS tagging will identify man and ladder as nouns and walk as a verb.
Chunking will return: [the tall man]_np [is going to quickly walk]_vp [under the ladder]_pp
(np stands for “noun phrase,” vp stands for “verb phrase,” and pp stands for “prepositional phrase.”)
Got that? Let’s move on.

Business Research Methods 168


and Data Analytics
22BA206

6. Syntax Parsing
The syntax parsing sub-function is a way to determine the structure of a sentence. In truth, syntax
parsing is really just fancy talk for sentence diagramming. But it’s a critical preparatory step in sentiment
analysis and other natural language processing features.
This becomes clear in the following example:
• Apple was doing poorly until Steve Job
• Because Apple was doing poorly, Steve Job
• Apple was doing poorly because Steve Job
• In the first sentence, Apple is negative, whereas Steve Jobs is positive.
In the second, Apple is still negative, but Steve Jobs is now neutral.
In the final example, both Apple and Steve Jobs are negative.
Syntax parsing is one of the most computationally-intensive steps in text analytics. At Lexalytics, we use
special unsupervised machine learning models, based on billions of input words and complex matrix
factorization, to help us understand syntax just like a human would.
7. Sentence Chaining
The final step in preparing unstructured text for deeper analysis is sentence chaining, sometimes known
as sentence relation.
Lexalytics utilizes a technique called “lexical chaining” to connect related sentences. Lexical chaining
links individual sentences by each sentence’s strength of association to an overall topic.
Even if sentences appear many paragraphs apart in a document, the lexical chain will flow through the
document and help a machine detect overarching topics and quantify the overall “feel.”
In fact, once you’ve drawn associations between sentences, you can run complex analyses, such as
comparing and contrasting sentiment scores and quickly generating accurate summaries of long
documents.

Use Cases of Text Analytics

1. Sports Trading
One of the most popular sports to bet on, particularly in Europe, is football (soccer). The top sports traders
gather data from the mainstream media and have a deep understanding of the game and its politics at a
local level.
If you live in England and you bet on English football, irrespective of the division, it’s relatively easy to
understand your market. You can successfully bet on a local second division English team because you
speak the language, read the local newspapers and may even follow some of the team members on Twitter.
But what if you’d like to do the same for a similar team in Spain and you don’t speak a word of Spanish?
A Text Analysis API capable of understanding Spanish would allow you to extract meaning from local
Twitter feeds, giving you insights into what the local fans are saying about their team. These people
understand the squad dynamics at a local level. If, for example, the star striker of Real Club Deportivo
Mallorca has an argument with his wife the night before his cup game, is he as likely to be the top scorer
on match day?
2. Financial Trading
As with sports trading, having an insight into what is happening at a local level can be very valuable to a
financial trader.
Domain-specific sentiment analysis/classification can add real value here. The same way in which fans
have their own distinct vocab based on the sport, so too do traders in particular markets.
Intent recognition and Spoken Language Understanding services for detecting user intents (e.g. “buy”,
“sell”, etc) from short utterances can help to guide traders in deciding what to trade, how much and how
quickly.
3. Voice of the Customer (VOC)

Business Research Methods 169


and Data Analytics
22BA206

VOC applications are primarily used by companies to determine what a customer is saying about a product
or service.
Sources of such data include emails, surveys, call center logs and social media streams like blogs, tweets,
forum posts, newsfeeds, and so on. For example, a telecom company could use voice of customer text
analysis to scan Twitter for customer gripes about their broadband internet services.
This would give them an early warning when customers were annoyed with the performance of the service
and allow them to intercept the issue before it involved the customer calling to officially complain or
request contract cancellation.
4. Fraud
Whether it’s workers claiming false compensation or a motorist disclosing a false home
address, fraudulent activity can be discovered much more quickly when those investigating can join the
dots together, faster.
In the latter case, for example, the guilty party may give an address that has many claims associated with
it or the driven vehicle may have been involved in other claims.
Having the ability to capture this information saves the insurer time and gives them greater insight into
the case.
5. Manufacturing or Warranty Analysis
In this use case, companies examine the text that comes from warranty claims, dealer technician lines,
report orders, customer relations text, and other potential information using text analytics to extract certain
entities or concepts (like the engine or a certain part).
They can then analyze this information, looking at how the entities cluster and to see if the clusters are
increasing in size and whether they are a cause for concern, for example.
6. Customer Service Routing
In this use case, companies can use text analytics to route requests to customer service representatives.
For example, say you’ve sent an email to a company while on hold to one of their reps. You might have
a question or a complaint about one of their products. The company can use text analytics for intelligent
routing of that email to the appropriate person at the company.
This could also be possible in a call center situation, provided you have sufficiently accurate speech-to-
text software.
7. Lead Generation
As was the case with the VOC application, taking timely action on a piece of Social Media information
can be used to both retain and gain new customers.
For example, if a person tweets that they are interested in a certain product or service, text analytics can
discover this & feed this info to a sales rep who can then pursue this prospect and convert them into a
customer.
8. TV Advertising & Audience Analysis
TV shows or live televised events are some of the most talked-about topics on Twitter. Marketers and TV
producers can benefit from using Text Analytics in two distinct ways. If producers can get an
understanding of how their audience ‘feels’ about certain characters, settings, storylines, featured music
etc, they can make adjustments in a bid to appease their viewers and therefore increase the audience size
and viewers ratings.
Marketers can dig into social media streams to analyse the effectiveness of product placement and
commercials aired during the breaks.
For example, the TV character ‘Cersei’ from Game of Thrones is becoming a fashion icon amongst fans,
who regularly Tweet about her latest frock. High street retailers that want to take advantage of this trend
could release a line of ‘Queen of Westeros’ style clothing and align their commercials with shows like
Game of Thrones.
Text Analytics could also be used by TV Executives looking to sell to advertisers. For example, a TV
company could mine viewers tweets & forum activity to profile their audience more accurately.

Business Research Methods 170


and Data Analytics
22BA206

So instead of merely pitching the size of their audience to advertisers, they could wow them by identifying
their gender, location, age etc and their feelings towards certain products.
9. Recruitment
Text Analysis could be used in both the search and selection phases of recruitment.
The most basic application would be identifying the skills of a potential hire. In the recruitment industry,
the real value comes from identifying prospects before they become active in the job market.
For example, it would be very powerful to know if somebody tweets about disliking their job or expresses
an interest in working in a different field, larger/smaller company, different location. Once you have
identified such a prospect, you could use Text Analytics to analyse the suitability of this person based on
what others say about them.
Mining news and blog articles, forum postings and other sources could help to evaluate potential hires.
10. Review Sites
Companies like Expedia have millions of reviews on their website, from travellers all over the world.
Given the nature of the site and the fact that their users are looking for a stress-free experience, having to
sift through hundreds of reviews to find a place to stay can be a real turn off.
Text Analysis can be used here to build tools that can summarize multiple properties in 2-3 word phrases.
Instead of scrolling through a list of hotel features like heated pool, massage therapy, buffet breakfast etc,
you could simply say “Luxurious Hotel and Spa”.

Advantages of Text Analytics:


Insights into Customer Feedback: Text analytics can help organizations understand customer feedback,
opinions, and sentiments. This can inform product development and marketing strategies, as well as
improve customer satisfaction.
Trend Analysis: Text analytics can be used to identify trends and patterns in large volumes of text data,
such as social media posts or news articles. This can provide organizations with valuable insights into
industry trends and public opinion.
Efficient Data Processing: Text analytics can automate the process of analyzing large volumes of text
data, making it faster and more efficient than manual methods.
Cost Effective: Text analytics can be a cost-effective way of gaining insights from large volumes of text
data, as it eliminates the need for manual data processing and analysis.
Disadvantages of Text Analytics:
Limitations of NLP Techniques: Natural language processing (NLP) techniques used in text analytics are
not always accurate and can struggle with ambiguity, sarcasm, and other complex language structures.
Bias: Text analytics algorithms can perpetuate and amplify existing biases in the training data, leading to
inaccurate results.
Quality of Data: The quality of the text data used for analysis can greatly affect the accuracy of the results.
Data cleaning and pre-processing are important steps in text analytics, but the quality of the original data
can still impact the results.
Complexity: Text analytics can be a complex process, requiring specialized skills and knowledge of NLP
techniques and machine learning algorithms. This can make it difficult for organizations to implement and
maintain a text analytics solution.

4.1.1.6 Visual Analytics

Visual analytics is a branch of data analytics that emphasizes the use of visual representations to
explore, analyze, and understand complex data. It combines traditional data analysis methods with
interactive visualizations, enabling users to quickly identify patterns, relationships, and trends in large and
complex datasets.
Visual analytics leverages the human brain's ability to process visual information, allowing users
to quickly identify patterns and insights that might be difficult to discern from raw data or traditional

Business Research Methods 171


and Data Analytics
22BA206

statistical reports. This makes it an effective tool for exploring and communicating data insights to both
technical and non-technical stakeholders.
Applications of visual analytics include exploratory data analysis, data visualization, business intelligence,
and scientific visualization. By using interactive visualizations, users can quickly test hypotheses, compare
multiple datasets, and gain a deeper understanding of the data.
Overall, visual analytics provides a powerful way to explore and understand complex data, helping
organizations to make informed decisions, improve operations, and gain a competitive advantage.

Importance of Data Visualization


Data visualization is essential to assist businesses in quickly identifying data trends, which would
otherwise be a hassle. The pictorial representation of data sets allows analysts to visualize concepts and
new patterns. With the increasing surge in data every day, making sense of the quintillion bytes of data is
impossible without Data Proliferation, which includes data visualization.
Every professional industry benefits from understanding their data, so data visualization is branching out
to all fields where data exists. For every business, information is their most significant leverage. Through
visualization, one can prolifically convey their points and take advantage of that information.
A dashboard, graph, infographics, map, chart, video, slide, etc. all these mediums can be used for
visualizing and understanding data. Visualizing the data enable decision-makers to interrelate the data to
find better insights and reap the importance of data visualization, which are:
1. Analyzing the Data in a Better Way
Analyzing reports helps business stakeholders focus on the areas that require attention. The visual
mediums help analysts understand the key points needed for their business. Whether it is a sales report or
a marketing strategy, a visual representation of data helps companies increase their profits through better
analysis and better business decisions.

Business Research Methods 172


and Data Analytics
22BA206

2. Faster Decision Making


Humans process visuals better than any tedious tabular forms or reports. If the data communicates well,
decision-makers can quickly take action based on the new data insights, accelerating decision-making,
and business growth simultaneously.
3. Making Sense of Complicated Data
Data visualization allows business users to gain insight into their vast amounts of data. It benefits them to
recognize new patterns and errors in the data. Making sense of these patterns helps the users pay attention
to areas that indicate red flags or progress. This process, in turn, drives the business ahead.

Use cases of Visual analytics


There are many use cases for visual analytics, including:
Business Intelligence: Visual analytics can be used to support decision-making by providing interactive
visualizations of key business metrics and KPIs. This can help executives and managers quickly identify
trends, opportunities, and challenges, and make informed decisions.
Fraud Detection: Visual analytics can help detect fraudulent activity by allowing analysts to interactively
explore large volumes of transaction data and identify suspicious patterns.
Customer Analytics: Visual analytics can be used to understand customer behavior and preferences by
analyzing data from sources such as website logs, customer surveys, and social media. This can inform
marketing and customer experience strategies.
Healthcare Analytics: Visual analytics can be used to improve patient outcomes by analyzing large
volumes of medical data to identify trends, predict outcomes, and inform treatment decisions.
Supply Chain Management: Visual analytics can be used to optimize supply chain operations by
analyzing data on inventory levels, supplier performance, and shipping times. This can help organizations
improve efficiency and reduce costs.
Visual analytics can be used to address a wide range of business challenges and support data-
driven decision-making across a variety of industries. By leveraging the power of visual representation
and interactive exploration, visual analytics can help organizations gain insights, make informed
decisions, and drive business value.

Advantages of Visual Analytics


Businesses are implementing data analytics and visualization tools with increasing frequency in order to
speed up their business performance and improve their business decisions making process. Some key
benefits of visualization in data analytics include:
Increased Insight: Visual analytics allows users to interactively explore and understand complex data,
helping to uncover hidden patterns, relationships, and insights that might be difficult to discern from raw
data or traditional statistical reports.
• Improved Communication: Visual analytics can make it easier to communicate data insights to
both technical and non-technical stakeholders. Interactive visualizations can help simplify
complex data and make it easier for decision-makers to understand the results.
• Faster Analysis: Visual analytics can be faster and more intuitive than traditional data analysis
methods, as it leverages the human brain's ability to process visual information. This can allow
users to quickly identify patterns and trends in large and complex datasets.
• Increased Engagement: Visual analytics can be more engaging for users than traditional data
analysis methods, as it provides interactive and intuitive visual representations of the data. This
can help to increase user engagement and improve the overall user experience.
• Improved Decision-Making: By providing a more complete and accurate understanding of
complex data, visual analytics can help organizations make more informed decisions. This can
improve operational efficiency and drive business value.

Disadvantages of Visual Analytics

Business Research Methods 173


and Data Analytics
22BA206

There are a few potential disadvantages of visual analytics, including:


• Dependence on Visual Representation: Visual analytics relies on the ability to represent data in a
meaningful way, and some types of data may be difficult to visualize effectively. This can limit
the utility of visual analytics in certain cases.
• Bias: Visual analytics can perpetuate and amplify existing biases in the data or the way the data is
represented. For example, the choice of colors or scales used in visualizations can affect the
interpretation of the data.
• Limited Interactivity: Although visual analytics often provides more interactive visualizations than
traditional data analysis methods, it can still have limitations in terms of the level of interaction
and exploration that is possible.
• Skill Requirements: Visual analytics can require specialized skills to create effective visualizations
and extract meaningful insights from the data. This can limit the ability of non-technical
stakeholders to use and understand the results.
• Data Quality: The quality of the data being analyzed can greatly affect the accuracy of the results
of visual analytics. Data cleaning and preparation are critical steps in visual analytics, but the
quality of the original data can still impact the results.

Data visualization for decision making


Data visualization plays a crucial role in decision-making by allowing individuals and organizations to
effectively analyze and understand complex data. By presenting data in an interactive, intuitive, and
visually appealing format, data visualization helps to:
• Identify patterns and trends: Data visualization allows users to quickly identify patterns, trends,
and relationships in large and complex datasets, helping to uncover new insights and inform
decision-making.
• Improve communication: Data visualization makes it easier to communicate data insights to both
technical and non-technical stakeholders. Interactive visualizations can help simplify complex data
and make it easier for decision-makers to understand the results.
• Facilitate exploration: Data visualization enables users to interactively explore data, allowing them
to test hypotheses, compare multiple datasets, and gain a deeper understanding of the data.
• Support collaboration: Data visualization can support collaboration by allowing multiple users to
access and explore the same data, facilitating discussion and decision-making.
• Enhance transparency: By providing an easily understandable representation of the data, data
visualization can increase transparency and help to build trust in decision-making.
Overall, data visualization is a powerful tool for decision-making, as it helps to simplify complex data,
improve communication, facilitate exploration, support collaboration, and enhance transparency. By
effectively leveraging data visualization, organizations can make data-driven decisions, improve
operational efficiency, and drive business value.

Data Analytics is more involved in bringing some form of structure into unorganized data, Data
Visualization deals with picturing the information to develop trends and conclusions. In Data
Visualization, information is organized into charts, graphs, and other forms of visual representations. This
simplifies otherwise complicated information and makes it accessible to all the involved stakeholders to
make critical business decisions.

Purpose of Data Visualization


The purpose of data visualization is pretty clear. It is to make sense of the data and use the information
for the organization’s benefits. That said, data is complicated, and it gains more value as and when it gets
visualized. Without visualization, it is challenging to quickly communicate the data findings and identify
patterns to pull insights and interact with the data seamlessly.

Business Research Methods 174


and Data Analytics
22BA206

Data scientists can find patterns or errors without visualization. However, it is crucial to communicate
data findings and identify critical information from them. And for this, interactive data visualization tools
make all the difference.A relevant and recent example is the ongoing pandemic. Yes, data scientists can
look into the data and gain insights. But data visualization is assisting experts in staying informed and
calm with such an abundance of data.
• Data visualization strengthens the impact of messaging for your audiences and presents the data
analysis results in the most persuasive manner. It unifies the messaging systems across all the
groups and fields within the organization.
• Visualization lets you comprehend vast amounts of data at a glance and in a better way. It helps to
understand the data better to measure its impact on the business and communicates the insight
visually to internal and external audiences.
• Decisions can’t be made in a vacuum. Available data and insights enable decision-makers to aid
decision analysis. Unbiased data without inaccuracies allows access to the right kind of
information and visualization to represent that information and keep it relevant.
Data visualization has the potential to solve many business issues. All businesses must incorporate data
visualization tools and reap transformative benefits in their critical areas of operations.

Types of Data Visualization Techniques:


Like Data Analytics, the type of Data Visualization technique you choose will largely depend on the type
of data to be modeled and the purpose. It is worth noting that some Visualizations are manually created
while others are automated. Below are some of the popular Visualization techniques:
• Histograms: This is a Graphical Visualization Tool that organizes a set of data into a range of
frequencies. It bears key similarities to a Bar Graph and organizes information in a way that makes
it easy to interpret.
• Graphs: These are excellent tools for analyzing the time series relationship in a particular set of
data. For instance, a company’s annual profits could be analyzed based on each month using a
graph.
• Fever Charts: A Fever Chart is an indispensable tool for any business since it shows how data
changes over time. For instance, a particular product’s performance could be analyzed based on
its yearly profits.
• Heatmap Visualization: This tool is based on the psychological fact that the human brain
interprets colors much faster than numbers. It is a graph that uses numerical data points highlighted
in light or warm colors to represent high or low-value points.
• Infographics: Infographics are effective when analyzing complex datasets. They take large
amounts of data and organize it into an easy to interpret format.
These were some of the most popular Visualization you can leverage to level up your Data Analytics and
Visualization workflows

Advantages of Visualization
Data Analytics and Visualization are a crucial elements of the business decision-making process. It helps
the stakeholders to recognize patterns in the data and devise profitable business strategies. Below are some
of the benefits of Data Analytics and Visualization:
• Better Decision Making: By using skilled Data Analysts and the right software, companies can
identify market trends and make better business decisions to Boost Sales and Profits.
• Better Insights: Companies can get better insights into their Customer Base- using Data Analytics
and Visualization, companies can break large customer data down into smaller sets that can be
used to understand the Client Base better.
• Improving Productivity and Revenue Growth: By looking at the results from Data Analytics
and Visualization, companies get to know which areas they need to invest in and what processes
need to be automated for better efficiency.

Business Research Methods 175


and Data Analytics
22BA206

•Noting Changes in Market Behaviour: With a real-time Data Analytics and Visualization
Dashboard, company stakeholders can quickly identify changes in market behavior and make
appropriate business decisions.
• Analyzing Different Markets: Using Data Analytics and Visualization techniques, companies
can analyze different markets and decide which ones to place attention on and which ones to avoid.
• Business Trends: This is one of the most valuable applications of Data Analytics and
Visualization. It allows businesses to examine the present and past trends to make predictions that
determine the way forward for the business.
• Data Relationships: This is one of the most obvious benefits of Data Analytics and Visualization.
It helps companies note the relationships between independent data sets and make business
decisions based on these findings.
Disadvantages of Data Visualization
• It gives assessment not exactness: While the information is exact in foreseeing the circumstances,
the perception of similar just gives the assessment. It without a doubt is anything but difficult to
change over the robust and protracted information into simple pictorial configuration yet such a
portrayal of data may prompt theoretical ends now and then.
• One-sided: The essential arrangement of information representation occurs with the human
interface, which means the information that turns out to be the base of perception can be one-sided.
The individual bringing the information for the equivalent may just think about the significant part
of the information or the information that requirements center and may reject the remainder of the
information which may prompt one-sided results.
• Absence of help: One of the downsides of information perception is that it can’t help, which means
an alternate gathering of the crowd may decipher it in an unexpected way.
• Inappropriate plan issue: On the off chance that information perception is viewed as such a
correspondence. At that point, it must be certifiable in clarifying the reason. In the event that the
plan isn’t legitimate, at that point, this can prompt disarray in correspondence.
• Wrong engaged individuals can skip center messages: One of the issues with information
perception is however it could be logical its clearness in clarification is totally subject to the focal
point of its crowd.

4.2 Skewness and Kurtosis

Skewness and kurtosis are two statistical measures that describe the shape of a probability
distribution. They provide important information about the distribution of a set of data and can help to
identify patterns or anomalies in the data.

4.2.1 Skewness
Skewness is a measure of the asymmetry of a probability distribution. It tells us whether the distribution
is symmetric or skewed to one side or the other. If the data is symmetrical, the skewness is zero. If the
data is skewed to the right (positive skewness), the mean is greater than the median, and if the data is
skewed to the left (negative skewness), the mean is less than the median.
Kurtosis is a measure of the "peakedness" of a probability distribution. It tells us whether the distribution
has a flat or peaked shape. A distribution with positive kurtosis has a higher peak and fatter tails than a
normal distribution. A distribution with negative kurtosis has a flatter shape and thinner tails than a normal
distribution.
Skewness and kurtosis are important measures for identifying outliers in a data set, as well as for
understanding the overall shape of the data. They can also be used to compare different data sets and to
identify patterns or anomalies in the data.

To Interpret Skewness

Business Research Methods 176


and Data Analytics
22BA206

The value for skewness can range from negative infinity to positive infinity.
Here’s how to interpret skewness values:
• A negative value for skewness indicates that the tail is on the left side of the distribution, which
extends towards more negative values.
• A positive value for skewness indicates that the tail is on the right side of the distribution, which
extends towards more positive values.
• A value of zero indicates that there is no skewness in the distribution at all, meaning the
distribution is perfectly symmetrical.

Types of Skewness
Positive Skewness
A positively skewed distribution (often referred to as Right-Skewed) is a distribution type where most
values are concentrated to the left tail of the distribution whereas the right tail of the distribution is longer.
A positively skewed distribution is the complete opposite of a negatively skewed distribution.
A Positively Skewed Curve
In contrast to normally distributed data, where all central trend measurements (mean, median, and mode)
are equal to each other, with positively skewed data, the observations are dispersed. The general
relationship between the central tendency measures in a positively skewed distribution can be expressed
using the following inequalities: Mean > Median > Mode

Business Research Methods 177


and Data Analytics
22BA206

Negative Skewness
A negatively skewed distribution (often referred to as Left-Skewed) is a kind of distribution where more
values are on the right side of the distribution graph whereas the left tail of its distribution graph is longer.
A Negatively Skewed Curve
Apart from normally distributed data, where all central trend measurements (mean, median, and mode)
are equal to each other, with negatively skewed data, the measurements are dispersed. The general
relationship between central trend measures in the negatively skewed distribution can be displayed using
the following inequality: Mode > Median > Mean

Zero Skewness
When a distribution has zero skew, it is symmetrical. Its left and right sides are mirror images.
Normal distributions have zero skew, but they’re not the only distributions with zero skew. Any
symmetrical distribution, such as a uniform distribution or some bimodal (two-peak) distributions, will
also have zero skew.
The easiest way to check if a variable has a skewed distribution is to plot it in a histogram. For example,
the weights of six-week-old chicks are shown in the histogram below.
The distribution is approximately symmetrical, with the observations distributed similarly on the left and
right sides of its peak. Therefore, the distribution has approximately zero skew.
Zero skew: mean = median=mode

Examples of Skewed Distribution

1. Cricket Score
Cricket score is one of the best examples of skewed distribution. Let us say that during a match, most of
the players of a particular team scored runs above 50, and only a few of them scored below 10. In such a
case, the data is generally represented with the help of a negatively skewed distribution. Similarly, a
positively skewed distribution can be used if most of the players of a particular team score badly during a
match, and only a few of them tend to perform well.

Business Research Methods 178


and Data Analytics
22BA206

2. Exam Results
The representation of exam results forms a classic example of skewed distribution in real life. The
distribution of scores obtained by the students of a class on any particularly difficult exam is generally
positively skewed in nature. This is because due to the increased difficulty level of the exam, a majority
of students tend to score low, and only a few of them manage to score high. Similarly, the distribution of
scores obtained on an easy test is negatively skewed in nature because the reduced difficulty level of the
exam helps more students score high, and only a few of them tend to score low.
3. Average Income Distribution
Income distribution is a prominent example of positively skewed distribution. This is because a large
percentage of the total people residing in a particular state tends to fall under the category of a low-income
earning group, while only a few people fall under the high-income earning group. The mean of such data
is generally greater than the other measures of central tendency of data such as median or mode.
4. Distribution of Stock Market Returns
The representation of stock market returns is usually done with the help of negatively skewed distribution.
This is because the stock market mostly provides slightly positive returns on most days, and the negative
returns are only observed occasionally. Hence, the graphical representation of data definitely has more
points on the right side as compared to the left side.

4.2.2 Kurtosis
Kurtosis is a statistical measure used to describe a characteristic of a dataset. When normally distributed
data is plotted on a graph, it generally takes the form of an upsidedown bell. This is called the bell curve.
The plotted data that are furthest from the mean of the data usually form the tails on each side of the
curve. Kurtosis indicates how much data resides in the tails.
Distributions with a large kurtosis have more tail data than normally distributed data, which appears to
bring the tails in toward the mean. Distributions with low kurtosis have fewer tail data, which appears to
push the tails of the bell curve away from the mean.
For investors, high kurtosis of the return distribution curve implies that there have been many price
fluctuations in the past (positive or negative) away from the average returns for the investment. So, an
investor might experience extreme price fluctuations with an investment with high kurtosis. This
phenomenon is known as kurtosis risk.
Kurtosis is all about the tails of the distribution — not the peakedness or flatness. It is used to describe the
extreme values in one versus the other tail. It is actually the measure of outliers present in the distribution.

High kurtosis in a data set is an indicator that data has heavy tails or outliers. If there is a high kurtosis,
then, we need to investigate why do we have so many outliers. It indicates a lot of things, maybe wrong
data entry or other things. Low kurtosis in a data set is an indicator that data has light tails or lack of outliers.
If we get low kurtosis(too good to be true), then also we need to investigate and trim the dataset of unwanted
results.

Business Research Methods 179


and Data Analytics
22BA206

Mesokurtic: This distribution has kurtosis statistic similar to that of the normal distribution. It means that
the extreme values of the distribution are similar to that of a normal distribution characteristic. When
kurtosis is equal to 3, the distribution is mesokurtic.
The kurtosis of a mesokurtic distribution is neither high nor low, rather it is considered to be a baseline
for the two other classifications.

Leptokurtic (Kurtosis > 3): Distribution is longer, tails are fatter. Peak is higher and sharper than
Mesokurtic, which means that data are heavy-tailed or profusion of outliers.
Outliers stretch the horizontal axis of the histogram graph, which makes the bulk of the data appear in a
narrow (“skinny”) vertical range, thereby giving the “skinniness” of a leptokurtic distribution.
Positive excess values of kurtosis (>3) indicate that a distribution is peaked and possess thick tails.
Leptokurtic distributions have positive kurtosis values.
A leptokurtic distribution has a higher peak (thin bell) and taller (i.e. fatter and heavy) tails than a normal
distribution.

An extreme positive kurtosis indicates a distribution where more of the values are located in the tails of
the distribution rather than around the mean.

Platykurtic: (Kurtosis < 3): When kurtosis is equal to 0, the distribution is platykurtic A platykurtic
distribution is flatter (less peaked) when compared with the normal distribution, with fewer values in its
shorter (i.e. lighter and thinner) tails.
The peak is lower and broader than Mesokurtic, which means that data are light-tailed or lack of outliers.
Negative excess values of kurtosis (<3) indicate that a distribution is flat and has thin tails. Platykurtic
distributions have negative kurtosis values.
A platykurtic distribution is flatter (less peaked) when compared with the normal distribution, with fewer
values in its shorter (i.e. lighter and thinner) tails.

Business Research Methods 180


and Data Analytics
22BA206

The reason for this is because the extreme values are less than that of the normal distribution.

Examples for Kurtosis


1. Let's say that 95% of women are 150-185 cm tall. In the real world, however, such natural phenomena
show a symmetrical distribution, and the mean and median values are almost the same as the average
height, and a value of about 167.5 cm, which is the exact middle of 150 to 185. There is no kurtosis
in this case. The closer the height of a large number of people is to 167.5 and the lower the height of
150 and 185, which are external values, the lower the curvature.
2. Let's take the example of the situation where there is excess pressure. Suppose that the number of
childless families in a region is 2000, the number of families with 1 child is 2050, the number of
families with 2 children is 2100, the number of families with 3 children is 2050, and the number of
families with 4 children is 2000. As you can see, the frequencies are very close to each
other. Consequently, the distribution is suppressed, ie the graph is almost parallel to the X axis.
3. In another region, the number of childless families should be 500, the number of families with one
child should be 1,000, the number of families with two children should be 2,200, the number of
families with three children should be 1,000, and the number of families with four children should be
500. Consequently, most of the values here are concentrated in families with two children, and the
number of families decreases as they move away from the center. In that case, the pressure is less, ie
there is a sharper graph

4.3 Formatting Data


Formatting data in data analytics refers to the process of transforming raw data into a structure that is
suitable for analysis and visualization. The goal of formatting data is to make the data consistent, accurate,
and easy to work with. Some of the common data formatting techniques include:
• Data Cleaning: This involves identifying and correcting errors, missing values, and inconsistent
data in the raw data set.
• Data Transformation: This involves converting data from one format to another. For example,
converting date fields from text to date format, converting string data to numeric data, etc.
• Data Normalization: This involves transforming data values so that they have the same scale and
range. This helps to prevent data skewing and make it easier to compare data values.
• Data Aggregation: This involves summarizing data by grouping it into categories, calculating
summary statistics, and creating new variables based on existing data.
• Data Merging: This involves combining data from multiple sources into a single data set.
Formatting data is a crucial step in the data analytics process, as it helps to ensure that the data is accurate,
consistent, and easy to work with, making it possible to uncover insights and make data-driven decisions.

4.3.1 Different operations using charts

Business Research Methods 181


and Data Analytics
22BA206

In data analytics, formatting data using charts is an important step in visualizing data and uncovering
insights. Different types of charts are used to represent data in different ways and highlight specific
patterns and relationships in the data. There are many charts which are used for visual representation
including column chart, line, pie, doughnut, bar, area, stock, surface, radar, bubble, tree map, waterfall,
map, and pivot charts. MS Excel is the best choice of data formatting. Some common types of charts used
in data formats include:
• Bar Charts: These are used to compare the values of different categories. They can be used to
represent data vertically (column chart) or horizontally (bar chart).
• Line Charts: These are used to represent continuous data over time, such as stock prices, sales, or
temperatures.
• Pie Charts: These are used to represent the proportion of different categories in a whole. They are
best used for small data sets with few categories.
• Scatter Plots: These are used to represent the relationship between two variables. They are used to
visualize the distribution of data points and identify trends and patterns in the data.
• Histograms: These are used to represent the distribution of a single variable. They provide a visual
representation of the frequency of data points within specific ranges or bins.
• Area Charts: These are used to represent the changes in data over time. They are similar to line
charts, but the area under the line is filled with color to represent the magnitude of the data.
• Box Plots: These are used to represent the distribution of a single variable. They show the
minimum and maximum values, the median, and the interquartile range of the data.

4.3.2 Pivot charts


A pivot chart is a type of chart that is used to represent and analyze data in a pivot table. A pivot table is
a tool for summarizing data in a spreadsheet, and a pivot chart is a visual representation of the data in the
pivot table. The main uses of pivot charts include:
• Data summarization: Pivot charts allow you to quickly summarize and analyze large amounts of
data by aggregating the data into meaningful categories and calculating summary statistics.
• Data visualization: Pivot charts provide a visual representation of the data, making it easier to
identify patterns and trends in the data.
• Data exploration: Pivot charts allow you to explore different aspects of the data by changing the
way the data is organized and aggregated. This can help you uncover new insights and identify
areas for further analysis.
• Data communication: Pivot charts provide a clear and concise way to communicate the results of
your data analysis to others. They can be used to present data in a report, dashboard, or
presentation.
• Data comparison: Pivot charts allow you to compare different categories of data and see how they
relate to each other. This can help you make informed decisions based on the data.
Pivot Charts provide graphical representations of the data in their associated PivotTables. PivotCharts
are also interactive. When you create a PivotChart, the PivotChart Filter Pane appears. You can use this
filter pane to sort and filter the PivotChart's underlying data. Changes that you make to the layout and data
in an associated PivotTable are immediately reflected in the layout and data in the PivotChart and vice
versa.
Usually PivotCharts display data series, categories, data markers, and axes just as standard charts do. You
can also change the chart type and other options such as the titles, the legend placement, the data labels,
the chart location, and so on.

Difference Between a Pivot Chart and Normal Charts


• A standard chart use range of cells, on the other hand, a pivot chart is based on data summarized
in a pivot table.

Business Research Methods 182


and Data Analytics
22BA206

• A pivot chart is already a dynamic chart, but you have to make changes in data to convert
a standard chart into a dynamic chart.

To create charts in MS Excel


Create a chart
1. Select data for the chart.
2. Select Insert > Recommended Charts.
3. Select a chart on the Recommended Charts tab, to preview the chart.
4. Select a chart.
5. Select OK.

Business Research Methods 183


and Data Analytics
22BA206

To format with Adding a Trendline


1. Select a chart.
2. Select Design > Add Chart Element.
3. Select Trendline and then select the type of trendline you want, such as Linear, Exponential, Linear
Forecast, or Moving Average.

4.3.3 Formatting a plot area


After inserting and fixing the charts, If you right-click a pivot chart's plot area — the area that shows the
plotted data — Excel displays a shortcut menu. Choose the last command on this menu, Format Plot Area,
and Excel displays the Format Plot Area pane, as shown here.
This dialog box provides several collections of buttons and boxes you can use to specify the line
background fill color and pattern, the line and line style, any shadowing, and any third-dimension visual
effect for the chart.
For example, to add a background fill to the plot area, select Fill from the list box on the left side of the
Format Plot Area pane. Then make your choices from the radio buttons and drop-down lists available.
Select the chart element (for example, data series, axes, or titles), right-click it, and click Format <chart
element>. The Format pane appears with options that are tailored for the selected chart element.

Format your chart using the Ribbon in MS Excel


1. In your chart, click to select the chart element that you want to format.
2. On the Format tab under Chart Tools, do one of the following:
• Click Shape Fill to apply a different fill color, or a gradient, picture, or texture to the chart element.
• Click Shape Outline to change the color, weight, or style of the chart element.
• Click Shape Effects to apply special visual effects to the chart element, such as shadows, bevels,
or 3-D rotation.
• To apply a predefined shape style, on the Format tab, in the Shape Styles group, click the style that

you want. To see all available shape styles, click the More button
• To change the format of chart text, select the text, and then choose an option on the mini toolbar
that appears. Or, on the Home tab, in the Font group, select the formatting that you want to use.
• To use WordArt styles to format text, select the text, and then on the Format tab in the WordArt
Styles group, choose a WordArt style to apply. To see all available styles, click the More button

Format your chart using the Format task pane


Select the chart element (for example, data series, axes, or titles), right-click it, and click Format <chart
element>. The Format pane appears with options that are tailored for the selected chart element.
Clicking the small icons at the top of the pane moves you to other parts of the pane with more options. If
you click on a different chart element, you’ll see that the task pane automatically updates to the new chart
element.
For example, to format an axis:
1. Right-click the chart axis, and click Format Axis.
2. In the Format Axis task pane, make the changes you want.
3. You can move or resize the task pane to make working with it easier. Click the chevron in the
upper right.
• Select Move and then drag the pane to a new location.
• Select Size and drag the edge of the pane to resize it.

4.4 Data Wrangling


Data wrangling is the process of removing errors and combining complex data sets to make them more
accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources

Business Research Methods 184


and Data Analytics
22BA206

available today, storing and organizing large quantities of data for analysis is becoming increasingly
necessary.
A data wrangling process, also known as a data munging process, consists of reorganizing, transforming
and mapping data from one "raw" form into another in order to make it more usable and valuable for a
variety of downstream uses including analytics.
Data wrangling can be defined as the process of cleaning, organizing, and transforming raw data into the
desired format for analysts to use for prompt decision-making. Also known as data cleaning or data
munging, data wrangling enables businesses to tackle more complex data in less time, produce more
accurate results, and make better decisions. The exact methods vary from project to project depending
upon your data and the goal you are trying to achieve. More and more organizations are increasingly
relying on data wrangling tools to make data ready for downstream analytics.

4.4.1 Importance of Data Wrangling


Some may question if the amount of work and time devoted to data wrangling is worth the effort. A simple
analogy will help you understand. The foundation of a skyscraper is expensive and time-consuming before
the above-ground structure starts. Still, this solid foundation is extremely valuable for the building to stand
tall and serve its purpose for decades. Similarly, for data handling, once the code and infrastructure
foundation are gathered, it will deliver immediate results (sometimes almost instantly) for as long as the
process is relevant. However, skipping necessary data wrangling steps will lead to significant downfalls,
missed opportunities, and erroneous models that damage the reputation of analysis within the organization.
Data wrangling software has become such an indispensable part of data processing. The primary
importance of using data wrangling tools can be described as:
• Making raw data usable. Accurately wrangled data guarantees that quality data is entered into the
downstream analysis.
• Getting all data from various sources into a centralized location so it can be used.
• Piecing together raw data according to the required format and understanding the business context
of data
• Automated data integration tools are used as data wrangling techniques that clean and convert
source data into a standard format that can be used repeatedly according to end requirements.
Businesses use this standardized data to perform crucial, cross-data set analytics.
• Cleansing the data from the noise or flawed, missing elements
• Data wrangling acts as a preparation stage for the data mining process, which involves gathering
data and making sense of it.
• Helping business users make concrete, timely decisions
• Data wrangling software typically performs six iterative steps of Discovering, Structuring,
Cleaning, Enriching, Validating, and Publishing data before it is ready for analytics.

Business Research Methods 185


and Data Analytics
22BA206

Benefits of Data Wrangling


• Data wrangling helps to improve data usability as it converts data into a compatible format for the
end system.
• It helps to quickly build data flows within an intuitive user interface and easily schedule and
automate the data-flow process.
• Integrates various types of information and their sources (like databases, web services, files, etc.)
• Help users to process very large volumes of data easily and easily share data-flow techniques.

4.4.2 Data Wrangling Tools


There are different tools for data wrangling that can be used for gathering, importing, structuring, and
cleaning data before it can be fed into analytics and BI apps. You can use automated tools for data
wrangling, where the software allows you to validate data mappings and scrutinize data samples at every
step of the transformation process. This helps to quickly detect and correct errors in data mapping.
Automated data cleaning becomes necessary in businesses dealing with exceptionally large data sets. For
manual data cleaning processes, the data team or data scientist is responsible for wrangling. In smaller
setups, however, non-data professionals are responsible for cleaning data before leveraging it.
Some examples of basic data munging tools are:
• Spreadsheets / Excel Power Query - It is the most basic manual data wrangling tool
• OpenRefine - An automated data cleaning tool that requires programming skills
• Tabula – It is a tool suited for all data types
• Google DataPrep – It is a data service that explores, cleans, and prepares data
• Data wrangler – It is a data cleaning and transforming tool

Data Wrangling Examples


Data wrangling techniques are used for various use-cases. The most commonly used examples of data
wrangling are for:
• Merging several data sources into one data-set for analysis
• Identifying gaps or empty cells in data and either filling or removing them
• Deleting irrelevant or unnecessary data
• Identifying severe outliers in data and either explaining the inconsistencies or deleting them to
facilitate analysis
Businesses also use data wrangling tools to
• Detect corporate fraud
• Support data security
• Ensure accurate and recurring data modeling results
• Ensure business compliance with industry standards
• Perform Customer Behavior Analysis
• Reduce time spent on preparing data for analysis
• Promptly recognize the business value of your data
• Find out data trends

Data Wrangling vs. ETL


ETL stands for Extract, Transform and Load. ETL is a middleware process that involves mining or
extracting data from various sources, joining the data, transforming data as per business rules, and
subsequently loading data to the target systems. ETL is generally used for loading processed data to flat
files or relational database tables.
Though Data Wrangling and ETL look similar, there are key differences between data wrangling and ETL
processes that set them apart.

Business Research Methods 186


and Data Analytics
22BA206

• Users – Analysts, statisticians, business users, executives, and managers use data wrangling. In
comparison, DW/ETL developers use ETL as an intermediate process linking source systems and
reporting layers.
• Data Structure – Data wrangling involves varied and complex data sets, while ETL involves structured
or semi-structured relational data sets.
• Use Case – Data wrangling is normally used for exploratory data analysis, but ETL is used for
gathering, transforming, and loading data for reporting.

4.5 Business Problem Solving across different domains


Polya’s 4-Step Problem Solving Process
Starting simply, a problem-solving process of four steps as described in the 1945, G. Polya in his
Book. These process stages are by no means deemed universal. They do serve well as the basis from which
more specific and evolved problem-solving processes can be built. Understanding these core tenants of
the problem-solving process is essential.
1. Understand the Problem
It should go without saying that to effectively solve a problem one must first understand that problem, at
least in part. To help ensure an effective understanding of the problem considers the following:
• Who are the stakeholders for this problem set?
• What unknown variables exist for this problem?
• Can the problem be broken down into smaller problems and, if so, would this offer an advantage in
solving the larger problem?
• Can this problem be represented graphically such that analysis can be performed for deeper insight?
Problems can be more fully understood and thus more completely addressed by working to answer these
basic questions and address these basic concerns.
2. Plan the Solution
Just as with Software Engineering, without a plan of action few problems are solved. At the very least,
the absence of a plan is likely to result in an inefficient result. Planning techniques vary wildly depending
on the problem domain, solution domain, and available resources. Below are some general guidelines to
help structure the planning stages:
• Are there existing solutions to similar problems that could be used for reference or application?
• Can the larger problem be broken into smaller problems in such a way that these sub-problems can be
addressed individually? If so, can any sub-problems be compared to existing problems where full or
partial solutions exist?
• Can a design model be created for a potential solution, such that it can be analyzed, improved, and
ultimately lead to an effective solution?
Learning from existing solutions and past problems is an invaluable tool for addressing new problems.
Particularly in software development, even partial solutions may already exist such that considerable
resources can be conserved in addressing new projects.
3. Carry Out the Plan
Design planning can help model potential problem solutions but eventually, they must be applied to realize
their merit. Carrying out the plans and applying models developed during the planning stage often leads
to the discovery of many previously-unknown variables. This process is often iterative and requires astute
observation, consideration, and adjustment to reach goals. The following points are useful to consider:
• Does the solution align with the plan/model? If not, where and why were deviations made and how
did they affect the outcome of solving the problem?
• Is each aspect of the solution provably correct in that rigorous proofs and/or tests can be applied?
Execution separates ideas from products and requires that one take action. When approaching a problem
this means applying one’s plan and measuring outcomes. Was the solution successful? If not, was
it partially successful? These questions can help identify what about your current process is providing
positive results and where improvements can be made.

Business Research Methods 187


and Data Analytics
22BA206

4. Examine the Results


Solutions, especially in software, aren’t always wholly successful. This hallmark characteristic of software
products doesn’t necessitate scrapping a project, however. Even large-scale projects can be deemed
successful while still possessing the potential for error. Things like bugs, unforeseen user behaviors, or
overlooked requirements can all be addressed iteratively. Having a robust system to detect such bugs and
errors is essential to having a flexible and dynamic problem-solving framework. Consider these two points
when examining the results when problem-solving:
• Can each aspect of the solution be tested and verified? If so, can the accuracy and effectiveness of
these solutions be consistently measured?
• Does the solution produce results that conform to the requirements of the results that are required? If
so, can these results be validated by stakeholders?
Examination of results is a hallmark of scientific and engineering processes by which efforts can be
quantified. Problem-solving processes are, by their very nature, scientific in nature. This characteristic can
be illustrated by the systematic approaches outlined by common problem-solving frameworks. The
measurement and validation/verification of solution results is arguably the most essential stage of any
problem-solving process.
In the 4-step process, this would fit within the plan the solution stage. Selecting a solution would be done
at the end of step two after initial solutions had been proposed. An extension of this step, while useful, is
still essentially a modeling and design focused step.

4.6 Dashboarding Fundamentals


Dashboards are business intelligence (BI) reporting tools that aggregate and display critical metrics and
key performance indicators (KPIs) in a single screen, enabling users to monitor and examine business
performance at a glance.
A data dashboard is an information management tool that visually tracks, analyses and displays key
performance indicators (KPI), metrics and key data points to monitor the health of a business, department
or specific process. They are customisable to meet the specific needs of a department and company.
Behind the scenes, a dashboard connects to your files, attachments, services and API’s, but on the surface
displays all this data in the form of tables, line charts, bar charts and gauges. A data dashboard is the most
efficient way to track multiple data sources because it provides a central location for businesses to monitor
and analyse performance. Real-time monitoring reduces the hours of analysing and long line of
communication that previously challenged businesses.

Business Research Methods 188


and Data Analytics
22BA206

4.6.1 Need for Dasboarding


The goal in any business should be to take the ever-increasing raw data that is being generated, turn that
into usable information, and then make that information accessible in time for key decision-makers to act
in a timely manner. The organizations that are excelling with analytics and dashboarding are driving
efficiency and creating competitive differentiation for themselves by leveraging this existing and often
un-leveraged resource.
Here are some core reasons organizations expend resources to pursue dashboards and dashboarding in
general:
• Makes Data Actionable – Using a dashboard, you can identify positive or negative trends, operational
issues or lagging KPIs and take action quickly.
• Saves Time in Understanding – Today, most business intelligence dashboards are web-based and
can update automatically as the underlying data changes. With a web-based dashboard, or even better
with mobile, once you have connected your data source(s) and built your dashboard content once, it
updates automatically based on your schedule preference.
• Drives Clarity – A well-made dashboard is visual, not a list of facts and numbers. A period-over-
period line or bar graph of sales is much easier to follow than 50 rows of sales totals.
• Creates Agility – Dashboarding platforms offer simple tools that help you visualize and understand
trends in your data. Domo dashboards are a popular choice that offer over 200 chart types of chart
types to choose from, including core maps as well as unlimited expansion with custom charts. You
can also set Alerts to remind you automatically via e-mail, text or push notification when key metrics
reach levels you have preset.
• Highly Secure – Strong web-based dashboard platforms offer customizable data permissions. You
can limit the data each of your users see, and safely share new data and insights instead of sending
sensitive information via e-mail.

4.6.2 Process of Dashboarding


1. Identify what insights would make you more effective: Each dashboard should have one driving
focus. Common examples of dashboard focus areas are sales, finance, inventory, project management,
digital marketing, marketing/ROI and there are many more. Make a list of your core business
measurements and prioritize them.
2. Outline Requirements: Identify what data would support your key metrics, where it is located and
how you want to visualize it.
3. Choose where to build your dashboard. You may be tempted to begin building a dashboard in Excel
or another desktop application, but web-based (and even more important cloud-native) options are
virtually always a stronger choice. Cloud-native web dashboards platforms (like Domo and Looker as
a couple of the scant options put there) allow for easier access from anywhere, secure sharing and the
peace of mind that all users see the most updated version, without sacrificing power or flexibility.
4. Connect the Data. Modern dashboard platforms are designed to enable scheduled updates to your
data directly from the source. Domo has over a thousand prebuilt connectors for the most widely used
business systems.
5. Secure the Data. Implement security options to suit your business needs. A good cloud-native, web-
based dashboarding platform will offer customizable options to protect your data and its users.
6. Start Building. Dashboards complexity can range from relatively simple to highly interactive.
Initially, build one or two dashboards to become comfortable with the tools, and allow the users to test
and give you feedback. If your business is larger with more complex needs, you may consider working
with a professional firm that specializes in implementing your dashboarding platform like Graphable
7. Revise. Welcome feedback from users and experiment with different graph types and formats to make
your dashboard more specific. Consider more key metrics that haven’t been visualized yet. Customize
your dashboard to make it more intuitive. A common way to make dashboards even more interactive
and customizable by users is to include variables

Business Research Methods 189


and Data Analytics
22BA206

8. Expand and Iterate. Once your initial dashboards begin to take shape, it’s common for new questions
to arise. Now that you have this new knowledge regarding your finance data, what new insight would
a marketing dashboard allow you to learn? At Graphable we have found this to be a highly iterative
process- as people see their data visualized, it makes them see and understand more and better ways
to gain insight from that same data. In Domo for example, many of the capabilities to customize are
designed for users to make on their own which often decreases the load on IT significantly.

4.6.2 Advantages of Dashboards


Dashboards are valuable because they transform business data into critical information that jumps out to
the user, who can then make sense and act on it immediately.
• Fast and effective decision-making- Gives executives, managers and analysts convenient
immediate access to key performance metrics, which help them monitor performance and
processes for a greater understanding of the business.
• On demand, accurate and relevant information in line with business priorities -Dashboards
clearly communicate business objectives throughout the organization and allow users to see
progress towards those goals. This keeps everyone focused and informed. With a personalized
layout, users only see the information that is most important to them, and they can filter out
information that is not relevant.
• Focused identification of problems, inefficiencies or negative trends for immediate action
and improved performance- Users can immediately see any problems and drill down on charts
and links to explore detailed information and analyze data in real time, to determine root causes
and to correct negative trends.

Business Research Methods 190


and Data Analytics
22BA206

Exercise
I. Write down short answers for the following:

22. What is data analytics


23. What is descriptive statistics
24. Define predictive analysis
25. Describe big data
26. Define descriptive statistics
27. What is text analytics?
28. What is data visualization?
29. What are skewness and kurtosis
30. Define data wrangling
31. What is dashboard?

II. Provide Detailed Answers:


16. Elaborate the types of data analytics
17. Describe applications of big data
18. Explain use cases of text analytics
19. Describe different types of charts
20. Elaborate business problem solving with multiple domains
21. List out need of dashboarding
22. Elaborate importance and benefits of data wrangling

Business Research Methods 191


and Data Analytics
22BA206

UNIT V INTERMEDIATE DATA ANALYTICS 12


Dash boarding for Enterprise Reporting -Visualization : introducing visualization tool with Dash boarding - Linear &
Logistic regression modeling and their types– Time series modeling, forecasting- Introduction to big data –Predictive
analysis- Automated analytics-cloud analytics-Report – Writing: Planning report writing work- target audience, type of
report, style of writing synoptically outline of chapters; steps in drafting the report

Learning Objectives
• To learn about Types of data analytics
• To understand data visualization
• To understand data wrangling

Learning Outcomes
At the end of the unit they will be able to:
• To apply different operations of data formatting
• To apply dash boarding fundamentals
• To solve business problems in various field with data analytics

DETAILEDSESSIONPLAN TOPIC WISE

Mode of Assessment
S.No Title of Teaching Textbook/ Link Tool(Quiz/Pu
Topic (PPT/Seminar/ Reference Book (if Applicable) link zzle/
Chalk & should on Springboard/ Assignment/
Board etc.) Coursera / Nptel Seminaretc..)
NPTEL Link:
Introduction https://onlinecourses.npt
William G Zikmund, Barry J
to data Chalk and el.ac.in/noc23_mg54/uni
Babin, Jon C.Carr,
analytics and board/PPT t?unit=26&lesson=31
AtanuAdhikari,Mitch Griffin,
1. visualization
Business Researchmethods, A
https://onlinecourses.npt
South Asian Perspective, 8th
Edition, Cengage Learning, el.ac.in/noc23_mg54/uni
New Delhi,2012. t?unit=26&lesson=30 Quiz

https://www.youtube.co
Chalk and James R. Evans, "Business m/watch?v=1LgkR1R1A
2. Data board/PPT Analytics - Methods, Models CU
formatting and Decisions", Pearson Ed,
2012.

Marc J. Schniederjans, Dara G. https://onlinecourses.npt


3 Problem Chalk and Schniederjans and Christopher el.ac.in/noc23_mg54/uni
solving board/PPT M. Starkey, " Business t?unit=17&lesson=22
Analytics Principles, Concepts,
and Applications - What, Why,
and How" , Pearson Ed, 2014

Business Research Methods 192


and Data Analytics
22BA206

5. INTERMEDIATE DATA ANALYTICS

5.1 Dash Boarding For Enterprise Reporting

An enterprise dashboard is typically a feature of a software solution that, for the purposes of a hospital,
aggregates data from different departments or the same department from different hospitals and presents
the data on a single screen. Viewing any form of information using graphics makes data easier to
understand, analyze, and process into actionable insights. Adding the functionality of dashboard
reporting provides a comprehensive and clear view of all business insights at a glance. A dashboard can
be created by linking it to Excel databases or other software that data and reporting is produced on. It can
be automated so that as data is uploaded to the database or reporting system, the charts and graphs are
automatically updated. Dashboard reporting becomes critical in a dynamic business environment,
providing real-time insights that can be acted upon to steer organizations toward their goals.

5.1.1 Dashboard Reporting Visualizations


Dashboard Reporting Visualizations are becoming increasingly popular in the collaborative workspace.
Deciding on adapting this system can be challenging for companies for are not aware of its benefits and
features. To better understand what exactly dashboard reporting visualizations are, here is an in-depth and
informative article with everything you should before making a purchasing decision.
It allow us to see your data and analytics in an easier way through visualizing trends and occurrences. It
is a tool that tracks, analyzes, and displays KPIs, metrics and critical data points. Non-high tech users can
still understand the data behind their company’s needs. There are many reasons to implement dashboard
reporting visualizations and below are the processes, types, and benefits.
Dashboards show visualized data using charts and tables. It is also considered a process oriented system,
where images are used to easily understand the flow of a company’s business practices. Employees and
decision makers use these visual representations to monitor the status of their company against established
goals and industry benchmarks. It should contain the following: charts and impact metrics, icons, images,
drawing objects and organizers. The average person makes an estimated 35,000 decisions each day.
Having the right data that is easy to read helps make better choices.

5.1.2 Advantages of Visualization using Data Dashboard


Here are some advantages that interactive dashboards provide, when compared to traditional, static
reporting:
1. Agility for decision-makers: Interactiveness during the analytical process empowers users to answer
critical questions on-demand with the most up-to-date data. Additionally, data can be looked at from
different perspectives and points of view with just a few clicks. Zooming in and out, detailing time
intervals, filtering countries, or showing and hiding specific parameters that you don’t need enables
you to look at data in the most holistic way, like never before.
2. Avoid redundant reports: You need only one tool with state-of-the-art interactive capabilities to
quickly adapt the displayed data instead of creating 10 static PowerPoint slides. Reports use real-time
data, with implemented intelligent data alerts that enable users to completely eliminate spreadsheets
and presentations. The alarm will notify the user when an anomaly occurs, while neural networks will
ensure smart detection and future forecasts.
3. Less IT involvement: By empowering users to perform their own ad hoc data analysis, a company can
save valuable IT resources since the number of requests for database queries or customizations will
significantly decrease. The IT department can then concentrate on other urgent or valuable tasks while
users can get answers to important business questions quickly.
4. Speed: There is no doubt, swiftness today is a crucial element for any company trying to survive in
our cutthroat digital age. When using traditional spreadsheets or PowerPoint presentations, data is

Business Research Methods 193


and Data Analytics
22BA206

inserted once and updated manually. With modern reporting tools, there is no need to do so. Real-time
dashboards enable real-time data and that is the beauty and power of BI at its core.
5. Productivity: While static reports have been a useful tool for increasing productivity, in today's modern
economies this is simply not enough. The amount of data that is collected, and needs to be analyzed
is continuously growing, and numerous static or paper sheets or millions of rows and columns cannot
help as much as they used to. The rise of self-service BI tools has enabled users to tinker with the data
on their own, and use modern technologies that will increase their productivity levels.

Use case for business dashboards through visualisation


Business dashboards have become an integral part in businesses of all sizes, especially since implementing
them has become less expensive than ever before. If you are yet to step into the world of dashboards, the
following advantages will assure you that you are making the right decision.

5.1.3 Dashboards for the Decision Making Process and Company Performance
The better the decisions you make for your company, the more it is bound to grow and become profitable.
Dashboards provide decisive individuals with the best tools to support their jobs, especially through the
following tasks:
• Identifying Negative Trends: In addition of activating and stimulating positive trends, effective
management should detect and reduce negative trends. The latter is more important as localizing,
analyzing and correcting these trends is essential for productivity and company morale.
• Inventing Strategies According to Goals: Dashboards support the decision-making procedure by
providing timely and accurate information. By basing decisions on this information, better
strategies will be developed and an improvement in the company’s performance will be noted.
• Improving Analysis through Visualization Abilities: Pure data will not necessarily identify and
trace most irregularities. Luckily, what may not be visible in spreadsheets may be prominently
displayed in graphic visualizations.
• Measuring Company’s Parameters: Measuring a company’s performance or levels of efficiency
can be difficult, especially since the outside may not reflect what is going on within four walls. As
dashboards support deep analysis, executives and managers alike will be able to detect
inefficiencies and take action against them.

Business Research Methods 194


and Data Analytics
22BA206

Dashboards’ Effect on Employee Efficiency


When used by employees, dashboards can boost their efficiency and productivity. This is mainly because
this software automates analysis and reporting. Other benefits that impact employee efficiency include:
• Saving Time: In addition of activating and stimulating positive trends, effective management
should detect and reduce negative trends. The latter is more important as localizing, analyzing and
correcting these trends is essential for productivity and company morale.
• Creating Interactive Reports: Dashboards support the decision-making procedure by providing
timely and accurate information. By basing decisions on this information, better strategies will be
developed and an improvement in the company’s performance will be noted.

Dashboards’ Ability to Improve Employee Motivation


Dashboards can determine the most productive employees, allowing managers to give them raises and
bonuses as a form of motivation. In addition, this software offers managers the following benefits:
• Tracing Successful Trends: By identifying and implementing successful trends, employees can
become better at their jobs. As a result, they can grow more motivated and earn rewards from their
employers.
• Concentrating on Facts Rather than Forms: Employees spending more time on reports and forms
are bound to grow less motivated. Therefore, by allowing them to use the dashboard, you can
double their interest in their work and reap the benefits of their newfound productivity.
• Understanding Strategy Statements: The most complex strategy will be easily understandable
since it is presented in a transparent way. Due to the lack of ambiguity, employees will work
according to the company’s goals and deliver beyond expectations.

5.2 Visualisation Tools for Dashboarding

Dashboard visualization tools allow you to see how you’re progressing towards your goals via dashboards,
scorecards, charts, and graphs.

The most common way businesses visualize data is via dashboards. In fact, 90% of our survey respondents
have been actively creating and using dashboards for some time. The rest have just started using
dashboards.

And more than half of them are using dashboards in marketing, sales, and web analytics.

Here is a list of dashboard visualization tools our contributors ranked as best:

Tableau
One of the most widely used data visualization tools, Tableau, offers interactive visualization solutions to
more than 57,000 companies.
Providing integration for advanced databases, including Teradata, SAP, My SQL, Amazon AWS,
and Hadoop, Tableau efficiently creates visualizations and graphics from large, constantly-evolving
datasets used for artificial intelligence, machine learning, and Big Data applications.

Business Research Methods 195


and Data Analytics
22BA206

The Pros of Tableau:


• Excellent visualization capabilities
• Easy to use
• Top class performance
• Supports connectivity with diverse data sources
• Mobile Responsive
• Has an informative community
The Cons of Tableau:
• The pricing is a bit on the higher side
• Auto-refresh and report scheduling options are not available

Dundas BI
Dundas BI offers highly-customizable data visualizations with interactive scorecards, maps, gauges, and
charts, optimizing the creation of ad-hoc, multi-page reports. By providing users full control over visual
elements, Dundas BI simplifies the complex operation of cleansing, inspecting, transforming, and
modeling big datasets.
The Pros of Dundas BI:
• Exceptional flexibility
• A large variety of data sources and charts
• Wide range of in-built features for extracting, displaying, and modifying data
The Cons of Dundas BI:
• No option for predictive analytics
• 3D charts not supported

JupyteR
A web-based application, JupyteR, is one of the top-rated data visualization tools that enable users to
create and share documents containing visualizations, equations, narrative text, and live code. JupyteR is
ideal for data cleansing and transformation, statistical modeling, numerical simulation, interactive
computing, and machine learning.
1. The Pros of JupyteR:
• Rapid prototyping
• Visually appealing results
• Facilitates easy sharing of data insights
2. The Cons of JupyteR:
• Tough to collaborate
• At times code reviewing becomes complicated

Zoho Reports
Zoho Reports, also known as Zoho Analytics, is a comprehensive data visualization tool that
integrates Business Intelligence and online reporting services, which allow quick creation and sharing of
extensive reports in minutes. The high-grade visualization tool also supports the import of Big Data from
major databases and applications.
The Pros of Zoho Reports:
• Effortless report creation and modification
• Includes useful functionalities such as email scheduling and report sharing
• Plenty of room for data
• Prompt customer support.
The Cons of Zoho Reports:
• User training needs to be improved
• The dashboard becomes confusing when there are large volumes of data

Business Research Methods 196


and Data Analytics
22BA206

Google Charts
One of the major players in the data visualization market space, Google Charts, coded with SVG
and HTML5, is famed for its capability to produce graphical and pictorial data visualizations. Google
Charts offers zoom functionality, and it provides users with unmatched cross-platform compatibility with
iOS, Android, and even the earlier versions of the Internet Explorer browser.
The Pros of Google Charts:
• User-friendly platform
• Easy to integrate data
• Visually attractive data graphs
• Compatibility with Google products.
The Cons of Google Charts:
• The export feature needs fine-tuning
• Inadequate demos on tools
• Lacks customization abilities
• Network connectivity required for visualization

Visual.ly
Visual.ly is one of the data visualization tools on the market, renowned for its impressive distribution
network that illustrates project outcomes. Employing a dedicated creative team for data visualization
services, Visual.ly streamlines the process of data import and outsource, even to third parties.
The Pros of Visual.ly:
• Top-class output quality
• Easy to produce superb graphics
• Several link opportunities
The Cons of Visual.ly:
• Few embedding options
• Showcases one point, not multiple points
• Limited scope

RAW
RAW, better-known as RawGraphs, works with delimited data such as TSV file or CSV file. It serves as
a link between data visualization and spreadsheets. Featuring a range of non-conventional and
conventional layouts, RawGraphs provides robust data security even though it is a web-based application.
The Pros of RAW:
• Simple interface
• Super-fast visual feedback
• Offers a high-level platform for arranging, keeping, and reading user data
• Easy-to-use mapping feature
• Superb readability for visual graphics
• Excellent scalability option
The Cons of RAW:
• Non-availability of log scales
• Not user intuitive

IBM Watson
Named after IBM founder Thomas J. Watson, this high-caliber data visualization tool uses analytical
components and artificial intelligence to detect insights and patterns from both unstructured and structured

Business Research Methods 197


and Data Analytics
22BA206

data. Leveraging NLP (Natural Language Processing), IBM Watson's intelligent, self-service visualization
tool guides users through the entire insight discovery operation.
The Pros of IBM Watson:
• NLP capabilities
• Offers accessibility from multiple devices
• Predictive analytics
• Self-service dashboards
The Cons of IBM Watson:
• Customer support needs improvement
• High-cost maintenance

Sisense
Regarded as one of the most agile data visualization tools, Sisense gives users access to instant data
analytics anywhere, at any time. The best-in-class visualization tool can identify key data patterns and
summarize statistics to help decision-makers make data-driven decisions.
The Pros of Sisense:
• Ideal for mission-critical projects involving massive datasets
• Reliable interface
• High-class customer support
• Quick upgrades
• Flexibility of seamless customization
The Cons of Sisense:
• Developing and maintaining analytic cubes can be challenging
• Does not support time formats
• Limited visualization versions

Plotly
An open-source data visualization tool, Plotly offers full integration with analytics-centric programming
languages like Matlab, Python, and R, which enables complex visualizations. Widely used for
collaborative work, disseminating, modifying, creating, and sharing interactive, graphical data, Plotly
supports both on-premise installation and cloud deployment.
The Pros of Plotly:
• Allows online editing of charts
• High-quality image export
• Highly interactive interface
• Server hosting facilitates easy sharing
The Cons of Plotly:
• Speed is a concern at times
• Free version has multiple limitations
• Various screen-flashings create confusion and distraction

Data Wrapper
Data Wrapper is one of the very few data visualization tools on the market that is available for free. It is
popular among media enterprises because of its inherent ability to quickly create charts and present
graphical statistics on Big Data. Featuring a simple and intuitive interface, Data Wrapper allows users to
create maps and charts that they can easily embed into reports.
The Pros of Data Wrapper:
• Does not require installation for chart creation
• Ideal for beginners

Business Research Methods 198


and Data Analytics
22BA206

• Free to use
The Cons of Data Wrapper:
• Building complex charts like Sankey is a problem
• Security is an issue as it is an open-source tool

Highcharts
Deployed by seventy-two of the world's top hundred companies, the Highcharts tool is perfect for
visualization of streaming big data analytics. Running on Javascript API and offering integration with
jQuery, Highcharts provides support for cross-browser functionalities that facilitates easy access to
interactive visualizations.
The Pros of Highcharts:
• State-of-the-art customization options
• Visually appealing graphics
• Multiple chart layouts
• Simple and flexible
The Cons of Highcharts:
• Not ideal for small organizations

Fusioncharts
Fusioncharts is one of the most popular and widely-adopted data visualization tools. The Javascript-based,
top-of-the-line visualization tool offers ninety different chart building packages that integrate with major
frameworks and platforms, offering users significant flexibility.
The Pros of Fusioncharts:
• Customized for specific implementations
• Outstanding helpdesk support
• Active community
The Cons of Fusioncharts:
• An expensive data visualization solution
• Complex set-up
• Old-fashioned interface

Infogram
Infogram is one of the most popular software programmes on the internet today. It is a web-based
tool for creating infographics and visualising data. It is primarily intended to assist all users in quickly and
simply creating interesting and interactive reports, infographics, and dashboards with data-driven
information and captivating images. This particular solution provides customers with over 550 maps and
35 charts, 20 ready-made design templates, numerous pictures and icons, a drag-and-drop editor, and other
features. Even someone who is new to the sector may quickly learn how to utilise this programme.
It has a simple editor that allows users to modify the colours and styles of their visualisations, add
corporate logos, and adjust the display choices. In addition, the users will be granted the right to use over
a million icons, GIFs, and photos in their visualisations. Users may add connections to generate traffic to
their website using interactive charts, which allow audiences to examine data using Infogram tabs. Reports
that are interactive and shareable may also be developed and incorporated, with metrics to measure
audience interaction.

Sigma.js
Sigma is a JavaScript library for drawing graphs. It enables developers to incorporate network exploration
into rich online applications and makes it simple to publish networks on websites.
• The Sigma.js layout is fantastic.

Business Research Methods 199


and Data Analytics
22BA206

• It enables individuals to follow up with interest as soon as possible.


• Sigma.js's performance is currently satisfactory.
• Sigma.js support is fantastic and quite helpful.
• Good software must be tried.

5.3 Linear and Logistic Regression Modeling

Linear Regression and Logistic Regression are the two famous Machine Learning Algorithms which come
under supervised learning technique. Since both the algorithms are of supervised in nature hence these
algorithms use labeled dataset to make the predictions. But the main difference between them is how they
are being used. The Linear Regression is used for solving Regression problems whereas Logistic
Regression is used for solving the Classification problems. The description of both the algorithms is given
below along with difference table.

Linear Regression:
• Linear Regression is one of the most simple Machine learning algorithm that comes under Supervised
Learning technique and used for solving regression problems.
• It is used for predicting the continuous dependent variable with the help of independent variables.
• The goal of the Linear regression is to find the best fit line that can accurately predict the output for
the continuous dependent variable.
• If single independent variable is used for prediction then it is called Simple Linear Regression and if
there are more than two independent variables then such regression is called as Multiple Linear
Regression.
• By finding the best fit line, algorithm establish the relationship between dependent variable and
independent variable. And the relationship should be of linear nature.
• The output for Linear regression should only be the continuous values such as price, age, salary, etc.
The relationship between the dependent variable and independent variable can be shown in below
image:

Business Research Methods 200


and Data Analytics
22BA206

In above image the dependent variable is on Y-axis (salary) and independent variable is on x-
axis(experience). The regression line can be written as:
y= a0+a1x+ ε
Where, a0 and a1 are the coefficients and ε is the error term.

5.3.1 Least Square Regression Line or Linear Regression Line


The most popular method to fit a regression line in the XY plot is the method of least-squares. This process
determines the best-fitting line for the noted data by reducing the sum of the squares of the vertical
deviations from each data point to the line. If a point rests on the fitted line accurately, then its
perpendicular deviation is 0. Because the variations are first squared, then added, their positive and
negative values will not be cancelled.

Linear regression determines the straight line, called the least-squares regression line or LSRL,
that best expresses observations in a bivariate analysis of data set. Suppose Y is a dependent variable, and
X is an independent variable, then the population regression line is given by;
Y = B0+B1X
Where
B0 is a constant
B1 is the regression coefficient
If a random sample of observations is given, then the regression line is expressed by;
ŷ = b0 + b1x
where b0 is a constant, b1 is the regression coefficient, x is the independent variable, and ŷ is the predicted
value of the dependent variable.

Business Research Methods 201


and Data Analytics
22BA206

5.3.2 Features of Linear Regression


For the regression line where the regression parameters b 0 and b1 are defined, the properties are given
as:
• The line reduces the sum of squared differences between observed values and predicted values.
• The regression line passes through the mean of X and Y variable values
• The regression constant (b0) is equal to y-intercept the linear regression
• The regression coefficient (b1) is the slope of the regression line which is equal to the average
change in the dependent variable (Y) for a unit change in the independent variable (X).

5.3.3 Types Of Linear Regression


Linear regression can be further divided into two types of the algorithm:
Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent variable, then such
a Linear Regression algorithm is called Simple Linear Regression.
Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent variable, then
such a Linear Regression algorithm is called Multiple Linear Regression.

Business Research Methods 202


and Data Analytics
22BA206

Linear Regression Line


A linear line showing the relationship between the dependent and independent variables is called
a regression line. A regression line can show two types of relationship:
Positive Linear Relationship:
If the dependent variable increases on the Y-axis and independent variable increases on X-axis, then such
a relationship is termed as a Positive linear relationship.

Negative Linear Relationship:


If the dependent variable decreases on the Y-axis and independent variable increases on the X-axis, then
such a relationship is called a negative linear relationship.

5.3.4 Assumptions of Linear Regression


Below are some important assumptions of Linear Regression. These are some formal checks while
building a Linear Regression model, which ensures to get the best possible result from the given dataset.
• Linear relationship between the features and target: Linear regression assumes the linear
relationship between the dependent and independent variables.
• Small or no multicollinearity between the features: multicollinearity means high-correlation
between the independent variables. Due to multicollinearity, it may difficult to find the true
relationship between the predictors and target variables. Or we can say, it is difficult to determine
which predictor variable is affecting the target variable and which is not. So, the model assumes
either little or no multicollinearity between the features or independent variables.
• Homoscedasticity Assumption: Homoscedasticity is a situation when the error term is the same
for all the values of independent variables. With homoscedasticity, there should be no clear
pattern distribution of data in the scatter plot.
• Normal distribution of error terms: Linear regression assumes that the error term should
follow the normal distribution pattern. If error terms are not normally distributed, then
confidence intervals will become either too wide or too narrow, which may cause difficulties in
finding coefficients.

Business Research Methods 203


and Data Analytics
22BA206

It can be checked using the q-q plot. If the plot shows a straight line without any deviation,
which means the error is normally distributed.
• No autocorrelations: The linear regression model assumes no autocorrelation in error terms. If
there will be any correlation in the error term, then it will drastically reduce the accuracy of the
model. Autocorrelation usually occurs if there is a dependency between residual errors.

Common Examples of Lenear regression used in Various fields:

1. Businesses often use linear regression to understand the relationship between advertising spending and
revenue.
For example, they might fit a simple linear regression model using advertising spending as the predictor
variable and revenue as the response variable. The regression model would take the following form:
Revenue = β0 + β1(ad spending)
The coefficient β0 would represent total expected revenue when ad spending is zero.
The coefficient β1 would represent the average change in total revenue when ad spending is increased by
one unit (e.g. one dollar).
If β1 is negative, it would mean that more ad spending is associated with less revenue.
If β1 is close to zero, it would mean that ad spending has little effect on revenue.
And if β1 is positive, it would mean more ad spending is associated with more revenue.
Depending on the value of β1, a company may decide to either decrease or increase their ad spending.

2. Medical researchers often use linear regression to understand the relationship between drug dosage and
blood pressure of patients.
For example, researchers might administer various dosages of a certain drug to patients and observe how
their blood pressure responds. They might fit a simple linear regression model using dosage as the
predictor variable and blood pressure as the response variable. The regression model would take the
following form:
Blood pressure = β0 + β1(dosage)
The coefficient β0 would represent the expected blood pressure when dosage is zero.
The coefficient β1 would represent the average change in blood pressure when dosage is increased by one
unit.
If β1 is negative, it would mean that an increase in dosage is associated with a decrease in blood pressure.
If β1 is close to zero, it would mean that an increase in dosage is associated with no change in blood
pressure.
If β1 is positive, it would mean that an increase in dosage is associated with an increase in blood pressure.
Depending on the value of β1, researchers may decide to change the dosage given to a patient.

3. Agricultural scientists often use linear regression to measure the effect of fertilizer and water on crop
yields.
For example, scientists might use different amounts of fertilizer and water on different fields and see how
it affects crop yield. They might fit a multiple linear regression model using fertilizer and water as the
predictor variables and crop yield as the response variable. The regression model would take the following
form:
crop yield = β0 + β1(amount of fertilizer) + β2(amount of water)
The coefficient β0 would represent the expected crop yield with no fertilizer or water.
The coefficient β1 would represent the average change in crop yield when fertilizer is increased by one
unit, assuming the amount of water remains unchanged.
The coefficient β2 would represent the average change in crop yield when water is increased by one
unit, assuming the amount of fertilizer remains unchanged.

Business Research Methods 204


and Data Analytics
22BA206

Depending on the values of β1 and β2, the scientists may change the amount of fertilizer and water used to
maximize the crop yield.

4. Data scientists for professional sports teams often use linear regression to measure the effect that
different training regimens have on player performance.
For example, data scientists in the NBA might analyze how different amounts of weekly yoga sessions
and weightlifting sessions affect the number of points a player scores. They might fit a multiple linear
regression model using yoga sessions and weightlifting sessions as the predictor variables and total points
scored as the response variable. The regression model would take the following form:
points scored = β0 + β1(yoga sessions) + β2(weightlifting sessions)
The coefficient β0 would represent the expected points scored for a player who participates in zero yoga
sessions and zero weightlifting sessions.
The coefficient β1 would represent the average change in points scored when weekly yoga sessions is
increased by one, assuming the number of weekly weightlifting sessions remains unchanged.
The coefficient β2 would represent the average change in points scored when weekly weightlifting
sessions is increased by one, assuming the number of weekly yoga sessions remains unchanged.
Depending on the values of β1 and β2, the data scientists may recommend that a player participates in more
or less weekly yoga and weightlifting sessions in order to maximize their points scored.

5.4 Logistic Regression


• Logistic regression is one of the most popular Machine learning algorithm that comes under
Supervised Learning techniques.
• It can be used for Classification as well as for Regression problems, but mainly used for Classification
problems.
• Logistic regression is used to predict the categorical dependent variable with the help of independent
variables.
• The output of Logistic Regression problem can be only between the 0 and 1.
• Logistic regression can be used where the probabilities between two classes is required. Such as
whether it will rain today or not, either 0 or 1, true or false etc.
• Logistic regression is based on the concept of Maximum Likelihood estimation. According to this
estimation, the observed data should be most probable.
• In logistic regression, we pass the weighted sum of inputs through an activation function that can map
values in between 0 and 1. Such activation function is known as sigmoid function and the curve
obtained is called as sigmoid curve or S-curve.
• The equation for logistic regression is:

5.4.1 Features of Logistic Regression


Typical properties of the logistic regression equation include:
• Logistic regression’s dependent variable obeys ‘Bernoulli distribution’
• Estimation/prediction is based on ‘maximum likelihood.’
• Logistic regression does not evaluate the coefficient of determination (or R squared) as
observed in linear regression’. Instead, the model’s fitness is assessed through a concordance.
For example, KS or Kolmogorov-Smirnov statistics look at the difference between cumulative events and
cumulative non-events to determine the efficacy of models through credit scoring.
While implementing logistic regression, one needs to keep in mind the following key assumptions:
1. The dependent/response variable is binary or dichotomous
The first assumption of logistic regression is that response variables can only take on two possible
outcomes – pass/fail, male/female, and malignant/benign.

Business Research Methods 205


and Data Analytics
22BA206

This assumption can be checked by simply counting the unique outcomes of the dependent variable. If
more than two possible outcomes surface, then one can consider that this assumption is violated.
2. Little or no multicollinearity between the predictor/explanatory variables
This assumption implies that the predictor variables (or the independent variables) should be
independent of each other. Multicollinearity relates to two or more highly correlated independent
variables. Such variables do not provide unique information in the regression model and lead to wrongful
interpretation.
The assumption can be verified with the variance inflation factor (VIF), which determines the correlation
strength between the independent variables in a regression model.
3. Linear relationship of independent variables to log odds
Log odds refer to the ways of expressing probabilities. Log odds are different from probabilities. Odds
refer to the ratio of success to failure, while probability refers to the ratio of success to everything that can
occur.
For example, consider that you play twelve tennis games with your friend. Here, the odds of you winning
are 5 to 7 (or 5/7), while the probability of you winning is 5 to 12 (as the total games played = 12).
4. Prefers large sample size
Logistic regression analysis yields reliable, robust, and valid results when a larger sample size of the
dataset is considered.
This assumption can be validated by taking into account a minimum of 10 cases considering the least
frequent outcome for each estimator variable. Let’s consider a case where you have three predictor
variables, and the probability of the least frequent outcome is 0.30. Here, the sample size would be (10*3)
/ 0.30 = 100.

Business Research Methods 206


and Data Analytics
22BA206

5. Problem with extreme outliers


Another critical assumption of logistic regression is the requirement of no extreme outliers in the dataset.
This assumption can be verified by calculating Cook’s distance (Di) for each observation to identify
influential data points that may negatively affect the regression model. In situations when outliers exist,
one can implement the following solutions:
• Eliminate or remove the outliers
• Consider a value of mean or median instead of outliers, or
• Keep the outliers in the model but maintain a record of them while reporting the regression results

6. Consider independent observations


This assumption states that the dataset observations should be independent of each other. The observations
should not be related to each other or emerge from repeated measurements of the same individual type.
The assumption can be verified by plotting residuals against time, which signifies the order of
observations. The plot helps in determining the presence or absence of a random pattern. If a random
pattern is present or detected, this assumption may be considered violated.

5.4.2 Types of Logistic Regression


Logistic regression is classified into binary, multinomial, and ordinal. Each type differs from the other in
execution and theory. Let’s understand each type in detail.
1. Binary logistic regression
Binary logistic regression predicts the relationship between the independent and binary dependent
variables. Some examples of the output of this regression type may be, success/failure, 0/1, or true/false.
Examples:
1. Deciding on whether or not to offer a loan to a bank customer: Outcome = yes or no.
2. Evaluating the risk of cancer: Outcome = high or low.
3. Predicting a team’s win in a football match: Outcome = yes or no.
2. Multinomial logistic regression
A categorical dependent variable has two or more discrete outcomes in a multinomial regression type.
This implies that this regression type has more than two possible outcomes.
Examples:
1. Let’s say you want to predict the most popular transportation type for 2040. Here,
transport type equates to the dependent variable, and the possible outcomes can be electric
cars, electric trains, electric buses, and electric bikes.
2. Predicting whether a student will join a college, vocational/trade school, or corporate
industry.
3. Estimating the type of food consumed by pets, the outcome may be wet food, dry food, or
junk food.
3. Ordinal logistic regression
Ordinal logistic regression applies when the dependent variable is in an ordered state (i.e., ordinal). The
dependent variable (y) specifies an order with two or more categories or levels.
Examples: Dependent variables represent,
1. Formal shirt size: Outcomes = XS/S/M/L/XL
2. Survey answers: Outcomes = Agree/Disagree/Unsure
3. Scores on a math test: Outcomes = Poor/Average/Good

5.4.3 Assumptions of Logistic Regression


Logistic regression uses a logistic function called a sigmoid function to map predictions and their
probabilities. The sigmoid function refers to an S-shaped curve that converts any real value to a range
between 0 and 1.

Business Research Methods 207


and Data Analytics
22BA206

Moreover, if the output of the sigmoid function (estimated probability) is greater than a predefined
threshold on the graph, the model predicts that the instance belongs to that class. If the estimated
probability is less than the predefined threshold, the model predicts that the instance does not belong to
the class.
Assume that, if the output of the sigmoid function is above 0.5, the output is considered as 1. On the other
hand, if the output is less than 0.5, the output is classified as 0. Also, if the graph goes further to the
negative end, the predicted value of y will be 0 and vice versa. In other words, if the output of the sigmoid
function is 0.65, it implies that there are 65% chances of the event occurring; a coin toss, for example.
The sigmoid function is referred to as an activation function for logistic regression and is defined as:

where,

e = base of natural logarithms

value = numerical value one wishes to transform
The following equation represents logistic regression:

Equation of Logistic Regression


here,
• x = input value
• y = predicted output
• b0 = bias or intercept term
• b1 = coefficient for input (x)
This equation is similar to linear regression, where the input values are combined linearly to predict an
output value using weights or coefficient values. However, unlike linear regression, the output value
modeled here is a binary value (0 or 1) rather than a numeric value.

Fig Logistic Regression – Sigmoid Function

Common Examples of Logistics Regression used in Various Field


1. Determine the probability of heart attacks: With the help of a logistic model, medical practitioners
can determine the relationship between variables such as the weight, exercise, etc., of an individual and
use it to predict whether the person will suffer from a heart attack or any other medical complication.

Business Research Methods 208


and Data Analytics
22BA206

2. Possibility of enrolling into a university: Application aggregators can determine the probability of a
student getting accepted to a particular university or a degree course in a college by studying the
relationship between the estimator variables, such as GRE, GMAT, or TOEFL scores.
3. Identifying spam emails: Email inboxes are filtered to determine if the email communication is
promotional/spam by understanding the predictor variables and applying a logistic regression algorithm
to check its authenticity.

Linear Regression Logistic Regression

Linear regression is used to predict the continuous Logistic Regression is used to predict the categorical
dependent variable using a given set of independent dependent variable using a given set of independent
variables. variables.

Linear Regression is used for solving Regression Logistic regression is used for solving Classification
problem. problems.

In Linear regression, we predict the value of In logistic Regression, we predict the values of
continuous variables. categorical variables.

In linear regression, we find the best fit line, by In Logistic Regression, we find the S-curve by
which we can easily predict the output. which we can classify the samples.

Least square estimation method is used for Maximum likelihood estimation method is used for
estimation of accuracy. estimation of accuracy.

The output for Linear Regression must be a The output of Logistic Regression must be a
continuous value, such as price, age, etc. Categorical value such as 0 or 1, Yes or No, etc.

In Linear regression, it is required that relationship In Logistic regression, it is not required to have the
between dependent variable and independent linear relationship between the dependent and
variable must be linear. independent variable.

In linear regression, there may be collinearity In logistic regression, there should not be
between the independent variables. collinearity between the independent variable.

5.5 Time Series Analysis- Modeling and Forecasting


Time series analysis is a specific way of analyzing a sequence of data points collected over an interval of
time. In time series analysis, analysts record data points at consistent intervals over a set period of time
rather than just recording the data points intermittently or randomly. However, this type of analysis is not
merely the act of collecting data over time.
What sets time series data apart from other data is that the analysis can show how variables change over
time. In other words, time is a crucial variable because it shows how the data adjusts over the course of
the data points as well as the final results. It provides an additional source of information and a set order
of dependencies between the data.
Time series analysis typically requires a large number of data points to ensure consistency and reliability.
An extensive data set ensures you have a representative sample size and that analysis can cut through
noisy data. It also ensures that any trends or patterns discovered are not outliers and can account for
seasonal variance. Additionally, time series data can be used for forecasting predicting future data based
on historical data.

Business Research Methods 209


and Data Analytics
22BA206

5.5.1 Time series modeling


Time series modeling is used for non-stationary data, things that are constantly fluctuating over time or
are affected by time. Industries like finance, retail, and economics frequently use time series analysis
because currency and sales are always changing. Stock market analysis is an excellent example of time
series analysis in action, especially with automated trading algorithms. Likewise, time series analysis is
ideal for forecasting weather changes, helping meteorologists predict everything from tomorrow’s weather
report to future years of climate change. Examples of time series analysis in action include:
• Weather data
• Rainfall measurements
• Temperature readings
• Heart rate monitoring (EKG)
• Brain monitoring (EEG)
• Quarterly sales
• Stock prices
• Automated stock trading
• Industry forecasts
• Interest rates

Models of time series analysis include:


• Classification: Identifies and assigns categories to the data.
• Curve fitting: Plots the data along a curve to study the relationships of variables within the data.
• Descriptive analysis: Identifies patterns in time series data, like trends, cycles, or seasonal
variation.
• Explanative analysis: Attempts to understand the data and the relationships within it, as well as
cause and effect.
• Exploratory analysis: Highlights the main characteristics of the time series data, usually in a visual
format.
• Forecasting: Predicts future data. This type is based on historical trends. It uses the historical data
as a model for future data, predicting scenarios that could happen along future plot points.
• Intervention analysis: Studies how an event can change the data.
• Segmentation: Splits the data into segments to show the underlying properties of the source
information.

5.5.2 Time series Forecasting


Time series forecasting is one of the most applied data science techniques in business, finance, supply
chain management, production and inventory planning. Many prediction problems involve a time
component and thus require extrapolation of time series data, or time series forecasting. Time series
forecasting is also an important area of machine learning (ML) and can be cast as a supervised learning
problem. ML methods such as Regression, Neural Networks, Support Vector Machines, Random Forests
and XGBoost — can be applied to it. Forecasting involves taking models fit on historical data and using
them to predict future observations.
Time series forecasting means to forecast or to predict the future value over a period of time. It entails
developing models based on previous data and applying them to make observations and guide future
strategic decisions.
The future is forecast or estimated based on what has already happened. Time series adds a time order
dependence between observations. This dependence is both a constraint and a structure that provides a
source of additional information. Before we discuss time series forecasting methods, let’s define time
series forecasting more closely.

Business Research Methods 210


and Data Analytics
22BA206

Time series forecasting is a technique for the prediction of events through a sequence of time. It predicts
future events by analyzing the trends of the past, on the assumption that future trends will hold similar to
historical trends. It is used across many fields of study in various applications including:
• Astronomy
• Business planning
• Control engineering
• Earthquake prediction
• Econometrics
• Mathematical finance
• Pattern recognition
• Resources allocation
• Signal processing
• Statistics
• Weather forecasting
Time series forecasting starts with a historical time series. Analysts examine the historical data and check
for patterns of time decomposition, such as trends, seasonal patterns, cyclic patterns and regularity. Many
areas within organizations including marketing, finance and sales use some form of time series forecasting
to evaluate probable technical costs and consumer demand. Models for time series data can have many
forms and represent different stochastic processes.

Models of Time series forecasting includes:


Time series models are used to forecast events based on verified historical data. Common types include
ARIMA, smooth-based, and moving average. Not all models will yield the same results for the same
dataset, so it’s critical to determine which one works best based on the individual time series.
When forecasting, it is important to understand your goal. To narrow down the specifics of your predictive
modeling problem, ask questions about:
1. Volume of data available -more data is often more helpful, offering greater opportunity for
exploratory data analysis, model testing and tuning, and model fidelity.
2. Required time horizon of predictions — shorter time horizons are often easier to predict with
higher confidence than longer ones.
3. Forecast update frequency- Forecasts might need to be updated frequently over time or might
need to be made once and remain static (updating forecasts as new information becomes available
often results in more accurate predictions).
4. Forecast temporal frequency-Often forecasts can be made at lower or higher frequencies, which
allows harnessing down sampling and up-sampling of data (this in turn can offer benefits while
modeling).

Time series analysis vs. time series forecasting


While time series analysis is all about understanding the dataset; forecasting is all about predicting it.
Time series analysis comprises methods for analyzing time series data in order to extract meaningful
statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future
values based on previously observed values.
The three aspects of predictive modeling are:
1. Sample data: the data that we collect that describes our problem with known relationships between
inputs and outputs.
2. Learn a model: the algorithm that we use on the sample data to create a model that we can later use
over and over again.
3. Making predictions: the use of our learned model on new data for which we don’t know the output.

Validating and testing a time series model

Business Research Methods 211


and Data Analytics
22BA206

Among the factors that make time series forecasting challenging are:
• Time dependence of a time series - The basic assumption of a linear regression model that the
observations are independent doesn’t hold in this case. Due to the temporal dependencies in time series
data, time series forecasting cannot rely on usual validation techniques. To avoid biased evaluations,
training data sets should contain observations that occurred prior to the ones in validation sets. Once
we have chosen the best model, we can fit it on the entire training set and evaluate its performance on
a separate test set subsequent in time.
• Seasonality in a time series - Along with an increasing or decreasing trend, most time series have
some form of seasonal trends, i.e. variations specific to a particular time frame. Time series models
can outperform others on a particular dataset — one model which performs best on one type of dataset
may not perform the same for all others.

Data classification
Further, time series data can be classified into two main categories:
• Stock time series data means measuring attributes at a certain point in time, like a static snapshot
of the information as it was.
• Flow time series data means measuring the activity of the attributes over a certain period, which
is generally part of the total whole and makes up a portion of the results.
Data variations
In time series data, variations can occur sporadically throughout the data:
• Functional analysis can pick out the patterns and relationships within the data to identify notable
events.
• Trend analysis means determining consistent movement in a certain direction. There are two types
of trends: deterministic, where we can find the underlying cause, and stochastic, which is random
and unexplainable.
• Seasonal variation describes events that occur at specific and regular intervals during the course
of a year. Serial dependence occurs when data points close together in time tend to be related.
Time series analysis and forecasting models must define the types of data relevant to answering the
business question. Once analysts have chosen the relevant data they want to analyze, they choose what
types of analysis and techniques are the best fit.

Common Time series modeling and forecasting used in research:


Times series methods refer to different ways to measure timed data. Common types include:
Autoregression (AR), Moving Average (MA), Autoregressive Moving Average (ARMA), Autoregressive
Integrated Moving Average (ARIMA), and Seasonal Autoregressive Integrated Moving-Average
(SARIMA). The important thing is to select the appropriate forecasting method based on the
characteristics of the time series data.
• Box-Jenkins ARIMA models: These univariate models are used to better understand a single
time-dependent variable, such as temperature over time, and to predict future data points of
variables. These models work on the assumption that the data is stationary. Analysts have to
account for and remove as many differences and seasonality in past data points as they can.
Thankfully, the ARIMA model includes terms to account for moving averages, seasonal difference
operators, and autoregressive terms within the model.
• Box-Jenkins Multivariate Models: Multivariate models are used to analyze more than one time-
dependent variable, such as temperature and humidity, over time.
• Holt-Winters Method: The Holt-Winters method is an exponential smoothing technique. It is
designed to predict outcomes, provided that the data points include seasonality.

5.6 Introduction to Big Data

Business Research Methods 212


and Data Analytics
22BA206

Big Data is a massive amount of data sets that cannot be stored, processed, or analyzed using
traditional tools. The history of Big Data analytics can be traced back to the early days of computing,
when organizations first began using computers to store and analyze large amounts of data. However, it
was not until the late 1990s and early 2000s that Big Data analytics really began to take off, as
organizations increasingly turned to computers to help them make sense of the rapidly growing volumes
of data being generated by their businesses.
Today, Big Data analytics has become an essential tool for organizations of all sizes across a wide
range of industries. By harnessing the power of Big Data, organizations are able to gain insights into their
customers, their businesses, and the world around them that were simply not possible before.As the field
of Big Data analytics continues to evolve, we can expect to see even more amazing and transformative
applications of this technology in the years to come.
There are millions of data sources that generate data at a very rapid rate. These data sources are
present across the world. Some of the largest sources of data are social media platforms and networks.
Let’s use Facebook as an example—it generates more than 500 terabytes of data every day. This data
includes pictures, videos, messages, and more.
Data also exists in different formats, like structured data, semi-structured data, and unstructured data. For
example, in a regular Excel sheet, data is classified as structured data with a definite format. In contrast,
emails fall under semi-structured, and your pictures and videos fall under unstructured data. All this data
combined makes up Big Data.

5.6.1 Big Data Analytics

Big Data analytics is a process used to extract meaningful insights, such as hidden patterns, unknown
correlations, market trends, and customer preferences. Big Data analytics provides various advantages, it
can be used for better decision making, preventing fraudulent activities, among other things.

Features of Big Data Analytics


According to Doug Laney the 5 Vs of Big Data are:
Volume refers to the amount of data that is being collected. The data could be structured or unstructured.
Velocity refers to the rate at which data is coming in.
Variety refers to the different kinds of data (data types, formats, etc.) that is coming in for analysis.
Over the last few years, 2 additional Vs of data have also emerged – value and veracity.
Value refers to the usefulness of the collected data.
Veracity refers to the quality of data that is coming in from different sources

5.6.2 General Application and Examples of Big Data Analytics


There are many different ways that Big Data analytics can be used in order to improve businesses and
organizations. Here are some examples:
• Using analytics to understand customer behavior in order to optimize the customer experience
• Predicting future trends in order to make better business decisions
• Improving marketing campaigns by understanding what works and what doesn't
• Increasing operational efficiency by understanding where bottlenecks are and how to fix them
• Detecting fraud and other forms of misuse sooner
These are just a few examples — the possibilities are really endless when it comes to Big Data analytics.
It all depends on how you want to use it in order to improve your business.

Business Research Methods 213


and Data Analytics
22BA206

5.6.3 Other Applications in business includes:


Big Data helps corporations in making better and faster decisions, because they have more information
available to solve problems, and have more data to test their hypothesis on.
Demand forecasting has become more accurate with more and more data being collected about customer
purchases. This helps companies build forecasting models, that help them forecast future demand, and
scale production accordingly. It helps companies, especially those in manufacturing businesses, to reduce
the cost of storing unsold inventory in warehouses.
Big data also has extensive use in applications such as product development and fraud detection.
Customer experience is a major field that has been revolutionized with the advent of Big Data.
Companies are collecting more data about their customers and their preferences than ever. This data is
being leveraged in a positive way, by giving personalized recommendations and offers to customers, who
are more than happy to allow companies to collect this data in return for the personalized services. The
recommendations you get on Netflix, or Amazon/Flipkart are a gift of Big Data!
Machine Learning is another field that has benefited greatly from the increasing popularity of Big Data.
More data means we have larger datasets to train our ML models, and a more trained model (generally)
results in a better performance. Also, with the help of Machine Learning, we are now able to automate
tasks that were earlier being done manually.

5.6.4 Use Cases of Big Data Analytics


1. Risk Management
Use Case: Banco de Oro, a Phillippine banking company, uses Big Data analytics to identify fraudulent
activities and discrepancies. The organization leverages it to narrow down a list of suspects or root causes
of problems.
2. Product Development and Innovations
Use Case: Rolls-Royce, one of the largest manufacturers of jet engines for airlines and armed forces across
the globe, uses Big Data analytics to analyze how efficient the engine designs are and if there is any need
for improvements.
3. Quicker and Better Decision Making Within Organizations
Use Case: Starbucks uses Big Data analytics to make strategic decisions. For example, the company
leverages it to decide if a particular location would be suitable for a new outlet or not. They will analyze
several different factors, such as population, demographics, accessibility of the location, and more.
4. Improve Customer Experience
Use Case: Delta Air Lines uses Big Data analysis to improve customer experiences. They monitor tweets
to find out their customers’ experience regarding their journeys, delays, and so on. The airline identifies
negative tweets and does what’s necessary to remedy the situation. By publicly addressing these issues
and offering solutions, it helps the airline build good customer relations.

5.6.5 Functions of Big Data Analytics


1. Collect Data
Data collection looks different for every organization. With today’s technology, organizations can gather
both structured and unstructured data from a variety of sources — from cloud storage to mobile
applications to in-store IoT sensors and beyond. Some data will be stored in data
warehouses where business intelligence tools and solutions can access it easily. Raw or unstructured data
that is too diverse or complex for a warehouse may be assigned metadata and stored in a data lake.

Business Research Methods 214


and Data Analytics
22BA206

2. Process Data
Once data is collected and stored, it must be organized properly to get accurate results on analytical
queries, especially when it’s large and unstructured. Available data is growing exponentially, making data
processing a challenge for organizations. One processing option is batch processing, which looks at large
data blocks over time. Batch processing is useful when there is a longer turnaround time between
collecting and analyzing data. Stream processing looks at small batches of data at once, shortening the
delay time between collection and analysis for quicker decision-making. Stream processing is more
complex and often more expensive.
3. Clean Data
Data big or small requires scrubbing to improve data quality and get stronger results; all data must be
formatted correctly, and any duplicative or irrelevant data must be eliminated or accounted for. Dirty data
can obscure and mislead, creating flawed insights.
4. Analyze Data
Getting big data into a usable state takes time. Once it’s ready, advanced analytics processes can turn big
data into big insights. Some of these big data analysis methods include:
• Data mining sorts through large datasets to identify patterns and relationships by identifying
anomalies and creating data clusters.
• Predictive analytics uses an organization’s historical data to make predictions about the future,
identifying upcoming risks and opportunities.
• Deep learning imitates human learning patterns by using artificial intelligence and machine
learning to layer algorithms and find patterns in the most complex and abstract data.

5.6.6 The Lifecycle Phases of Big Data Analytics


Now, let’s discuss how Big Data analytics works:
• Stage 1 - Business case evaluation - The Big Data analytics lifecycle begins with a business case,
which defines the reason and goal behind the analysis.
• Stage 2 - Identification of data - Here, a broad variety of data sources are identified.
• Stage 3 - Data filtering - All of the identified data from the previous stage is filtered here to remove
corrupt data.
• Stage 4 - Data extraction - Data that is not compatible with the tool is extracted and then
transformed into a compatible form.
• Stage 5 - Data aggregation - In this stage, data with the same fields across different datasets are
integrated.
• Stage 6 - Data analysis - Data is evaluated using analytical and statistical tools to discover useful
information.
• Stage 7 - Visualization of data - With tools like Tableau, Power BI, and QlikView, Big Data
analysts can produce graphic visualizations of the analysis.
• Stage 8 - Final analysis result - This is the last step of the Big Data analytics lifecycle, where the
final results of the analysis are made available to business stakeholders who will take action.

5.6.7 Different Types of Big Data Analytic


Here are the four types of Big Data analytics:
1. Descriptive Analytics
This summarizes past data into a form that people can easily read. This helps in creating reports, like a
company’s revenue, profit, sales, and so on. Also, it helps in the tabulation of social media metrics.
Use Case: The Dow Chemical Company analyzed its past data to increase facility utilization across its
office and lab space. Using descriptive analytics, Dow was able to identify underutilized space. This space
consolidation helped the company save nearly US $4 million annually.
2. Diagnostic Analytics

Business Research Methods 215


and Data Analytics
22BA206

This is done to understand what caused a problem in the first place. Techniques like drill-down, data
mining, and data recovery are all examples. Organizations use diagnostic analytics because they provide
an in-depth insight into a particular problem.
Use Case: An e-commerce company’s report shows that their sales have gone down, although customers
are adding products to their carts. This can be due to various reasons like the form didn’t load correctly,
the shipping fee is too high, or there are not enough payment options available. This is where you can use
diagnostic analytics to find the reason.
3.Predictive Analytics
This type of analytics looks into the historical and present data to make predictions of the future. Predictive
analytics uses data mining, AI, and machine learning to analyze current data and make predictions about
the future. It works on predicting customer trends, market trends, and so on.
Use Case: PayPal determines what kind of precautions they have to take to protect their clients against
fraudulent transactions. Using predictive analytics, the company uses all the historical payment data and
user behavior data and builds an algorithm that predicts fraudulent activities.

5.6.8 Big Data Business Applications


• Customer acquisition and retention. Consumer data can help the marketing efforts of companies,
which can act on trends to increase customer satisfaction. For example, personalization engines for
Amazon, Netflix and Spotify can provide improved customer experiences and create customer
loyalty.
• Targeted ads. Personalization data from sources such as past purchases, interaction patterns and
product page viewing histories can help generate compelling targeted ad campaigns for users on
the individual level and on a larger scale.
• Product development. Big data analytics can provide insights to inform about product viability,
development decisions, progress measurement and steer improvements in the direction of what fits
a business' customers.
• Price optimization. Retailers may opt for pricing models that use and model data from a variety of
data sources to maximize revenues.
• Supply chain and channel analytics. Predictive analytical models can help with preemptive
replenishment, B2B supplier networks, inventory management, route optimizations and the
notification of potential delays to deliveries.
• Risk management. Big data analytics can identify new risks from data patterns for effective risk
management strategies.
• Improved decision-making. Insights business users extract from relevant data can help
organizations make quicker and better decisions.

Business Research Methods 216


and Data Analytics
22BA206

5.7 Predictive Analytics


Predictive analytics is a branch of advanced analytics that makes predictions about future outcomes using
historical data combined with statistical modeling, data mining techniques and machine learning.
Companies employ predictive analytics to find patterns in this data to identify risks and opportunities.
Predictive analytics is often associated with big data and data science.
Today, companies today are inundated with data from log files to images and video, and all of this data
resides in disparate data repositories across an organization. To gain insights from this data, data scientists
use deep learning and machine learning algorithms to find patterns and make predictions about future
events. Some of these statistical techniques include logistic and linear regression models, neural
networks and decision trees. Some of these modeling techniques use initial predictive learnings to make
additional predictive insights.

5.7.1 Importance of Predictive Analysis


Organizations are turning to predictive analytics to help solve difficult problems and uncover new
opportunities. Common uses include:
Detecting fraud. Combining multiple analytics methods can improve pattern detection and prevent
criminal behavior. As cybersecurity becomes a growing concern, high-performance behavioral analytics
examines all actions on a network in real time to spot abnormalities that may indicate fraud, zero-day
vulnerabilities and advanced persistent threats.
Optimizing marketing campaigns. Predictive analytics are used to determine customer responses or
purchases, as well as promote cross-sell opportunities. Predictive models help businesses attract, retain
and grow their most profitable customers.
Improving operations. Many companies use predictive models to forecast inventory and manage
resources. Airlines use predictive analytics to set ticket prices. Hotels try to predict the number of guests
for any given night to maximize occupancy and increase revenue. Predictive analytics enables
organizations to function more efficiently.
Reducing risk. Credit scores are used to assess a buyer’s likelihood of default for purchases and are a
well-known example of predictive analytics. A credit score is a number generated by a predictive model
that incorporates all data relevant to a person’s creditworthiness. Other risk-related uses include insurance
claims and collections.

5.7.2 Functions of Predictive Analytics


According to the Harvard Business Review, successful predictive analytics strategies need three things.
1. Data – The most common barrier faced by organizations trying to implement predictive analytics
is a lack of reliable data.
2. Statistics – Regression analysis, which estimates relationships among different variables, is the
primary tool used by organizations for predictive analytics.
3. Assumptions – Every predictive model has an assumption behind it, and it is important to know
what that assumption is and monitor whether it is still true. The general assumption in predictive
analytics is that the future will continue to mimic the past.
Businesses that are able to gather enough relevant data, develop the right type of statistical model and
monitor their assumptions carefully will typically produce more accurate predictions of the future.

Business Research Methods 217


and Data Analytics
22BA206

5.7.3 Types of Predictive Modeling


Predictive analytics models are designed to assess historical data, discover patterns, observe trends,
and use that information to predict future trends. Popular predictive analytics models include
classification, clustering, and time series models.
Classification models
Classification models fall under the branch of supervised machine learning models. These models
categorize data based on historical data, describing relationships within a given dataset. For example, this
model can be used to classify customers or prospects into groups for segmentation purposes. Alternatively,
it can also be used to answer questions with binary outputs, such answering yes or no or true and false;
popular use cases for this are fraud detection and credit risk evaluation. Types of classification models
include logistic regression, decision trees, random forest, neural networks, and Naïve Bayes.
Clustering models
Clustering models fall under unsupervised learning. They group data based on similar attributes. For
example, an e-commerce site can use the model to separate customers into similar groups based on
common features and develop marketing strategies for each group. Common clustering algorithms include
k-means clustering, mean-shift clustering, density-based spatial clustering of applications with noise
(DBSCAN), expectation-maximization (EM) clustering using Gaussian Mixture Models (GMM), and
hierarchical clustering.
Time series models
Time series models use various data inputs at a specific time frequency, such as daily, weekly, monthly,
et cetera. It is common to plot the dependent variable over time to assess the data for seasonality, trends,
and cyclical behavior, which may indicate the need for specific transformations and model types.
Autoregressive (AR), moving average (MA), ARMA, and ARIMA models are all frequently used time
series models. As an example, a call center can use a time series model to forecast how many calls it will
receive per hour at different times of day.

5.7.4 Business Use Cases of Predictive Analytics


Predictive analytics can be deployed in across various industries for different business problems. Below
are a few industry use cases to illustrate how predictive analytics can inform decision-making within real-
world situations.
• Banking: Financial services use machine learning and quantitative tools to predict credit risk
and detect fraud. As an example, BondIT is a company that specializes in fixed-income asset-
management services. Predictive analytics allows them to support dynamic market changes in real-
time in addition to static market constraints. This use of technology allows it to both customize
personal services for clients and to minimize risk.
• Healthcare: Predictive analytics in health care is used to detect and manage the care of
chronically ill patients, as well as to track specific infections such as sepsis. Geisinger Health used
predictive analytics to mine health records to learn more about how sepsis is diagnosed and
treated. Geisinger created a predictive model based on health records for more than 10,000
patients who had been diagnosed with sepsis in the past. The model yielded impressive results,
correctly predicting patients with a high rate of survival.
• Human resources (HR): HR teams use predictive analytics and employee survey metrics to
match prospective job applicants, reduce employee turnover and increase employee engagement.
This combination of quantitative and qualitative data allows businesses to reduce their recruiting
costs and increase employee satisfaction, which is particularly useful when labor markets are
volatile.
• Marketing and sales: While marketing and sales teams are very familiar with business
intelligence reports to understand historical sales performance, predictive analytics enables
companies to be more proactive in the way that they engage with their clients across the customer
lifecycle. For example, churn predictions can enable sales teams to identify dissatisfied clients

Business Research Methods 218


and Data Analytics
22BA206

sooner, enabling them to initiate conversations to promote retention. Marketing teams can leverage
predictive data analysis for cross-sell strategies, and this commonly manifests itself through a
recommendation engine on a brand’s website.
• Supply chain: Businesses commonly use predictive analytics to manage product inventory and
set pricing strategies. This type of predictive analysis helps companies meet customer demand
without overstocking warehouses. It also enables companies to assess the cost and return on their
products over time. If one part of a given product becomes more expensive to import, companies
can project the long-term impact on revenue if they do or do not pass on additional costs to their
customer base. For a deeper look at a case study, you can read more about how FleetPride used
this type of data analytics to inform their decision making on their inventory of parts for excavators
and tractor trailers. Past shipping orders enabled them to plan more precisely to set appropriate
supply thresholds based on demand.
• Forecasting: Forecasting is essential in manufacturing because it ensures the optimal utilization
of resources in a supply chain. Critical spokes of the supply chain wheel, whether it is inventory
management or the shop floor, require accurate forecasts for functioning. Predictive modeling is
often used to clean and optimize the quality of data used for such forecasts. Modeling ensures
that more data can be ingested by the system, including from customer-facing operations, to
ensure a more accurate forecast.
• Credit: Credit scoring makes extensive use of predictive analytics. When a consumer or business
applies for credit, data on the applicant's credit history and the credit record of borrowers with
similar characteristics are used to predict the risk that the applicant might fail to perform on any
credit extended.
• Underwriting: Data and predictive analytics play an important role in underwriting. Insurance
companies examine policy applicants to determine the likelihood of having to pay out for a
future claim based on the current risk pool of similar policyholders, as well as past events that
have resulted in payouts. Predictive models that consider characteristics in comparison to data
about past policyholders and claims are routinely used by actuaries.
• Fraud Detection: Financial services can use predictive analytics to examine transactions, trends,
and patterns. If any of this activity appears irregular, an institution can investigate it for fraudulent
activity. This may be done by analyzing activity between bank accounts or analyzing when certain
transactions occur.

5.7.5 Benefits of Predictive Modeling In Business


An organization that knows what to expect based on past patterns has a business advantage in managing
inventories, workforce, marketing campaigns, and most other facets of operation.
• Security: Every modern organization must be concerned with keeping data secure. A
combination of automation and predictive analytics improves security. Specific patterns associated
with suspicious and unusual end user behavior can trigger specific security procedures.
• Risk reduction: In addition to keeping data secure, most businesses are working to reduce their
risk profiles. For example, a company that extends credit can use data analytics to better understand
if a customer poses a higher-than-average risk of defaulting. Other companies may use predictive
analytics to better understand whether their insurance coverage is adequate.
• Operational efficiency: More efficient workflows translate to improved profit margins. For
example, understanding when a vehicle in a fleet used for delivery is going to need maintenance
before it’s broken down on the side of the road means deliveries are made on time, without the
additional costs of having the vehicle towed and bringing in another employee to complete the
delivery.
• Improved decision making: Running any business involves making calculated decisions. Any
expansion or addition to a product line or other form of growth requires balancing the inherent risk

Business Research Methods 219


and Data Analytics
22BA206

with the potential outcome. Predictive analytics can provide insight to inform the decision-making
process and offer a competitive advantage.

5.8 Prescriptive Analytics


This type of analytics prescribes the solution to a particular problem. Perspective analytics works with
both descriptive and predictive analytics. Most of the time, it relies on AI and machine learning.

Use Case: Prescriptive analytics can be used to maximize an airline’s profit. This type of analytics is used
to build an algorithm that will automatically adjust the flight fares based on numerous factors, including
customer demand, weather, destination, holiday seasons, and oil prices.

5.9 Cloud Analytics


Cloud analysis is a term that refers to using cloud-based analytics tools to evaluate information.
With these systems, you can connect and upload data to find new insights and trends that will drive better
decision-making and business results.
Cloud analytics refers to the manipulation and analysis of data that happens in the cloud instead
of locally, in an on-premises business system. Analytics systems hosted in the cloud empower users to
access, aggregate, analyze and utilize data. With these platforms, users can work with large data sets,
identify trends and pinpoint areas for improvement across the organization.
While your laptop’s spreadsheet program or on-premises analytics may be sufficient for a list of a
few thousand data points, those needing to analyze complex data sets made of tens of thousands to millions
of inputs won’t find much success using such a program. Cloud analytics allow companies to process
large data sets in a scalable, more affordable means than building infrastructure to handle the process on-
site. That’s just one reason to look to the power of the cloud for your data analytics needs.

5.9.1 Cloud Analytics Infrastructure


To support their cloud data analytics needs, most organizations use a mix of cloud types and providers to
gain the most benefit. Here we take a quick look at the four main data service options.
1. Data On-Premises: In this approach, data is stored on local hardware (servers or computers)
which is typically set up at the company’s headquarters. Organizations with highly sensitive data,
such as banks or medical organizations, may choose an on-premises solution for maximum
security.
2. Private cloud: Usually behind a firewall, private cloud services and hardware are dedicated to one
company and provide increased data governance and control. As with on-premises, organizations
with highly sensitive data might select a private cloud approach for increased security.
3. Public cloud: Here you get higher performance and manageability with lower costs. Services are
shared by multiple companies, but each company’s data and applications are hidden from each
other.
4. Hybrid and Multi-cloud: Many organizations use a mixture of different Public, Private and even
on-premises approaches to optimize security, scalability and total cost of ownership. Best-in-class
analytics solutions embrace this approach so organizations can choose which data is stored where
and where analytics occurs.

5.9.2 Cloud Analytics Tools


There are several types of cloud analytics tools. Many of these can be easily accessed through your web
browser. Here are some examples of a few popular types of cloud analytics tools:
• Website Analytics: One of the most common types of cloud analytics is website traffic analytics.
These cloud analytics tools help you understand a website’s traffic, conversion rate, bounce rate and
more so you can make adjustments that improve the user experience, boosting revenue and
profitability.

Business Research Methods 220


and Data Analytics
22BA206

• Sales Analytics: Sales analytics platforms help you manage customers, leads, evaluate sales across
geographies, and monitor the performance of your sales team. This information can reveal important
trends or signals that help leaders develop more effective sales strategies.
• Financial Analytics: Financial analytics go beyond financial statements to draw out revenue and
expense trends and details in your financial results that would be impossible to find without a large
team of financial analysts.
• Performance Analytics: Performance analytics look at sales, production or other data to find
bottlenecks, sources of expenses and improvement opportunities.
• Among any of these categories of tools, you may find more specialized software and varying features.
Basic tools may simply summarize and help you understand data, while more advanced tools leverage
technology like machine learning and artificial intelligence (AI) to analyze large volumes of data and
make predictions based on all of that information.

5.9.3 Advantages of Cloud Analytics


Here are seven advantages of cloud analytics that all business owners and managers should consider when
choosing an analytics solution.
• Growth and Scalability: Cloud platforms offer powerful capabilities on-demand with near-limitless
flexibility. These resources are available as needed for scaling up or scaling down, without any need
to purchase, set up or maintain any of your own servers or other resources as your needs shift with
growth.
• Unified Approach From Anywhere: When your finance, IT, marketing and sales teams all manage
their own database and use different analysis tools, workers may need to be trained on multiple
systems. As a result, they waste time looking for data from disparate systems or need to be trained on
multiple solutions. Cloud analytics pulled from your company’s ERP system gives your entire
workforce a single source of information and analytics, no matter where they’re working from.
• Break Down Silos: When your entire staff uses one system, it facilitates collaboration between
different departments. Even though users only get access to the information they need, teams can more
easily communicate across departments to find more useful and valuable insights.
• Cost Reduction: The costs of housing and maintaining any on-premises systems include IT
headcount, hardware and development efforts — cloud analytics means just one bill and, in many
cases, lower total costs.
• Find Answers Faster: The more powerful servers available through cloud computing, as opposed to
on-premises systems or employee laptops, allows for faster data processing.
• Increased Sharing and Collaboration: The cloud is inherently better for sharing. Instead of email
attachments, network drives and confusion about multiple versions of files, workers can work from
the same data sets and share reports in a few clicks.
• Improved Security: With most cloud analytics, data is regularly backed up to servers in multiple
locations so it’s protected during a fire or natural disaster. Since no information is stored locally, there
are no local hard drives to steal, and sensitive data is not shared through insecure methods like email
or flash drives. All data is password-protected, users only have access to what they need, and audit
logs provide visibility into who accessed what and when and what they did with it.

5.9.4 Features of Cloud Analytics Tools


Cloud analytics systems must be hosted on an internet platform. In most cases, they are run on state-of-
the-art data centers that can provide the processing power and storage space needed for analyzing massive
amounts of data.In cloud analytics systems, all generated data is collected and securely stored in the cloud,
where it can be accessed from any internet-connected device.Let us look for these common features that
can boost your productivity and results in analytics:
• Data sources: These are the various sources from which your business data originates. Common
examples include web usage and social media data, as well as data from CRM and ERP systems.

Business Research Methods 221


and Data Analytics
22BA206

• Data models: A data model structure retrieves data and standardizes how data points relate to each
other for analysis. Models can be simple — using data from a single column of a spreadsheet, for
example or complex, involving several triggers and parameters, in multiple dimensions.
• Processing applications: Cloud analytics uses special applications to process huge volumes of
information stored in a data warehouse and reduce time to insight (more on this below).
• Computing power: Cloud analytics requires sufficient computing power to intake, clean,
structure and analyze large volumes of data.
• Analytic models: These are mathematical models that can be used to analyze complex data sets
and predict outcomes.
• Data sharing and storage: Cloud analytics solutions offer data warehousing as a service so that
the business can scale quickly and easily.
In addition to these features, AI is becoming a more integral part of cloud analytics. Machine
learning algorithms, in particular, enable cloud analytics systems to learn on their own and more
accurately predict future outcomes.

5.9.5 Considerations for Deploying Cloud Analytics


• Computing Power: The first step of implementing cloud analytics is ensuring that you have the
required raw computing power to ingest, structure, and analyze data at scale.
• Data Sources: A powerful cloud analytics solution must be able to capture data from various data
sources, including company websites, ERPs, CRMs, social media platforms, the Internet of Things
(IoT), mobile apps, and more.
• Data Models: Using data models to move data from on-premises to the Cloud can help minimize
business disruption. Cloud-based data models standardize how elements of data relate to one another
and determines the structure of the data.
• Processing Applications: There must be applications in place that can process the large data sets
coming in from different systems. Organizations need a data processing framework for their cloud-
based environments. Apache Spark, Google BigQuery, and Hadoop are some options for developing
such applications.
• Analytic Models: You also need to develop models for predictive analytics and other advanced
analytic functions to run in the cloud.
• Data Sharing: Cloud analytics solutions should also support easy data sharing and storage through a
modern cloud architecture.

5.9.6 Selecting Perfect Cloud Analytics Platform


Not all cloud based analytics tools are created equal. It’s important to choose the one that best fits your
data requirements, business needs, and IT ecosystem. Here are the 10 things to look for in a cloud based
analytics solution.
1. Truly cloud-based.
Many platforms claim to be cloud – and then require local software for development. If creating the
analytics requires local software, then it isn’t cloud analytics. Performing the development in the cloud is
not only easier for users; it also lowers security risk by removing the need to create local copies of the
data. A true cloud provider will also take on support, infrastructure costs and management, automatic
updates, and disaster recovery – so you can offload those internal management costs and focus on
analytics.
2. Enabling cloud choice: public, private, multi-cloud, and hybrid.
You may already be using multiple clouds to manage data and run applications. At the same time, to
comply with security regulations, you’re probably also keeping some analytics development and
consumption on-premises, or in a virtual private cloud. With dispersed architecture, you’ll want the
flexibility to bring analytics to your data and run analytics computing in the cloud of your choice.
3. Accommodating data gravity.

Business Research Methods 222


and Data Analytics
22BA206

Many SaaS analytics vendors require you to move your data to their cloud. But moving your data can be
expensive – and by distancing your data from your users, you can introduce latency and performance
issues, too. Search for a solution that lets you keep your data wherever it’s most productive. You’ll want
to avoid getting locked into a single vendor, where your options will dwindle.
4. Single point of entry.
As with any SaaS solution, adoption is key. Make it easy for users by opting for a platform with a single
point of entry for login. Administrators and IT leaders also need a simple way to manage data analytics
across different clouds, regions, and users. Make sure they’ll be able to manage the entire deployment
from one management console – and easily change the deployment model at any time.
5. Self-service and readily available data, at scale, for all.
You shouldn’t have to be a coding pro to get in-depth insights about your data. The best cloud-based
analytics solutions give business users easy access to data through a catalog, a simple user interface where
they can “shop for” and select datasets, easily viewing lineage. The solutions also provide intuitive ways
to get insights, allowing users to explore and analyze in all possible contexts, without limitations.
6. Performance and scalability.
Most analytics solutions struggle with performance. That’s because they’re query-based, restricting users
to predetermined paths in the data and requiring them to reformulate queries whenever they want to pivot.
Look for a high-performing solution that can calculate analytics quickly even when used simultaneously
by a great number of users. And make sure that scaling capacity in any direction will be straightforward
and fast.
7. Augmented analytics.
AI capabilities are becoming increasingly integral to analytics, and different platforms employ them
differently. Instead of black-box AI that operates independently, look for a solution that uses AI to
augment the user experience with things like insight suggestions and natural language interactions. That
gives you the best of both worlds: machine intelligence that augments human intuition and understanding.
8. Orchestration across your cloud ecosystem.
Automation is another tool that’s vastly accelerating analytics delivery and augmenting insight discovery.
AI can speed time-to-insight by automating a wide variety of tasks for the user, including combining data
sets, preparing and transforming data, and creating visualizations.
9. Fully interactive mobile analytics.
From laptops to smartphones, the best cloud analytics solutions provide users with a consistent,
comprehensive experience. This includes the ability to analyze and share data and apps from anywhere.
10. A secure, enterprise-class experience with governed collaboration.
Your cloud analytics platform should allow you to easily assign and change permissions, so your data
stays secure and the right people have access. And when you’re evaluating moving workloads to a SaaS
platform, it’s vital to know that the service provider is following open and audited processes for security
controls.

5.9.7 Benefits of Cloud Analytics in Information Management


Improving Decision Making
This is the main benefit. With cloud analytics, the process becomes faster and produces effective results
in the shortest possible time. That’s why organizations can get valuable information, which improves the
quality of decisions.
When the IT department does its own data control, it’s hard for them to make quick, informed decisions.
With cloud analytics IT professionals can focus on the logic, leaving aside all infrastructure concerns.
This allows you to achieve your goals thanks to timely information: attract new customers, increase
revenue, and also bring your business to a new level faster.
Reducing Spendings

Business Research Methods 223


and Data Analytics
22BA206

While costs may vary from company to company, cloud solutions can be more cost-effective compared
to other IT infrastructures. This is because, with cloud computing, you only pay for what you use.
Consequently, providers supply tools that help businesses save on IT infrastructure costs.
Cost savings are usually achieved through the following methods:
• Access to more versatile IT services
• Replacement of local equipment and server rooms
• Additional services are provided only on request
• Improving operational efficiency
Migrating to cloud infrastructure requires an initial investment. But with the right spending management
strategy, good ROI can be achieved.
Improving Flexibility
At a fundamental level, cloud analytics is more flexible than local computing. There is no need to integrate
additional physical resources if you need to scale up. In fact, in many cloud computing schemes, you can
access more resources in real-time when the need arises and then cut back to save.
For example, if you run an e-shop and you know there will be more activity on Black Friday. Instead of
maintaining servers all year round, the cloud service will solve this issue when it is needed.
Security Improvement
Many people think that data available in the cloud is insecure. Business owners are interested in how
secure files, programs, and data are in the cloud—what will prevent hackers from taking and accessing
this data?
Service providers guarantee security. The combination of the best security tools: from a system of data
encryption at rest and transmission to the implementation of several security settings, can provide a very
reliable infrastructure. Even if there is a data leak, it takes several minutes to detect.
But users of cloud services should also be extremely careful and follow all security rules, especially when
controlling access to data and managing security keys. Using consistent security across both your internal
and cloud infrastructure is the best way to ensure security.
Improving Mobility
Remote work is now the norm. This has become especially popular during the pandemic. Businesses that
want to stay competitive are hiring skilled workers from all over the world, whether in the same city as
the company or on the other side of the planet.
The cloud is an internet service. This means that all resources in the cloud are available over the internet.
The user needs to sign in to the account and access the resources in the cloud. Employees don’t need to
work onsite to access the resources they need to do a job.
Sharing
Gone are the days when colleagues needed to email files back and forth to exchange information.
Everyone now uses cloud storage, which updates files in real-time as they are edited. The simplest example
of cloud collaboration software is Google Docs, which can be edited by multiple people.
Cloud analytics combines all company data sources to get a more complete picture of business processes.
All employees, regardless of their physical location, time of the day, and mood, can easily access and
share data with employees around the world.
Improved Disaster Recovery
Today, everyone understands how important it is to protect yourself from data loss. Therefore, many
people backup files, photos, etc. in the cloud. The same goes for business.
All organizations need to be aware that data loss will occur at some point. If all their important documents
and files are stored in only one place, one disaster could destroy the entire organization.
Cloud storage allows companies to save huge amounts of data away from the office and across multiple
locations. If an accident happens, they can simply turn on their recovery protocol to get back to square
one. And to prevent data loss and application downtime, cloud providers ensure that data is replicated
across multiple sites. This prevents losses from disaster or any other unexpected errors.
Automatic Update

Business Research Methods 224


and Data Analytics
22BA206

Another advantage is that you do not have to sit and wait for updates to be installed. Cloud applications
do this automatically and without the intervention of IT staff. This saves time and money. In general, the
tools are very easy to use; workers can use cloud analytics tools without prior training.
Reducing Harm to the Environment
Usually, this benefit of the cloud is completely overlooked. But in modern realities, when animals are
dying out and future generations are threatened by global warming, in fact, cloud technologies, unlike
installed, maintained, and then disposed of machines, cause much less harm to the environment. You
simply rent or buy a place in the cloud, and do not maintain a server room, thus significantly saving on
electricity.
Cloud infrastructure is the new normal. Technology has come a long way, new options and solutions have
emerged that cannot be ignored. Depending on the characteristics and needs of the business, you have to
choose one of the types of cloud models for the analytical platform. This will be a good start for making
important strategic business decisions.

5.10 Automated Analytics


Automated analytics refers to the use of computer systems to deliver analytical products with little or no
human intervention. Automated analytics can be used in a number of ways. For example, you can automate
full data processes, automate full business intelligence dashboards, create data-driven self-governing
machine learning models, or you can automate singular tasks such as:
• Data Discovery,
• Data Preparation,
• Replication,
• Maintenance, Etc.

Automated analytics are a form of advanced analytics that use emerging technology, such as
machine learning, artificial intelligence, and natural language processing, to assist human data scientists
and analysts with tedious tasks like data gathering and preparation.
Relying on data-driven decision-making can be an effective way to expand a business. Data
collection and analysis allow companies to improve the efficiency of their operations. A nalyzing data can
be a long-term process, so relying on automated systems can provide employees with more time to work
on other important projects.
It is the process of using advanced computer programs and simulations to examine digital
information . Depending on a business' industry, its staff might collect statistical data on customer
information, production processes, profitability or performance metrics. Using this data to inform
important business decisions can help keep a business profitable, but analyzing these data points manually
can be time-consuming and costly.
Automated analytics systems save time and funds, as you can input data directly into software that
generates reports and makes recommendations based on user preferences. This type of automation is
particularly useful for companies that handle big data, as there could be many individual data points to
analyze on a day-to-day basis. By working with automation software, business owners can produce more
reliable results while prioritizing funds for other projects.

5.10.1 Need of Automated Analytics


Automating data analytics can save time and money, but not every task is suitable for automation.
Different projects can benefit from automation more than others, so can help to review some common
criteria that make a task a suitable candidate for automation. Here are some aspects that make a data
analytics task suitable for automation:

Business Research Methods 225


and Data Analytics
22BA206

Repeatability factors
Analyses you only conduct once every few months are generally not suitable for automation. Automation
is often more beneficial for data analysis tasks that your team completes regularly, such as researching
new social media trends. As repetitive tasks can consume a lot of company resources, you can increase
efficiency by reducing the amount of human input required for these activities.
Consistency factors
Automation can be helpful if completing a task consistently is an important factor for accurate results. For
example, if a data task occurs at a specific time of the day, automating this process can ensure a team
completes it on time, allowing them to move forward to the next step of a project. Relying on computerized
systems to complete these tasks consistently may also reduce the number of data errors that occur.
Error mitigation
When collecting data, even small errors can lead to misrepresentations of important information.
Automated systems are often more reliable than manual options, as they rely on complex, pre-set
algorithms. If minor errors can reduce the reliability of your data sets, then using an automated analytics
process can be beneficial. For example, a pharmacologist developing a new medication might use an
automated process to ensure their study results are correct before submitting data for a formal evaluation.
Decision-making
Many businesses rely on dashboards and other types of data visualization to inform decision-making
processes. These data models can be an efficient way to convey important performance metrics to your
team. Updating the data in your dashboard can be a time-consuming process, so automating these types
of data models can save time while providing more reliable information.

5.10.2 Functions of automated analytics


As businesses may collect a vast variety of data, they can use different types of automated data analytics
systems. If you aren't sure about what type of automated analytics solution might work best for your own
projects, it can help to read about some common examples. Here are a few types of automated analytics
solutions:
Data collection
One automatic data analytics process involves creating a library of information to evaluate. As automation
tools benefit from having as many data points as possible, optimizing your data collection process can
help produce more informative results. Compared to employees manually entering data into a spreadsheet,
automation technology can also extract important information from user interactions more efficiently.
Business intelligence
Another type of automated data analytics is the creation of business intelligence metrics. These processes
typically track emerging trends in your business. For example, business intelligence can examine which
geographic locations are producing the most orders, and compare these numbers to the marketing budget
in those areas. These comparisons also allow you to make accurate estimations about where advertising
is the most effective.
Dashboards
Analytics dashboards track relevant information on one screen and present the data in a format every
employee can easily understand. They're one of the most useful tools for companies, as they allow staff
to combine different data analytics solutions and view cross-departmental results. Dashboards also give
businesses a way to visualize trends in data quickly, which can allow teams to conduct more accurate data-
driven decision-making processes.
Machine-learning models
Machine-learning programs create statistical models for tracking changes in business operations. These
programs analyze data points and identify trends to predict what a business' financial future might look
like. Machine-learning models can also help predict changes in the market that might affect a business'
profitability. Using machine-learning models, companies can determine what actions can help them stay
competitive in an industry.

Business Research Methods 226


and Data Analytics
22BA206

5.10.3 Stages of Automated Analytics


1. Set clear goals
Before you can move forward with your data analytics automation process, it's often helpful to identify
your goals. Some examples of goals might include trend recognition, increased data collection or
automated performance monitoring. After deciding what your objectives are, you can start exploring
which solutions best meet your needs. Depending on your role, consider meeting with your teammates or
supervisors to outline these goals and better ensure everyone understands them.
2. Decide what metrics to track
If you know what your data analytics automation goals are, the next step is determining which metrics
you want to track. For example, if you want to record more data on a company's customers, you might
track how many consumer profiles your software generates or evaluate their demographic information.
Understanding this information can help you select an automation program that's right for the business'
goals.
3. Select an automation tool
Start researching different programs by inquiring online or connecting with other professionals in the same
field. Depending on a business' goals, a data analysis team might require software that tracks certain items
or processes data using specific methods. For example, a retail store might benefit from an automation
tool that also specializes in inventory management. Taking the time to research each possible solution, or
even trying a free trial, can help save you time and money.
4. Establish criteria and run tests
After selecting an automation tool, spend some time establishing the criteria you want the software to
track. Then, you can run some tests to see if the system operates correctly. Verifying that your automation
tool collects the right data can ensure that you protect any valuable information inputted into the system.
Consider conducting multiple tests, as one option might identify different errors than another.
5. Review and change settings
For the best results, reassess your software and change your configuration on a regular basis. After using
the tool for one project, you may think of some additional information that you want to track. Consider
experimenting with different automation settings and compare your results over time to determine the
most effective methods for a particular data assessment process.

5.10.4 Implementing Data Analytics


The implementation of data analytics depends on which level of automation you are looking at:
• Partial automation: Automation supports existing behavior but removes parts of the manual work.
For example, your data team would write scripts, which speed up parts of their work.
• End-to-end production: Automation is set up end-to-end, and computer systems produce data
products for human decision-makers to inspect and act on. For example, automation produces KPI
dashboards or alerts of fraudulent behavior without an employee touching anything.
• Full automation: Automation takes business decisions in near real-time without any human
intervention. For example, an AI algorithm decides whether the signal in the data is good enough to
automatically buy or sell assets.
The more you move towards full automation, the more the value of automation shifts from saving hours
to providing independent impacts on the business bottom line.

5.10.5 Process of Automated Analytics

• Identify candidate analytical tasks: Use the four checks we presented above: the task has
business value, is repetitive, saves time, reduces errors, and you can iteratively improve on it.

Business Research Methods 227


and Data Analytics
22BA206

• Set up expectations by formalizing criteria for success: In the early stages, automation should
serve as a way to optimize processes, and save time. But be clear on what you expect. Start small,
by automating one data pipeline.
• Use devoted platforms and tools to speed up automation: Your engineers can write SQL
procedures and Python scripts to automate code, but relying on specialized tools and platforms
will save you time and money.
• Repeat and evaluate: As you automate part of the data analytics processes and products, evaluate
them against the success criteria set up before. If successful, automate more.

5.10.6 Benefits of data analytics automation


Here are some common benefits of using data analytics automation:
Getting fast results: One of the main benefits of automating data analytics is that programs can process
data faster than humans. By using an automated program, you can get results faster and spend less time
researching specific data points.
• Handling more data: Automation software can filter more data than a team of employees in the same
period of time Data analytics automation programs can also work on multiple different queries
simultaneously, and may analyze greater quantities of user data.
• Saving money: While some data analytics automation programs require a paid license to use, these
programs can still save a company money, as this venture typically requires fewer staff members and
billable work hours. These licensing fees can be a worthwhile investment for a business.
• Increasing productivity: As these programs can produce results faster than manual analysis,
employees have more time to work on other important duties. Data analytics automation programs
also allow staff to incorporate newly evaluated data into project steps, increasing efficiency across
multiple teams.
• Accelerate reporting process: The average turnaround time from request to analytic delivery can be
shortened by automating (part of) the data pipeline needed to create the analytic report.
• Improved processes and systems: Running analytics manually often involves complex processes.
By automating analytics you skip the parts, which are prone to human error. And once you find an
error in the automated processes, you only need to correct it once. With automation, you build future-
proof processes and systems.

5.11 Report Writing


Research Report
A research report is a well-crafted document that outlines the processes, data, and findings of a systematic
investigation. It is an important document that serves as a first-hand account of the research process, and
it is typically considered an objective and accurate source of information.
In many ways, a research report can be considered as a summary of the research process that clearly
highlights findings, recommendations, and other important details. Reading a well-written research report
should provide you with all the information you need about the core areas of the research process.

5.11.1 Features of a Research Report


So how do you recognize a research report when you see one? Here are some of the basic features that
define a research report.
• It is a detailed presentation of research processes and findings, and it usually includes tables and
graphs.
• It is written in a formal language.
• A research report is usually written in the third person.
• It is informative and based on first-hand verifiable information.
• It is formally structured with headings, sections, and bullet points.

Business Research Methods 228


and Data Analytics
22BA206

• It always includes recommendations for future actions.

5.11.2 Goals of Business Report


Business reports are actual documents that inform by summarizing and analyzing a particular situation,
issue, or facts and then make recommendations to the group or person asking for the report. The goal of
these reports is usually one of the following:
• To examine potential and available solutions to an issue, situation, or problem
• To apply business and management theories to produce different suggestions for improvement
• To show your evaluation, reasoning, and analytical skills in recognizing and considering possible
solutions and outcomes
• To make conclusions about an issue or problem
• To produce a range of suggestions for future action
• To present clear and concise communication skills

5.11.3 Category of Research Report


The research report is classified based on two categories:
Qualitative Research Report
This is the type of report written for qualitative research. It outlines the methods, processes, and findings
of a qualitative method of systematic investigation. In educational research, a qualitative research report
provides an opportunity for one to apply his or her knowledge and develop skills in planning and executing
qualitative research projects.
A qualitative research report is usually descriptive in nature. Hence, in addition to presenting details of
the research process, you must also create a descriptive narrative of the information.
Quantitative Research Report
A quantitative research report is a type of research report that is written for quantitative
research. Quantitative research is a type of systematic investigation that pays attention to numerical or
statistical values in a bid to find answers to research questions.
In this type of research report, the researcher presents quantitative data to support the research process and
findings. Unlike a qualitative research report that is mainly descriptive, a quantitative research report
works with numbers; that is, it is numerical in nature.

5.11.4 Importance of a Research Report


When it comes to the question why Business Research is important, it has an essential role to play in
varied areas of business. Here are some of the reasons describing the importance of Business Research:
• It helps businesses gain better insights about their target customer’s preferences, buying patterns,
pain points, as well as demographics.
• Business Research also provides businesses with a detailed overview of their target markets,
what’s in trend, as well as market demand.
• By studying consumers’ buying patterns and preferences as well as market trends and demands
with the help of business research, businesses can effectively and efficiently curate the best
possible plans and strategies accordingly.
• The importance of business research also lies in highlighting the areas where unnecessary costs
can be minimized and those areas in a business which need more attention and can bring in more
customers and hence boost profits.
• Businesses can constantly innovate as per their customers’ preferences and interests and keep their
attention towards the brand.
• Business Research also plays the role of a catalyst as it helps business thrive in their markets by
capturing all the available opportunities and also meeting the needs and
• preferences of their customers.

Business Research Methods 229


and Data Analytics
22BA206

5.12 Planning Report Writing


Research is imperative for launching a new product/service or a new feature. The markets today are
extremely volatile and competitive due to new entrants every day who may or may not provide effective
products. An organization needs to make the right decisions at the right time to be relevant in such a
market with updated products that suffice customer demands.

The details of a research report may change with the purpose of research but the main components of a
report will remain constant. The research approach of the market researcher also influences the style of
writing reports. Here are seven main components of a productive research report:
• Research Report Summary: The entire objective along with the overview of research are to be
included in a summary which is a couple of paragraphs in length. All the multiple components of the
research are explained in brief under the report summary. It should be interesting enough to capture
all the key elements of the report.
• Research Introduction: There always is a primary goal that the researcher is trying to achieve
through a report. In the introduction section, he/she can cover answers related to this goal and establish
a thesis which will be included to strive and answer it in detail. This section should answer an integral
question: “What is the current situation of the goal?”. After the research was conducted, did the
organization conclude the goal successfully or they are still a work in progress – provide such details
in the introduction part of the research report.
• Research Methodology: This is the most important section of the report where all the important
information lies. The readers can gain data for the topic along with analyzing the quality of provided
content and the research can also be approved by other market researchers. Thus, this section needs to
be highly informative with each aspect of research discussed in detail. Information needs to be
expressed in chronological order according to its priority and importance. Researchers should include
references in case they gained information from existing techniques.
• Research Results: A short description of the results along with calculations conducted to achieve the
goal will form this section of results. Usually, the exposition after data analysis is carried out in the
discussion part of the report.
• Research Discussion: The results are discussed in extreme detail in this section along with a
comparative analysis of reports that could probably exist in the same domain. Any abnormality
uncovered during research will be deliberated in the discussion section. While writing research reports,
the researcher will have to connect the dots on how the results will be applicable in the real world.
• Research References and Conclusion: Conclude all the research findings along with mentioning
each and every author, article or any content piece from where references were taken.

5.13 Targeting Audience in Research


A target audience is a group of people with common interests, demographics, and behavior. Market
researchers need to collect consumer feedback on certain products and services.
Collecting feedback from random people who aren’t your customers or those with no interest or
knowledge about the research subject will not help you solve problems. To gain valuable business insights,
targeting the right people for your research is crucial. Analyzing your target audience is an essential part
of building your marketing strategy.
Types of Target Audience
For the best research results, divide these audiences into three categories – demography, interests, and
purchasing intentions.
Audience based on demography: Demographics are socio-economic factors that describe individuals.
Demographic factors include attributes like age, education, geographic location, gender, income, and so
on.

Business Research Methods 230


and Data Analytics
22BA206

For example, for conducting a research study on the impact of the pandemic on young students, target
students between the age of 18-24, both male and female, form counties with a population of more than
25,000.
Audience based on purchase intentions: E-commerce businesses use a lot of purchase intentions data.
This is a crucial piece of information that they must possess to understand the buying intentions and
interests of potential customers.
For example, researchers group individuals based on the products they specifically look at or show interest
in. This helps them target individuals to capture their feedback on the expectations of the products and
services to enhance them further.
Audience based on personal interests: Interests make up an individual’s hobbies, passions, behavior,
things they read about and looking for. It can be anything from movie types to music genre, to cars, books,
and dance to name a few.
For example, you can offer a new action movie to action movie enthusiasts and get their feedback on
different parameters you set to collect genuine feedback.

5.13.1 Characteristics of a target audience


Researchers must sample the right target audience to make inferences about the entire population. It is
impossible to survey the whole population due to logistical concerns, budgetary limitations, and time
constraints. Dividing the entire population into smaller bits and drawing inferences from them is the most
scientific way of researching a large population.
The quality of the research results directly depends on who your target audience is. Your market research
will yield actionable results if you get the target audience mix right.
Here are three attributes that researchers must keep in mind to create a sample from their target audience:
1. Diversity: Always ensure the sample is diverse. Ensuring the diversity of a sample can be difficult in
some cases because it is difficult to reach some portions of the population or convince them to take
part in the survey. For a sample to represent the population truly, it must be diverse. A sample that
fails to be diverse and representative of the entire population has serious research consequences.
2. Transparency: The structure and the size of the population depend on various factors. Researchers
must discuss these constraints to maintain a level of transparency about the sample selection
procedures. Researchers must be transparent so that the survey results can be viewed with the correct
perspective.
3. Consistency: Researchers must understand the population thoroughly and must test the consistency
of the sample before launching the survey fully. This is extremely crucial for research studies that
monitor changes across space and time, especially where we need confidence that any variation we
notice in our research data reflects a similar trend across comparable and consistent samples.

5.13.2 Determining a sample size from your target audience


In research, it is impossible to survey your target audience. It is also not advisable to target everybody in
your target audience. You must derive a sample from this target to make inferences of the whole
population. Here are some tips to determine your sample size:
1. Identify your research objective: First set your research goals. Know exactly what you want
your research to achieve. Will the research be used to make projections about the entire population
that reside in your target location? Have clarity about these areas.
2. Precision level: Have an expectation in mind about the precision level you want to achieve with
your research. You can adjust your sample size based on the size of your population.
3. Confidence level: Confidence level is proportional to risk. If you want your risk to be minimum,
your confidence level must be high and vice versa. Choose your confidence level on the criticality
of the research.

Business Research Methods 231


and Data Analytics
22BA206

4. Response rate: Ascertain the response rate you’re likely to receive. If your population is huge,
and you expect a low response rate, increase your sample size. A small sample, especially in a
diverse population will just not give you the accuracy you’re looking for.

5.13.3 Factors of Target Audience


Consider the following Demographic factors to select target audience of reasearch:
• Age
• Gender
• Location
• Education
• Income
• Occupation
• Marital Status
• Ethnicity
Demographic factors tell you who are buying your products or services,
while psychographics reveal why they buy it. Observing your core market’s interests and characteristics
can help you gain insights about their preferences and help you incorporate some of these attributes into
your brand messages. Psychographics covers the following:
• Personality
• Attitudes
• Values
• Interests/hobbies
• Lifestyles
• Behavior

5.14 Types of Report writing


Research report is mainly of 2 types: Technical report and Popular report.

Technical Report
Technical report is one that is needed where complete written report of research study is needed for the
purpose of public dissemination or record-keeping. In these report, data is presented in a simple manner
and key results are defined properly. Technical report emphasis on tools used in study, assumptions made
and presentation of findings along with their limitation.
Outline of Technical report is: –
1. Results Summary- Description of key findings of the study conducted.
2. Nature of Study- Denotes objectives of study, formulating problem on operational basis,
hypothesis used for working, type of data needed and kinds of analysis.
3. Methods Used- Tools and techniques used for carrying out the study along with their limitations
is explained.
4. Data- Description of how the data was collected, what are their sources, their characteristics and
limitations.
5. Data Analysis and Presenting Findings- It is the main body of report where data is analyzed
and finding are presented along with supporting data. Distinct types of tables and charts are used
for better explanation.
6. Conclusions- Findings are narrated in a detailed manner and implications of policies drawn from
results is explained.
7. Bibliography- It provide details of distinct sources which were consulted while performing a
research.

Business Research Methods 232


and Data Analytics
22BA206

8. Technical Appendices- Technical appendices related to mathematical deviations, questionnaire


and analysis technique elaboration.
9. Index- It is attached invariably at the report end.
Outline of a Technical report may not be same in all case and may vary in all technical reports.

Popular Report
Popular report is the one that focuses on attractiveness and simplification of data. It is used when its
findings will have policy implications. Focus is laid on writing in a clear manner, minimization of
technical aspects, using charts and diagrams in liberal and detailed manner. Other key characteristics of
popular report are use of many subheadings, large prints and occasional cartoon. Practical emphasis is
given more importance in these type of report.
General outline of Popular report is as given below: –
1. Findings and Their Implications- Focus is given on practical aspects of findings of study
conducted and how these findings are implied.
2. Recommendations for Action- This section of report on basis of findings provides
recommendations for action.
3. Objectives of Study- A description of nature of problem and key objectives of conducting a
study are explained here.
4. Techniques Used- Review of all tools and techniques employed along with data employed for
concluding the study is given in this portion of study. All description is given in non-technical
manner.
5. Results- It is the main portion of report where all finding are denoted in simplified and non-
technical terms. All sorts of illustration like diagrams and charts are used liberally.
6. Technical Appendices- Technical appendices provides a detailed informed on different methods
used, forms etc. In case, if report is meant for general public then technical appendices is kept
precise.

5.14.1 Other essential reports used in business research

Informational reports. These reports present facts about certain given activity in detail without any
note or suggestions. Whatever is gathered is reported without giving any thing by way of either
explanation or any suggestion. A vice-chancellor asking about the number of candidates appearing at
a particular examination naturally seeks only information of the fact (candidates taking up the
examination) of course without any comment. Generally such reports are of routine nature. Sometimes
they may fall under statutory routine category. A company registrar asking for allotment return within
the stipulate period is nothing but informational routine, falling under statutory but routine report.

Analytical reports. These reports contain facts along with analytical explanations offered by the
reporter himself or may be asked for by the one who is seeking the report. Such reports contain the
narration of facts, collected data and information, classified and tabulated data and also explanatory
note followed by the conclusions arrived at or interpretations. A company chairman may ask for a
report on falling trends in sale in a particular area. He will in this case be naturally interested in
knowing all the details including that of opinion of any of the investigator.

Common Business Research reports. These reports are based on some research work conducted by
either an individual or a group of individuals on a given problem. Indian oil company might have
asked its research division to find some substitute for petrol, and if such a study is conducted then a
report shall be submitted by the research division detailing its findings and then offering their own
suggestions, including the conclusions at which the division has arrived at as to whether such a

Business Research Methods 233


and Data Analytics
22BA206

substitute is these and if it is there can the same be put to use with advantage and effectively. All
details shall naturally be asked and has to be given. In fact such a report is the result of a research.

Statutory reports. These reports are to be presented according to the requirements of a particular law
or a rule or a custom now has become a rule. The auditor reports to company registrar has to be
submitted as per the requirements of country legal requirement. A return on compensation paid to
factory workers during a period by a factory has to be submitted to competent authorities periodically.
These reports are generally prepared in the prescribed form as the rules have prescribed.
Non statutory reports. These reports are not in the nature of legal requirements or rules wants,
therefore, the reports are to be prepared and submitted. These reports are required to be prepared and
submitted: (i) for the administrative and other conveniences,(ii) for taking decision in a matter (iii) for
policy formulations, (iv) for projecting the future or (v) any thing alike so that efficient and smooth
functioning maybe assured and proper and necessary decision may be taken with a view to see that
every thing goes well and the objectives of the organization are achieved with assured success.

Routine reports. These reports are required to be prepared and submitted periodically on matters
required by the organization so as to help the management of the organization to take decisions in the
matters relating to day to day affairs. The main objectives of routine reports are to let the management
know as to what is happening in the organization, what is its progress where the deviation is, what
measures have been taken in solving the problems and what to do so that the organization may run
smoothly and efficiently. Routine reports are generally brief. They only give the facts. No comments
or explanations are usually offered in such reports. Generally forms are prescribed for preparation and
submission of such reports.

Special reports. Such a type of report is specially required to be prepared and submitted on matters
of special nature. Due to an accident a death of the foreman has occurred in a factory. The factory
manager may ask for a detail report from the head foreman. Such a report is classified as special
reports. These reports contain not only facts and details but they may contain suggestion, comments
and explanations as well.

Journal Articles
It is helpful to make acquainted yourself with the diverse types of articles published by journals.
Although it may emerge that there are a great number of types of articles published due to the broad
assortment of names they are published under, most articles published are one of the following
types- creative Research, evaluation Articles, Short Reports, or Letters, Case Studies, Methodologies.

Technical Research Reports


One of the major forms of communication in engineering is the scientific report. In the place of work,
the report is a real working document written by engineers for clients, executives, and other engineers.
This means every testimony has a rationale beyond the simple presentation of information. Some
common purposes are to:
• persuade a government agency of the consequence of a particular course of action
• sway a client that your clarification will fulfill their needs
• induce the public that a proposed venture will bring remuneration
• persuade a government or council to approve a particular course of action
• influence a client to prefer one design over another
• plead your case before an organization to partner with your company on a plan

Business Research Methods 234


and Data Analytics
22BA206

Monographs or Books
Research monographs can be reformatted editions of dissertations, theses, or other noteworthy
research reports. Monographs are published by academia presses and profitable scholarly publishers.
A summit of distinction is that authors may get a royalty reimbursement for monographs, whereas,
for a good number of other research broadcasting, such as journal articles and conference papers,
authors do not accept direct payment.
As a profitable work, a monograph will characteristically be edited to be decipherable to a more
universal or specific audience, depending on to whom the publisher will be marketing the book.
The distribution of a research monograph will likely be individuals with anecdotal levels of
proficiency in the field, ranging from students to academics, practitioners to arrange people. When
writing, you can presuppose the reader will have some curiosity about the topic, but he or she may not
have many milieus in the field.
The required complexity or quality of research of a Monograph can fluctuate by country,
university, or program, and the required lowest study period. The word “Monograph” can at times be
used to describe a discourse without relation to obtaining an academic extent. The term “Monograph”
is also used to pass on to the general state of an essay or analogous work.
Professional Meetings report
• A meeting needs a clear purpose declaration. The exact goal for the specific meeting will evidently
relate to the whole goal of the group or committee. Formative your purpose is central to a
successful meeting and getting.
• A meeting should not be scheduled just because it was held at the same time last month or because
it is a standing committee. Members will show antipathy towards the intrusion into their schedules
and hastily perceive the short of purpose.
• Similarly, if the need for a meeting crops up, one should not dash into it without planning. An
inadequately planned meeting announced at the last minute is in no doubt to be less than useful.
• People may be powerless to change their schedules, may fall short to concentrate, or may hinder
the advancement and debate of the group because of their nonappearance. Those who concentrate
may feel stalled because they needed more time to organize and present all-inclusive results to the
assemblage or committee.
Business Seminar Reports
• A seminar may be defined as an assembly of employee for the intention of discussing a stated topic.
Such gatherings are typically interactive sessions where the participants fit into place in discussions
about the demarcated topic. The sessions are frequently headed or led by one or two presenters who
dole out to maneuver the discussion along the preferred conduit.
• A seminar may have numerous purposes or just one purpose. For a case in point, a seminar may be for
the rationale of education, such as a lecture, where the contributor engages in the discussion of an
academic subject for the intention of gaining a superior approach to the subject. Other forms of
instructive seminars might be held to notify some skills or acquaintance to the participants.

5.15 Different styles of writing synoptically outline of chapters


Following are different styles of synoptically outline:

1. Project synopsis

A project synopsis is often used in science and engineering fields and summarizes a project’s goals,
processes, and conclusions. It often starts with a statement summarizing the problem that the project aims
to solve. It delves into methods used and other details that are important to the project, such as relevant
details about the project’s participants.

Business Research Methods 235


and Data Analytics
22BA206

2. Research synopsis

Of the three main types of synopses, research and project synopses are most often used by research and
scientific institutions. Like a project synopsis, a research synopsis summarizes the problem or question
the research is attempting to solve and then describes how the research was conducted.

Research synopses also give details on the researchers themselves, such as any relevant academic degrees
they hold.

3. Literary synopsis

A literary synopsis is a synopsis of a work of fiction. It summarizes all the critical elements of a book so
that an agent or publisher understands, to a high level of detail, what a book is about without having read
it.

5.15.1 Steps to write successful synopsis

• Make a list of your book’s key elements. These include the most critical story and plot points, conflict,
characters, settings, themes, and tone. For the plot, go through each chapter, and write down one to
three of the most important plot developments from each. Then flesh out each item on your list with
any other important details.

• Write a good opening sentence. This should summarize your character, setting, and the immediate
conflict, ensuring you make it clear what’s at stake. Then link together your detailed list from step 1
to form a first draft of your synopsis.

• Read through the synopsis. Then add any details you may have forgotten. Also, look for details you
included that are not critical and cut them.

• Read through it again. Ensure that the plot and character arcs are clearly defined.

• Give it a final edit and proofread. A one-page synopsis is often ideal, but publishers may request a
synopsis of three to five pages or specify some other length.

5.16 Steps in drafting the report


1. Identify an issue or question
Before developing your analytical report, it's important to identify an issue or question. Your question or
issue is the main topic of your report, and it can help you create an outline. For example, you might create
an analytical report to determine why sales have been lower than usual. You'd use this issue to conduct
research, collect data and propose solutions. To identify an issue or question for your report, try looking
at how your company is currently performing.
2. Gather relevant information
Once you identify your issue or question, start gathering relevant information. This could include data or
resources. If you're making an analytical report about market trends, then you might study your industry's
current market or how competitors are performing. You could even read credible articles that are related
to your report's topic. Inquiry informs your analytical report, so it's important to conduct accurate research.
3. Choose a format
Now that you have the groundwork for your analytical report, you can choose a format for it. There are
several ways to present an analytical report, such as a spreadsheet, document or presentation. Another

Business Research Methods 236


and Data Analytics
22BA206

popular option is an online dashboard. An online dashboard allows you to create and display charts in a
way that's easy to understand. Try to select a format that can present all of your data.
4. Add charts and other elements
A common component of an analytical report is its charts and other elements. Charts and graphs are how
you display your data, so it's valuable to include several of these visuals. Try to add charts that accurately
represent your findings. Some common graphs for an analytical report include line charts, bars charts,
maps and plots. Along with charts, you can add other elements, such as images and icons. These can help
your chart be easier to read and look impressive.
5. Use design practices
Once you've added your charts and other elements, start designing your report. While you can make your
analytical report as simple or complex as you'd like, it's important to use design practices. These guidelines
help make your report visually appealing and easy to read. For example, a common design practice is to
use a layout that's clear, with a mix of visuals and text. Another practice is to use plenty of white space to
improve the readability of your report.
6. Make recommendations
The last component of your report is the recommendations. Since you're trying to solve a problem or
answer a question, it's important to provide a few solutions based on your research. For example, if you
made an analytical report about operational performance, you could make recommendations for how the
company can improve its productivity.

5.16.1 Components of Research Report


The report must contain the following components
1) Preliminary Part
a) Cover
b) Title
c) Preface
d) Acknowledgement
e) Table of contents
f) List of tables
g) List of graphs
2) Introduction of the Report
a) Introduction
b) Background of the research study
c) Statement of the problem
d) Brief outline of the chapters
3) Review of Literature
a) Books review
b) Review of articles published in books, journals, periodicals, etc
c) Review of articles published in leading newspapers
d) Working papers / discusssion paper / study reports
e) Articles on authorised websites
f) A broad conclusion and indications for further research
4) The Research Methodology
a) The theoretical framework (variables)
b) Model / hypothesis
c) Instruments for data collection
d) Data collection
5) Results
a) Pilot study
b) Processing of data

Business Research Methods 237


and Data Analytics
22BA206

c) Hypothesis / model testing


d) Data analysis and interpretation
e) Tables and figures
6) Concluding Remarks
a) Findings
b) Conclusions
c) Shortcomings
d) Suggestions to the problems
e) Direction for further research
7) Bibliography
a) Appendices & annexure

Following shows the details of the research report


1) Preliminary Part
The preliminary part may have seven major components – cover, title, preface, acknowledgement, table
of contents, list of tables, list of graphs. Long reports presented in book form have a cover made up of a
card sheet. The cover contains title of the research report, the authority to whom the report is submitted,
name of the author, etc.
The preface introduces the report to the readers. It gives a very brief introduction of the report. In the
acknowledgements author mention names of persons and organisations that have extended co-operation
and helped in the various stages of research. Table of contents is essential. It gives the title and page
number of each chapter.
2) Introduction of the Report
The introduction of the research report should clearly and logically bring out the background of the
problem addressed in the research. The purpose of the introduction is to introduce the research project to
the readers. A clear statement of the problem with specific questions to be answered is presented in the
introduction. It contains a brief outline of the chapters.
3) Review of Literature
The third section reviews the important literature related to the study. A comprehensive review of the
research literature referred to must be made. Previous research studies and the important writings in the
area under study should be reviewed. Review of literature is helpful to provide a background for the
development of the present study.
The researcher may review concerned books, articles published in edited books, journals and periodicals.
Researcher may also take review of articles published in leading newspapers. A researcher should study
working papers/discussion papers/study reports. It is essential for a broad conclusion and indications for
further research.
4) The Research Methodology
Research methodology is an integral part of the research. It should clearly indicate the universe and the
selection of samples, techniques of data collection, analysis and interpretation, statistical techniques, etc.
5) Results
Results contain pilot study, processing of data, hypothesis/model testing, data analysis and interpretation,
tables and figures, etc. This is the heart of the research report. If a pilot study is planned to be used, it’s
purpose should be given in the research methodology.
The collected data and the information should be edited, coded, tabulated and analysed with a view to
arriving at a valid and authentic conclusion. Tables and figures are used to clarify the significant
relationship. The results obtained through tables, graphs should be critically interpreted.
6) Concluding Remarks
The concluding remarks should discuss the results obtained in the earlier sections, as well as their
usefulness and implications. It contains findings, conclusions, shortcomings, suggestions to the problem

Business Research Methods 238


and Data Analytics
22BA206

and direction for future research. Findings are statements of factual information based upon the data
analysis.
Conclusions must clearly explain whether the hypothesis have been established and rejected. This part
requires great expertise and preciseness. A report should also refer to the limitations of the applicability
of the research inferences. It is essential to suggest the theoretical, practical and policy implications of the
research. The suggestions should be supported by scientific and logical arguments. The future direction
of research based on the work completed should also be outlined.
8) Bibliography
The bibliography is an alphabetic list of books, journal articles, reports, etc, published or unpublished,
read, referred to, examined by the researcher in preparing the report. The bibliography should follow
standard formats for books, journal articles, research reports.
The end of the research report may consist of appendices, listed in respect of all technical data. Appendices
are for the purpose of providing detailed data or information that would be too cumbersome within the
main body of the research report.

5.16.2 Qualities of Good Report


Report writing is a highly skilled job. It is a process of analysing, understanding and consolidating the
findings and projecting a meaningful view of the phenomenon studied. A good report writing is essential
for effective communication.
Following are the essential qualities of good report:
• A research report is essentially a scientific documentation. It should have a suggestive title, headings
and sub-headings, paragraphs arranged in a logical sequence.
• Good research report should include everything that is relevant and exclude everything that is
irrelevant. It means that it should contain the facts rather than opinion.
• The language of the report should be simple and unambiguous. It means that it should be free from
biases of the researchers derived from the past experience. Confusion, pretentiousness and pomposity
should be carefully guarded against. It means that the language of the report should be simple,
employing appropriate words, idioms and expressions.
• The report must be free from grammatical mistakes. It must be grammatically accurate. Faulty
construction of sentences makes the meaning of the narrative obscure and ambiguous.
• The report has to take into consideration two facts. Firstly, for whom the report is meant and secondly,
what is his level of knowledge. The report has to look to the subject matter of the report and the fact
as to the level of knowledge of the person for whom it is meant. Because all reports are not meant for
research scholars.

Exercise
I. Write down short answers for the following:
32. What is enterprise reporting
33. What is linear and logistic regression
34. Define time series analysis
35. Describe features of big data
36. List out types of big data analyics
37. What is predictive analysis?
38. What is prescriptive analysis?
39. What is cloud analytics?
40. Define report writing.
41. Describe automated analytics

II. Provide Detailed Answers:

Business Research Methods 239


and Data Analytics
22BA206

23. Elaborate the types of visualisation tools


24. Describe features of linear and logistic regression
25. Describe types of predictive modeling and its business use case
26. Describe life cycle of big data analytics
27. Elaborate cloud analytics tools.
28. How can you select perfect cloud analytics platform
29. Elaborate steps in report writing process

Business Research Methods 240


and Data Analytics

You might also like