0% found this document useful (0 votes)
7 views25 pages

Business Analytics Unit 1 Notes

The document outlines the fundamentals of Business Analytics, including its definition, key techniques, applications, and benefits such as data-driven decision making and improved operational efficiency. It also discusses challenges faced in Business Analytics, the data science process, types of data, and the analytics life cycle, which provides a structured approach to solving business problems. Additionally, it compares Business Analytics with Data Science, highlighting their differences in focus, data types, and techniques used.

Uploaded by

lokeshwaransr7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views25 pages

Business Analytics Unit 1 Notes

The document outlines the fundamentals of Business Analytics, including its definition, key techniques, applications, and benefits such as data-driven decision making and improved operational efficiency. It also discusses challenges faced in Business Analytics, the data science process, types of data, and the analytics life cycle, which provides a structured approach to solving business problems. Additionally, it compares Business Analytics with Data Science, highlighting their differences in focus, data types, and techniques used.

Uploaded by

lokeshwaransr7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

ANJALAI AMMAL - MAHALINGAM ENGINEERING COLLEGE

KOVILVENNI-614 403, THIRUVARUR DISTRICT


NBA & NAAC Accredited Institution
DEPARTMENT OF INFORMATION TECHNOLOGY
CCW331 – BUSINESS ANALYTICS
Unit 1
INTRODUCTION TO BUSINESS ANALYTICS

Analytics and data science- Analytics life cycle-Types of Analytics-Business Problem definition -
Data collection- Data preparation - Hypothesis generation - Modeling - Validation and Evaluation
– Interpretation - Deployment and iteration.
1.1 Analytics and data science
Analytics:
Analytics is a body of knowledge consisting of statistical, mathematical and operations
research techniques, and Artificial intelligence techniques such as machine learning and deep
learning algorithms, data collection and storage, data management processes such as data
extraction, transformation and loading (ETL).
Business Analytics
• Definition: Practice of using data analysis and statistical methods for insights and informed decision
making.
• Key Focus: Collection, processing, and interpretation of large data volumes.
• Goal: Uncover patterns, trends, and correlations to drive strategic and operational improvements.
Techniques in Business Analytics

Applications
Marketing, sales, operations, finance, supply chain, customer service.
Benefits:
Data driven decisions
Process optimization
Performance improvement
Market opportunity identification
Risk mitigation
Competitive advantage
•Data-Driven Decision Making
–Enables informed, evidence-based decisions, leading to accurate and reliable outcomes.
•Improved Operational Efficiency
–Optimizes processes, reduces costs, and enhances resource allocation.
•Enhanced Business Performance
–Provides insights into customer behavior and market trends to drive growth and satisfaction.
CCW331 Business Analytics Page 1
•Improved Risk Management
–Identifies and mitigates risks, detects fraud, and enhances compliance.
•Personalized Customer Experiences
–Tailors products, services, and marketing to individual customer needs, boosting loyalty.
•Competitive Advantage
–Offers a strategic edge through data-driven decisions and anticipation of market trends.
•Improved Marketing and Sales Effectiveness
–Optimizes campaigns, targets the right audience, and enhances customer engagement.
•Innovation and New Product Development
–Identifies market gaps and customer needs, driving innovation and product improvement.
•Continuous Improvement
–Fosters a culture of refinement through data insights and performance tracking.
•Efficient Resource Allocation
–Optimizes allocation of budgets and resources, improving overall utilization and reducing waste.
Challenges of Business Analytics
1. Data Quality and Availability
 Issues: Poor data quality, incomplete or inconsistent data, limited availability.
 Solution: Ensure data is accurate, reliable, and relevant; use data cleansing and integration processes.
2. Data Governance and Privacy
 Issues: Compliance with data governance and privacy regulations.
 Solution: Implement data governance frameworks, policies, and procedures to protect sensitive
information and manage access.
3. Data Integration and Complexity
 Issues: Challenges in integrating data from diverse sources due to format and system variations.
 Solution: Develop robust data integration processes and technologies for holistic insights.
4. Analytical Skills and Talent Gap
 Issues: Shortage of professionals with domain knowledge, statistical expertise, and analytical tool
proficiency.
 Solution: Address talent gaps by hiring skilled data analysts and data scientists.
5. Technology Infrastructure
 Issues: Complexity and resource intensity of implementing technology infrastructure for analytics.
 Solution: Invest in scalable and reliable systems for data storage, processing, and analytics.
6. Change Management and Organizational Culture
 Issues: Resistance to adopting data-driven decision-making.
 Solution: Employ change management strategies and leadership support to foster a data-driven
culture.
7. Interpretation and Actionability of Insights
 Issues: Difficulty in extracting actionable and understandable insights for decision-makers.
 Solution: Ensure insights are relevant and actionable; translate findings into tangible actions.
8. Cost and Return on Investment (ROI)
 Issues: High costs in technology, talent, and infrastructure for analytics.
 Solution: Carefully assess costs versus potential returns to ensure benefits justify the investment.

Data Science
 A multidisciplinary field for extracting knowledge from structured and unstructured data.
 Combines statistics, mathematics, computer science, and domain expertise.
 Aims to solve complex problems and make data driven decisions.

CCW331 Business Analytics Page 2


Key Skills in Data Science
 Data collection
 Data cleaning and preprocessing
 Data analysis
 Machine learning
 Statistical modeling
 Data visualization
 Communication
Steps in Data Science Process
1. Problem Formulation: Defining the problem or question.
2. Data Collection: Gathering data from databases, APIs, etc.
3. Data Preprocessing: Cleaning and transforming data.
4. Exploratory Data Analysis (EDA): Understanding data characteristics, patterns.
5. Feature Engineering: Selecting or creating features to improve model performance.
6. Model Building: Developing machine learning/statistical models.
7. Model Evaluation: Assessing model performance with metrics and validation.
8. Model Deployment: Integrating the model into production.
9. Model Monitoring and Maintenance: Continuously monitoring and updating the model.
Applications
 Finance
 Healthcare
 Marketing
 Ecommerce
 Social Sciences
Comparison
Business Analytics:
Definition: Focuses on the application of data analysis techniques within a business context to improve
decision making and operational efficiency.
Key Characteristics: Primarily deals with historical and structured data to provide insights that guide
business strategies.
Visual Perspective of Business Analytics

Business
Intelligence/
Statistics Information
Systems

Modeling
and
Optimization
Data Science:
Definition: A broader, multidisciplinary field that uses advanced methods (like machine learning and
statistical modeling) to extract insights from both structured and unstructured data.
Key Characteristics: Solves complex problems, makes predictions, and drives innovation beyond just
business applications.

CCW331 Business Analytics Page 3


Business Analytics Data Science
• Works mainly with structured data • Handles a diverse array of data types
sources like sales records, customer including structured (e.g., databases),
databases, and financial data. unstructured (e.g., text, images), and
• Emphasizes using past data to identify semistructured data (e.g., XML files).
trends, correlations, and patterns that • Explores various data sources to address
are directly relevant to business broader questions, such as social media
decisions. analysis, sensor data, and multimedia content.
Techniques • Incorporates a wide array of techniques
• Descriptive Analytics: Summarizes including data preprocessing, exploratory data
historical data to understand what analysis, machine learning, deep learning,
happened in the past. natural language processing (NLP), and
• Diagnostic Analytics: Investigates advanced visualization.
reasons behind past outcomes by • Focuses on uncovering hidden patterns in data
examining data relationships. and building predictive and prescriptive
• Predictive Analytics: Forecasts future models.
trends using historical data and
statistical models.
Tools • Uses programming languages like Python and
• Utilizes familiar tools such as R for data manipulation, modeling, and
spreadsheets (Excel), data visualization machine learning.
tools (Tableau, Power BI), and Business • Employs specialized libraries (like Pandas,
Intelligence (BI) platforms. Scikitlearn, TensorFlow) and frameworks
• Employs statistical software (like SPSS) designed for advanced data analysis and
to facilitate data reporting and machine learning tasks.
dashboard creation.
Applications • Utilized in predictive modeling (e.g., customer
• Applied in areas such as market churn prediction), fraud detection,
research, customer segmentation, sales recommendation systems (e.g., Netflix,
forecasting, pricing strategies, supply Amazon), image and speech recognition, and
chain optimization, and risk assessment. autonomous vehicles.
• Aims to enhance operational efficiency, • Focuses on generating actionable insights and
identify growth opportunities, and leveraging data for technological and strategic
improve overall business performance. innovations.
Business Analytics is the statistical study of Data science is the study of data using statistics,
business data to gain insights. algorithms and technology.
Uses mostly structured data. Uses both structured and unstructured data.
Does not involve much coding. It is more Coding is widely used. This field is a combination of
statistics oriented. traditional analytics practice with good computer
science knowledge.
The whole analysis is based on statistical Statistics is used at the end of analysis following
concepts. coding.
Studies trends and patterns specific to business. Studies almost every trend and pattern.
Top industries where business analytics is used: Top industries/applications where data science is
finance, healthcare, marketing, retail, supply used: e-commerce, finance, machine learning,
chain, telecommunications. manufacturing.

CCW331 Business Analytics Page 4


Types of Data
Structured Data
Structured data is a type of data that is organized and easily managed using traditional data management
tools such as spreadsheets, databases, or tables. Structured data is typically quantitative and numeric in
nature, meaning that it consists of numbers, percentages, and other numerical values. Because of its
organized nature, structured data is relatively easy to analyze using statistical methods such as regression
analysis or correlation analysis.
Unstructured Data
Unstructured data is data that does not have a predefined format or organization, making it difficult to
manage using traditional data management tools. Examples of unstructured data include social media posts,
emails, images, and videos. Because of its unstructured nature, unstructured data is typically qualitative in
nature, meaning that it is descriptive and narrative in nature. Analyzing unstructured data requires the use of
advanced analytics techniques such as natural language processing (NLP) or sentiment analysis.
Semi-Structured Data
Semi-structured data is a type of data that has elements of both structured and unstructured data. This type of
data includes information that is partially organized, but not to the extent that it can be classified as
structured data. Examples of semi-structured data include XML and JSON files, which have some
organization but also contain elements of unstructured data. Analyzing semi-structured data typically
requires a combination of traditional data management tools and advanced analytics techniques.
There are two types of data:
Qualitative and Quantitative data, which are further classified into four categories:
 Nominal data.
 Ordinal data.
 Discrete data.
 Continuous data.

Qualitative or Categorical Data


Qualitative or Categorical Data is data that can‘t be measured or counted in the form of numbers. These
types of data are sorted by category, not by number. That‘s why it is also known as Categorical Data. These
data consist of audio, images, symbols, or text. The gender of a person, i.e., male, female, or others, is
qualitative data.
The Qualitative data are further classified into two parts :
Nominal Data
Nominal Data is used to label variables without any order or quantitative value. The color of hair can be
considered nominal data, as one color can‘t be compared with another color.
Examples of Nominal Data :
 Colour of hair (Blonde, red, Brown, Black, etc.)
 Nationality (Indian, German, American)
Ordinal Data
CCW331 Business Analytics Page 5
Ordinal data have natural ordering where a number is present in some kind of order by their position on the
scale. These data are used for observation like customer satisfaction, happiness, etc., but we can‘t do any
arithmetical tasks on them.
Examples of Ordinal Data :
 When companies ask for feedback, experience, or satisfaction on a scale of 1 to 10
 Letter grades in the exam (A, B, C, D, etc.)
 Ranking of people in a competition (First, Second, Third, etc.)
Difference between Nominal and Ordinal Data
Nominal data can‘t be quantified, neither they have any intrinsic ordering whereas Ordinal data gives some
kind of sequential order by their position on the scale
Nominal data is qualitative data or categorical data whereas Ordinal data is said to be ―in-between‖
qualitative data and quantitative data
Quantitative Data
Quantitative data can be expressed in numerical values, making it countable and including statistical data
analysis. These kinds of data are also known as Numerical data. It answers the questions like ―how much,‖
―how many,‖ and ―how often.‖ For example, the price of a phone, the computer‘s ram, the height or weight
of a person, etc., falls under quantitative data.
Examples of Quantitative Data :
 Height or weight of a person or object
 Room Temperature
he Quantitative data are further classified into two parts :
Discrete Data
The term discrete means distinct or separate. The discrete data contain the values that fall under
integers or whole numbers
Examples of Discrete Data :
 Total numbers of students present in a class
 Cost of a cell phone
Continuous Data
Continuous data are in the form of fractional numbers. It can be the version of an android phone, the
height of a person, the length of an object, etc.
Examples of Continuous Data:
 Speed of a vehicle
 Market share price
Difference between Discrete and Continuous Data
Discrete Data Continuous Data
Discrete data are countable and finite; they are Continuous data are measurable; they are in the form of
whole numbers or integers fractions or decimal
Discrete data are represented mainly by bar Continuous data are represented in the form of a
graphs histogram
The values cannot be divided into subdivisions The values can be divided into subdivisions into smaller
into smaller pieces pieces
1.2 Analytics life cycle
Definition: The Analytics Life Cycle is an iterative framework used to systematically apply analytics
techniques to extract insights, solve problems, and support decision making within organizations.
Purpose: It ensures a structured approach to analytics, guiding teams from understanding the business
problem to implementing data driven solutions and continuously improving upon them.

CCW331 Business Analytics Page 6


Importance: By following a lifecycle approach, organizations can ensure that analytics efforts are
aligned with business objectives, data quality is maintained, and insights are actionable and relevant.

1. Business Understanding:
Objective: Clearly define the business problem or opportunity that analytics will address. This stage sets the
direction for the entire project.
Activities: Identify key questions, define goals, and determine the scope of the analytics project. Align the
analytics objectives with business priorities to ensure relevance.
Output: A well-defined problem statement, clear objectives, and a project plan that outlines the analytics
approach.
2. Data Acquisition:
Objective: Collect data that is relevant to the business problem. This stage involves gathering data from
various sources to ensure a comprehensive dataset.
Activities: Data can be acquired from internal databases, external data providers, APIs, web scraping, and
other sources. ETL (Extraction, Transformation, and Loading) processes are used to bring data into a usable
state.
Output: A collected dataset that is ready for initial review and cleaning, with all relevant data sources
identified and accessed.
3. Data Preparation:
Objective: Prepare the data for analysis by ensuring it is clean, consistent, and formatted correctly.
Activities: This involves data cleaning (removing duplicates, correcting errors, handling missing values),
data transformation (converting data types, scaling), and feature engineering (creating new variables that
may improve model performance).
Output: A high quality, ready to analyze dataset that accurately reflects the information needed for
modeling.
4. Exploratory Data Analysis (EDA):
Objective: Explore the data to understand its underlying structure, detect patterns, spot anomalies, and form
hypotheses for further analysis.
Activities: Use statistical techniques, summary statistics, and data visualization tools (like histograms,
scatter plots, and box plots) to explore relationships and trends within the data.

CCW331 Business Analytics Page 7


Output: Insights into data characteristics, identified patterns, potential correlations, and anomalies that
guide the next steps in the modeling process.
5. Modeling and Analysis:
Objective: Develop models that provide insights, make predictions, or solve specific business problems
using the prepared data.
Activities: Apply various modeling techniques such as regression, classification, clustering, machine
learning algorithms, or optimization methods, depending on the problem.
Model Selection: Choose the best model based on performance metrics such as accuracy, precision, recall,
or other relevant measures.
Output: A validated model or set of models that provide actionable insights or predictions based on the
data.
6. Interpretation and Communication:
Objective: Translate the results of the analysis into meaningful insights that stakeholders can understand
and use for decision making.
Activities: Interpret the results by linking them back to the business objectives. Use visualizations,
dashboards, and storytelling techniques to effectively communicate findings.
Output: A report or presentation that summarizes the key findings, insights, and recommendations, making
them accessible to nontechnical stakeholders.
7. Implementation and Action:
Objective: Implement the insights and recommendations derived from the analytics in a practical, realworld
context.
Activities: This may involve making strategic decisions, changing operational processes, launching new
initiatives, or integrating models into business applications (e.g., automated decision making systems).
Challenges: Ensure stakeholder buyin, align actions with business goals, and measure the impact of the
changes implemented.
Output: Tangible business actions taken based on the insights, with mechanisms in place to measure their
impact on business performance.
8. Monitoring and Iteration:
Objective: Continuously monitor the performance and impact of the implemented solutions, ensuring they
remain effective over time.
Activities: Track key performance indicators (KPIs), validate the model's accuracy with new data, and make
necessary adjustments.
Iteration involves revisiting previous stages to refine models, incorporate new data, or address changing
business needs.
Feedback Loop: Use monitoring results to inform future cycles, making the process dynamic and adaptive.
Output: A loop of continuous improvement, where analytics solutions are regularly updated and optimized,
ensuring sustained value and alignment with business objectives.

1.3 Types of Analytics


 There are 4 different types of analytics: Descriptive, Diagnostic, Predictive, and Prescriptive
analytics, through which you can eradicate flaws and promote informed decisions.
 By implementing these methods, decision-making becomes much more efficient. However, the
right combination of analytics is essential.
 Analytical methods, or analytics for short, change the game for the better. Each type has its
reasoning and calculated consequences, so you are rarely caught off-guard. A sorted process backs
it up, which deals with analyzing data at each of its stages.

CCW331 Business Analytics Page 8


Descriptive Analysis
 The first type of data analysis is descriptive analysis. It is at the foundation of all data insight. It is
the simplest and most common use of data in business today.
 Descriptive analysis answers the “what happened” by summarizing past data usually in the form of
dashboards. More information about designing dashboards can be found here.
 The biggest use of descriptive analysis in business is to track Key Performance Indicators (KPI‘s).
KPI‘s describe how a business is performing based on chosen benchmarks.
Business applications of descriptive analysis include:
 KPI dashboards
 Monthly revenue reports
 Sales leads overview
Diagnostic Analysis
 After asking the main question of ―what happened‖ you may then want to dive deeper and ask why
did it happen? This is where diagnostic analysis comes in.
 Diagnostic analysis takes the insight found from descriptive analytics and drills down to find the
cause of that outcome. Organizations make use of this type of analytics as it creates more
connections between data and identifies patterns of behavior.
 A critical aspect of diagnostic analysis is creating detailed information. When new problems arise, it
is possible you have already collected certain data pertaining to the issue. By already having the data
at your disposal, it ends having to repeat work and makes all problems interconnected.
Business applications of diagnostic analysis include:
 A freight company investigating the cause of slow shipments in a certain region
 A SaaS company drilling down to determine which marketing activities increased trials
Predictive Analysis
 Predictive analysis attempts to answer the question “what is likely to happen”. This type of
analytics utilizes previous data to make predictions about future outcomes.
 This type of analysis is another step up from the descriptive and diagnostic analyses.
 Predictive analysis uses the data we have summarized to make logical predictions of the outcomes of
events.

CCW331 Business Analytics Page 9


 This analysis relies on statistical modeling, which requires added technology and manpower to
forecast. It is also important to understand that forecasting is only an estimate; the accuracy of
predictions relies on quality and detailed data.
 While descriptive and diagnostic analysis are common practices in business, predictive analysis is
where many organizations begin show signs of difficulty. Some companies do not have the
manpower to implement predictive analysis in every place they desire. Others are not yet willing to
invest in analysis teams across every department or not prepared to educate current teams.
Business applications of predictive analysis include:
 Risk Assessment
 Sales Forecasting
 Using customer segmentation to determine which leads have the best chance of converting
 Predictive analytics in customer success teams
Prescriptive Analysis
 The final type of data analysis is the most sought after, but few organizations are truly equipped to
perform it.
 Prescriptive analysis is the frontier of data analysis, combining the insight from all previous analyses
to determine the course of action to take in a current problem or decision.
 Prescriptive analysis utilizes state of the art technology and data practices. It is a huge organizational
commitment and companies must be sure that they are ready and willing to put forth the effort and
resources.
6 Steps in the Business Analytics Process

Step 1: Identifying the Problem


The first step of the process is identifying the business problem. The problem could be an actual crisis; it
could be something related to recognizing business needs or optimizing current processes. This is a crucial
stage in Business Analytics as it is important to clearly understand what the expected outcome should be.
When the desired outcome is determined, it is further broken down into smaller goals. Then, business
stakeholders decide on the relevant data required to solve the problem. Some important questions must be
answered in this stage, such as: What kind of data is available? Is there sufficient data? And so on.
Step 2: Exploring Data
Once the problem statement is defined, the next step is to gather data (if required) and, more importantly,
cleanse the data—most organizations would have plenty of data, but not all data points would be accurate or
useful. Organizations collect huge amounts of data through different methods, but at times, junk data or
empty data points would be present in the dataset. These faulty pieces of data can hamper the analysis.
Hence, it is very important to clean the data that has to be analyzed.

CCW331 Business Analytics Page 10


To do this, you must do computations for the missing data, remove outliers, and find new variables as a
combination of other variables. You may also need to plot time series graphs as they generally indicate
patterns and outliers. It is very important to remove outliers as they can have a heavy impact on the accuracy
of the model that you create. Moreover, cleaning the data helps you get a better sense of the dataset.
Step 3: Analysis
Once the data is ready, the next thing to do is analyze it. Now to execute the same, there are various kinds of
statistical methods (such as hypothesis testing, correlation, etc.) involved to find out the insights that you are
looking for. You can use all of the methods for which you have the data.
The prime way of analyzing is pivoting around the target variable, so you need to take into account whatever
factors that affect the target variable. In addition to that, a lot of assumptions are also considered to find out
what the outcomes can be. Generally, at this step, the data is sliced, and the comparisons are made. Through
these methods, you are looking to get actionable insights.
Step 4: Prediction and Optimization
Gone are the days when analytics was used to react. In today‘s era, Business Analytics is all about being
proactive. In this step, you will use prediction techniques, such as neural networks or decision trees, to
model the data. These prediction techniques will help you find out hidden insights and relationships between
variables, which will further help you uncover patterns on the most important metrics. By principle, a lot of
models are used simultaneously, and the models with the most accuracy are chosen. In this stage, a lot of
conditions are also checked as parameters, and answers to a lot of ‗what if…?‘ questions are provided.
Step 5: Making a Decision and Evaluating the Outcome
From the insights that you receive from your model built on target variables, a viable plan of action will be
established in this step to meet the organization‘s goals and expectations. The said plan of action is then put
to work, and the waiting period begins. You will have to wait to see the actual outcomes of your predictions
and find out how successful you were in your endeavors. Once you get the outcomes, you will have to
measure and evaluate them.
Step 6: Optimizing and Updating
Post the implementation of the solution, the outcomes are measured as mentioned above. If you find some
methods through which the plan of action can be optimized, then those can be implemented. If that is not the
case, then you can move on with registering the outcomes of the entire process. This step is crucial for any
analytics in the future because you will have an ever-improving database. Through this database, you can get
closer and closer to maximum optimization. In this step, it is also important to evaluate the ROI (return on
investment). Take a look at the diagram below of the life cycle of business analytics.
1.4 Business Problem definition
 Problem-solving in business is defined as implementing processes that reduce or remove
obstacles that are preventing you or others from accomplishing operational and strategic business
goals.
 In business, a problem is a situation that creates a gap between the desired and actual outcomes.
In addition, a true problem typically does not have an immediately obvious resolution.
Why Problem Solving Is Important in Business
Understanding the importance of problem-solving skills in the workplace will help you develop as a
leader. Problem-solving skills will help you resolve critical issues and conflicts that you come across.
Problem-solving is a valued skill in the workplace because it allows you to:
● Apply a standard problem-solving system to all challenges
● Find the root causes of problems
● Quickly deal with short-term business interruptions
● Form plans to deal with long-term problems and improve the organization
● See challenges as opportunities
● Keep your cool during
CCW331 Business Analytics Page 11
How to Solve Business Problems Effectively
There are many different problem-solving skills, but most can be broken into general steps.
Identify the Key Question:
 Align with business goals: optimize efficiency, improve customer satisfaction, increase sales,
reduce costs, or find new opportunities.
Gather Stakeholder Inputs:
 Engage business leaders, managers, experts, and end-users.
 Understand their challenges, pain points, and desired outcomes.
Define the Scope:
 Set clear boundaries and focus areas within the organization.
 Manage expectations by keeping the analytics effort feasible and actionable.
Formulate Measurable Objectives:
 Develop SMART objectives (Specific, Measurable, Attainable, Relevant, Time-bound).
 Example: Increase customer retention by 10% in six months or reduce inventory costs by 15%
in a year.
Analyze Root Causes:
 Investigate underlying factors or drivers contributing to the issue.
 Helps in creating targeted analytics models and strategies.
Consider Data Availability:
 Assess the quality and availability of relevant data sources.
 Identify any data gaps to address for project feasibility.
Evaluate Business Impact:
 Assess potential financial, operational, or strategic impacts.
 Quantify benefits and demonstrate the ROI of the analytics solution.
Document the Problem Definition:
 Summarize problem definition, objectives, scope, and findings.
 Use as a reference throughout the analytics project.
1.5 Data collection
Data collection is crucial in business analytics as it enables organizations to gather relevant information for
informed decision-making and gaining insights.
Data collection is the methodological process of gathering information about a specific subject.
Types of Data:
 Customer Data: Information on customer behavior, preferences, and feedback.
 First-party data, which is collected directly from users by your organization
 Second-party data, which is data shared by another organization about its customers (or its
first-party data)
 Third-party data, which is data that‗s been aggregated and rented or sold by organizations
that don‗t have a connection to your company or users
 Sales Figures: Data on sales performance and trends.
 Financial Records: Details of financial transactions and accounting data.
 Market Trends: Insights into market dynamics and competitive landscape.
Data Collection Channels: Surveys, interviews, website tracking tools, social media monitoring, and
transactional databases.
Internal Data Sources:
 Enterprise Systems: Data from Customer Relationship Management (CRM) platforms, Enterprise
Resource Planning (ERP) systems, sales, and inventory databases.
 Provides a comprehensive view of internal operations and performance metrics.
CCW331 Business Analytics Page 12
External Data Sources:
 Market research firms, government databases, industry reports, and publicly available datasets.
 Helps businesses understand the broader market landscape and external factors impacting the
organization.
In the data life cycle, data collection is the second step. After data is generated, it must be collected to be of
use to your team. After that, it can be processed, stored, managed, analyzed, and visualized to aid in your
organization‘s decision-making.

Before collecting data, there are several factors you need to define:
 The question you aim to answer
 The data subject(s) you need to collect data from
 The collection timeframe
 The data collection method(s) best suited to your needs
Data Collection Methods
 Surveys and Questionnaires: Collect customer feedback and quantitative data.
 Interviews: Capture in-depth qualitative insights from stakeholders or customers.
 Data Scraping: Extract relevant information from websites or external sources.
 Data Integration: Combine data from multiple sources for a unified view.
Ensuring Data Quality
 Data Governance Frameworks
o Establish processes to validate, clean, and manage data.
 Quality Assurance
o Remove inconsistencies, errors, and ensure data accuracy.
 Compliance
o Adhere to data privacy regulations (e.g., GDPR) to protect sensitive information.
Methods of Collecting Data
There are two different methods of collecting data: Primary Data Collection and Secondary Data
Collection.
Primary Data
Primary data refers to information collected directly from first-hand sources specifically for a
particular research purpose. This type of data is gathered through various methods, including surveys,
interviews, experiments, observations, and focus groups. One of the main advantages of primary data

CCW331 Business Analytics Page 13


is that it provides current, relevant, and specific information tailored to the researcher‘s needs, offering
a high level of accuracy and control over data quality.

Methods of Collecting Primary Data


There are a number of methods of collecting primary data, Some of the common methods are as follows:
1. Interviews: Collect data through direct, one-on-one conversations with individuals. The investigator
asks questions either directly from the source or from its indirect links.
1. Direct Personal Investigation: The method of direct personal investigation involves collecting
data personally from the source of origin. In simple words, the investigator makes direct contact
with the person from whom he/she wants to obtain information. For example, direct contact with
the household women to obtain information about their daily routine and schedule.
2. Indirect Oral Investigation: In the indirect oral investigation method of collecting primary data,
the investigator does not make direct contact with the person from whom he/she needs information,
instead they collect the data orally from some other person who has the necessary required
information. For example, collecting data of employees from their superiors or managers.
 Advantage: Provides real-time, natural data; no reliance on self-reported information.
 Disadvantage: Observer bias; limited to what can be seen; may influence subjects‘ behavior.
 Suitable Use Case: Behavioral studies, user experience research.
2. Questionnaires: Collect data by asking people a set of questions, either online, on paper, or face-to-
face. In this method the investigator prepares a questionnaire to collect Information through
Questionnaires and Schedules , while keeping in mind the motive of the study, . The investigator can
collect data through the questionnaire in two ways:
1. Mailing Method: This method involves mailing the questionnaires to the informants for the
collection of data. The investigator attaches a letter with the questionnaire in the mail to define the
purpose of the study or research.
2. Enumerator’s Method: This method involves the preparation of a questionnaire according to the
purpose of the study or research. However, in this case, the enumerator reaches out to the
informants himself with the prepared questionnaire.
 Advantage: Can reach a large audience quickly and cost-effectively.
 Disadvantage: Responses may be biased or inaccurate; low response rates.
 Suitable Use Case: Customer satisfaction surveys, market research.
3. Observations: The observation method involves collecting data by watching and recording behaviors,
events, or conditions as they naturally occur. The observer systematically watches and notes specific
aspects of a subject‘s behavior or the environment, either covertly or overtly.
 Advantage: Provides real-time, authentic data without reliance on self-reported information.
 Disadvantage: Observer bias can influence the results, and the presence of an observer might alter
subjects‘ behavior.

CCW331 Business Analytics Page 14


 Suitable Use Case: Studying user interactions with a product in a natural setting, monitoring wildlife
behavior, or assessing classroom dynamics.
4. Experiments: The experiment method involves manipulating one or more variables to determine their
effect on another variable, within a controlled environment. Researchers create two groups (control and
experimental), apply the treatment or variable to the experimental group, and compare the outcomes
between the groups.
 Advantage: Allows for the establishment of cause-and-effect relationships with high precision.
 Disadvantage: Experiments can be artificial, limiting the ability to generalize findings to real-world
settings, and they can be resource-intensive.
 Suitable Use Case: Testing the efficacy of a new drug, assessing the impact of a new teaching
method, or evaluating the effect of a marketing campaign.
5. Focus Group: The focus group method involves gathering a small group of people to discuss a specific
topic or product, facilitated by a moderator. A group of 6-12 participants engages in a guided discussion
led by a moderator who asks open-ended questions to elicit opinions, attitudes, and perceptions.
 Advantage: Provides in-depth insights and diverse perspectives through interactive discussions,
revealing the reasoning behind participants‘ thoughts and feelings.
 Disadvantage: Results can be influenced by dominant participants or groupthink, and the findings are
not easily generalizable due to the small, non-representative sample size.
 Suitable Use Case: Exploring customer attitudes towards a new product, gathering feedback on a
marketing campaign, or understanding public opinion on social issues.
6. Information from Local Sources or Correspondents: In this method, for the collection of data, the
investigator appoints correspondents or local persons at various places, which are then furnished by them
to the investigator. With the help of correspondents and local persons, the investigators can cover a wide
area.
Secondary Data
Secondary data refers to information that has already been collected, processed, and published by
others. This type of data can be sourced from existing research papers, government reports, books,
statistical databases, and company records. The advantage of secondary data is that it is readily
available and often free or less expensive to obtain compared to primary data. It saves time and
resources since the data collection phase has already been completed.
Methods of Collecting Secondary Data
Secondary data can be collected through different published and unpublished sources. Some of them are
as follows:
1. Published Sources
 Government Publications: Government publishes different documents which consists of different
varieties of information or data published by the Ministries, Central and State Governments in India as
their routine activity. As the government publishes these Statistics, they are fairly reliable to the
investigator. Examples of Government publications on Statistics are the Annual Survey of Industries,
Statistical Abstract of India, etc.
 Semi-Government Publications: Different Semi-Government bodies also publish data related to
health, education, deaths and births. These kinds of data are also reliable and used by different
informants. Some examples of semi-government bodies are Metropolitan Councils, Municipalities,
etc.
 Publications of Trade Associations: Various big trade associations collect and publish data from
their research and statistical divisions of different trading activities and their aspects. For example,
data published by Sugar Mills Association regarding different sugar mills in India.
 Journals and Papers: Different newspapers and magazines provide a variety of statistical data in
their writings, which are used by different investigators for their studies.
CCW331 Business Analytics Page 15
 International Publications: Different international organizations like IMF, UNO, ILO, World Bank,
etc., publish a variety of statistical information which are used as secondary data.
 Publications of Research Institutions: Research institutions and universities also publish their
research activities and their findings, which are used by different investigators as secondary data. For
example National Council of Applied Economics, the Indian Statistical Institute, etc.
Read More: Published Sources of Collecting Secondary Data
2. Unpublished Sources
Unpublished sources are another source of collecting secondary data. The data in unpublished sources
is collected by different government organizations and other organizations. These organizations
usually collect data for their self-use and are not published anywhere. For example, research work
done by professors, professionals, teachers and records maintained by business and private enterprises.

1.6 Data preparation


Data preparation is also called pre-processing, is the act of cleaning and consolidating raw data prior
to using it for business analysis. It might not be the most celebrated of tasks, but careful data
preparation is a key component of successful data analysis.
Purpose: Transform raw data into a clean, structured, and organized format suitable for analysis.
Key Goals: Ensure data quality, consistency, and relevance.
Data Preparation Important
The decisions that business leaders make are only as good as the data that supports them. Careful
and comprehensive data preparation ensures analysts trust, understand, and ask better questions of their
data, making their analyses more accurate and meaningful. From more meaningful data analysis comes
better insights and, of course, better outcomes.
To drive the deepest level of analysis and insight, successful teams and organizations must implement a data
preparation strategy that prioritizes:
 Accessibility: Anyone — regardless of skillset — should be able to access data securely from a
single source of truth
 Transparency: Anyone should be able to see, audit, and refine any step in the end-to- end data
preparation process that took place
 Repeatability: Data preparation is notorious for being time-consuming and repetitive, which is
why successful data preparation strategies invest in solutions built for repeatability.
With the right solution in hand, analysts and teams can streamline the data preparation process, and
instead, spend more time getting to valuable business insights and outcomes, faster.
Data Preparation –Steps
Data Cleaning:
 Identify and rectify errors, inconsistencies, and missing values.
 Remove duplicates, handle missing data through imputation, and address outliers.
 Improves dataset accuracy and reliability for subsequent analysis.
Data Integration:
 Combine data from multiple sources into a unified dataset.
 Match and merge records, resolve discrepancies in formats, and establish relationships between
datasets.
 Provides a comprehensive view for better decision-making.
Data Transformation:
 Convert data into a format suitable for analysis.
 Standardize units, aggregate data (e.g., daily to monthly), and create new variables.
 Ensures consistency and comparability across variables.

CCW331 Business Analytics Page 16


Data Reduction:
 Apply techniques like Principal Component Analysis (PCA) or feature selection to reduce dataset
complexity.
 Focus on relevant variables to improve computational efficiency and simplify analysis.
Data Formatting:
 Structure data in a standardized format, adhering to a predefined schema.
 Define variable names, data types (numerical, categorical, textual), and organize into rows and
columns.
 Facilitates easier analysis and data handling.
Data Validation and Verification:
 Check data accuracy and integrity against business rules, external sources, or known benchmarks.
 Conduct plausibility checks to identify and correct errors.
 Ensures reliability and validity of the prepared dataset for analysis.

The data preparation process can vary depending on industry or need, but typically consists of the
following steps:
 Acquiring data: Determining what data is needed, gathering it, and establishing consistent access to
build powerful, trusted analysis
 Exploring data: Determining the data‗s quality, examining its distribution, and analyzing the
relationship between each variable to better understand how to compose an analysis
 Cleansing data: Improving data quality and overall productivity to craft error-proof insights
 Transforming data: Formatting, orienting, aggregating, and enriching the datasets used in an analysis
to produce more meaningful insights
1. Acquire Data
The first step in any data preparation process is acquiring the data that an analyst will use for their
analysis. It‗s likely that analysts rely on others (like IT) to obtain data for their analysis, likely from an
enterprise software system or data management system. IT will usually deliver this data in an accessible
format like an Excel document or CSV.
Modern analytic software can remove the dependency on a data-wrangling middleman to tap right into
trusted sources like SQL, Oracle, SPSS, AWS, Snowflake, Sales force, and Marketo. This means analysts
can acquire the critical data for their regularly-scheduled reports as well as novel analytic projects on their
own.
2. Explore Data
Examining and profiling data helps analysts understand how their analysis will begin to take shape.
Analysts can utilize visual analytics and summary statistics like range, mean, and standard
deviation to get an initial picture of their data. If data is too large to work with easily, segmenting
it can help.
During this phase, analysts should also evaluate the quality of their dataset. Is the data complete?
Are the patterns what was expected? If not, why? Analysts should discuss what they‗re seeing with the
owners of the data, dig into any surprises or anomalies, and consider if it‗s even possible to improve

CCW331 Business Analytics Page 17


the quality. While it can feel disappointing to disqualify a dataset based on poor quality, it is a wise
move in the long run. Poor quality is only amplified as one moves through the data analytics processes
3. Cleanse Data
During the exploration phase, analysts may notice that their data is poorly structured and in need of
tidying up to improve its quality. This is where data cleansing comes into play. Cleansing data
includes:
 Correcting entry errors
 Removing duplicates or outliers
 Eliminating missing data
 Masking sensitive or confidential information like names or addresses
Transform Data
Data comes in many shapes, sizes and structures. Some is analysis-ready, while other datasets may look
like a foreign language.
Transforming data to ensure that it‗s in a format or structure that can answer the questions being asked of
it is an essential step to creating meaningful outcomes. This will vary based on the software or language
that an analysts uses for their data analysis.
A couple of common examples of data transformations are:
● Pivoting or changing the orientation of data
● Converting date formats
● Aggregating sales and performance data across time
1.7 Hypothesis generation
Definition: Hypothesis generation involves formulating educated assumptions based on data, domain
knowledge, and problem context.
Purpose: To guide the analysis by proposing specific statements or relationships between variables that can
be tested and validated.
Problem Understanding:
• Gain a deep understanding of the business problem or research question.
• Gather relevant information, consult experts, and review literature or industry reports.
• Identify key variables influencing the outcome.
Exploratory Data Analysis:
• Use visualizations, summary statistics, and data profiling to identify patterns and potential
relationships.
• Generate initial insights and identify variables with significant impacts.
Domain Knowledge and Expertise:
• Leverage industry knowledge, market dynamics, and customer behavior.
• Engage domain experts and stakeholders for insights and hypothesis suggestions.
Existing Research and Literature:
• Review studies, papers, case studies, and reports relevant to the problem.
• Draw on existing theories and models to inspire and formulate hypotheses.
Data-Driven Hypothesis Formulation:
• Formulate specific, testable hypotheses based on insights from prior steps.
• Example: "Increased marketing expenditure leads to higher sales" or "Customer satisfaction
is higher for premium products."
• Ensure alignment with analysis objectives and testability with available data.
Refining and Prioritizing Hypotheses:
• Refine and prioritize hypotheses based on impact, feasibility, and resource constraints.
• Focus on those with the highest relevance and potential for actionable insights.

CCW331 Business Analytics Page 18


Hypothesis Generation vs. Hypothesis Testing

 Hypothesis generation is a process beginning with an educated guess whereas hypothesis testing is a
process to conclude that the educated guess is true/false or the relationship between the variables is
statistically significant or not.
 This latter part could be used for further research using statistical proof. A hypothesis is accepted or
rejected based on the significance level and test score of the test used for testing the hypothesis

 Null hypothesis (H0): The null hypothesis is the starting assumption in statistics. It says there is no
relationship between groups. For Example A company claims its average production is 50 units per
day then here:
H₀: The mean number of daily visits (μμ) = 50.
 Alternative hypothesis (H1): The alternative hypothesis is the opposite of the null hypothesis it
suggests there is a difference between groups. like The company‘s production is not equal to 50 units
per day then the alternative hypothesis would be:
H₁: The mean number of daily visits (μμ) ≠ 50.
Key Terms of Hypothesis Testing
 Level of significance: It refers to the degree of significance in which we accept or reject the null
hypothesis. 100% accuracy is not possible for accepting a hypothesis so we select a level of
significance. This is normally denoted with ααand generally it is 0.05 or 5% which means your output
should be 95% confident to give a similar kind of result in each sample.
 P-value: When analyzing data the p-value tells you the likelihood of seeing your result if the null
hypothesis is true. If your P-value is less than the chosen significance level then you reject the null
hypothesis otherwise accept it.
 Test Statistic: Test statistic is the number that helps you decide whether your result is significant. It‘s
calculated from the sample data you collect it could be used to test if a machine learning model
performs better than a random guess.
 Critical value: Critical value is a boundary or threshold that helps you decide if your test statistic is
enough to reject the null hypothesis
 Degrees of freedom: Degrees of freedom are important when we conduct statistical tests they help
you understand how much data can vary.
Seven steps of hypothesis testing
Step 1: Specify the null hypothesis and the alternative hypothesis
Step 2: What level of significance?
Step 3: Which test and test statistic to be performed? Step 4 : State the decision rule
Step 5: Use the sample data to calculate the test statistic
Step 6: Use the test statistic result to make a decision
Step 7: Interpret the decision in the context of the original question
1.8 Modeling
Definition: Modeling involves creating mathematical or statistical models to represent business processes,
relationships, or phenomena.
Purpose: To analyze data, make predictions, and derive actionable insights for decision-making.
Steps in the Modeling Process:
Problem Formulation:
• Clearly define the business problem or objective.
• Understand the context, identify key variables, and determine the scope and constraints.
Data Preparation:
• Clean, integrate, transform, and format data for modeling.

CCW331 Business Analytics Page 19


• Address missing values, outliers, and ensure data quality.
Model selection:
• Select appropriate modeling techniques or algorithms (e.g., regression, classification, clustering).
• Train the model using the prepared data to establish relationships between variables.
Model Development:
• Parameter estimation, optimaization and Calibration using training data.
• Model is designed to capture relationship between input variable and target variables of interest
Model Deployment:
• Deploy the model for practical use in the business environment.
• Integrate the model into operational systems, create user interfaces, or incorporate it into decision
support tools.
Model Monitoring and Maintenance:
• Continuously monitor the deployed model‘s performance.
• Update the model as needed, retrain with new data, and adapt to changing business conditions.
Model training
Depending on the type of question that you're trying to answer, there are many modeling algorithms
available. For guidance on choosing a prebuilt algorithm with designer, algorithms are available through
open-source packages in R or Python. The process for model training includes the following steps:
 Split the input data randomly for modeling into a training data set and a test data set.
 Build the models by using the training data set.
 Evaluate the training and the test data set. Use a series of competing machine-learning algorithms
along with the various associated tuning parameters (known as a parameter sweep) that are geared
toward answering the question of interest with the current data.
 Determine the "best" solution to answer the question by comparing the success metrics between
alternative methods.
1.9 Validation and Evaluation
Model Evaluation
After training, the data scientist focuses next on model evaluation.
 Checkpoint decision: Evaluate whether the model performs sufficiently for production. Some key
questions to ask are:
o Does the model answer the question with sufficient confidence given the test data?
o Should you try any alternative approaches?
o Should you collect additional data, do more feature engineering, or experiment with other
algorithms?
 Interpreting the Model:
o Explain the entire model behavior or individual predictions on your personal machine locally.
o Enable interpretability techniques for engineered features.
o Explain the behavior for the entire model and individual predictions in Azure.
o Use a visualization dashboard to interact with your model explanations,
o Deploy a scoring explainer alongside your model to observe explanations during inferencing.
 Assessing Fairness:
o Assess the fairness of your model predictions. This process will help you learn more about
fairness in machine learning.
In this stage, organizations use the selected analytical techniques and models to build the actual data models.
This involves training the models on the prepared data, tuning the model parameters, and validating the
model using different evaluation techniques. The model building stage requires a deep understanding of the
selected analytical techniques and algorithms, as well as domain-specific knowledge to interpret the results
accurately.
CCW331 Business Analytics Page 20
 Model Evaluation is an integral part of the model development process. It helps to find the best
model that represents our data and how well the chosen model will work in the future.
 .There are two methods of evaluating models in data science, Hold-Out and Cross-Validation. To
avoid overfitting, both methods use a test set (not seen by the model) to evaluate model performance.
Hold-Out: In this method, the mostly large dataset is randomly divided to three subsets:
1. Training set is a subset of the dataset used to build predictive models.
2. Validation set is a subset of the dataset used to assess the performance of model built in the training
phase. It provides a test platform for fine tuning model‘s parameters and selecting the best-
performing model. Not all modelling algorithms need a validation set.
3. Test set or unseen examples is a subset of the dataset to assess the likely future performance of a
model. If a model fit to the training set much better than it fits the test set, overfitting is probably the
cause.
Cross-Validation: When only a limited amount of data is available, to achieve an unbiased estimate of the
model performance we use k-fold cross-validation. In k-fold cross-validation, we divide the data into k
subsets of equal size. We build models k times, each time leaving out one of the subsets from training and
use it as the test set. If k equals the sample size, this is called ―leave-one-out‖.
Model Validation
 Model validation is defined within regulatory guidance as ―the set of processes and activities
intended to verify that models are performing as expected, in line with their design objectives, and
business uses.‖ It also identifies ―potential limitations and assumptions, and assesses their possible
impact.‖
 Generally, validation activities are performed by individuals independent of model development or
use. Models, therefore, should not be validated by their owners as they can be highly technical, and
some institutions may find it difficult to assemble a model risk team that has sufficient functional and
technical expertise to carry out independent validation. When faced with this obstacle, institutions
often outsource the validation task to third parties.
 In statistics, model validation is the task of confirming that the outputs of a statistical model are
acceptable with respect to the real data-generating process. In other words, model validation is the
task of confirming that the outputs of a statistical model have enough fidelity to the outputs of the
data-generating process that the objectives of the investigation can be achieved.

The Four Elements


Model validation consists of four crucial elements which should be considered:

1. Conceptual Design
The foundation of any model validation is its conceptual design, which needs documented coverage
assessment that supports the model‘s ability to meet business and regulatory needs and the unique risks
facing a bank.
2. System Validation
All technology and automated systems implemented to support models have limitations. An effective
validation includes: firstly, evaluating the processes used to integrate the model‘s conceptual design and
functionality into the organisation‘s business setting; and, secondly, examining the processes implemented
to execute the model‘s overall design.
3. Data Validation and Quality Assessment
Data errors or irregularities impair results and might lead to an organisation‘s failure to identify and respond
to risks. Best practice indicates that institutions should apply a risk-based data validation, which enables the
reviewer to consider risks unique to the organisation and the model.

CCW331 Business Analytics Page 21


4. Process Validation
To verify that a model is operating effectively, it is important to prove that the established processes for the
model‘s ongoing administration, including governance policies and procedures, support the model‘s
sustainability.
1.10 Interpretation
Interpretation- Data Patterns and Trends
 Identifying Patterns: Analyzing data to find recurring themes, such as seasonality or cyclical trends in
sales, customer behavior, or market movements.
 Importance:
– Helps in making informed predictions.
– Aids in strategy formulation to leverage opportunities or avoid risks.
 Examples:
– Recognizing peak sales periods.
– Identifying cyclical dips in customer engagement.
Interpretation-Statistical Significance and Relationships
 Statistical Significance:
– Assesses the reliability of findings through p-values, confidence intervals, and effect sizes.
– Validates the strength and robustness of relationships or differences observed in the data.
– Supports evidence-based decision-making.
 Correlations and Relationships:
– Identifies positive or negative correlations between variables.
– Helps uncover cause-and-effect relationships that influence business outcomes.
– Guides resource prioritization, budgeting, and strategic focus on key success drivers.
Interpretation -Predictive Insights and Contextual Understanding
 Predictive Insights:
– Utilizes predictive models to forecast future outcomes.
– Identifies key contributing factors to predictions.
– Supports performance optimization, inventory management, risk mitigation, and customer need
anticipation.
 Contextual Understanding:
– Combines data-driven insights with industry expertise for effective interpretation.
– Ensures insights are applied meaningfully to support business objectives.
– Recognizes the implications, limitations, or biases of the analysis within the broader business
context.
Quantitative data Interpretation
Quantitative data, often known as numerical data, is analyzed using the quantitative data
interpretation approach. Because this data type contains numbers, it is examined using numbers rather than
words. Quantitative analysis is a collection of procedures for analyzing numerical data. It frequently requires
the application of statistical modeling techniques such as standard deviation, mean, and median.
Median: The median is the middle value in a list of numbers that have been sorted ascending or descending,
and it might be more descriptive of the data set than the average.
Mean: The basic mathematical average of two or more values is called a mean. The arithmetic
mean approach, which utilizes the sum of the values in the series, and the geometric mean method,
which is the average number of products, are two ways to determine the mean for a given collection of
numbers.
Standard deviation: The positive square root of the variance is the standard deviation. One of the most
fundamental approaches to statistical analysis is the standard deviation. A low standard deviation indicates

CCW331 Business Analytics Page 22


that the values are near to the mean, whereas a large standard deviation indicates that the values are
significantly different from the mean.
There are three common uses for quantitative analysis.
 For starters, it‗s used to compare and contrast groupings. For instance, consider the popularity of
certain car brands with different colors.
 It‗s also used to evaluate relationships between variables.
 Third, it‗s used to put scientifically sound theories to the test. Consider a hypothesis concerning the
effect of a certain vaccination.

Regression analysis
A collection of statistical procedures for estimating connections between a dependent variable and
one or maybe more independent variables is known as regression analysis. It may be used to
determine the strength of a relationship across variables and to predict how they will interact in the future.

Cohort Analysis
Cohort analysis is a technique for determining how engaged users are over time. It‗s useful to determine
whether user engagement is improving over time or just looking to improve due to growth. Cohort analysis
is useful because it helps to distinguish between growth and engagement measures. Cohort analysis is
watching how individual‘s behavior develops over time in groups of people.

Predictive Analysis
By examining historical and present data, the predictive analytic approach seeks to forecast future trends.
Predictive analytics approaches, which are powered by machine learning and deep learning, allow firms to
notice patterns or possible challenges ahead of time and prepare educated initiatives. Predictive analytics is
being used by businesses to address issues and identify new possibilities.

Prescriptive Analysis
The prescriptive analysis approach employs tools like as graph analysis. Prescriptive analytics is a sort of
data analytics in which technology is used to assist organisations in making better decisions by
analysing raw data. Prescriptive analytics, in particular, takes into account information about potential
situations or scenarios, available resources, previous performance, and present performance to recommend
a course of action or strategy. It may be used to make judgments throughout a wide range of time frames,
from the immediate to the long term.

Conjoint Analysis
Conjoint analysis is the best market research method for determining how much customers appreciate a
product‗s or service‗s qualities. This widely utilized method mixes real-life scenarios and statistical tools
with market decision models

Cluster analysis
Any organization that wants to identify distinct groupings of consumers, sales transactions, or other sorts of
behaviors and items may use cluster analysis as a valuable data-mining technique.

1.11 Deployment and iteration.

Definition: Deployment refers to the process of implementing developed analytical models, solutions, or
insights into operational systems or business processes.
Objective: Ensures that insights and recommendations from analytics are used effectively to drive decision-
making and improve business outcomes.

CCW331 Business Analytics Page 23


Key Steps in Deployment:
1. Integration:
Connect analytics outputs to operational systems, software applications, or decision support tools.
May involve developing APIs or linking to existing databases for seamless integration.
2. User Interface:
Design user-friendly interfaces such as dashboards, visualizations, or reports. Aim to present insights clearly
and make them easily actionable for stakeholders
3. Training and Adoption: Provide training and support to users who will be interacting with the analytics
outputs.
• Ensure stakeholders understand how to interpret and leverage insights effectively.
• Promote user adoption and buy-in to maximize the impact of analytics.
4. Monitoring and Performance Evaluation:
– Continuously monitor the performance and impact of deployed analytics solutions.
– Track key performance indicators (KPIs) and assess how effectively insights drive desired outcomes.
– Make necessary adjustments or refinements to improve the analytics deployment over time.

Iteration in Business Analytics

Definition: Iteration involves the ongoing improvement and refinement of deployed analytics models or
solutions. It adapts analytics outputs to changing business conditions, data availability, and evolving
requirements.
Objective: To ensure that analytics solutions remain relevant, accurate, and impactful through continuous
feedback, learning, and enhancement.
• Key Aspects of the Iteration Process:
Feedback Collection:
– Gather feedback from users, stakeholders, or customers who interact with the deployed analytics
solutions.
– Feedback helps identify strengths, weaknesses, and areas for improvement in the analytics outputs.
Data Updates:
– Update the data used in analytics models as new data becomes available.
– Incorporate new data points or time periods to keep models up-to-date and reflective of the current
business environment.

CCW331 Business Analytics Page 24


• Further Aspects of the Iteration Process:
Model Refinement:
– Continuously refine and improve analytics models based on feedback and insights gained.
– Adjust parameters, add variables, explore alternative algorithms, or integrate new analytical
techniques to enhance model accuracy, predictive power, or performance.
Continuous Learning:
– Foster a culture of continuous learning and improvement within the organization.
– Encourage experimentation, exploration of new techniques, and staying updated with advancements
in analytics.
– This proactive approach helps identify emerging opportunities and challenges, enabling ongoing
enhancement of analytics solutions.
Pros:
 Increased efficiency. Because the iterative process embraces trial and error, it can often help you
achieve your desired result faster than a non-iterative process.
 Increased collaboration. Instead of working from predetermined plans and specs (which also takes
a lot of time to create), your team is actively working together.
 Increased adaptability. As you learn new things during the implementation and testing
phases, you can tweak your iteration to best hit your goals—even if that means doing
something you didn‗t expect to be doing at the start of the iterative process.
 More cost effective. If you need to change the scope of the project, you‗ll only have invested the
minimum time and effort into the process.
 Ability to work in parallel. Unlike other, non-iterative methodologies like the waterfall method,
iterations aren‗t necessarily dependent on the work that comes before them. Team members can
work on several elements of the project in parallel, which can shorten your overall timeline.
 Reduced project-level risk. In the iterative process, risks are identified and addressed during each
iteration. Instead of solving for large risks at the beginning and end of the project, you‗re
consistently working to resolve low-level risks.
 More reliable user feedback. When you have an iteration that users can interact with or see, they‗re
able to give you incremental feedback about what works or doesn‗t work for them.
Cons:
 Increased risk of scope creep. Because of the trial-and-error nature of the iterative process, your
project could develop in ways you didn‗t expect and exceed your original project scope.
 Inflexible planning and requirements. The first step of the iterative process is to define your project
requirements. Changing these requirements during the iterative process can break the flow of your
work, and cause you to create iterations that don‗t serve your project‗s purpose.
 Vague timelines. Because team members will create, test, and revise iterations until they get to a
satisfying solution, the iterative timeline isn‗t clearly defined. Additionally, testing for different
increments can vary in length, which also impacts the overall iterative process timeline.

CCW331 Business Analytics Page 25

You might also like