Fundamentals in Business Analytics Reviewer Prelims

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Module 1 Fundamentals in Business Analytics Descriptive Analytics

What is Business Analytics? - the data that is used to benchmark or to profile.


- the use of data, information technology, statistical - Descriptive analytics serves as a foundational step
analysis, quantitative methods, mathematical or in understanding past business performance
computer-based models through data analysis. It's akin to taking a snapshot
- the process by which business use statistical of historical data to create benchmarks or profiles,
methods and technologies for analyzing historical enabling businesses to grasp trends, patterns, and
data in order to gain new insight and improve key performance indicators.
strategic decision-making. - answers the questions “what happened?”
- involves using data, technology, and statistical Descriptive Analytics Process:
methods to analyze business information. It helps 1. Data Collection
managers make informed decisions, predict future 2. Cleaning and Preparation
outcomes, and improve processes over time. 3. Segmentation
Business Analytics Applications 4. Summary and key performance Indicators
• Management of customer relationships (KPIs)
• Financial and marketing activities 5. Historical trend analysis
• Supply chain management 6. Data reporting and visualization
• Human resource planning Predictive Analytics
• Pricing decisions - is used to determine relationships between two
• Sport team game strategies different types of data and making predictions
about future data.
- forecasts potential future outcomes.
- answers the question “what is likely to happen?”
Prescriptive Analytics
- used to create recommendations through
simulation and optimization models.
- answers the question “how will it happen?”
Example
Retail Markdown Decisions
Most department stores clear seasonal inventory by
reducing prices. The question is:
Importance of Business Analytics When to reduce the price and by how much to
• Profitability of businesses maximize revenue?
• Revenue of businesses Descriptive Analytics: examine historical data for
• Shareholder return similar products (prices, units sold, advertising,…)
Evolution of Business Analytics Predictive analytics: predict sales based on price
1. Operations research Prescriptive analytics: find the best sets of pricing
2. Management science and advertising to maximize sales revenue
3. Business Intelligence Data for Business Analytics
4. Decision support systems What is a Data?
5. Personal Computer Software For every database system, the heart of each system
Scope of Business Analytics is what you call the data (Recario, 2018). Data are
1. Descriptive Analytics facts or figures which we can store in a database. An
2. Predictive Analytics example of this is your ID number, the name of your
3. Prescriptive Analytics teacher, the number of students in your class now.
Metrics – are used to quantify performance
Measures – are numerical values of metrics
Discrete – metrics involve counting (on time or not
on time, number or proportion of on time deliveries)
Continuous – metrics are measured on a continuum Data set
(delivery time, package weight, purchase price) – a collection of data (often a single “spread sheet”
Examples of using Data in Business or data mining table) that is typically organized for
Internal: analysis or research purposes.
1. Annual reports - can be a small or large and can contain different
2. Accounting audits types of data, such as text, numbers, images or
3. Financial profitability analysis audio.
4. Operations management performance - often used in data analysis, machine learning, and
5. human resource measurements statistical research to derive insights, build models,
External: or train algorithms.
1. Economic trends Types of Data
2. Marketing research Discrete – derived from counting something
New developments: Continuous – based on a continuous scale of
1. Web behavior measurement.
2. Social Media 4 Types of data based on measurement scale
3. Mobile -IOT 1. Categorical (nominal) data - sorted into
Data for Business Analytics categories according to specified
Data – numerical or textual facts and figures that are characteristics (customer’s location,
collected through some type of measurement employee classification, etc)
process. 2. Ordinal Data – can be ordered or ranked
Information – result of analyzing data; that is according to some relationship to one
extracting meaning from data to support evaluation another (college football rankings, survey
and decision making. responses, etc)
Big Data 3. Interval Data – ordinal but have constant
- refers to having a ton of business information from differences between observations and have
lots of different places. Some of it comes in really arbitrary zero points (temperature readings)
quickly, and there are all sorts of different types of 4. Ratio Data – continuous and have a natural
data. Plus, a chunk of it might be messy or hard to zero (monthly sales delivery times)
predict. IBM (International Business Machines Data Reliability and Validity
Corporation) labels these features as volume (lots of Reliability – data are accurate and consistent
data), variety (different types of data), velocity (data - refers to the consistency of a measure
comes in fast), and veracity (dealing with uncertain (whether the results can be reproduced
or messy data). under the same conditions)
“The effective use of big data has the potential to Validity – data measures what it is supposed to
transform economies, delivering a new wave of measure.
productivity growth and consumer surplus. Using - refers to the accuracy of a measure
big data will become a key basis of competition for (whether the results really do represent
existing companies, and will create new competitors what they are supposed to measure)
who are able to attract employees that have the Models in Business Analytics
critical skills for a big data world.” – McKinsey Model – an abstraction or representation of a real
Global Institute, 2011 system, idea, or object
Database - often a simplification of the real thing.
– a collection of data that is organized and stored in - captures the most important features.
related tables containing records on people, places, - can be a written or verbal description, a
or thing. visual representation, a mathematical
- commonly used to store and manage large formula or a spreadsheet.
volumes of data in various applications, such as
customer information, inventory management, and
financial records
3 forms of a Model Problem Solving with Analytics
Verbal Model/Description 1. Recognizing the Problem
- a description of a problem or situation using words - the first step in solving a problem is recognizing
rather than mathematical symbols or visual there is one.
representations. It involves explaining relationships, - problem exists when there is a gap between what
patterns, or concepts in a narrative form. Verbal is happening and what we think should be
models are useful for conveying complex ideas in a happening.
simple and understandable way, facilitating 2. Define the Problem
communication between stakeholders who may not Complexity increases when the following occur:
have a background in mathematics or analytics. • large number of courses of action
Visual Model • the problem belongs to a group and not an
- represents data, relationships, or processes using individual
graphical or diagrammatic representations such as • competing objectives
charts, graphs, diagrams, or flowcharts. Visual • external groups are affected
models help in interpreting complex information • time limitations exist
more easily and identifying patterns or trends • problem owner and problem solver are not
visually. They are particularly effective for the same person
presenting data analysis results, illustrating 3. Structure the Problem
concepts, or conveying insights to stakeholders in a • Stating goals and objectives
clear and intuitive manner. • Characterizing the possible decisions
Mathematical Model • Identifying any constraints or restrictions
- a formal representation of a real-world problem 4. Analyze the Problem
using mathematical equations, formulas, or Analysis involves:
symbols. These models quantify relationships
• some sort of experimentation
between variables and can be used to make
• solution process, such as evaluating different
predictions, optimize decisions, or simulate
scenarios
scenarios. Mathematical models are particularly
• analyzing risks associated with various
powerful for analyzing quantitative data and solving
decision alternatives
complex problems in areas such as finance,
• finding a solution that meets certain goals
operations research, and predictive analytics.
Uncertainty • determining an optimal solution.
- is imperfect knowledge (of what will happen in the 5. Interpret Results and Make Decision
future) - Models cannot capture every detail of the real
Risk problem. Managers must understand the limitations
- is the potential of (gaining or) losing something of of models and their underlying assumptions and
value. It is the consequence of actions taken under often incorporate judgment into making a decision.
uncertainty. 6. Implement Solution
Prescriptive Decision Models • Translate the results of the model back to
- help decision makers identify the best solution. the real world.
Optimization – finding values of a decision variables • Requires providing adequate resources,
that minimize (or maximize) something such as cost motivating employees ,eliminating
(or profit). resistance to change, modifying
Objective Function – the equation that minimizes organizational policies, and developing trust.
(or maximizes) the quantity of interest. CRISP-DM
Constraints – limitations or restrictions. -stands for Cross-Industry Standard Process for
Optimal solution – values of the decision variables Mining Data.
at the minimum (or maximum) point. - most common life cycle.
- guiding data mining and analytics projects
Analytics Life Cycle Data Understanding activities:
• Data Collection - involves understanding the
sources, formats ,and access methods
• Data Description - understanding its
structure and composition
• Data Exploration - performing initial
exploratory data
• Verify Data Quality - assessing quality
• Initial Insights and Hypotheses Generation -
starting point for further analysis
• Documentation - ensures that insights are
Business Understanding gained and captured
- means getting a clear picture of what the business Data Understanding objective:
wants to achieve and how data can help. It involves - to gain a comprehensive understanding of the data
figuring out the problem, setting goals, available for analysis. This understanding informs
understanding who needs the information, checking subsequent phases of the CRISP-DM methodology,
what data is available, and considering any particularly data preparation and modeling.
challenges or risks. This step helps ensure that Data Preparation
analytics efforts focus on what matters most to the - an important step in data analytics. It aims at
business and can deliver useful insights. assessing and improving the quality of data for
Importance secondary statistical analysis. With this, the data is
1. ensures that the project is aligned with the better understood and the data analysis is
business objectives performed more accurately and efficiently.
2. helps identify the data sources relevant to Tasks for Data Preparation
the problem, saving time and resources later 1. Data Cleaning - deals with missing data,
3. establishes the success criteria for the noise, outliers, and correct inconsistencies of
project the data making sure that it is accurate and
Benefits correct.
1. improved project outcomes 2. Data Integration - a process of combining
2. reduced risk data derived from various data sources (such
3. efficient use of resources as database, flat files, etc.) into a consistent
4. improved communication data set for both operational and analytical.
Key Activities of the Business Understanding 3. Data Transformation - aims to transform the
Phase: data values into a format, scale, or unit that
• Identifying the Business Problem is more suitable for analysis.
• Defining project objectives 4. Data Reduction - a process of obtaining a
• Determining success criteria reduced representation of the data set that
• Assessing project feasibility is much smaller in volume but yet produce
• Identifying data sources the same (or almost the same) analytical
Best Practices for the Business Understanding results.
phase Modeling
1. Involving stakeholders 4 task:
2. Conducting a swot analysis 1. Selecting modeling techniques
3. Defining data mining goals 2. Generating Design Model
4. Communicating findings to stakeholders 3. Building model(s)
Data Understanding 4. Assessing model(s)
- a phase that involves gaining familiarity with the Modeling Technique
data available for analysis - the actual modeling technique that is used.
Modeling Assumptions Data Frame
- specific assumptions about the data, data quality - a tabular data structure, encapsulating multiple
or the data format series like columns in a spreadsheet. Data are stored
Building Model internally as a 2-dimensional object, but the Data
Frame allows us to represent and manipulate
higher-dimensional data.
- sorted by column name
- has a second index, representing the columns

• Parameter settings – With any modelling


tool there are often a large number of
parameters that can be adjusted.
• Models – These are the actual models
produced by the modelling tool, not a report
on the models.
• Model descriptions – Describe the resulting
models, report on the interpretation of the
models and document any difficulties
encountered with their meanings.
Data Preparation using pandas
Pandas
- a Python package providing fast, flexible, and
expressive data structures designed to work with
relational or labeled data both. It is a fundamental
high-level building block for doing practical, real
world data analysis in Python.
Pandas is well suited for:
• Tabular data with heterogeneously-typed
columns, as you might find in an SQL table or
Excel spreadsheet
• Ordered and unordered (not necessarily
fixed-frequency) time series data.
• Arbitrary matrix data with row and column
labels
Virtually any statistical dataset, labeled or
unlabeled, can be converted to a pandas data
structure for cleaning, transformation, and analysis.
Series
- a single vector of data (like a NumPy array) with an
index that labels each element in the vector.
- If an index is not specified, a default sequence of
integers is assigned as the index. A NumPy array
comprises the values of the Series, while the index
is a pandas Index object.

You might also like