This document provides an overview of business analytics including definitions, applications, importance, evolution, scope and key concepts such as descriptive, predictive and prescriptive analytics. It also discusses data, databases, models and the problem solving process in analytics.
This document provides an overview of business analytics including definitions, applications, importance, evolution, scope and key concepts such as descriptive, predictive and prescriptive analytics. It also discusses data, databases, models and the problem solving process in analytics.
Original Description:
Fundamentals in Business Analytics
Original Title
Fundamentals in Business Analytics reviewer prelims
This document provides an overview of business analytics including definitions, applications, importance, evolution, scope and key concepts such as descriptive, predictive and prescriptive analytics. It also discusses data, databases, models and the problem solving process in analytics.
This document provides an overview of business analytics including definitions, applications, importance, evolution, scope and key concepts such as descriptive, predictive and prescriptive analytics. It also discusses data, databases, models and the problem solving process in analytics.
Module 1 Fundamentals in Business Analytics Descriptive Analytics
What is Business Analytics? - the data that is used to benchmark or to profile.
- the use of data, information technology, statistical - Descriptive analytics serves as a foundational step analysis, quantitative methods, mathematical or in understanding past business performance computer-based models through data analysis. It's akin to taking a snapshot - the process by which business use statistical of historical data to create benchmarks or profiles, methods and technologies for analyzing historical enabling businesses to grasp trends, patterns, and data in order to gain new insight and improve key performance indicators. strategic decision-making. - answers the questions “what happened?” - involves using data, technology, and statistical Descriptive Analytics Process: methods to analyze business information. It helps 1. Data Collection managers make informed decisions, predict future 2. Cleaning and Preparation outcomes, and improve processes over time. 3. Segmentation Business Analytics Applications 4. Summary and key performance Indicators • Management of customer relationships (KPIs) • Financial and marketing activities 5. Historical trend analysis • Supply chain management 6. Data reporting and visualization • Human resource planning Predictive Analytics • Pricing decisions - is used to determine relationships between two • Sport team game strategies different types of data and making predictions about future data. - forecasts potential future outcomes. - answers the question “what is likely to happen?” Prescriptive Analytics - used to create recommendations through simulation and optimization models. - answers the question “how will it happen?” Example Retail Markdown Decisions Most department stores clear seasonal inventory by reducing prices. The question is: Importance of Business Analytics When to reduce the price and by how much to • Profitability of businesses maximize revenue? • Revenue of businesses Descriptive Analytics: examine historical data for • Shareholder return similar products (prices, units sold, advertising,…) Evolution of Business Analytics Predictive analytics: predict sales based on price 1. Operations research Prescriptive analytics: find the best sets of pricing 2. Management science and advertising to maximize sales revenue 3. Business Intelligence Data for Business Analytics 4. Decision support systems What is a Data? 5. Personal Computer Software For every database system, the heart of each system Scope of Business Analytics is what you call the data (Recario, 2018). Data are 1. Descriptive Analytics facts or figures which we can store in a database. An 2. Predictive Analytics example of this is your ID number, the name of your 3. Prescriptive Analytics teacher, the number of students in your class now. Metrics – are used to quantify performance Measures – are numerical values of metrics Discrete – metrics involve counting (on time or not on time, number or proportion of on time deliveries) Continuous – metrics are measured on a continuum Data set (delivery time, package weight, purchase price) – a collection of data (often a single “spread sheet” Examples of using Data in Business or data mining table) that is typically organized for Internal: analysis or research purposes. 1. Annual reports - can be a small or large and can contain different 2. Accounting audits types of data, such as text, numbers, images or 3. Financial profitability analysis audio. 4. Operations management performance - often used in data analysis, machine learning, and 5. human resource measurements statistical research to derive insights, build models, External: or train algorithms. 1. Economic trends Types of Data 2. Marketing research Discrete – derived from counting something New developments: Continuous – based on a continuous scale of 1. Web behavior measurement. 2. Social Media 4 Types of data based on measurement scale 3. Mobile -IOT 1. Categorical (nominal) data - sorted into Data for Business Analytics categories according to specified Data – numerical or textual facts and figures that are characteristics (customer’s location, collected through some type of measurement employee classification, etc) process. 2. Ordinal Data – can be ordered or ranked Information – result of analyzing data; that is according to some relationship to one extracting meaning from data to support evaluation another (college football rankings, survey and decision making. responses, etc) Big Data 3. Interval Data – ordinal but have constant - refers to having a ton of business information from differences between observations and have lots of different places. Some of it comes in really arbitrary zero points (temperature readings) quickly, and there are all sorts of different types of 4. Ratio Data – continuous and have a natural data. Plus, a chunk of it might be messy or hard to zero (monthly sales delivery times) predict. IBM (International Business Machines Data Reliability and Validity Corporation) labels these features as volume (lots of Reliability – data are accurate and consistent data), variety (different types of data), velocity (data - refers to the consistency of a measure comes in fast), and veracity (dealing with uncertain (whether the results can be reproduced or messy data). under the same conditions) “The effective use of big data has the potential to Validity – data measures what it is supposed to transform economies, delivering a new wave of measure. productivity growth and consumer surplus. Using - refers to the accuracy of a measure big data will become a key basis of competition for (whether the results really do represent existing companies, and will create new competitors what they are supposed to measure) who are able to attract employees that have the Models in Business Analytics critical skills for a big data world.” – McKinsey Model – an abstraction or representation of a real Global Institute, 2011 system, idea, or object Database - often a simplification of the real thing. – a collection of data that is organized and stored in - captures the most important features. related tables containing records on people, places, - can be a written or verbal description, a or thing. visual representation, a mathematical - commonly used to store and manage large formula or a spreadsheet. volumes of data in various applications, such as customer information, inventory management, and financial records 3 forms of a Model Problem Solving with Analytics Verbal Model/Description 1. Recognizing the Problem - a description of a problem or situation using words - the first step in solving a problem is recognizing rather than mathematical symbols or visual there is one. representations. It involves explaining relationships, - problem exists when there is a gap between what patterns, or concepts in a narrative form. Verbal is happening and what we think should be models are useful for conveying complex ideas in a happening. simple and understandable way, facilitating 2. Define the Problem communication between stakeholders who may not Complexity increases when the following occur: have a background in mathematics or analytics. • large number of courses of action Visual Model • the problem belongs to a group and not an - represents data, relationships, or processes using individual graphical or diagrammatic representations such as • competing objectives charts, graphs, diagrams, or flowcharts. Visual • external groups are affected models help in interpreting complex information • time limitations exist more easily and identifying patterns or trends • problem owner and problem solver are not visually. They are particularly effective for the same person presenting data analysis results, illustrating 3. Structure the Problem concepts, or conveying insights to stakeholders in a • Stating goals and objectives clear and intuitive manner. • Characterizing the possible decisions Mathematical Model • Identifying any constraints or restrictions - a formal representation of a real-world problem 4. Analyze the Problem using mathematical equations, formulas, or Analysis involves: symbols. These models quantify relationships • some sort of experimentation between variables and can be used to make • solution process, such as evaluating different predictions, optimize decisions, or simulate scenarios scenarios. Mathematical models are particularly • analyzing risks associated with various powerful for analyzing quantitative data and solving decision alternatives complex problems in areas such as finance, • finding a solution that meets certain goals operations research, and predictive analytics. Uncertainty • determining an optimal solution. - is imperfect knowledge (of what will happen in the 5. Interpret Results and Make Decision future) - Models cannot capture every detail of the real Risk problem. Managers must understand the limitations - is the potential of (gaining or) losing something of of models and their underlying assumptions and value. It is the consequence of actions taken under often incorporate judgment into making a decision. uncertainty. 6. Implement Solution Prescriptive Decision Models • Translate the results of the model back to - help decision makers identify the best solution. the real world. Optimization – finding values of a decision variables • Requires providing adequate resources, that minimize (or maximize) something such as cost motivating employees ,eliminating (or profit). resistance to change, modifying Objective Function – the equation that minimizes organizational policies, and developing trust. (or maximizes) the quantity of interest. CRISP-DM Constraints – limitations or restrictions. -stands for Cross-Industry Standard Process for Optimal solution – values of the decision variables Mining Data. at the minimum (or maximum) point. - most common life cycle. - guiding data mining and analytics projects Analytics Life Cycle Data Understanding activities: • Data Collection - involves understanding the sources, formats ,and access methods • Data Description - understanding its structure and composition • Data Exploration - performing initial exploratory data • Verify Data Quality - assessing quality • Initial Insights and Hypotheses Generation - starting point for further analysis • Documentation - ensures that insights are Business Understanding gained and captured - means getting a clear picture of what the business Data Understanding objective: wants to achieve and how data can help. It involves - to gain a comprehensive understanding of the data figuring out the problem, setting goals, available for analysis. This understanding informs understanding who needs the information, checking subsequent phases of the CRISP-DM methodology, what data is available, and considering any particularly data preparation and modeling. challenges or risks. This step helps ensure that Data Preparation analytics efforts focus on what matters most to the - an important step in data analytics. It aims at business and can deliver useful insights. assessing and improving the quality of data for Importance secondary statistical analysis. With this, the data is 1. ensures that the project is aligned with the better understood and the data analysis is business objectives performed more accurately and efficiently. 2. helps identify the data sources relevant to Tasks for Data Preparation the problem, saving time and resources later 1. Data Cleaning - deals with missing data, 3. establishes the success criteria for the noise, outliers, and correct inconsistencies of project the data making sure that it is accurate and Benefits correct. 1. improved project outcomes 2. Data Integration - a process of combining 2. reduced risk data derived from various data sources (such 3. efficient use of resources as database, flat files, etc.) into a consistent 4. improved communication data set for both operational and analytical. Key Activities of the Business Understanding 3. Data Transformation - aims to transform the Phase: data values into a format, scale, or unit that • Identifying the Business Problem is more suitable for analysis. • Defining project objectives 4. Data Reduction - a process of obtaining a • Determining success criteria reduced representation of the data set that • Assessing project feasibility is much smaller in volume but yet produce • Identifying data sources the same (or almost the same) analytical Best Practices for the Business Understanding results. phase Modeling 1. Involving stakeholders 4 task: 2. Conducting a swot analysis 1. Selecting modeling techniques 3. Defining data mining goals 2. Generating Design Model 4. Communicating findings to stakeholders 3. Building model(s) Data Understanding 4. Assessing model(s) - a phase that involves gaining familiarity with the Modeling Technique data available for analysis - the actual modeling technique that is used. Modeling Assumptions Data Frame - specific assumptions about the data, data quality - a tabular data structure, encapsulating multiple or the data format series like columns in a spreadsheet. Data are stored Building Model internally as a 2-dimensional object, but the Data Frame allows us to represent and manipulate higher-dimensional data. - sorted by column name - has a second index, representing the columns
• Parameter settings – With any modelling
tool there are often a large number of parameters that can be adjusted. • Models – These are the actual models produced by the modelling tool, not a report on the models. • Model descriptions – Describe the resulting models, report on the interpretation of the models and document any difficulties encountered with their meanings. Data Preparation using pandas Pandas - a Python package providing fast, flexible, and expressive data structures designed to work with relational or labeled data both. It is a fundamental high-level building block for doing practical, real world data analysis in Python. Pandas is well suited for: • Tabular data with heterogeneously-typed columns, as you might find in an SQL table or Excel spreadsheet • Ordered and unordered (not necessarily fixed-frequency) time series data. • Arbitrary matrix data with row and column labels Virtually any statistical dataset, labeled or unlabeled, can be converted to a pandas data structure for cleaning, transformation, and analysis. Series - a single vector of data (like a NumPy array) with an index that labels each element in the vector. - If an index is not specified, a default sequence of integers is assigned as the index. A NumPy array comprises the values of the Series, while the index is a pandas Index object.