We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25
FOUNDATION TO DATA SCIENCE
Business Analytics
BASIC STATISTICS REFRESHER AND HOW TO
EXPLORE DATA-Day 1
Prof. Dr. George Mathew
B.Sc., B.Tech, PGDCA, PGDM, MBA, PhD 1 Data science Data science is basically the science of analyzing raw data and deriving insights from this data. You could use multiple techniques to derive insights; you could use simple statistical techniques to derive insights, you could use more complicated and more sophisticated machine learning techniques to derive insights and so on. Data Science is used in many industries to allow them to make better business decisions, and in sciences to test models or theories. This requires a process of inspecting, cleaning, transforming, modelling, analysing and interpreting Data. Demand for Business Analytics Business analytics is a crucial area of study for students looking to enhance their employment prospects. It is predicted that there will be a shortage of business managers in future with adequate training in analytics. It is the responsibility of managers to plan, coordinate, organize, and lead their organizations to better performance. Trend of Business Analytics Levels of Managers and Type of Decisions Strategic level:Strategic decisions are usually the domain of higher-level executives and have a time horizon of three to five years Tactical Level: Tactical decisions concern how the organization should achieve the goals and objectives set by its strategy, and they are usually the responsibility of mid level management. Tactical decisions usually span a year and thus are revisited annually or even every six months. Operational Level: Operational decisions affect how the firm is run from day to day; they are the domain of operations managers, who are the closest to the customer. Decision making process 1. Identify and define the problem 2. Determine the criteria that will be used to evaluate alternative solutions 3. Determine the set of alternative solutions 4. Evaluate the alternatives 5. Choose an alternative Making right business decisions based on data
Challenges of Decision Making:
1.Uncertainty- is probably the number one challenge. 2.Enormous number of alternatives Example: What is the best product line for a company that wants to maximize its market share?
Business analytics is the scientific process of transforming
data into insight for making better decisions. Business analytics is used for data-driven or fact-based decision making, which is often seen as more objective than other alternatives for decision making. Business Analytics A Categorization of Analytical Methods and Models 1. Descriptive analytics encompasses the set of techniques that describes what has happened in the past. 2. Predictive analytics consists of techniques that use models constructed from past data to predict the future or ascertain the impact of one variable on another. 3. Prescriptive analytics differ from descriptive or predictive analytics in that prescriptive analytics indicate a best course of action to take; that is, the output of a prescriptive model is a best decision. Business Analytics in Practice Business analytics involves tools as simple as reports and graphs, as well as some that are as sophisticated as optimization, data mining, and simulation. In practice, companies that applied analytics often follow a trajectory similar to that shown in Figure 2. Organizations start with basic analytics in the lower left. As they realize the advantages of these analytic techniques, they often progress to more sophisticated techniques in an effort to reap the derived competitive advantage. Predictive and prescriptive analytics are sometimes therefore referred to as advanced analytics. Not all companies reach that level of usage, but those that embrace analytics as a competitive strategy often do. DESCRIPTIVE STATISTICS Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation. A characteristic or a quantity of interest that can take on different values is known as a variable. Practically every problem (and opportunity) that an organization (or individual) faces is concerned with the impact of the possible values of relevant variables on the business outcome. Thus, we are concerned with how the value of a variable can vary; variation is the difference in a variable measured over the observations (time, customers, items etc.) The role of descriptive analytics is to collect and analyze data to gain a better understanding of variation and its impact on the business setting. The values of some variables are under direct control of the decision maker (these are often called decision variables). The values of other variables may fluctuate with uncertainty due to factors outside the direct control of the decision maker. In general, a quantity whose values are not known with certainty is called a random variable, or uncertain variable. When we collect data, we are gathering past observed values, or realizations of a variable. By collecting these past realizations of one or more variables, our goal is to learn more about the variation of a particular business situation. Qualitative, Quantitative and Categorical Data Data can be categorized in several ways based on how they are collected and the type collected. In many cases, it is not feasible to collect data from the population of all elements of interest. In such instances, we collect data from a subset of the population known as samples. It is very important to collect sample data that are representative of the population data so that generalizations can be made from them. In most cases, a representative sample can be gathered by random sampling of the population data. Dealing with populations and samples can introduce subtle differences in how we calculate and interpret summary statistics. In almost all practical applications of business analytics, we will be dealing with sample data. Qualitative, Quantitative and Categorical Data Data are considered quantitative data if numeric and arithmetic operations, such as addition, subtraction, multiplication, and division, can be performed on them. If arithmetic operations cannot be performed on the data, they are considered categorical data. We can summarize categorical data by counting the number of observations or computing the proportions of observations in each category. Using Excel, excel solver Modifying Data in Excel Projects often involve so much data that it is difficult to analyze all of the data at once. Here, we examine methods for summarizing and manipulating data using Excel to make the data more manageable and to develop insights.
Sorting and Filtering Data in Excel
Excel contains many useful features for sorting and filtering data so that one can more easily identify patterns. Table 1 contains data on the top 20 selling automobiles in the United States in March 2011. The table shows the model and manufacturer of each automobile as well as the sales for the model in March 2011 and March 2010. Figure 3 shows the data from Table 1 entered into an Excel spreadsheet, and the percent change in sales for each model from March 2010 to March 2011 has been calculated. This is done by entering the formula 5(D2-E2)/E2 in cell F2 and then copying the contents of this cell to cells F3 to F20. (We cannot calculate the percent change in sales for the Ford Fiesta because it was not being sold in March 2010.) Exercise: 02-03 ASAP Discriptive statistics_Excel Solver.xlsx Conditional Formatting of Data in Excel You can apply conditional Formatting rule to a cell or range, select cells and then use one of the commands from Home -> Conditional Formatting Conditional formatting in Excel can make it easy to identify data that satisfy certain conditions in a data set. For instance, suppose that we wanted to quickly identify the automobile models in Table 1 for which sales had decreased from March 2010 to March 2011. We can quickly highlight these models: Step 1. Starting with the original data shown in Figure 6, select cells F1:F21 Step 2. Click on the HOME tab in the Ribbon Step 3. Click Conditional Formatting in the Styles group Step 4. Select Highlight Cells Rules, and click Less Than from the dropdown menu Step 5. Enter 0% in the Format cells that are LESS THAN: box Step 6. Click OK Example Using Excel, excel solver Modifying Data in Excel Projects often involve so much data that it is difficult to analyze all of the data at once. Here, we examine methods for summarizing and manipulating data using Excel to make the data more manageable and to develop insights.
Sorting and Filtering Data in Excel
Excel contains many useful features for sorting and filtering data so that one can more easily identify patterns. Table 1 contains data on the top 20 selling automobiles in the United States in March 2011. The table shows the model and manufacturer of each automobile as well as the sales for the model in March 2011 and March 2010. Figure 3 shows the data from Table 1 entered into an Excel spreadsheet, and the percent change in sales for each model from March 2010 to March 2011 has been calculated. This is done by entering the formula 5(D2-E2)/E2 in cell F2 and then copying the contents of this cell to cells F3 to F20. (We cannot calculate the percent change in sales for the Ford Fiesta because it was not being sold in March 2010.) Exercise: 02-03 ASAP Discriptive statistics_Excel Solver.xlsx