BigData&Analytics Module2
BigData&Analytics Module2
James R. Evans, “Business Analytics: Methods, Models and Decisions” 3rd edition; 2019, Pearson
data,
information technology,
statistical analysis,
to help managers gain improved insight about their business operations and make better, fact-
based decisions.
Challenges
◦ lack of understanding of how to use analytics
◦ insufficient analytical skills
◦ difficulty in getting good data and sharing information
◦ not understanding the benefits versus perceived costs
Mathematical Techniques:
◦ Data and Statistical methods
◦ Machine Learning (supervised versus unsupervised)
Excel Spreadsheets
Tableau Software Simple drag and drop tools for visualizing data from
spreadsheets and other databases
SAS / SPSS / Rapid Miner predictive modeling, data mining, machine learning
and visualization using visual workflows (not for free)
R Studio https://www.youtube.com/watch?v=_V8eKsto3Ug
SAS https://www.youtube.com/watch?v=PJOqwQJT_NA
© 2020 Eslsca. All Rights Reserved 8
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
New developments: Web behavior –
Internal Social Media – Mobile - IOT
o Annual reports o page views
o Accounting audits o visitor’s country
o Financial profitability analysis o visitor’s demographics
o Operations management performance o time of view & duration of view
o Human resource measurements o products they searched for and
viewed
o products purchased
External o what reviews they read
o Economic trends and others …
o Marketing research The effective use of big data has
high potential to transform
economies in the new era
Categorical data –
sorted into categories / conceptual criteria
Ordinal data –
They are numerical and thus can be ordered or ranked
Interval data –
ordinal but have constant differences between observations
Ratio data –
ordinal but have a natural zero in its scale
© 2020 Eslsca. All Rights Reserved 11
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
o The total of all the values divided by the size of the data set
o It is the most commonly used statistic of position
o The mean of a sample is denoted by ‘x-bar’
Mode
The value that occurs the most often in a data set
It is rarely used as a central tendency measure
Range
o The difference between the highest and the lowest values
o The simplest measure of variability
o It can be misleading when the data has outliers, just one outlier will increase
the range dramatically
Standard Deviation
o The average distance of the data points from their own mean
o A low standard deviation indicates that the data points are clustered around
the mean
o A large standard deviation indicates that they are widely scattered around the
mean
o It is a more robust measure of variability than Range
o The standard deviation of a sample is denoted by ‘s’/ sigma
Outlier
o A data point that is significantly greater or smaller than other
data points in a data set.
o It is useful when analyzing data to identify outliers
o They may affect the calculation of descriptive statistics
o You need to decide whether to exclude them before carrying
out your analysis
o An outlier should be excluded if it is due to measurement or
human error
Source: https://corporatefinanceinstitute.com/resources/knowledge/other/uniform-distribution/
Module Completed
Module 02