BBA 4 - Unit
BBA 4 - Unit
Unit - lV
Tools Used For Data Analytics
1. Descriptive Analytics: Descriptive analytics answers the question, "What has happened?" It
summarizes historical data to provide insights into past performance and trends. This type of
analytics is foundational, as it lays the groundwork for further analysis.
Example: A retail company uses descriptive analytics to analyze sales data from the
previous year. By generating reports that show monthly sales figures, customer
demographics, and product performance, the company can identify peak sales periods
and popular products. For instance, they might discover that sales of winter clothing
spike in November and December, allowing them to plan inventory and marketing
strategies accordingly.
Software Tools: Common tools for descriptive analytics include Microsoft Excel, Google
Analytics, and Tableau, which help visualize data through charts and dashboards.
2. Diagnostic Analytics: Diagnostic analytics goes a step further by answering the question,
"Why did this happen?" It examines data to identify relationships and causes behind trends
observed in descriptive analytics.
Example: Continuing with the retail company, after noticing a spike in winter clothing
sales, they use diagnostic analytics to investigate why this occurred. They analyze
customer feedback, weather data, and marketing campaign performance. They might
find that a targeted advertising campaign during the holiday season significantly
influenced sales, or that a particularly cold winter drove demand for jackets and coats.
Software Tools: Tools like SAS, R, and Python (with libraries such as Pandas) are often
used for diagnostic analytics, as they allow for deeper statistical analysis and data
manipulation.
3. Predictive Analytics: Predictive analytics answers the question, "What is likely to happen in
the future?" It uses historical data and statistical algorithms to forecast future outcomes,
helping businesses anticipate trends and make proactive decisions.
Example: A bank employs predictive analytics to assess the likelihood of loan defaults
among its customers. By analyzing past loan performance, customer credit scores, and
economic indicators, the bank can identify which customers are at higher risk of
defaulting. This allows them to adjust lending criteria or offer financial counseling to at-
risk customers.
Software Tools: Predictive analytics often utilizes machine learning platforms like IBM
Watson, Google Cloud AI, and specialized software like RapidMiner or KNIME.
4. Prescriptive Analytics: Prescriptive analytics answers the question, "What should we do?" It
provides recommendations for actions based on predictive analytics insights, helping
organizations optimize their strategies.
Example: A logistics company uses prescriptive analytics to optimize delivery routes. By
analyzing traffic patterns, weather conditions, and delivery schedules, the software can
recommend the most efficient routes for drivers. For instance, if the system predicts
heavy traffic on a particular route, it might suggest an alternative path that saves time
and fuel costs.
Software Tools: Tools like IBM SPSS, Oracle Analytics Cloud, and advanced BI tools like
Thought Spot are commonly used for prescriptive analytics, as they can simulate various
scenarios and provide actionable insights.
Real-Life Example:
A small business uses Python scripts and Jupyter Notebooks to analyze customer data for free,
creating sales reports and predictive models without purchasing expensive software.
Real-Life Example:
A large corporation invests in Tableau to create interactive dashboards and reports for
executives, simplifying complex data analysis for strategic decision-making.
Step-by-step Example
Suppose you are a business student analyzing the monthly sales (in units) of a product over 12
months:
Visualizing Data:
Use Insert Chart (e.g., Column Chart) to visualize sales patterns over months.
Summary
Measure Excel Formula Explanation
Mean =AVERAGE(range) Average sales over months
Median =MEDIAN(range) Middle sales value
Mode =MODE.SNGL(range) Most common sales value
Range =MAX(range)-MIN(range) Difference between highest and lowest
Standard Deviation =STDEV.S(range) Variability or spread of sales
Using MS Excel to perform descriptive statistics is easy and effective for summarizing data,
making it ideal for business analysis, reports, and decision-making.
Step-by-Step Example
Scenario
Suppose you have the following dataset representing the marks of 10 students:
{1, 4, 6, 1, 8, 15, 18, 1, 5, 1}
Calculation:
Sum = 1 + 4 + 6 + 1 + 8 + 15 + 18 + 1 + 5 + 1 = 60
Number of observations = 10
Mean = 60 / 10 = 61
Calculation:
The value "1" appears 4 times, more than any other value.
Mode = 1
Calculation:
Maximum = 18, Minimum = 1
Range = 18 - 1 = 17
5. Frequency Distribution
Definition: Shows how often each value occurs in the dataset.
Example:
o 1: 4 times
o 4: 1 time
o 5: 1 time
o 6: 1 time
o 8: 1 time
o 15: 1 time
o 18: 1 time
Frequency distribution: Histogram shows how many patients experienced each level of
reduction
These descriptive statistics help the company understand the typical response and variability
before making broader inferences.
Example:
If the mean score on a test is 75, most students scored around 75.
3. Describe the Spread of Your Data
Standard Deviation: Shows how much values typically differ from the mean. A small
standard deviation means most values are close to the mean; a large one means more
variability.
Range: The difference between the highest and lowest values, indicating the overall
spread.
Example:
If the standard deviation of ages in a group is low, most people are about the same age.
4. Assess the Shape and Distribution
Skewness: Indicates if your data is symmetric or has a longer tail on one side. Negative
skew means a longer left tail; positive skew means a longer right tail.
Interpretation:
The average (mean) age is 35, with most employees clustered near this value.
The small standard deviation (5) indicates that most employees are close in age.
The range (22–48) shows the youngest and oldest employees.
Slight positive skewness suggests there are a few older employees pulling the mean
slightly above the median.
Plotting charts is a fundamental part of data analysis and descriptive statistics. Charts transform
raw numbers into visual stories, making it easier to spot trends, patterns, and outliers, and to
communicate results effectively. The type of chart you choose depends on your data and the
story you want to tell.
Bar Chart Categorical data (counts, frequencies) Which product category sells most units
Pie Chart Proportion of categories within a whole Market share of different brands
Line Graph Data trends over time Sales growth across months
Histogram Distribution of continuous numerical Most common test score range in a class
data
Box Plot Spread and outliers in numerical data Comparing income distributions between two
regions
Scatter Relationship between two numeric Correlation between advertising spend and sales
Plot variables
1. Line Graphs
A line chart graphically displays data that changes continuously over time. Each line graph
consists of points that connect data to show a trend (continuous change). Line graphs have an
x-axis and a y-axis. In the most cases, time is distributed on the horizontal axis.
Example:
The following line graph shows annual sales of a particular business company for the period of
six consecutive years:
Note: the above example is with 1 line. However, one line chart can compare multiple trends by
several distributing lines.
2. Bar Charts
Bar charts represent categorical data with rectangular bars (to understand what is categorical
data see categorical data examples). Bar graphs are among the most popular types of graphs
and charts in economics, statistics, marketing, and visualization in digital customer experience.
They are commonly used to compare several categories of data.
Each rectangular bar has length and height proportional to the values that they represent.
One axis of the bar chart presents the categories being compared. The other axis shows a
measured value.
Example:
The bar chart below represents the total sum of sales for Product A and Product B over three
years.
The bars are 2 types: vertical or horizontal. It doesn’t matter which kind you will use. The above
one is a vertical type.
3. Pie Charts
When it comes to statistical types of graphs and charts, the pie chart (or the circle chart) has a
crucial place and meaning. It displays data and statistics in an easy-to-understand ‘pie-slice’
format and illustrates numerical proportion.
Each pie slice is relative to the size of a particular category in a given group as a whole. To say it
in another way, the pie chart brakes down a group into smaller pieces. It shows part-whole
relationships.
To make a pie chart, you need a list of categorical variables and numerical variables.
Example:
The pie chart below represents the proportion of types of transportation used by 1000 students
to go to their school.
Pie charts are widely used by data-driven marketers for displaying marketing data.
4. Histogram
A histogram shows continuous data in ordered rectangular columns (to understand what is
continuous data see our post discrete vs continuous data). Usually, there are no gaps between
the columns.
The histogram displays a frequency distribution (shape) of a data set. At first glance, histograms
look alike to bar graphs. However, there is a key difference between them. Bar Chart represents
categorical data and histogram represent continuous data.
Histogram Uses:
When the data is continuous.
When you want to represent the shape of the data’s distribution.
When you want to see whether the outputs of two or more processes are different.
To summarize large data sets graphically.
To communicate the data distribution quickly to others.
Example:
The histogram below represents per capita income for five age groups.
5. Scatter plot
The scatter plot is an X-Y diagram that shows a relationship between two variables. It is used to
plot data points on a vertical and a horizontal axis. The purpose is to show how much one
variable affects another.
Usually, when there is a relationship between 2 variables, the first one is called independent.
The second variable is called dependent because its values depend on the first variable.
Scatter plots also help you predict the behavior of one variable (dependent) based on the
measure of the other variable (independent).
Example:
The below Scatter plot presents data for 7 online stores, their monthly e-commerce sales, and
online advertising costs for the last year.
The orange line you see in the plot is called “line of best fit” or a “trend line”. This line is used to
help us make predictions that are based on past data.
The Scatter plots are used widely in data science and statistics. They are a great tool for
visualizing linear regression models.
More examples and explanation for scatter plots you can see in our post what does a scatter
plot show and simple linear regression examples.
Plotting charts is an effective way to visualize data and draw meaningful inferences. In this
example, the line chart of monthly sales data provided insights into trends, monthly changes,
and potential areas for improvement. By analyzing the chart, the sales manager can make
informed decisions to enhance business performance and set realistic goals for future growth.