0% found this document useful (0 votes)

30 views25 pages

Ids Unit 5 Final

Uploaded by

pnithishreddy14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views25 pages

Ids Unit 5 Final

Uploaded by

pnithishreddy14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

INTRODUCTION TO DATA SCIENCE(IDS)

Prepared by N.Pandu Ranga Reddy

UNIT-5

Graphical Representation of Data (CHARTS & GRAPHS)

Graphical Representation of Data: Graphical Representation of Data,” where

numbers and facts become lively pictures and colorful diagrams.

Instead of staring at boring lists of numbers, we use fun charts, cool graphs,
and interesting visuals to understand information better.

In this exciting concept of data visualization, we’ll learn about different kinds of
graphs, charts, and pictures that help us see patterns and stories hidden in
data.
What is Graphical Representation
Graphics Representation is a way of representing any data in picturized
form. It helps a reader to understand the large set of data very easily as it gives us
various data patterns in visualized form.

There are two ways of representing data,

 Tables
 Pictorial Representation through graphs.
 They say, “A picture is worth a thousand words”. It’s always better to represent
data in a graphical format. Even in Practical Evidence and Surveys, scientists
have found that the restoration and understanding of any information is better
when it is available in the form of visuals as Human beings process data
better in visual form than any other form.

Types of Graphical Representations

Comparison between different items is best shown with graphs, it becomes easier
to compare the crux of the data about different items. Let’s look at all the different
types of graphical representations briefly:
1.Line Graphs

A line graph is used to show how the value of a particular variable changes with
time. We plot this graph by connecting the points at different values of the
variable. It can be useful for analyzing the trends in the data and predicting
further trends.

Line graph also known as a line chart or line plot is a tool used for data visualization .

In a line graph data points are connected with a straight-line and data points are
represented either with points or wedges.

A line graph or line chart is a graphical representation of the data that displays the
relationship between two or more variables concerning time. It is made by connecting data
points with straight-line segments.

Parts of Line Graph

Parts of the line graph include the following:

 Title: It is nothing but the title of the graph drawn.
 Axes: The line graph contains two axes i.e. X-axis and Y-axis.
 Labels: The name given to the x-axis and y-axis.
 Line: It is the line segment that is used to connect two or more data points.
 Point: It is nothing but a point given at each segment.

Example: Draw a line graph for the given data

No. of Days 1 2 3 4

Absentees 5 10 15 10
Multiple Line Graph

It is the type of line graph in which we can represent two or more lines in a single
graph and they can either belong to the same categories or different which makes it easy
to make comparisons between them. Multiple line graphs also include a double line graph
or we can say that a double line graph is also a multiple line graph.

An example of multiple graphs is shown below:

PIE CHARTS:
Pie chart is a popular and visually intuitive tool used in data representation, making
complex information easier to understand at a glance.
This circular graph divides data into slices, each representing a proportion of the whole,
allowing for a clear comparison of different categories making it easier to digest complex
information through a straightforward, intuitive format.

They look like a pie cut into slices, and each slice shows a piece of information .

Pie charts are ideal for displaying percentage data or showing how individual parts
contribute to a total.

A pie chart uses a circle or sphere to represent the data, where the circle represents
the entire data, and the slices represent the data in parts.

Pie charts, also known as circle graphs or pie diagrams

Pie Chart Examples
In a class of 200 students, a survey was done to collect each student’s favorite sports.
The pie chart of the data is given below:

Since the pie chart is provided and the total number of students is given, we can easily
take the original data out for each sport.

 Cricket = 17/100 × 200 = 34 students

 Football = 25/100 × 200 = 50 students
 Badminton = 12/100 × 200 = 24 students
 Hockey = 5/100 × 200 = 10 students
 Other = 41/100 × 200 = 82 students

Pie Chart Formula

The total value or percentage of the pie is 100% always. Here it contains different sectors
and segments in which each sector or segment of the chart corresponds to a certain
portion of the net or total percentage (or data). The total or sum of all the data can be
summed up to 360 degrees.

 Converting the data into degrees on a pie chart. The formula for a pie chart can be
summed up as:
(Given Data / Total Value of Data) × 360°

 Calculating the percentage of each sector from degrees in a pie chart.

Chart Legend
 Plot/Chart legends give meaning to a visualization, assigning labels to
the various plot elements.

 Legends are found in maps - describe the pictorial language or symbology of
the map.
 Legends are used in line graphs to explain the function or the values
underlying the different lines of the graph.

 • Matplotlib has native support for legends. Legends can be placed in
various positions:
 A legend can be placed inside or outside the chart and the position can be
moved. The legend() method adds the legend to the plot.

 • To place the legend inside, simply call legend():

In above grapg,Each series is differentiated by a specific color, and the legend provides
color-based labels “blue” and “green” for clarity.

2.Bar Graphs/Charts
A bar graph is a type of graphical representation of the data in which bars of
uniform width are drawn with equal spacing between them on one axis (x-axis
usually), depicting the variable. The values of the variables are represented by
the height of the bars.

The pictorial representation of data in groups, either in horizontal or vertical bars

where the length of the bar represents the value of the data present on the axis .
They (bar graphs) are usually used to display or impart the information belonging to
‘categorical data’ i.e., data that fit in some category.

Reading a Bar Graph and comparing two sets of data

To read a Bar graph, we need to ask questions to ourselves looking at the displayed
graph. Let’s understand reading a Bar graph through a fundamental example,

Different types of fruits are liked by People,

What does the X-axis and Y-axis on the graph are representing?

The X-axis represents the different types of fruits like apple, guava. while Y-axis
represents the Number of people.

Overall, what kind of information the bar graph displaying?

The bar graph is displaying the number of People liking different types of fruits.
3.Histograms

This is similar to bar graphs, but it is based frequency of numerical values rather
than their actual values. The data is organized into intervals and the bars represent
the frequency of the values in that range. That is, it counts how many values of the
data lie in a particular range.

A histogram displays frequencies of quantitative data that has been sorted into
intervals
What is Histogram?

Histograms are graphical representations of data distributions. They consist of bars, each
representing the frequency or count of observations falling within specific intervals, known
as bins.

We can also say a histogram is a variation of a bar chart in which data values are grouped
together and put into different classes. This grouping enables you to see how frequently
data in each class occur in the dataset.

Example:
Suppose you’re analyzing the distribution of scores on a standardized test. You have data
for 2000 students, and you want to visualize how many students scored within different
score ranges. For this you can create a histogram using the following data.
Score Range Frequency

0-25 150

26-50 300

51-75 600

76-100 750

101-125 150

126-150 50
The histogram show that the data is normally distributed, and the students have
mostly score between 76-100

SCATTER PLOT
Scatter plot is a mathematical technique that is used to represent data.

Scatter plot also called a Scatter Graph, or Scatter Chart uses dots to describe two
different numeric variables.

The position of each dot on the horizontal and vertical axis indicates values for an
individual data point.

A scatter plot is used to plot the relationship between two variables, on a two-
dimensional graph that is known as Cartesian Plane on mathematical grounds.

It is generally used to plot the relationship between one independent variable and one
dependent variable,

where an independent variable is plotted on the x-axis and a dependent variable is plotted
on the y-axis so that you can visualize the effect of the independent variable on the
dependent variable. These plots are known as Scatter Plot Graph or Scatter Diagram.

Scatter Plot is known by several other names, a few of them are scatter chart,
scattergram, scatter plot, and XY graph.

A scatter plot is used to visualize a data pair, such that each element gets its axis,
generally the independent one gets the x-axis and the dependent one gets the y-axis.

So Scatter Plot is useful in situations when we have to find out the relationship
between two sets of data, or in cases when we suspect that there may be some
relationship between two variables and this relationship may be the root cause of some
problem.

Let's understand the process through an example. In the following table, a data set of two
variables is given.

Matches Played 2 5 7 1 12 15 18

Goals Scored 1 4 5 2 7 12 11
Now in this data set there are two variables, first is the number of matches played by a
certain player and second is the number of goals scored by that player. Suppose, we aim
to find out the relationship between the number of matches played by a certain player and
the number of goals scored by him/her. For now, let us discard our obvious intuitive
understanding that the number of goals scored is directly proportional to the number of
matches played. For now, let us assume that we just have the given dataset and we have
to extract out relationship between given data pair.

Scatter Plot
A scatter plot is a diagram where each value in the data set is represented by a dot.
BOX PLOT/Box and Whisker Plot

A boxplot (also known as a box and whiskers plot) is another way to display
quantitative data.

Box plot is a type of chart that depicts a group of numerical data through
their quartiles.

Box plot is also known as a whisker plot, box-and-whisker plot, or simply a

box-and whisker diagram.

Box plot is a graphical representation of the distribution of a dataset. It

displays key summary statistics such as the median, quartiles, and
potential outliers in a concise and visual manner.
By using Box plot you can provide a summary of the distribution, identify
potential and compare different datasets in a compact and visual manner.
The box can either be vertically or horizontally displayed depending on the labeling
of the axis.

The box does not need to be perfectly symmetrical because it represents data that
might not be perfectly symmetrical.

Elements of Box Plot

A box plot gives a five-number summary of a set of data which is-

 Minimum – It is the minimum value in the dataset excluding the outliers.

 First Quartile (Q1) – 25% of the data lies below the First (lower) Quartile.
 Median (Q2) – It is the mid-point of the dataset. Half of the values lie below it and half
above.
 Third Quartile (Q3) – 75% of the data lies below the Third (Upper) Quartile.
 Maximum – It is the maximum value in the dataset excluding the outliers.

Note: The box plot shown in the above diagram is a perfect plot with no skewness. The
plots can have skewness and the median might not be at the center of the box.

The area inside the box (50% of the data) is known as the Inter Quartile Range .

The IQR is calculated as –

IQR = Q3-Q1

Outlies are the data points below and above the lower and upper limit. The lower and
upper limit is calculated as –

Lower Limit = Q1 - 1.5*IQR

Upper Limit = Q3 + 1.5*IQR
The values below and above these limits are considered outliers and the minimum and
maximum values are calculated from the points which lie under the lower and upper limit.

How to create a box plots?

Let us take a sample data to understand how to create a box plot.

Here are the runs scored by a cricket team in a league of 12 matches – 100, 120, 110,
150, 110, 140, 130, 170, 120, 220, 140, 110.

To draw a box plot for the given data first we need to arrange the data in ascending order
and then find the minimum, first quartile, median, third quartile and the maximum.

Ascending Order
100, 110, 110, 110, 120, 120, 130, 140, 140, 150, 170, 220

Median (Q2) = (120+130)/2 = 125; Since there were even values

To find the First Quartile we take the first six values and find their median.
Q1 = (110+110)/2 = 110

For the Third Quartile, we take the next six and find their median.
Q3 = (140+150)/2 = 145

Note: If the total number of values is odd then we exclude the Median while calculating
Q1 and Q3. Here since there were two central values we included them. Now, we need to
calculate the Inter Quartile Range.

IQR = Q3-Q1 = 145-110 = 35

We can now calculate the Upper and Lower Limits to find the minimum and maximum
values and also the outliers if any.

Lower Limit = Q1-1.5IQR = 110-1.535 = 57.5

Upper Limit = Q3+1.5*IQR = 145+1.5*35 = 197.5

So, the minimum and maximum between the range [57.5,197.5] for our given data are –
Minimum = 100
Maximum = 170
The outliers which are outside this range are –
Outliers = 220

Now we have all the information, so we can draw the box plot which is as below-

We can see from the diagram that the Median is not exactly at the center of the box and
one whisker is longer than the other. We also have one Outlier.

Use-Cases of Box Plot

 Box plots provide a visual summary of the data with which we can quickly identify the
average value of the data, how dispersed the data is, whether the data is skewed or
not (skewness).
 The Median gives you the average value of the data.
 Box Plots shows Skewness of the data-
a) If the Median is at the center of the Box and the whiskers are almost the
same on both the ends then the data is Normally Distributed.
b) If the Median lies closer to the First Quartile and if the whisker at the
lower
end is shorter (as in the above example) then it has a Positive Skew
(Right Skew).
c) If the Median lies closer to the Third Quartile and if the whisker at the
upper end is shorter than it has a Negative Skew (Left Skew).
 The dispersion or spread of data can be visualized by the minimum and maximum
values which are found at the end of the whiskers.
 The Box plot gives us the idea of about the Outliers which are the points which are
numerically distant from the rest of the data.

How to compare box plots?

As we have discussed at the beginning of the article that box plots make comparing
characteristics of data between categories very easy. Let us have a look at how we can
compare different box plots and derive statistical conclusions from them.
Let us take the below two plots as an example: –
 Compare the Medians — If the median line of a box plot lies outside the box of the
other box plot with which it is being compared, then we can say that there is likely to be
a difference between the two groups. Here the Median line of the plot B lies outside the
box of Plot A.

 Compare the Dispersion or Spread of data — The Inter Quartile range (length of the
box) gives us an idea about how dispersed the data is. Here Plot A has a longer length
than Plot B which means that the dispersion of data is more in plot A as compared to
plot B. The length of whiskers also gives an idea of the overall spread of data. The
extreme values (minimum &maximum) give the range of data distribution. Larger the
range more scattered the data. Here Plot A has a larger range than Plot B.

 Comparing Outliers — The outliers give the idea of unusual data values which are
distant from the rest of the data. More number of Outliers means the prediction will be
more uncertain. We can be more confident while predicting the values for a plot which
has less or no outliers.

 Compare Skewness — Skewness gives us the direction and the magnitude of the lack
of symmetry. We have discussed above how to identify skewness. Here Plot A is
Positive or Right Skewed and Plot B is Negative or Left Skewed.

Example box plot

Linear Regression in Machine Learning/Data science
Linear regression is one of the easiest and most popular Machine Learning
algorithms. It is a statistical method that is used for predictive analysis. Linear
regression makes predictions for continuous/real or numeric variables such
as sales, salary, age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y)

and one or more independent (y) variables, hence called as linear regression.
Since linear regression shows the linear relationship, which means it finds how the
value of the dependent variable is changing according to the value of the
independent variable.

The linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:

Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε
Here,

Y= Dependent Variable (Target Variable)

X= Independent Variable (predictor Variable)

a0= intercept of the line (Gives an additional degree of freedom)

a1 = Linear regression coefficient (scale factor to each input value).
ε = random error.

The values for x and y variables are training datasets for Linear Regression model
representation.

Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:

o Simple Linear Regression:

o
If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called
Simple Linear Regression.

o Multiple Linear regression:

o
If more than one independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression algorithm is
called Multiple Linear Regression.

Linear Regression Line

A linear line showing the relationship between the dependent and independent
variables is called a regression line. A regression line can show two types of
relationship:
o Positive Linear Relationship:
o
If the dependent variable increases on the Y-axis and independent variable
increases on X-axis, then such a relationship is termed as a Positive linear
relationship.

o Negative Linear Relationship:

o
If the dependent variable decreases on the Y-axis and independent variable
increases on the X-axis, then such a relationship is called a negative linear
relationship.
Finding the best fit line:

When working with linear regression, our main goal is to find the best fit line that
means the error between predicted values and actual values should be
minimized. The best fit line will have the least error.

The different values for weights or the coefficient of lines (a 0, a1) gives a different
line of regression, so we need to calculate the best values for a 0 and a1 to find
the best fit line, so to calculate this we use cost function.

Cost function-

o The different values for weights or coefficient of lines (a 0, a1) gives the
different line of regression, and the cost function is used to estimate the
values of the coefficient for the best fit line.
o
o Cost function optimizes the regression coefficients or weights. It measures
how a linear regression model is performing.
o
o We can use the cost function to find the accuracy of the mapping function,
which maps the input variable to the output variable. This mapping function
is also known as Hypothesis function.
o

o For Linear Regression, we use the Mean Squared Error (MSE) cost
function, which is the average of squared error occurred between the
predicted values and actual values. It can be written as:

o For the above linear equation, MSE can be calculated as:

o Where,

o N=Total number of observation

o
Yi = Actual value

o (a1xi+a0)= Predicted value.

o Residuals: The distance between the actual value and predicted values is
called residual. If the observed points are far from the regression line, then
the residual will be high, and so cost function will high. If the scatter points
are close to the regression line, then the residual will be small and hence the
cost function.

Bar Graphs and Pie Charts
No ratings yet
Bar Graphs and Pie Charts
9 pages
Types of Graphs
No ratings yet
Types of Graphs
24 pages
Copy Presenting and Interpreting Data in Graphical Form
No ratings yet
Copy Presenting and Interpreting Data in Graphical Form
36 pages
Graphical Representation
100% (1)
Graphical Representation
19 pages
Business Mathematics - Module 17 - Presentation and Analysis of Business Data
No ratings yet
Business Mathematics - Module 17 - Presentation and Analysis of Business Data
16 pages
Unit 3 8614
No ratings yet
Unit 3 8614
28 pages
Graphical Representation
100% (1)
Graphical Representation
34 pages
Graphical Presentation of Data
No ratings yet
Graphical Presentation of Data
28 pages
Math 7-Q4-Module-4
100% (2)
Math 7-Q4-Module-4
15 pages
Research 7 Q3 W4
No ratings yet
Research 7 Q3 W4
9 pages
Assignment-1: Bar Plot
No ratings yet
Assignment-1: Bar Plot
24 pages
Basic Applied Visualizations: Data Visualization
No ratings yet
Basic Applied Visualizations: Data Visualization
76 pages
Graphical Presentation: Grade 7
No ratings yet
Graphical Presentation: Grade 7
12 pages
Lesson 2 Data Presentation
No ratings yet
Lesson 2 Data Presentation
54 pages
Applied - Data - Science MODULE 3 SEM 8
No ratings yet
Applied - Data - Science MODULE 3 SEM 8
41 pages
Organising and Displaying of Data
No ratings yet
Organising and Displaying of Data
55 pages
Presentation and Analysis of Business Data
No ratings yet
Presentation and Analysis of Business Data
16 pages
Unit II Notes_3a091239e16b3291bff686ccfe24f1b8
No ratings yet
Unit II Notes_3a091239e16b3291bff686ccfe24f1b8
79 pages
Graphs & Relative Standing: Chapter. 7.1 & 7.3
No ratings yet
Graphs & Relative Standing: Chapter. 7.1 & 7.3
45 pages
Data Presentation
No ratings yet
Data Presentation
29 pages
Graphical Representation of Statistical Data
No ratings yet
Graphical Representation of Statistical Data
25 pages
Q 3 Define Statical Graphs
No ratings yet
Q 3 Define Statical Graphs
5 pages
3 Graphical Representation Histogram, OGIVE...
No ratings yet
3 Graphical Representation Histogram, OGIVE...
38 pages
Lecture7 GraphicalPresentationData
No ratings yet
Lecture7 GraphicalPresentationData
31 pages
Interpreting Charts and Graphs
No ratings yet
Interpreting Charts and Graphs
18 pages
Graphical Presentatiion
No ratings yet
Graphical Presentatiion
3 pages
Data Visualization
No ratings yet
Data Visualization
18 pages
Graphs
No ratings yet
Graphs
22 pages
Graphical Presentation
No ratings yet
Graphical Presentation
6 pages
LEC-5 Probability and Stats
No ratings yet
LEC-5 Probability and Stats
28 pages
Graphical Representation of Data
No ratings yet
Graphical Representation of Data
5 pages
Graphical Representation of Data
No ratings yet
Graphical Representation of Data
22 pages
Diagrammatic Representation
No ratings yet
Diagrammatic Representation
14 pages
Math7 Q3 W2 Day3
No ratings yet
Math7 Q3 W2 Day3
32 pages
Jhamhel Basoy Final Lesson Plan Grapphical Representation of Data
No ratings yet
Jhamhel Basoy Final Lesson Plan Grapphical Representation of Data
9 pages
Graphs PDF
No ratings yet
Graphs PDF
10 pages
Buss. Math
No ratings yet
Buss. Math
22 pages
Ids Unit 3 Final
No ratings yet
Ids Unit 3 Final
42 pages
Graphs
No ratings yet
Graphs
8 pages
Statistics Presentation
No ratings yet
Statistics Presentation
21 pages
M2stats 0prob34
No ratings yet
M2stats 0prob34
12 pages
Diskusi 7 Reading
No ratings yet
Diskusi 7 Reading
12 pages
5 Methods of Data Visualisation
No ratings yet
5 Methods of Data Visualisation
5 pages
Math Project 5th
No ratings yet
Math Project 5th
4 pages
51f6a72c-92f0-401d-b68e-f06b6bba2b29
No ratings yet
51f6a72c-92f0-401d-b68e-f06b6bba2b29
5 pages
Data Handling
No ratings yet
Data Handling
10 pages
Data Handling
No ratings yet
Data Handling
16 pages
Graphical
No ratings yet
Graphical
26 pages
ES Assignment
No ratings yet
ES Assignment
19 pages
3rd Term e Lesson Week Nine Basic 6 2025
No ratings yet
3rd Term e Lesson Week Nine Basic 6 2025
14 pages
Ids Unit 1 Final
No ratings yet
Ids Unit 1 Final
30 pages
2nd Chapter Statistics
No ratings yet
2nd Chapter Statistics
23 pages
N Research 06.05.2020 Graphical Representation of Data
100% (1)
N Research 06.05.2020 Graphical Representation of Data
50 pages
Al-Moalla Et Al-2010-The Electronic Journal of Information Systems in Developing Countries PDF
No ratings yet
Al-Moalla Et Al-2010-The Electronic Journal of Information Systems in Developing Countries PDF
18 pages
202003271604164717neeraj Jain Graphical Representation
No ratings yet
202003271604164717neeraj Jain Graphical Representation
12 pages
Submitted To: Mam Hira Rafiq: Assignment For Mid Term Subjects
No ratings yet
Submitted To: Mam Hira Rafiq: Assignment For Mid Term Subjects
6 pages
Pie Chart and Line Graphs
No ratings yet
Pie Chart and Line Graphs
4 pages
Different Types of Graphs
No ratings yet
Different Types of Graphs
11 pages
LAS 4th Quarter M4
No ratings yet
LAS 4th Quarter M4
7 pages
Statistics (Part 1)
No ratings yet
Statistics (Part 1)
15 pages
Statistics For Road and Transport Engineers 1-1
No ratings yet
Statistics For Road and Transport Engineers 1-1
235 pages
Vote Buying, Government Accountability, and Political Corruption: The Case of The Philippines
No ratings yet
Vote Buying, Government Accountability, and Political Corruption: The Case of The Philippines
7 pages
Bar Chart and Pie Chart J (Methods of Presenting Data)
No ratings yet
Bar Chart and Pie Chart J (Methods of Presenting Data)
5 pages
The Glass Floor: Sexual Harassment in The Restaurant Industry
100% (3)
The Glass Floor: Sexual Harassment in The Restaurant Industry
40 pages
Doeslegalizedprostitution PDF
No ratings yet
Doeslegalizedprostitution PDF
49 pages
Organization and Presentation of Data: Graphs
No ratings yet
Organization and Presentation of Data: Graphs
3 pages
Data Graphics
No ratings yet
Data Graphics
4 pages
Seminar PPT
No ratings yet
Seminar PPT
22 pages
Disertatie PDF
No ratings yet
Disertatie PDF
16 pages
University of Southampton PHD Thesis Template
100% (3)
University of Southampton PHD Thesis Template
6 pages
12 Jurnal Internasional PDF
No ratings yet
12 Jurnal Internasional PDF
6 pages
5170 15851 5 PB
No ratings yet
5170 15851 5 PB
6 pages
Ids Unit 4 Final
No ratings yet
Ids Unit 4 Final
32 pages
Financial Data Analysis
No ratings yet
Financial Data Analysis
8 pages
Nonlinear Regression
No ratings yet
Nonlinear Regression
43 pages
Ids Unit 2 Final
No ratings yet
Ids Unit 2 Final
18 pages
@BEPROPOZAL1
No ratings yet
@BEPROPOZAL1
30 pages
Terman, Rochelle - 2017 - Islamophobia and Media Portrayals of Muslim Women
No ratings yet
Terman, Rochelle - 2017 - Islamophobia and Media Portrayals of Muslim Women
14 pages
Jan Fifteen
No ratings yet
Jan Fifteen
5 pages
The Empirical Status of Gottfredson and Hirschfs General Theory of Crime: A Meta-Analysis
No ratings yet
The Empirical Status of Gottfredson and Hirschfs General Theory of Crime: A Meta-Analysis
35 pages
FVAP PROPOSAL OMNeT++
No ratings yet
FVAP PROPOSAL OMNeT++
32 pages
Project - Report - Forest Fire Prediction - Group 119
No ratings yet
Project - Report - Forest Fire Prediction - Group 119
26 pages
Institutions and Economic Growth Evidence From Eco
No ratings yet
Institutions and Economic Growth Evidence From Eco
16 pages
Jurnal Quality - Access To Success Q3
No ratings yet
Jurnal Quality - Access To Success Q3
6 pages
Pged Paper IV Syllabus
No ratings yet
Pged Paper IV Syllabus
2 pages
Srinidhi 2019
No ratings yet
Srinidhi 2019
41 pages
Paper11-5 TruckTimeSeriesForecasting
No ratings yet
Paper11-5 TruckTimeSeriesForecasting
46 pages
Marketing Mix, Loyal Customers & Motorcycle Industry
No ratings yet
Marketing Mix, Loyal Customers & Motorcycle Industry
4 pages
A Beginner's Guide To Partial Least Squares Analysis: Summary
No ratings yet
A Beginner's Guide To Partial Least Squares Analysis: Summary
4 pages
结构方程模型输出
No ratings yet
结构方程模型输出
2 pages
Analysis of Multivariate Data From Ecology and Environmental Science, Using PRIMER v6
No ratings yet
Analysis of Multivariate Data From Ecology and Environmental Science, Using PRIMER v6
2 pages
Chap013 Solutions
No ratings yet
Chap013 Solutions
21 pages
Module 4 Question Bank: Big Data Analytics
No ratings yet
Module 4 Question Bank: Big Data Analytics
2 pages

Ids Unit 5 Final

Uploaded by

Ids Unit 5 Final

Uploaded by

INTRODUCTION TO DATA SCIENCE(IDS)

Prepared by N.Pandu Ranga Reddy

Graphical Representation of Data (CHARTS & GRAPHS)

Graphical Representation of Data: Graphical Representation of Data,” where

There are two ways of representing data,

Types of Graphical Representations

Parts of Line Graph

Parts of the line graph include the following:

Example: Draw a line graph for the given data

An example of multiple graphs is shown below:

Pie charts, also known as circle graphs or pie diagrams

 Cricket = 17/100 × 200 = 34 students

Pie Chart Formula

 Calculating the percentage of each sector from degrees in a pie chart.

The pictorial representation of data in groups, either in horizontal or vertical bars

Reading a Bar Graph and comparing two sets of data

Different types of fruits are liked by People,

Overall, what kind of information the bar graph displaying?

Box plot is also known as a whisker plot, box-and-whisker plot, or simply a

Box plot is a graphical representation of the distribution of a dataset. It

Elements of Box Plot

A box plot gives a five-number summary of a set of data which is-

 Minimum – It is the minimum value in the dataset excluding the outliers.

The IQR is calculated as –

Lower Limit = Q1 - 1.5*IQR

How to create a box plots?

Let us take a sample data to understand how to create a box plot.

Median (Q2) = (120+130)/2 = 125; Since there were even values

IQR = Q3-Q1 = 145-110 = 35

Lower Limit = Q1-1.5*IQR = 110-1.5*35 = 57.5

Use-Cases of Box Plot

How to compare box plots?

Example box plot

Linear regression algorithm shows a linear relationship between a dependent (y)

Mathematically, we can represent a linear regression as:

Y= Dependent Variable (Target Variable)

X= Independent Variable (predictor Variable)

a0= intercept of the line (Gives an additional degree of freedom)

Types of Linear Regression

o Simple Linear Regression:

o Multiple Linear regression:

Linear Regression Line

o Negative Linear Relationship:

o For the above linear equation, MSE can be calculated as:

o N=Total number of observation

o (a1xi+a0)= Predicted value.

You might also like

Lower Limit = Q1-1.5IQR = 110-1.535 = 57.5