0% found this document useful (0 votes)
18 views10 pages

Data Sci

This lab record documents experiments on data visualization in Python using libraries like Matplotlib and Seaborn. Key plots covered include box plots, bar plots, pie charts, and histograms. Code examples demonstrate how to generate each type of plot from sample datasets. Box plots show the distribution of data values while bar plots enable comparison across categories. Pie charts express percentages and histograms bin continuous data for visualization.

Uploaded by

harsat2030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

Data Sci

This lab record documents experiments on data visualization in Python using libraries like Matplotlib and Seaborn. Key plots covered include box plots, bar plots, pie charts, and histograms. Code examples demonstrate how to generate each type of plot from sample datasets. Box plots show the distribution of data values while bar plots enable comparison across categories. Pie charts express percentages and histograms bin continuous data for visualization.

Uploaded by

harsat2030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Lab Record

of

INTRODUCTION TO DATA SCIENCE


(CSF345)

BACHELOR OF TECHNOLOGY
In

COMPUTER SCIENCE AND ENGINEERING

Session 2022-23

Submitted to: - Submitted by: -

Faculty Name-Dr Amit Kumar Mishra Name: Harshit Vashisth


Designation: Associate Professor Roll No.: 200102022
School of Computing Sap ID: 1000014073
Section: A (P1)

SCHOOL OF COMPUTING
DIT UNIVERSITY, DEHRADUN
(State Private University through State Legislature Act No. 10 of 2013 of Uttarakhand and approved by UGC)
Mussoorie Diversion Road, Dehradun, Uttarakhand - 248009, India.
INDEX
S. EXPERIMENTS DATE REMARK
NO.
1. Describe the installation of anaconda and
jupyter notebook.

2. Data Visualization

4..

5.

6.

7.

8.

9.

10

11.

12.

13.

2
EXPERIMENT 1

OBJECTIVE:
Describe the installation of anaconda and jupyter notebook.

ANSWER:

1.Download and install anaconda: -

Go to the anaconda website and choose a Python 3.10(64-bit) version and install it.

Go to download section and click on it. Click on next, the Read the license agreement and click on I
Agree. Then next box will appear and click on next. Note your installation location and then click
Next. And copy it to the path and the anaconda is installed.

2.Jupyter Notebook installation using pip: -

Open command prompt, press window key + R, type cmd click Ok.

Write a command ‘pip install jupyter’ to install jupyter and click enter .It will take few moment and
download files and data and install the packages ,and jupyter notebook is installed now.

To launch the jupyter notebook, write command-line ‘jupyter notebook’ and click enter.

Python library is a collection of modules that contain functions and classes that can be used by other
programs to perform various tasks. To install libraries use command ‘pip install library(name of the
library)

Libraires are as follows:

• NumPy
• Pandas
• Matplotlib
• TensorFlow, etc.

And process to install them as follows-

To install library pandas – ‘pip install pandas’ and pandas is installed now and read to use. Similarly
for installing matplotlib use command ‘pip install matplotlib’ and matplotlib is installed.

3
EXPERIMENT 2

OBJECTIVE:
Data visualization in Python

DATA VISUALIZATION-
Data visualization is the representation of data through use of common graphics, such as charts,
plots, infographics, and even animations. These visual displays of information communicate complex
data relationships and data-driven insights in a way that is easy to understand.

Python offers several plotting libraries, namely Matplotlib, Seaborn and many other such data
visualization packages with different features for creating informative, customized, and appealing
plots to present data in the most simple and effective way. The process of finding trends and
correlations in our data by representing it pictorially is called Data Visualization. To perform data
visualization in python, we can use various python data visualization modules such as Matplotlib,
Seaborn, Plotly, etc

Matplotlib and Seaborn are python libraries that are used for data visualization. They have inbuilt
modules for plotting different graphs. While Matplotlib is used to embed graphs into applications,
Seaborn is primarily used for statistical graphs.

BOX PLOT:
A Box Plot is also known as Whiskert is created to display the summary of the set of data values
having properties like minimum, first quartile, median, third quartile and maximum. The data can
be distributed between five key ranges, which are as follows:

• Minimum: Q1-1.5*IQR
• 1st quartile (Q1): 25th percentile
• Median:50th percentile
• 3rd quartile(Q3):75th percentile
• Maximum: Q3+1.5*IQR

Whereas IQR represents the InterQuartile Range which starts from the first quartile (Q1) and ends at
the third quartile (Q3).

4
Code-
#import libraries
import matplotlib.pyplot as plt

#create dataset for our model


data1 = [70, 69, 52, 43, 32, 20,
62, 57, 77, 33, 42, 53, 45
]
data2 = [1, 1,2,4,5, 10,
10, 11, 11, 14, 23,30 ]

#create plot for visualization


plt.boxplot(data1,widths=0.4)
plt.show()

plt.boxplot(data2,widths=0.4,patch_artist=True)
plt.show()

Output-

5
Code-
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt

# Create dataset
data = {'Laptop': ['Pavilion', 'Legion', 'Omen', 'Macbook',
'Alienware'],
'Ram': [1650, 1090, 1040, 2020, 2060],
'Graphics': [1650, 3090, 2040, 3020, 2060]}

d = pd.DataFrame(data)
# create plot
d.plot.box()
plt.show()

Output-

6
BAR PLOT:
A barplot (or barchart) is one of the most common types of graphic. It shows the relationship
between a numeric and a categoric variable. Each entity of the categoric variable is represented as a
bar. The size of the bar represents its numeric value.The bars can be plotted vertically or
horizontally.

A bar graph shows comparisons among discrete categories. One axis of the chart shows the specific
categories being compared, and the other axis represents a measured value.

Code-

# Import libraries
import pandas as pd
import matplotlib.pyplot as plt

# Create dataset
data = {'Laptop': ['Pavilion', 'Legion', 'Omen', 'Macbook',
'Alienware'],
'Ram': [12, 16, 24, 32, 14]}

df = pd.DataFrame(data)

# create plot
df.plot.bar(color=(0.1, 0.1, 0.1, 0.1), edgecolor ='blue' )

# show plot
plt.show()
# show labels
plt.bar(df["Laptop"], df["Ram"],color=(0.2, 0.2, 0.2, 0.2),
edgecolor ='cyan' )
plt.xlabel("Laptop")
plt.ylabel("Ram")
plt.show()

Output-

7
PIE CHART:
A pie chart, sometimes called a circle chart, is a way of summarizing a set of nominal data or
displaying the different values of a given variable (e.g. percentage distribution).The Pie (or the
circle) represents the total value, i.e., 100 percent, and each slice of the pie chart adds some per cent
to the total.

Code-
# Import libraries
import numpy as np
import matplotlib.pyplot as plt

# Create dataset
Laptop=['Pavilion', 'Legion', 'Omen', 'Macbook', 'Alienware']
Ram=[12, 16, 24, 32, 14]

# create plot
plt.pie(Ram, labels=Laptop,labeldistance=1.15,wedgeprops = {
'linewidth' : 1, 'edgecolor' : 'white' })

# show plot
plt.show()

Output-

8
HISTOGRAM
A histogram is a graphical representation of data points organized into user-specified ranges.
Similar in appearance to a bar graph, the histogram condenses a data series into an easily interpreted
visual by taking many data points and grouping them into logical ranges or bins.A series of
rectangles with foundations equal to the distances between class bounds and areas proportionate to
the frequency in the associated classes make up the area diagram. Since the ground in such
representations spans the spaces between class bounds, every rectangle is adjacent.

Bins are defined as successive, non-overlapping ranges of variables. The matplotlib.pyplot.hist()


method is used to calculate and generate the histogram of the variable x.

Code-

#histogram
#import libraries
import matplotlib.pyplot as plt

#create dataset
age = [44, 57, 47, 53, 42, 43, 21, 33, 93, 59,
60, 69, 62, 73, 62, 73, 92, 83, 71, 63,
70, 50, 81, 69, 67, 55, 85, 55, 53, 60,
43, 66, 53, 69, 59, 59, 55, 87, 66, 43,

9
73, 41, 75, 60, 90, 40, 54, 47, 58, 69,
68, 48, 84, 68, 66, 64, 53, 48, 66, 63,
59, 67, 42, 68, 67, 98, 54, 72, 62, 76,
83, 47, 34, 84, 64, 98, 53, 57, 82, 52,
56, 74, 84, 83, 65, 31, 89, 72, 79, 57,
78, 68, 78, 48, 88, 62, 88, 74, 74, 68,12,16]

#create histogram
plt.hist(age,color="skyblue",edgecolor = 'black')

#create label
plt.xlabel("AGE")
plt.ylabel("POPULATION")
plt.title("Histogram of Age")
plt.show()

#define the egde color and customize bin


plt.hist(age,edgecolor = 'skyblue',bins =
[10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100],color="
grey")
plt.xlabel("AGE")
plt.ylabel("POPULATION")
plt.title("Histogram of age")
plt.show()

Output-

https://www.simplilearn.com/tutorials/python-tutorial/data-visualization-in-
python#:~:text=The%20process%20of%20finding%20trends,%2C%20Seaborn%2C%20Plotly%2C%
20etc.

10

You might also like