0% found this document useful (0 votes)

13 views

Datascience Lab 1-2

DAT ASCIENCE

Uploaded by

Geetha A L

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Datascience Lab 1-2

DAT ASCIENCE

Uploaded by

Geetha A L

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

MODULE-1

A study was conducted to understand the effect of number of hours the students spent studying on
their performance in the final exams. Write a code to plot line chart with number of hours spent
studying on x-axis and score in final exam on y-axis. Use a red ‘*’ as the point character, label the axes
and give the plot a title.

import matplotlib.pyplot as plt

hours = [10,9,2,15,10,16,11,16]

score = [95,80,10,50,45,98,38,93]

# Plotting the line chart

plt.plot(hours, score, marker='*', color='red', linestyle='-')

# Adding labels and title

plt.xlabel('Number of Hours Studied')

plt.ylabel('Score in Final Exam')

plt.title('Effect of Hours Studied on Exam Score')

# Displaying the plot

plt.grid(True)

plt.show()
For the given dataset mtcars.csv (www.kaggle.com/ruiromanini/mtcars), plot a histogram to check
the frequency distribution of the variable ‘mpg’ (Miles per gallon)

import pandas as pd

import matplotlib.pyplot as plt

# Load the dataset

mtcars = pd.read_csv('mtcars.csv') # Replace 'path_to_your_mtcars.csv' with the actual path to your

mtcars.csv file

# Plotting the histogram

plt.hist(mtcars['mpg'], bins=10, color='skyblue', edgecolor='black')

# Adding labels and title

plt.xlabel('Miles per gallon (mpg)')

plt.ylabel('Frequency')

plt.title('Histogram of Miles per gallon (mpg)')

# Displaying the plot

plt.show()

MODULE-2

Consider the books dataset BL-Flickr-Images-Book.csv from Kaggle

(https://www.kaggle.com/adeyoyintemidayo/publication-of-books) which contains information
about books.

Write a program to demonstrate the following.

 Import the data into a DataFrame

 Find and drop the columns which are irrelevant for the book information.
 Change the Index of the DataFrame
 Tidy up fields in the data such as date of publication with the help of simple regular
expression.
 Combine str methods with NumPy to clean columns

import pandas as pd

import numpy as np
# Import the data into a DataFrame

df = pd.read_csv('BL-Flickr-Images-Book.csv')

# Display the first few rows of the DataFrame

print("Original DataFrame:")

print(df.head())

# Find and drop the columns which are irrelevant for the book information

irrelevant_columns = ['Edition Statement', 'Corporate Author', 'Corporate Contributors', 'Former owner',

'Engraver', 'Contributors', 'Issuance type', 'Shelfmarks']

df.drop(columns=irrelevant_columns, inplace=True)

# Change the Index of the DataFrame

df.set_index('Identifier', inplace=True)

# Tidy up fields in the data such as date of publication with the help of simple regular expression

df['Date of Publication'] = df['Date of Publication'].str.extract(r'^(\d{4})', expand=False)

# Combine str methods with NumPy to clean columns

df['Place of Publication'] = np.where(df['Place of Publication'].str.contains('London'), 'London', df['Place

of Publication'].str.replace('-', ' '))

# Display the cleaned DataFrame

print("\nCleaned DataFrame:")

print(df.head())

?call of Duty Mobile Accounts
100% (6)
?call of Duty Mobile Accounts
9 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
GE- COMPUTER SCIENCE DATA ANALYSIS
No ratings yet
GE- COMPUTER SCIENCE DATA ANALYSIS
16 pages
2020-21 XIIInfo - Pract.S.E.155
No ratings yet
2020-21 XIIInfo - Pract.S.E.155
11 pages
2023 Data Analysis and Visualization Using Python
100% (2)
2023 Data Analysis and Visualization Using Python
9 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
TUTORIAL 2 QB & QP
No ratings yet
TUTORIAL 2 QB & QP
4 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
QP DAV 3rd Sem Dec 2023
No ratings yet
QP DAV 3rd Sem Dec 2023
12 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
21hcs4108 Davpracticals
No ratings yet
21hcs4108 Davpracticals
29 pages
Questions Practical File
No ratings yet
Questions Practical File
13 pages
Khadeeja_DS_PRACTICAL 4
No ratings yet
Khadeeja_DS_PRACTICAL 4
24 pages
I037 - Manas Patel Experiment09
No ratings yet
I037 - Manas Patel Experiment09
9 pages
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
No ratings yet
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
12 pages
dav 2024 pyq
No ratings yet
dav 2024 pyq
7 pages
IP Practical 2023-24 (1 To 34)
100% (1)
IP Practical 2023-24 (1 To 34)
32 pages
PDA_Assignment questions
No ratings yet
PDA_Assignment questions
4 pages
python 1
No ratings yet
python 1
16 pages
Ip 2019
No ratings yet
Ip 2019
12 pages
Python Codes[1]
No ratings yet
Python Codes[1]
28 pages
GE Python Visualization 2023
No ratings yet
GE Python Visualization 2023
16 pages
AL Notes
No ratings yet
AL Notes
61 pages
GEC PRACTICALS
No ratings yet
GEC PRACTICALS
31 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
Week 1 To Week 9
No ratings yet
Week 1 To Week 9
30 pages
DAV Previous Year
No ratings yet
DAV Previous Year
7 pages
dav end sem (1)
No ratings yet
dav end sem (1)
2 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
data science practicals
No ratings yet
data science practicals
47 pages
b2
No ratings yet
b2
6 pages
python_codes[1]
No ratings yet
python_codes[1]
29 pages
IP
No ratings yet
IP
10 pages
Kendriya Vidyalaya Sangathan, Mumbai Region 1 Pre-Board Examination 2019-20
No ratings yet
Kendriya Vidyalaya Sangathan, Mumbai Region 1 Pre-Board Examination 2019-20
11 pages
DAVPy_2024GE
No ratings yet
DAVPy_2024GE
12 pages
DIVP PYQ 2023
No ratings yet
DIVP PYQ 2023
7 pages
DAV_practicle_File
No ratings yet
DAV_practicle_File
28 pages
XII IP CHN 02 MS
No ratings yet
XII IP CHN 02 MS
4 pages
Httppython Mykvs inuploadsfilesXIIInfo Pract S E 150 PDF
No ratings yet
Httppython Mykvs inuploadsfilesXIIInfo Pract S E 150 PDF
15 pages
DS Question Bank Unit-1 Part-2
No ratings yet
DS Question Bank Unit-1 Part-2
3 pages
Question Bank CIA 2
No ratings yet
Question Bank CIA 2
3 pages
Exercises 2
No ratings yet
Exercises 2
10 pages
Dejene Chala Stat606 Screening Quiz Programming Part
No ratings yet
Dejene Chala Stat606 Screening Quiz Programming Part
12 pages
Lab 3 & 4
No ratings yet
Lab 3 & 4
10 pages
Amity International School SESSION: 2024-25 Informatics Practices (065) Class Xii Practical List
No ratings yet
Amity International School SESSION: 2024-25 Informatics Practices (065) Class Xii Practical List
5 pages
Python Project File
No ratings yet
Python Project File
31 pages
Ip Practical File
No ratings yet
Ip Practical File
20 pages
1. Model Practical Examination 2024-25 Python Pandas QP
No ratings yet
1. Model Practical Examination 2024-25 Python Pandas QP
3 pages
066d3536-105d-471c-bda8-367c910b8ddc (1)
No ratings yet
066d3536-105d-471c-bda8-367c910b8ddc (1)
33 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Pragya File
No ratings yet
Pragya File
31 pages
CS1010S Lecture 11 - Visualising Data
No ratings yet
CS1010S Lecture 11 - Visualising Data
68 pages
Question Bank Class XII IP 065 Long Question Answer
No ratings yet
Question Bank Class XII IP 065 Long Question Answer
35 pages
1
No ratings yet
1
3 pages
Practical7 Python Programming
No ratings yet
Practical7 Python Programming
6 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
ip study
No ratings yet
ip study
18 pages
Informatic Practices Hhw (3)
No ratings yet
Informatic Practices Hhw (3)
59 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
MODULE4
No ratings yet
MODULE4
69 pages
Proposal HIK
No ratings yet
Proposal HIK
8 pages
CM Partners Basic 2.0 Readme
No ratings yet
CM Partners Basic 2.0 Readme
6 pages
Close of Business
100% (1)
Close of Business
9 pages
Services Standard Build User Guide
No ratings yet
Services Standard Build User Guide
19 pages
Sccribddownloader.com - Free PDF & Book Scribd Downloader
No ratings yet
Sccribddownloader.com - Free PDF & Book Scribd Downloader
14 pages
HCL ME AE1V2475-X: Intel® Core™ I3 380M Processor
No ratings yet
HCL ME AE1V2475-X: Intel® Core™ I3 380M Processor
17 pages
Juniorengineer (Mechanical)
No ratings yet
Juniorengineer (Mechanical)
2 pages
Biostar H61MHV Spec
No ratings yet
Biostar H61MHV Spec
6 pages
Ouriginal Flyer
No ratings yet
Ouriginal Flyer
6 pages
CBS 221 Jobcard LAPS Interface Module
No ratings yet
CBS 221 Jobcard LAPS Interface Module
3 pages
Week 11 - Computer-Teaching Strategies (PSTM)
No ratings yet
Week 11 - Computer-Teaching Strategies (PSTM)
2 pages
Mission Planner Overview
No ratings yet
Mission Planner Overview
13 pages
What Is Mobile Analytics
No ratings yet
What Is Mobile Analytics
4 pages
Chapter 3: Operators and Assignments Java Operators
No ratings yet
Chapter 3: Operators and Assignments Java Operators
14 pages
DLL Etech Q1 W5
No ratings yet
DLL Etech Q1 W5
3 pages
Nextcloud User Manual
No ratings yet
Nextcloud User Manual
173 pages
SAP - ABAP - Poplist & Radiobutton
No ratings yet
SAP - ABAP - Poplist & Radiobutton
8 pages
ToolKit 1 - Unit 1 - Introduction To Data Analytics
No ratings yet
ToolKit 1 - Unit 1 - Introduction To Data Analytics
15 pages
Cisco SD-WAN Controller SSL Certificates Renew
No ratings yet
Cisco SD-WAN Controller SSL Certificates Renew
6 pages
1
No ratings yet
1
8 pages
Drive Module LT-MODUL INT.2X15A Failed After Checking With Another One - 203524 - Industry Support Siemens
No ratings yet
Drive Module LT-MODUL INT.2X15A Failed After Checking With Another One - 203524 - Industry Support Siemens
2 pages
Requirement Engineering Report For
100% (1)
Requirement Engineering Report For
22 pages
8dio Fire Sax
No ratings yet
8dio Fire Sax
14 pages
Fingerprint Terminal V1.3.38 - Build210802 Release Note
No ratings yet
Fingerprint Terminal V1.3.38 - Build210802 Release Note
2 pages
Very Important Note of ML
No ratings yet
Very Important Note of ML
2 pages
Dragon Lore: The Legend Begins
No ratings yet
Dragon Lore: The Legend Begins
6 pages
Infix To Prefix Dari Source C
No ratings yet
Infix To Prefix Dari Source C
23 pages
Excel date functions with formula examples
No ratings yet
Excel date functions with formula examples
4 pages

Datascience Lab 1-2

Uploaded by

Datascience Lab 1-2

Uploaded by

MODULE-1

import matplotlib.pyplot as plt

# Plotting the line chart

plt.plot(hours, score, marker='*', color='red', linestyle='-')

# Adding labels and title

plt.xlabel('Number of Hours Studied')

plt.ylabel('Score in Final Exam')

plt.title('Effect of Hours Studied on Exam Score')

# Displaying the plot

import matplotlib.pyplot as plt

# Load the dataset

mtcars = pd.read_csv('mtcars.csv') # Replace 'path_to_your_mtcars.csv' with the actual path to your

# Plotting the histogram

plt.hist(mtcars['mpg'], bins=10, color='skyblue', edgecolor='black')

# Adding labels and title

plt.xlabel('Miles per gallon (mpg)')

plt.title('Histogram of Miles per gallon (mpg)')

# Displaying the plot

Consider the books dataset BL-Flickr-Images-Book.csv from Kaggle

Write a program to demonstrate the following.

 Import the data into a DataFrame

# Display the first few rows of the DataFrame

irrelevant_columns = ['Edition Statement', 'Corporate Author', 'Corporate Contributors', 'Former owner',

# Change the Index of the DataFrame

df['Date of Publication'] = df['Date of Publication'].str.extract(r'^(\d{4})', expand=False)

# Combine str methods with NumPy to clean columns

df['Place of Publication'] = np.where(df['Place of Publication'].str.contains('London'), 'London', df['Place

# Display the cleaned DataFrame

You might also like