0% found this document useful (0 votes)
6 views

TP1 - Machine Learning h

Uploaded by

Anouar Belabbes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

TP1 - Machine Learning h

Uploaded by

Anouar Belabbes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Module : Machine Learning 1 Academic year : 2024/2025

Lab1 : Review of the core modules NumPy, Pandas, and Matplotlib for
successful data preparation

This lab’s objective : getting started with the most used python libraries in the preprocessing
phase of the machine learning projects

Note : Before starting, open a new google colab/jupyter notebook python file, and import the
datasets : adults.csv, amazon.csv and apple.csv

Make sure you execute all practise examples/exercises on your machines.

 Getting started with the Numpy Library


 Example:
 Exercise : Finding the mean
The following is the grade data of ten students.

Names = ['Jevon', 'Dawn', 'Kayleigh', 'Jadene', 'Kennedy', 'Kaydee', 'Ansh',


'Flynn', 'Kier', 'Clarence']

Math_grades = [80, 50, 60, 70, 60, 100, 70, 70, 60, 70]

Science_grades = [90, 80, 50, 50, 60, 50, 90, 70, 80, 80]

History_grades = [60, 90, 50, 90, 100, 100, 100, 100, 90, 70]

1 - Create a code using NumPy that calcuate and report their grade average.

2 - Create a better looking report under the format (ex: Average grade of student Jevon
is: … )

 The Library Pandas : Loading datasets, going through its rows and columns

 Example 1: We start by importing the library pandas, and loading our dataset as follows :
 Example 2 : To go through the dataset, we should use the loc and iloc functions :
 Exercise : the loc vs iloc functions
Use the adult.csv dataset and run the codes shown in the following Screenshots. Then answer
the questions.

a) Use the output to answer what is the difference in the behavior of .loc and .iloc
when it comes to slicing.

b) Without running but by only looking at the data, what will be the output of
adult_df.loc['10000':'10003', 'relationship':'sex'].

c) Without running but by only looking at the data, what will be the output of
adult_df.iloc[0:3, 7:9].
 Example 3 : Exploring the dataset further using Pandas and Matplotlib

 Pandas : the use of groupby function :


 Example1: using the groupby function

 Exercise :
a- Write a python code to group adults by race, sex, income and the mean of the fnlwgt
feature.
b- Calculate the mean and median of capitalLoss and capitalGain for every race in the
data.
c- Visualise the distribution of the capitalGain of the adults dataset.

 Exercise :
Given two datasets : Amazon Stock.csv and Apple Stock.csv :

a- Write python code to Load both datasets.

b- Display information about their columns, shape

c- What does the shape display ?

d- Can we specify the number of rows to be displayed in the head function ? is there a

defaut one ?

e- Write a python code to display statistical information about the two datasets

f- Display a list containting only the first two features of the Amazon stock dataset.

g- Build a plot displaying the closing price for Apple, using a boxplot ? what does this

graph reflect ?

h- Build a plot displaying the closing price for Amazon, what can you say about it ?

i- Make a plot displaying both distributions for the closing price of Apple and Amazon

j- Make sure you add the title and legend to the plot
k- Calculate the mean, median values for the closing price.

l- Calculate the maximum dividends for both Apple and Amazon

You might also like