0% found this document useful (0 votes)
2 views4 pages

AIYA Pre-Requisites Session 3

Module 3 introduces Python libraries, focusing on NumPy and Pandas for data manipulation and machine learning. Students will learn to load and analyze CSV data using Pandas, and explore basic numerical operations with NumPy. The module also covers the fundamentals of machine learning with Scikit-Learn, including model creation and accuracy assessment.

Uploaded by

shahaarav315
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

AIYA Pre-Requisites Session 3

Module 3 introduces Python libraries, focusing on NumPy and Pandas for data manipulation and machine learning. Students will learn to load and analyze CSV data using Pandas, and explore basic numerical operations with NumPy. The module also covers the fundamentals of machine learning with Scikit-Learn, including model creation and accuracy assessment.

Uploaded by

shahaarav315
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

AIYA (For High School Students)

Pre-Requisites
Module 3 – Introduction to Python Libraries

Python Libraries Overview

 Understanding the role of libraries in programming.


 Importance of libraries in data manipulation and machine learning tasks

Reading Material:

NumPy Tutorial (w3schools.com)

Pandas Tutorial (w3schools.com)

SciPy Tutorial (w3schools.com)

Assignment : (Test your understanding): :

- Load a simple CSV file containing some data with columns like 'Name', 'Age', 'Gender'.

Utilize Pandas to:


- Read the CSV file.
- Display the dataset.
- Extract specific information such as the mean age or the count of each gender category.
Summary of Reading Material

What is NumPy?
 Introduction to NumPy for numerical computing in Python
 Creating and manipulating arrays using NumPy

Code Explanatio Output


n
import numpy as Importing NA
np the numpy
library
arr1 = Creating a
np.array([1, 2, 3]) numpy
arr1 array with
the
elements
1, 2 and 3
arr2 = Creating a
np.zeros((3, 3)) numpy
arr2 array with
dim 3 x 3
and filled
with 0s
arr3 = Creating a
np.random.rand( numpy
2, 2) array with
arr3 dim 3 x 3
and filled
with
random
values
sum_result Sum of the
=np.sum(arr1) elements
sum_result in arr1

dot_product = Performs
np.dot(arr2, arr3) the dot
dot_product product
between
both the
arrays
Pandas for Data Manipulation
Introduction to Pandas
 Overview of Pandas for data manipulation and analysis
 Working with DataFrames in Pandas

Code Explanation Output


import pandas Importing
as pd the pandas
library as pd
data = {'Name': Creating a
['Alice', 'Bob', dictionary
'Charlie'],
'Age': [25, 30,
22],
'City': ['New
York', 'San
Francisco', 'Los
Angeles']}
data
df = Converting
pd.DataFrame( the
data) dictionary to
df a dataframe

mean_age = Finding
df['Age'].mean( mean of the
) column
named ‘Age’
filtered_data = Creating a
df[df['Age'] > new
25] dataframe
filtered_data with filtered
data
Pandas for Data Manipulation
Introduction to Pandas
 Overview of Scikit-Learn for machine learning in Python
 Basics of supervised learning and classification

Code Explanation
from sklearn.model_selection import Importing required libraries
train_test_split
from sklearn.tree import
DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.datasets import
make_classification
X, y = Creating a fummy dataset with 1000
make_classification(n_samples=1000, samples, 20 features and classified into 2
n_features=20, n_classes=2, classes
random_state=42)
X_train, X_test, y_train, y_test = Splitting the data into test and train
train_test_split(X, y, test_size=0.2,
random_state=42)
model = DecisionTreeClassifier() Creating a decision tree model
predictions = model.predict(X_test) Predictions from the model
accuracy = accuracy_score(y_test, Predict the accuracy of the model
predictions)

You might also like