2. Python for Data Science
2. Python for Data Science
SCIENCE
Learning Objectives
UNDERSTANDING THE FUNDAMENTALS OF PYTHON AND ITS VERSATILITY.
PROPERTIES OF LIST AND TUPLE. VARIOUS OPERATIONS PERFORMED ON BOTH LIST AND TUPLE.
#Example of a Script
import yagmail
user = yagmail.SMTP(user=‘[email protected]',password='zpsgowuwqlansmfv’)
user.send(to ='[email protected]', subject ='Sales_Reports',contents ='This is test mail for python automation',
attachments = 'Sales.xlsx')
Chat GPT Prompt
Explain how the
From above piece of code its easy to guess in Line no 1 yagmail laibrary is imported code <Paste code
here> works in
Python..
In Line 2 the from mail id ‘[email protected]’, and password is set as 'zpsgowuwqlansmfv’
Line 3 send the email to '[email protected]’, with subject Sales report and with body of email and attachment of
file named Sales.xlsx
Introduction – Use case scenarios
Data analytics
Office automation
Web scrapping
Machine learning
Share market (data analytics)
Application development
Web applications (Django and Flask)
Go to https://www.python.org/downloads/
Click download the relevant files based on the OS and version
12. Pip in Python
pip is the package installer for Python, which stands for "Pip Installs Packages"
(originally "Pip Installs Python"). It's a tool used to install and manage Python
packages (libraries or modules) that are not part of the Python standard library.
Eg
pip install package_name
Pip install Django
Pip install pandas
- IDLE - https://docs.python.org/3/library/idle.html
15. COLAB - GOOGLE
Refer this url to use notebook by Colab in Google
https://colab.google/
Variables: Variables are containers for storing data values. Python has no command for
declaring a variable. A variable is created the moment you first assign a value to it.
Example: x = 5
y = "John"
print(x)
print(y)
Refer more for Variables naming in -
https://www.w3schools.com/python/python_variables_names.asp
Introduction - Python Comments and Variables
Exercise:
Comments – (INT18)
Variables – (INT19)
Logical operators
Introduction - Python Operators
Python Operators: Operators are used to perform operations on variables and values.
Python divides the operators in the following groups:
Arithmetic operators
Assignment operators
Comparison operators
The common convention is to alias NumPy as `np`.
Logical operators
Identity operators
Membership operators
Bitwise operators
Introduction - Python Operators
Arithmetic Operators: Arithmetic operators are used with numeric values to perform
common mathematical operations.
Operator Name Example
+ is to alias NumPy as `np`.
The common convention Addition x+y
- Subtraction x-y
* Multiplication x*y
/ Division x/y
% Modulus x%y
** Exponentiation x ** y
// Floor division x // y
Introduction - Python Operators
Assignment Operators: Assignment operators are used to assign values to variables:
Exercise
Arithmetic operators - (INT20)
Comparison operators - (INT20)
Logical operators - (INT20)
Membership operators - (INT20)
The common convention is to alias NumPy as `np`.
Introduction – String operations
String: Python uses string operations to work with strings. Strings in python are surrounded by
either single quotation marks, or double quotation marks.
Function Name Description
Capitalize () Converts the first character of the string to a capital (uppercase) letter
Count () Returns the number of occurrences of a substring in the string.
Index () Returns the position of the first occurrence of a substring in a string
Isalnum () Checks whether all the characters in a given string are alphanumeric or not
The common conventionIsalpha
is to alias()
NumPy as `np`. Returns "True" if all characters in the string are alphabets
Isdecimal () Returns true if all characters in a string are decimal
Isdigit () Returns "True" if all characters in the string are digits
is lower () Check if all characters in the string are lowercase
isnumeric () Returns "True" if all characters in the string are numeric characters
is join () Returns a concatenated String
is lower () Converts all uppercase characters in a string into lowercase
Replace () Replace all occurrences of a substring with another substring
Startswith () Returns "True" if a string starts with the given prefix
Strip () Returns the string with both leading and trailing characters
Swapcase () Converts all uppercase characters to lowercase and vice versa
title () Convert string to title case
Upper () Converts all lowercase characters in a string into uppercase
Introduction – String operations
Escape character: To insert characters that are illegal in a string, use an escape character. An escape
character is a backslash \ followed by the character you want to insert.
Code Result
\' Single Quote
\\ Backslash
\nas `np`.
The common convention is to alias NumPy New Line
\r Carriage Return
\t Tab
\b Backspace
\f Form Feed
\ooo Octal value
\xhh Hex value
Introduction – String operations
Exercise:
Print a string – (INT01)
Print a string with a variable – (INT02)
Multiline Strings – (INT03)
String position – (INT04)
The String
common slicing
convention – (INT05)
is to alias NumPy as `np`.
Json Dictionary
The common convention is to alias NumPy as `np`.
JSON keys can only be strings. The dictionary’s keys can be any hashable object.
The keys in JSON are ordered sequentially and can be The keys in the dictionary cannot be repeated and
repeated. must be distinct.
The keys in JSON have a default value of undefined. There is no default value in dictionaries.
The subscript operator is used to access the values
The values in a JSON file are accessed by using the “.” (dot) in the dictionary. For example, if ‘dict’ =
or “[]” operator. ‘A’:’123R’,’B’:’678S’, we can retrieve data related by
simply calling dict[‘A’].
For string objects, we can use either a single or
For the string object, we must use double quotation marks.
double quotation.
The ‘dict’ object type is the return object type in a
In JSON, the return object type is a’string’ object type.
dictionary.
Introduction - Python Functions
Functions: A function is a block of code which only runs when it is called.
You can pass data, known as parameters, into a function. A function can
return data as a result. In Python a function is defined using the def
keyword: def my_function():
The common convention is to alias NumPy as `np`.
In Python a function is defined using the def keyword:
A function is a block of code which only runs when it is called.
You can pass data, known as parameters, into a function.
A function can return data as a result.
Arguments in a Function
Introduction - Python Functions
Exercise:
Function Basic – (INT21)
Function with Operators - (INT22)
Functions with default parameters - (INT23)
Functions with Variable Number of Arguments - (INT24)
The common convention is to alias NumPy as `np`.
Regex
Regex: A RegEx, or Regular Expression, is a sequence of characters that
forms a search pattern. RegEx can be used to check if a string contains
the specified search pattern.
RegEx Functions:
Function
The common convention is to alias NumPy as `np`. Description
findall Returns a list containing all
matches
search Returns a Match object if there is
split a match anywhere in the string
Returns a list where the string has
been split at each match
sub Replaces one or many matches
with a string
Regex
Meta Characters:
Character Description Example
[] A set of characters "[a-m]"
\ Signals a special sequence (can also be used to escape "\d"
special characters)
. Any character (except newline character) "he..o"
The common
^ convention is to alias NumPywith
Starts as `np`. "^hello"
$ Ends with "planet$"
* Zero or more occurrences "he.*o"
+ One or more occurrences "he.+o"
? Zero or one occurrences "he.?o"
{} Exactly the specified number of occurrences "he.{2}o"
| Either or "falls|stays"
() Capture and group
Regex
Special Sequences:
Character Description Example
\A Returns a match if the specified characters are at the beginning "\AThe"
of the string
\b Returns a match where the specified characters are at the r"\bain"
beginning or at the end of a word
(the "r" in the beginning is making sure that the string is being r"ain\b"
treated as a "raw string")
The common
\B convention is to alias NumPy as `np`.
Returns a match where the specified characters are present, r"\Bain"
but NOT at the beginning (or at the end) of a word
(the "r" in the beginning is making sure that the string is being r"ain\B"
treated as a "raw string")
\d Returns a match where the string contains digits (numbers from "\d"
0-9)
\D Returns a match where the string DOES NOT contain digits "\D"
\s Returns a match where the string contains a white space "\s"
character
\S Returns a match where the string DOES NOT contain a white "\S"
space character
\w Returns a match where the string contains any word characters "\w"
(characters from a to Z, digits from 0-9, and the underscore _
character)
\W Returns a match where the string DOES NOT contain any word "\W"
characters
\Z Returns a match if the specified characters are at the end of "Spain\Z"
Regex
Sets:
Set Description
[arn] Returns a match where one of the specified characters (a, r, or n) is
present
[a-n] Returns a match for any lower case character, alphabetically
between a and n
The common convention is to alias NumPy as `np`.
[^arn] Returns a match for any character EXCEPT a, r, and n
[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are
present
[0-9] Returns a match for any digit between 0 and 9
Python Datetime module comes built into Python, so there is no need to install it
externally.
Examples
The common convention is to alias NumPy as `np`.
Date – 01
Date -02
Date - 03
For further Reading :
https://www.w3schools.com/python/python_datetime.asp
Regex – Ai
Use any Ai tool like https://chat.openai.com
Prompt
Explain the below attached regex patten
pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]
+\.[A-Z|a-z]{2,}\b'
The common convention is to alias NumPy as `np`.
Python If and Else
Python supports the usual logical conditions from mathematics:
Equals: a == b
Not Equals: a != b
Less than: a < b
The Less
common than
convention orNumPy
is to alias equal
as `np`.to: a <= b
These conditions can be used in several ways, most commonly in "if statements"
and loops. Python relies on indentation (whitespace at the beginning of a line) to
define scope in the code.
Python If and Else
An "if statement" is written by using the if keyword.
Example: a = 55
b = 400
if b > a:
print("b is greater than a")
The The
common elif iskeyword
convention is Python's way of saying "if the previous conditions were not
to alias NumPy as `np`.
conditional statement:
Example: a = 33
b = 200
if not a > b:
print("a is NOT greater than b")
Python If and Else
Nested If - You can have if statements inside if statements, this is called nested if
statements.
Example: x = 41
if x > 10:
print("Above ten,")
if x > 20:
The common convention is to alias NumPy as `np`.
print("and also above 20!")
else:
print("but not above 20.")
Python If and Else
Exercises:
Basic if statement (IFEL01)
Indentation in if (code to explain error) (IFEL02)
Elif (IFEL03)
Else (IFEL04)
IF And (IFEL05)
The common convention is to alias NumPy as `np`.
IF Or (IFEL06)
IF Not (IFEL07)
Nested IF (IFEL08)
Python Loop – For and While
Loop: Python has two primitive loop commands:
while loops
for loops
The while loop: With the while loop we can execute a set of statements as long as a
condition is true.
Example: i = 1
The common convention is to alias NumPy as `np`.
while i < 6:
print(i)
i += 1
The For loop: A for loop is used for iterating over a sequence (that is either a list, a tuple, a
dictionary, a set, or a string).
Example: fruits = ["apple", "banana", "cherry"]
for x in fruits:
print(x)
Python Loop – For and While
Exercises:
While loop (FW01)
For loop (FW02)
The common convention is to alias NumPy as `np`.
File operations
“r” – Read
“w” – Write
“a” – Append
“x” – Create
The common convention is to alias NumPy as `np`.
Exercises:
“ r“ – FAR 01
“w” - FAR 02
“a” – FAR 03
“x” – FAR 04
DATA ANALYSIS FOR
PYTHON
Learning Objectives
TO UNDERSTAND THE IMPORTANCE OF PYTHON LIBRARIES IN DATA ANALYSIS.
Reshape arrays into different dimensions using np. reshape or the reshape method.
Pandas - Data Analysis
Pandas is a Python library used for working with data sets.
The name "Pandas" has a reference to both "Panel Data", and "Python Data
Analysis" and was created by Wes McKinney in 2008
Pandas - Data Analysis - Contents
Data Structures
- Series
- Data Frame
Data Alignment
Label Based Indexing
Data Cleaning
Data Aggregation
Data Merging and Joining
Data Visualisation Integration
Pandas - Data Analysis
Examples – Creating and Loading Dataframe
Creating Data Frame
- From Dictionary
Loading Data to Dataframe
- From External Data Sources
- CSV
- JSON
- XML
- Excel
- Database (Tally / Access) using Sql
Pandas - Data Analysis - Viewing Data
Examples - Viewing Data
df.head()
df.tail()
df.shape
df.info()
df.describe()
df.sample(~)
These methods are invaluable for getting an initial sense of your data's structure
and Content.
Pandas - Data Analysis - Indexing and Selecting Data
Examples - Indexing and Selecting Data
Viewing Data
Name_Column = df[`Name`
Subset = df[[‘Name’, ‘Age’]]
Young_People = df[df[“age”] <30]
Note: Loading data already discussed under Creating and Loading Data Frame
Data Preprocessing Steps
DATA REDUCTION
Dimensionality Reduction
Principal Component Analysis (PCA)
Feature Selection
Recursive Feature Elimination (RFE)
DATA IMBALANCE HANDLING
Oversampling
Undersampling
Synthetic Data Generation (SMOTE)
Pandas – Extracting data from different data sources
Practical Approach
Module Case Study - 1
Approach - 1
Using pandas data frame to read Json file and then write to excel
Approach – 2
Using openpyxl library read json parts and write to excel directly
Pandas – Extracting data from different data sources
Practical Approach
Module Case Study - 2
Approach - 1
Use XML Element tree Module
https://docs.python.org/3/library/xml.etree.elementtree.html
Pandas – Extracting data from different data sources
Practical Approach
Module Case Study - 3
Students may use the excel file provided to consolidate into single file
Approach :
Approach :
Get Ledger Master Data from Tally data using sql Query
Query
Select $Name, $Parent, $_PRimaryGroup, $OpeningBalance, $_ClosingBalance
from Ledger
Libraries used
Pyodbc
DATA
VISUALIZATION
WITH PYTHON
Learning Objectives
INDEPENDENT FEATURES
DEPENDENT FEATURES
TYPES OF ML ALGORITHMS
SUPERVISED ALGORITHM
UNSUPERVISED ALGORITHM
REINFORCEMENT LEARNING
TYPES OF ML ALGORITHMS
DATA
VISUALIZATION
WITH PYTHON
Learning Objectives