BASIC THINGS THAT WE SHOULD KNOW
Program: Set of Instructions . It is also called as an
Application (or) Software.
A program process input, manipulate data, and output a
result.
Programming: It is a process of creating a set of instructions
that tell a computer how to perform a task.
Programming language : It is any set of rules that converts
strings(group of characters), or graphical program elements.
TRANSLATORS In Programming Language:-
It converts high-level language code to machine code.
machine code contains 0’s and 1’s(binary bits)
Note:
‘0’ indicates absence of electrical pulse
‘1’ indicates presence of electrical pulse.
Three Types Of Translators Are There
1.Assembler: It converts assembly language code to machine code.
Assembly language contains mnemonic codes. mnemonic codes are
combination of 3 letter or 5 letter combinations.
For Example, ADD, MUL, DIV, SUB, START, STOP, LABEL Etc.
2.Compiler: It converts source code into machine code in single step.
Example:, C, C++, java, .NET………………..
3.Interpreter: It converts source code to machine code by line by line.
Examples: Python , JavaScript ,Ruby.
INTRODUCTION TO PYTHON
Python is a well-known programming language. It was created by Guido van
Rossum, and released in 1991.
It was designed with an emphasis on code readability, and its syntax allows
programmers to express their concepts in fewer lines of code.
Python is a programming language that lets you work quickly and integrate
systems more efficiently.
Programmers can express logical concepts in fewer lines of code in
comparison to languages such as C++ or Java.
Python supports multiple programming paradigms, like object-oriented, and
functional programming or procedural.
There exists inbuilt functions for almost all of the frequently used concepts.
Philosophy is “Simplicity is the best”.
Why python is useful for programmers?
The python programming language is one of the most
accessible programming language available because it has
simplified syntax and not complicated ,which gives more
emphasis on natural language.
Due to its ease of learning python code can be easily written.
Execute much faster than other programming language.
It is open source and it has a large library.
What are the uses of python?
1. AI
2. Machine learning
3. Data Analytics
4. Data visualisation
5. Web development
6. Game development
7. Programming application
8. Language development
9. Search Engine optimisation
10.Finance
LANGUAGE FEATURES
Interpreted
There are no separate compilation and execution steps like C and C++.
Directly run the program from the source code.
Internally, Python converts the source code into an intermediate form
called bytecodes which is then translated into native language of
specific computer to run it.
No need to worry about linking and loading with libraries, etc.
Platform Independent
Python programs can be developed and executed on multiple
operating system platforms.
Python can be used on Linux, Windows, Macintosh, Solaris and many
more.
Free and Open Source :- Redistributable
LANGUAGE FEATURES
High-level Language
In Python, no need to take care about low-level details
such as managing the memory used by the program.
Robust:
Exceptional handling features
Memory management techniques in built
LANGUAGE FEATURES
Rich Library Support
The Python Standard Library is very vast.
Known as the “batteries included” philosophy of Python
It can help do various things involving regular expressions,
documentation generation, unit testing, threading,
databases, web browsers, email, HTML, WAV files, GUI
and many more.
Besides the standard library, there are various other high-
quality libraries such as the Python Imaging Library which
is an amazingly simple image manipulation library.
Difference Between Java and Python.
JAVA PYTHON
1.it is fast and secure general purpose 1.A readable, efficient and powerful highlevel
programming language programming language
2.it is statically typed 2.it is dynamically typed
3.longer lines of code when compared to python 3.shorter lines of code when compared to java
4.most popular and widely used databases 4.Access layers are weaker than java JDBC
5.popular for mobile and web applications 5.Popular for data science , ML, AI, IOT
6.it is compiled language 6.it is interpreted language
7.it is an object oriented language 7.it is an scripting language
INTERPRETER VS COMPILER
Interpreter Compiler
Translates program’s one statement at a Scans the entire program and translates it
time. as a whole into machine code.
Interpreters usually take less amount of Compilers usually take a large amount of
time to analyze the source code. However, time to analyze the source code. However,
the overall execution time is the overall execution time is
comparatively slower than compilers. comparatively faster than interpreters.
Generates Object Code which further
No Object Code is generated, hence are
requires linking, hence requires more
memory efficient.
memory.
Programming languages like JavaScript, Programming languages like C, C++, Java
Python, Ruby use interpreters. use compilers.
Statically Typed vs Dynamically typed
Basic things about Python
Python Indentation:- Indentation refers to the spaces at the
beginning of a code line.
Where in other programming languages the indentation in code
is for readability only, the indentation in Python is very
important.
Python uses indentation to indicate a block of code.
Python Comments:-
Comments can be used to explain Python code.it can be used to
make the code more readable.
It can be used to prevent execution when testing code.
Basic things about Python
Python keywords :- They are special reserved words that have specific
meanings and purposes and can't be used for anything but those specific
purposes.
True, False, None, and, as, assets ,
def, class, continue, break, else,
finally, elif, del, except, global, for ,
if, from, import, raise, try, or, return,
pass, nonlocal, in, not, is ,lambda
Basic things about Python
Python Variables:-Variables are containers for storing data values.
Python has no command for declaring a variable.
A variable is created the moment you first assign a value to it.
Python allows you to assign values to multiple variables in one
line.
a=“python online free class”
x, y, z = “Python", ”online free”, ”class”
x = y = z = “cat“
The Python print() function is often used to
output variables.
Rules for writing a variable
A variable can have a short name (like x and y) or a more
descriptive name (age, carname, total_volume).
Rules for Python variables:A variable name must start with a
letter or the underscore character
A variable name cannot start with a number
A variable name can only contain alpha-numeric characters and
underscores (A-z, 0-9, and _ )
Variable names are case-sensitive
(age, Age and AGE are three different variables)
Examples:
MARKS -(valid) Marks -(valid) marks -(valid)
MaRkS -(valid) _marks -(valid) marks_1 -(valid)
1marks -(invalid) $marks -(invalid) stu name -(invalid)
stu_name -(valid) pass -(invalid) passe -(valid)
There are 3 types of variable :
1.single value can be assigned to single variable
Example:
id=123
print(id)
123
name=“Arunpal"
There are 3 types of variable declaration:
2.single value can be assigned to multiple variables:
Examples:
a=b=c=45
print(a)
print(c)
print(c)
output >45
There are 3 types of variable declaration:
3.mutiple values can be assigned to multiple variables
Example:
>>> a ,b ,c =11,12,13
Data Types:
Text Type: str
Numeric Types: int, float, complex
Sequence Types: list, tuple
Mapping Type: Dict
Set Types: set, frozenset
Boolean Type: bool
None Type: None
Data types
Numeric type:-
There are three numeric types in Python:
Int, float, complex
x = 1 # int
y = 2.8 # float
z = 1j # complex
DATA TYPES:-
String type
Strings in python are surrounded by either single quotation
marks, or double quotation marks.
Single line string.
Multi line string.
AND IT CAN BR OF ANY DATA TYPE INSIDE THE
QUOTATION
Type:-‘str’
Ex: a=“Arunpal”
b=“ WELCOME TO Besant Technologies FREE
PYTHON CLASSES
@ Indiranagar BANGALORE”
DATA TYPES:-
LIST
Lists are used to store multiple items in a single variable.
Lists are created using square brackets:
Allows duplicates
Ex: a=[] # empty list
b=[1,1.2, “python class”,[1,2,3]]
DATA TYPES:-
Tuples
Tuples are used to store multiple items in a single variable.
Tuples are written with round brackets.
Ex: a=() # empty tuple
b=(1,1.2, “python class”)
DATA TYPES:-
Dictonary
Dictionaries are used to store data values in key:value pairs.
A dictionary is a collection which is ordered*, changeable and do not
allow duplicates.
Ex: a={} # empty dict
b={“name”: “Rocky”,
“movie”: [“KGF 1”, “KGF 2”]
}
Data types:-
Python Casting:-
Specify a Variable Type
Casting in python is therefore done using constructor functions:
INTEGER:- Constructs an integer number from an integer literal, a float literal (by
removing all decimals), or a string literal (providing the string represents a whole
number)
FLOAT:- Constructs a float number from an integer literal, a float literal or a string literal
(providing the string represents a float or an integer)
STRING:-constructs a string from a wide variety of data types, including strings, integer
literals and float literals
Operators
Arithmetic Operators
Assignment Operators
Comparison Operators
Logical Operators
Membership Operators
Bitwise Operators
Identity Operators
Operators
Arithmetic operators are used to perform mathematical operations like
addition, subtraction, multiplication and division.
There are 7 arithmetic operators in Python :
1. Addition
2. Subtraction
3. Multiplication
4. Division
5. Modulus
6. Exponentiation
7. Floor division
Operators
Operato
Description Syntax
r
+ Addition: adds two operands x+y
– Subtraction: subtracts two operands x–y
* Multiplication: multiplies two operands x*y
Division (float): divides the first operand by
/ x/y
the second
Division (floor): divides the first operand by
// x // y
the second
Modulus: returns the remainder when first
% x%y
operand is divided by the second
** Power : Returns first raised to power second x ** y
Operators
Examples:-
val1 = 2
val2 = 3
# using the addition operator
res = val1 + val2
print(res)
# using the multiplication operator
res = val1 * val2
print(res)
Operators
# using the subtraction operator
res = val1 - val2
print(res)
# using the division operator
res = val1 / val2
print(res)
# using the modulus operator
res = val1 % val2
print(res)
Operators
# using the exponentiation operator
res = val1 ** val2
print(res)
# using the floor division
res = val1 // val2
print(res)
Operators
Assignment Operators are used to assigning values to variables.
Operator Description Syntax
Assign value of right side of expression to left side
= x=y+z
operand
Add and Assign: Add right side operand with left side
+= a += b
operand and then assign to left operand
Subtract AND: Subtract right operand from left
-= operand and then assign to left operand: True if both a -= b
operands are equal
Multiply AND: Multiply right operand with left
*= a *= b
operand and then assign to left operand
Operators
Divide AND: Divide left operand with right operand and
/= a /= b
then assign to left operand
Modulus AND: Takes modulus using left and right
%= a %= b
operands and assign result to left operand
Divide(floor) AND: Divide left operand with right
//= a //= b
operand and then assign the value(floor) to left operand
Exponent AND: Calculate exponent(raise power) value
**= a **= b
using operands and assign value to left operand
Performs Bitwise AND on operands and assign value to
&= a &= b
left operand
Performs Bitwise OR on operands and assign value to left
|= a |= b
operand
Operators
# Examples of Assignment Operators
a = 10
# Assign value
b=a
print(b)
# Add and assign value
b += a
print(b)
# Subtract and assign value
b -= a
print(b)
# multiply and assign
b *= a
print(b)
Operators
Comparison Operators are used for comparing the values. It either returns True or False
according to the condition.
Operator Description Syntax
> Greater than: True if the left operand is greater than the right x>y
< Less than: True if the left operand is less than the right x<y
== Equal to: True if both operands are equal x == y
!= Not equal to – True if operands are not equal x != y
Greater than or equal to: True if left operand is greater than or equal to
>= x >= y
the right
Less than or equal to: True if left operand is less than or equal to the
<= x <= y
right
Operators
# Examples of Relational Operators # a == b is False
a = 13 print(a == b)
b = 33
# a != b is True
# a > b is False print(a != b)
print(a > b)
# a >= b is False
# a < b is True print(a >= b)
print(a < b)
# a <= b is True
print(a <= b)
Operators
Logical operators are used on conditional statements (either True or False). They
perform Logical AND, Logical OR and Logical NOT operations.
OPERATOR DESCRIPTION SYNTAX
Logical AND: True if both the operands are
and x and y
true
Logical OR: True if either of the operands is
or x or y
true
not Logical NOT: True if operand is false not x
Operators
Membership Operators
in and not in are the membership operators; used to test whether a value or variable is in a
sequence.
in True if value is found in the sequence
not in True if value is not found in the sequence
Operators
# Python program to illustrate
# not 'in' operator
x = 24
y = 20
list = [10, 20, 30, 40, 50]
if (x not in list):
print("x is NOT present in given list")
else:
print("x is present in given list")
if (y in list):
print("y is present in given list")
else:
print("y is NOT present in given list")
Operators
Bitwise operators
In Python, bitwise operators are used to performing bitwise
calculations on integers. The integers are first converted into
binary and then operations are performed on bit by bit, hence
the name bitwise operators. Then the result is returned in
decimal format.
OPERATOR DESCRIPTION SYNTAX
& Bitwise AND x&y
| Bitwise OR x|y
~ Bitwise NOT ~x
^ Bitwise XOR x^y
Operators
Bitwise AND operator: Returns 1 if both
the bits are 1 else 0.
Bitwise or operator: Returns 1 if either of
the bit is 1 else 0.
Bitwise not operator: Returns one’s
complement of the number.
Bitwise xor operator: Returns 1 if one of
the bits is 1 and the other is 0 else returns
false.
Operators
a = 10 = 1010 (Binary)
b = 4 = 0100 (Binary)
a & b = 1010 & 0100
= 0000
= 0 (Decimal)
a = 10 = 1010 (Binary)
b = 4 = 0100 (Binary)
a | b = 1010 | 0100
= 1110
= 14 (Decimal)
Operators
a = 10 = 1010 (Binary)
~a = ~1010
= -(1010 + 1)
= -(1011)
= -11 (Decimal)
a = 10 = 1010 (Binary)
b = 4 = 0100 (Binary)
a ^ b = 1010 ^ 0100
= 1110
= 14 (Decimal)
Operators
Identity Operators
is and is not are the Identity Operators both are used to check
if two values are located on the same part of the memory. Two
variables that are equal do not imply that they are identical
is :- True if the operands are identical
is not :- True if the operands are not identical
a = 10
b = 20
c =a
print(a is not b)
print(a is c)
If statements
if statement:-
if statement is the most simple decision-making statement. It is used to
decide whether a certain statement or block of statements will be executed
or not i.e if a certain condition is true then a block of statement is executed
otherwise not.
Syntax:-
if condition:
# Statements to execute if
# condition is true
If-else Statements
if-else
The if statement alone tells us that if a condition is true it will execute
a block of statements and if the condition is false it won’t. But what if
we want to do something else if the condition is false. Here comes
the else statement. We can use the else statement with if statement to
execute a block of code when the condition is false.
Syntax:-
if (condition):
# Executes this block if
# condition is true
else:
# Executes this block if
# condition is false
if- elif -else ladder
if-elif-else ladder
The if statements are executed from the top down. As soon as one of the
conditions controlling the if is true, the statement associated with that if is
executed, and the rest of the ladder is bypassed. If none of the conditions is
true, then the final else statement will be executed.
Syntax:
if condition1:
# Block of code to execute if condition1 is true
elif condition2:
# Block of code to execute if condition2 is true
else:
# Block of code to execute if all conditions are false
Nested-if
Nested-if
A nested if is an if statement that is the target of another if
statement. Nested if statements mean an if statement inside
another if statement. Yes, Python allows us to nest if statements
within if statements. i.e, we can place an if statement inside
another if statement.
Syntax:
if (condition1):
# Executes when condition1 is true
if (condition2):
# Executes when condition2 is true
# if Block is end here
While Loops
With the while loop we can execute a set of statements
as long as a condition is true.
Write a program to print from 1 to 10 using while loop
i=1
while i < 11:
print(i)
i += 1 #remember to increment i, or else the loop
will continue forever.
Break Statement While loop
With the break statement we can stop the loop even if the
while condition is true.
Write a program to check if any number is divisible by 3 in
between 25 to 35.
i = 25
while i < 36:
print(i)
if i %3== 0:
break
i += 1
Continue Statement with while loop
With the continue statement we can stop the current iteration,
and continue with the next:
Example
Write a program to print number 1 to 10 except 3.
i=0
while i < 11:
i += 1
if i == 3:
continue
print(i)
Else Statement with while loop
With the else statement we can run a block of code once when
the condition no longer is true:
Example
Print a message once the condition is false:
i=1
while i < 11:
print(i)
i += 1
else:
print("i is no longer less than 11")
For loop
A for loop is used for iterating over a sequence (that is
either a list, a tuple, a dictionary, a set, or a string).
With the for loop we can execute a set of statements,
once for each item in a list, tuple, set etc.
Examples:-
cars = [“Kia", “Thar", “Audi"]
for x in cars:
print(x)
The for loop does not require an indexing variable to
set beforehand.
range() Function
To loop through a set of code a specified number of
times, we can use the range() function,
The range() function returns a sequence of numbers,
starting from 0 by default, and increments by 1 (by
default), and ends at a specified number.
for x in range(6):
print(x)
Note that range(6) is not the values of 0 to 6, but the
values 0 to 5.
range() Function
The range() function defaults to 0 as a starting value, however it
is possible to specify the starting value by adding a
parameter: range(2, 6), which means values from 2 to 6 (but not
including 6):
for x in range(2, 6):
print(x)
The range() function defaults to increment the sequence by 1,
however it is possible to specify the increment value by adding a
third parameter: range(2, 30, 3):
for x in range(3, 31, 3):
print(x)
Else in For Loop
The else keyword in a for loop specifies a block of code to be executed
when the loop is finished:
Example
Print all numbers from 0 to 5, and print a message when the loop has ended:
for x in range(6):
print(x)
else:
print("Finally finished!")
Note: The else block will NOT be executed if the loop is stopped by
a break statement.
Else in For Loop
for x in range(6):
if x == 3: break
print(x)
else:
print("Finally finished!")
Python Nested for Loop
months = ["jan", "feb", "mar"]
days = ["sun", "mon", "tue"]
for x in months:
for y in days:
print(x, y)
print ("Good bye!“)
Functions
A Python function is a block of organized, reusable code that is
used to perform a single, related action.
In Python a function is defined using the def keyword.
Functions provide better modularity for your application and a
high degree of code reusing.
A function is a block of code which only runs when it is called.
You can pass data, known as parameters, into a function.
A function can return data as a result.
There are two types of functions:
BUILT-IN FUNCTIONS: These functions are part of the python programming.
there are 68 built-in functions.
Example:
print(), dir(), abs(), length()……………………….
USER-DEFINED FUNCTIONS:
these functions are made by user as there requirement.
Advantages:
code reusability
reducing duplication of code
decomposing complex problems into smaller pieces.
abstraction
Syntax:
def <function_name>(Parameters):
statements
…………………..
return statement
‘def’ is a keyword used to define a function
‘def’ keyword is followed by a function name
function can contain parameters or without parameters.
return statement is optional
the return statement is used to return the value.
A function can have only one return.
by using return statement we can return multiple values.
Function calling:
A function must be defined before the function call,
otherwise the python interpreter gives an error.
after the function is created , we can call it from
another function.
Example:
def hello(): #function definition
print("today we are discussing functions concept")
hello() #function calling
Function Arguments
The process of a function often depends on certain data provided
to it while calling it. While defining a function, you must give a
list of variables in which the data passed to it is collected. The
variables in the parentheses are called formal arguments.
When the function is called, value to each of the formal arguments
must be provided. Those are called actual arguments.
________________________________________________________
From a function's perspective:
A parameter is the variable listed inside the parentheses in the
function definition.
An argument is the value that is sent to the function when it is
called.
Types of Python Function Arguments
Positional or Required Arguments
Keyword Arguments
Default Arguments
Arbitrary or Variable-length Arguments
Positional or Required Arguments
Required arguments are the arguments passed to a function in
correct positional order. Here, the number of arguments in the
function call should match exactly with the function definition.
def add_numbers(x, y):
print(x,y)
return x + y
# Calling the function with positional arguments
result = add_numbers(3, 5)
print(result) # Output: 8
Keyword Arguments
Keyword arguments are related to the function calls.
When you use keyword arguments in a function call, the
caller identifies the arguments by the parameter name.
def greet(name, greeting):
return f"{greeting}, {name}!"
# Calling the function with keyword arguments
print(greet(name="Alice", greeting="Hi")) # Output: Hi,
Alice!
Default Arguments
A default argument is an argument that assumes a default
value if a value is not provided in the function call for that
argument.
def greet(name, greeting="Hello"):
return f"{greeting}, {name}!"
# Calling the function without specifying
the greeting
print(greet("Alice")) # Output: Hello, Alice!
Arbitrary or Variable-length Arguments
*args: Variable-length arguments.
def my_function(*args):
for i in args:
print(i)
my_function(1, 2, 3, 4)
Variable-length keyword arguments.
**kwargs: variable-length keyword arguments.
def my_function(**kwargs):
for key, value in kwargs.items():
print(f"{key}: {value}")
my_function(name="Alice", age=30, city="New York")
Local variables
Local variables are defined within a function and are
only accessible within that function's scope.
They are created when the function is called and
destroyed once the function execution completes.
Local variables cannot be accessed from outside the
function in which they are defined.
If you try to access a local variable outside its scope,
you'll encounter a NameError.
Global Variables:
Global variables are defined outside of any function and
can be accessed and modified from any part of the code.
They have a global scope, meaning they are visible to all
functions in the same module.
If a function wants to modify a global variable, it must
declare it using the global keyword within the function.
Python built-in functions
abs() - Returns the absolute value of a number
all() - Returns True if all items in an iterable object are true
any() - Returns True if any item in an iterable object is true
bin() - Returns the binary version of a number
bool() - Returns the boolean value of the specified object
chr() - Returns a character from the specified Unicode code
complex() - Returns a complex number
Python built-in functions
dict() - Returns a dictionary (Array)
divmod() - Returns the quotient and the remainder
when argument1 is divided by argument2
enumerate() -Takes a collection (e.g. a tuple) and returns it as an enumerate object
filter() - Use a filter function to exclude items in an iterable object
float() - Returns a floating point number
format() - Formats a specified value
Python built-in functions
frozenset() - Returns a frozenset object
help() - Executes the built-in help system
hex() - Converts a number into a hexadecimal value
input() - Allowing user input
int() - Returns an integer number
len() - Returns the length of an object
list() - Returns a list
map() - Returns the specified iterator with the
specified function applied to each item
Python built-in functions
max() - Returns the largest item in an iterable
min() - Returns the smallest item in an iterable
next() - Returns the next item in an iterable
oct() - Converts a number into an octal
ord() - Convert an integer representing
the Unicode of the specified character
pow() - Returns the value of x to the power of y
print() - Prints to the standard output device
Python built-in functions
range() - Returns a sequence of numbers, starting from
0 and increments by 1 (by default)
reversed() - Returns a reversed iterator
round() - Rounds a number
set() - Returns a new set object
slice() - Returns a slice object
sorted() - Returns a sorted list
sum() - Sums the items of an iterator
Lambda function
A lambda function is a small anonymous function.
A lambda function can take any number of
arguments, but can only have one expression.
Syntax:
lambda arguments: expression
Why Use Lambda Functions?
The power of lambda is better shown when you use
them as an anonymous function inside another function.
Say you have a function definition that takes one
argument, and that argument will be multiplied with an
unknown number:
Uses of Sequence data-type functions
List Methods
Python has a set of built-in methods that you can use on
lists.
Method Description
append() Adds an element at the end of the list
clear() Removes all the elements from the list
copy() Returns a copy of the list
count() Returns the number of elements with the
specified value
extend() Add the elements of a list (or any iterable), to
the end of the current list
List methods
index() Returns the index of the first element with the specified value
insert() Adds an element at the specified position
pop() Removes the element at the specified position
remove() Removes the item with the specified value
reverse() Reverses the order of the list
sort() Sorts the list
Tuple Methods
Python has two built-in methods that you can use on
tuples.
Method Description
count() Returns the number of times a specified value
occurs in a tuple
index() Searches the tuple for a specified value and
returns the position of where it was found
Dictionary Methods
Python has a set of built-in methods that you can use on
dictionaries.
Method Description
clear() Removes all the elements from the dictionary
copy() Returns a copy of the dictionary
get() Returns the value of the specified key
items() Returns a list containing a tuple for each key value pair
Dictionary Methods
keys() Returns a list containing the dictionary's keys
pop() Removes the element with the specified key
popitem() Removes the last inserted key-value pair
update() Updates the dictionary with the specified key-value
pairs
values() Returns a list of all the values in the dictionary
String Methods
Python has a set of built-in methods that you can use on
strings.
Note: All string methods return new values. They do not
change the original string.
Method Description
capitalize() Converts the first character to upper case
endswith() Returns true if the string ends with the
specified value
count() Returns the number of times a specified value
occurs in a string
String Methods
format() Formats specified values in a string
index() Searches the string for a specified value and returns
the position of where it was found
isalnum() Returns True if all characters in the string are
alphanumeric
isalpha() Returns True if all characters in the string are in the
alphabet
String Methods
isdigit() Returns True if all characters in the
string are digit
isidentifier() Returns True if the string is an
identifier
islower() Returns True if all characters in the
string are lower case
istitle() Returns True if the string follows the
rules of a title
String Methods
isupper() Returns True if all characters in the string are
upper case
join() Joins the elements of an iterable to the end of
the string
lower() Converts a string into lower case
lstrip() Returns a left trim version of the string
replace() Returns a string where a specified value is
replaced with a specified value
String Methods
split() Splits the string at the specified separator, and
returns a list
splitlines() Splits the string at line breaks and returns a list
startswith() Returns true if the string starts with the
specified value
strip() Returns a trimmed version of the string
swapcase() Swaps cases, lower case becomes upper case
and vice versa
title() Converts the first character of each word to upper
case
upper() Converts a string into upper case
DataType Mutable Or Immutable?
Boolean (bool) Immutable
Integer (int) Immutable
Float Immutable
String (str) Immutable
tuple Immutable
frozenset Immutable
list Mutable
set Mutable
dict Mutable
Set methods
add(): Adds an element to the set. If the element is already present, it doesn't add
it again.
clear(): Removes all the elements from the set, making it an empty set.
copy(): Returns a shallow copy of the set.
difference(): Returns a new set containing elements that are present in the first set
but not in the second set(s).
difference_update(): Updates the set, removing elements that are present in other
specified set(s).
discard(): Removes the specified element from the set if it is present.
intersection(): Returns a new set containing elements that are common to both
sets.
Set methods
intersection_update(): Updates the set, keeping only the
elements that are present in other specified set(s).
isdisjoint(): Returns True if the sets have no elements in
common, otherwise False.
issubset(): Returns True if all elements of the set are present in
the specified set, otherwise False.
isdisjoint(): Returns True if the sets have no elements in
common, otherwise False. Sets are disjoint if their intersection
is an empty set.
issuperset(): Returns True if the set contains all elements of
the specified set, otherwise False.
Set methods
symmetric_difference(): Returns a new set containing elements
that are present in either the first set or the second set, but not
in both.
issubset(): Returns True if all elements of the set are present in
the specified set, otherwise False. It checks whether every
element of the set is contained in the specified set.
pop(): Removes and returns an arbitrary element from the set.
If the set is empty, it raises a KeyError.
remove(): Removes the specified element from the set. If the
element is not present, it raises a KeyError.
Set methods
symmetric_difference_update(): Updates the set,
keeping only the elements that are present in either the
first set or the second set, but not in both.
union(): Returns a new set containing all the unique
elements from both sets.
update(): Updates the set by adding elements from
another set or sets. If a set is provided, it adds all
elements from that set to the current set. If multiple
sets are provided, it adds elements from each set to the
current set.
List comprehension
List comprehension is a concise and elegant way of
creating lists in Python. It allows you to create a new list
by applying an expression to each item in an iterable
(such as a list, tuple, or range) and optionally filtering the
items based on a condition.
new_list = [expression for item in iterable if condition]
squares = [x ** 2 for x in range(10)] print(squares)
How To Read Multiple Values From Single
Input?
By Using Split()
x = [int(x) for x in input("Enter multiple value: ").split()]
print("Number of list is: ", x)
x = [int(x) for x in input("Enter multiple value: ").split(",")]
print("Number of list is: ", x)
Dictionary comprehension
Dictionary comprehension is similar to list comprehension
but is used to create dictionaries instead of lists. It provides a
concise and readable way to create dictionaries by applying
an expression to each item in an iterable and optionally
filtering the items based on a condition.
new_dict = {key_expression: value_expression for item in
iterable if condition}
squares_dict = {x: x ** 2 for x in range(5)}
print(squares_dict)
Formating strings
f-strings are string literals that have an f at the beginning and curly braces
containing expressions that will be replaced with their values.
Python string format() function has been introduced for handling complex
string formatting more efficiently. Sometimes we want to make generalized
print statements in that case instead of writing print statement every time we
use the concept of formatting.
# using format function
print("My Name is {} and I am {} years old".format(name,age))
Formating ways:-
age=int(input("what is your age?"))
print("Hello , I am {} years old".format(age))
message="My name is {}"
name=input("Enter your name")
print(message.format(name))
# using f_string
print(f"Hello, My name is {name} and I'm {age} years old.")
Formating ways:-
print("{} loves {}!!".format("Everyone",
"Python"))
print("Python is {1} {0} {2} {3}"
.format("independent", "platform", "programming",
"language"))
print("Every {3} should have {0} in order to {1} {2}"
.format("PVM", "execute", "Python", "Operating Systems"))
String slicing
# There are two ways to slice the string
mystring = “Rajnikanta”
# Using slice constructor , slice is an in-built class
s1 = slice(3)
s2 = slice(1, 5, 2)
s3 = slice(-1, -8, -2)
# s4=slice(1,6,-2) # cannot give negative value here
# s4=slice(-2,-7,2) # cannot give positive value here
s4=slice(-8,6)
String sclicing
# s5=slice(7,-8) # requires negative value for step
s5=slice(7,-8,-1)
s6=slice(3,-3)
s7=slice(6,-7,-2)
s8=slice(6,-7,-1)
s9=slice(7,2,-2)
s10=slice(-7) # 0 to -8
Note :print all the variable names to see output
String sclicing
print("String slicing without slice class")
print(mystring[:3]) # or mystring[0:3]
print(mystring[1:5:2])
print(mystring[-1:-8:-2])
print(mystring[-8:6])
print(mystring[7:-8:-1])
print(mystring[3:-3])
print(mystring[6:-7:-2])
String sclicing
print(mystring[6:-7:-1])
print(mystring[7:2:-2])
print(mystring[:-7]) # 0 to -8
print(mystring[:]) # copy of the string
print(mystring[4:])
print(mystring[:-1])
print(mystring[::-1])
Module
The module is a simple Python file that contains collections of functions and global
variables and with having a .py extension file.
It is an executable file and to organize all the modules we have the concept called
Package in Python.
A module is a single file (or files) that are imported under one
import and used. E.g. import mymodule
Package
A package is a collection of Python modules grouped together in a directory. It also
contains a special file named __init__.py, which indicates to Python that the directory
should be treated as a package.
Packages help organize related modules into a hierarchical structure, making it easier
to organize and distribute code.
You can import modules from a package using dot notation.
Library
In a broad sense, a library refers to a collection of code (modules and/or
packages) that provides specific functionality to be used by other
programs.
Libraries can include both built-in Python libraries (standard libraries) and
third-party libraries.
Built-in libraries are part of the Python standard library and come pre-
installed with Python. They provide core functionality for common tasks.
Third-party libraries are developed by individuals or organizations outside
of the Python core development
team. They extend the functionality of Python by
providing additional features and tools for specific tasks.
Standard Library:
datetime: Allows manipulation of dates and times.
json: Provides functions for encoding and decoding JSON data.
math: Contains mathematical functions and constants.
random: Provides functions for generating random numbers.
re: Supports regular expressions for pattern matching.
collections: Contains specialized container datatypes.
csv: Provides functions for working with CSV files.
sys: Provides access to some variables used or
maintained by the Python interpreter.
Third-Party Libraries:
numpy: Provides support for large, multi-dimensional arrays and matrices, along with a collection of
mathematical functions to operate on these arrays.
pandas: Offers data structures and data analysis tools for working with structured data, such as tables
and time series.
matplotlib: Allows creation of static, animated, and interactive visualizations in Python.
requests: Simplifies making HTTP requests in Python.
scikit-learn: Offers machine learning algorithms and tools for data mining and data analysis.
tensorflow: An open-source machine learning library for high-performance numerical computation
and deep learning.
django: A high-level Python web framework for building web applications.
flask: A lightweight web framework for building web applications and APIs.
beautifulsoup4: A library for scraping information from web pages.
sqlalchemy: A SQL toolkit and Object-Relational Mapping (ORM) library for Python.
Python Datetime
A date in Python is not a data type of its own, but we can
import a module named datetime to work with dates as date
objects.
Python Math
Python has a set of built-in math functions,
including an extensive math module, that allows you to perform
mathematical tasks on numbers.
Numpy
NumPy, short for Numerical Python, is a fundamental
package for numerical computing in Python. It provides
support for large, multi-dimensional arrays and matrices,
along with a collection of mathematical functions to operate
on these arrays efficiently. NumPy is a core library for
scientific computing in Python and is widely used in fields
such as data science, machine learning, engineering, and
scientific research.
Key features of NumPy include:
Multi-dimensional Arrays: NumPy's main object is the ndarray, which is a multidimensional array of elements of the same type. These arrays can
be of any dimensionality, allowing for efficient storage and manipulation of large datasets.
Array Operations: NumPy provides a wide range of mathematical functions that operate element-wise on arrays, including arithmetic operations,
trigonometric functions, exponential and logarithmic functions, statistical functions, and more. These operations are optimized for performance,
making them suitable for large-scale numerical computations.
Broadcasting: NumPy allows for arithmetic operations between arrays of different shapes and sizes through broadcasting. Broadcasting
automatically aligns the dimensions of arrays to perform element-wise operations, which simplifies the code and improves efficiency.
Indexing and Slicing: NumPy offers powerful indexing and slicing capabilities for accessing and manipulating elements of arrays. It supports
advanced indexing techniques such as boolean indexing, integer array indexing, and fancy indexing.
Vectorized Computations: NumPy encourages vectorized computations, where operations are applied to entire arrays at once, rather than looping
over individual elements. This approach leads to faster execution times and cleaner code compared to traditional looping constructs.
Random Number Generation: NumPy includes a random number generation module (numpy.random) for generating random numbers from
various probability distributions. This module is useful for tasks such as simulation, modeling, and statistical analysis.
Integration with Other Libraries: NumPy integrates seamlessly with other scientific computing libraries in Python, such as SciPy, Matplotlib,
Pandas, and Scikit-learn, enabling a cohesive ecosystem for data analysis, visualization, and machine learning.
Why Use NumPy?
In Python we have lists that serve the purpose of arrays, but
they are slow to process.
NumPy aims to provide an array object that is up to 50x
faster than traditional Python lists.
The array object in NumPy is called ndarray, it provides a
lot of supporting functions that make working with ndarray
very easy.
Arrays are very frequently used in data science, where speed
and resources are very important.
Why is NumPy Faster Than Lists?
NumPy arrays are stored at one continuous place in memory
unlike lists, so processes can access and manipulate them very
efficiently.
This behavior is called locality of reference in computer
science.
Installation of NumPy
pip install numpy
Once NumPy is installed, import it in your applications by adding the
import keyword:
import numpy
NumPy as np
NumPy is usually imported under the np alias.
alias: In Python alias are an alternate name for referring to the same
thing.
Create an alias with the as keyword while importing:
import numpy as np
Now the NumPy package can be referred to as np instead of numpy.
Create a NumPy ndarray Object
NumPy is used to work with arrays. The array object in NumPy is called
ndarray.
We can create a NumPy ndarray object by using the array() function.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))
Ndarray ,Ndim
type(): This built-in Python function tells us the type of the object
passed to it. Like in above code it shows that arr is numpy.ndarray
type.
import numpy as np
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)
NumPy Array Indexing
Array indexing is the same as accessing an array element.
You can access an array element by referring to its index
number.
The indexes in NumPy arrays start with 0, meaning that
the first element has index 0, and the second has index 1
etc.
Indexing – 1d
Accessing elements and adding
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[2] + arr[3])
Indexing – 2d
Accessing 1st row 2nd column
import numpy as np
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', arr[0, 1])
Indexing -3d
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr[0, 1, 2])
Array Slicing
Slicing arrays
Slicing in python means taking elements from one given index to another given index.
We pass slice instead of index like this: [start:end].
We can also define the step, like this: [start:end:step].
If we don't pass start its considered 0
If we don't pass end its considered length of array in that dimension
If we don't pass step its considered 1
Slicing 1-D Arrays
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
Slicing 2-D Arrays
import numpy as np
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])
Data Types
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )
Checking the Data Type of an Array
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)
Creating Arrays With a Defined Data Type
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)
Converting Data Type on Existing Arrays
import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype('i')
print(newarr)
print(newarr.dtype)
2nd way
import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype(int)
print(newarr)
print(newarr.dtype)
NumPy Array Copy vs View
The main difference between a copy and a view of an array is that
the copy is a new array, and the view is just a view of the original
array.
The copy owns the data and any changes made to the copy will not
affect original array, and any changes made to the original array will
not affect the copy.
The view does not own the data and any changes made to the view
will affect the original array, and any changes made to the original
array will affect the view.
Copy
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42
print(arr)
print(x)
The copy SHOULD NOT be affected by the changes
made to the original array.
VIEW:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42
print(arr)
print(x)
The view SHOULD be affected by the changes made to the
original array.
Check if Array Owns its Data
As mentioned above, copies owns the data, and views
does not own the data, but how can we check this?
Every NumPy array has the attribute base that returns
None if the array owns the data.
Otherwise, the base attribute refers to the original object.
Base
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
y = arr.view()
print(x.base)
print(y.base)
The copy returns None.
The view returns the original array.
Array Shape
The shape of an array is the number of elements in each dimension.
NumPy arrays have an attribute called shape that returns a tuple
with each index having the number of corresponding elements.
import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape) #(2,4) is the shape
shape
import numpy as np
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('shape of array :', arr.shape)
Reshaping
Reshaping arrays
Reshaping means changing the shape of an array.
The shape of an array is the number of elements in each dimension.
By reshaping we can add or remove dimensions or change number
of elements in each dimension.
Reshape From 1-D to 2-D
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)
print(newarr)
Unknown Dimension
You are allowed to have one "unknown" dimension.
Meaning that you do not have to specify an exact number for one of the dimensions in the
reshape method.
Pass -1 as the value, and NumPy will calculate this number for you.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
newarr = arr.reshape(2, 2, -1)
print(newarr)
Flattening the arrays
Flattening array means converting a multidimensional array into a 1D array.
We can use reshape(-1) to do this.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = arr.reshape(-1)
print(newarr)
Iterating Arrays
Iterating means going through elements one by one.
As we deal with multi-dimensional arrays in numpy, we can do
this using basic for loop of python.
If we iterate on a 1-D array it will go through each element one by
one.
import numpy as np
arr = np.array([1, 2, 3])
for x in arr:
print(x)
Iterating 2-D Arrays
In a 2-D array it will go through all the rows.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
for x in arr:
print(x)
Iterating 3-D Arrays
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in arr:
print("x represents the 2-D array:")
print(x)
Iterating Arrays Using nditer()
The function nditer() is a helping function that can be used
from very basic to very advanced iterations. It solves some
basic issues which we face in iteration, lets go through it
with examples.
Iterating on Each Scalar Element
In basic for loops, iterating through each scalar of an array
we need to use n for loops which can be difficult to write for
arrays with very high dimensionality.
Iterating Arrays Using nditer()
import numpy as np
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
for x in np.nditer(arr):
print(x)
Joining NumPy Arrays
Joining means putting contents of two or more arrays in a single array.
In SQL we join tables based on a key, whereas in NumPy we join arrays by
axes.
We pass a sequence of arrays that we want to join to the concatenate()
function, along with the axis. If axis is not explicitly passed, it is taken as 0.
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)
Joining with axis
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.stack((arr1, arr2), axis=1)
print(arr)
Stacking Along Rows
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.hstack((arr1, arr2))
print(arr)
Stacking Along Columns
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.vstack((arr1, arr2))
print(arr)
Splitting NumPy Arrays
Splitting is reverse operation of Joining.
Joining merges multiple arrays into one and Splitting breaks one array
into multiple.
We use array_split() for splitting arrays, we pass it the array we want
to split and the number of splits.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
print(newarr)
The return value is a list containing three arrays.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 4)
print(newarr)
We also have the method split() available but it will not adjust the
elements when elements are less in source array for splitting like
in example above, array_split() worked properly but split() would
fail.
Split Into Arrays
The return value of the array_split() method is an array containing each of the split
as an array.
If you split an array into 3 arrays, you can access them from the result just like any
array element:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
print(newarr[0])
print(newarr[1])
print(newarr[2])
Splitting 2-D Arrays
import numpy as np
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
newarr = np.array_split(arr, 3)
print(newarr)
hsplit() opposite of hstack()
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.hsplit(arr, 3)
print(newarr)
Similarly we can do for vsplit()
Searching Arrays
You can search an array for a certain value, and return the indexes that get a match.
To search an array, use the where() method.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4)
print(x)
The example above will return a tuple: (array([3, 5, 6],)
Which means that the value 4 is present at index 3, 5, and 6.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 0)
print(x)
Search Sorted
There is a method called searchsorted() which performs a binary search in the array, and
returns the index where the specified value would be inserted to maintain the search order.
The searchsorted() method is assumed to be used on sorted arrays.
import numpy as np
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7)
print(x)
Sorting Arrays
import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
This method returns a copy of the array, leaving the original array
unchanged.
Sorting a 2-D Array
import numpy as np
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))
Filtering Arrays
Getting some elements out of an existing array and creating a new
array out of them is called filtering.
In NumPy, you filter an array using a boolean index list.
A boolean index list is a list of booleans corresponding to indexes
in the array.
import numpy as np
arr = np.array([41, 42, 43, 44])
# Create an empty list
filter_arr = []
# go through each element in arr
for element in arr:
# if the element is higher than 42, set the value to True, otherwise False:
if element > 42:
filter_arr.append(True)
else:
filter_arr.append(False)
newarr = arr[filter_arr]
print(filter_arr)
Pandas
Pandas is a powerful open-source Python library used for data
manipulation and analysis. It provides easy-to-use data structures
and data analysis tools, making it popular among data scientists,
analysts, and developers working with tabular or structured data.
Key features of Pandas include:
DataFrame: Pandas introduces the DataFrame data structure, which is a two-dimensional, size-mutable, and
heterogeneous tabular data structure with labeled axes (rows and columns). It allows you to store and manipulate data in a
way similar to a spreadsheet or SQL table.
Series: Along with DataFrame, Pandas also provides the Series data structure, which is a one-dimensional labeled array
capable of holding any data type.
Data Manipulation: Pandas offers a wide range of functions and methods for data manipulation, including indexing,
slicing, filtering, sorting, merging, joining, reshaping, grouping, aggregating, and pivoting data.
Data Cleaning: Pandas provides tools for handling missing data, converting data types, removing duplicates, and
performing other data cleaning tasks.
Data Input/Output: Pandas supports various file formats for importing and exporting data, including CSV, Excel, SQL
databases, JSON, HTML, and more..
Integration with Other Libraries: Pandas integrates well with other Python libraries such as NumPy, Matplotlib,
Seaborn, and Scikit-learn, providing seamless interoperability for data analysis, visualization, and machine learning tasks.
Series
A Series is a one-dimensional labeled array-like data structure
that can hold any data type, including integers, floats, strings,
or even Python objects. It is similar to a one-dimensional
NumPy array, but with additional features such as axis labels
(index), which makes it more powerful and versatile.
Characteristics
One-Dimensional: A Series consists of a single dimension of data, similar to a one-dimensional array.
Labeled Index: Each element in a Series is associated with a unique label called the index. The index can be
numeric or non-numeric (e.g., strings, dates), and it provides a way to access and manipulate data within the
Series.
Homogeneous Data: Unlike Python lists, a Series requires all elements to be of the same data type. This
homogeneity allows for efficient storage and computation.
Flexibility: Series can store various data types, including numeric data (integers, floats), string data, datetime
data, and Python objects.
Vectorized Operations: Series support vectorized operations, allowing you to perform arithmetic operations
and other mathematical functions on the entire Series at once, without the need for explicit looping.
Automatic Alignment: When performing operations between two Series objects, pandas automatically aligns
the data based on the index labels, ensuring that calculations are performed correctly, even if the indexes are not
in the same order.
Name Attribute: A Series can have a name attribute, which provides a label for the Series itself. This can be
useful when working with DataFrame objects, where Series are often used as columns.
DataFrame:
A DataFrame is like the entire table. It is a two-dimensional data
structure where each column is a Series, and each row represents an
observation or record. DataFrames provide a convenient way to organize
and manipulate data, allowing you to perform various operations such as
indexing, filtering, grouping, and aggregation.
Locate
the loc attribute is used to access a group of rows and columns
by label(s) or a boolean array. It allows you to select data from
a DataFrame based on row labels and/or column labels. The
primary purpose of loc is to locate data based on its index
labels.
Accessing Rows: You can use loc to access one or more rows from a DataFrame
based on their index labels. If a single label is provided, loc returns a Series. If a
list of labels is provided, it returns a DataFrame.
Accessing Columns: You can also use loc to access specific columns of a
DataFrame by providing both row and column labels. This allows you to select
specific rows and columns simultaneously.
Slicing: loc supports slicing operations, allowing you to select a range of rows
or columns using label-based indexing.
Boolean Indexing: You can pass a boolean array to loc to select rows based on a
condition. This is useful for filtering rows based on certain criteria.
What are CSV files?
CSV files are text files with information separated by commas, saved with the
extension .csv. They allow large amounts of detailed data to be transferred ‘machine-to-
machine’, with little or no reformatting by the user.
You can open a CSV file with any spreadsheet, statistics, or analysis program, such as
Microsoft Excel, the R statistical environment, or Python.
CSV files may open in Excel by default, but they are not designed as Excel files. If CSV
files are opened in Excel, certain information (eg codes with leading zeros) could be
missing. Ideally, they should be
If you have a large DataFrame with many rows, Pandas will only return the first 5 rows,
and the last 5 rows
use to_string() to print the entire DataFrame.
The number of rows returned is defined in Pandas option settings.
You can check your system's maximum rows with
the pd.options.display.max_rows statement.
In my system the number is 60, which means that if the DataFrame contains more than 60
rows, the print(df) statement will return only the headers and the first and last 5 rows.
You can change the maximum rows number with the same statement.
What is json file?
JavaScript Object Notation (JSON) is a standardized format commonly used to transfer
data as text that can be sent over a network. It’s used by lots of APIs and Databases, and
it’s easy for both humans and machines to read.
JSON represents objects as name/value pairs, just like a Python dictionary.
Analyzing DataFrames
Viewing the Data
One of the most used method for getting a quick overview of the DataFrame, is the head()
method.
The head() method returns the headers and a specified number of rows, starting from the
top.
if the number of rows is not specified, the head() method will return the top 5 rows.
There is also a tail() method for viewing the last rows of the DataFrame.
The tail() method returns the headers and a specified number of rows, starting from the
bottom.
Info About the Data
The DataFrames object has a method called info(), that gives you more information about
the data set.
Null Values
The info() method also tells us how many Non-Null values there are present in each column, and
in our data set it seems like there are 164 of 169 Non-Null values in the "Calories" column.
Which means that there are 5 rows with no value at all, in the "Calories" column, for whatever
reason.
Empty values, or Null values, can be bad when analyzing data, and you should consider
removing rows with empty values. This is a step towards what is called cleaning data, and you
will learn more about that in the next chapters.
Pandas - Cleaning Data
Data cleaning means fixing bad data in your data set.
Bad data could be:
Empty cells
Data in wrong format
Wrong data
Duplicates
Empty cells can potentially give you a wrong result when you analyze data.
One way to deal with empty cells is to remove rows that contain empty cells.
This is usually OK, since data sets can be very big, and removing a few rows will
not have a big impact on the result.
By default, the dropna() method returns a new DataFrame, and will not change the
original.
If you want to change the original DataFrame, use the inplace = True argument:
Now, the dropna(inplace = True) will NOT return a new DataFrame, but it will
remove all rows containing NULL values from the original DataFrame.
Replace Empty Values
Another way of dealing with empty cells is to insert a new value instead.
This way you do not have to delete entire rows just because of some empty cells.
The fillna() method allows us to replace empty cells with a value:
Replace Only For Specified Columns
The example above replaces all empty cells in the whole Data Frame.
To only replace empty values for one column, specify the column name
for the DataFrame
Wrong Data
"Wrong data" does not have to be "empty cells" or "wrong format", it can just
be wrong, like if someone registered "199" instead of "1.99".
Sometimes you can spot wrong data by looking at the data set, because you
have an expectation of what it should be.
If you take a look at our data set, you can see that in row 7, the duration is 450,
but for all the other rows the duration is between 30 and 60.
It doesn't have to be wrong, but taking in consideration that this is the data set
of someone's workout sessions, we conclude with the fact that this person did
not work out in 450 minutes.
Replacing Values
One way to fix wrong values is to replace them with something else.
In our example, it is most likely a typo, and the value should be "45"
instead of "450", and we could just insert "45" in row 7:
Duplicate rows are rows that have been registered more than one time.
print(df.duplicated())
Removing Duplicates
To remove duplicates, use the drop_duplicates() method.
df.drop_duplicates(inplace = True)
Finding Relationships
A great aspect of the Pandas module is the corr() method.
The corr() method calculates the relationship between each column in your data set.
The corr() method ignores "not numeric" columns.
What is a good correlation? It depends on the use, but I think it is safe to say you have to
have at least 0.6 (or -0.6) to call it a good correlation.
Introduction to Data Visualization
Data visualization is the graphical representation of information and data. By using visual
elements like charts, graphs, and maps, data visualization tools provide an accessible way to
see and understand trends, outliers, and patterns in data.
Importance of Data Visualization
Simplifies Complex Data: Helps in simplifying complex data sets to make data-driven
decisions.
Identifies Trends and Patterns: Makes it easier to spot trends and patterns.
Communication: Effective for communicating data insights to stakeholders.
Comparisons: Makes comparisons between different data sets or variables more intuitive.
Decision Making: Aids in faster and more effective decision-making.
What is Matplotlib?
Matplotlib is a plotting library for the Python programming language and its numerical
mathematics extension, NumPy. It provides an object-oriented API for embedding plots
into applications using general-purpose GUI toolkits.
Why Use Matplotlib?
Matplotlib is one of the most widely used data visualization libraries in Python for several
reasons:
Versatility: It can create a wide range of static, animated, and interactive plots.
Customization: Highly customizable to suit various needs and preferences.
Integration: Integrates well with other Python libraries such as NumPy, Pandas, and
SciPy.
Output Formats: Supports various output formats like PNG, PDF, SVG, and more.
Community and Documentation: Well-documented with a large community, making it
easier to find support and examples.
Core Concepts of Matplotlib
Figures and Axes:
Figure: The entire window where everything is drawn, equivalent to a blank canvas.
Axes: The area where the data is plotted, equivalent to a single plot or graph. A figure can contain
multiple axes.
Plotting Functions:
plt.plot(): Line plots
plt.scatter(): Scatter plots
plt.bar(): Bar charts
plt.hist(): Histograms
plt.pie(): Pie charts
Customization:
Titles, labels, legends, colors, and styles can be customized to improve the readability and
aesthetics of the plots.
Grids and annotations can be added for better clarity.
Subplots:
Multiple plots can be created in a single figure using subplots, allowing for complex data
visualizations.
Example Use Cases of Matplotlib
Scientific Research: Used to visualize data from experiments and simulations.
Finance: Plotting stock prices, market trends, and financial reports.
Machine Learning: Visualizing model performance, feature importance, and data
distributions.
Geospatial Data: Creating maps and geographical plots.
Business Intelligence: Dashboard creation, sales analysis, and operational metrics.
Exception Handling
When an error occurs, or exception as we call it, Python will
normally stop and generate an error message.
The try block lets you test a block of code for errors.
The except block lets you handle the error.
The else block lets you execute code when there is no error.
The finally block lets you execute code, regardless of the result of
the try- and except blocks.
Raise an exception
To throw (or raise) an exception, use the raise keyword.
Object oriented programming
Procedural programming
Functional programming
Object oriented programming
Concepts in oops
Class
Object
Polymorphism
Encapsulation
Inheritance
Abstraction
Constructor
_ _init_ _() method
All classes have a function called __init__() constructor
And it gets executed every time the object is being initiated
The self parameter is a reference of the current invoking object
And it is used to access the attributes which belongs to the class
Attributes
Types of attributes
Class Attributes
Object attributes or instance attributes
Methods
Basically
Class is a combination of methods and attributes
Example
Attributes : in terms of car
so number of tyres,doors,seats,car body etc
Methods : how the car runs usings all this attributes
how the engine will start,how the brake will operate
etc
How to create method
#creating class
class Student:
def __init__(self,fullname) :
self.name=fullname
def hi(self): # creating a method according our requirment
print("hello",self.name)
# creating object
s1=Student("Arunpal")
s1.hi()
Abstraction
Hiding the implementation details of class and only showing the
essential features to the user.
Encapulation
Wrapping data and functions into a single unit(object).
It protects our data from accidental change or prohibits the user
to access the data present inside the class
Inheritance
When one class (child/derived) derives the properties & methods of another another
class(parent/base).
Types
Single level
Multi_level inheritance
Multiple inheritance
Super()
Super method is used to access the methods of parent class
Class method
A class method is bound to the class & recives the class as an implicit first argument.
Note – Static method can’t access or modify class state and generally
for utility.
class Student :
@classmethod
def collage(cls):
pass
Polymorphism
When the same operator is allowed to have different meaning according
To the context.
Make sure you have py files shared on whatsapp group
Extra topics
Regex
A RegEx, or Regular Expression, is a sequence of characters that forms a
search pattern.
RegEx can be used to check if a string contains the specified search pattern.
Uses of regex
Data Validation:
Ensuring that user input meets specific criteria, such as email
addresses, phone numbers, passwords, etc.
Text Search and Extraction:
Searching for specific patterns within text and extracting
relevant information, such as URLs, dates, or mentions.
Data Cleaning and Transformation:
Cleaning up messy data by removing unwanted characters,
formatting inconsistencies, or extracting specific information.
Uses of regex
Pattern Matching and Substitution:
Identifying and manipulating patterns within text data, such as
replacing all occurrences of a word or phrase.
Parsing and Tokenization:
Breaking down text into smaller units, such as words, sentences,
or individual components of a programming language.
Uses of regex
URL Routing in Web Development:
Defining URL patterns and routing requests to appropriate views or
controllers in web applications.
Syntax Highlighting and Code Analysis:
Supporting syntax highlighting and code analysis in code editors or
IDEs based on language rules.
Log Analysis and Filtering:
Analyzing log files or other structured data sources to extract relevant
information or filter out noise.
Uses of regex
Data Extraction for Machine Learning:
Preprocessing text data for machine learning tasks, such as sentiment analysis, named
entity recognition, or text classification.
Automating Text Processing Tasks:
Automating repetitive text processing tasks, such as finding and replacing text across
multiple files or documents.
Regex meta-character
. (Dot): Make sure you go through regex
Example 1: a.c word file shared with you
Matches: "abc", "axc", "a$c"
Does not match: "ac", "a\n"
Example 2: b.t
Matches: "bat", "bet", "b1t"
Does not match: "bot", "bit", "bat\n"
^ (Caret):
Example 1: ^abc
Matches: "abc" at the start of a string
Does not match: "xabc", "abc\n"
Example 2: ^start
$ (Dollar):
Example 1: end$
Matches: "end" at the end of a string
Does not match: "end here", "end\n"
Example 2: line$
Matches: "line" at the end of a string
Does not match: "line here", "xline"
* (Asterisk):
Example 1: ab*c
Matches: "ac", "abc", "abbbc"
Does not match: "a", "bc"
Example 2: go*d
Matches: "gd", "god", "good"
+ (Plus):
Example 1: ab+c
Matches: "abc", "abbc", "abbbc"
Does not match: "ac", "a", "bc"
Example 2: go+d
Matches: "god", "good"
Does not match: "gd", "g", "od"
? (Question Mark):
Example 1: colou?r
Matches: "color", "colour"
Does not match: "colouur", "colr"
Example 2: favou?rite
Matches: "favorite", "favourite"
Does not match: "favoourite", "favorit"