8423 Ecap776 Programming in Python
8423 Ecap776 Programming in Python
ECAP776
Edited by
Dr. Ajay Bansal
Programming in Python
Edited By:
Dr. Ajay Bansal
Content
Dr. Rajni Bhalla, Lovely Profssional University Unit 01: Python Basics
Objectives
After studying this unit, you will be able to:
Understand basic concepts about python
Learn installation steps in python
Learn control statements
Understand basic concepts of functions in python
Introduction
Python is a popular high-level, general-purpose programming language. Guido van Rossum
invented it in 1991, and the Python Software Foundation continued to advance it. Programmers
may convey their ideas in less code because to its syntax, which was created with code readability
in mind.
Python is a programming language that enables quick work and more effective system integration.
Programming in Python
1.Finding
an 1.Windows
Interpreter
1.Linux 1.macos
1. Finding an interpreter:
a. Kaggle: supports background processing in a manner comparable to Jupyter Notebook.
b. Google Colab: uses a google account solely, similar to Kaggle. both GPU and TPU
compute, but only the pro version allows for background execution.
c. Python.org: like running Python from the command line
d. Programiz:Programmers are able to create and execute Python code online using this
Python compiler (interpreter). It can use an IDLE-like Python Shell and accept user input.
e. Online Python: You can create, execute, and distribute Python code online for free using
this online interpreter (compiler). It is the online Python interpreter that is the quickest, most
trustworthy, and most potent.
f. Online GDB: An online IDE with a Python interpreter is called OnlineGDB. It is quick and
simple to run a Python programme online with this interpreter. It works with Python 3.
g. Replit:The greatest website for Python online execution and interactive programming. The
name of this terminal is derived from the Lisp and Python read-eval-print loop.
2. Windows: The Python software can be acquired from http://python.org/, and it comes with
IDLE (Integrated Development Environment), one of many free interpreters that can be used to
run Python scripts.
3. Linux:Popular Linux distributions like Fedora and Ubuntu include Python by default. Enter
"python" in the terminal emulator to see what version of Python is currently running. When it
launches, the interpreter should print the version number.
4. macOS: Python 2.7 is typically included with macOS. Python 3 must be manually installed from
http://python.org.
Python's syntax differs from various other programming languages in that it enables
programmers to construct applications with fewer lines of code.
Python operates on an interpreter system, allowing for the immediate execution of written
code. As a result, prototyping can proceed quickly.
Python can be used in a functional, object-oriented, or procedural manner.
A list of every Python version will be provided. Choose the version you need, then click
"Download." Let's say we go with Python version 3.9.1.
Programming in Python
Upon clicking download, a variety of executable installers with varying operating system
requirements will be made available. Select the installer that best fits your operating system and
download it. Imagine that we choose the Windows installer (64 bits).
The download is less than 30MB in size.
The installation process begins when you click the Install Now button.
The installation procedure will take a few minutes to finish, and after it has, the screen below will
appear.
Programming in Python
Pip is an effective framework for managing Python software packages. Therefore, confirm that it is
set up.
To check if pip was installed, adhere to the instructions provided.
Launch the command window.
To see if pip was installed, type pip -V.
If pip is successfully installed, the output shown below occurs.
Python and pip have been successfully installed on our Windows PC.
x = 20 int
x = 20.5 float
x = 1j complex
x = range(6) range
x = True bool
x = b"Hello" bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview
x = None nonetype
x = complex(1j) complex
Programming in Python
Arithmetic
operators
Assignment
operators
Python divides the operators
in the following groups:
Comparison operators
Logical
operators
Identity operators
Membership
operators
Bitwise operators
+ Addition x+y
- Subtraction x–y
* Multiplication x*y
/ Division x/y
% Modulus x%y
** Exponentiation x ** y
// Floor division x // y
= X=5 X=5
+= x += 3 x=x+3
-= x -= 3 x=x-3
*= x *= 3 x=x*3
/= x /= 3 x=x/3
%= x %= 3 x=x%3
//= x //= 3 x = x // 3
**= x **= 3 x = x ** 3
|= x |= 3 x=x|3
^= x ^= 3 x=x^3
== Equal x == y
!= Not equal X != y
Programming in Python
Not Reverse the result, returns False if the not(x < 5 and x < 10)
result is true
Break Statement
Continue Statement
Pass Statement
Break Statement
In Python, the break statement is used to end or remove the control from the loop that contains the
statement. It is used to end nested loops (a loop inside another loop), which are common with both
while and for loops. The inner loop is ended, and control is transferred to the statement in the
outside loop.
Continue statement
When a Python programme sees a continue statement, it skips the current iteration's execution
when the condition is satisfied and instead allows the loop to carry on to the next iteration. It is
used to keep the programme running even when it meets a break while being executed.
Pass statement
When the condition is met, the pass statement, a null operator, is used by the programmer to do
nothing. Python's control statement simply moves on to the next iteration without stopping the
execution or skipping any steps.
A programmer can use the pass statement to prevent the interpreter from throwing an error when a
loop is left empty.
Creating a Function
In Python a function is defined using the def keyword:
Example:
my_function()
Arguments
Functions accept arguments that can contain data.
The function name is followed by parenthesis that list the arguments. Simply separate each
argument with a comma to add as many as you like.
The function in the following example only takes one argument (fname). A first name is passed to
the function when it is called, and it is utilised there to print the whole name:
Example:
def my_function(fname):
print(fname + " Application")
my_function("Computer")
my_function("Science")
Programming in Python
my_function("System")
Args is a common abbreviation for arguments in Python documentation.
Summary
Python is a popular high-level, general-purpose programming language. Guido van Rossum
invented it in 1991, and the Python Software Foundation continued to advance it
Web applications can be developed on a server using Python.
Python is cross-platform compatible (Windows, Mac, Linux, Raspberry Pi, etc).
Indentation, which utilises whitespace, is how Python defines scope, including the scope of
loops, functions, and classes. Curly-brackets are frequently used for this in other computer
languages.
Operations on variables and values are carried out using operators.
Loops are used in Python to continually iterate over a section of code. Control statements
are used to modify a loop's execution from its default behaviour.
In Python, the break statement is used to end or remove the control from the loop that
contains the statement
When a Python programme sees a continue statement, it skips the current iteration's
execution when the condition is satisfied and instead allows the loop to carry on to the next
iteration
When the condition is met, the pass statement, a null operator, is used by the programmer to
do nothing.
Keywords
Python:The general-purpose, interactive, object-oriented, and high-level programming language
Python is particularly well-liked.
Python path: It has a role similar to PATH. This variable tells the Python interpreter where to
locate the module files imported into a program.
Python startup: It includes the location of a Python source code initialization file. Every time the
interpreter is launched, it is executed.
Unix:The original Python IDE for Unix is called IDLE.
Windows: The first Windows interface for Python is called PythonWin, and it is an IDE with a
GUI.
Macintosh:You can get the Macintosh version of Python and the IDLE IDE from the main website
in MacBinary or BinHex format.
Reserved Words:You cannot use them as identifier names for constants, variables, or anything
else.
Python Numbers: Number data types store numeric values.
Python Strings:Python defines strings as a contiguous group of characters that are enclosed in
quotation marks.
Python Lists:Of all the compound data types in Python, lists are the most flexible. Items in a list
are delimited by square brackets and separated by commas ([]).
Python Tuples: Another sequence data type that resembles the list is the tuple. A tuple is made up
of several values that are separated by commas.
Python Dictionaries: The dictionaries used by Python are something like hash tables. They
consist of key-value pairs and operate similarly to associative arrays or hashes seen in Perl.
Self Assessment
Q1: Who was the Python programming language's creator?
A. Mark
B. Guido can Rossum
C. Alfred novel
D. Ralf Kleinberg
A. Structural programming
B. Object-oriented programming
C. Functional programming
D. All of the above
A. .ps
B. .pyth
C. .py
D. .thon
A. Key
B. Brackets
C. Indentation
D. All of the above
Q6: Which of the subsequent characters is used in Python to provide single-line comments?
A. //
B. #
C. /*
D. */
A. python -version
B. python -v
C. python -V
D. None of above
Programming in Python
Q10: In Python programming, which of the following is not a basic data type?
A. Dictionary
B. Class
C. Tuple
D. Lists
Q11: Which of the above statements in Python is used to generate an empty set?
A. Empty(a)
B. {}
C. set()
D. None of above
A. try
B. except
C. finally
D. All of the above
14: Which of the following list items is a legitimate Python escape sequence?
A. \n
B. \t
C. \\
D. All of the above
A. C++
B. C
C. Java
D. None of these
6. B 7. C 8. D 9. B 10. B
Review Questions
1. Write down steps to download and install python.
2. Write down challenges used in installing python.
3. Explain all python data types.
4. Explain control statements in python.
5. What is the difference between list and tuples in python.
6. What is Python? What is the benefit of using a python.
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Dr. Rajni Bhalla, Lovely Profssional University Unit 02: Python Data Structures
Objectives
After this unit, student would be able to:
Introduction
In Python, strings are enclosed in either single or double quotation marks.
The same thing as "hello" is "hello".
With the print () method, a string literal can be shown:
Programming in Python
Example:
print(a)
Example:
Get the character at position 1 (remember that the first character has the position 0):
a = "Hello, World!"
print (a [1])
for x in "apple":
print(x)
Example Output
a = "Hello, World!" 13
print(len(a))
Example Output
Example:
Example Output
Check if NOT
The keyword not in can be used to determine whether a specific word or character DOES NOT exist
in a string.
Example: Verify that the following text DOES NOT contain the word "file":
Example Output
Programming in Python
Example Output
txt = "Like many other popular programming No, ‘file’ is NOT present
languages, strings in Python are arrays of
bytes representing Unicode characters.!"
If ‘file’ not in txt:
print("No, ‘file’ is NOT present.”)
Example Output
Example Output
Example Output
lcome, Students!
b = "Welcome, Students!"
print(b[2:])
Example Output
ent
b = "Welcome, Students!"
print(b[-5:-2])
Python-Modify Strings
You can use a variety of built-in methods on strings in Python.
2.9 Uppercase
The string is returned by the upper() function in upper case.
Example Output
WELCOME, STUDENTS!
a = "Welcome, Students!"
print(a.upper())
2.10 Lowercase
Lowercase characters are returned by the lower() function.
Example Output
welcome, students!
a = "Welcome, Students!"
print(a.lower())
Programming in Python
Example Output
Welcome, Students!
a = " Welcome, Students! "
print(a.strip())
Example Output
Jelcome, Students!
a = "Welcome, Students!"
print(a.replace("W", "J"))
Example Output
['Welcome', ' Students!']
a = "Welcome, Students!"
b = a.split(",")
print(b)
Example Output
WelcomeStudents
a = "Welcome"
b = "Students"
c=a+b
print(c)
Example Output
Welcome Students
a = "Welcome"
b = "Students"
c=a+""+b
print(c)
Example:
For strings, insert numbers using the format() method:
Example Output
There is no limit to the amount of arguments that can be passed to the format() method.
Example Output
Example Output
txt = "String is a collection of \" alphabets \" String is a collection of "alphabets" words or
words or other characters." other characters.
Programming in Python
Escape Characters
Python also uses the following escape symbols:
Code Result
\\ Backslash
\n New Line
\r Carriage Return
\t Tab
\b Backspace
\f Form Feed
2.16 Python-Lists
Lists
Multiple elements can be kept in a single variable by using lists.
One of the four built-in data types in Python for storing data collections is the list; the other three
are the tuple, set, and dictionary, each of which has a unique purpose.
Square brackets are used to form lists.
Example:
Example Output
['onion', 'tomato', 'brinjal']
thislist = ["onion", "tomato", "brinjal"]
print(thislist)
List Items
List items can have duplicate values and are ordered and editable.
The first item in a list has the index [0], the second item has the index [1], et.
Ordered
When we refer to a list as being ordered, we indicate that the entries are in a specific order that will
not alter.
The new things will be added at the end of the list if you add more items.
Changeable
The list is modifiable, which means that after it has been generated, we can edit, add, and remove
entries from it.
Allow Duplicates
Lists can contain items with the same value since they are indexed.
Example:
Example Output
List Length
The len() method can be used to count the number of elements in a list:
Example Output
Example Output
['onion', 'tomato', 'brinjal']
list1 = ["onion", "tomato", "brinjal"]
[2, 3, 4, 5, 6]
list2 = [2, 3, 4, 5, 6] [True, False, False]
Example: A list can contain different data types. A list with strings, integers and boolean
values shown in below table:
Example Output
['def', 36, False, 42, 'male']
list1 = ["def", 36, False, 42, "male"]
print(list1)
Programming in Python
Example Output
Updating Lists
By providing the slice on the left-hand side of the assignment operator, you can change one or more
list elements. You can also add to list elements by using the append() method. For instance,
Example Output
Example Output
1 cmp(list1, list2)
Compares elements of both lists.
2 len(list)
Gives the total length of the list.
max(list)
3
Returns item from the list with max value.
min(list)
4
Returns item from the list with min value.
list(seq)
5
Converts a tuple into list.
list.count(obj)
2
Returns count of how many times obj occurs in list
list.extend(seq)
3
Appends the contents of seq to list
list.index(obj)
4
Returns the lowest index in list that obj appears
list.insert(index, obj)
5
Programming in Python
list.pop(obj=list[-1])
6
Removes and returns last object or obj from list
list.remove(obj)
7
Removes object obj from list
list.reverse()
8
Reverses objects of list in place
list.sort([func])
9
Sorts objects of list, use compare func if given
2.17 Python-Tuples
An ordered and unchangeable group of things is referred to as a tuple. Sequences are what tuples
and lists both are. Tuples and lists vary in those tuples cannot be altered, although lists may, and
because tuples use parentheses while lists use square brackets.
Simply placing various values separated by commas forms a tuple. You may also choose to enclose
these comma-separated values in parenthesis. For instance:
tpl1 = (‘fruits’, ‘vegetables’, 2003, 2005);
tpl2 = (5,6,7,8,9);
tpl3 = “d”, “e”, “f”, “g”, “h”;
There are two parentheses surrounding the empty tuple, which contains nothing.
tpl1 = ();
Even though there is only one value in a tuple, you must still use a comma when writing it.
tpl1 = (50 , );
Tuple indices begin at 0 like string indices do, and they can be concatenated, sliced, and other
operations.
Updating Tuples
Due to their immutability, tuples cannot be updated or have their element values changed. The
example that follows shows how to generate new tuples by using pieces of existing tuples:
tpl1 = (35, 45.56);
tpl2 = ('apple', 'orange');
# So, let's create a new tuple as follows
Programming in Python
No Enclosing Delimiters
As seen in these brief examples, tuples are the default for any group of numerous objects that are
comma-separated and expressed without distinguishing symbols, such as brackets for lists and
parentheses for tuples.
2.18 Python-Dictionary
The items are separated by commas, each key is separated from its value by a colon (:), and the
entire structure is contained in curly brackets. A dictionary that is completely empty of all words is
written as follows:.
Values may not be unique within a dictionary, but keys always are. A dictionary's keys must be
immutable data types like texts, integers, or tuples, while its values can be of any kind.
We see the following error if we try to retrieve a data item using a key that is not listed in the
dictionary:
dict = {'Name': 'Alisha', 'Age': 11, 'Class': 'Fifth'} Traceback (most recent call last):
print("dict['Aalya']: ", dict['Aalya']) File "./prog.py", line 2, in <module>
KeyError: 'Aalya'
Updating Dictionary
As seen in the straightforward example below, you can change a dictionary by adding a new entry,
a key-value pair, editing an existing entry, or deleting an existing entry.
(a) It is not permitted to use a key more than once. This implies that no duplicate keys are
permitted. When using duplicate keys, the most recent assignment is chosen. For instance,
(b) Keys must not be changeable. In other words, you can use dictionary keys like ['key'] but not
strings, numbers, or tuples. Here is a straightforward illustration:
Programming in Python
2 len(dict)
Gives the total length of the dictionary. This would be equal to the number of items in
the dictionary.
3 str(dict)
Produces a printable string representation of a dictionary
4 type(variable)
Returns the type of the passed variable. If passed variable is dictionary, then it would
return a dictionary type.
dict.copy()
2
Returns a shallow copy of dictionary dict
dict.fromkeys()
3
Create a new dictionary with keys from seq and values set to value.
dict.get(key, default=None)
4
For key key, returns value or default if key not in dictionary
dict.has_key(key)
5
Returns true if key in dictionary dict, false otherwise
dict.items()
6
Returns a list of dict's (key, value) tuple pairs
dict.keys()
7
Returns list of dictionary dict's keys
dict.setdefault(key, default=None)
8
Similar to get(), but will set dict[key]=default if key is not already in dict
dict.update(dict2)
9
Adds dictionary dict2's key-values pairs to dict
dict.values()
10
Returns list of dictionary dict's values
Summary
Python's strings, like those of many other widely used programming languages, are
collections of bytes that represent unicode characters.
Use the len() function to determine a string's length.
The keyword in can be used to determine whether a specific word or character is present
in a string.
The keyword not in can be used to determine whether a specific word or character DOES
NOT exist in a string.
Use the + operator to concatenate, or merge, two strings.
Similar to dynamically scaled arrays specified in other languages (such as vector in C++
and ArrayList in Java), Python lists are similar to them. A list is a group of items that are
denoted by the symbol [] and separated by commas.
Refer to the index number to access the list entries. To retrieve a specific item in a list, type
[] in the index operator.
In Python, positions from the array's end are represented by negative sequence indexes.
Use of the built-in append() function allows for the addition of elements to the List. The
append() method can only add one element at a time to the list; loops must be used to add
many elements using the append() method.
Python is a popular high-level, general-purpose programming language. Guido van
Rossum invented it in 1991, and the Python Software Foundation continued to advance it
Web applications can be developed on a server using Python.
Keywords
Token:The smallest discrete unit in a Python programme is called a token. Tokens are used to
construct each statement and instruction in a programme.
Keywords:In a computer language, keywords are words that have a particular importance or
meaning. They cannot be utilised for any arbitrary reason, including as names for functions or
variables.
Identifiers:The names assigned to any variable, function, class, list, methods, etc. for identification
are known as identifiers.
Reserved Words:You cannot use them as identifier names for constants, variables, or anything
else.
Python Numbers: Number data types store numeric values.
Python Strings:Python defines strings as a contiguous group of characters that are enclosed in
quotation marks.
Literals or Values:The fixed values or data items used in a source code are known as literals.
Python Lists:Of all the compound data types in Python, lists are the most flexible. Items in a list
are delimited by square brackets and separated by commas ([]).
Programming in Python
Python Tuples: Another sequence data type that resembles the list is the tuple. A tuple is made up
of several values that are separated by commas.
Python Dictionaries: The dictionaries used by Python are something like hash tables. They
consist of key-value pairs and operate similarly to associative arrays or hashes seen in Perl.
Dictionary: Key-value pairs are stored in dictionaries. To make the dictionary more efficient, Key-
Value is offered.
Review Questions
Q1. What will be the output of above Python code?
str1="8/2"
print("str1")
print(str1)
A. str1
B. str1 8/2
C. str1 4.0
D. str1
A. od Mor
B. od Morn
C. odMor
D. oodMor
A. This Is my File
B. ThIsis my File
C. ThIs Is my FIle
D. ThIs Is my File
A. 81367
B. Error
C. 1367
D. 8 Error
A. 13
B. 14
C. 15
D. 16
print(x)
print(y)
print(z)
A. 4 4 4.0
B. 4.0 4 4
C. 4 4.0 4
D. None of above
A. 4
B. 5
C. 3
D. 9
Programming in Python
A. <class ‘list’>
B. <class ‘tuple’>
C. <class ‘boolean’>
D. None of above
A. {‘apple’,‘banana’,’cherry’,’apple’}
B. {‘banana’,’cherry’,’apple’,’apple’}
C. {‘banana’,’apple’}
D. {‘banana’,’cherry’,’apple’}
A. D1601
B. Error
C. String not allowed
D. Type incompatible
A. Green
B. Burnt sienna
C. Blue
D. Red
A. del dict
B. Del dictionary
C. rmvdict
D. remove dictionary
6. C 7. A 8. C 9. B 10. D
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Dr. Rajni Bhalla, Lovely Profssional University Unit 03: OOP Concepts
Objectives
After studying this unit, you will be able to:
Introduction
The Python programming style known as object-oriented programming (OOPs) makes use of
objects and classes. It seeks to incorporate in programming real-world concepts like inheritance,
polymorphism, encapsulation, etc. The fundamental idea behind OOPs is to unite the data and the
functions that use it such that no other portion of the code may access it.
Object-Oriented Programming's Core Ideas (OOPs) are:-
Class
Object
Method
Inheritance
Polymorphism
Data Abstraction
Encapsulation
3.1 Class
A class is a group of related items. The models or prototypes used to generate objects are included
in classes. It is a logical entity with a few methods and characteristics.Consider the following
scenario to better appreciate the need for generating classes: Suppose you needed to keep track of
the number of dogs that might have various characteristics, such as breed or age. If a list is utilised,
the dog's breed and age might be the first and second elements, respectively. What if there were 100
different breeds of dogs? How would you know which ingredient should go where? What if you
wanted to give these dogs additional traits? This is disorganised and just what courses need.
A few notes on the Python class:
3.2 Objects
The object is an entity that is connected to a state and activity. Any physical device, such as a
mouse, keyboard, chair, table, pen, etc., may be used. Arrays, dictionaries, strings, floating-point
numbers, and even integers are all examples of objects. Any single string or integer, more
specifically, is an object. A list is an object that may house other things, the number 12 is an object,
the text "Hello, world" is an object, and so on. You may not even be aware of the fact that you have
been using items.
An Object consists of:
State:The properties of an object serve as a representation of it. Additionally, it reflects an object's
characteristics.
Behavior:It is represented via an object's methods. It also shows how one object reacts to other
objects.
Identity: It gives a thing a special name and makes it possible for objects to communicate with one
another.
Let's look at the example of the class dog to better understand the state, behaviour, and identity
(explained above).
Breed, age, and colour of the dog are examples of states or attributes.
You may infer from the behaviour whether the dog is eating or sleeping.
a. The self
An additional initial parameter in the method declaration is required for class methods.
When we call the method, we don't supply a value for this parameter; Python does.
Even if we have a method that doesn't require any parameters, we still need one.
This is comparable to this Java reference and this C++ pointer.
This is the sole purpose of the special self. When we invoke a method of this object as
myobject.method(arg1, arg2), Python automatically converts it to MyClass.method(myobject, arg1,
arg2).
b. The __init__method
The constructors in Java and C++ are comparable to the __init__ method. As soon as a class object is
created, it is executed. Any initialization you want to perform on your object can be done with the
method.Let's build some objects utilising the self and __init__ methods after defining a class.
Example1: Class and object creation using class and instance properties
class Dog:
# class attribute
attr1 = "mammal"
# Instance attribute
def __init__(self, name):
self.name = name
# Driver code
# Object instantiation
Rodger = Dog("Rodger")
Tommy = Dog("Tommy")
Output
Rodger is a mammal
Tommy is also a mammal
My name is Rodger
My name is Tommy
# class attribute
attr1 = "mammal"
# Instance attribute
def __init__(self, name):
self.name = name
def speak(self):
print("My name is {}".format(self.name))
# Driver code
# Object instantiation
Rodger = Dog("Rodger")
Tommy = Dog("Tommy")
Output
My name is Rodger
My name is Tommy
3.3 Methods
A function connected to an object is the method. A method is not specific to class instances in
Python. Any sort of object may have methods.
3.4 Inheritance
The capacity of one class to derive or inherit properties from another class is known as inheritance.
The class from which the properties are being derived is referred to as the base class or parent class,
and the class from which the properties are being derived is referred to as the derived class or child
class. The advantages of inheritance include:
Types of Inheritance
Single Inheritance
A class can inherit properties from a single-parent class using single-level inheritance.
Multilevel Inheritance
A derived class can inherit properties from an immediate parent class, which in turn can inherit
properties from his parent class, thanks to multi-level inheritance.
Hierarchical Inheritance
More than one derived class can inherit properties from a parent class thanks to hierarchical level
inheritance.
Multiple Inheritance
One derived class may inherit properties from several different base classes thanks to multiple level
inheritance.
Example: Python inheritance
# Python code to demonstrate how parent constructors
# are called.
# parent class
class Person(object):
# __init__ is known as the constructor
def __init__(self, name, idnumber):
self.name = name
self.idnumber = idnumber
def display(self):
print(self.name)
print(self.idnumber)
def details(self):
print("My name is {}".format(self.name))
print("IdNumber: {}".format(self.idnumber))
# child class
class Employee(Person):
def __init__(self, name, idnumber, salary, post):
self.salary = salary
self.post = post
def details(self):
print("My name is {}".format(self.name))
print("IdNumber: {}".format(self.idnumber))
print("Post: {}".format(self.post))
Output
Rahul
886012
My name is Rahul
IdNumber: 886012
Post: Intern
In the aforementioned article, two classes—Person (parent class) and Employee—have been
established (Child Class). The Person class is an ancestor of the Employee class. As can be seen in
the show function in the code above, we may use the methods of the person class through the
employee class. The details() function shows how a child class can alter the parent class's
behaviour.
3.5 Polymorphism
Simply put, polymorphism means having multiple forms. For instance, utilising polymorphism, we
can answer the question of whether the given species of birds fly or not using just one function.
Example:Python's use of polymorphism
class Bird:
def intro(self):
print("There are many types of birds.")
def flight(self):
print("Most of the birds can fly but some cannot.")
class sparrow(Bird):
def flight(self):
print("Sparrows can fly.")
class ostrich(Bird):
def flight(self):
print("Ostriches cannot fly.")
obj_bird = Bird()
obj_spr = sparrow()
obj_ost = ostrich()
obj_bird.intro()
obj_bird.flight()
obj_spr.intro()
obj_spr.flight()
obj_ost.intro()
obj_ost.flight()
OUTPUT
There are many types of birds.
Most of the birds can fly but some cannot.
When using abstraction, internal details are hidden and only functionalities are displayed. Giving
things names that capture the essence of what a function or an entire programme does is the
process of abstracting something.
3.7 Encapsulation
One of the core ideas in object-oriented programming is encapsulation (OOP). It explains the
concept of data wrapping and the techniques that operate on data as a single unit. This restricts
direct access to variables and procedures and can avoid data alteration by accident. A variable can
only be altered by an object's method in order to prevent inadvertent modification. These variables
fall under the category of private variables.
A class, which encapsulates all the data that is contained in its member functions, variables, etc., is
an example of encapsulation.
Methods Variables
# Python program to
# demonstrate private members
# Calling constructor of
# Base class
Base.__init__(self)
print("Calling private member of base class: ")
print(self.__c)
# Driver code
obj1 = Base()
print(obj1.a)
Output
EcontentOnline
The c variable was generated as the private attribute in the example above. We are unable to even
directly read or modify the value of this attribute.
It makes development and upkeep simpler. When a project grows in scope, maintaining
the codes is difficult in procedural
programming.
It replicates the thing in the actual world. It doesn't represent reality in any way. It
Therefore, oops makes it simple to tackle operates using detailed instructions broken
difficulties in the actual world. down into smaller units called functions.
C++, Java, .Net, Python, C#, and other object- Procedural languages include C, Fortran,
oriented programming languages are Pascal, VB, and others.
examples.
Summary
The Python programming style known as object-oriented programming (OOPs) makes use
of objects and classes.
A class is a group of related items. The models or prototypes used to generate objects are
included in classes.
The object is an entity that is connected to a state and activity. Any physical device, such as a
mouse, keyboard, chair, table, pen, etc., may be used.
The constructors in Java and C++ are comparable to the __init__ method. As soon as a class
object is created, it is executed.
A function connected to an object is the method. A method is not specific to class instances
in Python. Any sort of object may have methods.
The capacity of one class to derive or inherit properties from another class is known as
inheritance.
Simply put, polymorphism means having multiple forms. For instance, utilising
polymorphism, we can answer the question of whether the given species of birds fly or not
using just one function
Both data abstraction and encapsulation are frequently used interchangeably. Since data
abstraction is accomplished by encapsulation, the two terms are almost synonymous
One of the core ideas in object-oriented programming is encapsulation (OOP). It explains the
concept of data wrapping and the techniques that operate on data as a single unit.
The approach to addressing problems that uses objects for computation is called object-
oriented programming.
A list of instructions is used in procedural programming to perform calculations in stages
Keywords
OOPS: Object-oriented programming is known as OOP. While object-oriented programming
involves constructing objects that include both data and methods, procedural programming
involves developing procedures or methods that perform actions on the data.
The __init__method: All classes have a function called __init__(), which is always executed when
the class is being initiated.
The __str__function: What should be returned when the class object is rendered as a string is
determined by the __str__() function.
Objects methods: Methods can also be found in objects. Object-specific functions are called
methods in an object.
Self-Parameter: To access class-specific variables, use the self parameter, which is a reference to
the currently running instance of the class.
Inheritance: By using inheritance, we may create a class that has all the methods and
attributes of another class.
Parent class:The class being inherited from, often known as the base class, is the parent class.
Child class:The class that inherits from another class is referred to as a child class or derived class.
Super Function:The super() function in Python allows a descendant class to inherit all of its
parent's methods and properties.
Self Assessment
Q1. Which option best encapsulates inheritance?
A. A.__init__(self)
B. B.__init__(self)
C. A.__init__(B)
D. B.__init__(A)
Q4: What function type is a built-in in the context of classes?
A. Double-level
B. Multi-level
C. Single-level
D. Multiple
A. Encapsulation
B. Inheritance
C. Instantiation
D. Polymorphism
A. The capacity of a class to derive individuals from other classes as part of its own definition.
B. Techniques for combining instance variables and methods to limit access to specific class
members
C. focuses on supplying parameters to functions and variables.
D. enables the use of sophisticated software that is well-designed and flexible.
A. Albert Einstein
B. Guido Van Rossum
C. Guido Evan
D. None of these
A. 1975
B. 1989
C. 1972
D. 1990
Q11: Which of the following commands the expression with the most precedence?
A. Addition
B. Subtraction
C. Parentheses
D. Power
A. Object
B. Logical
C. Real
D. Hypothetical
Q15: Which of the following can be considered a combination of data abstraction and
programming?
A. Class
B. Object
C. Inheritance
D. Interfaces
6. C 7. B 8. B 9. B 10. B
Review Questions
1. What do you understand by OOPS? Write down the code to make a python class that si
empty.
2. Define Objects. Write down example to create an object with methods.
3. What do you understand by inheritance and also define types of inheritance.
4. Write down difference between Single level inheritance, multilevel inheritance and multiple
inheritance.
5. Define Polymorphism. Write down python code that define use of polymorphism.
6. What do you understand by Encapsulation? Write down python program to demonstrate
private members.
7. Write down difference between object-oriented programming and procedural
programming.
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction
to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Dr. Rajni Bhalla, Lovely Profssional University Unit 04: More on OOP Concepts
Objectives
After this unit, student would be able to:
learn various examples to elaborate upon the concepts of Python function overloading.
understand operator overloading
learn method overriding
Introduction
Function overloading is a phenomenon that occurs when several functions with the same name
have different numbers of parameters. In contrast to other languages, Python does not provide
function overloading, and the functional parameters lack a data type. Let's say we wish to leverage
the functional overloading functionality. In that scenario, we can change the method's default
values for arguments to None, which won't result in an error if that particular value isn't supplied
as an argument when the function is called.
Program Output
Programming in Python
obj=sumClass()
obj.sum(19, 8, 77) #correct output
obj.sum(18, 20) #throws error
We can see from the programme above that the second sum technique takes precedence over the
first sum method. The function returns the output when we call it with three arguments, but an
error is returned when we call it with with two arguments. Thus, function overloading is not
supported by Python.
But does that imply there isn't another way to put this feature into practise? No, is the response.
Function overloading in Python can be implemented in a variety of ways.
This can be done by declaring one or more parameters in the function declaration as None. In order
to prevent an error from happening when calling a function that has a parameter set to None but no
argument provided, we will also include a checking condition for None in the function body.
Syntax
When we define a function, we set one or more parameters to None and include a checking
condition for None in the function body. This way, when we call the function, an error won't
happen even if we don't pass the argument for a parameter that we have set to None.
We choose to invoke the function with or without the parameter by setting the parameters to None.
SYNTAX:
class name:
def fun(self,p1=None,p2=None,...)
Program Output
obj=sumClass()
obj.sum(19, 8, 77)#104
obj.sum(18)#Provide more numbers
Here, we can see that function overloading may be implemented by changing the parameter default
values to None and by including a few validations.
Program Output
#special function
def __len__(self):
print("The total items are:")
return len(self.cart)#built-in function
purchase = items(['apple', 'banana', 'mango','grapes'])
print(len(purchase))#prints the body of the special
function
The __init__ method is invoked whenever an object derived from a class is created. We are
attempting to alter Python's len() function's default behaviour, which merely shows the object's
length. The specific definition we have created for the __len__() function will retrieve the desired
results whenever we provide an object of our class to len().
We have inserted the desired code to our custom definition for __len__. Len() is overloaded by this.
Example:
Let's create some code in Python that calculates the area of figures using function overloading
(triangle, rectangle, square). We will call the same function with different parameters while setting
the default values of the parameters to None.
Program Output
Programming in Python
obj=areaClass()
obj.area(19,8,77)#Area of the triangle 76.0
obj.area(18,18,18,18)#Area of the square 324
obj.area(72,38,72,38)#Area of the rectangle
2736
Introduction
Depending on the operands used, an operator's meaning can change in Python.
Program Output
p1 = Point(1, 2)
p2 = Point(2, 3)
print(p1+p2)
Here, we can see that a TypeError was thrown because Python was unable to combine two Point
objects.
However, using operator overloading in Python, we can complete this work. However, let's first
gain some understanding of special functions.
def __str__(self):
return "({0},{1})".format(self.x,self.y)
Let's try the print() function once more right now.
Program Output
def __str__(self):
return "({0}, {1})".format(self.x, self.y)
Programming in Python
p1 = Point(2, 3)
print(p1)
Better still. It turns out that when we utilise the built-in functions format () or str(), this procedure is
also called ().
>>> str(p1)
'(2,3)'
>>> format(p1)
'(2,3)'
So, when you use str(p1) or format(p1), Python internally calls the p1. __str__ () method.
Program Output
def __str__(self):
return "({0},{1})".format(self.x, self.y)
p1 = Point(1, 2)
p2 = Point(2, 3)
print(p1+p2)
In reality, when you use p1 + p2, Python actually calls p1. Point is obtained by adding __add (p2)
(p1,p2). The addition operation is then completed in the manner that we instructed.
We can also overload other operators in a similar manner. Below is a summary of the special
function that needs to be implemented.
Addition p1 + p2 p1.__add__(p2)
Subtraction p1 - p2 p1.__sub__(p2)
Multiplication p1 * p2 p1.__mul__(p2)
Power p1 ** p2 p1.__pow__(p2)
Division p1 / p2 p1.__truediv__(p2)
Bitwise OR p1 | p2 p1.__or__(p2)
Program Output
def __str__(self):
return "({0},{1})".format(self.x, self.y)
Programming in Python
p1 = Point(1,1)
p2 = Point(-2,-3)
p3 = Point(1,-1)
The special functions that must be implemented in order to overload other comparison operators
are listed below.
Table 5 Overload Comparison Operator
Equal to p1 == p2 p1.__eq__(p2)
The object that calls a method will determine which version of the method is executed. When a
method is called from an object of a parent class, the method's parent class version is executed;
however, when a method is called from an object of a subclass, the child class version is executed.
In other words, the version of an overridden method that is run depends on the type of the object
being referenced, not the type of the reference variable.
Program Output
Inside Parent
# Defining parent class
class Parent(): Inside Child
# Constructor
def __init__(self):
self.value = "Inside Parent"
# Constructor
def __init__(self):
self.value = "Inside Child"
# Driver's code
Programming in Python
obj1 = Parent()
obj2 = Child()
obj1.show()
obj2.show()
Example:Let's look at a scenario where we only wish to override methods from one parent
class. The implementation is shown below.
Program Output
Inside Child
# Python program to demonstrate
Inside Parent2
# overriding in multiple inheritance
# Driver's code
obj = Child()
obj.show()
obj.display()
Summary
Function overloading is a phenomenon that occurs when several functions with the same
name have different numbers of parameters
In Python, the most recent definition of a function is taken into account for determining its
validity.
Built-in classes are supported by Python operators. However, the same operator responds
differently to several types.
Python refers to class functions that start with a double underscore as special functions.
Operator overloading is not restricted to arithmetic operators in Python. Additionally, we
can overload comparison operators.
Any object-oriented programming language has the capability of allowing a subclass or
child class to offer a customised implementation of a method that is already supplied by one
of its super-classes or parent classes.
Many inheritances are the term used when a class derives from multiple base classes.
Keywords
Operator Overloading: When an operator in Python is overloaded, it signifies that it has additional
meaning in addition to its usual operational meaning.
Multiple Inheritance: Many inheritance is the term used when a class derives from multiple base
classes.
Multilevel Inheritance:When we have a child and grandchild relationship.
Function Overloading: When many functions share the same name but have different numbers of
parameters, this is known as function overloading.
Built-in functions: We have a few special functions in the Python Data Model, and it gives us the
ability to overload the built-in functions. The special function names all start with double
underscores ( ).
Functions:A function is a segment of clean, reusable code that executes a single, connected
operation. A higher level of code reuse and improved application modularity are provided via
functions.
Operator Overloading: Built-in classes are supported by Python operators. However, the same
operator responds differently to several types. For instance, the + operator will combine two lists,
concatenate two strings, or perform arithmetic addition on two numbers.
Method Overriding:Any object-oriented programming language that supports method overriding
enables a subclass or child class to provide a particular implementation of a method that is already
supplied by one of its super-classes or parent classes.
Self Assessment
Q1. Which function overload the + operator?
A. __add__()
B. __plus__()
C. __sum__()
Programming in Python
A. __eq__()
B. __equ__()
C. __equal__()
D. None of the above
A. __more__()
B. __gt__()
C. __ge__()
D. None of the above
A. ||
B. |
C. //
D. /
A. __div__()
B. __ceildiv__()
C. __floordiv__()
D. __truediv__()
Q11: Which of the subsequent claims about variable names in Python is true?
Q12: Which of the following commands the expression with the most precedence?
A. Division
B. Subtraction
C. Power
D. Parentheses
Q13: Which of the following option is not a core data type in the python language?
A. Dictionary
B. Lists
C. Class
D. All of the above
A. install numpy
B. pip install python numpy
C. pip install numpy
D. pip install numpypython
Q15: Which option from the list below best demonstrates how to import the Numpy module
into your programme?
A. import numpy
B. import numpy as np
Programming in Python
6. C 7. A 8. A 9. A 10. D
Review Question
1. What is function overloading. Explain with example.
2. Explain operator overloading with example.
3. What do you understand by method overriding. Explain with example.
4. Explain function overloading with example.
5. What do you understand by type conversion and also explain difference between
overloaded functions and overridden functions?
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Objectives
After this unit, student would be able to:
Learn how to handle exceptions in your program using try, except and finally statements with
the help of example.
Introduction
In Python, there are two different sorts of errors: syntax errors and exceptions. Errors are issues in a
programme that cause it to halt during execution. On the other hand, exceptions are raised when
internal events take place that alter the program's usual course.
Program Output
We cycle through the values in the random Listlist in this application. The try block contains the
code that, as was discussed earlier, can lead to an exception.
The unless block is bypassed and normal flow resumes if there is no exception (for last value). But
the unless block handles any exceptions that do arise (first and second values).
Here, we use the excinfo () method in the sys module to print the name of the exception. As we can
see, a results in ValueError whereas 0 results in ZeroDivisionError.
We may alternatively complete the aforementioned work in the following manner because every
exception in Python inherits from the base Exception class:
Program
Program
try:
# do something
pass
except ValueError:
# handle ValueError exception
pass
except (TypeError, ZeroDivisionError):
# handle multiple exceptions
# TypeError and ZeroDivisionError
pass
except:
# handle all other exceptions
pass
Program
x Output
Enter a number: 4
0.25
If we do so, we receive a ZeroDivisionError since the previous except does not handle the code
block inside else.
Enter a number: 0
Traceback (most recent call last):
File "<string>", line 7, in <module>
reciprocal = 1/num
ZeroDivisionError: division by zero
Python try...finally
In Python, a finally clause is an optional addition to the try statement. This phrase, which is
typically used to release external resources, is always put into effect.
For instance, we might be utilizing a file or a Graphical User Interface while connecting via the
network to a distant data center (GUI).
In any of these scenarios, whether the programme ran successfully or not, we must clear away the
resource before it terminates. The finally clause carries out these operations (closing a file, shutting
a GUI, or disconnecting from the network) to ensure execution.
Here is a file operation example to demonstrate this.
try:
f = open("test.txt",encoding = 'utf-8')
# perform file operations
finally:
f.close()
If an exception arises while the programme is running, this kind of construct ensures that the file is
closed.
This exception is user-defined and derives from the Exception class. Its name is CustomError. The
raise statement with an optional error message can be used to raise this new exception as well as
other exceptions.
It is best practice to save every user-defined exceptions that our application raises in a separate file
when we are creating a large Python programme. Many common modules carry out this. As
errors.py or exceptions.py, respectively, they declare their exceptions (generally but not always).
Although user-defined exception classes can implement anything a regular class can, we often keep
them short and sweet. The majority of implementations declare a unique base class from which
they derive all additional exception classes. This idea is shown in the example that follows.
Enter a number: 10
class ValueTooLargeError(Error):
Congratulations! You guessed it correctly.
"""Raised when the input value is too
large"""
pass
Program Output
Here, we have modified the Exception class's function Object () {[native code]} to accept the custom
arguments message and salary. Super is then used to manually invoke the function Object ()
{[native code]} of the parent Exception class with the self. Message argument ().
It is defined to use the custom self. Salary attribute in the future.
When SalaryNot in RangeError is raised, the appropriate message is then shown using the
inherited __str__ function of the Exception class.
By replacing it, we may also alter the __str__ method itself.
Program Output
Keywords
Exceptions: The finally keyword is available in Python, and it is always used after the try and
except blocks.
try: A try statement in Python can be used to manage exceptions
try catch:The try clause contains the crucial operation that can cause an exception
Arithmetic error: Raised when an error occurs in numeric calculations
Keyerror: Raised when a key does not exist in a dictionary
Runtimeerror: Raised when an error occurs that do not belong to any specific expectations.
Unboundlocalerror: Raised when a local variable is referenced before assignment
Zerodivisionerror: Raised when the second operator in a division is zero
Identation error: Raised when indendation is not correct
Importerror: Raised when an imported module does not exist
AssertionError: Raised when an assert statement fails
AttributeError: Raised when attribute reference or assignment fails
EOFError: Raised when the input() method hits an "end of file" condition (EOF)
Self Assessment
1. A try-except block can only contain so many except statements.
A. Zero
B. One
A. 12
B. 1
C. 2
D. 11
7. When a ________ error occurs, the interpreter refuses to run the programme until the error
is fixed; instead, we must save and rerun the programmes.
A. Syntax errors
B. Logical error
C. Runtime error
D. All of the above
11. It is raised when a calculation's outcome is greater than the numeric data type's upper
limit.
A. ZeroDivisonError
B. OverFlowError
C. TypeError
D. ValueError
6. C 7. A 8. C 9. D 10. A
Review Questions
1. Describe how actually does the python try except clause works.
2. Give example how to catch specific exceptions in python.
3. With example explain raising exception in python.
4. Differentiate between try with else cluse and try with finally clause with example.
5. In Python can a try have multiple except?
Further Readings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
https://www.programiz.com/python-programming/user-defined-exception
Objectives
Learn basic concepts about arrays and lists.
Learn to differentiate between array and list.
Learn several array creation routines in Numpy which are used to create Ndarray objects.
Introduction
Python uses the data structures list and array to store many elements. Let's examine some key
distinctions between lists and arrays in Python.
Program Output
An integer is the first element, a string is the second, and a list of characters is the third.
Program Output
List Array
can include components from several data only includes elements of the same data type.
kinds.
No need to import a module specifically for A module must be imported specifically for a
declaration declaration.
may hold several types of items by being Contains either all identically sized nested
nested. elements
preferred for shorter data item sequences preferred for longer data item sequences
Greater flexibility makes it simple to change Less flexibility because addition and deletion
(add or remove) data. must be done in terms of elements
Without using any explicit looping, the To print or retrieve the array's component
complete list can be printed. values, a loop must be created.
greater memory use for simple element smaller in size when compared to memory
addition
>>>print(y) [0 2 4 6
8]
>>>z = np.arange(0,10,2)
>>>print(z)
These following parameters are passed to the function Object () {[native code]}.
Examples Output
Example1 [1 2 3]
# convert list to ndarray
import numpy as np
x = [1,2,3]
a = np.asarray(x)
print a
numpy.frombuffer
A buffer is treated as a one-dimensional array by this function. To return an ndarray, any object
that exposes the buffer interface is used as an argument.
numpy.frombuffer(buffer, dtype = float, count = -1, offset = 0)
The following parameters are passed to the constructor
1 buffer
whatever object offers the buffer interface
2 dtype
The returned data is a ndarray. by default, is float
3 count
the amount of data to read; a default value of -1 indicates all data
4 offset
the place to start reading from. default is zero.
Example Output
import numpy as np ['H' 'e' 'l' 'l' 'o' ' ' 'W' 'o' 'r' 'l' 'd']
s = 'Hello World'
a = np.frombuffer(s, dtype = 'S1')
print a
numpy.fromiter
This function converts any iterable object into an ndarray object. This function returns a brand-new
one-dimensional array.
numpy.fromiter(iterable, dtype, count = -1)
1 iterable
Any iterable object
2 dtype
Data type of resultant array
3 count
The number of items to be read from iterator. Default is -1 which means all data to be
read
The range () function, which may be used to return a list object, is demonstrated in the following
examples. An ndarray object is created using an iterator of this list.
Examples Output
Example1 [0, 1, 2, 3, 4]
# create list object using range function
import numpy as np
list = range(5)
print list
6.4 Indexing
An element of an iterable is said to be "indexed" if it is based on where it is located inside the
iterable. Indexing starts at position 0. Index 0 represents the initial element in the sequence.
Indexing in the negative starts at 1. Index -1 serves as a representation of the final element in the
sequence. Each character in a string has a corresponding index number that can be used to access
that character. Characters in a String can be accessed in two different ways.
C O M P U T E R
Positive Indexing 0 1 2 3 4 5 6 7 9
Negative indexing -8 -7 -6 -5 -4 -3 -2 -1
Program Output
Program Output
Indexing in List
Program Output
6.5 Slicing
Getting a subset of elements from an iterable based on their indices is referred to as "slicing."
By slicing a string, which is essentially a string contained within another string, we can produce a
substring. When we only require a section of the string and not the complete string, we use slicing.
Syntax:
string [start:end: step]
Parameters
start - index from where to start
end - ending index
step - numbers of jumps/increment to take between i.estepsize
Program Output
Indexing Slicing
If you try to utilise an index that is too big, an Out-of-range indices are handled kindly when
IndexError will be thrown. used for slicing.
The list's length cannot be altered by item By designating objects to slicing, we can
assignment during indexing. modify the list's length or even remove items
from it.
Indexing can be given a single element or an A Type Error occurs when we assign a single
iterable. element to slicing. It only permits iterables.
Summary
In Python, a list is a group of things that can include items of different data types, such as
numeric, character, logical values, etc
A vector with members that are homogenous, or of the same data type, is referred to as an
array
With the exception of the fact that it has fewer parameters, this function is comparable to
numpy.array.
A buffer is treated as a one-dimensional array by this function. To return an ndarray, any
object that exposes the buffer interface is used as an argument
This function converts any iterable object into an ndarray object. This function returns a
brand-new one-dimensional array.
Only sequence data types are capable of indexing and slicing. Sequence types maintain the
order in which elements are added, allowing us to retrieve their elements through indexing
and slicing.
An element of an iterable is said to be "indexed" if it is based on where it is located inside the
iterable.
We pass a Positive index (that we want to access) in square brackets in this situation. Index
number zero is the first in the list of index numbers. (depicts the start of a string's characters).
In this kind of indexing, the negative index that we want to access is passed in square
brackets. In this instance, the index number starts at -1.
Getting a subset of elements from an iterable based on their indices is referred to as "slicing."
Tuple slicing is an option. It functions similarly to how lists and strings do. Several elements
can be obtained through tuple slicing.
Keywords
append ():Adds an element at the end of the list
clear ():Removes all the elements from the list
copy ():Returns a copy of the list
count ():Returns the number of elements with the specified value
extend ():Add the elements of a list (or any iterable), to the end of the current list
index ():Returns the index of the first element with the specified value
insert ():Adds an element at the specified position
pop ():Removes the element at the specified position
remove ():Removes the first item with the specified value
reverse ():Reverses the order of the list
sort ():Sorts the list
SelfAssessment
1. Which of the ensuing commands will result in the creation of a list?
A. list1 = list()
B. list1 = []
C. list1 = list([1, 2, 3])
D. all of the mentioned
10. What will happen if you observe the following code and predict the outcome?
import numpy as np
a=np.array([1,2,3,4,5,6])
print(a)
A. [1 2 3 4 5]
B. [1 2 3 4 5 6]
C. [0 1 2 3 4 5 6]
D. None of the mentioned above
11. What will happen if you observe the following code and predict the outcome?
import numpy as np
a = np.array([10, 20, 30, 40])
b = np.array([18, 15, 14])
c = np.array([25, 24, 26, 28, 23])
x, y, z = np.ix_(a, b, c)
A. 10
B. 9
C. 8
D. All of the mentioned above
6. C 7. C 8. D 9. D 10. B
Review Questions
1. Explain difference between indexing and slicing
2. Differentiate between Arrays and lists with examples.
3. Write down advantages of NumPy arrays compared with lists.
4. Explain using arrays in python with example.
5. Explain copy (), extend (), index (), pop () and remove () method with example.
Further Readings
Mark Lutz, Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
https://www.programiz.com/python-programming/user-defined-exception
Objectives
After this unit, student would be able to:
Introduction
Arrays are not natively supported by Python, although Python Lists can be used in their place. You
can utilize LISTS as ARRAYS as demonstrated on this unit, but in order to interact with arrays in
Python, you must import a library like the NumPy library.
7.1 Arrays
Multiple values can be stored in an array in a single variable as shown in
Table 1 Creating an array containing colors name
Program Output
What is an Array?
A unique type of variable called an array has the capacity to store several values at once.
If you have a list of objects, such as a list of automobile names, you might store the colors in
separate variables as follows:
colors1 = "Red"
colors2 = "Green"
colors3 = "Yellow"
Program Output
Program Output
['Blue', 'Green', 'Yellow']
colors = ["Red", "Green", "Yellow"]
colors [0] = “Blue”
print(colors)
Program Output
An array's length is always one greater than its topmost array index.
Program Result
Program Result
Program Output
['Red', 'Blue']
colors = ["Red", "Green", "Blue"]
colors.pop(1)
print(colors)
The remove () method can also be used to delete an element from an array.
Program Output
The remove () method of the list simply eliminates the first instance of the entered value.
Array Methods
Method Description
count() returns the quantity of elements that have the given value.
extend() To finish the current list, append the entries of another list (or any
iterable)
index() gives back the position of the first element with the given value.
Arrays are not supported by default in Python, however Python Lists can be used in their place.
Program Output
Broadcasting Rules:
1. Prepend the shape of the lower rank array with 1s until both shapes have the same length if the
arrays don't have the same rank.
2. If the two arrays in a dimension have the same size or if one of the arrays has size 1 in that
dimension, the two arrays are compatible in that dimension.
3. If the arrays are consistent with all dimensions, they can be broadcast together.
4. Each array acts as though it has a shape equal to the maximum element-wise shape of the two
input arrays after broadcasting.
5. The first array acts as if it were copied along any dimension where one array had size 1 and the
other array had size larger than 1.
a. Arithmetic Operators
Common mathematical procedures are carried out using arithmetic operators and numeric values:
+ Addition x+y
_ Subtraction x-y
* Multiplication x*y
/ Division x/y
% Modulus x%y
** Exponentiation x ** y
// Floor Division x // y
b. Bitwise Operators
To compare (binary) numbers, use the following bitwise operators:
<< Zero fill left Pushing zeros in from the right causes a shift to the left, causing
shift the last few bits to disappear.
>> Signed right Push copies of the leftmost bit in from the left to shift right whil
shift e letting the rightmost bits fall off.
c. Relational Operators
The primary purpose of the relational operators, commonly referred to as comparison operators, is
to return either true or false depending on the value of the operands.
The relational operators are listed as follows:
1. <
2. >
3. <=
4. >=
5. ==
6. !=
Summary
Arrays are not natively supported by Python, although Python Lists can be used in their place
A unique type of variable called an array has the capacity to store several values at once.
An array element is referred to by its index number.
Return the length of an array using the Len () function (the number of elements in an array).
To iterate over each element of an array, use the for in loop.
The append () method can be used to include an element in an array.
The term "broadcasting" describes how Numpy handles arrays of differing dimensions when
performing operations that result in restrictions; the smaller array is broadcast across the
bigger array to ensure that they have similar forms.
The primary purpose of the relational operators, commonly referred to as comparison
operators, is to return either true or false depending on the value of the operands.
Python's array module can be imported to generate an array. An array can be created by using
the syntax array (data type, value list), which takes two arguments: a data type and a value
list.
Keywords
append()Adds an element at the end of the list
clear() Removes all the elements from the list
copy() Returns a copy of the list
count() Returns the number of elements with the specified value
extend() Add the elements of a list (or any iterable), to the end of the current list
index() Returns the index of the first element with the specified value
insert() Adds an element at the specified position
pop() Removes the element at the specified position
Self Assessment
1. What does the following code produce as output?
L = ['d','e','f','g']
print("".join(L))
A. Error
B. None
C. defg
D. ‘d’,’e’,’f’,’g’
A. <type ‘type’>
B. type ‘int’
C. integer
D. 0
A. 24
B. 25
C. 23
D. 0
10. len(names)
A. Finds the length of the list called names
B. Finds the length of the list called len
C. Finds the length of the list called Length
D. None
A. Red
B. Green
C. Red Green Blue
D. Blue
A. 88
B. 44
C. 8
D. 4+4
A. 1
B. 2
C. 3
D. 0
6. C 7. B 8. B 9. D 10. A
Further Readings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Objectives
learn built-in module to run mathematical functions
learn built-in module to run statistical functions
learn basic concepts pf sort, search and count Functions
Introduction
Mathematical calculations may occasionally be required when working on certain types of business
or scientific tasks. Python has a math module that can handle these calculations. Basic operations
like addition, subtraction, multiplication, and division as well as more complex ones like
trigonometric, logarithmic, and exponential functions are handled by the math module's functions.
With the aid of a sizable dataset comprising functions that are described with the aid of useful
examples, we learn about the math module from fundamentals to advanced concepts.
We must import the module into our code in order to utilize it.
import math
Some Consonants
1 pi
Return the value of pi: 3.141592
2 E
Return the value of natural base e. e is
0.718282
3 tau
Returns the value of tau. tau = 6.283185
4 inf
Returns the infinite
5 nan
Not a number type
1 Ceil(x)
Give the Ceiling value back. It is the smallest integer that is either greater than or
equal to x.
2 copysign(x,y)
It copies the sign of y to x and returns the number x.
3 fbas(x)
gives x's absolute value back.
4 factorial(x)
returns the x factorial, where x >=0.
5 floor(x)
Give the Floor value back. It is the greatest integer that is less than or equal to x.
6 fsum(iterable)
Calculate the elemental sum of an iterable object.
7 gcd(x, y)
returns the x and y's greatest common divisor.
8 isfinite(x)
determines whether x is neither a nan nor an infinite.
9 isinf(x)
determines if x is infinite
10 isnan(x)
determines whether or not x is a number.
11 remainder(x,y)
Calculate the leftover after dividing x by y.
Example Code
import math
print('The Floor and Ceiling value of 23.56 are: ' + str(math.ceil(23.56)) + ', ' +
str(math.floor(23.56)))
x = 10
y = -15
print('The value of x after copying the sign from y is: ' + str(math.copysign(x, y)))
print('Absolute value of -96 and 56 are: ' + str(math.fabs(-96)) + ', ' + str(math.fabs(56)))
my_list = [12, 4.25, 89, 3.02, -65.23, -7.2, 6.3]
print('Sum of the elements of the list: ' + str(math.fsum(my_list)))
print('The GCD of 24 and 56 : ' + str(math.gcd(24, 56)))
x = float('nan')
if math.isnan(x):
print('It is not a number')
x = float('inf')
y = 45
if math.isinf(x):
print('It is Infinity')
print(math.isfinite(x)) #x is not a finite number
print(math.isfinite(y)) #y is a finite number
Output
1 pow(x,y)
the value of x raised to the power of y
2 sqrt(x)
determines x's square root
3 exp(x)
Finds xe, where e = 2.718281
4 log(x[, base])
provides the base and returns the Log of x. The standard base is e.
5 log2(x)
gives the Log of x with base 2 as a result.
6 log10(x)
provides the Log of x with a base of 10.
Example Code
import math
print('The value of 5^8: ' + str(math.pow(5, 8)))
print('Square root of 400: ' + str(math.sqrt(400)))
print('The value of 5^e: ' + str(math.exp(5)))
print('The value of Log(625), base 5: ' + str(math.log(625, 5)))
print('The value of Log(1024), base 2: ' + str(math.log2(1024)))
print('The value of Log(1024), base 10: ' + str(math.log10(1024)))
Output
1 sin(x)
Specify x's sine in radians.
2 cos(x)
Specify x's cosine in radians.
3 tan(x)
Give back x's tangent in radians.
4 asin(x)
This is the sine's inverse operation; the other two are acos and atan.
5 degrees(x)
Change angle x's radian value to a degree.
6 radians(x)
x-angle conversion from degrees to radians
Example Code
import math
print('The value of Sin(60 degree): ' + str(math.sin(math.radians(60))))
print('The value of cos(pi): ' + str(math.cos(math.pi)))
print('The value of tan(90 degree): ' + str(math.tan(math.pi/2)))
print('The angle of sin(0.8660254037844386): ' +
str(math.degrees(math.asin(0.8660254037844386))))
Output
a. mean()
With the use of an iterator or series, this function determines the arithmetic mean or average value
of the sampled data.
Example Output
b. harmonic_mean()
Example Output
c. median()
The middle value of the arithmetic data is calculated using this function iteratively.
Example Output
d. median_low()
When there are more even than odd components in the data, this function calculates the lower of
the two middle elements, rather than the median.
Example Output
e. median_high
If there are even numbers of items, this function calculates the higher of the two middle elements in
the data; otherwise, it calculates the median of the data.
Example Output
f. median_grouped
Example Output
g. mode()
This method returns the data point with the greatest number of occurrences from nominal or
discrete data.
Example Output
‘quicksort’ 1 O(n^2) 0 no
‘heapsort’ 3 O(n*log(n)) 0 no
numpy.sort()
A sorted version of the input array is returned by the sort () function. The following characteristics
apply.
numpy.sort(a, axis, kind, order)
where,
1 A
Array to be sorted
2 Axis
the direction that the array should be sorted
along. If none, sorting on the last axis flattens
the array.
3 Kind
default is quicksort
4 Order
If the array has fields, specify how they should
be sorted.
Example
import numpy as np
a = np.array([[3,7],[9,1]])
print 'Our array is:'
print a
print '\n'
print 'Applying sort() function:'
print np.sort(a)
print '\n'
print 'Sort along axis 0:'
print np.sort(a, axis = 0)
print '\n'
Output
numpy.argsort()
The numpy.argsort() function returns an array of data indices by performing an indirect sort on the
input array along the specified axis. The sorted array is built using this indices array.
Example
import numpy as np
x = np.array([3, 1, 2])
print 'Our array is:'
print x
print '\n'
print 'Applying argsort() to x:'
y = np.argsort(x)
print y
print '\n'
print 'Reconstruct original array in sorted order:'
print x[y]
print '\n'
print 'Reconstruct the original array using loop:'
Output
numpy.lexsort()
Function uses a series of keys to conduct an indirect sort. The keys can be thought of as a
spreadsheet column. The function provides a list of indices that can be used to access the sorted
data. Keep in mind that the sort's primary key just so happens to be the last key.
Example
import numpy as np
nm = ('raju','anil','ravi','amar')
dv = ('f.y.', 's.y.', 's.y.', 'f.y.')
ind = np.lexsort((dv,nm))
print 'Applying lexsort() function:'
print ind
print '\n'
print 'Use this index to get sorted data:'
print [nm[i] + ", " + dv[i] for i in ind]
Output
Numerous functions for searching inside an array are available in the NumPy module. There are
functions for determining the maximum, minimum, and items satisfying a particular criterion.
import numpy as np
a = np.array([[30,40,70],[80,20,10],[50,90,60]])
print 'Our array is:'
Output
numpy.nonzero()
The output of the numpy.nonzero() function is the indexes of the array's non-zero members.
Example
import numpy as np
a = np.array([[30,40,0],[0,20,10],[50,0,60]])
print 'Our array is:'
print a
print '\n'
print 'Applying nonzero() function:'
print np.nonzero (a)
Output
numpy.where()
The where() function returns the indices of input array members that satisfy the specified criterion.
Example
import numpy as np
x = np.arange(9.).reshape(3, 3)
print 'Our array is:'
print x
print 'Indices of elements > 3'
y = np.where(x > 3)
Output
numpy.extract()
The elements satisfying any criterion are returned by the extract() function.
Example
import numpy as np
x = np.arange(9.).reshape(3, 3)
print 'Our array is:'
print x
# define a condition
condition = np.mod(x,2) == 0
print 'Element-wise value of condition'
print condition
print 'Extract elements using condition'
print np.extract(condition, x)
Output
Keywords
numpy.sort(): A sorted version of the input array is returned by the sort() function.
numpy.argsort(): The numpy.argsort() function returns an array of data indices by performing an
indirect sort on the input array along the specified axis. The sorted array is built using this indices
array.
numpy.lexsort(): function uses a series of keys to conduct an indirect sort. The keys can be
thought of as a spreadsheet column. The function provides an array of indices that can be used to
retrieve the sorted data.
numpy.argmax() and numpy.argmin(): These two operations give back the indices of the
highest and lowest elements along the specified axis.
numpy.nonzero():The output of the numpy.nonzero() function is the indexes of the array's non-
zero members.
numpy.where():The where() function returns the indices of input array members that satisfy the
specified criterion.
numpy.extract():The function extract() returns elements that meet any requirement.
Self Assessment
1. What does the following code produce as output?
# Import math Library
import math
print(math.acos(0.65)
A. 0.863211890069541
B. 2.15316056466364
A. 7
B. 9
C. 8
D. 0
A. 7
B. 9
C. 8
D. 8.5
A. 4,7,-4,22,17
B. 4,7,-4,21,15
C. 4,7,-4,22,15
D. 3,7,-4,22,15
A. 7.6666666
B. 8.6666666
C. 9.6666667
D. 7.456789
A. 3.7416573867739413
B. 3.8516573867739413
C. 4.7416573867739413
D. 3.9426673867739413
A. 3
B. 6
C. 5
D. 4
A. 6
B. 7
C. 5
D. 4
A. 852.63665815335037, -687.5493541569879
B. 352.63665815335037, -687.5493541569879
C. 462.63665815335037, -887.5493541569879
D. 452.63665815335037, -687.5493541569879
A. 55.34, 6.0
B. 54.34,6.0
C. 55.24,6.0
D. None of above
E. Red Green Blue
F. Blue
A. 0.359665538729672
B. 0.269665538729672
C. 0.459665538729672
D. 0.259665538729672
A. 10.222222222222223
B. 11.222222222222223
C. 9.222222222222223
A. 1
B. 2
C. 3
D. 4+4
A. 12
B. 64
C. 444
D. None of above
6. A 7. A 8. D 9. C 10. D
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Objectives
Series, Dataframe, Sorting, Working with Csv Files
Operations Using Dataframe
Introduction
Python's Pandas package is used to manipulate data sets.
It offers tools for data exploration, cleaning, analysis, and manipulation.
Wes McKinney came up with the name "Pandas" in 2008, and it refers to both "Panel Data" and
"Python Data Analysis."
Installation of Pandas
Pandas installation is fairly simple if Python and PIP are already installed on a machine.
Use this command to install it:
Use a Python distribution with Pandas already installed, such as Anaconda, Spyder, etc., if this
command fails.
Import Pandas
Once Pandas is installed, import it by adding the import keyword to your applications:
Import pandas
Example Output
Pandas Series
Program Output
import pandas as pd 0 3
a = [3, 6, 1] 1 6
myvar = pd.Series(a) 2 1
print(myvar) dtype: int64
Labels
The values are identified with their index number if nothing else is supplied. The index of the first
item is 0, that of the second is 1, etc.
To access a certain value, use this label.
Program Output
import pandas as pd 3
a = [3, 6, 1]
myvar = pd.Series(a)
Create Labels
You are able to name your own labels using the index option.
Program Output
import pandas as pd a 3
a = [3, 6, 1] b 6
myvar = pd.Series(a, index = ["a", "b", "c"]) c 1
print(myvar) dtype: int64
When you create labels, you can use the label to get to an item.
Program Output
import pandas as pd 6
a = [3, 6, 1]
myvar = pd.Series(a, index = ["a", "b", "c"])
print(myvar["b"])
Program Output
Program Output
day1 400
import pandas as pd
day2 320
calories = {"day1": 400, "day2": 320, "day3": dtype: int64
300}
myvar = pd.Series(calories, index = ["day1",
"day2"])
print(myvar)
9.3 DataFrames
Known as DataFrames in Pandas, data sets are often multidimensional tables.
A DataFrame is the entire table, but a Series is similar to a column.
Program Output
Program Output
Syntax
list.sort(reverse=True|False, key=myFunc)
Parameter Values
Parameter Description
Program Output
Program Output
# A function that returns the length of the ['VW', 'BMW', 'Ford', 'Mitsubishi']
value:
def myFunc(e):
return len(e)
cars = ['Ford', 'Mitsubishi', 'BMW', 'VW']
cars.sort(key=myFunc)
print(cars)
Program Output
Structure of CSV
2001 1 39343
2004 5 40000
2015 6 70000
The document is called "Salary Data.csv." The header line of a CSV file contains the names of the
fields and features.
Reading a CSV
Python offers a variety of CSV file handling options.
Using csv.reader
Using the csv.reader object, the Python language's built-in module for reading CSV files.
Program Output
# import pandas as pd 0
import pandas as pd 0 This
1 is
# list of strings 2 my
lst = ['This', 'is', 'my', 'File', 3 File
'read', 'it', 'carefully'] 4 read
5 it
# Calling DataFrame constructor on list 6 carefully
df = pd.DataFrame(lst)
print(df)
Program Output
# Create DataFrame
df = pd.DataFrame(data)
Data is arranged in rows and columns in a data frame, which is a two-dimensional data structure.
Basic operations like selecting, removing, adding, and renaming can be done on rows and columns.
Column selection: We have two options for accessing columns in a Pandas DataFrame in order to
choose one of them.
Program Output
Row Selection:Rows can be retrieved from a Data frame using a special mechanism that Pandas
offers. Rows from a Pandas DataFrame are retrieved using the DataFrame.loc method.
Additionally, rows can be chosen by giving an integer location to the iloc[] method.
Program Output
Program Output
Output:
Since there was only one parameter both times, two series were returned, as seen in the output
image.
Program Output
import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col
="Name")
# retrieving rows by iloc method
row2 = data.iloc[3]
print(row2)
Summary
A one-dimensional labelled array called a series can store any kind of data (integer, string,
float, python objects, etc.). The term index refers to all of the axis labels.
The index passed must have the same length as data if the data is an ndarray. If no index
is provided, range(n), where n is the array length, will be used as the default index, which
is [0,1,2,3.. range(len(array))-1].
The values are identified with their index number if nothing else is supplied. The index of
the first item is 0, that of the second is 1, etc.
You are able to name your own labels using the index option.
The term "Comma Separated Values" or CSV. It is the most basic way to save tabular data
as plain text. Because we as data scientists usually use CSV data in our daily work, it is
crucial to know how to work with it.
Keywords
Dataframe:A Pandas DataFrame is a two-dimensional data structure having rows and columns,
similar to a two-dimensional array
Series: A Pandas Series resembles a table's column. It is a one-dimensional array that can hold any
kind of data.
Labels:The values are identified with their index number if nothing else is supplied. The index of
the first item is 0, that of the second is 1, etc.
Key/Value Object: When constructing a Series, you can also utilise a key/value object like a
dictionary.
DataFrames:In Pandas, data sets are often multidimensional tables, or "DataFrames."
CSV Files: Using CSV files is an easy approach to store large data sets (comma separated files).
Max_rows:The Pandas option settings control how many rows are returned.
Column_Selection: We have two options for accessing the columns in a Pandas DataFrame in
order to choose one of them.
Row_Selection: Rows from a Pandas DataFrame are retrieved using the DataFrame.loc method.
Indexing Operator: The indexing operator is also used by the.locand.iloc indexers to make
selections.
Dropna(): We used the dropna() method to remove null values from a dataframe. This function
removes rows and columns of datasets containing null values in several ways.
Self Assessment
1. Which of the following is not true about DataFrame?
A. A dataframe can be created by passing dictionaries
B. A dataframe is size immutable
C. A dataframe index can be string
D. A column of dataframe can have different types
A. 1
B. 4
C. 3
D. 2
8. Which of the following is used to give user defined column index in DataFrame?
A. index
B. column
C. columns
D. colindex
A. 1
B. 2
C. 3
D. 4
10. In regards to separated value files such as .csv and. tsv, what is the delimiter?
A. Delimiters are not used in separated value files
B. Any character such as the comma (,) or tab (\t) that is used to separate the column data.
11. In separated value files such as .csv and .tsv, what does the first row in the file typically
contain?
A. The source of the data
B. The column names of the data
C. Notes about the table data
D. The author of the table data
12. When iterating over an object returned from csv.reader(), what is returned with each
iteration?
For example, given the following code block that assumes csv_reader is an object returned
from csv.reader(), what would be printed to the console with each iteration?
for item in csv_reader:
print(item)
13. When we create Data Frame from Dictionary of List then Keys becomes the ___________
A. Row Labels
B. Column Labels
C. Both of the above
D. None of the above
14. Data Frame created from a single Series has ______ column
# Import math Library
import math
# Return the value of 9 raised to the power of 3
print (math.pow (4, 3))1
A. 1
B. 2
C. n (n is the number of elements in the series)
D. None of above
15. What do you understand by pandas? Explain use of pandas with example along with
installation procedure.
16. What do you understand by CSV file? Explain the steps to read a CSV file.
17. Explain with example creation of DataFrame from dict of ndarrays.
18. Explain column selection and row selection in a DataFrame with examples.
19. Difference between. loc and. iloc function using example.
6. C 7. A 8. A 9. C 10. C
Further Readings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Objectives
After this unit, student would be able to:
Introduction
Missing data is a constant issue in real-world situations. The accuracy of model predictions in fields
like machine learning and data mining is severely hampered by the poor quality of the data that
missing values produce. To improve the accuracy and validity of their models in these fields,
missing value treatment is a prominent area of focus.
Program Output
print df
We have produced a DataFrame with missing values using reindexing. NaN stands for Not a
Number in the output.
Program Output
Program Output
The value zero is being filled in here, but any other value may be used.
Method Action
Program Output
Program Output
1. Importing Libraries
2. Input Customer Feedback Dataset
3. Locate Missing Data
4. Check for Duplicates
5. Detect Outliers
6. Normalize Casing
1. Importing Libraries
Let's get your Python script going with NumPy and Pandas installed.
import pandas as pd
import numpy as np
In this situation, the libraries ought to be loaded into your
Input
data = pd.read_csv('feedback.csv')
Output:
As you can see, the dataset you wish to look at is "feedback.csv". And in this instance, we know we
are utilizing the Pandas library to read our dataset as we see "pd.read_csv" as the prior function.
Input
data.isnull()
Output
From here, we really sanitise the data using code. There are only two primary alternatives here.
Either remove the data or enter the blanks. If you decide to:
INPUT:
data.duplicated()
OUTPUT:
Also known as a list of boolean values with duplicate values indicated by a 'True' reading.
Let's move forward and eliminate that duplicate (datapoint 8).
INPUT:
data.drop_duplicates()
OUTPUT:
5. Detect Outliers
Numerical values that are significantly beyond the statistical norm are considered outliers. They are
data points that are sufficiently out of range that they are probably misreads, to cut down on
superfluous science jargon.
They must be eliminated, just like duplicates. Pulling up our dataset first, let's look for an outlier.
INPUT:
data['Rating'].describe()
OUTPUT:
Look at that "max" value; none of the other values, including the mean (average), are even
close to 100. Your understanding of your dataset will now determine how you will address
outliers. The data scientists who entered the knowledge in this instance are aware that they
meant to enter a value of 1, not 100. In order to correct our data, we may safely delete the
outlier.
INPUT:
data.loc[10,'Rating'] = 1
OUTPUT:
Now that our dataset only contains ratings between 1 and 5, there won't be any big distortion
caused by a single errant 100.
6. Normalize Casing
Last but not least, we'll cross our ts and dot our i's. Meaning that we will uppercase Customer
Names so that our algorithms can recognise them as variables and standardise (lowercase) all
review titles to prevent confusing our algorithms.Here’s how to make every review title lowercase:
INPUT
data['Review Title'] = data['Review Title'].str.lower()
OUTPUT
Looks fantastic! Now let's make sure that none of our sophisticated software misclassifies a
customer name since it isn't capitalised. How to capitalise "Customer Name" correctly is as follows:
INPUT:
data['Customer Name'] = data['Customer Name'].str.title()
OUTPUT:
And there it is—our data collection complete with all the fixings. Or, more accurately, with all the
fixes: To find and remove inaccurate data and normalise the remaining data, we made good use of
logical Python packages.
Summary
The practise of correcting or deleting inaccurate, damaged, improperly formatted, duplicate,
or incomplete data from a dataset is known as data cleaning.
The process of converting data from one format or structure to another is known as data
transformation.
Remove duplicate or pointless observations as well as undesirable observations from your
dataset. The majority of duplicate observations will occur during data gathering.
When you measure or transfer data and discover odd naming conventions, typos, or
incorrect capitalization, those are structural errors.
There will frequently be isolated findings that, at first look, do not seem to fit the data you
are evaluating.
The main measure of how well-founded and likely accurate a concept, conclusion, or
measurement is called validity.
Similar to missing data, duplicates are problematic and choke analytics tools. Let's find them
and get rid of them.
Numerical values that are significantly beyond the statistical norm are considered outliers.
Another choice will need to be made: to maintain the data in the set while simply dropping
the missing values, or to completely remove the feature (the entire column) because there
are so many missing datapoints that it is unusable for analysis.
Keywords
Data type constraints: Each column's values must belong to a specific data type, such as
Boolean, numeric (integer or real), date, etc.
Range Constraints: Most of the time, dates or numbers must fall inside a specified range. In other
words, they have minimum and/or maximum values that are acceptable.
Unique Constraints: A field, or a group of fields, must be distinct throughout a dataset. The same
social security number cannot be shared by two people, for instance.
Set-Membership Constraints: A set of discrete values or codes is used to generate the values for
each column. A person's sex, for instance, may be Female, Male, or Non-Binary.
Foreign-key Constraints: The more typical case of set membership is this. One table's column
that has distinct values defines the set of values in another table's column. For instance, the "state"
column in a US taxpayer database must be one of the US's recognized states or territories; the list of
acceptable states and territories is kept in a separate State table. Foreign key is a word that was
adopted from relational database terminology.
Regular expression patterns: It may occasionally be necessary to validate text fields in this
manner. For instance, it might be necessary for phone numbers to follow the pattern (999) 999-9999.
Cross-field validation: A certain set of multi-field conditions must be true. For instance, in
laboratory medicine, the differential white blood cell count's component parts must add up to 100.
(Since they are all percentages). A patient's date of discharge from the hospital cannot be earlier
than the date of admission in a hospital database.
Accuracy:The degree of conformity of a measure to a standard or a true value.
Completeness: The extent to which all necessary actions are known. Data cleansing techniques are
usually never able to completely correct incompleteness since they cannot be used to infer
information that was not originally recorded in the data.
Consistency: A set of measures' degree of system-to-system equivalence.
Uniformity: The extent to which a set of data measures are defined across all systems using the
same units of measurement.
Duplicate Detection:An algorithm is needed for duplicate detection in order to determine
whether the same thing is represented twice in the data.
Parsing:A parser determines whether a string of data complies with the specification for permitted
data. This is comparable to how a parser deals with languages and grammars.
Statistical Methods:An expert may discover values that are unexpected and thus incorrect by
analyzing the data using the mean, standard deviation, range, or clustering algorithms.
Self Assessment
1. Which of the following phrases describes the challenge of identifying abstract patterns (or
structures) in unlabeled data?
A. Supervised learning
B. Unsupervised learning
C. Hybrid learning
D. Reinforcement learning
4. The total number of neonates in the example of predicting the number of births can be
thought of as the ______________.
A. Features
B. Observation
C. Attribute
D. Outcome
7. The ________________ is the analysis carried out to find the intriguing statistical correlation
between associated -attributes value pairs.
A. Mining of association
B. Mining of correlation
C. Mining of clusters
D. All of the above
8. Which of the following can be characterized as a data object that deviates from the norm (or
the model of available data)?
A. Evaluation Analysis
B. Outliner Analysis
C. Classification
D. Prediction
11. How many different types of data warehousing approaches are there to integrate
heterogeneous databases?
A. 3
B. 4
C. 5
D. 2
6. C 7. B 8 B 9 D 10 D
11 A 12 D 13 A 14 B 15 D
Review Questions
1. What do you understand by data cleaning? explain best practices for data cleanig.
2. Explain with code how null values stored in pandas data frames.
3. Difference between structured and unstructured data.
4. What are the effect of missing values in prediction and also explain functions that are used
to handle missing values.
5. Explain following with example.
a. How to see first five rows of Data Frame in python
b. Define data profiling.
c. How t check the class of each variable in pandas DataFrame
d. Write code to see the dimensions of a DataFrame in python.
e. Explain data mining.
FurtherReadings
Mark Lutz,ProgrammingPython: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Dr. Rajni Bhalla, Lovely Profssional University Unit 11: Data Visualization
Objectives
After this unit, the student would be able to:
Introduction
When working with data, it can be challenging to fully comprehend your data if it is just presented
in tabular form. We must visualize or represent our data visually to fully comprehend what it
means, to properly clean it, and to choose the best models for it. This makes patterns, correlations,
and trends more obvious that cannot be seen in data that is presented as a table or CSV file.
Data visualization is the act of using visual representations of our data to identify trends and
relationships. We can utilize a variety of Python data visualization libraries, like Matplotlib,
Seaborn, Plotly, etc., to do data visualization.
Programming in Python
Python provides several plotting libraries, including Matplotlib, Seaborn, and many other data
visualization tools with a variety of features for building educational, unique, and visually
appealing plots to present data most simply and powerfully.
Matplotlib Seaborn
It is used to plot simple graphs like line charts, It can carry out complicated visualizations
bar graphs, and so forth. with fewer commands and is primarily used
for statistics visualization.
Seaborn is primarily used for statistical For exploratory data analysis, Matplotlib is
analysis and has more built-in themes. more customizable and works well with
Pandas and Numpy.
Figure 2 Matplotlib vs Seaborn
Let's think about the Kanto apple yield (tonnes per hectare). Using this information, let's create a
line graph to show how the apple yield has changed over time. We begin by importing Seaborn and
Matplotlib.
Using Matplotlib
To depict the yield of apples, we are utilizing arbitrary data points.
yield_apples = [0.895, 0.91, 0.919, 0.926, 0.929, 0.931]
plt.plot(yield_apples)
We can also include the values for the x-axis to clarify the graph's meaning.
years=[2010, 2011, 2012, 2013, 2014, 2015]
yield_apples = = [0.895, 0.91, 0.919, 0.926, 0.929, 0.931]
plt.plot(years,yield_apples)
Programming in Python
Let's give the axes labels so we can demonstrate what each axis represents.
plt.plot(years, yield_Apples)
plt.xlabel(‘Year’)
plt.ylabel(‘Yield(tons per hectare)’);
Simply use the plt.plot method once for each dataset to plot numerous datasets on the same graph.
On the same graph, let's utilise this to compare the yields of apples and oranges.
years=range(2000,2012)
apples=[0.895, 0.91, 0.919, 0,926, 0.929, 0.931, 0.934, 0.936, 0.937, 0.9375, 0.9372, 0.939]
oranges=[0.962, 0.941, 0.930, 0.923, 0.918, 0908, 0.907, 0.904, 0.901, 0.898, 0.9, 0.896]
plt.plot(years, apples)
plt.plot(years, oranges)
plt.xlabel(‘Year’)
plt.ylabel(‘Yield(tons per hectare)’);
Simply use the plt.plot method once for each dataset to plot numerous datasets on the same graph.
On the same graph, let's utilise this to compare the yields of apples and oranges.
plt.plot(years, apples)
plt.plot(years, oranges)
plt.xlabel(‘year’)
plt.ylabel(‘yield(tons per hectare)’)
With the help of the marker parameter, we can use markers to show each data point on our graph.
Matplotlib offers a wide variety of marker shapes, including a circle, cross, square, diamond, etc.
Programming in Python
11.4 Seaborn
A high-level interface called Seaborn was constructed on top of Matplotlib. It offers stunning design
themes and colour schemes to create graphs that are more appealing.
Enter the following command in the terminal to install Seaborn.
Output
matplotlib.pyplot.scatter()
Dots are used in scatter plots to show the relationship between variables, which are used to observe
relationships between variables. To create a scatter plot, use the matplotlib library's scatter()
method. Most often, scatter plots are used to show the relationship between variables and how
changing one affects the other.
Syntax
The syntax for scatter() method is given below:
matplotlib.pyplot.scatter(x_axis_data, y_axis_data, s=None, c=None, marker=None, cmap=None,
vmin=None, vmax=None, alpha=None, linewidths=None, edgecolors=None)
The following parameters are passed to the scatter() method:
An array of x-axis data is called x axis data.
Programming in Python
Program Output
import pandas as pd
import matplotlib.pyplot as plt
# reading the database
data = pd.read_csv("tips.csv")
# Scatter plot with day against tip
plt.scatter(data['day'], data['tip'])
# Adding Title to the Plot
plt.title("Scatter Plot")
# Setting the X and Y labels
plt.xlabel('Day')
plt.ylabel('Tip')
plt.show()
Bars can also be stacked on top of one another. The data for apples and oranges should be plotted.
Time of day
Total bill
Programming in Python
To see how the average bill amount varies on various days of the week, we can create a bar chart.
By calculating the day-wise averages and utilising plt.bar afterwards, we can accomplish this.
Additionally, a barplot function that can compute averages automatically is offered by the Seaborn
library.
The hue option can be used to compare bar charts side by side. Based on the third feature
mentioned in this argument, a comparison will be made.
11.7 Histograms
A histogram is a bar graph that shows how data changes over time. The range is plotted along the
x-axis, and the height of the data pertaining to a range is plotted along the y-axis. Data are plotted
using histograms over a range of values. To display the data corresponding to each range, they
employ a bar representation. Let's once more plot histograms using the "Iris" data, which provides
details about flowers.
Programming in Python
We can include several histograms in a single chart, just like we can with line charts. So that the
bars of one histogram don't obscure those of the others, we can make each histogram less opaque.
Let's create distinct histograms for every type of flower.
If the stacked parameter is set to True, then many histograms can be piled on top of one another.
Summary
The human mind processes and comprehends any given data more easily when it is
presented with images, maps, and graphs
It is used to plot simple graphs like line charts, bar graphs, and so forth.
It can carry out complicated visualisations with fewer commands and is primarily used for
statistics visualisation.
An informational graph called a line chart shows data as a collection of dots connected by
straight lines.
Use the plt.plot method once for each dataset to plot numerous datasets on the same
graph.
A high-level interface called Seaborn was constructed on top of Matplotlib. It offers
stunning design themes and colour schemes to create graphs that are more appealing.
Python's Matplotlib toolkit provides a complete tool for building static, animated, and
interactive visualisations.
Programming in Python
Dots are used in scatter plots to show the relationship between variables, which are used
to observe relationships between variables.
When you have categorical data, a bar graph can be used to display it. A bar graph uses
bars to indicate value on the y-axis and category on the x-axis to plot data.
A histogram is a bar graph that shows how data changes over time. The range is plotted
along the x-axis, and the height of the data pertaining to a range is plotted along the y-axis.
Keywords
Seaborn:Python has a dataset-oriented library called Seaborn that can be used to create statistical
representations.
Bokeh: For contemporary web browsers, there is a visualisation library called Bokeh.
Altair: A declarative statistical visualisation library for Python is called Altair. The Vega-Lite JSON
specification served as the foundation for Altair's user-friendly, dependable API.
Plotly:A high-level, declarative, interactive, open-source, and browser-based visualisation toolkit
for Python is called plotly.py.
Ggplot: The graphics grammar is implemented in Python by ggplot.
Bar Chart:When comparing metric values between various data subsets, a bar chart is utilised.
Column Chart: When comparing a single category of data between specific sub-items, such as
when comparing revenue between areas, column charts are typically utilised.
Stacked Bar Chart: When comparing the sums of the available groups and the makeup of the
various subgroups, a stacked bar chart is employed.
Pie Chart: Pie charts can be used to determine how much of each component there is in a given
whole.
Area Chart:To monitor changes over time for one or more groups, area charts are employed.
Column Histogram: To view the distribution for a single variable with few data points, column
histograms are utilised.
Scatter Plot:It is possible to use scatter plots to determine the relationships between two variables.
Box Plot:The form of the distribution, its central value, and its variability are displayed using a
box plot.
Waterfall Chart:A waterfall chart can be used to illustrate how a variable's value gradually
changes as a result of increments or decrements.
Venn Diagrams: To visualise the connections between two or three sets of items, utilise Venn
diagrams.
Self Assessment
Q1. Select those which does not visualize the data
A. Charts
B. Shapes
C. Graphs
D. Maps
A. Histogram
B. Boxplot
C. Pie
D. All of the above
A. Bar
B. Line
C. Histogram
D. Box Plot
A. Line
B. Bar
C. Pie
D. Scatter
A. 1D
B. 2D
C. 3D
D. All of above
A. Visualization
B. Analysis
C. Plotting
D. Handling
A. Seaborn
B. Anaconda
C. MATLAB
D. Pyplot
A. visual
B. matlibplot
Programming in Python
C. matplotlib
D. matlab
A. line
B. plot
C. graph
D. bar
A. pie
B. basemap
C. bar
D. histogram
A. bar
B. histogram
C. scatterplots
D. basemap
A. show()
B. plotting()
C. plot()
D. plots()
A. Plotting area
B. Legend
C. Axis labels
D. All of the above
6. B 7. A 8 D 9 C 10 D
11 D 12 C 13 C 14 C 15 D
Review Questions
1. What is data visulaizations? Write down benefits of Data Visualization.
2. Write down difference between Matplotlib and Seaborn.
3. Explain Line chart with example.
4. What do you understand by seaborn. Write down command to install seaborn. Explain use
of seaborn with example.
5. Explain difference between scatter plot, bar graph and histogram.
FurtherReadings
Maheshwari, Anil. Big Data. McGraw-Hill Education, 2019.
Mayer-Schonberger, Viktor; Cukier, Kenneth (2013). Big Data: A Revolution That
Will Transform How We Live, Work, and Think . Houghton Mifflin Harcourt.
McKinsey Global Institute Report (2011). Big Data: The Next Frontier For
Innovation, Competition, and Productivity. Mckinsey.com
Marz, Nathan, and James Warren (2015). Big Data: Principles and Best Practices of
Scalable Realtime Data Systems. Manning Publications.
Sandy Ryza, Uri Laserson et.al (2014). Advanced-Analytics-with-Spark. OReilley.
White, Tom (2014). Mastering Hadoop. OReilley.
Web Links
1. Apache Hadoop resources: https://hadoop.apache.org/docs/r2.7.2/
2. Apache HDFS: https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
3. Hadoop API site: http://hadoop.apache.org/docs/current/api/
4. NoSQL databases: http://nosql-database.org/
5. Apache Spark: http://spark.apache.org/docs/latest/
Dr. Rajni Bhalla, Lovely Profssional University Unit 12: Data Visualization
Objectives
After this unit, student would be able to
Introduction
Python's Seaborn visualization module is fantastic for plotting statistical visualizations. It offers
lovely default styles and color schemes to enhance the appeal of statistics charts. It is constructed on
top of the Matplotlib toolkit and is tightly integrated with the Pandas data structures.
With Seaborn, visualization will be at the heart of data exploration and comprehension. For a better
comprehension of the dataset, it offers dataset-oriented APIs that allow us to switch between
various visual representations for the same variables.
Programming in Python
12.2 Installation
For python environment
Program
# Importing libraries
importnumpy as np
importseaborn as sns
# Selecting style as white,
# dark, whitegrid, darkgrid
# or ticks
sns.set(style="white")
# Generate a random univariate
# dataset
rs=np.random.RandomState(10)
d =rs.normal(size=100)
Line Plot:One of the most fundamental plots in the Seaborn Library is the line plot. This graphic is
mostly used to depict continuous data in the form of a time series.
Program
importseaborn as sns
sns.set(style="dark")
fmri=sns.load_dataset("fmri")
Output
Lmplot:Another very simple plot is the lmplot. It displays a line denoting a linear regression
model together with data points in a 2D space, and the labels x and y can be set to represent the
horizontal and vertical axes, respectively.
Program
importseaborn as sns
sns.set(style="ticks")
Output
Programming in Python
Dealing multiple figures We can open and work with Each figure's creation is given
many figures at once. They a specific time by Seaborn.
are clearly closed, though. But it might result in (OOM)
One figure can be closed at a out of memory problems.
time using the syntax
matplotlib.pyplot.close ().
Close all the figures using this
syntax:
matplotlib.pyplot.close("all")
it.
Data Frames and Arrays When dealing with data Compared to Matplotlib,
frames and arrays, Matplotlib Seaborn is a lot more useful
performs well. It views axes and organised and treats the
and figures as objects. There entire dataset as a single
are several stateful plotting entity. Because Seaborn is not
APIs in it. Thus, methods very stateful, parameters are
similar to plot() can operate needed when calling methods
without parameters. like plot ()
Pandas
Pandas provide tools for processing and cleaning up your data. It is the most widely used Python
data analysis library. A data table is referred to as a dataframe in pandas.
Program
importpandas as pd
# Create DataFrame
df=pd.DataFrame( data )
Programming in Python
Output
Example2: : Load the CSV data from the system and display it through pandas .
Program
# import module
importpandas
Output
Seaborn
Python's Seaborn visualisation module is fantastic for plotting statistical visualisations. It is
constructed on top of the Matplotlib toolkit and is tightly integrated with the Pandas data
structures.
Installation
For python environment :
For condaenvironment :
# Importing libraries
import numpy as np
Output
Line Plot
Scatter Plot
Box plot
Point plot
Count plot
Violin plot
Swarm plot
Bar plot
KDE Plot
Line Plot:
Programming in Python
Although scatter plots are quite successful, there is no one form of visualisation that is always the
best. Instead, the visual representation should be customised for the unique characteristics of the
dataset and the plot's intended purpose.You might want to comprehend how variations in one
variable as a function of time, or a similarly continuous variable, in various datasets. Making a line
plot in this case is a wise decision. By setting kind=" line, the lineplot() function in Seaborn can
carry out this task either directly or in conjunction with relplot().
Scatter Plot
A scatterplot can be used in conjunction with several semantic groups to aid in clear understanding
of a graph. They can use the semantics of colour, size, and style parameters to plot two-dimensional
visuals that can be improved by mapping up to three additional variables. Each parameter controls
the visual and semantic features used to distinguish the various subsets. Making graphics more
accessible can be achieved by using redundant meanings.
Syntax: seaborn.scatterplot(x=None, y=None, hue=None, style=None, size=None, data=None,
palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None,
size_norm=None, markers=True, style_order=None, x_bins=None, y_bins=None, units=None,
estimator=None, ci=95, n_boot=1000, alpha=’auto’, x_jitter=None, y_jitter=None, legend=’brief’,
ax=None, **kwargs)
Parameters:
x, y: Input data variables that should be numeric.
data: Dataframe where each column is a variable and each row is an observation.
size: Grouping variable that will produce points with different sizes.
style: Grouping variable that will produce points with different markers.
palette: Grouping variable that will produce points with different markers.
markers: Object determining how to draw the markers for different levels.
alpha: Proportional opacity of the points.
Returns: This method returns the Axes object with the plot drawn onto it.
Box Plot:
The seaborn boxplot has a very simple structure. Distributions are represented visually using
boxplots.That is incredibly helpful when comparing data between two groups.
A boxplot may also be referred to as a box-and-whisker plot. Any box displays the dataset's
quartiles, and the whiskers extend to display the remainder of the distribution.The boxplot plot is
related with the boxplot() method.
Point Plot:
A point plot uses the position of the dot to indicate an estimate of the central tendency for a
numerical variable, and error bars are used to show the degree of uncertainty surrounding that
estimate.For comparisons between various levels of one or more categorical variables, point plots
may be more helpful than bar plots. They excel in demonstrating interactions, or how the
connection between levels of one category variable alters as levels of a second categorical variable
are added. It is simpler for the eyes to detect interactions by differences in slope rather than by
comparing the heights of various groupings of points or bars thanks to the lines that connect each
point from the same hue level.
Count
Plot:seaborn.countplot(data=None, *, x=None, y=None, hue=None, order=None, hue_order=Non
e, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, ax=None, **k
wargs)
A count plot resembles a histogram over a categorical variable as opposed to a quantitative one.
You can compare counts across nested variables because the fundamental API and settings are the
same as those for barplot().The more recent histplot() function, despite it has slightly different
default behaviour, offers greater capabilities.
Programming in Python
Violin
Plot:seaborn.violinplot(data=None, *, x=None, y=None, hue=None, order=None, hue_order=None
, bw='scott', cut=2, scale='area', scale_hue=True, gridsize=100, width=0.8, inner='box', split=False, d
odge=True, orient=None, linewidth=None, color=None, palette=None, saturation=0.75, ax=None, *
*kwargs)
A violin plot and a box and whisker plot serve the same purpose. In order to allow for comparison,
it displays the distribution of quantitative data across a number of levels of one (or more)
categorical variables. The violin plot includes a kernel density estimation of the underlying
distribution as opposed to a box plot, in which all of the plot elements correspond to actual
datapoints.This can be a useful and appealing technique to display numerous data distributions at
once, but take in note that the estimate procedure is affected by the sample size and so violins for
small samples may appear deceptively smooth.
Swarm Plot:The Seaborn swarmplot is presumably similar to the stripplot, with the exception that
the points are adjusted to avoid overlap in order to better depict the distribution of values. A
swarm plot can be created independently, but it also works well in conjunction with a box, which is
desirable since the names linked with the names will be used to annotate the axes. This plot style is
commonly referred to as a "beeswarm."
Syntax: seaborn.swarmplot(x=None, y=None, hue=None, data=None, order=None,
hue_order=None, dodge=False, orient=None, color=None, palette=None, size=5,
edgecolor=’gray’, linewidth=0, ax=None, **kwargs)
Parameters:
x, y, hue: Inputs for plotting long-form data.
data: Dataset for plotting.
color: Color for all of the elements
size: Radius of the markers, in points.
Bar Plot: A bar plot, often known as a bar chart, is a graph that uses rectangular bars with lengths
and heights proportionate to the values they represent to depict a category of data. Both horizontal
and vertical graphs of the bars are possible. The comparisons between the distinct categories are
shown in a bar chart. The exact categories under comparison are shown by one of the plot's axes,
while the measured values associated with those categories are represented by the other axis.
Summary
A package called Seaborn uses Matplotlib as its foundation to plot graphs. In order to see
random distributions, it will be used.
The statistical link between the data points is depicted using relational graphs. Because it
enables humans to recognise trends and patterns in data, visualisation is essential.
Histograms are plotted using the seaborn dist plot, as well as the kdeplot and rugplot
variants.
Another very simple plot is the lmplot. It displays a line denoting a linear regression model
together with data points in a 2D space, and the labels x and y can be set to represent the
horizontal and vertical axes, respectively.
Data is graphically represented in data visualisation. It facilitates data analysis and
forecasting by breaking down a large dataset into manageable graphs.
Matplotlib is used to create simple graphs. Bar graphs, histograms, pie charts, scatter plots,
lines, and other visual representations of data are used to visualize datasets.
Data visualisation patterns and graphs can be found throughout Seaborn. Interesting themes
are employed.
The visual presentation of data is known as data visualisation. . Because of the excellent
ecosystem of Python packages focused on data, it is crucial for data analysis.
Pandas provide tools for processing and cleaning up your data. It is the most widely used
Python data analysis library.
Python's Seaborn visualisation module is fantastic for plotting statistical visualisations. It is
constructed on top of the Matplotlib toolkit and is tightly integrated with the Pandas data
structures.
The seaborn boxplot has a very simple structure. Distributions are represented visually
using boxplots.
A point plot uses the position of the dot to indicate an estimate of the central tendency for a
numerical variable, and error bars are used to show the degree of uncertainty surrounding
that estimate. For comparisons between various levels of one or more categorical variables,
point plots may be more helpful than bar plots.
A violin plot and a box and whisker plot serve the same purpose. In order to allow for
comparison, it displays the distribution of quantitative data across a number of levels of one
(or more) categorical variables.
The Seaborn swarmplot is presumably similar to the stripplot, with the exception that the
points are adjusted to avoid overlap in order to better depict the distribution of values.
The KNN algorithm, also referred to as K-nearest neighbor, is a non-parametric algorithm
that groups data points according to their proximity and association with other pieces of
available information.
Programming in Python
Keywords
Relational plots: This type of graphic is used to see how two variables are related.
Categorical plots: This graphic discusses categorical variables and the visualization of them.
Distribution Plots: Plots used to examine univariate and bivariate distributions include
distribution plots.
Regression plots: The main purpose of the regression plots in Seaborn is to provide a visual aid
that highlights patterns in a dataset during exploratory data analysis.
Scatter Plots: Plots in a matrix An array of scatterplots makes up a matrix plot.
Multi-plot grids: Drawing numerous instances of the same plot on various subsets of the dataset
is a helpful strategy.
Visualizations:Data is graphically represented in data visualisation. It facilitates data analysis and
forecasting by breaking down a large dataset into manageable graphs.
Pandas and Seaborn: Pandas and Seaborn, makes importing and analysing data more simpler.
Scatter:A scatterplot can be used in conjunction with several semantic groups to aid in clear
understanding of a graph.
Box Plot:A boxplot may also be referred to as a box-and-whisker plot. Any box displays the
dataset's quartiles, and the whiskers extend to display the remainder of the distribution.The
boxplot plot is related with the boxplot() method.
Point plot: A point plot uses the position of the dot to indicate an estimate of the central tendency
for a numerical variable, and error bars are used to show the degree of uncertainty surrounding
that estimate
Self Assessment
Q1. Series and DataFrame's plot function is only a basic wrapper over _____________
A. gplt.plot()
B. plt.plot()
C. plt.plotgraph()
D. none of the mentioned
Q2. Please specify the ideal kind keyword combination for graph plotting.
Q3. Which of the following values does the kind barplot keyword provide?
A. Bar
B. Kde
C. Hexbin
D. none of the mentioned
Q4. By utilising the _________ method in pandas.tools.plotting, you may produce a scatter plot
matrix.
A. sca_matrix
B. scatter_matrix
C. DataFrame.plot
D. all of the mentioned
Q5: Indicate the incorrect kind keyword combination for graph plotting.
Q6. Which of the following plots are used to check if a data set or time series is random?
A. Lag
B. Random
C. Lead
D. None of the mentioned
A. Charts
B. Maps
C. Shapes
D. Graphs
A. Histogram
B. Boxplot
C. Pie
D. All are correct
A. df.plot(type=’hist’, edge=’red’)
B. df.plot(type=’hist’, edgecolor=’red’)
C. df.plot(type=’hist’, line=’red’)
D. df.plot(type=’hist’, linecolor=’red’)
Q11: Which of the following is not the parameter of pyplot’s plot() method.
A. Marker
Programming in Python
B. Lineheight
C. Linestyle
D. Color
A. Line
B. Bar
C. Pie
D. Scatter
Q13: Which of the following chart element is used to identify data series by its color patterns.
A. Chart title
B. Legend
C. Marker
D. Data Labels
A. 1D
B. 2D
C. 3D
D. All of the above
A. Visualisation
B. Analysis
C. Plotting
D. Handling
6. A 7. C 8. C 9. B 10. D
Review Questions
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Dr. Rajni Bhalla, Lovely Profssional University Unit 13: OOP Concepts
Objectives
After studying this unit, you will be able to:
Introduction
The Python programming style known as object-oriented programming (OOPs) makes use of
objects and classes. It seeks to incorporate in programming real-world concepts like inheritance,
polymorphism, encapsulation, etc. The fundamental idea behind OOPs is to unite the data and the
functions that use it such that no other portion of the code may access it.
Object-Oriented Programming's Core Ideas (OOPs) are:-
Class
Object
Method
Inheritance
Polymorphism
Data Abstraction
Encapsulation
13.1 Class
A class is a group of related items. The models or prototypes used to generate objects are included
in classes. It is a logical entity with a few methods and characteristics.Consider the following
scenario to better appreciate the need for generating classes: Suppose you needed to keep track of
the number of dogs that might have various characteristics, such as breed or age. If a list is utilised,
the dog's breed and age might be the first and second elements, respectively. What if there were 100
different breeds of dogs? How would you know which ingredient should go where? What if you
wanted to give these dogs additional traits? This is disorganised and just what courses need.
A few notes on the Python class:
13.2 Objects
The object is an entity that is connected to a state and activity. Any physical device, such as a
mouse, keyboard, chair, table, pen, etc., may be used. Arrays, dictionaries, strings, floating-point
numbers, and even integers are all examples of objects. Any single string or integer, more
specifically, is an object. A list is an object that may house other things, the number 12 is an object,
the text "Hello, world" is an object, and so on. You may not even be aware of the fact that you have
been using items.
An Object consists of:
State:The properties of an object serve as a representation of it. Additionally, it reflects an object's
characteristics.
Behavior:It is represented via an object's methods. It also shows how one object reacts to other
objects.
Identity: It gives a thing a special name and makes it possible for objects to communicate with one
another.
Let's look at the example of the class dog to better understand the state, behaviour, and identity
(explained above).
Breed, age, and colour of the dog are examples of states or attributes.
You may infer from the behaviour whether the dog is eating or sleeping.
a. The self
An additional initial parameter in the method declaration is required for class methods.
When we call the method, we don't supply a value for this parameter; Python does.
Even if we have a method that doesn't require any parameters, we still need one.
This is comparable to this Java reference and this C++ pointer.
This is the sole purpose of the special self. When we invoke a method of this object as
myobject.method(arg1, arg2), Python automatically converts it to MyClass.method(myobject, arg1,
arg2).
b. The __init__method
The constructors in Java and C++ are comparable to the __init__ method. As soon as a class object is
created, it is executed. Any initialization you want to perform on your object can be done with the
method.Let's build some objects utilising the self and __init__ methods after defining a class.
Example1: Class and object creation using class and instance properties
class Dog:
# class attribute
attr1 = "mammal"
# Instance attribute
def __init__(self, name):
self.name = name
# Driver code
# Object instantiation
Rodger = Dog("Rodger")
Tommy = Dog("Tommy")
Output
Rodger is a mammal
Tommy is also a mammal
My name is Rodger
My name is Tommy
# class attribute
attr1 = "mammal"
# Instance attribute
def __init__(self, name):
self.name = name
def speak(self):
print("My name is {}".format(self.name))
# Driver code
# Object instantiation
Rodger = Dog("Rodger")
Tommy = Dog("Tommy")
Output
My name is Rodger
My name is Tommy
13.3 Methods
A function connected to an object is the method. A method is not specific to class instances in
Python. Any sort of object may have methods.
13.4 Inheritance
The capacity of one class to derive or inherit properties from another class is known as inheritance.
The class from which the properties are being derived is referred to as the base class or parent class,
and the class from which the properties are being derived is referred to as the derived class or child
class. The advantages of inheritance include:
Types of Inheritance
Single Inheritance
A class can inherit properties from a single-parent class using single-level inheritance.
Multilevel Inheritance
A derived class can inherit properties from an immediate parent class, which in turn can inherit
properties from his parent class, thanks to multi-level inheritance.
Hierarchical Inheritance
More than one derived class can inherit properties from a parent class thanks to hierarchical level
inheritance.
Multiple Inheritance
One derived class may inherit properties from several different base classes thanks to multiple level
inheritance.
Example: Python inheritance
# Python code to demonstrate how parent constructors
# are called.
# parent class
class Person(object):
# __init__ is known as the constructor
def __init__(self, name, idnumber):
self.name = name
self.idnumber = idnumber
def display(self):
print(self.name)
print(self.idnumber)
def details(self):
print("My name is {}".format(self.name))
print("IdNumber: {}".format(self.idnumber))
# child class
class Employee(Person):
def __init__(self, name, idnumber, salary, post):
self.salary = salary
self.post = post
def details(self):
print("My name is {}".format(self.name))
print("IdNumber: {}".format(self.idnumber))
print("Post: {}".format(self.post))
Output
Rahul
886012
My name is Rahul
IdNumber: 886012
Post: Intern
In the aforementioned article, two classes—Person (parent class) and Employee—have been
established (Child Class). The Person class is an ancestor of the Employee class. As can be seen in
the show function in the code above, we may use the methods of the person class through the
employee class. The details() function shows how a child class can alter the parent class's
behaviour.
13.5 Polymorphism
Simply put, polymorphism means having multiple forms. For instance, utilising polymorphism, we
can answer the question of whether the given species of birds fly or not using just one function.
Example:Python's use of polymorphism
class Bird:
def intro(self):
print("There are many types of birds.")
def flight(self):
print("Most of the birds can fly but some cannot.")
class sparrow(Bird):
def flight(self):
print("Sparrows can fly.")
class ostrich(Bird):
def flight(self):
print("Ostriches cannot fly.")
obj_bird = Bird()
obj_spr = sparrow()
obj_ost = ostrich()
obj_bird.intro()
obj_bird.flight()
obj_spr.intro()
obj_spr.flight()
obj_ost.intro()
obj_ost.flight()
OUTPUT
There are many types of birds.
Most of the birds can fly but some cannot.
When using abstraction, internal details are hidden and only functionalities are displayed. Giving
things names that capture the essence of what a function or an entire programme does is the
process of abstracting something.
13.7 Encapsulation
One of the core ideas in object-oriented programming is encapsulation (OOP). It explains the
concept of data wrapping and the techniques that operate on data as a single unit. This restricts
direct access to variables and procedures and can avoid data alteration by accident. A variable can
only be altered by an object's method in order to prevent inadvertent modification. These variables
fall under the category of private variables.
A class, which encapsulates all the data that is contained in its member functions, variables, etc., is
an example of encapsulation.
Methods Variables
# Python program to
# demonstrate private members
# Calling constructor of
# Base class
Base.__init__(self)
print("Calling private member of base class: ")
print(self.__c)
# Driver code
obj1 = Base()
print(obj1.a)
Output
EcontentOnline
The c variable was generated as the private attribute in the example above. We are unable to even
directly read or modify the value of this attribute.
It makes development and upkeep simpler. When a project grows in scope, maintaining
the codes is difficult in procedural
programming.
It replicates the thing in the actual world. It doesn't represent reality in any way. It
Therefore, oops makes it simple to tackle operates using detailed instructions broken
difficulties in the actual world. down into smaller units called functions.
C++, Java, .Net, Python, C#, and other object- Procedural languages include C, Fortran,
oriented programming languages are Pascal, VB, and others.
examples.
Summary
The Python programming style known as object-oriented programming (OOPs) makes use
of objects and classes.
A class is a group of related items. The models or prototypes used to generate objects are
included in classes.
The object is an entity that is connected to a state and activity. Any physical device, such as a
mouse, keyboard, chair, table, pen, etc., may be used.
The constructors in Java and C++ are comparable to the __init__ method. As soon as a class
object is created, it is executed.
A function connected to an object is the method. A method is not specific to class instances
in Python. Any sort of object may have methods.
The capacity of one class to derive or inherit properties from another class is known as
inheritance.
Simply put, polymorphism means having multiple forms. For instance, utilising
polymorphism, we can answer the question of whether the given species of birds fly or not
using just one function
Both data abstraction and encapsulation are frequently used interchangeably. Since data
abstraction is accomplished by encapsulation, the two terms are almost synonymous
One of the core ideas in object-oriented programming is encapsulation (OOP). It explains the
concept of data wrapping and the techniques that operate on data as a single unit.
The approach to addressing problems that uses objects for computation is called object-
oriented programming.
A list of instructions is used in procedural programming to perform calculations in stages
Keywords
OOPS: Object-oriented programming is known as OOP. While object-oriented programming
involves constructing objects that include both data and methods, procedural programming
involves developing procedures or methods that perform actions on the data.
The __init__method: All classes have a function called __init__(), which is always executed when
the class is being initiated.
The __str__function: What should be returned when the class object is rendered as a string is
determined by the __str__() function.
Objects methods: Methods can also be found in objects. Object-specific functions are called
methods in an object.
Self-Parameter: To access class-specific variables, use the self parameter, which is a reference to
the currently running instance of the class.
Inheritance: By using inheritance, we may create a class that has all the methods and
attributes of another class.
Parent class:The class being inherited from, often known as the base class, is the parent class.
Child class:The class that inherits from another class is referred to as a child class or derived class.
Super Function:The super() function in Python allows a descendant class to inherit all of its
parent's methods and properties.
Self Assessment
Q1. Which option best encapsulates inheritance?
A. A.__init__(self)
B. B.__init__(self)
C. A.__init__(B)
D. B.__init__(A)
Q4: What function type is a built-in in the context of classes?
A. Double-level
B. Multi-level
C. Single-level
D. Multiple
A. Encapsulation
B. Inheritance
C. Instantiation
D. Polymorphism
A. The capacity of a class to derive individuals from other classes as part of its own definition.
B. Techniques for combining instance variables and methods to limit access to specific class
members
C. focuses on supplying parameters to functions and variables.
D. enables the use of sophisticated software that is well-designed and flexible.
A. Albert Einstein
B. Guido Van Rossum
C. Guido Evan
D. None of these
A. 1975
B. 1989
C. 1972
D. 1990
Q11: Which of the following commands the expression with the most precedence?
A. Addition
B. Subtraction
C. Parentheses
D. Power
A. Object
B. Logical
C. Real
D. Hypothetical
Q15: Which of the following can be considered a combination of data abstraction and
programming?
A. Class
B. Object
C. Inheritance
D. Interfaces
6. C 7. B 8. B 9. B 10. B
Review Questions
1. What do you understand by OOPS? Write down the code to make a python class that si
empty.
2. Define Objects. Write down example to create an object with methods.
3. What do you understand by inheritance and also define types of inheritance.
4. Write down difference between Single level inheritance, multilevel inheritance and multiple
inheritance.
5. Define Polymorphism. Write down python code that define use of polymorphism.
6. What do you understand by Encapsulation? Write down python program to demonstrate
private members.
7. Write down difference between object-oriented programming and procedural
programming.
FurtherReadings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming, OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction
to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python
Objectives
After this unit, student would be able to learn:
Introduction
Programs that use machine learning algorithms are able to discover hidden patterns in data,
forecast results, and enhance performance based on past performance. In machine learning, various
algorithms can be used for various tasks, such as simple linear regression for prediction issues like
stock market forecasting and the KNN algorithm for classification issues.
Using labelled data, supervised learning Unlabeled data is used to train algorithms for
algorithms are taught. unsupervised learning.
A supervised learning model uses direct A model of unsupervised learning does not
feedback to determine whether or not it is incorporate feedback.
foretelling the correct outcome.
A model of supervised learning forecasts the Unsupervised learning models uncover data's
results. buried patterns.
In supervised learning, the model receives In unsupervised learning, the model receives
input data in addition to output. only input data.
To train the model in supervised learning, The model can be trained without any
supervision is required. supervision using unsupervised learning
Classification and regression issues can be The model can be trained without any
grouped under supervised learning. supervision using unsupervised learning.
When both the input and the associated output Unsupervised learning issues fall under the
are known, supervised learning may be categories of clustering and associations.
applied.
A supervised learning model yields reliable When we only have input data and no
results. corresponding output data, unsupervised
learning can be applied.
a dependent (y) and one or more independent (y) variables. Given that linear regression
demonstrates a linear relationship, it may be used to determine how the dependent variable's value
changes as a function of the independent variable's value. The link between the variables is
represented by a sloping straight line in the linear regression model. Think on the photo below:
Numpy – fundamental package for scientific computing to create the example dataset.
Pandas – a powerful tool for data analysis and manipulation.
Scikit Learn (sklearn) – tools for predictive data analysis, including linear regression.
Matplotlib: plotting library for visualization.
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression #only importing the linear_model function
import matplotlib.pyplot as plt
%matplotlib inline
The numpy array x1 is converted to a matrix because the sklearn package requires it.
reshape(-1,1): - 1 represents 1 column and instructs NumPy to retrieve the number of rows
from the original x1.
We begin by making an instance of the class LinearRegression, abbreviated lr.
The fit method's input parameters can vary, but we'll leave them at their default values.
features = x1.reshape(-1, 1)
target = y
lr = LinearRegression()
lr.fit(features, target)
print(lr.coef_)
We can see that while.coef_ returns an array,.intercept_ returns a scalar. These coefficients' values
are fairly close to their actual values (y = 10 + 2*x1). The regression line and training dataset can
both be seen together in a visualization.
plt.figure(figsize=(20,7))
plt.plot(x1, y, 'o')x_chart = np.linspace(x1.min(), x1.max(), num=100)
plt.plot(x_chart, lr.intercept_ + lr.coef_[0]*x_chart)
Output:
array([10.06295511, 11.85833473, 13.65371434, 15.44909395])
The too complicated trees that decision-tree learners can produce do not effectively generalize
the input. Overfitting is the term for this. To prevent this issue, mechanisms like pruning,
defining the minimum number of samples needed at a leaf node, or establishing the
maximum depth of the tree are required.
Because even slight changes in the data could produce an entirely different tree, decision trees
can be unstable. The solution to this issue is to employ decision trees as part of an ensemble.
As can be seen in the above graphic, decision tree predictions are piecewise constant
approximations rather than smooth or continuous predictions. They therefore struggle with
extrapolation.
It is well known that learning an optimum decision tree under various conditions of
optimality, even for straightforward notions, is an NP-complete issue. Because each node
makes judgments that are locally optimal, heuristic algorithms like the greedy algorithm serve
as the foundation for practical decision-tree learning algorithms. Such algorithms cannot
promise to return the decision tree that is globally optimal. Multi-tree training in an ensemble
learner with replacement sampling for the features and samples can help to mitigate this.
Certain concepts, like XOR, parity, or multiplexer difficulties, are challenging to understand
because decision trees do not simply describe them.
If some classes predominate, decision tree learners will produce biased trees. As a result, it is
advised to balance the dataset before fitting it to the decision tree.
Create a specific query or set of data, then ask the source to provide the needed information.
Make that the data is in a format that can be accessed; if not, convert it to the necessary format.
List any obvious abnormalities and missing data that may be needed to obtain the desired
data.
Establish a machine learning model.
Decide on the baseline model you wish to accomplish.
train the machine learning model with the data.
Using test data, provide insight into the model.
Compare the test data and the model's projected data's performance metrics now.
You can try updating your model accordingly, dating your data, or using another data
modelling technique if it doesn't meet your expectations.
You now interpret the information you have learned and report accordingly.
In the example below, you will apply a similar sampling technique.
Here is a detailed example of how Random Forest Regression is implemented.
# Importing the libraries
Step 3: Select all rows and column 1 from dataset to x and all rows and column 2 as y
# the coding was not shown which is like that
x= df.iloc [:, : -1] # ” : ” means it will select all rows, “: -1 ” means that it will ignore last column
y= df.iloc [:, -1 :] # ” : ” means it will select all rows, “-1 : ” means that it will ignore all columns
except the last one
# the “iloc()” function enables us to select a particular cell of the dataset, that is, it helps us select a
value that belongs to a particular row or column from a set of values of a data frame or dataset.
Output
uses an iterative technique to choose the best value for K center points or centroids.
each data point is matched with the nearest k-center. A cluster is formed by the data points
that are close to a specific k-center.
As a result, each cluster is distinct from the others and contains data points with some
commonality.
The K-means Clustering Algorithm is explained in the diagram below:
Summary
The field of study known as machine learning enables computers to learn without being
explicitly programmed.
Astrategy for predicting a response based on a single feature is simple linear regression.
A subset of machine learning and artificial intelligence is supervised learning, commonly
referred to as supervised machine learning. It is distinguished by the way it trains computers
to accurately classify data or predict outcomes using labelled datasets.
In order to accurately classify test data into different categories, classification uses an
algorithm.
To comprehend the relationship between dependent and independent variables, regression is
used.
Neural networks process training data by simulating the connectivity of the human brain
through layers of nodes, which is mostly used for deep learning algorithms.
A classification method known as Naive Bayes adopts the idea of Class Conditional
Independence from the Bayes Theorem.
In order to anticipate future results, linear regression is frequently employed to determine the
relationship between a dependent variable and one or more independent variables.
While logistical regression is used when the dependent variable is categorical, or has binary
outputs, such as "true" and "false" or "yes" and "no," linear regression is used when the
dependent variable is continuous.
Vladimir Vapnik created the well-known supervised learning model known as the support
vector machine, which is used for both data classification and regression.
The KNN algorithm, also referred to as K-nearest neighbor, is a non-parametric algorithm that
groups data points according to their proximity and association with other pieces of available
information.
Keywords
Linear Regression: When modelling the relationship between a scalar answer and one or more
explanatory variables in statistics, linear regression is a linear method.
Linearity:This indicates that the parameters (regression coefficients) and predictor variables are
combined linearly to produce the mean of the response variable.
Constant variance: This translates to the fact that the variance of the errors is independent of the
values of the predictor variables. Therefore, regardless of how big or small the responses are, the
variability of the responses for given fixed values of the predictors is the same.
Independence of Errors:This assumes that the errors of the response variables are uncorrelated
with each other.
K Nearest Neighbor: One of the simplest Machine Learning algorithms based on the Supervised
Learning technique is K-Nearest Neighbor.
Lazy Learner Algorithm: It is also known as a lazy learner algorithm since it saves the training
dataset rather than learning from it immediately. Instead, it uses the dataset to perform an action
when classifying data.
Euclidean Algorithm:The distance between two points, which we have already examined in
geometry, is known as the Euclidean distance.
Decision Trees:A non-parametric supervised learning technique for classification and regression
is called a decision tree (DT). The objective is to learn straightforward decision rules derived from
the data features in order to build a model that predicts the value of a target variable.
Random Forests: Leo Breiman and Adele Cutler are the creators of the widely used machine
learning technique known as random forest, which mixes the output of various decision trees to
produce a single outcome.
K Means Clustering: The goal of k-means clustering, a vector quantization technique that
originated in signal processing, is to divide n observations into k clusters, where each observation
belongs to the cluster that has the closest mean (also known as the cluster centroid or cluster
centre), which serves as a prototype for the cluster.
K Medoids: K-medoids, also known as Partitioning Around Medoids, or PAM, minimises the sum
of distances for any given distance function by using the medoid rather than the mean.
Principal Component Analysis:Principal component analysis provides the k-means clustering's
relaxed solution, which is determined by the cluster indicators (PCA)
K
Self Assessment
1. Choose the item from the list below that is not a kind of learning.
A. Semi supervisedunsupervisedlearning
B. Supervised learning
C. Unsupervised Learning
D. Reinforcement Learning
2. What is the term for the application of machine learning techniques to a big database?
A. Supervised learning
B. Unsupervised Learning
C. Reinforcement Learning
D. Data mining
7. Any machine learning model's success Hinges on the engineering of a good feature space.
A. Pre-requisite
B. Process
C. Objective
D. None of the above
9. This is a reference to the adjustments made to the detected data before to feeding it to the
algorithm.
A. Problem Identification
B. Identification of Required Data
C. Data Pre-processing
D. Definition of Training Data Set
11. Which of the following kNN choices would you take into account if there are a lot of
sounds in the data?
A. Increase the value of k
B. Decrease the value of k
C. Noise does not depend on k
D. k = 0
13. Choose from the following that are Decision Tree nodes?
A. Decision Nodes
B. End Nodes
C. Chance Nodes
D. All of the mentioned
6. C 7. A 8. C 9. C 10. D
Review Question
Further Readings
Mark Lutz,Programming Python: Powerful Object-Oriented Programming,
OREILLY
Wes McKinney, Python for data analysis, OREILLY
David Ascher and Mark Lutz, Learning Python, OREILLY
Eric Matthes, Python Crash Course, 2nd Edition: A Hands-On, Project-Based
Introduction to Programming, Starch Pres
Web Links
https://www.tutorialspoint.com/python/index.htm
https://www.python.org/downloads/
https://www.w3schools.in/python/data-types
https://www.programiz.com/python-programming/online-compiler/
https://www.codecademy.com/catalog/language/python