Python for Data Science
Getting Started with Python
Guido van Rossum developed Python between 1985 and 1990 at the National Research
Institute for Mathematics and Computer Science in the Netherlands and officially
released on 20 February 1991.
The language derives its name from a very popular Comedy British television series
"Monty Python's Flying Circus."
Python – A General Purpose, Interpreted & Interactive
High Level Programming Language
Python, an interpreted and interactive programming
language, serves as a general-purpose tool for various tasks.
Python's source code is accessible under the GNU General
Public License (GPL), and its creation drew inspiration from
the ABC language and Modula-3.
The GNU General Public License is a series of widely used free software licenses that
guarantee end users the four freedoms to run, study, share, and modify the software.
Anaconda is a distribution of Python and R for scientific computing, simplifying package
management. It includes preinstalled data-science packages for Windows, Linux, and
macOS.
With Anaconda, one can create environments with different Python and package versions,
install/remove/upgrade packages, and benefit from a consistent project environment.
Anaconda saves time by avoiding individual package configuration and provides a package
manager called Conda for easy installation and management.
Spyder –
Variable Explorer Python Integrated
Development
Environment (IDE)
A very light weight but extremely powerful freely available
open-source Python IDE used in Data Science and Machine
Learning.
Features and Limitations of Python
Features Limitations
• Interpreted Programming Lang. Not Easily Convertible
• Case Sensitive Has lesser libraries than C, JAVA and Perl
• Supports OOPs concepts Not very strong in catching “Type Mismatch” errors
• Portable Slower Execution in comparison to Compiled
Languages like C and C++
• Free Open Source Code
• GUI Programming
• Interactive Mode
• Script Mode
• Faster in comparison to R
Real World Applications
Data Science,
AI and Machine Learning, Web
Development,
Web Scrapping, Game
Development, Currency
Converter,
Real Time Data Accessibility Using API, etc………….
Data Science
Finding patterns and useful information from large volume of
data in order to solve problems and take informed decisions.
Due to its simplicity and versatility, Python plays a vital role
in the field of Data Science.
Points to Remember
❖ Official website of Python is official website of
Python https://www.python.org
❖ To download python on Internet go to https://www.python.org/downloads
❖ Python files have extension .py
❖ Python is a case sensitive programming language.
❖ A Python identifier is a name used to identify a variable, function, class, module
or other object. An identifier starts with a letter A to Z or a to z or an underscore
(_) followed by zero or more letters, underscores and digits (0 to 9). Python
does not allow punctuation characters such as @, $, and % within identifiers.
❖ Python uses indentation as its method of grouping statements.
Database
A structured collection of related data for a specific
application that allows to store, retrieve and update the
data in a proper systematic way.
One may call it as a collection of related tables.
A software application that allows to create, organize and
manage the database efficiently is called as DBMS.
Examples: MySQL, Oracle, MongoDB, etc.
SQL
SQL (Structured Query Language) is a programming language
used for managing and manipulating data in DBMS. It
provides a set of commands that allow users to interact with
a database, perform various operations, and retrieve
information.
Type of Statements in Python
Simple Statement: A single executable statement in Python.
Compound Statement: A group of statements (with a header line
ending with a colon) executed as a unit; each statement is indented
inside the header line. All statements in the body are at the same
level of indentation.
Type of Statements in Python
Sequential Statements
Conditional Statements
Iterative Statements
Conditional Statements
if statement: Consists of a boolean
expression followed by one or more
statements.
if expression:
statement(s)
If the boolean expression evaluates to
TRUE, then the block of statement(s)
inside the if statement is executed. If
boolean expression evaluates to
FALSE, then the first set of code after
the end of the if statement(s) is
executed.
Conditional Statements
if-else statement: if statement can be followed
by an optional else statement, which executes
when the boolean expression is FALSE.
if expression:
statement(s)
else:
statement(s)
An else statement can be combined with an if
statement. An else statement contains the
block of code that executes if the conditional
expression in the if statement resolves to 0 or a
FALSE value. The else statement is an optional
statement and there could be at most only one
else statement following if.
Conditional Statements
Nested if-else statement: One can use one if
or else if statement inside another if or else if
statement(s).
Conditional Statements
if-elif-…else statement: The elif statement allows you to check
multiple expressions for TRUE and execute a block of code as soon as one of
the conditions evaluates to TRUE. Similar to the else, the elif statement is
optional. However, unlike else, for which there can be at most one
statement, there can be an arbitrary number of elif statements following an
if.
if expression1:
statement(s)
elif expression2:
statement(s)
elif expression3:
statement(s)
else:
statement(s)
Iterative Statements
Counter Loop – for loop: It can run
in any direction for the desired number of
times. It has the ability to iterate over the
items of any sequence, such as a list or a
string.
for iterating_var in sequence:
statements(s)
If a sequence contains an expression list, it is
evaluated first. Then, the first item in the
sequence is assigned to the iterating variable
iterating_var. Next, the statements block is
executed. Each item in the list is assigned to
iterating_var, and the statement(s) block is
executed until the entire sequence is exhausted.
Iterative Statements
Conditional Loop – while loop: It
repeatedly executes a target statement as long
as a given condition is true. When the condition
becomes false, program control passes to the
line immediately following the loop. When the
condition becomes false, program control
passes to the line immediately following the
loop.
while expression:
statement(s)
A loop becomes infinite loop if a condition never
becomes FALSE. One must use caution when using
while loops because of the possibility that this
condition never resolves to a FALSE value. This
results in a loop that never ends. Such a loop is
called an infinite loop.
Iterative Statements
Nested loops: Python programming language
allows us to use one loop inside another loop.
Jump Statements
The break statement: Terminates the loop statement and transfers
execution to the statement immediately following the loop. If we are using
nested loops, the break statement stops the execution of the innermost loop
and start executing the next line of code after the block. This statement can be
used both in while and for loops.
The continue statement: Causes the loop to skip the remainder of its body
and immediately retest its condition prior to reiterating. This statement can be
used both in while and for loops.
The pass statement: It is used when a statement is required syntactically
but you do not want any command or code to execute. The pass statement is
a null operation; nothing happens when it executes. The pass is also useful in
places where your code will eventually go, but has not been written yet
Python as a preferred language for Database
Programming more efficient and faster compared to other languages.
Portability of python programs.
Support platform independent program development.
Python supports SQL cursors.
Python itself take care of open and close of connections.
Python supports relational database systems.
Interface Python with MySQL
A library is required to enable the connectivity to a database
from within Python. There many libraries that helps to
accomplish this including mysql connector, sqlite3,
psycopg2, etc.
Interface Python with MySQL
The Python standard for database interfaces is the Python
DB-API. Python Database API supports a wide range of
database servers, like msql , mysql, postgressql, Informix,
oracle, Sybase etc.
DB – API
(Python standard for database interfaces)
The Python standard for database interfaces is the Python DB-API.
DB API provides a minimal standard for working with databases using Python
structures and syntax wherever possible. This API includes the following –
● Importing the API module.
● Acquiring a connection with the database.
● Issuing SQL statements and stored procedures.
● Closing the connection
Database Connection
A database connection object controls the connection to
the database and it represents a unique session with a
database connected from within a program. Connection and
communication between an application and database system
is referred to database connectivity.
In Python we import package mysql.connector in order to
establish a database connectivity.
Database Cursor
A database cursor is a special control structure that
facilitates traversal over the records in a database. A cursor
makes it possible to define a result set (a set of data rows)
and perform complex logic on row by row basis.
A result set refers to a set of data rows that are fetched
from the database by executing an SQL query.
The MySQLCursor class instantiates objects that can execute
operations such as SQL statements. Cursor objects interact
with the MySQL server using a MySQLConnection object.
Database Transaction
Database transaction represents a single
unit of work. Any operation which
modifies the state of the MySQL
database is a transaction.
Data Types of Python
strings - used to represent text data, the text is given under
quote marks. e.g. "ABCD“
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j