Python Implant Training
Python Implant Training
This report is about Data analysis using python. We have learned the basics of python
and few data science libraries. This report is consist of theory and practical content both with
detail explanation. In short this is a report describing what we have learned and practice in
this implant training.
At last, we have made a project to demonstrate and test our learning & knowledge in this
Training. This project helped us to explore how the real project works.
TABLE OF CONTENTS
CHAPTER 1: INTRODUCTION TO PYTHON...............................................1
1.1 INTRODUCTION..................................................................................................................................1
1.2 LANGUAGE FEATURES....................................................................................................................1
1.3 INDUSTRIAL IMPORTANCE...............................................................................................................2
1.4 APPLICATION......................................................................................................................................3
1.5 PROS & CONS.......................................................................................................................................3
1.6 DIFFERENT PYTHON IDE..................................................................................................................3
1.7 ADVANTAGES.....................................................................................................................................3
CHAPTER 2: OBJECT AND DATA STRUCTURES BASICS......................4
2.1 NUMBERS.............................................................................................................................................4
2.1.1 TYPES OF NUMBERS...................................................................................................................4
2.1.2 BASIC ARITHMETIC OPERATION............................................................................................5
2.1.3 DIVISION VS FLOOR DIVISION................................................................................................5
2.1.4 SQUARE & SQUARE ROOT........................................................................................................5
2.2 STRING..................................................................................................................................................6
2.2.1 STRING INDEXING.......................................................................................................................6
2.2.2 STRING SLICING..........................................................................................................................7
2.2.3 STRING STEP SIZE.......................................................................................................................7
2.2.4 STRING CONCATENATE.............................................................................................................8
2.2.5 STRING METHODS.......................................................................................................................8
2.2.6 STRING FORMATTING................................................................................................................9
2.3 LIST........................................................................................................................................................9
2.3.1 LIST PROPERTIES.......................................................................................................................10
2.3.2 NESTED LISTS.............................................................................................................................10
2.3.3 LIST METHODS...........................................................................................................................11
2.4 TUPLES................................................................................................................................................11
2.4.1 BASIC TUPLE METHODS..........................................................................................................12
2.4.2 IMMUTABILITY..........................................................................................................................12
2.4.3 WHEN TO USE TUPLE...............................................................................................................12
2.5 DICTIONARIES..................................................................................................................................13
2.5.1 NESTING WITH DICTIONARY.................................................................................................14
2.5.2 A FEW DICTIONARIES METHODS..........................................................................................14
2.6 SETS.....................................................................................................................................................15
2.7 BOOLEANS.........................................................................................................................................15
CHAPTER 3: PYTHON STATEMENT..........................................................16
3.1 INDENTATION...................................................................................................................................16
3.2 COMPARISON OPERATORS............................................................................................................16
PYTHON PROGRAMMING 18CS66P
1.1 INTRODUCTION
Often, programmers fall in love with Python because of the increased productivity it
provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast.
Debugging Python programs is easy: a bug or bad input will never cause a segmentation
fault. Instead, when the interpreter discovers an error, it raises an exception.
Prerequisites:
Interpreted
o There are no separate compilation and execution steps like C and C++.
o Directly run the program from the source code.
o Internally, Python converts the source code into an intermediate form called
bytecodes which is then translated into native language of specific computer to
run it.
o No need to worry about linking and loading with libraries, etc.
Platform Independent
o Python programs can be developed and executed on multiple operating system
platforms.
o Python can be used on Linux, Windows, Macintosh, Solaris and many more.
Free and Open Source; Redistributable
High-level Language
o In Python, no need to take care about low-level details such as managing the
memory used by the program.
Simple
o Closer to English language; Easy to Learn
o More emphasis on the solution to the problem rather than the syntax
Embeddable
o Python can be used within C/C++ program to give scripting capabilities for the
program’s users.
Robust:
o Exceptional handling features
o Memory management techniques in built
Rich Library Support
o The Python Standard Library is varied vast.
o Known as the “batteries included” philosophy of Python; It can help do
various things involving regular expressions, documentation generation, unit
testing, threading, databases, web browsers, CGI, email, XML, HTML, WAV
files, cryptography, GUI and many more.
o Besides the standard library, there are various other high-quality libraries such
as the Python Imaging Library which is an amazingly simple image
manipulation library.
Most of the companies are now looking for candidates who know about Python
Programming. Those having the knowledge of python may have more chances of impressing
the interviewing panel. So, I would suggest that beginners should start learning python and
excel in it.
1.4 APPLICATION
1. Machine learning
2. Data science & Analysis
3. IOT
4. Web development
5. Game development
6. Business application
7. Software Development
8. Web scraping Application
Pros: -
1. Ease of use
2. Multi-paradigm Approach
Cons: -
1. PyCharm
2. Visual Studio Code
3. Atom
4. Spyder
5. Vim
6. Thonny
7. Jupyter Notebook
1.7 ADVANTAGES
2.1 NUMBERS
Number data types store numeric values. They are immutable data types, means that
changing the value of a number data type results in a newly allocated object.
int (signed integers) − They are often called just integers or int, are positive or
negative whole numbers with no decimal point.
long (long integers) − Also called longs, they are integers of unlimited size, written
like integers and followed by an uppercase or lowercase L.
float (floating point real values) − Also called floats, they represent real numbers and
are written with a decimal point dividing the integer and fractional parts. Floats may
also be in scientific notation, with E or e indicating the power of 10 (2.5e2 = 2.5 x 10 2
= 250).
complex (complex numbers) − are of the form a + bJ, where a and b are floats and J
(or j) represents the square root of -1 (which is an imaginary number). The real part of
the number is a, and the imaginary part is b. Complex numbers are not used much in
Python programming.
Division: - It divides the two number and gives the accurate result with decimal
points. We use (/) to perform division
Floor Division: - It divides the two number and provide the round of result (int)
without any decimal. We use (//) to
2.2 STRING
Strings are used in Python to record text information, such as names. Strings in Python are
actually a sequence, which basically means Python keeps track of every element in the string
as a sequence. For example, Python understands the string “hello” to be a sequence of letters
in a specific order. This means we will be able to use indexing to grab particular letters (like
the first letter, or the last letter).
We know strings are a sequence, which means Python can use indexes to call parts of the
sequence. In Python, we use brackets [] after an object to call its index. We should also note
that indexing starts at 0 for Python.
We can use a: to perform slicing which grabs everything up to a designated point. For
example:
Note the above slicing. Here we're telling Python to grab everything from 0 up to 3. It doesn't
include the 3rd index. You'll notice this a lot in Python, where statements and are usually in
the context of "up to, but not including".
We can also use index and slice notation to grab elements of a sequence by a specified step
size (the default is 1). For instance, we can use two colons in a row and then a number
specifying the frequency to grab elements. For example:
This is one of the string properties that we can add two string using (+) symbol.
In string formatting, we can get variable inside the print function by using f.
2.3 LIST
Earlier when discussing strings, we introduced the concept of a sequence in Python. Lists can
be thought of the most general version of a sequence in Python. Unlike strings, they are
mutable, meaning the elements inside a list can be changed!
A great feature of Python data structures is that they support nesting. This means we can
have data structures within data structures. For example: A list inside a list.
We can again use indexing to grab elements, but now there are two levels for the index. The
items in the matrix object, and then the items inside that list!
extend() Add the elements of a list (or any iterable), to the end of the current list
index() Returns the index of the first element with the specified value
2.4 TUPLES
In Python tuples are very similar to lists, however, unlike lists they are immutable
meaning they cannot be changed. You would use tuples to present things that shouldn't
be changed, such as days of the week, or dates on a calendar.
2.4.2 IMMUTABILITY
You may be wondering, "Why bother using tuples when they have fewer available
methods?" To be honest, tuples are not used as often as lists in programming, but are
used when immutability is necessary. If in your program you are passing around an
object and need to make sure it does not get changed, then a tuple becomes your
solution. It provides a convenient source of data integrity.
2.5 DICTIONARIES
It’s important to note that dictionaries are very flexible in the data types they can hold.
For example:
It is also mutable in nature so you can access the element and change their content or
value.
Dictionary is quite a flexible Data structure; we can have a nested dictionary. We can
access the nested dictionary using multiple value.
2.6 SETS
Sets are an unordered collection of unique elements. We can construct them by using
the set() function.
Note the curly brackets. This does not indicate a dictionary! Although you can draw
analogies as a set being a dictionary with only keys. A set has only unique entries.
2.7 BOOLEANS
Python comes with Booleans (with predefined True and False displays that are
basically just the integers 1 and 0). It also has a placeholder object called None.
3.1 INDENTATION
Here is some pseudo-code to indicate the use of whitespace and indentation in Python:
Other Languages
if (x) {
if(y) {
code-statement;
}
}
Else {
another-code-statement;
}
Python
if x:
if y:
code-statement
else:
another-code-statement
Python is so heavily driven by code indentation and whitespace. This means that code
readability is a core part of the design of the Python language.
These operators will allow us to compare variables and output a Boolean value (True
or False). If you have any sort of background in Math, these operators should be very
straight forward.
If Statements in Python allows us to tell the computer to perform alternative actions based on
a certain set of results.
We can then expand the idea further with elif and else statements, which allow us to tell the
computer:
"Hey if this case happens, perform some action. Else, if another case happens, perform some
other action. Else, if none of the above cases happened, perform this action."
Let's look at the syntax format for if statements to get a better idea of this:
if case1:
perform action1
elif case2:
perform action2
else:
perform action3
For example,
Multiple Branches: -
We write this out in a nested structure. Take note of how the if, elif, and else line up in the
code. This can help you see what if is related to what elif or else statements.
Note how the nested if statements are each checked until a True Boolean causes the nested
code below it to run. You should also note that you can put in as many elif statements as you
want before you close off with an else.
A for loop acts as an iterator in Python; it goes through items that are in a sequence or any
other iterable item. Objects that we've learned about that we can iterate over include strings,
lists, tuples, and even built-in iterables for dictionaries, such as keys or values.
Here's the general format for a for loop in Python:
for item in object:
statements to do stuff
The variable name used for the item is completely up to the coder, so use your best judgment
for choosing a name that makes sense and you will be able to understand when revisiting
your code. This item name can then be referenced inside your loop, for example if you
wanted to use if statements to perform checks.
For Example,
Tuples have a special quality when it comes to for loops. If you are iterating through a
sequence that contains tuples, the item can actually be the tuple itself, this is an example
of tuple unpacking. During the for loop, we will be unpacking the tuple inside of a sequence
and we can access the individual items inside that tuple!
as the condition is true. The reason it is called a 'loop' is because the code statements are
looped through over and over again until the condition is no longer met.
The general format of a while loop is:
while test:
code statements
else:
final code statements
Notice how many times the print statements occurred and how the while loop kept going
until the True condition was met, which occurred once x==10. It's important to note that
once this occurred the code stopped. Let's see how we could add an else statement:
Note how the other else statement wasn't reached and continuing was never printed! After
these brief but simple examples, you should feel comfortable using while statements in
your code.
There are a few built-in functions and "operators" in Python that don't fit well into any
category, which are quite useful in loops and if-else statement. So let’s see few of the
popular useful operators.
3.6.1 RANGE
The range function allows you to quickly generate a list of integers, this comes in handy
a lot, so take note of how to use it! There are 3 parameters you can pass, a start, a stop,
and a step size. Let's see some examples:
Note that this is a generator function, so to actually get a list out of it, we need to cast it
to a list with list (). What is a generator? It’s a special type of function that will generate
information and not need to save it to memory. We haven't talked about functions or
generators yet, so just keep this in your notes for now, we will discuss this in much more
detail in later on in your training!
3.6.2 ENUMERATE
enumerate is a very useful function to use with for loops. Let's imagine the following
situation:
Keeping track of how many loops you've gone through is so common, that enumerate was
created so you don't need to worry about creating and updating this index_count or
loop_count variable
3.6.3 ZIP
This data structure is actually very common in Python, especially when working with outside
libraries. You can use the zip () function to quickly create a list of tuples by "zipping" up
together two lists.
Min & max function is used to find the min and max value in sequence data type. And it
mostly used in sequence data structure. It is python in-built function.
4.1 METHODS
We've already seen a few examples of methods when learning about Object and Data
Structure Types in Python. Methods are essentially functions built into objects. Later on in
the course we will learn about how to create our own objects and methods using Object
Oriented Programming (OOP) and classes.
Methods perform specific actions on an object and can also take arguments, just like a
function. This lecture will serve as just a brief introduction to methods and get you thinking
about overall design methods that we will touch back upon when we reach OOP in the
course.
You can always use Shift+Tab in the Jupyter Notebook to get more help about the
method. In general Python you can use the help () function:
4.2 FUNCTION
Formally, a function is a useful device that groups together a set of statements so they can be
run more than once. They can also let us specify parameters that can serve as inputs to the
functions.
On a more fundamental level, functions allow us to not have to repeatedly write the same
code again and again. If you remember back to the lessons on strings and lists, remember
that we used a function Len () to get the length of a string. Since checking the length of a
sequence is a common task you would want to write a function that can do this repeatedly at
command.
Functions will be one of most basic levels of reusing code in Python, and it will also allow us
to start thinking of program design (we will dive much deeper into the ideas of design when
we learn about Object Oriented Programming).
Why even use functions?
Put simply, you should use functions when you plan on using a block of code multiple times.
The function will allow you to call the same block of code without having to write it multiple
times. This in turn will allow you to create more complex Python scripts. To really
understand this though, we should actually write our own functions!
We begin with def then a space followed by the name of the function. Try to keep names
relevant, for example Len () is a good name for a length () function. We can use (‘‘‘doc
string’’’) to create a Doc String or comment for a function.
Be careful! Notice how print_result () doesn't let you actually save the result to a
variable! It only prints it out, with print () returning None for the assignment!
5.1 NumPy
NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for
Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on
NumPy as one of their main building blocks. Numpy is also incredibly fast, as it has bindings
to C libraries.
NumPy arrays are the main way we will use Numpy throughout the course. Numpy arrays
essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and
matrices are 2-d (but you should note a matrix can still have only one row or one column).
5.2 PANDAS
Pandas is an open-source library that is made mainly for working with relational or labeled
data both easily and intuitively. This library is built on top of the NumPy library. Pandas is
fast and it has high performance & productivity for users. It is mostly used to work with Data
Analysis and visualization task.
Pandas generally provide two data structures for manipulating data, they are:
Series
DataFrame
This notebook is the reference code for getting input and output, pandas can read a variety of
file types using its pd.read_ methods.
5.2.2 DATAFRAMES
DataFrames are the workhorse of pandas and are directly inspired by the R programming
language. We can think of a DataFrame as a bunch of Series objects put together to share the
same index.
5.2.3 SERIES
A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array
object). What differentiates the NumPy array from a Series, is that a Series can have axis
labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't
need to hold numeric data; it can hold any arbitrary Python Object.
5.2.4 GROUP BY
The groupby method allows you to group rows of data together and call aggregate functions
Now you can use the .groupby() method to group rows together based off of a column name.
For instance let's group based off of Company. This will create a DataFrameGroupBy object:
5.3 MATPLOTLIB
Matplotlib is the "grandfather" library of data visualization with Python. It was created by
John Hunter. He created it to try to replicate MatLab's (another programming language)
plotting capabilities in Python. It allows you to create reproducible figures programmatically.
That line is only for jupyter notebooks, if you are using another editor, you'll use: plt.show()
at the end of all your plotting commands to have the figure pop up in another window.
Example
In this project we are going to implement all the concept that we have learned during
this implant training (python). Our final project will be the Data Analysis & visualization on
San Francisco Employee salaries.
PROCEDURE: -
1. We need to download the salaries.csv dataset file from Kaggle.com
2. Then Download the starter file given by our instructor.
3. First import the required library and .csv file in Jupyter Notebook.
4. After setting up everything, we can start writing our command for Analysis on the
data.
TOOLS USED FOR THIS PROJECT: -
EDITOR: JUPYTER NOTEBOOK
OTHER: EXCEL, WEB BROWSER
PYTHON LIBRARY: -
PANDAS
MATPLOTLIB
PROGRAM: -
In [3]: sal=pd.read_csv('Salaries.csv')
In [4]: sal.head()
Out[4]:
** Use the .info() method to find out how many entries there are.**
In [5]: sal.info()
In [6]: sal['BasePay'].mean()
Out[6]: 66325.44884050643
In [7]: sal['OvertimePay'].max()
Out[7]: 245131.88
** What is the job title of JOSEPH DRISCOLL ? Note: Use all caps, otherwise you may get an
answer that doesn't match up (there is also a lowercase Joseph Driscoll). **
Out[8]:
Id EmployeeName JobTitle BasePay OvertimePay OtherPay Benefits TotalP
Out[9]: 24 270324.91
Name: TotalPayBenefits, dtype: float64
#sal[sal['TotalPayBenefits']==sal['TotalPayBenefits'].max()]
In [10]:
sal.loc[sal['TotalPayBenefits'].idxmax()]
Out[10]:
** What is the name of lowest paid person (including benefits)? Do you notice something
strange about how much he or she is paid?**
In [11]: #sal[sal['TotalPayBenefits']==sal['TotalPayBenefits'].min()]
sal.loc[sal['TotalPayBenefits'].idxmin()]
Out[11]: Id 148654
EmployeeName Joe Lopez
JobTitle Counselor, Log Cabin Ranch
BasePay 0.0
OvertimePay 0.0
OtherPay -618.13
Benefits 0.0
TotalPay -618.13
TotalPayBenefits -618.13
Year 2014
Notes NaN
Agency San Francisco
Status NaN
Name: 148653, dtype: object
** What was the average (mean) BasePay of all employees per year? (2011-2014) ? **
In [12]: sal.groupby('Year').mean()['BasePay']
Out[12]: Year
2011 63595.956517
2012 65436.406857
2013 69630.030216
2014 66564.421924
Name: BasePay, dtype: float64
In [13]: sal['JobTitle'].nunique()
Out[13]: 2159
In [14]: sal['JobTitle'].value_counts().head()
** How many Job Titles were represented by only one person in 2013? (e.g. Job Titles with only one
occurence in 2013?) **
In [15]: #sum(sal[sal['Year']==2013]['JobTitle'].value_counts() == 1)
sum(sal[sal['Year']==2013]['JobTitle'].value_counts() == 1)
Out[15]: 202
In [16]: salYear=sal.groupby('Year').mean()['BasePay']
OUTPUT SCREENSHOTS
CHAPTER 7: CONCLUSION
In this implant training we learnt the basic syntax of python, some data structure & few
Python library for Data Analysis. We did several exercises to test our learning and enhance
our understanding about Python. At last, we made a project about Data Analysis which
helped us to implement our knowledge and learning in this final project.
CHAPTER 8: REFERENCE
1. https://www.anaconda.com/
2. https://www.geeksforgeeks.org/
3. https://www.w3schools.com/
4. https://www.kaggle.com/datasets/kaggle/sf-salaries
5. https://www.tutorialspoint.com/