0% found this document useful (0 votes)
1 views

Python Basics

Uploaded by

Ravinder Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Python Basics

Uploaded by

Ravinder Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Numpy – Mathematical and statistical computing

Pandas – Data processing and Analysis

MatPlotLib – Data Visualisations

Reminder of how to define a function.

Reminder of how to define a class.

Attributes : Variables of a class.


The { } acts as a placeholder where you can place the said word, given in
the format( ) option.

Very powerful – using fstrings.

Good tip man! Do when numbers are massive, say in millions or billions!

Exponential operator in Python : **

Lists in Python are ordered i.e. they are also iterable.

You can access one item at a time, but what if you wanted to access
multiple

together in a list – slicing.


1. Remember that the ending index is exclusive.
2. If you add two lists that might have the same items, redundant
items will still be added, in whatever order they are in.

Right so coming to the concept of unpacking.

If I do skill1, skill2, skill3..skill6 = job_skills – it is called unpacking –


assigning each value in an iterable to a variable in a single line.
So, if I do say skill_concerned, skill_dont_care = job_skills – it’ll give an
error because it expects to unpack 6 values and I provided only 2
variables. So we fix that by using the “star” or the “unpacking” operator =
*.

Doing so just assigns the rest, however many there are as a list to the
skills_dont_care.

Also, Lists are mutable – can be changed after creation i.e. new values
inserted, removed, modified etc.
Set is like a list – but written in curly braces but there is no colon here so
separate from a dictionary. Covers only unique items – e.g. it
removed python here.
Also unordered and no indexing available.

Alphabet-ised here but don’t always count on it.

Right, so something interesting here:

Pop( ) uses an index value to remove something – default is -1 – last item.


Here well, if you use that in a set – which is not iterable – it’ll just remove
an item randomly. Use remove function then.
Why even use a set? If I want to extract a sequence of unique items from
a list or a tuple very efficiently.

Right – Tuples very important. Very similar to lists except these are
immutable. You can add duplicate items, ordered, iterable – but in
parantheses ( ) and immutable. So you cannot do append, pop, remove
functions on this.

So, how do you really just update a tuple if you have to? There’s a
workaround – use the + operator but here too, there’s a catch. So you do
luke_skill = luke skill + new skill
Yes, it’ll update the tuple but it won’t be the same tuple – not in the same
location, will have different ids. So, inherently it will be two different
objects, not the same one proving it is immutable as you can see using
the id output above.
People often confuse identity operator with comparison operator
(=). Identity compares if x and y are both the same objects, both lie in the
same memory. You can confirm in the example above.

DRY – DON’T REPEAT YOURSELF (LOOPS)

Didn’t know this – instead of doing + + in print( ) everytime to join literals,


you can simply do commas – it is an implicit space that Python
understands.
Two equivalents – list comprehension.

Easy list comprehension – saves 3 LOC if you use only for.

all(condition for item in iterable)

Had a big problem understanding how the list comprehension for 2:51:35
came about. This above is key to understand. You’re testing if the skills
that I have exist in the job roles, if yes? Then I’m qualified for the same.

Python syntax says that the condition/expression needs to come first then
the for expression for which you’re iterating. That’s why it’s
Right so (lambda x : x+3) x is the argument and x+3 is the expression.

Right, so maintaining environments is a very vital thing for Python. It


contains the runtime environment, the variables and what not. So there
are two ways we can do this in: Anaconda and pip install <x>.

This pip is a bash/shell command, it is not a python command – it is


executed in terminal.
Pandas is really popular in accessing tabular data and using data frames
to analyze it.
Matplotlib is used to just visualise things.

Libraries can be packages and you can go to pipi, anaconda and download
it for a package manager. But if a package only includes things like
modules and is very basic, can’t be considered as a library.

Class is usually named as e.g. BaseSalary – no underscore b/w two words


– more commonly it is used for variables. We want to quickly identify it’s a
class so we just do this.
Numpy is a very critical library – it is one of the main dependencies for
Pandas.
Has a very rich ecosystem:

 NumPy arrays are designed to store data in a structured, multi-


dimensional format.

 Lists and tuples are common ways to represent collections of data in


Python.
 By providing a list or tuple to np.array(), you're essentially telling
NumPy how to structure the data within the array.

Right so when you directly want to multiply lists/arrays like in the above
use case. Can be done directly through numpy using arrays.

You can’t do it directly for your lists because * doesn’t take a list straight
up as an operand. There are two other ways – one which I’ve been used to
and the other one is the zip function which is also nice:
Pandas is the industry standard to handle tabular data : anything excel
and CSV can handle, this can handle!
This is how we import the dataset in our Colab notebook.

Df.head(n) gives the first n rows.

You can access particular columns by df[‘column_name’].head(n) e.g.


If you want to access multiple rows, you have to pass the names in a list –
df[[‘a’, ‘b’]].
If you want to view rows in a sequence somewhere in b/w, use iloc and
slicing in it.

Right so when you make some changes, it gets saved only in that context
unless you give the df = command.
Or by doing the above^ inplace = true ensures that the changes are
sustained.

The ‘axis’ command is useful : 1 – deletes the entire column itself, 0


deletes the rows e.g. deletes all rows where values are NaN for a
particular column title.

The above command will quite literally delete columns or rows that will
have NaN values. So, we’re deleting monthly average salary entirely
because we have yearly available and then NaN values from Yearly
average.

You might also like