0% found this document useful (0 votes)
36 views16 pages

Machine Learning With Python: The Complete Course

Programação

Uploaded by

Magno Junior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views16 pages

Machine Learning With Python: The Complete Course

Programação

Uploaded by

Magno Junior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Machine Learning with

Python
The Complete Course

TELCOMA
Copyright © TELCOMA. All Rights Reserved
Module 2
Data Scientist’s Toolbox

Copyright © TELCOMA. All Rights Reserved


Content:
1. Python - Quick recap

2. Python 2.7.x or 3.x ?

3. Installation and setup

4. Data types, functions and important packages

5. Data manipulation & Data Engineering

6. Data Visualization

Copyright © TELCOMA. All Rights Reserved


Quick recap – Python got a lot of traction with the launch of

Python Pandas, Scipy and Scikit Learn

highly popular, open-source and general purpose


programming language

extensive production support

Preferred choice for Deep Learning

will soon overtake R as the preferred language for a data


scientist

Popular packages
Pandas, numpy, matplotlib, scipy, statsmodel, scikit-learn

Copyright © TELCOMA. All Rights Reserved


Python 2.7.x or 3.x?
Python 2.x is legacy, Python 3.x is the present and future of the language

Few Differences

Python 2.7.x Python 3.x


• Has more libraries • Cutting edge – all new features will
• Extensive 3rd party be added to 3.x
module support • Limited 3rd Party module support
• No new major releases • Is under active development

Copyright © TELCOMA. All Rights Reserved


Installation & setup Windows Installation

• Download the installer (32 or 64 bit)


• Install the .exe file install and follow the
Recommended distribution installation wizard

– Anaconda Python 3.5.x


OSX Installation

Graphical Installer
• Download the graphical installer
• Double-click the downloaded .pkg file and
What platforms are supported? follow the installation wizard
- Windows, Linux and Mac
(32 bit and 64 bit versions) Command Line Installer
• Download the command-line installer
• In your terminal window type one of the below
and follow the instructions: bash <Anaconda2-
x.x.x-MacOSX-x86_64.sh>

What tools do we use? Linux Installation


- IDE : Spyder/ Jupyter Notebooks
• Download the installer (32 or 64 bit)
• In your terminal window type one of the below
and follow the instructions: bash Anaconda2-
x.x.x-Linux-x86_xx.sh
https://www.anaconda.com/download/
https://repo.continuum.io/archive/
Copyright © TELCOMA. All Rights Reserved
Data types List vs Tuple vs Set vs Dictionary?

 List: Use when you need an ordered


sequence of homogenous/heterogenous
Basic Data types collections, whose values can be changed
later in the program
Boolean True, False
Integer -1,0,1 (32 bits of precision)  Tuple: Use when you need an ordered
Long 1234 (unlimited precision) sequence of heterogeneous collections
Float 3.21456, 6.3 whose values need not be changed later in
Complex 2+9j (numbers with real and imaginary part) the program
String ‘This is a string’
Dictionary {‘A’ : ’item1’, ‘B’ : ‘item2’}  Set: ideal when we don’t have to store
File f=open(‘path/filename’,’rb’) duplicates and you are not concerned
List [1,2,4,5] about the order or the items.
Set set(1,’ML’,2)
Tuple [1,2,3,5]  Dictionary: ideal whenever we need to
reference values with keys

Copyright © TELCOMA. All Rights Reserved


Demo

Copyright © TELCOMA. All Rights Reserved


Data types continued..
A sample numpy array with 10 rows and 3 columns

Numpy 0 1 2
NumPy’s main object is the homogeneous 0 12 13 35
multidimensional array. It has an associated fast 1 14 16 56
math functions that operate on it. It also provides 2 15 19 77
simple routines for linear algebra and fft and 3 17 22 98
sophisticated random-number generation. 4 18 25 119
5 20 28 140
E.g.. 6 21 31 161
import numpy as np 7 23 34 182
a = np.arange(15).reshape(3, 5) 8 24 37 203
a 9 26 40 224
10 27 43 245
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])

Copyright © TELCOMA. All Rights Reserved


A sample pandas DataFrame with 8 rows and 3 columns

Data types continued.. Index


1
Column1 Column2 Column3
100 Andy 1/1/1990
2 120 Jake 5/6/1985
3 140 Bill 17/05/2000
Pandas 4 160 Smith 9/8/1980
5 180 Jane 1/12/1976
Series 6 200 Melvin 17/05/2001
1-dimensional labelled array capable of holding 7 220 Roger 5/17/1971
any data type. 8 240 Rahul 9/19/1966

DataFrame A sample pandas Series with 8 rows


A 2-dimensional labelled data structure with Index
columns of potentially different types. 1 45
Analogous to spreadsheet or SQL table. 2 46
3 47
Note 4 48
The Series is the data structure for a single column 5 49
of a DataFrame. The data in a DataFrame is 6 50
actually stored in memory as a collection of Series. 7 51
8 52

Download data from - https://www.kaggle.com/ludobenistant/hr-analytics/data


Copyright © TELCOMA. All Rights Reserved
Sample

Functions
Function
A function is a block of organized, reusable code
that is used to perform a single, related action.

Defining a function
Eg.

def functionname( parameters ):


"function_docstring"
function_suite
return [expression]

Copyright © TELCOMA. All Rights Reserved


Demo

Copyright © TELCOMA. All Rights Reserved


Data Manipulation
and Engineering

Most elementary data manipulation exercises


- Reading CSV data from local system
- Exploring length and breadth of data
head/tail/shape/columns
- CRUD
- CR : add new columns/ create new dataframes
- U : update columns/filter
- D : delete columns

Data Engineering : Transform data


String manipulation
Data rollup (groupby)
Merge, Join, Pivot, concat

Copyright © TELCOMA. All Rights Reserved


Data Visualization

Scatter plot Bar charts Line charts Histogram

Copyright © TELCOMA. All Rights Reserved


Demo

Copyright © TELCOMA. All Rights Reserved


Next Module :
Exploratory Data Analysis,
Feature Engineering &
Hypothesis Testing

Copyright © TELCOMA. All Rights Reserved

You might also like