0% found this document useful (0 votes)
251 views27 pages

Python for Data Science: A Beginner's Guide

This document provides an overview of Python as a programming language for data science and analytics. It discusses that Python is an easy to learn, high-level, interpreted, general-purpose, and platform-independent language with a massive community and ecosystem. Popular tools for practicing Python include Anaconda, Jupyter notebooks, Google Colab, and DataCamp courses. The document outlines some basic concepts in Python like data types, operators, control flow statements, functions, modules, and packages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
251 views27 pages

Python for Data Science: A Beginner's Guide

This document provides an overview of Python as a programming language for data science and analytics. It discusses that Python is an easy to learn, high-level, interpreted, general-purpose, and platform-independent language with a massive community and ecosystem. Popular tools for practicing Python include Anaconda, Jupyter notebooks, Google Colab, and DataCamp courses. The document outlines some basic concepts in Python like data types, operators, control flow statements, functions, modules, and packages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

PHUONG NGUYEN

A CRASH COURSE ON PYTHON


THE EASIEST PROGRAMMING LANGUAGE TO LEARN
DATA SCIENCE/ANALYTICS TOOLS
The first decision that must be made in choosing a data
science platform is whether to use an application-based
solution or to use a programming language.

2
DATA SCIENCE/ANALYTICS TOOLS
APPLICATION-BASED SOLUTIONS
▪ Well-designed application-based, or point-and-click,
tools make it very quick and easy to develop and
evaluate models, and to perform associated data
manipulation tasks.
▪ Using one of these tools, it is possible to train, evaluate,
and deploy a predictive analytics model in less than an
hour!
▪ Important application-based solutions for building
predictive analytics models include IBM SPSS
Modeler, SAS Enterprise Miner, RapidMiner Studio,
KNIME Analytics Platform, Weka and H20.ai.

3
Magic Quadrant for Data Science and Machine Learning Platforms 2019
Source: Gartber (2019)

4

Magic Quadrant for Data Science and Machine Learning Platforms 2020
Source: Gartber (2020)

5
DATA SCIENCE/ANALYTICS TOOLS
PROGRAMMING LANGUAGES
▪ Two of the most commonly used programming
languages for predictive analytics are R and
Python.
▪ Building predictive analytics models using a
language like R or Python is not especially difficult.
▪ The advantage of using a programming language
for predictive analytics projects is that it gives the
data analyst huge flexibility. Anything that the
analyst can imagine can be implemented.

6

Top Software for Analytics, Data Science, Machine Learning in 2018


https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-
learning-results.html
7
CONTENT
1. INTRODUCTION TO PYTHON
WHAT & WHY?

2. HOW TO PRACTICE PYTHON


GOOGLE COLAB + DATACAMP

3. BASIC THINGS TO KNOW IN PYTHON


TYPES & OPERATORS
CONTROL FLOW STATEMENTS
FUNCTIONS & METHODS
MODULES & PACKAGES

8
WHAT IS PYTHON?

https://www.python.org

9
WHY PYTHON?
HIGH-LEVEL LANGUAGE, EASE OF LEARNING
▪ Python is easy to understand, and it is capable of
encapsulating relatively complex problems in
statements which makes the problem look relatively
simple, reducing the effort.
▪ Python has a shallow learning curve and it is one of the
easiest languages to learn to come up to speed.

10
Overall question views for Python
https://stackoverflow.blog/2017/09/06/incredible-growth-python/

11
WHY PYTHON?
INTERPRETED LANGUAGE
▪ Data analysis is mostly an iterative process, where
lots of exploration needs to be done in an ad-hoc
manner.
▪ Python (like R) being an interpreted language
provides an interactive interface for accomplishing
this.

12
WHY PYTHON?
GENERAL-PURPOSE LANGUAGE
▪ Python has a comprehensive set of core
libraries for data science. Python (unlike R) is
not built only for data science, it is a general-
purpose language.
▪ Python can be used to build desktop
applications, enterprise applications, web
applications, mobile apps, games and is
easier to integrate with existing systems in an
enterprise.

13
WHY PYTHON?
PLATFORM INDEPENDENT
▪ Data science projects need extraction of data
from various sources, data cleaning, data
imputation beside model building, validation,
and making predictions. Enterprises typically
want to build an end-to-end integrated systems
and Python is a powerful platform to build
these systems.
▪ Python is supported by many platforms like
Linux, Windows, Macintosh.

14
WHY PYTHON?
MASSIVE COMMUNITY
▪ Python is developed under an open-source license,
making it freely usable and distributable, even for
commercial use.
▪ Python has an amazing ecosystem and is excellent
for developing prototypes quickly.
▪ Python’s strong community continuously evolves its
data science libraries and keeps it cutting edge. It
has libraries for linear algebra computations,
statistical analysis, machine learning, visualization,
optimization,…

15
PYTHON LIBRARIES FOR DATA SCIENCE
▪ numpy Mathematical computations
▪ statsmodels Statistical modelling
▪ scipy Scientific computing
▪ pandas Data structure operations
▪ matplotlib, seaborn Data visualization
▪ scikit-learn Machine learning
▪ tensorflow, pytorch Deep learning
▪ nltk Natural language processing
▪ librosa Music and audio processing
▪ scikit-video Video processing
▪ …
https://pypi.org

16
WHO USES PYTHON?

https://stackshare.io/python

17
WHY PYTHON?

Interpreted language

General-purpose language

High-level language

18
HOW TO PRACTICE PYTHON
▪ For any programmer, and by extension, for
any data scientist, the integrated
development environment (IDE) is an
essential tool.
▪ IDEs are designed to maximize programmer
productivity. In general, the basic pieces of
any IDE are three: the editor, the
compiler/interpreter and the debugger.

19
HOW TO PRACTICE PYTHON: ANACONDA

https://www.anaconda.com

20
HOW TO PRACTICE PYTHON: JUPITER
▪ With the advent of web applications, a new
generation of IDEs for interactive languages such
as Python has been developed.
▪ IPython notebook is an interactive console which
shows the Python execution results very clearly and
concisely by means of cells.
▪ Jupyter (for Julia, Python and R) aims to reuse the
same web-based IDE for all these interpreted
languages and not just Python.
▪ Colab notebooks are Jupyter notebooks that are
hosted by Google Colab.
https://ipython.org | https://jupyter.org

21
HOW TO PRACTICE PYTHON: COLAB

https://colab.research.google.com/notebooks/welcome.ipynb

22
HOW TO PRACTICE PYTHON: DATACAMP

https://www.datacamp.com/courses/introduction-to-data-science-in-python

23
Salt caption

Course
Descripton

Certificate
Hashtag

Certificate
(Drive link)

Instruction:
https://www.youtube.com/
watch?v=0wOSefc7kp8

24
BASIC THINGS TO KNOW IN PYTHON
1. Scalar (int, float, bool) and collection types (str, list,
tuple, dict, set)
2. Assignment (=), arithmetic (+ – * / % //) and relational
operators (== != < > <= =>)
3. Conditional statements (if … elif … else …)
4. Looping statements (while loops and for loops)
5. Functions and methods
6. Modules and packages
https://github.com/nnbphuong/datascience4biz/blo
b/master/Introduction_to_Python.ipynb

25
PYTHON CULTURE
▪ Python community embraces a set of values,
practices and philosophies and most of them
are documented in the Python Enhancement
Proposals (PEPs).
▪ You can find a set of widely adopted
conventions and Python principles at this link:
https://www.python.org/dev/peps/
▪ The most important PEP is PEP8.

26

You might also like