0% found this document useful (0 votes)
7 views56 pages

Data Science - Module 1_ Data Science Fundamentals

Vertical Institute - Data Science Boot Camp - Python Programming - Module 1

Uploaded by

espionage84
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views56 pages

Data Science - Module 1_ Data Science Fundamentals

Vertical Institute - Data Science Boot Camp - Python Programming - Module 1

Uploaded by

espionage84
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

MODULE 01 DATA SCIENCE BOOTCAMP

Introduction to
Python and Data
Science
Reminder

1 Please turn on Zoom camera the whole duration of classes.

2 At the start of all classes, please rename yourselves to: Name + Last
3 digits and letter of your NRIC. Example: John Tan (123A)
Agenda ● Introduction To The Course

● Introduction to Python, Jupyter Notebook and Data Science

● Python Fundamentals

© 2024 Vertical Institute


Course Module 1-2
Starting from ground zero, you will learn fundamental programming

Overview concepts by executing basic functions in Python.

You will be able to code comfortably in Python and understand


control flow and conditional programming.

© 2024 Vertical Institute


Course Module 3-4
In the next two modules, you will practice exploratory data analysis

Overview for cleaning and aggregating data, and understand the basic
statistical testing values of your data and more.

© 2024 Vertical Institute


Course Module 5-7
Finally, you will build and refine machine learning models to predict

Overview patterns from data sets, tune data parameters for advanced
model evaluation.

You will also begin working on your capstone project to solve a real-world
problem related to finance.

© 2024 Vertical Institute


Capstone ● You will address a data-related problem to create a predictive model

Project
(must be finance-related). You will acquire a real-world finance data set,
form a hypothesis about it, and then clean, parse, and apply modelling
(100% of final grade) techniques and data science principles.

● The capstone project will culminate your learning by applying the new
tools and concepts learnt to create a report that includes:

✔ A clearly articulated problem statement.


✔ A summary of the data acquisition, cleaning, and parsing stages.
✔ A clear presentation of your predictive model and the processes you
took to create it.

© 2024 Vertical Institute


🧰 Technicalities

1 2 3

How is your Wifi connection? Latest Laptop is required. Mobile or tablets Recording and PDF slides are
version of Zoom is required if you’re screens are too small and the screen provided to you in your learners’
having trouble with Zoom. will be different from the trainer demo if portal on Vertical Institute’s Website.
you use your mobile or tablets.
To receive the funding support, please take note of
the following:
● Minimum of 75% attendance (this means
that you must attend at least 6 out of 7
lessons.
Things to ● Achieve at least a PASS for Capstone

take note…✍ ●
Project.
The Capstone Project has to be submitted
by the deadline given (1 week from the end
of the bootcamp).

In the event that the participant fails to fulfil the


above mentioned requirements, the participant is
liable for the full amount of the course fee.

© 2024 Vertical Institute


If you are missed a lesson, you are allowed to attend
make-up class. This will be corresponding class of
another intake that same week subjected to slot

What if I missed availability.

a lesson?🗓
To arrange for a make-up class, you can contact the
Teaching Assistant or Admin and they will be able to
arrange for a make-up class nearer to the date.

© 2024 Vertical Institute


INTRODUCTION

Hello there! 👋

© 2024 Vertical Institute


INTRODUCTION

Instructor
Introduction 👋

© 2024 Vertical Institute


👋 Now let’s meet you!
Now it’s your turn to introduce yourself!

●Name

●Occupation/School

●In your spare time, what do you like to do?

●Why do you want to learn Data Science?

© 2024 Vertical Institute


📸
Attendance Photo
Taking
What is Python and
Why Learn it?

© 2024 Vertical Institute


Amazon, Google, FB, Netflix – What Do They Have In Common?

FAANG companies love Python and use it for their real-world applications.

© 2024 Vertical Institute


What is Python?
● Released in 1991
● Python was created by Guido van
● Rossum
● Flexible
● Easy-to-learn
● Open Source

Jupyter (aka. Ipython) Notebook


● Web based application that can
○ Run Python code
○ Contain data
○ Render visualisations www.i-programmer.info/news/216-python/12748-guido-van-rossum-on-python-and-diversity-in-open-source.html

○ Take notes in markdown

© 2024 Vertical Institute


Why Learn Python?
• Named as the most in-demand coding
language
• One of the most favorite language used
by data scientists
• Being a multi-purpose language, one
can build application via
python on top of being able to
perform data analysis

© 2024 Vertical Institute


Jupyter Notebook

© 2024 Vertical Institute


What is Jupyter Notebook?

• The Jupyter Notebook is a powerful tool for


interactively developing and presenting data
science projects.

• It’s a single document where you can run code,


display the output, and also add explanations,
formulas, charts, and make your work more
transparent, understandable, repeatable, and
shareable.
Credit: https://www.dataquest.io/blog/jupyter-notebook-tutorial/

• Free to install and use!

© 2024 Vertical Institute


Setting Up Your Jupyter Notebook

Download Anaconda here:


www.anaconda.com/products/individual

© 2024 Vertical Institute


Setting Up Your Jupyter Notebook

Launch the installer and follow the


recommended settings during the
installation.
For Advanced Installation Options,
make sure to ‘Register Anaconda as
my default Python 3.7

© 2024 Vertical Institute


Setting Up Your Jupyter Notebook

When the installation is complete,


you should be able to access an
application called ‘Jupyter Notebook’.

© 2024 Vertical Institute


Setting Up Your Jupyter Notebook

● On your web browser, you


should be able to see the
Jupyter Notebook interface.

● If you’re able to arrive at this


screen, the installation
should have been completed
successfully!

© 2024 Vertical Institute


Introduction to
Data Science

© 2024 Vertical Institute


What is Data Science?
Ever wondered how YouTube’s recommendation engine
works? Or how TikTok knows exactly what to show you next?

These predictive functionalities are driven by training a


computer how to learn using large data sets.

Machine learning is powering innovation in everything from


insurance-tech to lending models to fraud detection.

https://www.wsj.com/video/series/inside-tiktoks-highly-secretive-algorithm/investigation-how- tiktok-
algorithm -figures-out-your-deepest-desires/6C0C2040-FF25-4827-8528-2BD6612E3796

© 2024 Vertical Institute


What is Data Science?
Data science lies at the intersection of business, statistics
and computer science.

© 2024 Vertical Institute


What is Data in Data Science?
Traditional data is data that is structured and stored in databases which analysts can
manage from one computer; it is in table format, containing numeric or text value.

Big data, on the other hand, is… bigger than traditional data, and not in the trivial sense.
From variety (numbers, text, but also images, audio, mobile data, etc.), to velocity
(retrieved and computed in real time), to volume (measured in tera-, peta-, exa-bytes),
big data is usually distributed across a network of computers.

https://www.kdnuggets.com/2018/06/what-where-how-data-science.html

© 2024 Vertical Institute


Data
Science
Lifecycle

© 2024 Vertical Institute


Data Science Use Cases

© 2024 Vertical Institute


Python Data
Types

© 2024 Vertical Institute


Python Data Types
Primitives data types are the building blocks for data manipulation and contain pure, simple values
of a data. There are 4 primitive variable types.

Primitive Variable Explanation Examples


Type

Integers Represents numeric data and more specifically whole -1, 2, 50


numbers from negative infinity to infinity

Float Short for floating point number, usually used with decimals -2.1, 2.8, 3.14159

String Collection of alphabets, words or other characters. Usually “words”, “1” ,“ ”


enclosed within a pair of single or double quotes

Boolean Takes up the value of True, False. Commonly used True, False
for controlling flow of program.

© 2024 Vertical Institute


Python Data Types
Non-primitives are the sophisticated members of data structure family. They don’t just store a value,
but rather a collection of values in various formats.

Non-Primitive Explanation Examples


Variable Type

Lists Used to store collection of heterogenous(diverse) items. They [1,2,3]


are mutable, which means you can change their content [‘a’,’b’,’c’
without changing their identity ]
[1,’apple’,3]
Dictionaries Made of key-value pairs. Key is to identify the item and the x_dict = {‘a’:1,
value holds as the name suggest, the value of item ‘b’:2 }

Tuples Tuples are another standard sequence data type however it (‘a’,’b’,’c’,’d’,’e’)
is immutable, meaning once defined, you cannot delete, add
or edit any values inside it

Set Unordered collection of distinct unique objects. x = set([‘a’,’a’,b’,’c’])


>>> {‘a’,’b’,’c’}

© 2024 Vertical Institute


Variables

© 2024 Vertical Institute


Variables Assignment
● A variable is a named place in the memory where a programmer can store data and retrieve the data later
by using the variable name

● As a programmer, you get to choose the names of the variables

Examples of declaring a variable in python

a=123 #number
b = ‘Hello’ #string #list
c = [1,2,3] #dictionary
d = {“1”: “A”, “2”: “B”}

© 2024 Vertical Institute


Rules of Declaring a Variable

1. Python variable name can contain small case letters (a-z), upper case letters (A-Z),
numbers (0-9), and underscore (_).

2. A variable name can’t start with a number.


3. We can’t use reserved keywords as a variable name.
4. Python variable can’t contain only digits.
5. A python variable name can start with an underscore or a letter.
6. The variable names are case sensitive.
7. There is no limit on the length of the variable name.

© 2024 Vertical Institute


Reserved Keywords

© 2024 Vertical Institute


Best Practices

● Variable names should be lowercase.

● A variable's name should be representative of the


value(s) it has been assigned.

● If you must use multiple words in your variable


name, use an underscore to separate them.

© 2024 Vertical Institute


Examples of Invalid Variables

● 9abc: variable name can’t start with a number.


● 123: variable name can’t contain only numbers.
● x-y: the only special character allowed in the variable name is an underscore.
● def: invalid variable name because it’s a reserved keyword.

© 2024 Vertical Institute


Built-In functions
Functions are statements that run a specific computation on the input you give it.
Functions are identifiable by the function name followed by round brackets.

● sorted()
● len()
● set()
● list()
● print()
● type()

© 2024 Vertical Institute


Python Errors

© 2024 Vertical Institute


What are Python Errors?

We can make certain mistakes while writing a program that


leads to errors when we try to run it.

A python program terminates as soon as it encounters an


unhandled error. These errors can be broadly classified into 2
groups:

1. Syntax errors
2. Logical errors (exceptions)

© 2024 Vertical Institute


What are Python Errors?

SyntaxError: Code that cannot be interpreted by Python

AttributeError: when you try to call an attribute of an object whose type does not support that method

NameError: using variable that does not exist yet

TypeError: Doing operation on an incorrect/unsupported object type

ZeroDivisionError: Due to either a number being divided by zero, or a number being modulo by zero

© 2024 Vertical Institute


Python Operators

© 2024 Vertical Institute


Arithmetic Operator

The arithmetic operators perform


addition, subtraction, multiplication,
division, exponentiation, and modulus
operations

© 2024 Vertical Institute


Relational Operator

● Operators can also be used to


compare objects
● Comparison is required for sorting,
sorting helps searching, both of
which are fundamental information
processing tools
● Takes in 2 operands, returns a
boolean (True, False) which is
later used to control program flow,
or filter rows during analytics

© 2024 Vertical Institute


Logical Operator

Each comparison operator creates a single


Boolean, logical operators combined
booleans to implement logical concepts

© 2024 Vertical Institute


Logical Operator Examples

© 2024 Vertical Institute


Operator Precedence and Associativity

Operator Precedence (high to low):


1. Python operators have different levels of precedence
2. A good practice is to use parentheses to explicitly indicate the desired evaluation precedence

Operator Associativity (the order in which Python evaluates an expression containing multiple operators
of the same precedence)
1. Left associativity means that the expression is evaluated from left-to-right (almost all operators)
2. Right associativity means the expression is evaluated from right-to-left

© 2024 Vertical Institute


CRUD Framework

© 2024 Vertical Institute


CRUD Framework

CRUD is an acronym that comes from the world of computing and refers to the four
functions that are considered necessary to implement a persistent storage application.

Create Allows users to create a new record in the database

Read Allows users to search and retrieve specific groups in the


table and read their values

Update Allows users to modify existing records that exist in the


database

Delete Allows users to remove records from a database that is


no longer needed

© 2024 Vertical Institute


CRUD Operations In Finance

A financial institution maintains multiple databases that helps manage to and keep track of existing
customers, financial products and spending patterns. Below are some of the common financial tables:

● A Customer Data Table includes attributes such as first and last name, personal identification
number, contact number, home address, work location, and any other relevant personal details.
● A Product Table that includes the company’s financial products such as credit cards, loans and
trading activities.
● A Transaction Table that contains data at the transaction level for each of the customers, including
frequency, amount and recency.

© 2024 Vertical Institute


📝 Recap time!
What are your favorite
takeaways?
Let’s share with each other!
Some things to take note…

Link and resource could be accessed in the Learning Portal.


https://elearn.verticalinstitute.com/users/sign_in

© 2024 Vertical Institute


📸
Attendance Photo
Taking
MODULE 01 DATA SCIENCE BOOTCAMP

Thank you!

You might also like