(Student View) Python For DS - Week 1 Lecture
(Student View) Python For DS - Week 1 Lecture
Today’s 3 Breakout
5 Week 1 Lecture
● Key for becoming a Data Analyst, Data Scientist or Machine Learning Engineer
Course overview
Week 1 Week 2
Today’s 3 Breakout
5 Week 1 Lecture
55% 32%
Companies
Roles … and range of other cool roles:
● CEO
● Data Engineer
● Biologist
● BI Analyst
● Consultant
● Software Engineer ● Architect
● Doctor
● Support Specialist
● Operations specialist
● Manager
● Project Coordinator
● Systems Engineer ● Journalist
● Professor
● Recruiter
1 Course overview
Today’s 3 Breakout
5 Week 1 Lecture
Today’s 3 Breakout
5 Week 1 Lecture
Generosity Bravery
Share your expertise with others Dare to try. Be willing to take risks with
and support them along the way. your learning, knowing that our community
will support you along the way.
Perseverance Joy
When (not if) times get tough, Celebrate successes and failures,
don’t give up. Remember why they’re both a part of learning! 🎉
you are doing this, and dig deep.
Give emoji reactions and #shoutouts
for your classmates.
Making a useful slack workspace
Hey…
Cool cool.
● Project Environment
○ #py-for-dsl-introductions
○ #py-for-ds-shoutouts
○ #py-for-ds-questions
○ #py-for-ds-projects
○ #py-for-ds-tips-and-tricks
○ #py-for-ds-feedback
○ #py-for-ds-announcements
You are already winning!
Today’s 3 Breakout
5 Week 1 Lecture
● Numpy is the underlying package used by practically all other machine learning tools
Python Lists vs Numpy
Lists Numpy
How is Numpy faster?
NumPy
8 12 2 3
7 5 11 9
Size (Int16)
18 10 4 6
Lists Reference Count (Int32)
Object Type (Int32)
Object Value (Int64)
NumPy vs Lists (Contiguous Memory)
List:
Numpy:
*Operations can be
computed in parallel on
all of these values
NumPy is the go-to for numerical computation on a set of
data
Let’s get started with numpy!
a = np.array([1,2,3], dtype='int32')
b = np.array([[9.0,8.0,7.0],[6.0,5.0,4.0]])
Other NumPy initialization methods
3D Matrix in Numpy
Image Representation with Numpy
● An image can be represented as a matrix of pixel values
● Each color is represented by pixel values for each of red, green and blue components
(R, G, B)
Numpy Indexing
● list[i]
○ Slicing: list[i:i+5]
● 2d_matrix[i, j]
● 3d_matrix[i, j, k]
genfromtxt(filename, delimiter=”,”)
"WK1_Airbnb_Amsterdam_listings_1.csv"
Matrix Reshaping
● np.reshape(shape=(tuple of ints))
● Needs to make sure reshape dimensions maintain same total number of elements
Merging Matrices
● np.concatenate()
● np.stack()
Broadcasting
● With numpy, you can apply a transformation to multiple elements at once
● Parallelization & broadcasting optimizations in numpy make it MUCH faster than python
Neat References
Today’s 3 Breakout
5 Week 1 Lecture