0% found this document useful (0 votes)
115 views9 pages

DS4A Resources 2

The document provides resources to help people learn data science and become data literate. It includes sections on why AI is important, how data science, data literacy, and AI differ, key topics in data science, and free resources for learning statistics, Python, and code-free data science. The resources are intended to help people gain foundational skills in analytics through self-study in order to advance their learning and careers.

Uploaded by

Martin Cordoba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views9 pages

DS4A Resources 2

The document provides resources to help people learn data science and become data literate. It includes sections on why AI is important, how data science, data literacy, and AI differ, key topics in data science, and free resources for learning statistics, Python, and code-free data science. The resources are intended to help people gain foundational skills in analytics through self-study in order to advance their learning and careers.

Uploaded by

Martin Cordoba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

EMPOWERMENT

DATA SCIENCE
FOR ALL RESOURCES
Provided here is a list of curated resources that can help you
learn the importance of data science and jumpstart your
development as a data-literate individual.

As part of our mission to make data science for all, we have


worked hard to curate the best materials across the internet
to help you on your journey. We have also ensured that all of
these resources are completely free. They provide a strong
foundation for you to learn analytics, and will also position
you well to continue your learning either in a future DS4A
Empowerment cohort or elsewhere.

If you would like to study before taking the DS4A Empower-


ment admissions assessment, please see our 1-page package
of prep materials.
DS4A RESOURCES FOR ALL

CONTENT

HOW TO USE THESE RESOURCES 03

WHY IS AI IMPORTANT? 04

HOW ARE DATA SCIENCE, DATA LITERACY, 05


AND AI DIFFERENT?

DATA SCIENCE: KEY TOPICS 06

STATISTICS RESOURCES 07

PYTHON RESOURCES 08
Starting Point 08
Where to Code 08
Advancing Your Python 08

CODE-FREE DATA SCIENCE RESOURCES 09

PROVIDE YOUR FEEDBACK 09

2
HOW TO USE THESE RESOURCES
Going through all the resources provided below would take hundreds of hours. We do not expect that to work for most
people. So how should you use these resources?

First, consider what you want to get out of these resources. Do you want foundational data literacy to stand out in your
current field? Do you want to be able to code with a focus on data? Do you want a career switch to data science?

Second, consider how much time you’re willing to put into this. If you want meaningful results, you need to make an honest
effort and actually put time in consistently and over a sustained period of time. With that said, you do not need to sink in
hundreds of hours to see noticeable development, as we have curated the most-effective-per-unit-time resources for you.

The table below outlines multiple routes you can take through these resources based on your answers to the above
questions (of course, feel free to define your own path). Regardless of what route is best for you, please go through the
“Why is AI Important” and “How are Data Science, Data Literacy, and AI Different” sections to develop a better understanding
of why data science and data literacy are important.

HOURS LEARNING OBJECTIVE

DATA LITERACY CODING + SOME DATA DATA SCIENCE

Read chapter 1 of Data Science: Work through the Work through the learnpython.org
20 A Gentle Introduction, find learnpython.org tutorials, read tutorials, read chapters 1, 2, & 7 of
your favorite code-free data chapters 1 & 2 of Data Science: A Data Science: A Gentle Introduction,
science resource and become Gentle Introduction, complete all complete all the Pandas
an expert with that tool the Pandas intermediate level intermediate level Python tutorials
Python tutorials

Read all of Data Science: A Work through all of Python for Work through chapters 1 - 11, 15, &
Gentle Introduction, find your Everybody, read chapters 1 & 2 of 16 of Python for Everybody, work
50 favorite code-free data science Data Science: A Gentle Introduc- through chapters 1, 2, & 9 of
resource and become an tion, complete all the Pandas Openstax’s Intro Stats, complete all
expert with that tool intermediate level Python the Pandas intermediate level
tutorials Python tutorials

Work through the learnpy- Work through all of Python for Work through chapters 1 - 11, 15,
thon.org tutorials, read all of Everybody, read through all of & 16 of Python for Everybody and
Data Science: A Gentle Data Science: A Gentle Introduc- work through all of Think Stats, try
Introduction, complete all the tion, complete all the Pandas out several data science/ML
100+ Pandas intermediate level intermediate level Python advanced tutorials from realpython
Python tutorials, learn 2 tutorials, try out several advanced
code-free data science tools tutorials from realpython
(that have different use cases)

If you’re coming in with prior stats or Python experience, the best route for you is probably not one of the above. If this is the
case for you, we encourage you to pick and choose from the resources listed here that best fill in your missing skills.

3
WHY IS AI IMPORTANT?
If you work in finance, wouldn’t you want to know which
loan applicants will likely default? Or if you were in
sales, which people are most likely to buy your product?

Many people would approach this by applying rules of


thumb or by using their intuition. These approaches are
somewhat effective, but also give incorrect conclusions
fairly often. Whereas, if you could predict the outcome
correctly nearly every time, you could save a lot of time
and money.

This is the crux of AI: being able to optimally predict an


outcome based on the given information. Instead of
using rules of thumb, AI analyzes all the data you have
on past loan defaults or successful sales to make the
right decision. As long as you have enough data on past
dealings, you can use AI to improve your future work.

Regardless of what industry you work in, AI can


noticeably improve your dealings in it. For examples of
how AI is changing your industry or others, we suggest
you read the second chapter in the State of AI 2019.

4
HOW ARE DATA SCIENCE, DATA LITERACY,
AND AI DIFFERENT?
AI is just one component of data science. It is an advanced form of data modelling where the
“author” of the model does not know exactly how the model works (just as we do not know
exactly how our brains function), yet the model still gives useful results.

Typical data science models are explicit - that is, the author knows exactly what the program is
doing and why. These models are extremely valuable and require a similar amount of expertise
as AI to create. The difference is that AI can tackle some more complex problems (like artificial
vision) whereas data science models efficiently tackle more tractable problems (like predicting
future sales accounting for seasonality).

In contrast to AI and data science, data literacy has us take a step back from the weeds of
manipulating data and instead has us communicating with data. That is, being able to read and
interpret visualizations and summary statistics to draw accurate conclusions from data. Moreover,
data literacy allows us to effectively communicate those conclusions to anyone, regardless of
their background.

The below graphic provides an overview of the relationship between data literacy, data analysis,
data science, and AI.

COMMUNICATING RESULTS
DATA LITERACY DASHBOARDS
DATA VISUALIZATION
SUMMARY STATISTICS
DATA ANALYSIS
EDA
HYPOTHESIS TESTING
STATISTICAL INFERENCE
DATA SCIENCE
COHORTS ANALYSIS

MODELING
AI

5
DATA SCIENCE: KEY TOPICS
Hopefully at this point you see the value of AI and the data
science that underpins it. Since you cannot create useful AI
without a firm foundation in data science, what are the key
topics and skills you should know to build this foundation?

IBM created an excellent guide that answers just that. For a


clean overview, go through pages 3 and 4. If you want a
detailed breakdown of the topics, go through the tables on
pages 6-11.

Fundamentally, you need to understand some statistics and


some programming to create useful insights from data. For
this reason, we suggest that you explore the Python and
statistics resources provided below.

If coding sounds too intimidating, we also suggest several


resources where you can apply data science techniques
without having to code! (see the Code-Free Data Science
Resources section).

6
STATISTICS RESOURCES
Below is a table of excellent resources to learn statistics with. Each resource comes with a time
estimate to read through the resource thoroughly + time for dedicated practice to solidify your
learning. The read times are based on slow, high absorption reading (your mileage may vary). Do
not just skim through these books if you are looking to learn.

ESTIMATED TIME
RESOURCE KEY CONSIDERATIONS WHY USE THIS RESOURCE? COMMITMENT

Data Science: - Very easy to read Great if you want to learn


30h + 20h
A Gentle - Presented from a data statistics through data
of practice
Introduction science (not statistics science applications. Gives
perspective) you a better sense of purpose
- You will need to find your (to be a data scientist) as you
own practice problems work through the book

Think Stats - Includes code on Github as If you already know Python, 30h + 20h
examples and homeworks it will teach you both of practice
- Well written statistics and Python’s stats
libraries

Openstax’s - Many detailed exercises You will likely find the text 90h + 0h
Introductory throughout the chapters easier to follow than OpenIn- (practice included
Statistics - Provided “try it” exercise tro. It will give you a very throughout)
solutions solid statistics foundation

OpenIntro - Complementary videos Ideal if you want both lectures 45h + 15h of
Statistics for most sections and a textbook. It will give you extra practice
a solid statistics foundation

Statistics for - Full video table of contents The video is concise and 7.5h video (watch
beginners for easy navigation presents a manageable at x1.25 speed) +
video-course - Slow and detailed amount of content to learn 25h of practice
explanations

7
PYTHON RESOURCES
In data science, two programming languages are dominant: Python and R. R has a stronger
presence in academia, however we recommend Python as it is the more popular language
generally and has a very active community.

STARTING POINT
If you have no familiarity with Python, we suggest you start with the Hello World! tutorial from
learnpython.org. This site provides guides that will walk you through how to code in Python right
from the beginning up to some basic data science applications. You can expect to learn Python’s
syntax from these tutorials, but do not expect to develop a deep understanding of programming.
We suggest you complete all of the tutorials up to and including Pandas Basics. Working through
these tutorials will likely take you 7 to 10 hours to complete.
If you want a deeper understanding of programming, or if you prefer a book to walk you through
Python, we suggest Python for Everybody (which includes recorded lectures and graded assign-
ments to assist your learning). Be sure to complete the exercises as you go through the book,
otherwise you will not retain your learnings. In total, going through this book and practicing will
likely take you around 40 to 50 hours.

WHERE TO CODE
When you start playing around with your own basic Python programs, we suggest you use
Code Skulptor. This is an in-browser version of Python that will let you test and run your code
without needing to install anything.
Once you move to more advanced programs (such as reading from and writing to files), you will
have to install python on your own computer. We suggest you do so by installing Anaconda.
Anaconda is a pain-free way to install Python quickly. It also installs an application called Spyder
(an IDE), which you can write and run your code in. For your convenience, here are the Anaconda
download links for Windows, macOS, and Linux.

ADVANCING YOUR PYTHON


Once you have the fundamentals of Python figured out, if you would like to further develop your
Python skills, we suggest the intermediate level tutorials from realpython.com (not all of these
are free but our suggested ones are). Below we recommend some of the more useful ones for
working with data. Pick and choose the ones that interest you; do not feel as though you need to
complete them all!

Jupyter Notebook Introduction Data Management With Python (SQL)


The Pandas DataFrame Web Scraping in Python
Plotting With Pandas Working With JSON Data in Python
Tips and Trick With Pandas Advanced Python Import Techniques
Correlation With Python (Pandas) Object-Oriented Programming
Reading and Writing CSV Files Natural Language Processing in Python
Manipulating Excel With Python Making Web Requests With Python

8
CODE-FREE DATA SCIENCE RESOURCES
Although having the programming experience lets you make more advanced models, you can
use services that handle the coding for you. This can both save time for experienced program-
mers, and bring the power of data science to those lacking a coding background.

Here are some tools you can use to analyze data with the power of data science:

Google Data Studio -- here are some good resources to learn the tool:
The ultimate guide to google data studio
These short video tutorials on how to use the tool
This article lists 8 good code-free data science options with varying use cases

The key to getting value from these code-free options is knowing the right questions to ask (so
that you generate impactful insights). This article explains the importance of this step and how to
best approach it.

PROVIDE YOUR FEEDBACK


We are trying our hardest to provide you with the best free resources to jumpstart your data
science journey. We would love to hear from you about your experience using these resources
(which you liked, which you didn’t) and if you think we missed some resources that would
improve this package.

To provide your feedback and suggestions, or if you would like to talk with us, please contact:
[email protected]

THANK YOU

You might also like