0% found this document useful (0 votes)
60 views19 pages

Schedule Summer Analytics 2019

The 4-week Summer Analytics 2019 course covers topics like Python, exploratory data analysis, probability, and inferential statistics. Each week is split into two phases: Phase I focuses on concepts through video lectures and courses, while Phase II involves a related assignment to apply the concepts. Week 1 teaches Python basics, EDA, and has students complete two online courses. Week 2 covers pandas, probability, inferential statistics, and requires completing two more courses. Assignments are submitted on GitHub and involve Jupyter notebooks analyzing datasets.

Uploaded by

Ryan Nelson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views19 pages

Schedule Summer Analytics 2019

The 4-week Summer Analytics 2019 course covers topics like Python, exploratory data analysis, probability, and inferential statistics. Each week is split into two phases: Phase I focuses on concepts through video lectures and courses, while Phase II involves a related assignment to apply the concepts. Week 1 teaches Python basics, EDA, and has students complete two online courses. Week 2 covers pandas, probability, inferential statistics, and requires completing two more courses. Assignments are submitted on GitHub and involve Jupyter notebooks analyzing datasets.

Uploaded by

Ryan Nelson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Summer Analytics 2019 Schedule:

The entire course is distributed in 4 weeks. Every week will be distributed in two
phases Phase I (4-5 Days) & Phase II (2-3 days).

Phase I : It will consist of learning various concepts of Data Analytics where most of
the theory part will be covered, basically Tue-Fri of each week is allotted for this
phase.

Phase II : An assignment/quiz will be uploaded every Saturday morning which would


help you to apply all the concepts learnt within the theory part during Phase I and
you will have 3 days to implement and finish your work.

Week 1 :

Phase I

We‘ll be starting with most important tools for Analytics:

Python : This week will comprise of learning basic coding skills in python
required for data science.
EDA : ​In statistics, ​exploratory data analysis​ (​EDA​) is an approach to
analyzing data sets to summarize their main characteristics, often with
visual methods(graphs).
Note - This week you have two extra days 25th and 26th May so Phase I
will be of 7 days

Breakdown:

For this week you need to audit two courses the method to do this is
described at the end.

PS: Do make notes while learning and it’s always a better practice to try
implementing the syntax of the code yourself in a jupyter notebook for better
understanding

.
Day 1 Python Basics (Coursera)
*complete week 1

Day 2 Python Data Structures(Coursera)


*complete week 2

Day 3 i) EDA-Examining
Distributions(Stanford)
*​upto Measure of Spread-range,IQR &Outliers
ii) Python Programming
Fundamentals(Coursera)
*upto loops

Day 4 i) EDA-Examining
Distributions(Stanford)
*complete
ii) Python Programming
Fundamentals(Coursera)
*complete

Day 5 i) EDA- Examining


Relationships(Stanford)
*​upto Scatter plots

ii)Working with data in


python(Coursera)
*​upto pandas

Day 6 i)Working with data in


python(Coursera)
*​complete
ii) EDA- Examining
Relationships(Stanford)
*complete

Day 7 i) Python Matplotlib (for making


plots in python). Follow this link:
https://www.youtube.com/watch?v=yZTBM
MdPOww

​Python For Data Science - visit the link below and follow the steps

https://www.coursera.org/learn/python-for-applied-data-science?speciali
zation=ibm-data-science-professional-certificate
Step 1 - Click on Join and Enroll now and create your account

Step 2 -Click on next button

Step 3 - Click on Audit the course


Finally you are good to go with this python course.

Statistics for data analytics - visit the link below

https://lagunita.stanford.edu/courses/OLI/StatReasoning/Open/about

Enroll & register yourself to this course to get started.

Phase II
This week’s assignment consists of two parts Assignment-1(a) &
Assignment -1(b) based on whatever that is covered in lectures of week
1.

The link to Assignment file is:


https://drive.google.com/drive/folders/1mh5YGIHEJZBw_Q1ARQkAhiZ9
Wq3v34oC?usp=sharing

Please follow the instructions as given in above folder to complete the


assignment.

Submit your Assignment by following instructions below :

Instructions for uploading assignment:


1) Create a Github account
2) Create a new Repo and name it Assignment_Week1. Write some
description about the project.
3) Upload all the files of the assignment that is shared with you (ipynb
replaced with your new updated one)
4) Yayy! You're done. Now share the link of the Github repo in the
Google Form below

Link for submission :


https://docs.google.com/forms/d/e/1FAIpQLSdHqUNygPBfO9yizK-o7aB
5dvsOc8L6bjJ33gl5_-76yaA58A/viewform?usp=sf_link

If you are new to Github you can take help from link below:
https://www.youtube.com/watch?v=73I5dRucCds

(Note: You can have multiple Submissions the one with best marks will
be selected)

P.S. If you feel any doubt you can post it on facebook page or feel free
to contact us.
Sample Solution : Click on the link below to get the solutions to first
Assignment-
https://drive.google.com/drive/folders/1Sh_VPHNdf8CJvr-0lNOwNtdoCJ
CCym0_?usp=sharing

Week 2:

Phase I

Data Analysis in Python Using Pandas​ : In this week we will enhance


our skill and learn more tricks for playing with a Dataset.

Introduction to Probability​: ​Probability theory is the mathematical


foundation of statistical inference which is indispensable for analyzing
data affected by chance, and thus essential for data scientists. I know it
may seem a bit cliché at starting but trust me probability is one of the
most important foundation used in Machine Learning algorithms.

Inferential Statistics:​ This is the most important concept, you’ll be


applying these throughout your career and it will help you answer many
questions in various interviews.
The main purpose of inferential statistics is to:
A. Summarize data in a useful and informative manner.
B. Estimate a population characteristic based on a sample.
C. Determine if the data adequately represents the population.

**This week too you will have to audit two courses, the link and steps to do are given
at the end.**
Some Instructions:

a) For the python part this week it is highly advised to duplicate everything after
watching the video on your own for better grasp.
b) For the courses of Cousera just watch the video lectures (Speed 1.5x) and it is not
required to do it’s assignments and quizzes.
c) Making notes is highly advised. It will help you grasp things better and save you
time when you need to go through concepts.

Breakdown:

DAY 1 (i)Data Analysis in Python using Pandas:


visit the link-
https://www.youtube.com/watch?v=yzIMirc
GU5I&list=PL5-da3qGB5ICCsgW1MxlZ0H
q8LL5U3u9y&index=1
(upto video 8 watch in 2x speed)

(ii)Producing Data: Sampling (Stanford)


*​complete

(iii)Producing Data: Designing Studies


(Stanford)
*​complete
DAY 2 (i)Data Analysis in Python using Pandas:
visit the link-
https://www.youtube.com/watch?v=YPItfQ8
7qjM&list=PL5-da3qGB5ICCsgW1MxlZ0H
q8LL5U3u9y&index=9
(​videos 9-16 watch in 2x speed)

(ii)Introduction to probability: Week 3


(Coursera)
*​complete
(you don’t need to do week 1 & week 2)

DAY 3 (i)Data Analysis in Python using Pandas:


visit the link-
https://www.youtube.com/watch?v=OYZNk7
Z9s6I&list=PL5-da3qGB5ICCsgW1MxlZ0H
q8LL5U3u9y&index=17
(​videos 17-21 watch in 2x speed)

(ii)Introduction to probability: Week 4


(Coursera)
*​complete
DAY 4 (i)Data Analysis in Python using Pandas:
visit the link-
https://www.youtube.com/watch?v=0s_1IsR
OgDc&list=PL5-da3qGB5ICCsgW1MxlZ0H
q8LL5U3u9y&index=24
(videos 24-30 watch in 2x speed)

(ii)Inferential Statistics (Coursera)- Week 1


*​upto CLT & Sampling

DAY 5 (i)Inferential Statistics (Coursera)- Week 1


*​upto Confidence Intervals

(ii)Inferential Statistics (Coursera)- Week 2


*​complete i.e. Hypothesis Testing &
Significance

For Introduction To Probability & Data – visit the link below and
follow the same steps as done earlier for Python for Data Science.
https://www.coursera.org/learn/probability-intro

For Inferential Statistics – visit the link below and follow the same
steps as done above.
https://www.coursera.org/learn/inferential-statistics-intro

***This is all for the Phase I part of Week 2, in the upcoming week we’ll be continuing
with Inferential Statistics and start with Machine Learning, So be motivated guys
push yourself and complete this Week something bigger is waiting for you in the
upcoming one :) :) ***
Phase II

The link to Assignment 2 is given below:

https://drive.google.com/drive/folders/1mh5YGIHEJZBw_Q1ARQkAhiZ9
Wq3v34oC?usp=sharing

This assignment contains 3 parts, you have to complete first 2 parts and
follow the same procedure as done in Assignment 1 for submission.

The link for submission is :

https://docs.google.com/forms/d/e/1FAIpQLSdY7yRlKrnkbDqtt516ozHyIj
mcej0Hrl89xfV0DTa5z3Wzvw/viewform?usp=sf_link

Part 3 of the assignment is for your practice and its a good application
based problem so try doing it.

Duration Given For Assignment : 2 Days


Deadline:11th June 2019 23:59

Week 3:

Phase I

As mentioned earlier this week we will be finishing Inferential Statistics


and start with Machine learning.
Machine Learning​: The field of study that gives computers the ability to
learn without being explicitly programmed.
Or
If performance of a program at task T as measured by P improves with
experience E it is called Machine Learning.

**For this week you will have to audit one course from edX i.e. Machine Learning
with Python : A practical introduction by IBM. The link and method to enroll is given
at the end.**
This is the most interesting and important week of this course where you will enjoy
every bit of whatever you learn.

Instructions:
(i)​ ​In the course given by edX there are Lab sessions after every topic covered,

it is highly suggested to go through the code in that as it will help you understand the
implementation of the topic.
(ii)​ ​ ​Making notes is highly advised. It will help you grasp things better and save

you time when you need to go through concepts.

Breakdown:

DAY 1 (i)Intro of Machine Learning, visit


link-
https://www.youtube.com/watch?v=
ukzFI9rgwfU

(ii)Inferential Statistics
(Coursera)-Week 3
​ ​*complete

DAY 2 (i)Machine learning with Python


(edX) – Module 1
*complete

(ii)Inferential Statistics (Coursera)-


Week 4
​*Inference for Proportions (i.e. till Hypothesis
test for comparing two proportions)
DAY 3 (i)Machine Learning with Python
(edX) – Module 2
*upto Simple Linear Regression

(ii)Inferential Statistics (Coursera) –


Week 4
*complete

DAY 4 (i)Machine Learning with Python


(edX) – Module 2
*complete

DAY 5 (i)Machine Learning with Python


(edX) – Module 3
*u​ pto Decision Trees

For Machine Learning with Python course by edX visit the link and
follow the steps as given below :
https://www.edx.org/course/machine-learning-with-python

Step 1 – Click on enroll now


Step 2 – Click on Audit This Course

Finally you are good to go with this course.

(**Congrats guys you have completed about 75% of the course just one
more week to go after this so push yourself, you have reached this far so
complete it and develop a highly demanded skill in today’s world within
you :) **)

Phase II

The link for Assignment 3 is :

https://drive.google.com/open?id=1mh5YGIHEJZBw_Q1ARQkAhiZ9Wq
3v34oC

This assignment has two parts one for implementation of Simple Linear
Regression and other for KNN from scratch.
The assignment for remaining topics taught in this week will be uploaded
at the end of the course as it would become hectic for you guys.
The link for submission is :

https://forms.gle/nGApe3RJH1ATgMKh8

Duration Given For Assignment : 3 Days


Deadline:20th June 2019 23:59

WEEK 4:

Phase I

We will be completing the Machine Learning course on edX this week.


Also we will be covering some of the important algorithms in depth. So
for this we have a reading part everyday which has links to the blogs that
discusses the algorithm and it implementation in details.

This is the last week of our course, you guys have come this far so make sure you
complete this week.

Instructions:
(i)​ ​For the reading portion make sure you take notes and try to implement the

code wherever possible. It is highly advised to google the portion of code that you
don’t understand.
(ii)​ ​Watch the Cross validation portion videos at 1.5x speed.
Breakdown:
DAY 1 (i)Machine Learning with Python (edX)
– Module 3 (Classification)
*Complete

(ii)Linear Regression (reading) – visit


link:
https://towardsdatascience.com/linear-r
egression-using-python-b136c91bf0a2

DAY 2 (i)Machine Learning with Python (edX)


- Module 4 (Clustering)
*upto K-means Clustering

(ii)KNN (reading) – visit link:


https://medium.com/machinelearningal
gorithms/k-nearest-neighbors-c9823dc
a611b

(iii)Decision Tree (reading) – visit link:


https://medium.com/datadriveninvestor/
decision-tree-algorithm-with-hands-on-
example-e6c2afb40d38
DAY 3 (i)Machine Learning with Python (edX)

Module 4 (Clustering)
*upto Hierarchical Clustering

(ii)Logistic Regression (reading) – visit


link:
https://towardsdatascience.com/buildin
g-a-logistic-regression-in-python-301d2
7367c24

DAY 4 (i)Machine Learning with Python (edX)



Module 4 (Clustering)
*complete

(ii)SVM (reading) – visit link:


https://medium.com/datadriveninvestor/
support-vector-machines-ae0ff2375479
DAY 5 Cross Validation:

1. ​Four Types Of Cross Validation

2. Improve Your Model Performance using


Cross Validation

3. Selecting the best model in scikit-learn


using cross-validation

(**This marks the end of the theoretical phase of our course. I hope that
you guys had great and productive time learning with us :) :). Following
this you will have your final assignment which will earn you a certificate
for successful completion of course**)

Phase II

The link for the final assignment is given below:

https://drive.google.com/open?id=1mh5YGIHEJZBw_Q1ARQkAhiZ9Wq
3v34oC
This is the final assignment of this course and it has 1 problem. The
descriptions about the problem is well described in the assignment
folder.
Link for submission-
https://forms.gle/Yh6QUPRuJtWE9ZCQ8

Deadline For Submission - 12th July 2019

​ Update Learning Progress Here

You might also like