0% found this document useful (0 votes)
56 views25 pages

Introductory Data Science and ML

This document provides an introduction to data science and machine learning, outlining key concepts such as common data science tasks, popular machine learning algorithms for supervised and unsupervised learning, and Python libraries for data analysis, visualization, and machine learning like Pandas, Matplotlib, and Scikit-Learn. It also discusses applications of data science and machine learning as well as resources for learning more about these topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views25 pages

Introductory Data Science and ML

This document provides an introduction to data science and machine learning, outlining key concepts such as common data science tasks, popular machine learning algorithms for supervised and unsupervised learning, and Python libraries for data analysis, visualization, and machine learning like Pandas, Matplotlib, and Scikit-Learn. It also discusses applications of data science and machine learning as well as resources for learning more about these topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

INTRODUCTORY DATA SCIENCE

AND MACHINE LEARNING


Why Should You Know About Data
Science?
Applications of Data
Science
• Gaming, Recommendation
Systems, Fraud Detection
• Self Driving Cars
• Virtual Assistants
• YouTube Algorithms
• Cancer Research
Roadmap

Learn few
Discussion
Machine
about Data Lets do
Learning
Science, something
algorithms
Machine great together
and Python
Learning
Libraries
What is Data Science?

■ Combines Fields of Statistics and


Computer Science
■ Common Subdomains and
applications of Data Science are
ML/AI, Data Mining, Data
Warehousing, Big Data etc.
General tasks in Data Science

Exploratory
Data
Preprocessing Analysis and
Collection
Visualization

Stats/Machine Prediction,
Learning Correction and
Models Refining
Data Science Tools
Introduction to Machine Learning
Machine Learning Tasks
Supervised Learning

Classification Algorithms Regression Algorithms


■ K Nearest Neighbors ■ Linear Regression
■ Support Vector Machines ■ Polynomial Regression
■ Logistic Regression ■ Ridge Regression
■ Decision Trees
Many More…
Unsupervised Learning

■ Clustering :
■ K-Means Clustering (Spherical Clusters)
■ Hierarchical Clustering
■ DBSCAN (Density Based Clustering)
K Nearest Neighbors (KNN)
Linear Regression
Decision Trees
What is Deep Learning?

■ Deep Learning makes use of Deep


Neural Networks
■ A Neural Network generally contains 3
types of layers: Input Layer, Hidden
layer, Output Layer.
■ A Deep Neural Network contains a
large number of hidden layers
■ Nodes are connect with Weights and
Biases. Each node is activated if the
activation function produces a value
greater than the threshold value
PYTHON DATA SCIENCE
LIBRARIES
Libraries

■ Numpy – Fast array operations


■ Pandas – Data Processing and spreadsheets
■ Matplotlib and Seaborn – Data Visualization
■ Sklearn (Scikit-Learn) – Machine Learning Algorithms
Iris Dataset

■ Hello World of classification problems


■ Task is to classify flower species based on parameters
Numpy

■ Array computation library


■ A=Np.array()
■ A.mean() A.sum() A.max() A.min()
■ A.size A.shape A.ndim
■ Operations on np array occurs element wise unlike lists
Pandas

■ Data Processing Library


■ Df=pd.DataFrame(dictionary/np array)
■ Df.shape, df.ndim, df.size, df.describe()
■ Pd.read_csv()
■ Df.info()
matplotlib

■ Data Visualization Library


■ Pyplot.plot(x,y,’colorcode shape’)
■ Pyplot.xlabel()
■ Pyplot.ylabel()
■ Pyplot.title()
■ Pyplot.legend()
Seaborn

sns.lmplot(x = col1, y = col2, data = dataset, hue = target, fit_reg = True)


sns.countplot(x,data)
sklearn

■ Fit(x,y)
■ Predict(x)
■ accuracy_score(testdata,prediction)*100
LETS IMPLEMENT THESE
CONCEPTS PRACTICALLY
Where to get data from?

■ Kaggle.com – Data Science Community (14K Plus)


■ UCI ML Repository (ML Friendly Data)
■ Data.world (dataset aggregator)
■ Datasets subreddit
■ https://registry.opendata.aws
■ https://github.com/awesomedata/awesome-public-datasets
■ Making or collecting your own – Web Scraping, Web Crawling, Survey using Google
Forms etc.
Where to learn more? (Good Resources)

■ Books – ML for absolute beginners, Hands on ML using scikit-learn and tensorflow,


Python data science handbook and more.
■ Websites – Kaggle.com, towardsdatascience.com, medium.com, analyticsvidya.com,
kdnuggets.com, machinelearningmastery.com
■ Courses: edx.org, coursera.org, cognitiveclass.ai, datacamp.com (3 Free courses
and cheatsheets)
■ Youtube: Deep Learning TV, Google’s Series on ML, Siraj Raval, csdojo

You might also like