0% found this document useful (0 votes)
31 views12 pages

Machine Learning Presentation

The document discusses machine learning topics including artificial intelligence, data science, machine learning techniques, and applications. It covers supervised learning methods like regression, classification, and decision trees. Unsupervised learning techniques like clustering and dimensionality reduction are examined. Other topics include time series modeling, ensemble methods, recommender systems, and text mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views12 pages

Machine Learning Presentation

The document discusses machine learning topics including artificial intelligence, data science, machine learning techniques, and applications. It covers supervised learning methods like regression, classification, and decision trees. Unsupervised learning techniques like clustering and dimensionality reduction are examined. Other topics include time series modeling, ensemble methods, recommender systems, and text mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning

Artificial intelligence
• Intelligence displayed by machines that simulates animals & human
intelligence
• Data Science vs Machine Learning vs Artificial Intelligence
• Machine learning (Supervised, unsupervised & deep learning as well as
prediction). Subset of Artificial Intelligence
• Artificial intelligence (Automated decision making)
• Data science (Data transformation, Analytics insight & human decision
making). Use of statistical methods to find patterns in data
ML Topics
1. Intro to AI & Machine learning (Techniques & Applications)
2. Data wrangling & manipulation
3. Supervised learning – Regression & Classification
4. Feature engineering
5. Unsupervised learning
6. Time series modelling
7. Ensemble learning
8. Recommender systems
9. Text mining
Machine Learning techniques &
Applications
1. Classification • Healthcare (patient data & risk analytics,
medical imaging & diagnostics, wearables,
2. Categorization virtual assistants, in patient care, drug
discovery, lifestyle management & monitoring,
3. Clustering emergency room & surgery)
• Image processing (optical character processing
4. Trend analysis (Use time series – OCR, self driving cars)
data analysis) • Robotics (human simulation, humanoid robots,
industrial robotics)
5. Anomaly detection • Data mining (Anomaly detection, Grouping &
6. Visualization predictions, Association rules)
• Video games (reinforcement learning)
7. Decision making • Text analysis (sentiment analysis, spam filtering,
information extraction)
Data wrangling & manipulation (acquisition,
exploration, wrangling, manipulation & typecasting)
• Exploration techniques using Python – dimensionality check, dataset
types, slicing & indexing, unique elements, value extraction, feature
mean median and mode.
• Wrangling – techniques (discovering, structuring, cleaning,
enrichment, validation)
• Manipulation using python – coercion, merging, concantenation &
joining
Supervised learning – Regression &
Classification
• Classification – output has discreet and finite values
• Logistic regression/sigmoid probability (binary solutions – loan sanction, Pass/Fail, Customer segments,
Cancer prediction, Weather forecast). Accuracy metrics – Confusion matrix, ROC curve, AUC,
• Decision trees (sequential, hierarchical decisions). Root note, Decision node & Terminal node
• Random forest (ensemble of decision trees). Bagging – reduces variance & Bootstrapping. Produces
decorrelated decision trees
• Naïve Bayes classifier (Bayes theorem and assumption that features are independent)
• Support Vector Machines (hyperplane that separates instances into different categories)
• Regression – if output is continuous numeric variable such as R&D spend, Admin, Marketing
spend, State, Profit etc
• Linear
• Multiple linear
• Polynomial (Quadratic features)
• Ridge regression – when there is problem of collinearity
• Lasso regression – similar to ridge but also performs feature selection
• ElasticNet – combines Ridge & Lasso
Unsupervised learning
• Types of clustering (distance measures – complete, single, mean,
centroid)
• Hierarchical Clustering
• Agglomerative
• Divisive
• Partitioning Clustering
• K-Means (choose random k-data points as initial centroids, assign each datapoint closest
to centroid, calculate new cluster centroids, check if convergence criterion is met)
• Fuzzy C-Means
Feature engineering
• Factor analysis (extract underlying causes)
• PCA – Principal Component Analysis (reduces computational complexity).
Eigenvectors & Eigenvalues
• LDA – Linear Discriminant Analysis (reduces dimensions)
Time series modelling
• Time acts as an independent variable estimator
• Need – understand seasonal & cyclic patterns, detect unusual events,
forecast. Look out for white noise
• Stationarity (variance, covariance, mean). Use differencing & decomposition
to achieve stationarity
• Time series models
• Auto regressive models (AR) – predict using weighted sum of past values
• Auto regressive Integrated Moving Average (ARIMA) – linear combination of past and
present errors and values
• Moving Average (MA)
• Autocorrelation factors (ACF) & past autocorrelation factors (PACF)
Ensemble learning
• Combines individual models (averaging & weighted averaging – SVM,
Decision tree, Logistic regression)
• Bagging & bootstrapping aggregation algorithms (reduces variance by
taking mean of multiple estimates)
• Boosting – reduces bias by training weak learners sequentially
• Cross validation
• Adaboost
• Gradient boosting (GBM)
• Extreme Gradient Boosting (XGBoost)
Recommender systems (information
filtering technique)
• Collaborator filtering
• Item based nearest neighbor
• User based nearest neighbor
• Cosine & adjusted cosine similarity
• Association rule mining
• Apriori algorithm
Text Mining
• Text preprocessing – text transformation & attribute generation – attribute selection – data
mining or pattern discovery – interpretation or evaluation
• Applications
• Document clustering
• Products insight
• Pattern identification
• Security monitoring
• Natural language toolkit (NLTK) – tokenization, stemming, lemmatization, tagging, parsing &
semantic reasoning, named entity recognition (NER)
• N-grams (unigrams, bigrams & trigrams)
• NLP Process workflow
• Tokenization, stopword removal, stemming & lemmatization, POS tagging, information removal
• Structuring sentences
• Syntax, phase structure rules, Syntax tree parsing, rendering syntax tree, chunking & chunk parsing,
chinking (removing sequences of tokens from chunks
• Context free grammar (CFG)

You might also like