0% found this document useful (0 votes)
42 views40 pages

Lecture 12 Machine Learning

machine learning lecture
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
42 views40 pages

Lecture 12 Machine Learning

machine learning lecture
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 40
.. h_=- Today’s Agenda MACHINE LEARNING WITH PYTHON -PROCESS -TYPES OF ML -TYPES OF PROBLEMS -APPLICATIONS 8/16/2020 : Machine Learning i. / ETP Framework “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” Main components of any learning algorithm: Task(T), Performance(P) and experience (E) * MLisa field of Al consisting of learning algorithms that: — Improve their performance (P) — At executing some task (T) — Over time with experience (E) 8/16/2020 Input Data: + Housing prices + Customer transactions *Clickstream data Baie ler) 8/16/2020 + Predict prices «Segment customers + Optimize user flows + Categorize images ae ue Leo - Accurate prices * Coherent groupings aCe mL} Oro) aie aro) (ce) Tlutelelor) : | Machine Learnin » Wis. / P * Machine Learning allows the systems to make decisions autonomously without any external support * These decisions are made when the machine is able to learn from the data and understand the underlying patterns that are contained within it * Then, through pattern matching and further analysis, they return the outcome which can be a classification or a prediction 8/16/2020 5 y / ML Process tht iP / One Approach: ML algorithm is trained using a labelled or unlabeled training data set to produce a model New input data is introduced to the ML algorithm — Make a prediction based on the model The prediction is then evaluated for accuracy — If the accuracy is acceptable, the machine learning algorithm is deployed — If the accuracy is not acceptable, the ML algorithm is trained again and again within an augmented training data set 8/16/2020 6 2 ; Types and I, “ y/ Applications of ML Types of Machine Learning fee atti] ett} Peer alin} + + + Fraud detection | Text Mining Email spam Detection [+ Face Recognition Diagnostics |- Big Data Visualization Image Classification ‘— Image Recognition Risk Assessment [> Biology Score Prediction |— City Planning L_ targetted Marketing 8/16/2020 a : Supervised Uhl ssi. / Reevaatlays3 * Supervised Learning is the most popular paradigm for performing machine learning operations * It is widely used for data where there is a precise mapping between input-output data * The dataset, is labeled, the algorithm identifies the features explicitly and carries out predictions or classification accordingly * As the training period progresses, the algorithm is able to identify the relationships between the two variables to predict a new outcome 8/16/2020 3 : j Supervised Learning °c... Az Supervised Learning Prediction Annotations These are grapes 8/16/2020 9 G Supervised Learning Ps ' / Use Case * Facial Recognition is one of the most popular applications of Supervised Learning and more specifically — Artificial Neural Networks * Convolutional Neural Networks (CNN) is a type of ANN used for identifying the faces of people * These models are able to draw features from the image through various filters * Finally, if there is a high similarity score between the input image and the image in the database, a positive match is provided 8/16/2020 so eu. Baidu will provide the airports with the facial recognition technology that will provide access to the ground crew and the staff. Therefore, the passengers do not have to wait in long queues for flight check-in when they can simply board their flight by scanning their faces. BAIDU, CHINA’S PREMIER SEARCH ENGINE COMPANY 8/16/2020 1 : Supervised Whi / Machine Learning * Train model on a labelled dataset — Both raw input data as well as its results * Split data into a training dataset and test dataset — Training dataset is used to train network — Test dataset acts as new data for predicting results or to see the accuracy of model * Accuracy is what we achieve in supervised learning as model perfection is usually high 8/16/2020 » ¢ Supervised Way. if Machine Learning Learning Phase G fz -» & wip e Training data Features vector Algor Labeled Data 7 8/16/2020 / : / Unsupervised Learning » Md ns. / * Data is not explicitly labeled into different classes — There are no labels * The model is able to learn from the data by finding implicit patterns * Unsupervised Learning algorithms identify the data based on their densities, structures, similar segments, and other similar features * Cluster analysis is one of the most widely used techniques in supervised learning 8/16/2020 14 : j Unsupervised Learning o... Azz Unsupervised Learning » 00 yy bbb Model 64H Input Output = Unsupervised Learning Ps ' / Use Case * Using clustering, businesses are able to capture potential customer segments for selling their products * Sales companies are able to identify customer segments that are most likely to use their services * Companies can evaluate the customer segments and then decide to sell their product to maximize the profits 8/16/2020 16 :.: hz The goal of this company is to ingest and process the customer data in order to make it accessible to the marketers. They take it one step further by providing smart insights to the marketing team, allowing them to reap the maximum profit out of their product marketing. ISRAELI BASED STARTUP — OPTIMOVE 8/16/2020 v : Unsupervised Learning » Md ns. Unsupervised Learning in real life bG-é aes 6 é by mle leh Identifying the potential Implementing Clustering ae product to customer dase for Algorithms to group selling the product the customer base ee group 8/16/2020 18 y : f Reinforcement Learning Ula -P / * Reinforcement Learning covers more area of Artificial Intelligence which allows machines to interact with their dynamic environment in order to reach their goals * With this, machines and software agents are able to evaluate the ideal behavior in a specific context * With the help of this reward feedback, agents are able to learn the behavior and improve it in the longer run * This simple feedback reward is known as a reinforcement signal 8/16/2020 19 Reinforcement Learning i. action A: state S: reward RX. Res —. Environment s * The agent in the environment is required to take actions that are based on the current state * There is no answer key provided to the agent when they have to perform a particular task * When there is no training dataset, it learns from its own experience 8/16/2020 20 @,. / Reinforcement Learning Use Case Google’s Active Query Answering (AQA) system makes use of reinforcement learning Google’s AQA system reformulates the questions asked by the user For example, if you ask the AQA bot the question — “What is the birth date of Nikola Tesla” then the bot would reformulate it into different questions like “What is the birth year of Nikola Tesla”, “When was Tesla born?” and “When is Tesla’s birthday” This process of reformulation utilized the — traditional sequence2sequence model, but Google has integrated reinforcement Learning into its system to better interact with the query based environment system 8/16/2020 21 Reinforcement Learning i. * This is a deviation from the traditional Reinforcement Learning seq2seq model as all the tasks are carried out using reinforcement learning and policy e) gradient methods r = * For a given question q0, we want to obtain the best possible answer a* * The goal is to maximize the award a* = argmaxa R(ajq0) 8/16/2020 2 Types of Supervised Machine Learning * Resulting Supervised learning algorithms are task-oriented * As we provide it with more and more examples, it is able to learn more properly so that it can undertake the task and yield us the output more accurately Types * Regression z — Output variable is a real value, such as “dollars” or “weight” * Classification — Predict output variable which falls just in particular categories, // such as the “red” or “blue” or “disease” and “no disease” Ce ea 8/16/2020 23 : / Algorithms B.: / Won nw Sw Nee Decision Trees Naive Bayes Classification Support Vector Machines for classification problems Random Forest for classification and regression problems Linear regression for regression problems Logistic Regression K-nearest Neighbors Ensemble Methods (Gradient Boosting) Artificial Neural Network 8/16/2020 24 y f / VE as Ml ns. old dalu Logistic Regression Naive Bayes Decision Tree ey ATT roa aa ene) Random Forest _ K-Nearest Neighbours 8/16/2020 25 i. Help to make a decision about the data item Decision Tree algorithms are used for both predictions as well as classification in machine learning Using the decision tree with a given set of inputs, one can map the various outcomes that are a result of the consequences or decisions 8/16/2020 Decision Trees Is Sex Male? ca Is age > 9.5? Is SIBSP > 2.5? 26 : | Naive Bayes @,, / A classification technique dependent on the Bayes’ Theorem Naive Bayes Classifier assumes that a particular feature in a class is not exactly, directly related to any other feature Even if features are interdependent and each of the features exist because of the other feature All these properties got to contribute independently to the probability of the outcome Naive Bayes model isn’t difficult to build and is really useful for very large datasets 8/16/2020 27 : / Naive Bayes ~ Way. / ij * If the categorical variable belongs to a category that wasn’t followed up in Posterior a the training set, then the model will ~~ give it a probability of O which will inhibit it from making any prediction. Ba rece * Naive Bayes assumes independence between its features. In real life, it is difficult to gather data that involves wtlin. i" completely independent features 8/16/2020 28 i. Bayes’ Theorem P(H | E) - Posteriori means deriving theory out of given evidence. It denotes the conditional probability of H (hypothesis), given the evidence E. P(E | H) — Likelihood conditional probability of the occurrence of the evidence, given the hypothesis. It calculates the probability of the evidence, considering that the assumed hypothesis holds true. P(H) — Prior probability denotes the original probability of the hypothesis H being true before the implementation of Bayes’ Theorem. That is, this probability is without the involvement of the data or the evidence. P(E) - This is the probability of the occurrence of evidence regardless of the hypothesis. 8/16/2020 pP(E|H) p(w) pie) P(H|E) = Posterior probability P(H | E) as the product of the probability of hypothesis P(E | H), multiplied by the probability of the hypothesis P(H) and divided by the probability of the evidence P(E) 29 : Example » Ms. 4 Suppose the weather of the day is cloudy * Whether it would rain today, given the cloudiness of the day * Calculate the probability of rainfall, given the evidence of cloudiness * Posterior probability part of equation — P(Rain | Clouds), where finding whether it would rain today is the Hypothesis (H) and Cloudiness is the Evidence (E) Suppose we know that 60% of the time, rainfall is caused by cloudy weather. Probability of it being cloudy, given the rain P(clouds | rain) = P(E | H) This is the backward probability where, E is the evidence of observing clouds given the probability of the rainfall, which is originally our hypothesis 8/16/2020 30 @,. / * Out of all the days, 75% of the days in a month are cloudy. This is the probability of cloudiness or P(clouds) * Since this is a rainy month of the year, it rains usually for 15 days out of 30 days. That is, the probability of hypothesis of rainfall or P(H) is P(Rain) = 15/30 = 0.5 or 50% * Calculate the probability of it raining, given the cloudy weather P(Rain | Cloud) = (P(Cloud | Rain) * P(Rain)) / (P(Cloud)) = (0.6 * 0.5) / (0.75) =0.4 Hence, 40% chance of rainfall, given the cloudy weather 8/16/2020 31 : Advantages » Ms. ‘i Naive Bayes is an easy and quick way to predict the class of the dataset. Using this, one can perform a multi-class prediction. When the assumption of independence is valid, Naive Bayes is much more capable than the other algorithms like logistic regression. Furthermore, you will require less training data If the categorical variable belongs to a category that wasn’t followed up in the training set, then the model will give it a probability of O which will inhibit it from making any prediction. Naive Bayes assumes independence between its features. In real life, it is difficult to gather data that involves completely independent features 8/16/2020 32 y Wy, f Logistic Regression * Used for binary classification of data-points * The name logistic regression came from a special function called Logistic Function which plays a central role in this method * A logistic regression model is termed as a probabilistic model * It helps in finding the probability that a new instance belongs to a certain class * The output lies between 0 and 1 * Positive and negative class 8/16/2020 33 : / Important Parts » Wis. / 4 * Hypothesis and Sigmoid Curve * With the help of this hypothesis, we can derive the likelihood of the event * The data generated from this hypothesis can fit into the log function that creates an S-shaped curve known as “sigmoid” * Using this log function, we can further predict the category of class * Equation for logistic regression — y=e(b0 + b1*x) / (1 + e4(b0 + b1*x)) — bO and b1 are the two coefficients of the input x. We estimate these two coefficients using “maximum likelihood estimation” 8/16/2020 34 => Clay el (cy i. Let’s consider an example of classifying emails into the spam malignant and ham (not spam) Assume that the malignant spam would be falling in the positive class and benign ham would be in the negative class Take several labeled examples of emails and then use it to train the model After training it, this can be used really well to predict the class of new email based examples Feed the examples to our model, it returns to us a value, say it is y such that Osys1 Suppose, the value we get is 0.8 From this value, we can say or predict that there is 80% probability that tested examples are a kind of spam Thus this can be classified it in the form of a spam mail 8/16/2020 35 ~ Has. f Linear regression * Linear regression is well known to be an approach for modeling the relationship that lies in between a dependent variable ‘y’ and another or more independent variables that are denoted as ‘x’ and expressed in a linear form * The word Linear indicates that the dependent variable is directly proportional to the independent variables * It has to be constant as if x is increased/decreased then Y also changes linearly y=Ax+B 8/16/2020 36 ° Logistic and Linear Whi if Nee gola MCT ec] ela) Graph this logistic function: 11 (1+e4-x) The ‘e’ in the above equation represents the S-shaped curve that has values between 0 and 1 —— Logistic Regression —— Linear Regression -+3-2-10123456 x 8/16/2020 37 eu. Python implementation of Simple Linear Regression It is the most basic version of linear regression which predicts a response using a single feature. The assumption in SLR is that the two variables are linearly related USING DIABETES DATASET FROM SCIKIT- LEARN 8/16/2020 38 PSY te metatectinnsict wat ‘oort nay a = 1 cae rain = at a 1 ta jester on sy cmae mathe area) nimIt Uearegresson on. terre, ALamercenein, eens, eemalinales) 1 (2s red = eer aet) 1 (aye pum catectns 0, apne) See weer a wee cepts so a brio Saraae sae Raf F raacretssen Sores) 1 (a LE ee cars teeta vate, shor natn =) ac 8/16/2020 39 eu. Python implementation of Multiple Linear Regression It is the extension of simple linear regression that predicts a response using two or more features Multiple Linear Regression models always includes the errors in the data known as residual error which changes the calculation USING BOSTON HOUSING DATASET FROM SCIKIT LEARN 8/16/2020 40 8/16/2020 oa ot 31 pa psy Tipere atploeLibyptit ox ptt import nangy a8 99 ‘fron elem Inport Gatasets, Lineerqodel, metrics estan > eran Aaa alee TERY = eal) free gusrnamceseneton seperate see eit a ve ea ong ant) ep ree uersins 7 tram) frdeec catficrentst t's cepscoet) Weacte-vertence sesret 3" Seror ea.ocoreOfesk, WEGHO)) Fie style weet #iicteiyeiphe”) Fitssexttcr(eeeeoreascuA tein, reesOredLceK train) —y-train, color» “efcen's s 20, 18bel ~ “Teaun ats") it scatter (vepcpresiewitesey reg prestet(x est) > y tart colors "MMe", 2 2; bel = test sete). fehheg = eae TE Tgessate Hiectegeraloc = rer nigh icetheestdet evar) 1 Residual errors a

You might also like