Ai DS 2 Book-Chpt-5

BE IT MU Sem 7 AIDS Chapter 5

Uploaded by

Sachin Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

74 views17 pages

Ai DS 2 Book-Chpt-5

BE IT MU Sem 7 AIDS Chapter 5

Uploaded by

Sachin Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 17

F Advanced ML CHAPTER 5 Classification Techniques University Prescribed Syllabus we Academic Year 2022-2023 Ensemble Classifiers: Introduction to Ensemble Methods, Bagging, Boosting, Random forests, Improving classification accuracy of Class Imbalanced Data, Metrics for Evaluating Classiier Performance, Holdout Method and Random ‘Subsampling, Cross-Validation, Bootstrap, Model Selection Using Statistical Tests of Significance, Comparing CClasifers Based on Cost-Benefit and ROC Curves. ‘Selt1eaming Topics: Introduction to ML (Revision), Introduction to Reinforcement Leaming, 5.1 Classification. 52. Model Evaluation and Selection 5.2.1 Metrics for Evaluating Classifier Performance. 52.2 Holdout Method and Random Subsampling.... 523 Cross-Validation 524 Bootstrap. 525 Model Selection Using Statistical Tests of Significance... 6.2.6 Comparing Classifiers Based on Cost- Benefit and ROC Curves... 53 Techniques to Improve Classification Accuracy. 53.1 Introducing Ensemble Methods 532 Bagging (Bootstrap Aggregating). 6.3.3 Boosting and AdaBoost 534 Random Forests. 535 Improving Classification Accuracy of Class-Imbalanced Data 54 Reinforeement Learning (FL). 55 Multiple Choice Questions. + Chapter Ends...AL& DS-HI(MU-Som 7-11) 5.1 CLASSIFICATION We have studied classification and different classifiers catlier. Let's reall some key concepts of classification. (@) Classification is a form of data models describing data classes. It is a Supervised Teaming technique that is used to identify the category ‘of new observations on the basis of training data. A classifier, or classification model, predicts categorical labels (classes). Numeric prediction models ccontinuous-valued functions. Classification and numeric prediction are the two major types of prediction problems. @) Decision tree induction is a top-down recursive tree induction algorithm, which uses an attribute selection measure to select the attribute tested for each non-leaf node in the tee. It a tree-structured classifier, where fnternal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. ID3, C45, and CART ‘are examples of such algorithms using different attribute selection measures. Tree pruning algorithms attempt to improve accuracy by removing tree branches reflecting noise in the data, Early decision tree algorithms typically assume that the data are memory resident, @) Naive Bayesian classification is based on Bayes" theorem of posterior probability. It assumes class- conditional independence thatthe effect of an attribute value on a given class is independent of the values of the other attributes It isa probabilistic classifier, which ‘means it predicts on the basis of the probability of an ‘object. Some popular examples of Naive Bayes Algorithm are spam filtration, Sentimental analysis, and classifying articles. (4) A rule-based classifier uses a set of IF-THEN rules for classification. Rules can be extracted from a decision tree. These rules are easily interpretable and thus these classifiers are generally used to generate descriptive ‘models. The condition used with “if” is called the antecedent and the predicted class of each rule is called the consequent. Rules may also be generated directly training data using sequential covering algorithms. from (MU-New Syllabus wes academic year 22-23)(M7-62) m 5.2 ____SevecTION ___—___ © Once our el WHEL © Before discussing these measures, ptvancod ML Casitenton Toebigues):Pege ne (6-2) MODEL EVALUATION AND Jassitication model is ready, We WOU Tike creaimate of bow accurately the clasiir can prediclelassify the output class. Based on this, we will come to know whether training done is sufficient or not. We can even think of building more than one classifier and then compare their accuracy. Let's sce now, what is accuracy? How can we estimate it? Are some measures of a classifier’s accuracy more appropriate than others? How can we obtain a reliable accuracy estimate? Metrics for Evaluating Classifier Performance «The following list depicts various metrics/measures of evaluating how “accurate” your classifier is at predicting the class label of tuples : (1) Accuracy Q) Bror rate ® Prec (8) Specificity (7) AOC-ROC (8) Log Loss we need t0 understand with certain terminologies related 10 Confusion matrix. ‘© It is the easiest way to measure the performance of a classification problem where the output can be of two (oF more type of classes * A confusion matrix is nothing but a table with two zor <~z, then our value of t lies in the rejection region, within the distribution’ s tails, ‘This means that we can reject the null hypothesis that the means of My and Mp are the same and conclude that there is a statistically significant difference between the two models. Lerche uitcations.A SACHIN SHAH Ventre‘onto that any difference tetween My be attniboted to chance. * The tstatistic for pairwise comparison is computed eM) - ora ce Seo Year) Mp7 ite — SSH =M) = fem, ed) = [eraty. Where, 2 ety] + two test sets are available instead of a single test set then a nonpaired version of the test is used, where the variance between the means of the two models can be computed as: Franti) _ varia) i i and ky and ky are the number of cross-validation Samples used for My and Mz, respectively. This is also ‘known as the two-sample t-test. While consulting the lable of t-istribution, in such case, the number of degrees of freedom used is taken as the minimum number of degrees of the two models. var(My - M3) 5.2.6 Comparing Classifiers Based on Cost Benefit and ROC Curves For assessing the costs (risks) and benefits (gains) associated with a classification model, the tiie Positives, tre negatives, false positives, and false negatives are useful + The cost associated with a false negative (such as incomectly predicting that a diabetic patent is not Giabetc) is far greater than those of a false positive (incorrectly yet conservatively labeling a nondiabetic patient as diabetic). ‘* In such cases, we can overshadow one type of error over another by assigning a different cost to each. ‘+ These costs may consider the danger to the patient, financial costs of resulting therapies, and other hospital costs. © Similarly, the benefits associated with a true posi decision may be different than those of a true negative. a (MU-New Sylabus wes academic year 22-23\M7-62) to compute UHALY, we 4, im ned equal costs and essentially AIVidEU the yg instead computing the average cost (OF benefty pe, decision, Other applications involving cost-benefit analy, include loan application decisions and target marketing nailouts, ROC (Receiver operating characteristic) curves are ‘useful visual tool for comparing wo classification models. We have studied previously, an ROC curve for a given model shows the trade-off between the true positive rate (TPR) and the false positive rate (FPR). ‘The area under the ROC curve is a measure of the accuracy of the model ‘Any increase in TPR occurs atthe cost ofan increase in FPR. For a two-class problem, an ROC curve allows us to visualize the trade-off between the rate at which the ‘model can accurately recognize positive cases versus the rate at which it mistakenly identifies negative cases as positive for different portions of the test st It is immediately apparent that a ROC curve can be used to select a threshold for a classifier which ‘maximizes the true positives, while minimizing the false positives. However, different types of problems have different optimal classifier thresholds. For a cancer screening test, for example, we may be prepared to put up with a relatively high false positive rate in order to get a high ‘true postive, it is most important to identify possible cancer sufferers. For a follow-up test after treatment, however, 2 Afferent threshold might be more desirable, since we ant o minimize false negatives, we don't want tl 4 patient they're clear if this is not actually the case, ‘The AUC ean be used to compare the performance of {wo or more classifiers, A single threshold can be selected and ‘the classifiers’ Performance at that point compared, or the ovell Performance can be compared by considering the AUC. UBhrcitien rok tions..A SACHIN SHAH VentureANC i absole zhacan AUC OF 079, so classifi | i lsior a clearly beter” Is, however, Possibe to cleat whet : in AUC are statistically signifieany jy 53 TECHNIQUES TO Imp) ” CLASSIFICATION AccuRney In machine earning, no mate if we are differences facing a Fase classification ora repression problem, the choice of the [em 2) [ra 0) [Por 31 ee ; smodel is extremely important to have any chance to ripley 0} [Dontslay 2] [Donte obvain good results Ths choice can depend on many vaisbles of the problem: quantity of data, dimensionality y of the distribution hypothesis, ete om) + Inensemble leaming theory, we call weak learners (or tase models) models that ean be used as building Mocks for designing more complex. models by combining several of them «The idea of ensemble methods isto uy reducing bias andlor variance of such weak leames by combining several of them together in order to crete a strong Jeamer (or ensemble model) that achieves. beter performances + To outline the definition and practicality of Ensemble Methods, here we have used example of Decision tree clasifier. However, it is important to note that Ensemble Methods do not only pertain to Decision ‘Trees. a +A decision tree determines the predictive value based on series of questions and conditions. For instance, the simple Decision Tree shown in Fig. 5.3.1 determines «on whether an individual should pla outside or not. 4+ The toe takes several weather factors into account, and given each factor either makes a decision or asks another question, In this example, every time it is overcast, we will play outside. + However, if itis raining, we must ask if tis windy oF 101? If windy, we will not play. * But given no wind, tie those shoelaces tight because ‘were going outside to play. Fig. 5 Fig. 5.31: A decision tree to determine whether to play ‘outside or not {Good economic ‘conastons (60) $50,000 Apariment Poor ean $90,000 conations (40) economic Seren (50 100,00 Poor econamie conditions (40) $40,000 Good economic conditions (60) $30,000 Poor econ $10,000 ‘condtions (40) ecision tree to determine whether or not to invest in real estate Decision Trees can also solve quantitative problems as well with the same format, In the Tree to the left, we want to know whether or not to invest in a commercial real estate property Is it an office building? A Warehouse? An Apartment bui economic conditions? Poor Economic Conditions? How much will an investment return? These questions are answered and solved using this d ‘When making Decision Trees, there are several factors we must take into consideration: On what features do. we make our decisions on? What is the threshold for classifying each question into a yes or no answer? In the first Decision Tree, what if we wanted to ask ‘ourselves if we had friends to play with or not (MU-New syllabus wef academic year 22-23)(M-62) Tal rech-Neo Publcations.A SACHIN SHAH Venture‘Classification Techniques)...Page no, L805. Qusem 7m) nest Introducing Ensemble Methods + owe have frends, we will pay every time, Irnot, we | 8 5.5.1 ight continue 10 ask ourselves questions aout the is « machine leaming tech rit oti ak antes ges a4 MS | ge mts 2 Mahe eg hg er. By adding an atonal question, we hope ries sverl base models in Oreo pede sreater define the Yes and No classes. cone optimal predictive model which helps to improve + This is where Ensemble Methods come into picture! | ee seing results: Rather than just relying on one Decision Tree and . = duction hoping we made the ight decom a cach spi, | * TH approach allows, He DOL AT beter Ensemble Methods allow us to take a sample of | Predictive performance compare ® Single mote Pears Soon ar ty | Basi idea i to eam a set of casifiers and to alloy res into account, calculate whi a eee Enseables te o be MOC secur thy use oF questions to ask at eae h split, and make a final predictor based on the aggregated results of the sampled Decision Trees. their component classifiers. Create multiple | Dataset 1 Dataset ] Dataset dataset Na Create ‘multiple classifiers/_C1 Combine classifiers Fig. 5.3.3 : An overview of Ensemble methodsfearning, ‘cqn-1) cn) © Different types of ensemble classifiers are: 1. Bagging 2. Boosting and AdaBoost 3. Random Forests %.5.3.2 Bagging (Bootstrap Aggregating) + This approsch combines Botsrpping and Agregtono form oe ensemble model, thats why the names Bagel ‘+ Consider yourself as a patent and you would like t havea diagnosis made based i on the symptoms. Instead of aki ‘one doctor, you may choose to ask several. eee + Hf acenain diagnosis occurs more thn any er, you may choos his she Fin or best agnosis + Thais the inal agnosis is made based on monty vot, where ath doctor gets an equal ve, +) Itwe replace each doctor bya classifier, and thats the basic idea behind bagging. + Naturally, « majoriy vote made by a large group of doctors may be more reliab ‘more reliable than a majority vote made’ small se vty vote made by a sal (14U-New Syllabus we f academic year 22-23)(M7-62) Wbrech- eo Pubkatons JA SACHIN SHAH VentureCombined prediction Given a sample of data, multi le subsamples are pulled. le Poostrapped A Decision Tree is formed on each ape of the bootstrapped Each training set is a bootstrap sample. After each subsample Decision Tree has been formed, an algorithm is used to aggregate over the Decision Trees to form the most efficient predictor. To classify an unknown tuple, X, each classifier, Mj returns its class prediction, which counts as one vote, The bagged classifier, Mx, counts the votes and assigns the class with the most votes to X. Bagging often considers homogeneous weak learners, leams them independently from each other in parallel and combines them following some kind of deterministic averaging process Bagging can be applied to the prediction of continuous values by taking the average value of each prediction for a given test tuple. 5.3.3 Boosting and AdaBoost ‘Consider the same example that was taken in previous section, you as a patient, you have certain symptoms. Now, instead of consulting one doctor, you choose to consult several. Suppose you assign weights 0 the value or worth of each doctor's diagnosis, based on the accuracies of previous diagnoses they have made. ‘The final diagnosis is then a combination of the weighted diagnoses. This is the basic idea behind boosting. (4U-New Syllabus wee academic year 22-23)(M7-62) Fig. $3.4 : Bagaing Boosting often considers homogeneous weak leamers, leams them sequentially in a very adaptative way (a base model depends on the previous ones) and combines them following a deterministic strategy. In boosting, weights are also assigned to each training, tuple. A series of k classifiers is iteratively learned. After a classifier, Mj, is learned, the weights are updated to allow the subsequent classifier, Mi + 1, to “pay more attention” to the taining tuples that were misclassified byM,. ‘The final boosted classifier, M+, combines the votes of, cach individual classifier, where the weight of each classifier’s vote is a function of its accuracy. In adaptative boosting (often called “adaboost”), we try to define our ensemble model as a weighted sum of LL weak leamers. t's a popular boosting algorithm. ‘The basic idea is that when we build a classifier, ‘want it to focus more on the misclassified tuples of the previous round. Some classifiers may be better at classifying some ““dfficult” tuples than others. In this way, we build a series of classifiers that complement each other. We are given D, a data set of d class-labeled tuples, (Xi, yi. Xa, Yadenr(Kys Yad where y, is the class label of tuple Initially, AdaBoost assigns each training tuple an equal weight of 1/4. Generating k classifiers for the ensemble requires k rounds through the rest ofthe algorithm. Del recicoPucations_A SACHIN SHAH Ventureinsaification Techniques)....PAge no, A 8 DS-H1(MUSom 7.17) (Advancod MLC + Inount i the apes from Dave sampled to frm a | » To pct elas label fora une X: BOOS asians leaning set, Dy of size Sampling with replacement is | wight to each elasiier’s vote based on how wel he Used, This indicates the same tiple may be selected | elassifier performed rove than once, «s The lower a elasifier's error rate, the more accurate i + Bach tuple’s chance of being selected is based om its od therefore, the higher is Weight Fr Voting sho weight. A classifier model, My, is derived from the | be. The weight of the elasifier’s vote is ealeulated ay training ples of Dy © ts eror is then lculated using D, as a test set. The weights of the training tuples are then adjusted acconting to how they were classified. © IE a tuple was increased. Ifa tuple was correctly classified, its weight correctly classified, its weight is is decreased. + Avople's weight reftects how difficult ti to classi. ‘The higher the weight, the more often it has been misclassified, * These weights will be used to generate the training samples for the clasifier of the next round. This is how, a series of classifiers that complement each other axe built + To compute the error rate of model M;, we sum the weights of each of the ples in D, that M, misclassified 4 error(M). = wjxcern(X)) i where exr(X) is the miselassifcaton error of tuple X; I the tuple was misclassified, then er(X)) is 1; Mi is 0 poor that its error exceeds 0.5, then we abandon it Instead, we ty again by generating a new D; taining ‘otherwise it is 0. Ifthe performance of classi ‘set, from which we derive a new M, ©The error rate of M; affects how the weights of the training tuples are updated. © Ia tuple in round i was correctly classified, its weight {is multiplied by error(M,Y(I ~ error(Mj). ‘© Once the weights of all the correctly classified tuples are updated, the weights for all tuples (including the misclassified ones) are normalized so that their sum remains the same as it was before. ‘© Tonormalize a weight, we multiply it by the sum of the ‘old weights, divided by the sum of the new weights. © As a result, the weights of misclassified wples are increased and the weights of correctly classified tuples are decreased. (MU-New Syllabus wee academic year 22-23)(M7-62) 1 =error(M) 108 ron(M) For each clas, c, we sum the weights of each classify that assigned clas ¢ to. “The class with the highest sum is the “winner” and is returned as the class prediction for tuple X. Bagging is less susceptible to model overiting. While bagging and boosting, both can significantly improve accuracy in comparison to a single model, boosting tends to achieve greater accuracy. 5.5.4 Random Forests Random Forest Models can be thought of as extension, over bagging, as itis bagging with a slight twist. Each classifier in the ensemble is a decision tree classifier so that the collection of classifiers is a“ Classifier is generated using a random selection of attributes at each node to determine the split. During classification, each tree votes and the most popular class is returned, When deciding where to split and how to make decisions, bagged Decision Trees have the full dispossl of features to choose from. Therefore, although the bootstrapped samples may be slightly different, the data is largely going to break off| at the same features throughout each model. In contrary, Random Forest models decide where © split based on a random selection of features. Rather than splitting at similar features at each node throughout, Random Forest models implement a level of differentiation because each tree will split bused of different features This level of differentiation provides a great ensemble to aggregate ove : t over, producing a more acu predictor, sed Tech-Neo Publications..A SACHIN SHAH Ventrejy 05 QuSem 7.1) ig. 5.3.5 : Random Forest Classifier + Steps for implementing Random Forest Classifier 1. Multiple subsets are created from the original data set, selecting observations with replacement A subset of features is selected randomly and whichever feature gives the best spit is used to split the node iteratively. ‘The tre is grown to the largest, 4. Repeat the above steps and prediction is given based on the aggregation of predictions from n number of trees. 5.5.5 Improving Classification Accuracy of Class-Imbalanced Data + When observation in one class is higher than the observation in other classes then there exists a class imbalance. ‘© Class Imbalance is a common problem in machine learning, especially in classification problems. Imbalance data can hamper our model accuracy big time. If the data set is imbalanced then in such cases, you get a pretty high accuracy just by predicting the majority class, but you fail to capture the minority class, which is most often the point of creating the model in the first place. + Having an imbalanced dataset (imbalanced target \atiables) in a problem statement is always frustrating, and having a perfectly balanced dataset is always a myth, * Mostly in Medical Science! Healthcare Machine Leaming problems, the data set is mostly biased. * So, predicting outcome in such cases becomes very difficult as data becomes biased towards one particular class of outcome, (MU.New Sylabus wef academic year 22-23)M7-62) (Advanced ML Cl ication Techniques). Class Imbalance appear in many domains such as (1) Fraud detection 2) Spam fitering @) Disease screening (4) SaaS subscription churn (6) Advertising click-through If we considera healtheare problem, our main goal is 10 reoe false Positive outcomes, a5 you cannot afford Jet patients go away with a disease because of a biases algorithm, Since there are more ‘Negatives’ in a dataset, the Machine Leaming Model becomes biased toward Negative Class. So, in some cases, it might predict ‘Negative’ for a “Positive” Class. There are many ways to reduce the Class Imbalance/Bias Problem and improving the classification accuracy of class-imbalanced data: 1, Improve Data Collection and Preprocessing, ‘Techniques: Collect more data and give much more time to preprocessing by detecting outliers ‘and segment the data according to balanced class. 2. Resampling (Up Sampling and Down Sampling): A widely adopted technique for dealing with highly unbalanced datasets is called resampling. It consists of removing samples from the majority class (under-sampling) and/or adding more examples from the minority class (oversampling). If you have ess data, then this technique is quite useful. ‘Up(Over) Sampling is increasing the number of classes which is less in number by considering the data points closer to that of the original class. The simplest implementation of over-sampling is to duplicate random records from the minority class, \hich ean cause overfishing, Down Sampling is the inverse of oversampling, ie, reducing the number of classes having a hhigher number of data points. In under-sampling, the simplest technique involves removing random records from the majority class, which can cause Jos of information, Ll rect-neoPubteatons_A SACHIN SHAH VentureALR DS. (MU-Som 7-17) advance mi Cassticaon Tecigues)...Pa9e n,(5-14 5 ing, the agent : © In Reinforcement Learning, ams The threshold-moving = This approach tothe | ¢ Ih MT eedbacks without any lai lass imbalance problem does not involve any sampling. It applies to classifiers that ‘input tuple, retum a continuous output value. That is, for an input tuple, X, such a classifier returns as ‘output a mapping, f (X) —» [0,1]. Rather than ‘manipulating the training tuples, this method retums a classification decision based on the ‘output values. Inthe simplest approach, tuples for which £0) > 1, for some threshold, 1, are considered positive, while all other tuples are considered negative. Other approaches may involve manipulating the outputs by weighting. In general, threshold moving moves the threshold, 1, $0 that the rare class tuples are easier to classify ‘and hence, there is Jess chance of costly false Degative errors. Examples of such classifiers include naive Bayesian classifiers and neural network classifiers like backpropagation, 4. Use Specific Algorithm Properties: This. is helpful for some Machine Learning Algorithms where you can give weights to a particular class, Which may decrease bias. For example, you have 70% A Class and 30% B Class, where you can give more weight on A class because Algorithm ‘ay tend to be more biased towards Class B. 5. Ensemble methods discussed in section have also bbeen applied to the class imbalance problem. The individual classifiers making up the ensemble may include versions of the approaches described here such as oversampling and threshold moving. 5.4 REINFORCEMENT LEARNING (RL) © Reinforcement learning is an area of Machine Leaming in which an agent leams to behave in an ‘environment by performing the actions and seeing the results of actions. It is a feedback-based Machine learning technique. It is about taking suitable action to ‘maximize reward in a particular situation. For each good action, the agent gets positive feedback, and for ‘each bad action, the agent gets negative feedback or penalty. =. I erly af (MU-New Syllabus we academic year 22-23)(M7-62) data, unlike supervised Teaming. Since there is py labeled data, s0 the agent is bound to leam by ig experience only se Enironment Teva fea tion Fig. $4.1 : Reinforcement Learning © RL solves a specific type of problem where decision ‘making is sequential, and the goal is long-term, such as ‘game-playing, robotics, etc © The agent interacts with the environment and explores it by itself. The primary goal of an agent in reinforcement learning is to improve the performance by getting the maximum positive rewards, ‘+ The agent learns with the process of hit and tral, and based on the experience, it learns to perform the task in better way. Hence, we can say that “Reinforcement Jearning isa type of machine learning method where an intelligent agent (computer program) interacts with the environment and learns to act within that.” How a Robotic dog Jeams the movement of his arms is an ‘example of Reinforcement learning. It is a core part of Artificial intelligence, and all Al ‘gent works on the concept of reinforcement learning. Here we do not need to pre-program the agent, cams from its own experience without any human intervention, * Example : Suppose there is an Al agent present within & maze environment, and his goal is 0 find the Glamond. The agent interacts with the environment by Performing some actions, and based on those actions, the state ofthe agent gets changed, and it also receives reward or penalty as feedback. Tect-Neo Publications..a SACHIN SHAH Ventuté‘AL8.DS«ll (MU-Sem 7.17) The agent continues doin action, change sate/remss feedback), and by doing explores the environment, The agent leams that what actions fedtak ora and wa scons fad ee feedback penalty. Asa positive reward, the agem gets Positive point and as. penalty, it gets a negative point 11 is employed by various software and machines to find the best possible behavior or path it should take in a specific situation, Reinforcement 8 these three if Teaming differs from supervised Teaming in a way that in supervised teaming the ‘taining data has the answer key with itso the mode! is trained with the correct answer itself whereas in reinforcement leaming, there is no answer but the reinforcement agent decides what to do to perform the given task. In the absence of a training dataset, itis bound to lear from its experience. 45.5 MULTIPLE CHOICE QUESTIONS a RECHOICE QUESTIONS Q.5.1 Which ofthe following algorithm isnot an example of an ensemble method? (@) Extra Tree Regressor () Random Forest, (© Gradient Boosting @ Decision Tree ans. (8) Q.5.2 Whatis iru about an ensembled clsier? 1. Classifiers that are sur can yote with more 2. Classifiers canbe surer about a pariolr pat ofthe space 3. Most of the times, it perfoms beter than a single classifier (@) Land2@) Lands (© 2and3 @ Alloftheabove Ans :(@) Q.53 Which ofthe folowing option i / are comect reguding benefits of ensemble model? 1. Better performance 2. Generalized models 3. Better intexpretabilty (@) Land3 (6) 2and3 (©) land? (6) 1,2and3 Vans. (0) (MU-New Syllabus w.ef academic year 22-23)(M7-62) Oss Q.56 Qs7 O58 ass sed Ml. Classification Toct Which of the following can be true for selecting base Jearmers for an ensemble? 1. Different feamers ean come from same algorithm ‘with different hyper parameters 2. Different leamers can come from different algorithms 3. Different leamers can come from different training spaces @ 2 (© Vand3 (@) 1,2and3 Yans.:(@) ‘True or False: Ensemble learning can only be applied to supervised learning methods. (@ Tue) False Yans.: (0) ‘Tree or False: Ensembles will yield bad results when ‘there is significant diversity among the models. ‘Note: All individual models have meaningful and good predictions (a) True (b) False ~Ans. + (b) Which ofthe following is / are tre about weak learners used in ensemble model? 1. They have low variance and they don't usually overfit, 2, They have high bias, so they can not solve hard Jeaming problems. 3. They have high variance and they don't usually overfit, (@) Land2 (6) Yand3 (©) 2and3 (8) None ofthese Ans. (a) ‘True or Fase: Ensemble of classifiers may or may not be ‘more accurate than any ofits individual model (@) Te () False Ans. : (a) If you use an ensemble of different base model, is it necessary to tune the hyper parameters of all base ‘modes to improve the ensemble performance? (a) Yes (@) No (© cantsay Ans. 2(0) Q.5.10 Generally, an ensemble method works better, if the individual base models have 2 ‘Note: Suppose each individual base models have scouracy greater than 50%. (@) Less comeation among predictions (©) High cometation among predictions (©) Correlation does not have any impact on ensemble output (©) None ofthe above Ans. : (9) Tab rechieo putcations.A SACHIN SHAN VentureAL& DS. (MU-Som 7-11) 7 as either of the candidates See ae ce ‘works similar to above-discussed eecton procedure int Peroos ae ike Base models of enerle metho (2) Bagging (6) Boosting (©) AOEB (@) None ofthese Ans. suppose you ae given a" predictions on test lab a” ‘Sao Mg) pec We ere ftoing mao) can Be sdf combine Be prisons of thse noe? Nee: Wear working on arson poem 1. Matin 2 Prot 3 avenge 4. Weighed sm 5. Minimum od Maximum 6. General ean @13med elas (© 134and6 ( Allofabove How can we aig he weight Yo opt of difeet dla sense? 1. Uses rita etme opi 2. Chote te weigh xing co aliaon 2. Give beh wigs to moe acre models (tad?) tas (23 Alefabve Q.siz vans. :(@), YAns.:(@) Which of the following is trve about averaging ensemble? (@) Itean only be used in classification problem (@) It-can only be used in regression problem (©) Xt can be used in both classification 2s well as regression (2) None ofthese Yas. (6) Q. 5.15 Suppose there ae 25 base classifies. Each classifier has ‘error rates of © = 0.35. Suppose you are using averaging 4 ensemble technique. What will be the probabilities that ensemble of above 25 classifiers will make a wrong prediction? Note: All classifies are independent ofeach other (=) 005 () 0.05 om (009 Ans. (0) (MU-New Syllabus wef academic year 22-23)(M7-62) Q.517 Qsu8 59 Q.5.20 Qs2 Q.5.22 iadvanced ML Classification ‘rich of the following POFIMetrs can be tye Tending good ensemble model in bagging a algorithms? 1. Max numberof samples 2, Max features 3, Bootstrapping of samples 44, Bootstrapping of Features (o Lands (&) 2and3 (@ 1ad2 (A) allofabove Yam Forte below confusion matrix, what is thereat? Nas [5 wos [53272 | 1307 3 io (| ae @07 008 (09 995 Yhza.s0 Which among the following evaluation meties woulg you NOT use to measure the performance of ‘lassification model? (@) Precision (©) Mean Squared Error (6) Recall (Fl score YAms (6) Which ofthe following i comet use of eros validation? (@) Selecting variables to include in a model (©) Comparing predictors (©) Selecting parameters in prediction function (@) Allof the mentioned YAns.:(@) Point out the wrong combination. (e) True negative = correctly rejected (©) False negative = correctly rejected (©) False positive = conetl identified (©) Allof the mentioned YAns:(0) Which ofthe following is a common err measure? (@) Sensitivity (©) Median absolte deviation © Specificity () Allofthe mentioned Yans.1(8) ‘Which ofthe following cross validation versions may 9° be suitable for very large datasets with hundreds of thousind of sample? (@) kefold cross-validation () Leave-one-out cross-validation (©) Holdout method (@ Alot the above ans. Tech-Neo Publications.A SACHIN SHAH Ventu®‘AL& DS-II(MU-Sem Q.525 Q.526 isa disadva vantage of kfld ers (a) The variance ofthe result is increased, NIM estimate is reduced ask validation method? (© Reduced bias (@) The taining algritim hs o times, Fenun from scratch k Reinforcement learnin, aes 1a) (@) Unsupervised laming (©) Supervised teasing (©) Award based learning (None — ‘Which ofthe following is an appicati wie 18 an application of reinforcement (@) Topic modeting (©) Recommendation sytem (©) Pattem recognition (@ Image classification YAns.: (6) Which of the following is true about reinforcement earning? (@) The agent gets rewards or penalty according to the action (©) It's an online learning (© The target of an agent is to maximize the rewards (@) All of the above Ans. (8) Q.5.27 If TP=9 FP=6 FN=26 TN=T0 then Error rate willbe (2) 45 percentage (b) 99 percentage (©) 28 percentage (6) 20 percentage YAns.:(0, (Advancod ML Classification Techni az as as as ar as as 0.10 an a1 0.13 Descriptive Questions What is Rolnforcoment Leaming? Explain the significance of Award and action in RL. Which are the various metrics for evaluating the classifier performance? What is confusion matrix? What aro the contents of rd Compare and contrast different ‘evaluating the classifier performance. fs based on cost mots for How can we compare classi benefit and ROC curves? Write a shor note on: Holdout method How Cross validation method can be used 10 evaluate classifiers? ‘Write a note on: Bootstrap. ‘What is meant by ensemble leaming? What are the ferent typos of ensemble classifiers? Explain how bagging method works. How is boosting method different than bagging? Write a note on: Adaboost Explain Random Forest model in detail, Can we ‘consider this as a extension over bagging method? ‘What is class imbalance problem? Explain it with an example. How can we improve classification accuracy of (Class-imbalanced data? ‘Chapter Ends... ooo

6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
Lectures3 5
No ratings yet
Lectures3 5
57 pages
9b. Evaluation of Classifiers
No ratings yet
9b. Evaluation of Classifiers
4 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
22 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
41 pages
Classification - Performance Evlaution
No ratings yet
Classification - Performance Evlaution
13 pages
Lesson 6 Analytics Methods
No ratings yet
Lesson 6 Analytics Methods
12 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
L22 KNN+Metrics
No ratings yet
L22 KNN+Metrics
18 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
18 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
Module 6
No ratings yet
Module 6
24 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
ML Model Evaluation
No ratings yet
ML Model Evaluation
17 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
6 Evaluarea Performantei
No ratings yet
6 Evaluarea Performantei
43 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
Lecture 10
No ratings yet
Lecture 10
16 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
D3 IT Performance Metrics May 2023
No ratings yet
D3 IT Performance Metrics May 2023
48 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Accuracy and Error Measures
No ratings yet
Accuracy and Error Measures
14 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Evaluation Measures For Machine Learning Models
No ratings yet
Evaluation Measures For Machine Learning Models
6 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
IE 527 Intelligent Engineering Systems: Basic Concepts Model/performance Evaluation Overfitting
No ratings yet
IE 527 Intelligent Engineering Systems: Basic Concepts Model/performance Evaluation Overfitting
18 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Ads Exp4
No ratings yet
Ads Exp4
3 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
CH 6
No ratings yet
CH 6
24 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
Ai DS 2 Book-Chpt-6
No ratings yet
Ai DS 2 Book-Chpt-6
11 pages
Ai DS 2 Book-Chpt-1
No ratings yet
Ai DS 2 Book-Chpt-1
16 pages
AIDS 2 - CHP 1to3
No ratings yet
AIDS 2 - CHP 1to3
50 pages
IFS Chapter 4
No ratings yet
IFS Chapter 4
22 pages
IFS CHP 1
No ratings yet
IFS CHP 1
9 pages

Ai DS 2 Book-Chpt-5

Uploaded by

Ai DS 2 Book-Chpt-5

Uploaded by

You might also like