0% found this document useful (0 votes)
38 views40 pages

Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML

Uploaded by

Deepika Poojari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views40 pages

Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML

Uploaded by

Deepika Poojari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Unit 1

1. Define Machine Learning.


Assignment
Application of machine learning
Applications of ML

1.Image recognition is one of the most common uses of machine learning. There are many situations where you can classify the
object as a digital image. For example, in the case of a black and white image, the intensity of each pixel is served as one of the
measurements. In colored images, each pixel provides 3 measurements of intensities in three different colors – red, green and
blue (RGB).
Machine learning can be used for face detection in an image as well. There is a separate category for each person in a database
of several people. Machine learning is also used for character recognition to discern handwritten as well as printed letters. We
can segment a piece of writing into smaller images, each containing a single character.
eg. Auto friend tagging suggestion on facebook.
It is done with the help of deepface project of deep learning face recognition system.
2.Speech Recognition
Speech recognition is the translation of spoken words into the text. It is also known as computer speech recognition or
automatic speech recognition. Here, a software application can recognize the words spoken in an audio clip or file, and then
subsequently convert the audio into a text file. The measurement in this application can be a set of numbers that represent the
speech signal. We can also segment the speech signal by intensities in different time-frequency bands.
Speech recognition is used in the applications like voice user interface, voice searches and more. Voice user interfaces include
voice dialing, call routing, and appliance control. It can also be used a simple data entry and the preparation of structured
documents.
3.Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a significant role in self-
driving cars. Tesla, the most popular car manufacturing company is working on self-driving car. It is using unsupervised learning
method to train the car models to detect people and objects while driving.
Learning associations is the process of developing insights into the various associations between the products. A good example
is how the unrelated products can be associated with one another. One of the applications of machine learning is studying the
associations between the products that people buy. If a person buys a product, he will be shown similar products because there
is a relation between the two products. When any new products are launched in the market, they are associated with the old
ones to increase their sales.
4.Medical diagnosis
It is used for the analysis of the clinical parameters and their combination for the prognosis example prediction of disease
progression for the extraction of medical knowledge for the outcome research, for therapy planning and patient monitoring.
These are the successful implementations of the machine learning methods. It can help in the integration of computer-based
systems in the healthcare sector.
In medical science, machine learning is used for diseases diagnoses. With this, medical technology is growing very fast and able
to build 3D models that can predict the exact position of lesions in the brain.
It helps in finding brain tumors and other brain-related diseases easily.
5.Financial Services
Machine learning has a lot of potential in the financial and banking sector. It is the driving force behind the popularity of the
financial services. Machine learning can help the banks, financial institutions to make smarter decisions. Machine learning can
help the financial services to spot an account closure before it occurs. It can also track the spending pattern of the customers.
Machine learning can also perform the market analysis. Smart machines can be trained to track the spending patterns. The
algorithms can identify the tends easily and can react in real time.
6.Product Recommendations
You shopped for a product online few days back and then you keep receiving emails for shopping suggestions. If not this, then
you might have noticed that the shopping website or the app recommends you some items that somehow matches with your
taste. Certainly, this refines the shopping experience but did you know that it’s machine learning doing the magic for you? On
the basis of your behaviour with the website/app, past purchases, items liked or added to cart, brand preferences etc., the
product recommendations are made.
7. Email Spam and Malware Filtering
Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We always receive an important
mail in our inbox with the important symbol and spam emails in our spam box, and the technology behind this is Machine
learning. Below are some spam filters used by Gmail:
Content Filter
Header filter
General blacklists filter
Rules-based filters
Permission filters
Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes classifier are used for email
spam filtering and malware detection.
8.Predictions while Commuting
Traffic Predictions: We all have been using GPS navigation services. While we do that, our current locations and velocities are
being saved at a central server for managing traffic. This data is then used to build a map of current traffic. While this helps in
preventing the traffic and does congestion analysis, the underlying problem is that there are less number of cars that are
equipped with GPS. Machine learning in such scenarios helps to estimate the regions where congestion can be found on the
basis of daily experiences.
9. Virtual Personal Assistant:
We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name suggests, they help us in
finding the information using our voice instruction. These assistants can help us in various ways just by our voice instructions
such as Play music, call someone, Open an email, Scheduling an appointment, etc.
10. Online Fraud Detection:
Machine learning is making our online transaction safe and secure by detecting fraud transaction. Whenever we perform some
online transaction, there may be various ways that a fraudulent transaction can take place such as fake accounts, fake ids, and
steal money in the middle of a transaction. So to detect this, Feed Forward Neural network helps us by checking whether it is a
genuine transaction or a fraud transaction.

2. What are the types of Machine learning


Assignment
3. Differentiate between Supervised and Unsupervised learning
4. Explain steps to design machine learning model
1)Goal defining- Define the goal of designing the model . Here the task that a machine should perform is decided.
How do we define the success of model is decided.
2) Data Gathering- To design a model sufficient data is required which will analyze to generate a model.Data can be
historic or real time data.
3) Data Parsing- Once the data is gathered it is randomly devided into two parts that is training data and testing data.
Training data is the data that is given as an input to machine to create a model by analyzing it.
testing data is used to test the created model to check if it is makes the correct predictions or not.
4) Model Creation- By applying some sort of algorithms the model will be generated out of traning data which can be
further tested with testing data. The model can be used to predict the value of targeted attribute.
5) Accuracy Testing- Once the model is ready it can be tested using testing data and predicted value for targeted
attribute will be observed. If the actual value and predicted value of targeted attribute is same means Machine has
learned successfully and model can be deployed.
5. Write a short note on Training versus Testing.
In machine learning, training and testing are two distinct phases that ensure the development of an accurate and generalizable
model:

Training Phase:
The model learns patterns and relationships from a labeled dataset, called the training dataset.
It involves adjusting parameters using techniques like gradient descent to minimize errors.
The goal is to optimize the model's performance on the training data.

Testing Phase:
The trained model is evaluated on an unseen dataset, called the testing dataset.
This phase assesses the model's ability to generalize and perform well on new data.
Metrics such as accuracy, precision, recall, and F1 score are used for evaluation.

Key Difference:
The training phase focuses on learning from data, while the testing phase measures how well the learning generalizes to new
data. A balance between both ensures the model avoids overfitting or underfitting.

6. Write a note on Machine learning tasks.


Characteristics of ML tasks
• Machine learning is the systematic study of algorithms and systems that improve their knowledge or performance with
experience.
• Tasks: the problems that can be solved with machine learning
• The machine learning tasks are divided into two major types of models depending on whether it involves target variable or not.
If it involves target variable it is called predictive model , if it not it is called discriptive model.
• Descriptive models can naturally be learned in an unsupervised setting.
• we cluster data with the intention of using the clusters to assign class labels to new data. We will call this predictive clustering
to distinguish it from the previous, descriptive form of clustering.
• The most common setting is supervised learning of predictive models – in fact,this is what people commonly mean when they
refer to supervised learning. Typical tasks are classification and regression.
• It is also possible to use labelled training data to build a descriptive model that is not primarily intended to predict the target
variable, but instead identifies,say, subsets of the data that behave differently with respect to the target variable.
• This example of supervised learning of a descriptive model is called subgroup discovery.

7. Short note on Support Vector Machine

Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into two classes by using a
single straight line, then such data is termed as linearly separable data, and classifier is used called as Linear SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset cannot be classified by using
a straight line, then such data is termed as non-linear data and classifier used is called as Non-linear SVM classifier.
Hyperplane and Support Vectors in the SVM algorithm:
Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-dimensional space, but we need to
find out the best decision boundary that helps to classify the data points. This best boundary is known as the hyperplane of
SVM.
Support Vectors:

The data points or vectors that are the closest to the hyperplane and which affect the position of the hyperplane are termed as
Support Vector. Since these vectors support the hyperplane, hence called a Support vector.
• The dimensions of the hyperplane depend on the features present in the dataset, which means if there are 2 features (as
shown in image), then hyperplane will be a straight line. And if there are 3 features, then hyperplane will be a 2-dimension
plane.
• We always create a hyperplane that has a maximum margin, which means the maximum distance between the data points.
• Support Vectors:
• The data points or vectors that are the closest to the hyperplane and which affect the position of the hyperplane are termed as
Support Vector. Since these vectors support the hyperplane, hence called a Support vector.

8. Predictive and descriptive tasks.


Predictive vs. Descriptive Tasks
Predictive and descriptive tasks are two main types of data analysis approaches in data mining and machine learning:
1. Predictive Tasks:
o Focus on forecasting or predicting unknown outcomes based on existing data.
o Require labeled datasets where the target variable (dependent variable) is known.
o Examples include:
▪ Predicting customer churn using past behavior.
▪ Forecasting stock prices based on historical trends.
o Common techniques: regression, classification, and time-series analysis.
2. Descriptive Tasks:
o Aim to uncover patterns, relationships, or summaries within the data without predicting specific outcomes.
o Work with unlabeled data to identify insights or structures.
o Examples include:
▪ Clustering customers into segments based on buying habits.
▪ Identifying frequent itemsets in transaction data (e.g., market basket analysis).
o Common techniques: clustering, association rule mining, and dimensionality reduction.
Key Difference:
Predictive tasks focus on making future predictions, while descriptive tasks concentrate on understanding and summarizing the
existing data. Both are critical for data-driven decision-making.

9. What are the methods of feature selection.


The data may contain n number of features out of which may be very few features are relevant to targeted feature or to predict
the targeted attribute value or to classify the data. Considering the entire data as a training data including irrelevant attribute
values will decrease the accuracy of the model So feature selection is very important. The features that are relevant with
respect to targeted variable should be selected while generating Machine Learning model.
Feature selection techniques that are easy to use and also gives good results.
1. Univariate Selection
2. Feature Importance
3. Correlation matrix with heatmap

Following are the methods of feature selection


Filter Method
• Here best features are selected by using some statistical technique like chi-square distribution or correlation coefficient or
ANOVA test etc. These techniques helps to select best features means to select the features that are highly correlated with
targeted attributes.
• (Note: ANOVA (Analysis of Variance) is a statistical technique that is used to check if the means of two or more groups are
significantly different ...)
• x and y are said to be highly correlated if x is the independent variable and y is dependent variable as x increases y will also
increase. eg. Area of circle increases with increasing radius. So radius and area are highly correlated.Correlation coeffiecient
value varies from -1 to 1.
• So in filter method only the features that are highly correlated with targeted attribute are selected.
Wrapper method
refer below

Embedded method
10. Explain wrapper method in detail.

• Wrapper method is easy as compared to filter method as it does not use any statistical method for feature selection.
• There are 3 basic mechanisms in this method.
– 1) Forward selection-Forward selection is an iterative method in which we start with having no feature in the model. In each
iteration we keep adding the feature which best improves our model till an addition of a new variable does not improve the
performance of the model.
– eg. A,B,C,D,E (are the indendent features ) o/p
A->Model->accuracy
Model is trained with feature (attribute) A and accuracy is checked. In next iteration next feature B is also added to train the model
and accuracy is checked.
AB->Model->accuracy
• both the accuracies are compared ,if accuracy is improved after adding feature B in second iteration then B is added to feature
set, if accuracy is not improved after adding any new feature then those features are eliminated.
2. Backward elimination
• In backward elimination we start with all the features and removes the least significant fearure at each iteration which
improves the performance of the model. We repeat this until no improvement is observed on removal of features.
The statistical test will find the feature that has lowest impact on targeted variable.ie. Correlation between independent and target
variable is nothing.Chi square test can be done to find the feature for elimination . In chi square method p value is calculated
if p<0.05 This feature is useful.
else p>0.05 This feature is not useful and can be eliminated.
3. Recursive Feature Elimination
It is a greedy optimization algorithm which aims to find best performing feature subset. It repeatedly creates models and keeps
aside the best or worst performing feature at each iteration. It constructs the next model with the left features until all the features
are exhausted. It then ranks the features based on the order of their elimination.
These techniques are useful when dataset is very small.

11. What is Expert system ? What are it’s components .Explain with diagram
What is Expert System?
● An expert system is a computer program that is designed to solve complex problems and to provide decision-making ability
like a human expert.
● It performs this by extracting knowledge from its knowledge base using the reasoning and inference rules according to the
user queries.
● The expert system is a part of AI, and the first ES was developed in the year 1970, which was the first successful approach of
artificial intelligence.
● It solves the most complex issue as an expert by extracting the knowledge stored in its knowledge base.
● The system helps in decision making for complex problems using both facts and heuristics like a human expert.
● It is called so because it contains the expert knowledge of a specific domain and can solve any complex problem of that
particular domain.
● These systems are designed for a specific domain, such as medicine, science, etc.
● The performance of an expert system is based on the expert's knowledge stored in its knowledge base.
● The more knowledge stored in the KB, the more that system improves its performance.
● One of the common examples of an ES is a suggestion of spelling errors while typing in the Google search box.

Components of ES
The components of ES include −
I. Knowledge Base
II. Inference Engine
III. User Interface

Knowledge Base
It contains domain-specific and high-quality knowledge. Knowledge is required to exhibit intelligence. The success of any ES
majorly depends upon the collection of highly accurate and precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task domain. Data, information, and past
experience combined together are termed as knowledge.
Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form of IF-THEN-ELSE rules.
Knowledge Acquisition
The success of any expert system majorly depends on the quality, completeness, and accuracy of the information stored in the
knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge Engineers.
The knowledge engineer is a person with the qualities of empathy, quick learning, and case analyzing skills.
Inference Engine
Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct, flawless solution. In case of
knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from the knowledge base to arrive at a
particular solution.
The Inference Engine uses the following strategies −

Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the outcome. It considers all the
facts and rules, and sorts them before concluding to a solution. This strategy is followed for working on conclusion, result, or
effect. For example, prediction of share market status as an effect of changes in interest rates.

Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions could have happened in the
past for this result. This strategy is followed for finding out cause or reason. For example, diagnosis of blood cancer in humans.

User Interface
User interface provides interaction between user of the ES and the ES itself. It is generally Natural Language Processing so as to
be used by the user who is well- versed in the task domain. The user of the ES need not be necessarily an expert in Artificial
Intelligence.

12. Explain Expert System with Example.(Dentral,Mycin,Pxcles,CaDeT)


Below are some popular examples of the Expert System:
o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert system. It was used in organic
chemistry to detect unknown organic molecules with the help of their mass spectra and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was designed to find the bacteria causing infections
like bacteraemia and meningitis. It was also used for the recommendation of antibiotics and the diagnosis of blood clotting
diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung cancer. To determine the disease, it takes a
picture from the upper body, which looks like the shadow. This shadow identifies the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early stages.

13. Explain Characteristics of Expert System


Characteristics of Expert System
o High Performance: The expert system provides high performance for solving any type of complex problem of a specific
domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable
by the user. It can take input in human language and provides the output in the same way.
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very short period of time.

Unit – II
1. What is classification? How to check accuracy of binary classification
Assignment
2. Explain terminologies of classification
1) Classifier- Is an algorithm to map input data to a specific category.
2) Classiation Model- Model predicts the class through input data given for training.
3) Feature- is an individual measureable property or phenomenon which will be observed.
4) Binary Classification- Classification with two classes.
Which has exactly two outcomes and maps the instance to any one of the two classes. eg. Positive and negative
numbers, even or odd numbers,spam mail or ham mail etc.
5) Multiclass classification - Classification with more than two classes.Each sample is assigned one and only one label
or target

3. Write a note on confusion matrix


A confusion matrix is a table used in classification tasks to evaluate the performance of a machine learning model. It
compares the actual and predicted classifications to give a detailed breakdown of the model's results.The performance of
such classifiers can be summarised by means of a table known as a contingency table or confusion matrix .
Table 2.2. (left) A two-class contingency table or confusion matrix depicting the performance
of the decision tree in Figure 2.1. Numbers on the descending diagonal indicate correct predictions,
while the ascending diagonal concerns prediction errors. (right) A contingency table with
the same marginals but independent rows and columns.
In above table each row refers to actual classes as recorded in the test set, and each column to classes as predicted by the
classifier. So, for instance, the first row states that the test set contains 50 positives, 30 of which were correctly predicted and
20 incorrectly. The last column and the last row give the marginals (i.e., column and row sums). Marginals are important
because they allow us to assess statistical significance. For instance, the contingency table in Table 2.2 (right) has the same
marginals, but the classifier clearly makes a random choice as to which predictions are positive and which are negative – as a
result the distribution of actual positives and negatives in either predicted class is the same as the overall distribution (uniform
in this case).

4. Explain the role of confusion matrix for checking the accuracy of the model
Refer above
5. Explain Naive bayes classifier with example
Naive Bayes Classifier
The Naive Bayes Classifier is a probabilistic machine learning algorithm based on Bayes' Theorem. It assumes that features are
independent of each other (naive assumption), simplifying the computation of probabilities. It is particularly effective for
classification tasks with large datasets and works well with text data like spam filtering or sentiment analysis.

Bayes’ Theorem
Working of Naive Bayes

Example
Problem: Predict whether an email is spam or not based on two features:
- Contains the word "Offer" (yes/no).
- Contains the word "Win" (yes/no).
Advantages:
- Simple and fast.
- Handles high-dimensional data well.

Limitations:
- Assumes feature independence, which may not hold true in real-world data.

6. Explain scoring of classifier with example


Scoring of a Classifier
Scoring a classifier refers to evaluating its performance by calculating various metrics based on the predicted and actual
outcomes. These scores help determine the model's effectiveness in classification tasks and guide improvements.
Key Metrics for Scoring
7. Describe Class probability Estimation.
Applications
1. Medical Diagnosis: Predicting the likelihood of disease presence.
2. Fraud Detection: Estimating the probability of a transaction being fraudulent.
3. Recommendation Systems: Assigning probabilities to items a user might prefer.
Importance
● Provides insights into the model's confidence in predictions.
● Enables flexibility by adjusting thresholds to balance precision and recall.
● Essential for decision-making in high-stakes applications (e.g., healthcare, finance).
Class probability estimation enhances interpretability and enables more informed, nuanced decision-making.
8. How can a class probability Estimation be assessed? Explain with example.
(from gpt)
Class probability estimation can be assessed by evaluating how well the predicted probabilities align with the actual outcomes.
The goal is to determine whether the probabilities are calibrated, meaning the predicted likelihoods accurately reflect the real-
world occurrence of the events.
9. Explain one vs all and one vs one scheme with respect to multi class classification
•If there are multiple classes in which the instance space is to be classifier has to assign a particular class for that instance or to
predict that particular instance belong to which class the the are two techiques for classification.
1. One-vs-All (OvA)
● Description:
For a dataset with nnn classes, the OvA scheme trains one binary classifier per class. Each classifier predicts whether
an instance belongs to that class (111) or not (000).
○ For class C1C_1C1: It predicts C1C_1C1 vs. all other classes.
○ This is repeated for all nnn classes.
● Prediction:
For a new instance, the classifier with the highest confidence score determines the predicted class.
● Example:
For 3 classes A,B,CA, B, CA,B,C:
○ Train three classifiers:
■ Classifier 1: AAA vs. B,CB, CB,C
■ Classifier 2: BBB vs. A,CA, CA,C
■ Classifier 3: CCC vs. A,BA, BA,B
● Advantages:
○ Easy to implement.
○ Computationally efficient for training.
● Disadvantages:
○ Can struggle with imbalanced datasets.
○ Overlaps in decision boundaries may lead to misclassification.
2. One-vs-One (OvO)
● Description:
For a dataset with nnn classes, the OvO scheme trains binary classifiers for every pair of classes. This results in

classifiers.
○ Each classifier distinguishes between two classes at a time.
● Prediction:
For a new instance, each classifier votes for one of its two classes. The class with the most votes is the final prediction.
● Example:
For 3 classes A,B,CA, B, CA,B,C:
○ Train three classifiers:
■ Classifier 1: AAA vs. BBB
■ Classifier 2: AAA vs. CCC
■ Classifier 3: BBB vs. CCC
● Advantages:
○ Often more accurate as it focuses on simpler binary problems.
○ Handles imbalances between pairs of classes better.
● Disadvantages:
○ Computationally expensive, especially for large nnn.
○ More complex to implement due to the number of classifiers.

10. What is regression? Explain in detail.


Assignment
11. Explain performance of regression
Metrics that can be used to check the performance of regression model is MSE(Mean squared erroe) and RMSE(Root mean
squared error).

Mean sqauared error is simply the average of squared errors ie squared differences between actual and predected value of
dependent variable.
The MSE value 0 Indicates the ideal regression model which is rare whereas higher value of MSE indeicates the bad
performance of regression model.

Root mean square error is yet another method of finding performance of regression model which is more useful than MSE.
It can be obtained just by taking the square root of the MSE but it is more accurate than MSE.
12. Explain polynomial regression in detail.
● Polynomial Regression is a regression algorithm that models the relationship between a dependent(y) and
independent variable(x) as nth degree polynomial. The Polynomial Regression equation is given below:
● y= b0+b1x1+ b2x1^2+ b2x1^3+...... bnx1^n
● It is also called the special case of Multiple Linear Regression in ML. Because we add some polynomial terms to the
Multiple Linear regression equation to convert it into Polynomial Regression.
● It is a linear model with some modification in order to increase the accuracy.
● The dataset used in Polynomial regression for training is of non-linear nature.
● It makes use of a linear regression model to fit the complicated and non-linear functions and datasets.
● Hence, "In Polynomial regression, the original features are converted into Polynomial features of required degree
(2,3,..,n) and then modeled using a linear model."
The need of Polynomial Regression in ML can be understood in the below points:
● If we apply a linear model on a linear dataset, then it provides us a good result as we have seen in Simple Linear
Regression, but if we apply the same model without any modification on a non-linear dataset, then it will produce a
drastic output.
● Due to which loss function will increase, the error rate will be high,and accuracy will be decreased.So for such cases,
where data points are arranged in a non-linear fashion, we need the Polynomial Regression model.
● We can understand it in a better way using the below comparison diagram of the linear dataset and non-linear
dataset.

13. Explain the following terms in brief: i) bias, ii) variance, iii) overfitting, vi) underfitting
1. Bias:
Bias refers to the error introduced by approximating a real-world problem, which may be complex, with a simplified model. A
high-bias model makes strong assumptions about the data, which can lead to systematic errors and underfitting.
● High Bias: The model is too simple and fails to capture the underlying patterns of the data (underfitting).
● Low Bias: The model is complex enough to represent the data accurately.
Example: A linear model trying to fit non-linear data will have high bias because it's oversimplifying the relationships.
2. Variance:
Variance refers to the model's sensitivity to small fluctuations in the training data. A model with high variance will capture noise
in the data, leading to overfitting, where the model fits the training data too closely and fails to generalize well to new data.
● High Variance: The model is too complex and fits the noise of the training data.
● Low Variance: The model is stable and less sensitive to fluctuations in the data.
Example: A very complex model (like a high-degree polynomial) might perform very well on training data but fail to generalize
to unseen data because it fits even the noise.
3. Underfitting:
Underfitting occurs when the model is too simple to capture the underlying patterns in the data. It results from high bias and
leads to poor performance on both the training and test sets.
● Cause: Usually due to a model that is not complex enough, lacks the capacity, or is trained for too few epochs.
● Consequence: The model performs poorly on both the training and test data, as it has not learned enough from the
data.
Example: Using a linear regression model to fit a dataset with clear non-linear relationships.
4. Overfitting:
Overfitting occurs when the model learns not only the underlying patterns but also the noise in the training data. It performs
very well on training data but poorly on unseen test data because it fails to generalize.
● Cause: Usually due to a model that is too complex or trained for too long.
● Consequence: The model has low error on the training set but high error on the test set.
Example: A decision tree with too many branches might perfectly classify the training data but perform poorly on new data.
● Bias: Error due to overly simplistic assumptions in the model (underfitting).
● Variance: Error due to the model being too sensitive to small changes in the training data (overfitting).
● Underfitting: Model is too simple to capture data patterns (high bias).
● Overfitting: Model is too complex, capturing both patterns and noise (high variance).

14. Difference between bias and variance


15. Explain underfitting and overfitting problem in machine learning also explain techniques to overcome it.
Assignment
16. Explain Find-S algorithm with example
Assignment

17. Explain Candidate elimination algorithm.


• The candidate elimination algorithm incrementally builds the version space given a hypothesis space H and a set E of
examples. The examples are added one by one; each example possibly shrinks the version space by removing the hypotheses
that are inconsistent with the example. The candidate elimination algorithm does this by updating the general and specific
boundary for each new example.
• Consider this as an extended form of Find-S algorithm.
• It considers both positive and negative examples.
• Actually, positive examples are used here as Find-S algorithm (Basically they are generalizing from the specification).
• While the negative example is specified from generalize form.
• Terms Used:
• Concept learning: Concept learning is basically learning task of the machine (Learn by Train data)
• General Hypothesis: Not Specifying features to learn the machine.
• G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
• Specific Hypothesis: Specifying features to learn machine (Specific feature)
• S= {‘pi,’pi’,’pi’…}: Number of pi depends on number of attributes.
• Version Space: It is intermediate of general hypothesis and Specific hypothesis. It not only just written one hypothesis
but a set of all possible hypothesis based on training data-set.

Algorithm:
• Step1: Load Data set
• Step2: Initialize General Hypothesis and Specific Hypothesis.
• Step3: For each training example
• Step4: If example is positive example
• if attribute_value == hypothesis_value:
• Do nothing
• else:
• replace attribute value with '?' (Basically generalizing it)
• Step5: If example is Negative example
• Make generalize hypothesis more specific.

https://www.youtube.com/watch?v=O2wYwFOMQ24

18. Explain logistic regression and it’s type

The output of the logistic function (a value between 0 and 1) is interpreted as a probability, and a threshold (usually
0.5) is applied to decide the class label. If P(Y=1∣X)>0.5P(Y = 1|X) > 0.5P(Y=1∣X)>0.5, the instance is classified as 1;
otherwise, it is classified as 0.
Types of Logistic Regression
There are three primary types of logistic regression, based on the number of classes in the classification problem:
Binary Logistic Regression
Used for classification tasks with two classes (e.g., Spam vs. Not Spam).
The model predicts the probability of one class (e.g., P(Y=1∣X)P(Y = 1|X)P(Y=1∣X)) and uses a threshold (0.5) to classify the
instance.
Example: Predicting whether a patient has a disease (1) or not (0) based on features like age, weight, etc.
Multinomial Logistic Regression (or Multiclass Logistic Regression)
Used when the target variable has more than two categories (classes).
It generalizes binary logistic regression by calculating the probability of each class using multiple equations (one vs all
approach).
The model outputs a probability distribution over all possible classes. The class with the highest probability is selected.
Example: Classifying animals into categories such as Dog, Cat, and Rabbit based on features like size, fur type, etc.

Ordinal Logistic Regression


● Used for classification problems where the target variable has ordered categories (e.g., ratings from 1 to 5).
● Unlike binary and multinomial logistic regression, ordinal logistic regression accounts for the inherent order in the
target categories (but not the exact distance between them).
● It is used when the outcome variable is discrete and ordered (like "low", "medium", "high").
Example: Predicting the severity of a disease as low, medium, or high based on medical data.
Advantages of Logistic Regression
● Simple and easy to implement.
● Works well when the data is linearly separable.
● Provides probabilities, which are useful for decision-making (e.g., risk assessment).
● Can handle both binary and multiclass classification tasks (with extensions).
Limitations of Logistic Regression
● Assumes a linear relationship between the independent variables and the log-odds of the dependent variable (may
not always hold in practice).
● It can underperform if there is a strong non-linear relationship in the data.
● Sensitive to outliers and multicollinearity in the features.

19. Difference between linear and logistic regression


20. Explain importance of classification and regression with suitable example
Classification and Regression are two core types of supervised machine learning tasks that help in predicting outcomes based
on input data. Both are widely used across various domains, but they differ in the nature of the predicted output.
1. Classification
Classification involves predicting a categorical label or class for a given input. The output variable is discrete, and the goal is to
assign the correct class label from a finite set of possibilities. Classification is useful when we want to categorize objects or
events based on certain features.
Importance of Classification
● Categorizes Data: Classification is essential when we need to sort or categorize data into distinct groups.
● Decision-Making: It aids in decision-making by providing clear, discrete outcomes, often in scenarios like fraud
detection, medical diagnosis, and customer segmentation.
● Real-World Applications: It helps in practical problems like email filtering (Spam/Not Spam), sentiment analysis
(positive/negative reviews), and image recognition (identifying objects).
Example:
In a medical diagnosis system, a classification model can be used to predict whether a patient has a certain disease (e.g., "Yes"
or "No") based on medical features such as age, gender, test results, etc. The model would help doctors quickly identify high-
risk patients and recommend treatments.
● Problem: Predict if a person will buy a product or not (Yes/No).
● Model: Logistic regression or Decision Trees can be used.
● Application: E-commerce websites using this model to target potential customers with personalized offers.
2. Regression
Regression involves predicting a continuous numerical value based on input features. The output variable is continuous, and the
goal is to model the relationship between input variables and a continuous output. Regression is useful when we want to
predict a quantity or amount.
Importance of Regression
● Predicts Quantities: Regression helps in forecasting or estimating real-world quantities, such as prices, sales, or
temperatures.
● Trend Analysis: It is used to understand relationships between variables, helping businesses and researchers make
informed predictions or decisions.
● Real-World Applications: Regression is widely used in finance, economics, and engineering for price prediction, stock
market analysis, and risk assessment.
Example:
In real estate, a regression model can predict the price of a house based on features such as location, size, number of
bedrooms, and age of the house. This helps buyers and sellers estimate the market value of properties.
● Problem: Predict the price of a house.
● Model: Linear regression or Random Forest Regression.
● Application: Real estate websites or apps estimating property prices.

Unit III
1. What is least square method? Explain in detail.
Refer notes
2. Explain Univariant Linear regression.
3. Write a short note on Multivariate Linear Regression.
➢ Multivariate Regression is one of the simplest Machine Learning Algorithm. It comes under the class of Supervised
Learning Algorithms i.e, when we are provided with training dataset. Some of the problems that can be solved using
this model are:

➢ A researcher has collected data on three psychological variables, four academic variables (standardized test scores),
and the type of educational program the student is in for 600 high school students. She is interested in how the set of
psychological variables is related to the academic variables and the type of program the student is in.

➢ A doctor has collected data on cholesterol, blood pressure, and weight. She also collected data on the eating habits
of the subjects (e.g., how many ounces of red meat, fish, dairy products, and chocolate consumed per week). She
wants to investigate the relationship between the three measures of health and eating habits.

➢ Multivariate Regression is a type of machine learning algorithm that involves multiple data variables for analysis.
➢ It is mostly considered as a supervised machine learning algorithm. Steps involved for Multivariate regression analysis
are feature selection and feature engineering, normalizing the features, selecting the loss function and hypothesis
parameters, optimize the loss function, Test the hypothesis and generate the regression model.
➢ The major advantage of multivariate regression is to identify the relationships among the variables associated with
the data set. It helps to find the correlation between the dependent and multiple independent variables. Multivariate
linear regression is a commonly used machine learning algorithm.
➢ Multivariate Regression helps us to measure the angle of more than one independent variable and more than one
dependent variable. It finds the relation between the variables (Linearly related).
➢ It used to predict the behavior of the outcome variable and the association of predictor variables and how the
predictor variables are changing.
➢ It can be applied to many practical fields like politics, economics, medical, research works and many different kinds of
businesses.
➢ Multivariate regression is a simple extension of multiple regression.
➢ Examples of Multivariate Regression
○ If E-commerce Company has collected the data of its customers such as Age, purchased history of a customer, gender and
company want to find the relationship between these different dependents and independent variables.
○ A gym trainer has collected the data of his client that are coming to his gym and want to observe some things of client
that are health, eating habits (which kind of product client is consuming every week), the weight of the client. He wants
to find a relation between these variables.

4. How to find coefficient of Multiple linear regression.


Refre from book

5. Explain the concept of least squares regression to find the line of best fit for the above data.
Step 1: Calculate the slope ‘m’ by using the following formula:

Step 2: Compute the y-intercept value


c=y-mx;

Step 3: Substitute the values in the final equation


y=mx+c

Example refer from book

6. Explain Artificial neuron with it’s characteristics.


An artificial neuron is a mathematical function based on a model of biological neurons, where each neuron
takes inputs, weighs them separately, sums them up and passes this sum through a nonlinear function to produce
output.
The artificial neuron has the following characteristics:
➔ A neuron is a mathematical function modeled on the working of biological neurons
➔ It is an elementary unit in an artificial neural network
➔ One or more inputs are separately weighted
➔ Inputs are summed and passed through a nonlinear function to produce output
➔ Every neuron holds an internal state called activation signal
➔ Each connection link carries information about the input signal
➔ Every neuron is connected to another neuron via connection link

7. Write a note on perceptron


➔ A single-layer perceptron is the basic unit of a neural network. A perceptron consists of input values, weights and a
bias, a weighted sum and activation function.
➔ A perceptron is a neural network unit (an artificial neuron) that does certain computations to detect features or
business intelligence in the input data.
➔ A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm enables neurons to learn and
processes elements in the training set one at a time.
➔ Perceptron is usually used to classify the data into two parts. Therefore, it is also known as a Linear
➔ Perceptron is a single layer neural network and a multi-layer perceptron is called Neural Networks.
➔ Perceptron is a linear classifier (binary). Also, it is used in supervised learning. It helps to classify the given input data.
But how it works ?
➔ The perceptron works on these simple steps
a. All the inputs x are multiplied with their weights w. Let’s call it k.

b. Add all the multiplied values and call them Weighted Sum.

c. Apply that weighted sum to the correct Activation Function.


For Example: Unit Step Activation Function.
➔ Perceptron Learning Rule
◆ Perceptron Learning Rule states that the algorithm would automatically learn the optimal weight
coefficients. The input features are then multiplied with these weights to determine if a neuron fires or not.
◆ Perceptron is a function that maps its input “x,” which is multiplied with the learned weight coefficient; an
output value ”f(x)”is generated.

In the equation given above:

● “w” = vector of real-valued weights


● “b” = bias (an element that adjusts the boundary away from origin without any dependence on the input value)
● “x” = vector of input x values
● “m” = number of inputs to the Perceptron

◆ The output can be represented as “1” or “0.” It can also be represented as “1” or “-1” depending on which
activation function is used.
8. Describe Soft Margin SVM.
Support Vector Machines (SVMs) are powerful supervised learning algorithms primarily used for classification tasks. A Soft
Margin SVM is an extension of the standard Hard Margin SVM, designed to handle data that is not perfectly linearly separable.
It allows for some misclassifications by introducing a margin of tolerance, which makes the model more flexible and robust
when dealing with noisy or overlapping data. In a typical Hard Margin SVM, the goal is to find a hyperplane that perfectly
separates the two classes with the largest possible margin, without allowing any misclassification. However, in real-world
scenarios, data is often noisy, and a perfect separation may not be possible. Soft Margin SVM allows for some misclassifications
to achieve a better overall model that generalizes well.
The key concept of Soft Margin SVM is to balance between maximizing the margin and allowing some misclassifications.
Example of Soft Margin SVM
Consider a binary classification problem where we want to classify data points into two classes: positive (+1) and negative (-1).
Some data points may be noisy, and perfect separation is not possible.
Using Soft Margin SVM, the classifier will find a hyperplane that separates the two classes while allowing some data points to
fall within the margin or be misclassified. The regularization parameter CCC will control the trade-off between margin width and
the number of misclassifications.
Unit IV
1. Explain KNN classification in detail.
Assignment
2. What is clustering ? Types of clustering.
Assignment
3. Explain K-means clustering algorithm with example.
•K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into different clusters. Here K
defines the number of pre-defined clusters that need to be created in the process, as if K=2, there will be two clusters, and for
K=3, there will be three clusters, and so on.
The working of the K-Means algorithm is explained in the below steps:
Step-1: Select the number K to decide the number of clusters.
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each cluster.
Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
Step-7: The model is ready.
OR( example refer below if algo is asked refer above)

4. Explain Heirarchical clustering in detail


Assignment
5. Explain single linkage and complete linkage with example
Refer Book
6. How to measure association with respect to Association Rule Mining? Explain in detail.

7. What a short note on decision tree


Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but
mostly it is preferred for solving Classification problems. It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent the decision rules and each leaf node represents the
outcome.
In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make
any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain
any further branches.
The decisions or the test are performed on the basis of features of the given dataset.
It is a graphical representation for getting all the possible solutions to a problem/decision based on given conditions.
It is called a decision tree because, similar to a tree, it starts with the root node, which expands on further branches
and constructs a tree-like structure.
In order to build a tree, we use the CART algorithm, which stands for Classification and Regression Tree algorithm.
A decision tree simply asks a question, and based on the answer (Yes/No), it further split the tree into subtrees.
Below diagram explains the general structure of a decision tree:

8. Explain the following terms wrt decision tree:Splitting, decision node, pruning, sub tree
● Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which further gets
divided into two or more homogeneous sets.
● Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a leaf node.
● Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the given
conditions.
● Branch/Sub Tree: A tree formed by splitting the tree.
● Pruning: Pruning is the process of removing the unwanted branches from the tree.
● Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the child nodes.

9. How does decision tree algorithm work?


● In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node of the tree.
This algorithm compares the values of root attribute with the record (real dataset) attribute and, based on the
comparison, follows the branch and jumps to the next node.
● For the next node, the algorithm again compares the attribute value with the other sub-nodes and move further. It
continues the process until it reaches the leaf node of the tree.
● The complete process can be better understood using the below algorithm:
○ Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
○ Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
○ Step-3: Divide the S into subsets that contains possible values for the best attributes.
○ Step-4: Generate the decision tree node, which contains the best attribute.
○ Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3. Continue
this process until a stage is reached where you cannot further classify the nodes and called the final node as
a leaf node.
● Example: Suppose there is a candidate who has a job offer and wants to decide whether he should accept the offer or
Not. So, to solve this problem, the decision tree starts with the root node (Salary attribute by ASM). The root node
splits further into the next decision node (distance from the office) and one leaf node based on the corresponding
labels. The next decision node further gets split into one decision node (Cab facility) and one leaf node. Finally, the
decision node splits into two leaf nodes (Accepted offers and Declined offer). Consider the below diagram:

10. Explain Attribute selection method a) Information Gain b) Entrophy


Information Gain:
● Information gain is the measurement of changes in entropy after the segmentation of a dataset based on an
attribute.
● It calculates how much information a feature provides us about a class.
● According to the value of information gain, we split the node and build the decision tree.
● A decision tree algorithm always tries to maximize the value of information gain, and a node/attribute
having the highest information gain is split first. It can be calculated using the below formula:
Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)

Entropy:
Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in data. Entropy can be
calculated as:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)


Where,
S= Total number of samples
P(yes)= probability of yes
P(no)= probability of no
Refer example calculate Information gain of weather

Unit V
1. Explain Ensemble Method
Ensemble methods are machine learning techniques that combine predictions from multiple individual models to improve
overall performance. The goal of ensemble learning is to reduce errors, improve accuracy, and create a robust model that
outperforms individual models. By aggregating predictions, ensemble methods can better handle variance, bias, and model
overfitting.
Types of Ensemble Methods
Bagging (Bootstrap Aggregating):(refer below)
Boosting:(refer below)
Stacking:
● Definition: Combines multiple base models by training a meta-model (a second-level model) that learns to aggregate
the predictions of base models.
● Example: Combining predictions from logistic regression, decision trees, and SVMs into a single model.
● Advantages: Exploits the strengths of different algorithms.
Voting:
● Definition: Aggregates predictions from multiple models using voting (for classification) or averaging (for regression).
● Types:
○ Hard Voting: Takes the majority class label from all models.
○ Soft Voting: Takes the average of predicted probabilities and selects the class with the highest probability.
● Example: Combining predictions from decision trees, SVM, and k-NN classifiers.

2. What is Bagging?
● bagging, that often considers homogeneous weak learners, learns them independently from each other in parallel and combines
them following some kind of deterministic averaging process
● Bagging is used when our objective is to reduce the variance of a decision tree. Here the concept is to create a few subsets of data
from the training sample, which is chosen randomly with replacement. Now each collection of subset data is used to prepare their
decision trees thus, we end up with an ensemble of various models. The average of all the assumptions from numerous trees is
used, which is more powerful than a single decision tree.
● Random Forest is an expansion over bagging. It takes one additional step to predict a random subset of data. It also makes the
random selection of features rather than using all features to develop trees. When we have numerous random trees, it is called
the Random Forest.
● These are the following steps which are taken to implement a Random forest:
○ Let us consider X observations Y features in the training data set. First, a model from the training data set is taken randomly with
substitution.
○ The tree is developed to the largest.
○ The given steps are repeated, and prediction is given, which is based on the collection of predictions from n number of trees.

3. What is Boosting?
Boosting:
● boosting, that often considers homogeneous weak learners, learns them sequentially in a very adaptative way (a base model
depends on the previous ones) and combines them following a deterministic strategy
● Boosting is another ensemble procedure to make a collection of predictors. In other words, we fit consecutive trees, usually
random samples, and at each step, the objective is to solve net error from the prior trees.
● If a given input is misclassified by theory, then its weight is increased so that the upcoming hypothesis is more likely to classify it
correctly by consolidating the entire set at last converts weak learners into better performing models.
● Gradient Boosting is an expansion of the boosting procedure.
● Gradient Boosting = Gradient Descent + Boosting
● It utilizes a gradient descent algorithm that can optimize any differentiable loss function. An ensemble of trees is constructed
individually, and individual trees are summed successively. The next tree tries to restore the loss ( It is the difference between
actual and predicted values).

4. Difference between Bagging and Boosting


Bagging Boosting

Various training data subsets are randomly drawn with replacement Each new subset contains the components that were misclassified by
from the whole training dataset. previous models.

Bagging attempts to tackle the over-fitting issue.(Bagging tries to Boosting tries to reduce bias.
reduce variance)
If the classifier is unstable (high variance), then we need to apply If the classifier is steady and straightforward (high bias), then we need
bagging. to apply boosting.
Every model receives an equal weight. Models are weighted by their performance.

Objective to decrease variance, not bias. Objective to decrease bias, not variance.
It is the easiest way of connecting predictions that belong to the same It is a way of connecting predictions that belong to the different
type. types.

Every model is constructed independently. New models are affected by the performance of the previously
developed model.

5. When to use multi-task learning? Explain Multitask learning in detail.


Multitask Learning (MTL) is a machine learning paradigm where a model is trained to perform multiple tasks simultaneously.
Instead of building separate models for each task, MTL leverages shared knowledge across tasks to improve performance on all
tasks. This approach assumes that tasks are related and can benefit from each other through shared representations.
Key Concepts of Multitask Learning
● Task Relatedness: The tasks in MTL must be related or share some commonalities, such as similar data distributions or
underlying features.
● Shared Representations: MTL promotes learning of shared features across tasks, which helps the model generalize
better.
● Auxiliary Tasks: In some cases, auxiliary tasks (additional tasks) are used to help the model learn better
representations for the main task.
When to Use Multitask Learning
● Related Tasks: When there are multiple tasks that share a significant amount of information (e.g., shared features or
similar labels).
○ Example: Predicting both sentiment polarity and emotion category from the same text.
● Limited Data: When there is insufficient data for some tasks, MTL helps by leveraging data from other related tasks.
○ Example: Training models for multiple languages in NLP using a shared dataset.
● Feature Extraction: When tasks benefit from learning shared lower-level representations.
○ Example: Image classification and object detection tasks on the same images.
● Efficiency: To reduce computational cost by training a single model for multiple tasks instead of training separate
models.

6. Explain Random Forest Algorithm


•Random Forest works in two-phase first is to create the random forest by combining N decision tree, and second is to make
predictions for each tree created in the first phase.
•The Working process can be explained in the below steps and diagram:
•Step-1: Select random K data points from the training set.
•Step-2: Build the decision trees associated with the selected data points (Subsets).
•Step-3: Choose the number N for decision trees that you want to build.
•Step-4: Repeat Step 1 & 2.
•Step-5: For new data points, find the predictions of each decision tree, and assign the new data points to the category that
wins the majority votes.
The working of the algorithm can be better understood by the below example:
Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random forest classifier.
The dataset is divided into subsets and given to each decision tree. During the training phase, each decision tree produces a
prediction result, and when a new data point occurs, then based on the majority of results, the Random Forest classifier
predicts the final decision. Consider the below image:
7. Write a brief note on Sequences Prediction.
•Sequence prediction involves predicting the next value for a given input sequence.
For example:
•Given: 1, 2, 3, 4, 5
•Predict: 6

Sequence prediction attempts to predict elements of a sequence on the basis of the preceding elements
“A prediction model is trained with a set of training sequences. Once trained, the model is used to perform sequence
predictions. A prediction consists in predicting the next items of a sequence.” This task has numerous applications such as web
page prefetching, consumer product recommendation, weather forecasting and stock market prediction.
Applications
● Time-Series Forecasting:
○ Predicting stock prices, weather, or sales trends based on historical data.
● Natural Language Processing (NLP):
○ Predicting the next word in a sentence (e.g., autocomplete in keyboards).
● Recommender Systems:
○ Suggesting the next movie, song, or product based on user behavior.
● Biological Sequences:
○ Predicting DNA or protein sequences for genetic research.
● Autonomous Systems:
○ Predicting future actions in robotics or navigation tasks.

8. What is Streaming ? How Streaming data works


The term "streaming" is used to describe continuous, never-ending data streams with no beginning or end, that provide a
constant feed of data that can be utilized/acted upon without needing to be downloaded first.
A simple analogy is how water flows through a river or creek. The streams come from various sources, in varying speed and
volumes and flow into a single, continuous, combined stream.Similarly, data streams are generated by all types of sources, in
various formats and volumes. From applications, networking devices, and server log files, to website activity, banking
transactions, and location data, they can all be aggregated to seamlessly gather real-time information and analytics from one
source of truth.

How Streaming Data Works


Data processing is not new. In previous years, legacy infrastructure was much more structured because it only had a handful of
sources that generated data and the entire system could be architected in a way to specify and unify the data and data
structures.Modern data is generated by an infinite amount of sources whether it’s from hardware sensors, servers, mobile
devices, applications, web browsers, internal and external and it’s almost impossible to regulate or enforce the data structure
or control the volume and frequency of the data generated.Applications that analyze and process data streams need to process
one data packet at a time, in sequential order. Each data packet generated will include the source and timestamp to enable
applications to work with data streams.Applications working with data streams will always require two main functions: storage
and processing. Storage must be able to record large streams of data in a way that is sequential and consistent. Processing must
be able to interact with storage, consume, analyze and run computation on the data.
This also brings up additional challenges and considerations when working with data streams. Many platforms and tools are
now available to help companies build streaming data applications.
Some common examples of streaming data are real-time stock trades, retail inventory management, ride-sharing apps, and
multiplayer games.
For example, when a passenger book a cab, real-time streams of data occur together to create the best user experience.
Through this data, the application pieces together real-time location tracking, traffic stats, pricing, and historical traffic data,
and pricing data to know to how much it should cost based on both real-time and past data.

9. What is Active Learning? How does active learning work?


Active Learning is a special case of Supervised Machine Learning. This approach is used to construct a high
performance classifier while keeping the size of the training dataset to a minimum by actively selecting the valuable
data points.
Where should we apply active learning?
We have a very small amount or a huge amount of dataset.
Annotation of the unlabeled dataset cost human effort, time and money.
We have access to limited processing power.

•On a certain planet, there are various fruits of different size(1-5), some of them are poisonous and others don’t. The only
criteria to decide a fruit is poisonous or not is it’s size. our task is to train a classifier which predicts the given fruit is poisonous
or not. The only information we have, is fruit with size 1 is not poisonous, the fruit of size 5 is poisonous and after a particular
size, all fruits are poisonous.
•The first approach is to check each and every size of the fruit, which consume time and resources.
•The second approach is to apply the binary search and find the transition point (decision boundary). This approach uses fewer
data and gives the same results as of linear search.

10. What is Deep Learning? Explain in detail.


11. What is reinforcement learning? How does it work?
12. Explain Agent and goals of Agent
Refer Book
13. Explain Intelligent Agent.
14. What are the Categories of Intelligent System
15. Explain a)Agent and Object b)Agent And Expert System

You might also like