0% found this document useful (0 votes)

37 views12 pages

Mmds

Unit -I discusses various data mining concepts and techniques. It provides definitions for data mining, machine learning, supervised vs unsupervised learning, and feature extraction. It also lists common data mining techniques such as association rules, clustering, prediction, and neural networks. Statistical modeling is explained as using mathematical equations and statistical approaches to describe relationships between variables in a dataset. Computational approaches to modeling discussed include supervised/unsupervised learning techniques like classification, regression, and clustering as well as dimensionality reduction, feature selection, ensemble methods, deep learning, and time series analysis.

Uploaded by

Ankitha Vardhini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views12 pages

Mmds

Uploaded by

Ankitha Vardhini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Unit -I

SHORT QUESTIONS
1.What is Data Mining and list few techniques of Data Mining
a) Data mining is the process of searching and analyzing a large batch of raw data in order to
identify patterns and extract useful information.
Companies use data mining software to learn more about their customers. It can help them to
develop more effective marketing strategies, increase sales, and decrease costs. Data mining
relies on effective data collection, warehousing, and computer processing.
TECHNIQUES OF DATAMINING :
1)Association rule
2)Clustering
3)Prediction
4)knn
5)decision tree
6)neural network
7)classification

2. Define Hash Functions and Natural Logarithms

a) Hash Functions: In data mining, a hash function is a mathematical function that takes an
input (or "key") and produces a fixed-size string of characters, which is typically a numerical
value or a hexadecimal representation. This output is commonly referred to as the hash code
or hash value. Hash functions are used for various purposes in data mining, including data
indexing, data deduplication, data summarization, and more.

Natural Logarithms: Data transformation using natural logarithms can help normalize data,
making it suitable for various statistical and machine learning techniques. For example, when
working with skewed data distributions, taking the logarithm of the data can make it more
amenable to linear modeling and analysis.

3.What is machine learning

A) Machine learning is a set of methods, tools, and computer algorithms used to train
machines to analyze, understand, and find hidden patterns in data and make predictions. The
eventual goal of machine learning is to utilize data for self-learning, eliminating the need to
program machines in an explicit manner. Once trained on datasets, machines can apply
memorized patterns on new data and as such make better predictions.

4.Differentiate Supervised and Unsupervised learning.

Supervised Learning Unsupervised Learning

Supervised learning algorithms are trained Unsupervised learning algorithms are
using labeled data. trained using unlabeled data.

Supervised learning model takes direct Unsupervised learning model does not
feedback to check if it is predicting correct take any feedback.
output or not.

Supervised learning model predicts the Unsupervised learning model finds the
output. hidden patterns in data.

In supervised learning, input data is In unsupervised learning, only input data

provided to the model along with the is provided to the model.
output.

The goal of supervised learning is to train The goal of unsupervised learning is to

the model so that it can predict the output find the hidden patterns and useful
when it is given new data. insights from the unknown dataset.

Supervised learning needs supervision to Unsupervised learning does not need any
train the model. supervision to train the model.

Supervised learning can be categorized Unsupervised Learning can be classified

in Classification and Regression problem in Clustering and Associations problem
s. s.

Supervised learning can be used for those Unsupervised learning can be used for
cases where we know the input as well as those cases where we have only input
corresponding outputs. data and no corresponding output data.

Supervised learning model produces an Unsupervised learning model may give

accurate result. less accurate result as compared to
supervised learning.

Supervised learning is not close to true Unsupervised learning is more close to

Artificial intelligence as in this, we first the true Artificial Intelligence as it learns
train the model for each data, and then similarly as a child learns daily routine
only it can predict the correct output. things by his experiences.

It includes various algorithms such as It includes various algorithms such as

Linear Regression, Logistic Regression, Clustering, KNN, and Apriori algorithm.
Support Vector Machine, Multi-class
Classification, Decision tree, Bayesian
Logic, etc.

5.Define Feature Extraction and list any 2 feature extraction techniques

a) Feature extraction is a crucial aspect of data mining, particularly when dealing with large
and complex datasets. In data mining, feature extraction refers to the process of selecting,
transforming, or creating new features from the raw data to prepare it for analysis.
Before data mining can be effectively performed, raw data needs to be preprocessed. This
often involves handling missing values, dealing with outliers, and cleaning the data. Once the
data is prepared, feature extraction comes into play.
2 feature techniques are:
1) Principal component anlysis
2) Linear discriminate anlysis

ESSAY QUESTIONS
1.Explain Statistical Modelling

A) Statistical modeling is the process of describing the connections between variables in a

dataset using mathematical equations and statistical approaches. In statistical modeling, we
use a collection of statistical methods to investigate the connections between variables and
uncover patterns in data.

Predicting the number of people who will travel on a specific rail route is an example of
statistical modeling. To develop a statistical model, we would collect data on the number of
passengers who utilize the train route over time, as well as data on variables that might affect
passenger counts, such as time of day, day of the week, and weather.

Then, using statistical approaches such as regression analysis, we can determine the
correlations between these factors and the number of passengers utilizing the railway route.
For example, we might discover that the number of passengers is larger during rush hour and
on weekdays, and fewer when it is raining.

We can apply this data to build a statistical model that forecasts the number of people who
would use the railway route depending on the time of day, day of the week, and weather
conditions. This model can then be used to anticipate future passenger numbers and make
resource allocation choices, such as adding additional trains during rush hour or giving
specials during severe weather.

It is essential in statistical modeling to pick an appropriate statistical model that fits the data
and to evaluate the model to ensure accuracy and reliability. This might include running the
model on a new set of data or employing statistical tests to assess the model’s performance.
Types of Statistical Models

Statistical Modeling Techniques

2.What are the various computational approaches to modelling

A) In data mining, computational approaches to modeling involve the use of various
algorithms and techniques to analyze and extract valuable patterns, knowledge, and insights
from large and complex datasets. These approaches help in making predictions, identifying
trends, and uncovering hidden relationships within the data. Here are some key computational
approaches to modeling in data mining:
Supervised Learning:
● Classification: Classification models are used to categorize data into
predefined classes or labels. Algorithms like decision trees, support vector
machines, and neural networks are commonly employed for this task.

● Regression: Regression models predict a continuous numerical value or output

based on input features. Linear regression, polynomial regression, and
regression trees are examples of regression techniques.

Unsupervised Learning:
● Clustering: Clustering algorithms group similar data points into clusters or
segments. K-means clustering, hierarchical clustering, and DBSCAN are
popular methods for unsupervised clustering.
● Association Rules: Association rule mining, often used in market basket
analysis, identifies relationships between items in a dataset. Apriori and FP-
growth are well-known algorithms for this purpose.

Dimensionality Reduction:
● Principal Component Analysis (PCA): PCA is used to reduce the
dimensionality of data while preserving as much variance as possible. It is
valuable for visualizing high-dimensional data and eliminating
multicollinearity.

● t-Distributed Stochastic Neighbor Embedding (t-SNE): t-SNE is a non-linear

dimensionality reduction technique that is especially useful for visualizing and
exploring complex data patterns.

Feature Selection and Extraction:

● Feature selection methods like mutual information, chi-squared tests, and
recursive feature elimination help identify the most relevant features for
modeling.

● Feature extraction techniques, such as Principal Component Analysis (PCA)

and Linear Discriminant Analysis (LDA), create new features or transform
existing ones to enhance modeling.

Ensemble Learning:
● Ensemble methods combine multiple models to improve predictive accuracy
and robustness. Examples include Random Forests, Gradient Boosting, and
AdaBoost.

Deep Learning:
● Deep neural networks, including convolutional neural networks (CNNs) for
image data and recurrent neural networks (RNNs) for sequential data, are used
for tasks like image recognition, natural language processing, and time series
analysis.

Time Series Analysis:

● Time series forecasting methods, such as autoregressive integrated moving
average (ARIMA) and seasonal decomposition of time series (STL), are
employed for modeling and predicting time-dependent data.

Text Mining and Natural Language Processing (NLP):

● Techniques in NLP are used for sentiment analysis, text classification, topic
modeling, and information retrieval from unstructured text data. Algorithms
like Word2Vec and BERT have shown substantial success in this domain.
Anomaly Detection:
● Anomaly detection models identify unusual or rare instances in the data,
which is valuable for fraud detection, network security, and quality control.
Methods include Isolation Forests and One-Class SVM.

Reinforcement Learning:
● Reinforcement learning is applied when modeling agents must learn how to
make sequential decisions by interacting with an environment. It is commonly
used in robotics, game playing, and autonomous systems.

Graph Mining:
● Graph mining and analysis are used for tasks involving networks and
relationships. Algorithms like PageRank, community detection, and graph
neural networks are applied to social networks, recommendation systems, and
network analysis.

Big Data and Distributed Computing:

● When dealing with massive datasets, distributed computing frameworks like
Hadoop and Spark are employed for parallel processing and distributed
machine learning.

Interpretable Models:
● In some cases, interpretable models like decision trees and linear regression
are preferred to gain insights and explainability, especially in regulated
industries.

3.Explain Feature Extraction and its techniques

A) Feature extraction is a crucial aspect of data mining, particularly when dealing with large
and complex datasets. In data mining, feature extraction refers to the process of selecting,
transforming, or creating new features from the raw data to prepare it for analysis.
Before data mining can be effectively performed, raw data needs to be preprocessed. This
often involves handling missing values, dealing with outliers, and cleaning the data. Once the
data is prepared, feature extraction comes into play.
One of the primary objectives of feature extraction is dimensionality reduction. Large
datasets with numerous features can be computationally intensive and can lead to overfitting
in data mining models. Feature extraction methods aim to reduce the number of features
while preserving the most critical information.
Feature Extraction can be divided into two broad categories i.e. linear and non-linear.
Feature Extraction Techniques:
○ Principal Component Analysis (PCA): PCA is a widely used technique for
linear dimensionality reduction,It is an unsupervised learning algorithm . It
identifies orthogonal axes (principal components) in the data that capture the
most significant variance and projects the data onto a lower-dimensional
space..

○ Linear Discriminant Analysis (LDA): LDA is used when dealing with

classification tasks. It aims to find directions that maximize class separability.

○ Feature Selection: Feature selection methods involve choosing a subset of the

original features that are most relevant to the mining task. These methods can
be filter-based, wrapper-based, or embedded in the modeling process.

○ Manifold Learning: Non-linear techniques like t-distributed stochastic

neighbor embedding (t-SNE) and Isomap are useful for capturing complex
data patterns that linear techniques like PCA might miss.

4.Illustrate the classic problems in Machine learning that are highly related to data
mining
a)classification
clustering
regression
anomaly detecton
feature reduction

UNIT-II
Short Questions
1. Define Confidence and Support.
a)
Support Confidence

Support is a measure of the number Confidence is a measure of the likelihood

of times an item set appears in a that an itemset will appear if another itemset
dataset. appears.

Support is calculated by dividing the Confidence is calculated by dividing the

number of transactions containing an number of transactions containing both
item set by the total number of itemsets by the number of transactions
transactions. containing the first itemset.

Support is used to identify itemsets Confidence is used to evaluate the strength

that occur frequently in the dataset. of a rule.

Support is often used with a threshold Confidence is often used with a threshold to
to identify itemsets that occur identify rules that are strong enough to be of
frequently enough to be of interest. interest.

Support is interpreted as the Confidence is interpreted as the percentage

percentage of transactions in which of transactions in which the second itemset
an item set appears. appears given that the first itemset appears.

2. Define Frequent itemset, Maximal Frequent Itemset and closed frequent itemset
a) Frequent itemset: Frequent item sets, also known as association rules, are a
fundamental concept in association rule mining, which is a technique used in
data mining to discover relationships between items in a dataset. The goal of
association rule mining is to identify relationships between items in a dataset
that occur frequently together.
A frequent item set is a set of items that occur together frequently in a dataset. The
frequency of an item set is measured by the support count, which is the number of
transactions or records in the dataset that contain the item set. For example, if a
dataset contains 100 transactions and the item set {milk, bread} appears in 20 of
those transactions, the support count for {milk, bread} is 20.
Maximal frequent itemset
A maximal frequent itemset is represented as a frequent itemset for which
none of its direct supersets are frequent. The itemsets in the lattice are
broken into two groups such as those that are frequent and those that are
infrequent.
Closed Frequent Itemset:
A closed frequent itemset is a frequent itemset for which there is no other frequent
itemset that has the same support and is a proper superset of it. In other words, it's a
frequent itemset that cannot be "closed" further without losing its support level.

3. What are the different ways of improving the efficiency of Apriori algorithm

a) Here are some of the methods how to improve efficiency of apriori algorithm -

1. Hash-Based Technique: This method uses a hash-based structure called a

hash table for generating the k-itemsets and their corresponding count. It
uses a hash function for generating the table.
2. Transaction Reduction: This method reduces the number of transactions
scanned in iterations. The transactions which do not contain frequent
items are marked or removed.
3. Partitioning: This method requires only two database scans to mine the
frequent itemsets. It says that for any itemset to be potentially frequent in
the database, it should be frequent in at least one of the partitions of the
database.
4. Sampling: This method picks a random sample S from Database D and
then searches for frequent itemset in S. It may be possible to lose a global
frequent itemset. This can be reduced by lowering the min_sup.
5. Dynamic Itemset Counting: This technique can add new candidate
itemsets at any marked start point of the database during the scanning of
the database.

4. What are the three major components of the Apriori algorithm in data mining

a) There are three major components of the Apriori algorithm in data mining
which are as follows.

1. Support
2. Confidence
3. Lift

4. What is FP tree and FP Growth algorithm

a) The frequent-pattern tree (FP-tree) is a compact data structure that stores
quantitative information about frequent patterns in a database. Each transaction is
read and then mapped onto a path in the FP-tree. This is done until all transactions
have been read. Different transactions with common subsets allow the tree to remain
compact because their paths overlap.

A frequent Pattern Tree is made with the initial item sets of the database. The
purpose of the FP tree is to mine the most frequent pattern. Each node of the FP tree
represents an item of the item set.

The root node represents null, while the lower nodes represent the item sets.

FP Growth Algorithm?
The FP-Growth Algorithm is an alternative way to find frequent item sets without
using candidate generations, thus improving performance. For so much, it uses a
divide-and-conquer strategy. The core of this method is the usage of a special data
structure named frequent-pattern tree (FP-tree), which retains the item set
association information.

This algorithm works as follows:

o First, it compresses the input database creating an FP-tree instance to represent

frequent items.
o After this first step, it divides the compressed database into a set of conditional
databases, each associated with one frequent pattern.
o Finally, each such database is mined separately.

5. Define Association Rules for data mining

a) Association rule learning is a type of unsupervised learning technique that checks

for the dependency of one data item on another data item and maps accordingly so
that it can be more profitable. It tries to find some interesting relations or
associations among the variables of dataset. It is based on different rules to discover
the interesting relations between variables in the database.

The association rule learning is one of the very important concepts of machine
learning, and it is employed in Market Basket analysis, Web usage mining,
continuous production, etc.
Essay Questions
1. Find the frequent itemsets using Apriori Algorithm and generate association rules .
Assume that minimum support
threshold (s = 33.33%) and minimum confident threshold (c = 60%) …

2. Explain the working of an FP algorithm with an example.

3. Explain the working of PCY algorithm with an example.
4.Describe the Apriori Algorithm with steps to implement it.. What are its key
principles and advantages?
5.Compare the Apriori Algorithm with the FP-Growth Algorithm. What are the key
differences and trade-offs between the two algorithms?
UNIT-III
Short Questions
1.What is clustering? Why do businesses need to do clustering?
2.What are the major clustering methods?
3. When Should We Use DBSCAN Over K-Means In Clustering Analysis?
4. Define Clustering Features in BIRCH algorithm
5. What is CF tree in BIRCH algorithm
Essay Questions
1. Describe the BIRCH clustering technique.
2. With an example describe the k-means algorithm.
3. Describe the DBSCAN clustering technique.
4. Explain CURE algorithm with a suitable example.
UNIT-IV
SHORT
1. Define the characteristics of data streams
ESSAY
1.What are Data Streams? Discuss the problems associated with Data Streams. 5m
2.What is the need of Bloom Filters and explain its working.
OR
Explain the role of Bloom filter in data stream.
3. Discuss about Data Stream Management.
4. Illustrate the example of data stream queries
5. Discuss models of data stream processing.
6.Explain about the algorithm which is used to Count Different Elements in Stream
Unit V
Short
1. How do you mine social network news feeds?
2. What is web mining?
ESSAY
1.What is MapReduce? Illustrate a simple example of the working of MapReduce?
2. When we search on the internet,we want to see the most relevant pages .Discuss the
relevant algorithms used to determine the pages which are more authorative on the
internet based on their popularity to ensure users see pages that are most likely to be of
use to them……
3. Is Web a directed graph or undirected graph? Discuss the two challenges of web
search with examples

Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
5 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Article 6
No ratings yet
Article 6
6 pages
Ds Unit 2
No ratings yet
Ds Unit 2
36 pages
DA5.6 Marketing Analytics Q&a
No ratings yet
DA5.6 Marketing Analytics Q&a
4 pages
DM Passing Package
No ratings yet
DM Passing Package
38 pages
DM - Unit-1 - Fundamentals of Data Mining
No ratings yet
DM - Unit-1 - Fundamentals of Data Mining
43 pages
Data Mining
No ratings yet
Data Mining
20 pages
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
No ratings yet
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
4 pages
BI - Unit 5
No ratings yet
BI - Unit 5
9 pages
HW1
No ratings yet
HW1
4 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Data Mining
No ratings yet
Data Mining
30 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Wa0016.
No ratings yet
Wa0016.
60 pages
Questions and Answers
No ratings yet
Questions and Answers
7 pages
Data Mining - An Overview
No ratings yet
Data Mining - An Overview
40 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Unit 3
No ratings yet
Unit 3
33 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
Unit I-1data Mining Introduction
No ratings yet
Unit I-1data Mining Introduction
39 pages
Machine Learning
No ratings yet
Machine Learning
22 pages
DWDM 4 Unit Notes
No ratings yet
DWDM 4 Unit Notes
21 pages
Tools of Machine Learning
No ratings yet
Tools of Machine Learning
3 pages
0-Introduction To Data Mining
No ratings yet
0-Introduction To Data Mining
3 pages
ISS-DSS - Module 3
No ratings yet
ISS-DSS - Module 3
23 pages
Data Science Lex
No ratings yet
Data Science Lex
7 pages
Empirical Finance
No ratings yet
Empirical Finance
5 pages
Unit 2 Question Answer
No ratings yet
Unit 2 Question Answer
13 pages
Data Exploration
No ratings yet
Data Exploration
5 pages
Unit 2
No ratings yet
Unit 2
13 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
Whats App
No ratings yet
Whats App
23 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Data Mining Models and Tasks
No ratings yet
Data Mining Models and Tasks
6 pages
Ba Unit 2 Imp
No ratings yet
Ba Unit 2 Imp
9 pages
Advanced Databases and Mining Unit 3
No ratings yet
Advanced Databases and Mining Unit 3
30 pages
Chapter3 Classification and Prediction
No ratings yet
Chapter3 Classification and Prediction
63 pages
Ai Project Cycle
No ratings yet
Ai Project Cycle
9 pages
What Are The Types of Machine Learning?
100% (1)
What Are The Types of Machine Learning?
24 pages
Data Mining
No ratings yet
Data Mining
24 pages
Unit 1 Mining
No ratings yet
Unit 1 Mining
15 pages
Data Mining Techniques and Its Applications in Banking Section - Chitra and Subashini
No ratings yet
Data Mining Techniques and Its Applications in Banking Section - Chitra and Subashini
8 pages
Unit 2 ML
No ratings yet
Unit 2 ML
141 pages
Knowledge Management UNIT-3 Notes
No ratings yet
Knowledge Management UNIT-3 Notes
17 pages
Data Mining Models - GeeksforGeeks
No ratings yet
Data Mining Models - GeeksforGeeks
4 pages
DM Unit 1
No ratings yet
DM Unit 1
10 pages
(IJCST-V3I1P21) : S. Padmapriya
No ratings yet
(IJCST-V3I1P21) : S. Padmapriya
5 pages
Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava
No ratings yet
Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava
3 pages
5 What Is Data-WPS Office
No ratings yet
5 What Is Data-WPS Office
19 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
4.introductin To Machine Learning
No ratings yet
4.introductin To Machine Learning
28 pages
Aiml Model
No ratings yet
Aiml Model
13 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Classification
No ratings yet
Classification
50 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
Unit 3 - DS - 1st Year
No ratings yet
Unit 3 - DS - 1st Year
5 pages
Ai Project Cycle
No ratings yet
Ai Project Cycle
4 pages
Advances in Computing and Data Sciences Second International Conference ICACDS 2018 Dehradun India April 20 21 2018 Revised Selected Papers Part II Mayank Singh - Explore the complete ebook content with the fastest download
100% (3)
Advances in Computing and Data Sciences Second International Conference ICACDS 2018 Dehradun India April 20 21 2018 Revised Selected Papers Part II Mayank Singh - Explore the complete ebook content with the fastest download
58 pages
Prajwal. K
No ratings yet
Prajwal. K
31 pages
Lecture 11 Unsupervised Learning
No ratings yet
Lecture 11 Unsupervised Learning
19 pages
Machine Learning Absolute Beginners Introduction 2nd PDF
100% (2)
Machine Learning Absolute Beginners Introduction 2nd PDF
128 pages
NeurIPS 2023 Paintseg Painting Pixels For Training Free Segmentation Paper Conference
No ratings yet
NeurIPS 2023 Paintseg Painting Pixels For Training Free Segmentation Paper Conference
22 pages
BTP Project Report
No ratings yet
BTP Project Report
13 pages
Full ml-2
No ratings yet
Full ml-2
1 page
SC Imp
No ratings yet
SC Imp
10 pages
An Analysis of Outlier Detection Through Clustering Method
No ratings yet
An Analysis of Outlier Detection Through Clustering Method
6 pages
BoussaadaAchraf Tunisian Truck License Plate Recognition
No ratings yet
BoussaadaAchraf Tunisian Truck License Plate Recognition
92 pages
Unit I
No ratings yet
Unit I
28 pages
Capstone Design Project Weekly Progress Report: Dynamic Autoselection of Machine Learning Model in Networks of Cloud
No ratings yet
Capstone Design Project Weekly Progress Report: Dynamic Autoselection of Machine Learning Model in Networks of Cloud
6 pages
Infinite 3D Landmarks Improving Continuous 2D Facial Landmark Detection Paper
No ratings yet
Infinite 3D Landmarks Improving Continuous 2D Facial Landmark Detection Paper
12 pages
bk978 0 7503 4957 4ch0
No ratings yet
bk978 0 7503 4957 4ch0
14 pages
Data Science Master Class 2023
No ratings yet
Data Science Master Class 2023
8 pages
A Hybrid Machine Learning Framework For Predictive Maintenance in Smart Manufacturing
No ratings yet
A Hybrid Machine Learning Framework For Predictive Maintenance in Smart Manufacturing
7 pages
Applied Ai U3
No ratings yet
Applied Ai U3
23 pages
Answer AIL 1
No ratings yet
Answer AIL 1
12 pages
Disease Detection Using ML
100% (8)
Disease Detection Using ML
24 pages
Dis Co
No ratings yet
Dis Co
40 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 3, Reading 7
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 3, Reading 7
11 pages
CLASS NOTES Unit 1 ML Material
No ratings yet
CLASS NOTES Unit 1 ML Material
42 pages
Paper 79-Deep Learning Based Intelligent Surveillance System
No ratings yet
Paper 79-Deep Learning Based Intelligent Surveillance System
12 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Chapter - Machine Learning Algorithms
No ratings yet
Chapter - Machine Learning Algorithms
2 pages
Introduction To Machine Learning - Midterm Quiz 2
No ratings yet
Introduction To Machine Learning - Midterm Quiz 2
11 pages
Gene Expression Analysis On Cancer Dataset
No ratings yet
Gene Expression Analysis On Cancer Dataset
11 pages
Agile AI: A Practical Guide To Building AI Applications and Teams
100% (1)
Agile AI: A Practical Guide To Building AI Applications and Teams
52 pages
Improved Anomaly Detection in Surveillance Videos Based On A Deep Learning Method (2018)
No ratings yet
Improved Anomaly Detection in Surveillance Videos Based On A Deep Learning Method (2018)
9 pages

Mmds

Uploaded by

Mmds

Uploaded by

Unit -I

2. Define Hash Functions and Natural Logarithms

3.What is machine learning

4.Differentiate Supervised and Unsupervised learning.

Supervised Learning Unsupervised Learning

In supervised learning, input data is In unsupervised learning, only input data

The goal of supervised learning is to train The goal of unsupervised learning is to

Supervised learning can be categorized Unsupervised Learning can be classified

Supervised learning model produces an Unsupervised learning model may give

Supervised learning is not close to true Unsupervised learning is more close to

It includes various algorithms such as It includes various algorithms such as

5.Define Feature Extraction and list any 2 feature extraction techniques

A) Statistical modeling is the process of describing the connections between variables in a

Statistical Modeling Techniques

2.What are the various computational approaches to modelling

● Regression: Regression models predict a continuous numerical value or output

● t-Distributed Stochastic Neighbor Embedding (t-SNE): t-SNE is a non-linear

Feature Selection and Extraction:

● Feature extraction techniques, such as Principal Component Analysis (PCA)

Time Series Analysis:

Text Mining and Natural Language Processing (NLP):

Big Data and Distributed Computing:

3.Explain Feature Extraction and its techniques

○ Linear Discriminant Analysis (LDA): LDA is used when dealing with

○ Feature Selection: Feature selection methods involve choosing a subset of the

○ Manifold Learning: Non-linear techniques like t-distributed stochastic

Support is a measure of the number Confidence is a measure of the likelihood

Support is calculated by dividing the Confidence is calculated by dividing the

Support is used to identify itemsets Confidence is used to evaluate the strength

Support is interpreted as the Confidence is interpreted as the percentage

1. Hash-Based Technique: This method uses a hash-based structure called a

4. What is FP tree and FP Growth algorithm

This algorithm works as follows:

o First, it compresses the input database creating an FP-tree instance to represent

5. Define Association Rules for data mining

a) Association rule learning is a type of unsupervised learning technique that checks

2. Explain the working of an FP algorithm with an example.

You might also like