0% found this document useful (0 votes)

6 views13 pages

Data Science

BCA

Uploaded by

SPANDANA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views13 pages

Data Science

BCA

Uploaded by

SPANDANA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Unit-I

2marks
1) Define data mining.
=Data Mining refers to extracting or mining knowledge from large amount of data.
2) What is knowledge discovery in databases?
=Knowledge Discovery in databases is a process of identifying a valid, potentially useful and
ultimately understandable structure in data. This process involve selecting or sampling data from
data warehouse, cleaning and preprocessing it, transforming and reducing it, applying data
mining component to produce a structure and then evaluating the derived structure.
3) List any four stages of KDD.
= Data cleaning, Data integration, Data selection, Data transformation, .Data mining, Pattern
evaluation.
4) What is Loosely Coupled DBMS and Tightly Coupled DBMS?
= Tightly Coupled DBMS: In a tightly coupled DBMS, the components or subsystems are highly
integrated and interact closely with each other.
Loosely Coupled DBMS: In a loosely coupled DBMS, the components or subsystems within the
system operate relatively independently and communicate with each other through standardized
interfaces or protocols.
5) What is prediction and description in Data Mining?
= In data mining, a description refers to a summary or characterization of patterns, trends, or
relationships discovered in a dataset.
predictive data mining is used to make predictions about future events.
6) List any four discovery driven tasks.
=Clustering Analysis, Association rule mining, Anomaly detection, sequential pattern mining.
7) Define Support and Confidence.
= In data mining, support and confidence are key metrics used in association rule mining
Support :measures the frequency or occurrence of a particular itemset in a dataset.
Confidence :measures the reliability or strength of an association rule between two itemsets.
8) Write the objectives of clustering.
= Identifying Natural Groupings
Understanding Data Distribution
Data Reduction
Anomaly Detection
9) Define Rough Set.
= In data mining, rough set theory is a mathematical framework used for data analysis and
knowledge discovery, particularly in dealing with uncertainty and vagueness in data.
10) What is sequence mining and spatial data mining
= Sequential data mining involves the extraction of patterns or knowledge from sequential data,
where the order of data points is crucial. It's commonly used in fields like natural language
processing
Spatial data mining involves the extraction of patterns or knowledge from spatial data, which
typically includes geographical or spatial information. This can involve analyzing data such as
maps, satellite imagery, GPS data, and geographical databases to discover trends, relationships,
or anomalies within spatially distributed data.
11) Write the subtasks of web mining.
= Web Content Mining:
Web Structure Mining:
Web Usage Mining:
Web Data Integration:
Web Information Retrieval:
12) Expand KDT and IE.
=KDT-K Dimensional Tree
IE-information Extraction
13) What is web mining and text mining?
= Web mining is the process of extracting useful information or patterns from the World Wide
Web. It involves techniques from data mining, machine learning, and statistics to analyze web
data, including web content, structure, and usage logs. Web mining can be used for various
purposes such as improving search engine performance, understanding user behavior, and
extracting market intelligence.
Text mining, also known as text analytics, is the process of deriving high-quality information
from textual data. It involves analyzing unstructured text data to discover patterns, trends, and
insights. Text mining techniques typically include natural language processing (NLP), machine
learning, and statistical analysis to extract valuable information from large volumes of text.
14) List any two issues and challenges in data mining.
= Data Quality: Data mining heavily relies on the quality of the data being analyzed. Poor-
quality data, such as missing values, inconsistencies, or inaccuracies, can significantly impact the
results and conclusions drawn from the analysis.
Overfitting: Overfitting occurs when a data mining model captures noise or random fluctuations
in the training data rather than the underlying patterns.
15) List the two application areas of data mining.
 = Data mining in Education: ...
 Data Mining in Healthcare: ...
 Data Mining in Fraud Detection. ...
 Data Mining in Lie Detection. ...
 Data Mining in Market Basket Analysis.

16) Explain the use of Data Visualization?

= Data visualization is actually a set of data points and information that are represented
graphically to make it easy and quick for user to understand.
17) Define Machine Learning?
= Machine learning investigates how computers can learn (or improve their performance) based
on data. A main research area is for computer programs to automatically learn to recognize
complex patterns and make intelligent decisions based on data.
18) Define Supervised and Unsupervised Learning?
= Supervised learning is a machine learning method in which models are trained using labeled
data. In supervised learning, models need to find the mapping function to map the input variable
(X) with the output variable (Y).
Unsupervised learning is another machine learning method in which patterns inferred from the
unlabeled input data. The goal of unsupervised learning is to find the structure and patterns from
the input data. Unsupervised learning does not need any supervision.
19) Explain the two fundamental goals of Data Mining?
= Prediction involves using some variables or fields in the database to predict unknown or future
values of other variables of interest.
Description focuses on finding human-interpretable patterns describing the data.
20) Define Discovery Model? Give an example?
= Data Discovery is the process of identifying patterns, trends, and insights within a meaningful
dataset. It includes collecting data from various types of sources and then applying an advanced
Data Analytical technique for identifying the patterns and themes within the collected dataset.
An example of a discovery model in data mining is the recommendation system used by
streaming platforms like Netflix or Spotify.
21) What is Frequent Episodes? List its types?
= In data mining, frequent episodes refer to sequences of events that occur frequently within a
dataset. These sequences could represent patterns of behavior, occurrences in a time series, or
sequences of actions.
Types of frequent episodes include:

1.Sequential Pattern Mining

2.Episode Rule Mining
22) What is Deviation Detection? Give example?
= Deviation detection in data mining, also known as anomaly detection, is the process of
identifying patterns or instances that deviate from the norm or expected behavior within a
dataset. These deviations, or anomalies, can represent unusual events, errors, or potential fraud.
An example of deviation detection in data mining is fraud detection in financial transactions.

Unit 2
2 marks questions
1) What is data warehouse?
= A data warehouse is a repository of information collected from multiple sources, stored under a
unified schema and that usually resides at a single site.
2) How are organizations using the information from data warehouses?
= A data warehouse centralizes and consolidates large amounts of data from multiple sources. Its
analytical capabilities allow organizations to derive valuable business insights from their data to
improve decision-making.

3) Expand OLTP and OLAP.

= OLTP is Online Transaction Processing
Online analytical processing (OLAP)
4) What is data cube? How it is defined?
= When data is grouped or combined in multidimensional matrices called Data Cubes. The data
cube method has a few alternative names or a few variants, such as "Multidimensional
databases," "materialized views," and "OLAP (On-Line Analytical Processing)."
5) Define dimensions and facts.
= Dimensions: Dimensions represent the categorical attributes or descriptors by which data is
analyzed, categorized, or viewed. They provide context and descriptive information about the
data.
Facts: Facts represent the numerical, measurable data or metrics that are being analyzed or
reported on. They are typically numeric values that can be aggregated, summarized, or analyzed.
6) Define base cuboid and apex cuboid.
= In data warehousing, the data cubes are n-dimensional. The cuboid which holds the lowest
level of summarization is called a base cuboid.
The topmost 0-D cuboid, which holds the highest level of summarization, is known as the apex
cuboid.
7) Define measure and mention any two categories.
= In data mining, measures refer to the numerical values or metrics that are being analyzed or
predicted. Measures are typically the target variables or outcomes of interest in a data mining
task.
Data mining measures can be categorized or arranged into three categories: holistic,
distributive, and algebraic.
8) Define concept hierarchy. Give an example.
= In data mining, the concept of a concept hierarchy refers to the organization of data into a
tree-like structure, where each level of the hierarchy represents a concept that is more general
than the level below it.
9) Write any two differences between OLAP system and Statistical database.
=Purpose and Usage:
OLAP systems are designed for interactive analysis of multidimensional data, allowing users to
explore and analyze data from different perspectives, drill down into details, and perform
complex analytical queries.
Statistical databases, on the other hand, are optimized for storing and managing large volumes of
structured statistical data, often collected from surveys, experiments, or observational studies.
Data Structure and Representation:
OLAP systems use multidimensional data models to represent data in the form of cubes or
hypercubes, with dimensions representing different aspects of the data (e.g., time, product,
geography) and measures representing numerical metrics or KPIs.
Statistical databases typically store data in tabular or relational formats, where each row
represents an observation or case, and each column represents a variable or attribute.
10) What is Starnet Model? Give an example through diagram.
=A star schema is the elementary form of a dimensional model, in which data are organized
into facts and dimensions.

11) List any two benefits of Business analyst by having data warehouse.
 =Quickly analyze data for various business applications.
 Improve decision-making speed and efficiency.
 Maintain the accuracy and reliability of data.
 Reduce costs related to data storage and management.

12) List any two data warehouse models from the architecture point of view.
=1)ETL
2)data mart
3)logical data model
13) List any four back-end tools and utilities included in data warehouse.
=
14) Define data cleaning and data integration.
=Data cleaning is the process of correcting or deleting inaccurate, damaged, improperly
formatted, duplicated, or insufficient data from a dataset.
Data Integration is a data preprocessing technique that combines data from multiple
heterogeneous data sources into a coherent data store and provides a unified view of the data.
15) Define data transformation and data reduction.
=Data transformation is a technique used to convert the raw data into a suitable format that
efficiently eases data mining and retrieves strategic information. Data transformation includes
data cleaning techniques and a data reduction technique to convert the data into the appropriate
form.
Data reduction is a technique used in data mining to reduce the size of a dataset while still
preserving the most important information.
16) What is Dimensionality Reduction? Mention any one method used for Dimensionality
Reduction.
=Dimensionality reduction technique can be defined as, "It is a way of converting the higher
dimensions dataset into lesser dimensions dataset ensuring that it provides similar information."
Principal component analysis (PCA)
Missing value ratio
17) List any two methods for data discretization.
= There are two forms of data discretization first is supervised discretization, and the second is
unsupervised discretization.
18) Write the difference between Operational database and Data Warehouse?
=

Operational Database Data Warehouse

Data warehousing frameworks are regularly

Operational frameworks are outlined to
outlined to back high-volume analytical
back high-volume exchange preparing.
processing (i.e., OLAP).

operational frameworks are more often Data warehousing frameworks are ordinarily
than not concerned with current data. concerned with verifiable information.

Data inside operational frameworks are Non-volatile, unused information may be

basically overhauled frequently agreeing to included routinely. Once Included once in a
need. while changed.

It is outlined for investigation of commerce

It is planned for real-time commerce
measures by subject range, categories, and
managing and processes.
qualities.

Relational databases are made for on-line Data Warehouse planned for on-line
value-based Preparing (OLTP) Analytical Processing (OLAP)

Operational frameworks are ordinarily Data warehousing frameworks are more

optimized to perform quick embeds and often than not optimized to perform quick
overhauls of cooperatively little volumes of recoveries of moderately tall volumes of
data. information.

Data In Data out

Operational Database Data Warehouse

Operational database systems are generally While data warehouses are generally subject-
application-oriented. oriented.

19) Define ROLAP and MOLAP?

=ROLAP stands for Relational Online Analytical Processing.
MOLAP stands for Multidimensional Online Analytical Processing.
20) What is Data mart? List its types?
=A Data Mart is a subset of a directorial information store, generally oriented to a specific
purpose or primary data subject which may be distributed to provide business needs.

here are mainly two approaches to designing data marts. These approaches are

o Dependent Data Marts

o Independent Data Marts

21) Explain Virtual Warehouse?

=Virtual Data Warehouses is a set of perception over the operational database. For effective
query processing, only some of the possible summary vision may be materialized. A virtual
warehouse is simple to build but required excess capacity on operational database servers.
22) What type of information a Metadata Repository should include?
= metadata includes such details as date created, file size and type, and archiving requirements.
23) What is Metadata Repository?
=A metadata repository is a database created to store metadata. Metadata is information about
the structures that contain the actual data. Metadata is often said to be "data about data", but this
is misleading.
24) Explain the approaches to data cleaning as a process?

=1. Remove duplicate or irrelevant observations

2. Fix structural errors

3. Filter unwanted outliers

4. Handle missing data

5. Validate and QA

25) Define Binning?

=Data binning, also called discrete binning or bucketing, is a data pre-processing technique used
to reduce the effects of minor observation errors. It is a form of quantization. The original data
values are divided into small intervals known as bins, and then they are replaced by a general
value calculated for that bin.
26) What are the use of Data scrubbing tool and Data Auditing tool?
=Data Auditing Tools:
Data auditing involves monitoring and tracking changes to the data over time to ensure data
quality, compliance, and security.
Data Scrubbing Tools:
Data scrubbing, also known as data cleansing or data cleaning, is the process of identifying and
correcting errors, inconsistencies, and inaccuracies in the data.
27) What is Entity Identification Problem?
=Entity Identification Problem occurs during the data integration. During the integration of
data from multiple resources, some data resources match each other and they will become
reductant if they are integrated.
28) List the processes involved in data transformation?
=1. Data Smoothing
2. Attribute Construction
3. Data Aggregation
4. Data Normalization
5. Data Discretization
6. Data Generalization
29) What is the use of Attribute Subset Selection?
=Attribute subset Selection is a technique which is used for data reduction in data mining
process. Data reduction reduces the size of data so that it can be used for analysis purposes more
efficiently.
30) What is Histograms? Mention the use of Multidimensional histograms
=A histogram is a graphical representation of the frequency distribution of continuous series
using rectangles. The x-axis of the graph represents the class interval, and the y-axis shows the
various frequencies corresponding to different class intervals.

Multidimensional histograms are used in data science for analyzing and visualizing the
distribution of multidimensional data.
Unit IV
2 marks questions
1)What is Classification?
=Classification is a process of categorizing data or objects into predefined classes or categories
based on their features or attributes.
2) What is prediction?
=In data science, prediction refers to the process of using data and statistical algorithms to make
informed guesses about future outcomes or trends based on historical data.
3) What is regression Analysis?
=Regression analysis is a statistical method to model the relationship between a dependent
(target) and independent (predictor) variables with one or more independent variables.
4) Write steps involved in data classification?
=1. Perform A Risk Assessment For Sensitive Data
2. Establish A Data Classification Policy3. Categorize The Types Of Data
4.Identify Data Locations
5. Identify And Classify Data
6. Use Results To Improve Security And Compliance
5) What is supervised learning?
=Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled data
means some input data is already tagged with the correct output.
6) What is unsupervised learning?
=Unsupervised learning is a type of machine learning that learns from unlabeled data. This
means that the data does not have any pre-existing labels or categories. The goal of unsupervised
learning is to discover patterns and relationships in the data without any explicit guidance.
7) What is correlation Analysis?
=Correlation analysis is a statistical method used to measure the strength of the linear
relationship between two variables and compute their association. Correlation analysis calculates
the level of change in one variable due to the change in the other.
8) What is Decision tree induction?
=The goal of decision tree induction is to build a model that can accurately predict the outcome
of a given event, based on the values of the attributes in the dataset.
9) What is Decision tree?
=Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-
structured classifier, where internal nodes represent the features of a dataset, branches represent
the decision rules and each leaf node represents the outcome.
10) What is tree pruning?
=Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal
decision tree.
11) What is the use of tree selection measures?
=
12) Expand CART, ID3
=Classification and regression trees (CART)
ID3, refers to Iterative Dichotomizer 3
13) What is splitting criterion? List distinct splitting criteria’s values.
=
14) What is attribute selection measure? List popular attribute measures.
=Attribute selection measure (ASM) is a criterion used in decision tree algorithms to evaluate the
usefulness of different attributes for splitting a dataset.
15) What is information Gain? Write the formula to find expected information needed to
classify a tuple.
= Information Gain is the attribute selection measure that is used to find/select the best attribute in
a dataset or used to find the root node. The attribute with the highest information gain value is
selected as a root node for splitting criteria.
Gain(A) = Info(D) − Info A(D). where Gain(A) is information Gain of Attribute A , info(D)=−X
m i=1 pi log2 (pi), is entropy or expected information. Info A (D) = Xv j=1 |Dj | /|D| × Info(Dj).
16) What is entropy?
=Entropy is the measure of the degree of randomness or uncertainty in the dataset. In the case of
classifications, It measures the randomness based on the distribution of class labels in the dataset.
17) Define gain ratio. Write formula.
= Gain Ratio is a measure that takes into account both the information gain and the number of
outcomes of a feature to determine the best feature to split on.
Gain Ratio = Information Gain / Split Info
18) Define Gini index. Write formula.
=The Gini Index is a proportion of impurity or inequality in statistical and monetary settings. In
machine learning, it is utilized as an impurity measure in decision tree algorithms for
classification tasks. The Gini Index measures the probability of a haphazardly picked test being
misclassified by a decision tree algorithm, and its value goes from 0 (perfectly pure) to 1
(perfectly impure).
19) List the use of MDL.
=
20) Write two approaches to tree pruning.
= Pre-pruning Approach
Post-pruning Approach

21) What is pessimistic pruning?

= The Pessimistic Error Pruning algorithm is a top-down pruning algorithm. This means we start
at the root and then, for each decision node, we check from top to bottom to determine whether it
is relevant for the final result or should be pruned.
22) Expand and briefly explain BOAT.
=
23) What is eager learners? Give examples.
= eager learning precomputes a model during training, making predictions faster but
potentially requiring more memory.Eg:
Decision Trees, Support Vector Machines (SVM), Neural Networks
24) What is Instance based Learners?
= instance-based learning are the systems that learn the training examples by heart and then
generalizes to new instances based on some similarity measure. It is called instance-based
because it builds the hypotheses from the training instances.
25) What is clustering? Write the advantages.
= A way of grouping the data points into different clusters, consisting of similar data points. The
objects with the possible similarities remain in a group that has less or no similarities with
another group."
26) What is the purpose of Portioning Algorithm?
=
27) Expand CLARA and ROCK.
= CLARA(Clustering for Large Application)
ROCK-RObust Clustering using LinKs,
28) Expand BIRCH and DBSCAN.
= BIRCH (balanced iterative reducing and clustering using hierarchies)
DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

29) Expand OPTICS and DENCLUE.

= OPTICS Clustering stands for Ordering Points To Identify Cluster Structure.
DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
30) What is STING clustering.
= Statistical Information Grid(STING):A STING is a grid-based clustering technique. It uses a
multidimensional grid data structure that quantifies space into a finite number of cells.
31) What is grid based clustering? Give examples.
= A STING is a grid-based clustering technique. It uses a multidimensional grid data structure
that quantifies space into a finite number of cells. Instead of focusing on data points, it focuses
on the value space surrounding the data points.

Basic and Advanced Laboratory Techniques in Histopathology and Cytology
100% (12)
Basic and Advanced Laboratory Techniques in Histopathology and Cytology
275 pages
Innopro Paramix C Blending System PDF
100% (1)
Innopro Paramix C Blending System PDF
2 pages
21 Masks Ego EN PDF
No ratings yet
21 Masks Ego EN PDF
6 pages
Dynamic Reservoir Simulation of The Alwyn Field Using Eclipse.
No ratings yet
Dynamic Reservoir Simulation of The Alwyn Field Using Eclipse.
108 pages
FDS - I Unit
No ratings yet
FDS - I Unit
9 pages
Datamining Quiz
No ratings yet
Datamining Quiz
173 pages
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
No ratings yet
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
31 pages
Vivaquestions
No ratings yet
Vivaquestions
14 pages
2018 & 2019 Data Mining Answers
No ratings yet
2018 & 2019 Data Mining Answers
25 pages
MCQ On Data Mining With Answers Set-1
No ratings yet
MCQ On Data Mining With Answers Set-1
11 pages
Vi Sem Bca Qbank - Wcms - Fds
50% (2)
Vi Sem Bca Qbank - Wcms - Fds
11 pages
CS1004 DWM 2marks 2013
No ratings yet
CS1004 DWM 2marks 2013
22 pages
Aie - Concept of Data Mining
No ratings yet
Aie - Concept of Data Mining
5 pages
DWM Mid 2 Question Bank
No ratings yet
DWM Mid 2 Question Bank
5 pages
6 TheRealTimeFaceDetectionandRecognitionSystem
No ratings yet
6 TheRealTimeFaceDetectionandRecognitionSystem
48 pages
DMA QB Solved
No ratings yet
DMA QB Solved
42 pages
Data Mining and Warehousing
100% (3)
Data Mining and Warehousing
30 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Book Exercises NayelliAnswers
No ratings yet
Book Exercises NayelliAnswers
3 pages
Whats App
No ratings yet
Whats App
23 pages
DMBI Viva
No ratings yet
DMBI Viva
18 pages
DM UNIT-1 Question and Answer
No ratings yet
DM UNIT-1 Question and Answer
25 pages
DWDM SR2
No ratings yet
DWDM SR2
21 pages
DM&DW SEE Module 1
No ratings yet
DM&DW SEE Module 1
6 pages
Question Bank With 2 Marks
100% (1)
Question Bank With 2 Marks
21 pages
DM IV YR MID2 Set2
No ratings yet
DM IV YR MID2 Set2
4 pages
Data Mining Mid 1 - Students-1
No ratings yet
Data Mining Mid 1 - Students-1
4 pages
DM Obj
No ratings yet
DM Obj
16 pages
Short Notes On Data Mining & Warehousing
No ratings yet
Short Notes On Data Mining & Warehousing
43 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
DM Passing Package
No ratings yet
DM Passing Package
38 pages
Mcqs
No ratings yet
Mcqs
5 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
11 pages
Data Mining MCQs Unit1&2
No ratings yet
Data Mining MCQs Unit1&2
11 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
3 pages
DMBI QB AssignmentQ
No ratings yet
DMBI QB AssignmentQ
8 pages
Data Mining Solved PP Short Q's
No ratings yet
Data Mining Solved PP Short Q's
11 pages
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
No ratings yet
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
7 pages
Dataqb
No ratings yet
Dataqb
38 pages
Datamining Bits
No ratings yet
Datamining Bits
16 pages
Data Mining Summaries PDF
No ratings yet
Data Mining Summaries PDF
22 pages
Questions and Answers
No ratings yet
Questions and Answers
19 pages
01 Intro
No ratings yet
01 Intro
26 pages
Data Mining
No ratings yet
Data Mining
3 pages
Data Mining Unit-I
No ratings yet
Data Mining Unit-I
5 pages
My Notes DWDM
No ratings yet
My Notes DWDM
18 pages
Sample Question DMW
No ratings yet
Sample Question DMW
4 pages
DMDW
No ratings yet
DMDW
4 pages
DWDM Bits New
No ratings yet
DWDM Bits New
5 pages
DS Notes BCA
No ratings yet
DS Notes BCA
16 pages
MCQ On Data Mining
No ratings yet
MCQ On Data Mining
20 pages
DWDM Assignment 1
No ratings yet
DWDM Assignment 1
4 pages
Data Science & Big Data Analysis Module 1,2,3,4,5
No ratings yet
Data Science & Big Data Analysis Module 1,2,3,4,5
70 pages
CEUC502 - DMBI - Question - Bank
No ratings yet
CEUC502 - DMBI - Question - Bank
12 pages
MCQ On Data Mining
No ratings yet
MCQ On Data Mining
20 pages
DMDW Lab Oral Question Bank
No ratings yet
DMDW Lab Oral Question Bank
4 pages
Question Bank: Data Warehousing and Data Mining Semester: VII
No ratings yet
Question Bank: Data Warehousing and Data Mining Semester: VII
4 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Exploring the World of Data Science and Machine Learning
From Everand
Exploring the World of Data Science and Machine Learning
NIBEDITA Sahu
No ratings yet
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
3 Minute Speech
No ratings yet
3 Minute Speech
2 pages
Portfolio Optimization
No ratings yet
Portfolio Optimization
53 pages
Astm E9 09
No ratings yet
Astm E9 09
4 pages
BHEL Application
No ratings yet
BHEL Application
6 pages
Energy Efficient Pumping Technology Innovations and Recent Trends
No ratings yet
Energy Efficient Pumping Technology Innovations and Recent Trends
15 pages
Latest Software Collection TERABYTE Package Links
No ratings yet
Latest Software Collection TERABYTE Package Links
32 pages
CBSE Class 11 Mathematics Relations and Functions
No ratings yet
CBSE Class 11 Mathematics Relations and Functions
2 pages
Induction Proofs, IV: Fallacies and Pitfalls: Example 1
No ratings yet
Induction Proofs, IV: Fallacies and Pitfalls: Example 1
4 pages
De Cuong On Thi Tieng Anh Hoc Ky II Lop 11 Nang Cao
No ratings yet
De Cuong On Thi Tieng Anh Hoc Ky II Lop 11 Nang Cao
13 pages
Series and Parallel - Simple Circuits: © Boardworks LTD 2003
No ratings yet
Series and Parallel - Simple Circuits: © Boardworks LTD 2003
22 pages
Burgess-What Is Literature
No ratings yet
Burgess-What Is Literature
4 pages
Part 3 Speaking On The Phone Public Places
No ratings yet
Part 3 Speaking On The Phone Public Places
1 page
Pressure Groups in India
No ratings yet
Pressure Groups in India
3 pages
2023 2024 Class Catch Up Friday Program
100% (1)
2023 2024 Class Catch Up Friday Program
6 pages
Sample Creative Brief
No ratings yet
Sample Creative Brief
2 pages
Nder
No ratings yet
Nder
2 pages
Aminoguanidine Bicarbonate
No ratings yet
Aminoguanidine Bicarbonate
8 pages
IIE Bachelor of Commerce in Law Factsheet 2020 (New) V1 PDF
No ratings yet
IIE Bachelor of Commerce in Law Factsheet 2020 (New) V1 PDF
2 pages
HRM Reflection 1
No ratings yet
HRM Reflection 1
2 pages
Abel Math Harvard - Edu PDF
No ratings yet
Abel Math Harvard - Edu PDF
65 pages
Manual MCS
No ratings yet
Manual MCS
209 pages
Beamforming: Fundamentals To Implementation: Luc Langlois, Director Products & Emerging Technologies /5G Avnet
No ratings yet
Beamforming: Fundamentals To Implementation: Luc Langlois, Director Products & Emerging Technologies /5G Avnet
24 pages
Acute Mesenteric Ischemia
No ratings yet
Acute Mesenteric Ischemia
20 pages
Icd 16 5 Eng V2.1 PDF
No ratings yet
Icd 16 5 Eng V2.1 PDF
2 pages
Reviewer Print
No ratings yet
Reviewer Print
9 pages
Practice Chapter 3 Conformations of Alkanes and Cycloalkanes
No ratings yet
Practice Chapter 3 Conformations of Alkanes and Cycloalkanes
22 pages

Data Science

Uploaded by

Data Science

Uploaded by

Unit-I

16) Explain the use of Data Visualization?

1.Sequential Pattern Mining

3) Expand OLTP and OLAP.

Operational Database Data Warehouse

Data warehousing frameworks are regularly

Data inside operational frameworks are Non-volatile, unused information may be

It is outlined for investigation of commerce

Operational frameworks are ordinarily Data warehousing frameworks are more

Data In Data out

19) Define ROLAP and MOLAP?

o Dependent Data Marts

21) Explain Virtual Warehouse?

=1. Remove duplicate or irrelevant observations

2. Fix structural errors

3. Filter unwanted outliers

25) Define Binning?

21) What is pessimistic pruning?

29) Expand OPTICS and DENCLUE.

You might also like