Sample Questions:: Section I: Subjective Questions
Sample Questions:: Section I: Subjective Questions
Sample Questions:
Section I: Subjective Questions
1. From the perspective of data warehouse architecture, there are different data
warehouse models, explain them.
2. There are issues while pre-processing the data as well as for comparing and
evaluating classification and prediction methods. Discuss.
3. Discuss the Clustering and Outliers methods of data mining techniques.
1. In this system, the balance is the current outstanding balance in the customer’s
account.
1] Accounts receivable
2] Accounts payable
3] Accounts balance
4] Accounts
2. The data is moved from here which is used in operational systems into a data
warehouse staging area, then into a data warehouse and finally into a set of conformed
data marts.
1] databases
2] records
3] files
4] fields
3. These are intermediate servers that sit between a relational back end server (where
the data in the warehouse is stored) and client front end tools.
1] ROLAP Servers
2] MOLAP Servers
3] OLAP Servers
SYMBIOSIS CENTRE FOR DISTANCE LEARNING (SCDL)
Subject: Data Warehousing and Data Mining
6. This learning applies on dynamic dataset where class label of training data is
unknown.
1] Unsupervised
2] supervised
3] labelled
4] unlabelled
7. These Databases contain complex texts, graphics, images, video fragments, maps,
voice, music, and other forms of audio/video information.
1] Multimedia
2] Spatial
3] Content
4] Integrated
9. This module is used to analyse and interact with data mining modules to search for
an interesting pattern. It filters data to discover an interesting pattern.
1] Pattern Evaluation Module
2] Search Evaluation Module
3] Filter Evaluation Module
4] Data Discover Module
10. This layer integrates the disparate data sets by transforming the data from the staging
layer often storing the transformed data in an operational data store (ODS) database.
1] Integration
2] Staging
3] Operational
SYMBIOSIS CENTRE FOR DISTANCE LEARNING (SCDL)
Subject: Data Warehousing and Data Mining
4] Subject
12. The data mining task is generally divided into two categories, these are
1] Predictive Task
2] Descriptive Task
3] Outlier Task
4] Correlation Task
15. The output of this process is a global logical data model consisting of the following:
1] Entity- Relationship diagram
2] Relational schema
3] Supporting documentation
4] Technical documentation
18. The different types of data can be a data source such as:
1] Operations
2] Web server logs
3] Internal market research data
4] Client data
19. In ____________, data sits prior to being scrubbed and transformed into a data
warehouse / data mart.
1] Staging Area
2] Data Extraction Layer
3] ETL Layer
4] Data Storage Layer
21. ____________ states that a subset of frequent item set is always frequent.
1] Apriori property
2] Apriori Set
3] Apriori Algorithm
4] Apriori Item
22. The data stored in the warehouse is uploaded from the ____________ systems.
1] operational
2] financial
3] transactional
4] separate
24. In ____________, data sits prior to being scrubbed and transformed into a data
warehouse / data mart.
1] Staging Area
2] Data Extraction Layer
3] ETL Layer
4] Data Storage Layer
SYMBIOSIS CENTRE FOR DISTANCE LEARNING (SCDL)
Subject: Data Warehousing and Data Mining
26. ____________ helps to identify items that are connected to each other, but it does
not help to find nature of the connection.
1] Association mining
2] Rule mining
3] Data mining
4] Generalisation mining
27. In ____________, the data in the database contains incomplete data called missing
data for some records or noisy data, which misleads the data mining process.
1] Noisy data handling
2] Missing data handling
3] Misleading data handling
4] Noisy record handling
29. The relational model discovers the strong entities in terms of business process
execution, whereas dimension model discovers the associative entities that represent
the effect of business process.
30. Data warehouse usually requires integrating the data from several heterogeneous
resources.
31. Each cell within a multidimensional structure contains aggregated data related to
elements along each of its dimensions.
32. A data mining system may not operate on all operating systems.
33. One of the leading causes of poor query performance is poor I/O design.
37. Concept hierarchies can be used to derive relationship between spatial and non-
spatial attributes.
38. The best split measures are based on degree of impurity of child node.