0% found this document useful (0 votes)
23 views10 pages

Unit 1 Question Bank

This document is a question bank for the Data Warehousing and Data Mining course at Vivekanandha College of Engineering for Women. It includes various questions categorized into three parts: Part A consists of multiple-choice questions, Part B includes short answer questions, and Part C contains detailed discussion prompts. The questions cover topics such as data warehousing concepts, schemas, metadata, and the differences between OLAP and OLTP systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views10 pages

Unit 1 Question Bank

This document is a question bank for the Data Warehousing and Data Mining course at Vivekanandha College of Engineering for Women. It includes various questions categorized into three parts: Part A consists of multiple-choice questions, Part B includes short answer questions, and Part C contains detailed discussion prompts. The questions cover topics such as data warehousing concepts, schemas, metadata, and the differences between OLAP and OLTP systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 10

VIVEKANANDHA COLLEGE OF ENGINEERING FOR WOMEN

[AUTONOMOUS INSTITUTION AFFILIATED TO ANNA


UNIVERSITY, CHENNAI]
Elayampalayam – 637 205, Tiruchengode, Namakkal Dt., Tamil Nadu.

Year & Semester: III & V


Subject Code & Subject Name: U15CSE05 & Data Warehousing and Data Mining
Question Bank
Unit – I
PART – A

1. Which of the following forms the logical subset of the complete data warehouse?
a) Dimensional model
b) Fact table
c) Data Mart
d) Dimensional table
2. Operational database is,
a) A measure of the desired maximal complexity of data mining algorithms
b) A database containing volatile data used for the daily operation of an organization
c) Relational database management system
d) None of these
3. Which of the following is true on building a Matrix for Data warehouse bus architecture?
a) Data marts as columns and dimensions as rows
b) Dimensions as rows and facts as columns
c) Data marts as rows and dimensions as columns
d) Data marts as rows and facts as columns
4. Data modeling technique used for data marts is
a) Dimensional modeling
b) ER – model
c) Extended ER – model
d) Physical model
5. Which of the following statements is true?
a) A fact table describes the transactions stored in a DWH
b) A fact table describes the granularity of data held in a DWH
c) The fact table of a data warehouse is the main store of descriptions of the transactions
stored in a DWH
d) The fact table of a data warehouse is the main store of all of the recorded
transactions over time
6. Of the following team members, who do not form audience for Data warehousing?
a) Data architects
b) Customers/users.
c) Business Intelligence experts
d) Managers
7. At which level we can create dimensional models?
a) Business requirements level
b) Architecture models level
c) Detailed models level
d) Implementation level
8. The generalization of multidimensional attributes of a complex object class can be
performed by examining each attribute, generalizing each attribute to simple-value data and
constructing a multidimensional data cube is called as
a) Object cube
b) Relational cube
c) Transactional cube
d) Tuple
9. ____________ is a repository of information gathered from multiple sources stored under a
unified schema at a single site.
a) Data mining
b) Data warehouse
c) Web server
d) None of the above
10. The full form of OLAP is
a) Online analytical processing
b) Online advanced processing
c) Online advanced preparation
d) Online analytical performance
11. ……………………. is a subject-oriented, integrated, time-variant, nonvolatile collection or
data in support of management decisions.
a) Data Mining
b) Data Warehousing
c) Document Mining
d) Text Mining
12. The data is stored, retrieved and updated in ………………..
a) OLAP
b) OLTP
c) SMTP
d) FTP
13. . …………………… is a good alternative to the star schema.
a) Star Schema
b) Snowflake Schema
c) Fact Constellation
d) Star-Snowflake Schema
14. The ………………………. exposes the information being captured, stored, and managed by
operational systems.
a) Top-Down View
b) Data Warehouse View
c) Data Source View
d) Business Query View
15. The type of relationship in star schema is ……………
a) Many To Many
b) One To One
c) One To Many
d) Many To One
16. Which of the following is not a component of a data warehouse?
a) Metadata
b) Current Detail Data
c) Lightly Summarized Data
d) Component Key
17. Which of the following is not a kind of data warehouse application?
a) Information Processing
b) Analytical Processing
c) Data Mining
d) Transaction Processing
18. The important aspect of the data warehouse environment is that data found within the data
warehouse is___________.
a) Subject-oriented.
b) Time-variant.
c) Integrated.
d) All of the above.
19. __________describes the data contained in the data warehouse.
a) Relational data.
b) Operational data.
c) Metadata.
d) Informational data.
20. ________________ is the specialized data warehouse database.
a) Oracle.
b) Dbz.
c) Informix.
d) Redbrick.
21. _______________defines the structure of the data held in operational databases and used by
operational applications.
a) user-level metadata.
b) data warehouse metadata.
c) operational metadata.
d) data mining metadata
22. Issues to be considered while data sourcing, cleanup, extract and transformation:
a) Database heterogeneity
b) Data heterogeneity
c) Both a & b
d) None of the above
23. Data mart presents problem of
a) Scalability
b) Data integration
c) Both a & b
d) None of the above
24. Horizontal parallelism belongs to
a) Intra query parallelism
b) Inter query Parallelism
c) Both a & b
d) None of the above
25. The star schema is composed of __________ fact table.
a) One.
b) Two
c) Three.
d) Four.
26. The process of removing the deficiencies and loopholes in the data is called as
a) Aggregation of data
b) Extracting of data
c) Cleaning up of data.
d) Loading of data
27. Metadata is classified into
a) Technical Meta data
b) Business Meta data
c) Both a & b
d) None of the above
28. Which one manages both current and historic transactions?
a) OLTP
b) OLAP
c) Spread sheet
d) XML
29. ____________ contains information that gives users an easy-to-understand perspective of
the
information stored in the data warehouse.
a) Business metadata.
b) Technical metadata.
c) Operational metadata.
d) Financial metadata.
30. The ASCII file format used to represent the metadata that is being exchanged is used in
_______.
a) The Standard Metadata Model
b) The Standard Access Framework
c) Tool Profile
d) The User Configuration
31. Data marts that incorporate data mining tools to extract sets of data are called ______.
a) Independent data mart.
b) Dependent data marts.
c) Intra-entry data mart.
d) Inter-entry data mart.
32. What are the different types of Data Warehousing?
a) Enterprise DataWarehousing
b) Operational Store
c) Data Mart
d) All of the above.
33. The database is partitioned across multiple disks and parallel processing occurs within a
specific task that is performed concurrently on different sets of data called?
a) Vertical parallelism
b) Horizontal parallelism
c) Interquery parallelism
d) Intraquery parallelism
34. Which software is used for maintaining the knowledge about the environment, data
partitions and parallel query execution?
a) SYBASE MPP
b) DYBASE MPP
c) Split server
d) SQL servers
35. The ............................ exposes the information being captured, stored, and managed by
operationalsystem.
a) Top-down view
b) Data warehouse view
c) Data source view
d) Business query view
36. An Operational System is which of the following?
a) A system that is used to run the business in real time and is based on historical data.
b) A system that is used to run the business in real time and is based on current data.
c) A system that is used to support decision making and is based on current data.
d) A system that is used to support decision making and is based on historical data.
37. _______________are either identical or strict mathematical subsets of the most granular,
detailed dimension.
a) Degenerate Dimensions
b) Factles fact Tables
c) Conformed dimensions
d) Conformed Facts
38. Which of the following statement is true?
a) The data warehouse consists of data marts and operational data
b) The data warehouse is used as a source for the operational data
c) The operational data are used as a source for the data warehouse
d) All of the above
39. How a database design is represented in OLAP systems?
a) Star schema
b) Snowflake schema
c) Fact constellation schema
d) All of the above
40. Which data Focuses on transactional function such as bank card withdrawals and deposits ?
a) Operational Data
b) Informational Data
c) Both a & b
d) None of the above
41. Which factors that drive you to build and use data warehouse?
a) Business factors
b) Technological factors
c) Both a & b
d) None of the above
42. What are the three classes of users in user levels?
a) Casual users
b) Power Users
c) Expert users
d) All of the above
43. Rows are placed and located in the partitions according to the value of the partitioning key
is used in
a) Hash partitioning
b) Key range partitioning
c) Schema portioning
d) All of the above
44. In Which model the set of objects that the metadata interchange standard can be used to
describe?
a) The application meta model
b) The metadata meta model
c) Both a & b
d) None of the above
45. Which contains information about data warehouse data used by warehouse designer,
administrator to carry out development and management tasks.?
a) Technical Meta data
b) Business Meta data
c) Both a & b
d) None of the above
46. Which contains information that gives information stored in data warehouse to users?
a) Technical Meta data
b) Business Meta data
c) Both a & b
d) None of the above
47. Tools used to analyze the data in multi dimensional and complex views
a) Data mining tools
b) OLAP Tools
c) Application development tools
d) Managed Query tools
48. In which different server threads or processes handle multiple requests at the same time?
a) Intra query parallelism
b) Inter query Parallelism
c) Both a & b
d) None of the above
49. The parallelism decomposes the serial SQL query into lower level operations such as scan,
join, sort etc
a) Intra query parallelism
b) Inter query Parallelism
c) Both a & b
d) None of the above
50. The disadvantage of shared memory systems for parallel processing is
a) Scalability
b) Multiple PUs share memory
c) Communication between nodes
d) Performance
Part B
Two Marks

1. Define data warehousing.


2. List the nine decisions in the design of data warehouse.
3. Define star schema.
4. Compare data marts and data warehouse.
5. List the contents of Meta data Repository.
6. What are data marts and data cubes?
7. State the challenges in designing a data warehouse.
8. Give the major characteristics of data warehouse.
9. What are the two types of data mart?
10.State the benefits of metadata.
11.Give some applications of data warehouse.
12.Differentiate horizontal parallelism and vertical parallelism
13.What is data warehouse metadata?
14.List out the different types of reporting tools?
15.How is a data warehouse different from database? How they are similar?
16.What is Data Warehouse?
17.What is Data warehousing?
18.What is a metadata? What are its contents?
19.What are the three kinds of data warehouse applications?
20.What are the four views regarding the design of a data warehouse?
21.Write short notes on Bitmap Index.
22.What are operational databases?
23.Define OLTP.
24.Define OLAP.
25.Write short notes on multidimensional data model?
26.What are facts?
27.What are dimensions?
28.Define dimension table
29.Define fact table
30.Which is popular in the data warehouse design, star schema model (or)
snowflake schema model?
31.What is data mart?
32.What is data warehouse metadata?
33.Explain the differences between star and snowflake schema
34.List the characteristics of a data ware house
35.Differentiate fact table and dimension table.
36.How is a data warehouse different from a database? How are they similar?
37.Describe sourcing, Acquisition and transformation tools?
38.Differentiate metadata and data mart
39.How would you classify the access tools?
40.How would you evaluate the goals of data mining?
41.Can you list the principles of data warehousing
42.What elements would you use to relate the design of data warehouse?
43.Define Metadata?
44.Select the logical steps needed to build a data warehouse
45.Define parallel RDBMS features
46.Describe the alternate technologies used to improve the performance in data
warehouse environment?
47.Distinguish STARjoin and STARindex
48.How would you show your understanding in Multidimensional data model?
49.Formulate What conclusion can you draw in Bitmapped index
50.Explain Vendor solutions?
51.Explain the features of Metadata repository in data warehousing?
Part – C
1. Illustrate the mapping data warehouse with multiprocessor architecture with
the concept of parallelism and data partitioning
2. Discuss Data Extraction, Clean up and transformation tools with meta data
management.
3. With neat architecture explain the seven components of data warehouse.
4. Discuss DBMS schemas for decision support. Describe performance
problems with starschema.
5. Write short notes on
i. Transformation
ii. Metadata
6. Illustrate the various considerations of building a data warehouse
7. Explain about star, snowflake and Fact Constellation schemas for
Multidimensional
Databases with suitable examples.
8. Briefly discuss the schemas for multidimensional databases.
i) Describe in detail about access tools types?
ii) Describe the overall architecture of data warehouse?
9. i)Demonstrate in detail about Data marts
ii)Demonstrate data warehouse administration and management
10.Explain the functional blocks needed to build a data warehouse?
11.Describe in detail about Mapping the Data warehouse to a multiprocessor
architecture?
12.Analyze the information needed to support DBMS schemas for Decision
support
13.Develop in detail about Vendor solutions?
14.i)Discuss in detail about Vendor approaches
ii) Discuss in detail about access to legacy data
15.Describe in detail about data extraction?
16.Describe in detail about transformation tools?
17.How would you explain Metadata implementation with examples?
18.Describe in detail about
i) Bitmapped indexing
i)STARjoin and index
19.What is data warehouse? Diagrammatically illustrate and discuss the data
warehouse
a. architecture.
b. Briefly compare the following concepts. You may use an example to
explain your point(s).
(a) Snowflake schema, fact constellation
(b) Data cleaning, data transformation, refresh
(c) Enterprise warehouse, data mart, virtual warehouse
20.Enumerate the building blocks of data warehouse. Explain the importance of
metadata in a data warehouse environment.
21.Diagrammatically illustrate and discuss the data warehousing architecture
with briefly explain components of data warehouse
22.List and discuss the steps involved in mapping the data warehouse to a
multiprocessor architecture.
23.Distinguish between Data warehousing and data mining.
24.Describe in detail about data extraction, cleanup.
25.List out the functions of OLAP servers in the data warehouse architecture.
26.Compare OLTP and OLAP Systems.
27.Briefly discuss the schemas for multidimensional databases.
28.How is a data warehouse different from a database? How are they similar?
29.List out the functions of OLAP servers in the data warehouse architecture.
30.What do you understand about knowledge discovery?
31.List the characteristics of a data ware house and explain it.
32.Demonstrate in detail about Data marts.
33.Demonstrate data warehouse administration and management.
34.How would you explain Metadata implementation with examples?
35.Explain in detail about different Vendor Solutions.
36.List and discuss the steps involved in mapping the data warehouse to a
multiprocessor architecture. Enumerate the building blocks of data
warehouse. Explain the importance of metadata in a data warehouse
environment

37.Suppose that a data warehouse consists of the three dimensions time, doctor,
and patient, and the two measures count and charge, where charge is the fee
that a doctor charges a patient for a visit.
a. Enumerate three classes of schemas that are popularly used for modeling data
warehouses.
b. Draw a schema diagram for the above data warehouse using one of the
schema classes listed in (a).
c. Starting with the base cuboid [day, doctor, patient], what specific OLAP
operations should be performed in order to list the total fee collected by each
doctor in 2004?
d. To obtain the same list, write an SQL query assuming the data are stored in a
relational database with the schema fee (day, month, year, doctor, hospital,
patient, count, charge).

38.Suppose that a data warehouse for Big University consists of the following
four dimensions:
student, course, semester, and instructor, and two measures count and avg grade.
When at the lowest conceptual level (e.g., for a given student, course, semester,
and instructor combination), the avg grade measure stores the actual course grade
of the student. At higher conceptual levels, avg grade stores the average grade for
the given combination.
a. Draw a snowflake schema diagram for the data warehouse.
b. Starting with the base cuboid [student, course, semester, instructor], what
specific OLAP operations (e.g., roll-up from semester to year) should one
perform in order to list the average grade of CS courses for each Big
University student.
c. If each dimension has five levels (including all), such as “student < major <
status < university < all”, how many cuboids will this cube contain
(including the base and apex cuboids)?

39.Suppose that a data warehouse consists of the four dimensions, date,


spectator, location and game, and the two measures, count and charge, where
charge is the fare that a spectator pays when watching a game on a given
date. Spectators may be students, adults, or seniors, with each category
having its own charge rate.
a. Draw a star schema diagram for the data warehouse.
b. Starting with the base cuboid [date, spectator, location, game], what
specific OLAP operations should one perform in order to list the total
charge paid by student spectators at GM Place in 2004?
c. Bitmap indexing is useful in data warehousing. Taking this cube as an
example, briefly discuss advantages and problems of using a bitmap index
structure.

40.Design a data warehouse for a regional weather bureau. The weather bureau
has about 1,000 probes, which are scattered throughout various land and
ocean locations in the region to collect basic weather data, including air
pressure, temperature, and precipitation at each hour. All data are sent to the
central station, which has collected such data for over 10 years. Your design
should facilitate efficient querying and on-line analytical processing, and
derive general weather patterns in multidimensional space.

41.Regarding the computation of measures in a data cube:


a. Enumerate three categories of measures, based on the kind of aggregate
functions used in computing a data cube.
b. For a data cube with the three dimensions time, location, and item, which
category does the function variance belong to? Describe how to compute
it if the cube is partitioned into many chunks. (The formula for
computing variance is where is the average of N
xis).
c. Suppose the function is “top 10 sales”. Discuss how to efficiently
compute this measure in a data cube

You might also like