0% found this document useful (0 votes)
85 views9 pages

Paper Pattern-Sample Paper - CS614

This document contains a sample paper for the final term of the CS614 Data Warehousing course. It includes 40 multiple choice questions testing knowledge of data warehousing concepts and techniques. There are also 6 subjective questions asking students to discuss topics like keys to successful rapid prototyping, steps in the data warehouse development lifecycle, web warehousing, and applying clustering indexing to a university course database table.

Uploaded by

Hamza Mehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views9 pages

Paper Pattern-Sample Paper - CS614

This document contains a sample paper for the final term of the CS614 Data Warehousing course. It includes 40 multiple choice questions testing knowledge of data warehousing concepts and techniques. There are also 6 subjective questions asking students to discuss topics like keys to successful rapid prototyping, steps in the data warehouse development lifecycle, web warehousing, and applying clustering indexing to a university course database table.

Uploaded by

Hamza Mehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Paper Time [90 minutes]

CS614 – Data Warehousing

Final Term Fall 2022 (Sample Paper)

MCQs

1. __________ approach is used to implement the Data Warehouse.


a) Agile Approach
b) Scrum Approach
c) Kimball Approach
d) Waterfall Approach

2. The Kimball approach to implement the Data Warehouse is ________________.


a) Time-driven
b) Resource-driven
c) Goal-driven
d) Cost-driven

3. _______________ is the first stage of the Kimball approach to Data Warehouse


Implementation?
a) Dimensional Modeling
b) Requirements Gathering
c) Data Warehouse Population
d) Deployment
4. What is the final stage of the Kimball approach to Data Warehouse implementation?
a) Requirements Gathering
b) Ongoing Maintenance
c) ETL Design
d) Data Staging

5. What is the role of the business analyst in the Kimball approach to Data Warehouse
Implementation?
a) To design the dimensional model
b) To monitor and maintain the data warehouse
c) To gather and document business requirements
d) To develop the ETL process

6. Which of the following is the one of five signs of trouble in a data warehousing project?
a) End users are not involved hands-on from day one throughout the program.
b) IT team members have experience with the access tools.
c) The data design is finished before participants have experimented with the tools and live data.
d) The project has proceeded for only a few days.

7. What is a possible pitfall in the Data Warehousing life cycle and development?
a) Having a weak business sponsor.
b) Having enough time for ETL.
c) Having a subject-matter expert in the data modeling team.
d) Having a single server for the different environments.

8. What is a reason for separating the development and production environments?


a) To prevent the production environment from being affected by a reboot of the development
environment server.
b) To avoid interference between the different database environments.
c) To allow the development environment to run faster.
d) To reduce costs.

9. What is the importance of having a subject-matter expert in the Data Modeling team?
a) To get definitive answers to questions.
b) To extend the project timeline.
c) To have someone with industry experience.
d) To have an outside consultant.

10. What is web warehousing?


a) A type of data warehousing that focuses on the World Wide Web as its data source
b) A type of data warehousing that focuses on traditional data sources
c) A type of data warehousing that focuses on data stored in a warehouse
d) A type of web OLTP

11. What are the three reasons for Warehousing Web Data?
a) Searching the web, analyzing web traffic, and archiving the web
b) Storing data, processing data, and retrieving data
c) Cleaning data, transforming data, and loading data
d) none of above

12. What are the three major approaches to accessing information stored on the web?
a) Keyword-based search, querying deep web sources, and random surfing
b) Indexing, searching, and browsing
c).Sorting, grouping, and aggregating.
d) Only star schema

13. __________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in


support of management decisions.
a) Data Mining.
b) Data Warehousing.
c) Web Mining.
d) Text Mining.

14. The data Warehouse is__________.


a) read only.
b) write only.
c) read write only.
d) none.

15. The important aspect of the data warehouse environment is that data found within the data
warehouse is___________.
a) subject-oriented.
b) time-variant.
c) integrated.
D) all of the above.

16. The data is stored, retrieved & updated in ____________.


a) OLAP.
b) OLTP.
c) SMTP.
d) FTP.

17. ____________predicts future trends & behaviors, allowing business managers to make
proactive, knowledge-driven decisions.
a) Data warehouse.
b) Data mining.
c) Data marts.
d) Metadata.

18. The star schema is composed of __________ fact table.


a) one
b) two
c) three
d) four
19. Data can be updated in _____environment.
a) data warehouse
b) data mining
c) operational
d) information

20. The modern CASE tools belong to _______ category.


a) analysis
b) Development
c) Coding
d) Delivery

21. What class of anomalies does lexical errors belong to?


a) Syntactically Dirty Data
b)Semantically Dirty Data
c) Logical Anomalies
d)Runtime Anomalies

22. The challenge of finding and removing duplicate records in a merged database due to
inconsistent or incorrect representation in various data sets is known as ______________ .
a) Merge/Purge Problem
b) Cleansing Problem
c) Transformation Problem
d) Quality Problem

23. After completing most of the transformation and cleansing, especially correcting single-
source errors and resolving conflicting representations, we perform the _____ task.
a) Duplicate Elimination
b)Duplicate Identification
c) Duplicate Classification
d)Duplicate Recombination

24. In the Information Age, organizations lacking the ability to "learn" are at a disadvantage.
This term is referred to as ----------- functioning
a) Impaired
b)Slow
c) Active
d)Advance
25. ------------------ is conservative in nature, assigning the dimension an aggregate value that
cannot exceed the value of its weakest data quality indicator
a) Min operator
b)Max operator
c) Average operator
d)Sum operator

26. Which operation can be applied to handle dimensions that involve the aggregation of
multiple data quality indicators?
a) Average Sum
b)Weighted Ratio
c) Minimum or Maximum
d)Accumulate Ratio

27. UAT stands for ---------------.


a) Universal Acceptance Test
b)Universal Applied Test
c) Undignified Applied Test
d)Uniform Acceptance Test

28. ------------- can enhance system performance on systems that are over-utilized or have
limited I/O bandwidth.
a) Parallelism
b)Serialization
c) Mining
d)Trashing

29. A ------------ is a type of index that contains an entry for every single value in a column of a
table.
a) Dense Index
b)Sparse Index
c) Primary Index
d)Hash Index

30. ------------------indexes are stored in a tree structure that has branch blocks which point to
lower- level block
a) Dense
b)Sparse
c) Tree
d)Hash
31. An ---------- is often used for information retrieval and is optimized for fast search and
retrieval operations, with updates being a secondary consideration.
a) Inverted index
b)Sparse Index
c) Primary Index
d)Hash Index

32. --------------- is a simple join algorithm that compares every row from one table with every
row from another table to find the matching rows.
a) Naive nested-loop join
b) Index nested-loop join
c) Temporary index nested-loop join
d) None of the given options

33. Which of the following is not a Nested Loop Join technique?


a) Block nested-loop join
b) Naive nested-loop join
c) Index nested-loop join
d) Temporary index nested-loop join

34. ---------------- are suitable for the VLDB environment as they are useful for joining large
data sets or tables.
a) Sort-Merge
b) Hash-Based
c) Naive nested-loop
d) Index nested-loop

35. A -------------- means that requires some search or inference, such as finding the global
minimum of a complex function or finding the most likely explanation for a set of observations.
a) Non-trivial
b) Multidimensional
c) Trivial
d) Unrelated

36. ------------- is a process of extracting useful and previously unknown information or patterns
from large and complex datasets.
a) Data mining
b) Database
c) Data structure
d) Algorithm
37. ----------------- consist of dividing a given data set into groups based on some criteria or rule.
a) Decision trees
b) Clustering
c) Prediction
d) Estimation

38. -------------- consists of examining the properties of a newly presented


observation and assigning it to a predefined class.
a) Classification
b) Clustering
c) Prediction
d) Estimation

39. As opposed to discrete outcome of classification i.e. YES or NO, deals with continuous
valued outcomes is called ----------
a) Classification
b) Clustering
c) Prediction
d) Estimation

40. Unlike classification, ------------------- does NOT depend on predefined classes.


a) Classification
b) Clustering
c) Prediction
d) Estimation

Subject Questions (3 Marks)

Question#1
Discuss any two keys to a successful rapid prototyping methodology.
Question#2
Discuss first two steps of DWH development life cycle.
Question#3
What is web warehousing?
Question#4
Which class of anomalies pertains to missing records?
Question#5
Describe the following:
i. Simple ratio
ii. Min or Max operation

Question#6
Apply Run-Length Encoding (RLE) on the followings and mention the output:

Input Stream-a: 1100001110000111


Input Stream-b: 011100111100000

Subjective Questions (5 Marks)

Question#1
What are the signs of trouble during the development of DWH?

Question#2
Discuss the steps of smooth DWH implementation process.

Question#3
In a Jeans factory that produces men's jeans pent, each pent taking 10 minutes to complete
cutting,stitching and finishing, the management is considering the introduction of a parallel production
pipeline. Your task is to calculate the amount of time that could be saved by implementing this pipeline
when manufacturing 100 pents consecutively.

Question#4
Consider the following table structure for storing the course’s records in a university, you are required
to apply Cluster Indexing on the basis of “Department” for the given table

Cours
Courses Department Faculty Reference_Book
id
001 Distributed DBMS Computer Science Ayesha Ahmad Database Systems
002 Cloud Computing Computer Science Bilal Hashim Cloud Database Systems
003 Data Warehouse Computer Science Bilal Hashim Big Data
004 Operating System Computer Science Ali Asghar OS Basics
005 Data Science Mathematics Essa khan Cloud Database Systems
Discrete Intro to Discrete
006 Mathematics Muhammad Musa
Mathematics Mathematics

You might also like