Paper Pattern-Sample Paper - CS614
Paper Pattern-Sample Paper - CS614
MCQs
5. What is the role of the business analyst in the Kimball approach to Data Warehouse
Implementation?
a) To design the dimensional model
b) To monitor and maintain the data warehouse
c) To gather and document business requirements
d) To develop the ETL process
6. Which of the following is the one of five signs of trouble in a data warehousing project?
a) End users are not involved hands-on from day one throughout the program.
b) IT team members have experience with the access tools.
c) The data design is finished before participants have experimented with the tools and live data.
d) The project has proceeded for only a few days.
7. What is a possible pitfall in the Data Warehousing life cycle and development?
a) Having a weak business sponsor.
b) Having enough time for ETL.
c) Having a subject-matter expert in the data modeling team.
d) Having a single server for the different environments.
9. What is the importance of having a subject-matter expert in the Data Modeling team?
a) To get definitive answers to questions.
b) To extend the project timeline.
c) To have someone with industry experience.
d) To have an outside consultant.
11. What are the three reasons for Warehousing Web Data?
a) Searching the web, analyzing web traffic, and archiving the web
b) Storing data, processing data, and retrieving data
c) Cleaning data, transforming data, and loading data
d) none of above
12. What are the three major approaches to accessing information stored on the web?
a) Keyword-based search, querying deep web sources, and random surfing
b) Indexing, searching, and browsing
c).Sorting, grouping, and aggregating.
d) Only star schema
15. The important aspect of the data warehouse environment is that data found within the data
warehouse is___________.
a) subject-oriented.
b) time-variant.
c) integrated.
D) all of the above.
17. ____________predicts future trends & behaviors, allowing business managers to make
proactive, knowledge-driven decisions.
a) Data warehouse.
b) Data mining.
c) Data marts.
d) Metadata.
22. The challenge of finding and removing duplicate records in a merged database due to
inconsistent or incorrect representation in various data sets is known as ______________ .
a) Merge/Purge Problem
b) Cleansing Problem
c) Transformation Problem
d) Quality Problem
23. After completing most of the transformation and cleansing, especially correcting single-
source errors and resolving conflicting representations, we perform the _____ task.
a) Duplicate Elimination
b)Duplicate Identification
c) Duplicate Classification
d)Duplicate Recombination
24. In the Information Age, organizations lacking the ability to "learn" are at a disadvantage.
This term is referred to as ----------- functioning
a) Impaired
b)Slow
c) Active
d)Advance
25. ------------------ is conservative in nature, assigning the dimension an aggregate value that
cannot exceed the value of its weakest data quality indicator
a) Min operator
b)Max operator
c) Average operator
d)Sum operator
26. Which operation can be applied to handle dimensions that involve the aggregation of
multiple data quality indicators?
a) Average Sum
b)Weighted Ratio
c) Minimum or Maximum
d)Accumulate Ratio
28. ------------- can enhance system performance on systems that are over-utilized or have
limited I/O bandwidth.
a) Parallelism
b)Serialization
c) Mining
d)Trashing
29. A ------------ is a type of index that contains an entry for every single value in a column of a
table.
a) Dense Index
b)Sparse Index
c) Primary Index
d)Hash Index
30. ------------------indexes are stored in a tree structure that has branch blocks which point to
lower- level block
a) Dense
b)Sparse
c) Tree
d)Hash
31. An ---------- is often used for information retrieval and is optimized for fast search and
retrieval operations, with updates being a secondary consideration.
a) Inverted index
b)Sparse Index
c) Primary Index
d)Hash Index
32. --------------- is a simple join algorithm that compares every row from one table with every
row from another table to find the matching rows.
a) Naive nested-loop join
b) Index nested-loop join
c) Temporary index nested-loop join
d) None of the given options
34. ---------------- are suitable for the VLDB environment as they are useful for joining large
data sets or tables.
a) Sort-Merge
b) Hash-Based
c) Naive nested-loop
d) Index nested-loop
35. A -------------- means that requires some search or inference, such as finding the global
minimum of a complex function or finding the most likely explanation for a set of observations.
a) Non-trivial
b) Multidimensional
c) Trivial
d) Unrelated
36. ------------- is a process of extracting useful and previously unknown information or patterns
from large and complex datasets.
a) Data mining
b) Database
c) Data structure
d) Algorithm
37. ----------------- consist of dividing a given data set into groups based on some criteria or rule.
a) Decision trees
b) Clustering
c) Prediction
d) Estimation
39. As opposed to discrete outcome of classification i.e. YES or NO, deals with continuous
valued outcomes is called ----------
a) Classification
b) Clustering
c) Prediction
d) Estimation
Question#1
Discuss any two keys to a successful rapid prototyping methodology.
Question#2
Discuss first two steps of DWH development life cycle.
Question#3
What is web warehousing?
Question#4
Which class of anomalies pertains to missing records?
Question#5
Describe the following:
i. Simple ratio
ii. Min or Max operation
Question#6
Apply Run-Length Encoding (RLE) on the followings and mention the output:
Question#1
What are the signs of trouble during the development of DWH?
Question#2
Discuss the steps of smooth DWH implementation process.
Question#3
In a Jeans factory that produces men's jeans pent, each pent taking 10 minutes to complete
cutting,stitching and finishing, the management is considering the introduction of a parallel production
pipeline. Your task is to calculate the amount of time that could be saved by implementing this pipeline
when manufacturing 100 pents consecutively.
Question#4
Consider the following table structure for storing the course’s records in a university, you are required
to apply Cluster Indexing on the basis of “Department” for the given table
Cours
Courses Department Faculty Reference_Book
id
001 Distributed DBMS Computer Science Ayesha Ahmad Database Systems
002 Cloud Computing Computer Science Bilal Hashim Cloud Database Systems
003 Data Warehouse Computer Science Bilal Hashim Big Data
004 Operating System Computer Science Ali Asghar OS Basics
005 Data Science Mathematics Essa khan Cloud Database Systems
Discrete Intro to Discrete
006 Mathematics Muhammad Musa
Mathematics Mathematics