Assignment 7solution
Assignment 7solution
A. tasks
B. inputs
C. outputs
D. None of these
Ans: B) inputs
Explanation: Map, written by the user, takes an input pair and produces a set of intermediate
key/value pairs.
Q. 2 _________ function is responsible for consolidating the results produced by each of the
Map() functions/tasks.
A. Reducer
B. Map
C. Reduce
D. All of the mentioned
Ans: C) Reduce
Explanation: The MapReduce library groups together all intermediate values associated with
the same intermediate key and passes them to the Reduce function.
A. sort
B. shuffle
C. reduce
D. None of these
Ans: A) Sort
Explanation: When a reduce worker has read all intermediate data, it sorts it by the
intermediate keys so that all occurrences of the same key are grouped together. The sorting is
needed because typically many different keys map to the same reduce task. If the amount of
intermediate data is too large to fit in memory, an external sort is used.
Q. 4 In the local disk of the HDFS namenode the files which are stored persistently are:
Q. 5 In HDFS, application data are stored on other servers called___________. All servers
are _____________and communicate with each other using ______________protocols.
A. Spark Streaming
B. FlatMap
C. Driver
D. Resilient Distributed Dataset (RDD)
Q. 7 In Spark, a dataset with elements of type A can be transformed into a dataset with
elements of type B using an operation called____________, which passes each element
through a user-provided function of type A →List[B].
A. Map
B. Filter
C. flatMap
D. Reduce
Ans: C) flatMap
Explanation: A dataset with elements of type A can be transformed into a dataset with
elements of type B using an operation called flatMap
Q. 8 Spark also allows programmers to create two restricted types of shared variables to
support two simple but common usage patterns such as ___________and____________.
A. Map, Reduce
B. Flatmap, Filter
C. Sort, Shuffle
D. Broadcast, Accumulators
Spark lets the programmer create a “broadcast variable” object that wraps the value and
ensures that it is only copied to each worker once.
Accumulators: These are variables that workers can only “add” to using an associative
operation, and that only the driver can read.
Statement-1: Reactive routing protocols ask each host (or many hosts) to maintain global
topology information, thus a route can be provided immediately when requested.
Statement-2: Proactive routing protocols have the feature on-demand. Each host computes
route for a specific destination only when necessary.
Explanation: (i) Proactive routing protocols ask each host (or many hosts) to maintain global
topology information, thus a route can be provided immediately when requested. But large
amount of control messages are required to keep each host updated for the newest topology
changes.
(ii) Reactive routing protocols have the feature on-demand. Each host computes route for a
specific destination only when necessary. Topology changes which do not influence active
routes do not trigger any route maintenance function, thus communication overhead is lower
compared to proactive routing protocol.
A. Broadcast, Flooding
B. Shortest-path, Convergecast
C. Flooding, Broadcast storm
D. Broadcast, Convergecast
Broadcast storm problem refers to the fact that flooding may result in excessive redundancy,
contention, and collision. This causes high protocol overhead and interference to other
ongoing communication sessions.
___________________________________________________________________________