HPC Final Merged
HPC Final Merged
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
Q.no 2. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
Q.no 3. Regarding implementation of Breadth First Search using queues, what is
the maximum distance between two nodes present in the queue? (considering
each edge length 1)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
A:2
B:4
C:6
D:8
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
B : Dividing no of processors
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Management
B : Media mass
C : Business
D : Science
Q.no 18. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
Q.no 19. The time complexity of a quick sort algorithm which makes use of
median, found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
Q.no 21. Which of the following is not an application of Depth First Search?
Q.no 22. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : program counter
B : stack
A : Merge sort
C : Heap sort
D : Selection sort
Q.no 25. In ………………. only one process at a time is allowed into its critical
section, among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 27. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
A : DMA
B : CPU
C : I/O
D : Memory
Q.no 30. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : single threaded
B : multithreaded
Q.no 32. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
A : sequential
B : unique
C : simultaneous
Q.no 36. When the event for which a thread is blocked occurs?
C : thread completes
C : Simultaneous execution
Q.no 38. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
Q.no 39. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
C : share data
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 42. The time required to create a new thread in an existing process is
___________
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
Q.no 45. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
Q.no 46. the basic operations in the message-passing programming paradigm are
___
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
Q.no 48. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
A : High, high
B : Low, low
C : High, low
D : Low, high
A : input
B : output
C : operating system
D : memory
Q.no 52. The management of data flow between computers or devices or between
nodes in a network is called
A : Flow control
B : Data Control
C : Data Management
D : Flow Management
Q.no 53. Which of the following are TRUE for direct communication?
C : Exactly N/2 links exist between each pair of processes(N = max. number of
processes supported by system)
Q.no 55. Which of the following two operations are provided by the IPC facility?
Q.no 56. Which of the following is not the possible ways of data exchange?
A : Simplex
B : Multiplex
C : Half-duplex
D : Full-duplex
Q.no 57. Which of the following algorithms has lowest worst case time
complexity?
A : Insertion sort
B : Selection sort
C : Quick Sort
D : Heap sort
Q.no 58. A thread shares its resources(like data section, code section, open files,
signals) with ___________
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Q.no 60. The register context and stacks of a thread are deallocated when the
thread?
A : terminates
B : blocks
C : unblocks
D : spawns
Answer for Question No 1. is b
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
Q.no 1. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 3. The time complexity of a quick sort algorithm which makes use of median,
found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : Management
B : Media mass
C : Business
D : Science
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A:2
B:4
C:6
D:8
A : Merge sort
C : Heap sort
D : Selection sort
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
B : Dividing no of processors
Q.no 15. In ………………. only one process at a time is allowed into its critical
section, among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
A : program counter
B : stack
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 18. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
Q.no 19. Which of the following is not an application of Breadth First Search?
D : Path Finding
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
Q.no 25. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
A : single threaded
B : multithreaded
C : share data
A : DMA
B : CPU
C : I/O
D : Memory
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : Bus based
B : Mesh
C : Linear Array
D : All of above
Q.no 33. The time required to create a new thread in an existing process is
___________
C : Simultaneous execution
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
Q.no 37. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
Q.no 38. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
Q.no 39. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
Q.no 41. When the event for which a thread is blocked occurs?
C : thread completes
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
Q.no 45. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
A : sequential
B : unique
C : simultaneous
Q.no 47. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 48. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 50. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 51. The link between two processes P and Q to send and receive messages is
called __________
A : communication link
B : message-passing link
C : synchronization link
A : Multithreading
B : Cyber cycle
C : Internet of things
D : Cyber-physical system
Q.no 53. The amount of data that can be carried from one point to another in a
given time period is called
A : Scope
B : Capacity
C : Bandwidth
D : Limitation
Q.no 54. Octa-core processor are the processors of the computer system that
contains
A : 2 processors
B : 4 processors
C : 6 processors
D : 8 processors
Q.no 55. Given a number of elements in the range [0….n^3]. which of the following
sorting algorithms can sort them in O(n) time?
A : Counting sort
B : Bucket sort
C : Radix sort
D : Quick sort
Q.no 57. The register context and stacks of a thread are deallocated when the
thread?
A : terminates
B : blocks
C : unblocks
D : spawns
Q.no 58. Which of the following two operations are provided by the IPC facility?
Q.no 59. Which of the following is not the possible ways of data exchange?
A : Simplex
B : Multiplex
C : Half-duplex
D : Full-duplex
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Answer for Question No 1. is b
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
Q.no 1. The time complexity of a quick sort algorithm which makes use of median,
found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 3. Which one of the following is not shared by threads?
A : program counter
B : stack
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
Q.no 8. The time complexity of heap sort in worst case is
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 10. Which of the following is not an application of Breadth First Search?
D : Path Finding
Q.no 11. Which of the following is not an application of Depth First Search?
Q.no 12. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 14. In ………………. only one process at a time is allowed into its critical
section, among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
Q.no 15. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
B : Dividing no of processors
C : Dividing number of tasks
Q.no 18. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
Q.no 19. The decomposition technique in which the input is divided is called
as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Merge sort
C : Heap sort
D : Selection sort
Q.no 21. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : processing
B : parallel processing
C : serial processing
D : multitasking
A:2
B:4
C:6
D:8
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
Q.no 29. The time required to create a new thread in an existing process is
___________
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
Q.no 35. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
Q.no 36. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
Q.no 37. Parallel computing uses _____ execution
A : sequential
B : unique
C : simultaneous
A : single threaded
B : multithreaded
C : Simultaneous execution
Q.no 40. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : High, high
B : Low, low
C : High, low
D : Low, high
Q.no 42. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 43. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
C : share data
Q.no 45. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
Q.no 46. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
Q.no 47. the basic operations in the message-passing programming paradigm are
___
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 49. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
Q.no 51. Which of the following is not the possible ways of data exchange?
A : Simplex
B : Multiplex
C : Half-duplex
D : Full-duplex
Q.no 52. A thread shares its resources(like data section, code section, open files,
signals) with ___________
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Q.no 55. Which of the following algorithms has lowest worst case time
complexity?
A : Insertion sort
B : Selection sort
C : Quick Sort
D : Heap sort
Q.no 56. Resources and clients transparency that allows movement within a
system is called
A : Mobility transparency
B : Concurrency transparency
C : Performance transparency
D : Replication transparency
A : cost
B : reliability
C : uncertainty
D : scalability
A : input
B : output
C : operating system
D : memory
Q.no 59. The management of data flow between computers or devices or between
nodes in a network is called
A : Flow control
B : Data Control
C : Data Management
D : Flow Management
C : Process
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
A : Management
B : Media mass
C : Business
D : Science
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
Q.no 11. Regarding implementation of Breadth First Search using queues, what is
the maximum distance between two nodes present in the queue? (considering
each edge length 1)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
A : Merge sort
C : Heap sort
D : Selection sort
Q.no 13. Following is not mapping technique
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 15. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
B : Dividing no of processors
A:2
B:4
C:6
D:8
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 21. Which of the following is not an application of Breadth First Search?
D : Path Finding
Q.no 22. In ………………. only one process at a time is allowed into its critical
section, among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
Q.no 23. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
Q.no 24. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : program counter
B : stack
Q.no 26. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
A : DMA
B : CPU
C : I/O
D : Memory
A : Bus based
B : Mesh
C : Linear Array
D : All of above
Q.no 29. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
A : single threaded
B : multithreaded
Q.no 31. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
C : share data
Q.no 34. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
C : Simultaneous execution
Q.no 37. When the event for which a thread is blocked occurs?
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
Q.no 39. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
Q.no 41. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 43. the basic operations in the message-passing programming paradigm are
___
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 46. The time required to create a new thread in an existing process is
___________
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
A : sequential
B : unique
C : simultaneous
Q.no 50. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 51. The amount of data that can be carried from one point to another in a
given time period is called
A : Scope
B : Capacity
C : Bandwidth
D : Limitation
A : cost
B : reliability
C : uncertainty
D : scalability
C : Process
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Q.no 56. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
Q.no 57. Resources and clients transparency that allows movement within a
system is called
A : Mobility transparency
B : Concurrency transparency
C : Performance transparency
D : Replication transparency
Q.no 58. One that is not a type of multiprocessor of the computer system is
A : dual core
B : blade server
C : clustered system
D : single core
Q.no 59. Which of the following are TRUE for direct communication?
C : Exactly N/2 links exist between each pair of processes(N = max. number of
processes supported by system)
A : a) there is another process R to handle and pass on the messages between P and Q
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A:2
B:4
C:6
D:8
D : Path Finding
Q.no 3. The time complexity of a quick sort algorithm which makes use of median,
found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : Management
B : Media mass
C : Business
D : Science
Q.no 6. In ………………. only one process at a time is allowed into its critical section,
among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
Q.no 7. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
A : program counter
B : stack
Q.no 10. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
Q.no 13. Regarding implementation of Breadth First Search using queues, what is
the maximum distance between two nodes present in the queue? (considering
each edge length 1)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 15. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : Merge sort
C : Heap sort
D : Selection sort
Q.no 22. The decomposition technique in which the input is divided is called
as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
Q.no 23. Which of the following is not an application of Depth First Search?
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
Q.no 25. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
Q.no 26. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
Q.no 27. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 29. the basic operations in the message-passing programming paradigm are
___
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 33. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
A : single threaded
B : multithreaded
A : Bus based
B : Mesh
C : Linear Array
D : All of above
A : sequential
B : unique
C : simultaneous
A : High, high
B : Low, low
C : High, low
D : Low, high
C : Simultaneous execution
C : share data
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
Q.no 43. When the event for which a thread is blocked occurs?
C : thread completes
Q.no 45. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
Q.no 46. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 47. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
Q.no 48. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
Q.no 49. The time required to create a new thread in an existing process is
___________
A : DMA
B : CPU
C : I/O
D : Memory
Q.no 52. Which of the following are TRUE for direct communication?
C : Exactly N/2 links exist between each pair of processes(N = max. number of
processes supported by system)
Q.no 53. Resources and clients transparency that allows movement within a
system is called
A : Mobility transparency
B : Concurrency transparency
C : Performance transparency
D : Replication transparency
A : a) there is another process R to handle and pass on the messages between P and Q
Q.no 55. The architecture which can compute several tasks simultaneously at
processor level itself is called as:
D : All of above
Q.no 56. The amount of data that can be carried from one point to another in a
given time period is called
A : Scope
B : Capacity
C : Bandwidth
D : Limitation
A : input
B : output
C : operating system
D : memory
Q.no 58. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
Q.no 59. The transparency that enables accessing local and remote resources
using identical operations is called ____________
A : Access transparency
B : Concurrency transparency
C : Performance transparency
D : Scaling transparency
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 3. Most message-passing programs are written using
Q.no 4. In ………………. only one process at a time is allowed into its critical section,
among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Merge sort
C : Heap sort
D : Selection sort
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
A : program counter
B : stack
B : Dividing no of processors
Q.no 12. Which of the following is not an application of Depth First Search?
Q.no 13. Which of the following is not an application of Breadth First Search?
D : Path Finding
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Management
B : Media mass
C : Business
D : Science
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
Q.no 18. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
Q.no 19. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
Q.no 20. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 23. The time complexity of a quick sort algorithm which makes use of
median, found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 25. The decomposition technique in which the input is divided is called
as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
A : DMA
B : CPU
C : I/O
D : Memory
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
C : Simultaneous execution
Q.no 32. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : single threaded
B : multithreaded
C : share data
A : High, high
B : Low, low
C : High, low
D : Low, high
Q.no 36. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 37. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
Q.no 38. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
Q.no 41. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
Q.no 43. The time required to create a new thread in an existing process is
___________
Q.no 44. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
A : sequential
B : unique
C : simultaneous
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 47. When the event for which a thread is blocked occurs?
A : thread moves to the ready queue
C : thread completes
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
Q.no 50. the basic operations in the message-passing programming paradigm are
___
Q.no 51. Which of the following are TRUE for direct communication?
C : Exactly N/2 links exist between each pair of processes(N = max. number of
processes supported by system)
D : Exactly two link exists between each pair of processes
Q.no 52. A thread shares its resources(like data section, code section, open files,
signals) with ___________
Q.no 53. One that is not a type of multiprocessor of the computer system is
A : dual core
B : blade server
C : clustered system
D : single core
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Q.no 55. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
A : a) there is another process R to handle and pass on the messages between P and Q
A : input
B : output
C : operating system
D : memory
Q.no 58. The management of data flow between computers or devices or between
nodes in a network is called
A : Flow control
B : Data Control
C : Data Management
D : Flow Management
C : Process
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
Answer for Question No 1. is c
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
Q.no 1. In ………………. only one process at a time is allowed into its critical section,
among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
Q.no 3. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
A : Merge sort
C : Heap sort
D : Selection sort
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : Management
B : Media mass
C : Business
D : Science
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
Q.no 11. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
Q.no 12. Which of the following is not an application of Breadth First Search?
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
Q.no 14. Regarding implementation of Breadth First Search using queues, what is
the maximum distance between two nodes present in the queue? (considering
each edge length 1)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 15. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : program counter
B : stack
C : both program counter and stack
A:2
B:4
C:6
D:8
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
B : Dividing no of processors
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 23. The decomposition technique in which the input is divided is called
as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
Q.no 25. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
C : share data
A : High, high
B : Low, low
C : High, low
D : Low, high
Q.no 28. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
Q.no 29. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
C : Simultaneous execution
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
B : CPU
C : I/O
D : Memory
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 34. When the event for which a thread is blocked occurs?
C : thread completes
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 37. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
A : Bus based
B : Mesh
C : Linear Array
D : All of above
Q.no 39. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
Q.no 40. The time required to create a new thread in an existing process is
___________
A : single threaded
B : multithreaded
Q.no 42. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
Q.no 44. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
Q.no 46. the basic operations in the message-passing programming paradigm are
___
B : unique
C : simultaneous
Q.no 48. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 51. The link between two processes P and Q to send and receive messages is
called __________
A : communication link
B : message-passing link
C : synchronization link
A : input
B : output
C : operating system
D : memory
Q.no 53. One that is not a type of multiprocessor of the computer system is
A : dual core
B : blade server
C : clustered system
D : single core
Q.no 54. A thread shares its resources(like data section, code section, open files,
signals) with ___________
Q.no 55. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
A : Counting sort
B : Bucket sort
C : Radix sort
D : Quick sort
Q.no 58. Which of the following two operations are provided by the IPC facility?
A : a) there is another process R to handle and pass on the messages between P and Q
Q.no 60. Octa-core processor are the processors of the computer system that
contains
A : 2 processors
B : 4 processors
C : 6 processors
D : 8 processors
Answer for Question No 1. is a
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : program counter
B : stack
D : Path Finding
Q.no 3. Regarding implementation of Breadth First Search using queues, what is
the maximum distance between two nodes present in the queue? (considering
each edge length 1)
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 4. In ………………. only one process at a time is allowed into its critical section,
among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
Q.no 5. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
A : Management
B : Media mass
C : Business
D : Science
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
Q.no 13. The time complexity of a quick sort algorithm which makes use of
median, found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
Q.no 14. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A:2
B:4
C:6
D:8
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : processing
B : parallel processing
C : serial processing
D : multitasking
B : Dividing no of processors
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 23. The decomposition technique in which the input is divided is called
as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
Q.no 25. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
A : High, high
B : Low, low
C : High, low
D : Low, high
A : single threaded
B : multithreaded
Q.no 30. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
Q.no 31. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
Q.no 35. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
C : share data
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 38. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
Q.no 39. When the event for which a thread is blocked occurs?
C : thread completes
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
Q.no 41. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 43. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
A : sequential
B : unique
C : simultaneous
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
C : Simultaneous execution
D : May use networking
A : Centralized memory
B : Shared memory
C : Message passing
D : Both A and B
Q.no 48. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
Q.no 50. The time required to create a new thread in an existing process is
___________
A : cost
B : reliability
C : uncertainty
D : scalability
A : Instruction level
B : Thread level
C : Transaction level
D : None of Above
A : input
B : output
C : operating system
D : memory
Q.no 54. Octa-core processor are the processors of the computer system that
contains
A : 2 processors
B : 4 processors
C : 6 processors
D : 8 processors
Q.no 56. Data access and storage are elements of Job throughput, of __________.
A : Flexibility
B : Adaptation
C : Efficiency
D : Dependability
Q.no 58. The link between two processes P and Q to send and receive messages is
called __________
A : communication link
B : message-passing link
C : synchronization link
Q.no 59. Which of the following algorithms has lowest worst case time
complexity?
A : Insertion sort
B : Selection sort
C : Quick Sort
D : Heap sort
Q.no 60. The register context and stacks of a thread are deallocated when the
thread?
A : terminates
B : blocks
C : unblocks
D : spawns
Answer for Question No 1. is c
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : program counter
B : stack
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
Q.no 3. The time complexity of heap sort in worst case is
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
Q.no 5. The time complexity of a quick sort algorithm which makes use of median,
found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
A:2
B:4
C:6
D:8
Q.no 10. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
Q.no 11. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A : processing
B : parallel processing
C : serial processing
D : multitasking
Q.no 14. The decomposition technique in which the function is used several
number of times is called as_________
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Merge sort
C : Heap sort
D : Selection sort
Q.no 16. In ………………. only one process at a time is allowed into its critical
section, among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
B : Dividing no of processors
C : Dividing number of tasks
Q.no 18. Which of the following is not an application of Depth First Search?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
Q.no 22. Which of the following is not an application of Breadth First Search?
D : Path Finding
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
Q.no 24. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : Management
B : Media mass
C : Business
D : Science
C : Simultaneous execution
A : High, high
B : Low, low
C : High, low
D : Low, high
Q.no 28. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : Serialization
B : Parallelism
C : Serial processing
D : Distribution
Q.no 30. the basic operations in the message-passing programming paradigm are
___
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
Q.no 32. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
A : Bus based
B : Mesh
C : Linear Array
D : All of above
C : share data
A : Centralized computing
B : Decentralized computing
C : Parallel computing
D : All of Above
Q.no 36. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
Q.no 37. Consider the situation in which assignment operation is very costly.
Which of the following sorting algorithm should be performed so that the
number of assignment operations is minimized in general?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
A : sequential
B : unique
C : simultaneous
A : DMA
B : CPU
C : I/O
D : Memory
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
Q.no 42. When the event for which a thread is blocked occurs?
A : thread moves to the ready queue
C : thread completes
Q.no 43. The time required to create a new thread in an existing process is
___________
Q.no 44. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
B : parallel processing
C : serial processing
D : multitasking
Q.no 48. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
Q.no 49. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 50. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
C : Process
Q.no 52. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
A : a) there is another process R to handle and pass on the messages between P and Q
Q.no 54. The transparency that enables accessing local and remote resources
using identical operations is called ____________
A : Access transparency
B : Concurrency transparency
C : Performance transparency
D : Scaling transparency
Q.no 55. Octa-core processor are the processors of the computer system that
contains
A : 2 processors
B : 4 processors
C : 6 processors
D : 8 processors
Q.no 56. Given a number of elements in the range [0….n^3]. which of the following
sorting algorithms can sort them in O(n) time?
A : Counting sort
B : Bucket sort
C : Radix sort
D : Quick sort
Q.no 57. Which of the following is not the possible ways of data exchange?
A : Simplex
B : Multiplex
C : Half-duplex
D : Full-duplex
Q.no 58. The register context and stacks of a thread are deallocated when the
thread?
A : terminates
B : blocks
C : unblocks
D : spawns
A : Multithreading
B : Cyber cycle
C : Internet of things
D : Cyber-physical system
A : cost
B : reliability
C : uncertainty
D : scalability
Answer for Question No 1. is c
1) All questions are Multiple Choice Questions having single correct option.
7) Use only black/blue ball point pen to darken the appropriate circle.
A : Mandatory Instructions/sec
B : Millions of Instructions/sec
C : Most of Instructions/sec
Q.no 2. In ………………. only one process at a time is allowed into its critical section,
among all processes that have critical sections for the same resource.
A : Mutual Exclusion
B : Synchronization
C : Deadlock
D : Starvation
Q.no 3. Which of the following is not an application of Breadth First Search?
D : Path Finding
A : Can be anything
B:0
C : At most 1
D : Insufficient Information
A : Data Decomposition
B : Recursive Decomposition
C : Speculative Decomposition
D : Exploratory Decomposition
A : Merge sort
C : Heap sort
D : Selection sort
A : program counter
B : stack
A : processing
B : parallel processing
C : serial processing
D : multitasking
A : symetric Paradigm
B : asymetric Paradigm
C : asynchronous paradigm
D : synchronous paradigm
A:2
B:4
C:6
D:8
A : Static Mapping
B : Dynamic Mapping
C : Hybrid Mapping
D : All of Above
Q.no 13. The time complexity of a quick sort algorithm which makes use of
median, found by an O(n) algorithm, as pivot element is
A : O(n^2)
B : O(nlogn)
C : O(nlog(log(n))
D : O(n)
A : Data Decomposition
B : Recursive Decomposition
C : Serial Decomposition
D : Exploratory Decomposition
Q.no 16. The logical view of a machine supporting the message-passing paradigm
consists of p processes, each with its own _______
A : O(log n)
B : O(n)
C : O(nlogn)
D : O(n^2)
B : Dividing no of processors
A : Management
B : Media mass
C : Business
D : Science
Q.no 20. The kernel code is dentified by the ________qualifier with void return type
A : _host_
B : __global__
C : _device_
D : void
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 22. Which of the following is not an application of Depth First Search?
A : kernel thread
B : kernel initialization
C : kernel termination
D : kernel invocation
A : Selection sort
B : Heap sort
C : Quick Sort
D : Merge sort
Q.no 25. Depth First Search is equivalent to which of the traversal in the Binary
Trees?
A : Pre-order Traversal
B : Post-order Traversal
C : Level-order Traversal
D : In-order Traversal
Q.no 26. Broader concept offers Cloud computing .to select which of the following.
A : Parallel computing
B : Centralized computing
C : Utility computing
D : Decentralized computing
A : Multithreading
B : Cyber cycle
C : Internet of things
D : None of these
Q.no 29. the basic operations in the message-passing programming paradigm are
___
C : Simultaneous execution
A : sequential
B : unique
C : simultaneous
A : single threaded
B : multithreaded
A : DMA
B : CPU
C : I/O
D : Memory
Q.no 34. High performance computing of the computer system tasks are done by
A : node clusters
B : network clusters
C : both a and b
D : Beowulf clusters
C : share data
Q.no 36. The time required to create a new thread in an existing process is
___________
B : Low, low
C : High, low
D : Low, high
Q.no 38. Which of the ceramic components are easier through nano structuring?
A : Lubrication
B : Coating
C : Fabrication
D : Wear
Q.no 39. _____ are major issues with non-buffered blocking sends
C : synchronization
D : scheduling
A : θ (n)
B : θ (nlogn)
C : θ (n^2)
D : θ (n(logn)^2)
Q.no 41. When the event for which a thread is blocked occurs?
C : thread completes
Q.no 42. If the given input array is sorted or nearly sorted, which of the following
algorithm gives the best performance?
A : Insertion sort
B : Selection sort
C : Bubble sort
D : Merge sort
Q.no 43. If one thread opens a file with read privileges then ___________
A : other threads in the another process can also read from that file
B : other threads in the same process can also read from that file
A : Bus based
B : Mesh
C : Linear Array
D : All of above
A : multi processing
B : parallel processing
C : serial processing
D : multitasking
A : allows processes to communicate and synchronize their actions when using the
same address space
B : allows processes to communicate and synchronize their actions without using the
same address space
A : Counting sort
B : Bucket sort
C : Radix sort
D : Shell sort
Q.no 48. Running merge sort on an array of size n which is already sorted is
A : O(n)
B : O(nlogn)
C : O(n^2)
D : O(log n)
A : Parallel computation
B : Parallel processes
C : Parallel development
D : Parallel programming
A : Quantum mechanics
B : Newtonian mechanics
C : Macro-dynamic
D : Geophysics
Q.no 51. Given a number of elements in the range [0….n^3]. which of the following
sorting algorithms can sort them in O(n) time?
A : Counting sort
B : Bucket sort
C : Radix sort
D : Quick sort
Q.no 52. Thread synchronization is required because ___________
A : a) there is another process R to handle and pass on the messages between P and Q
Q.no 54. Which of the following is not the possible ways of data exchange?
A : Simplex
B : Multiplex
C : Half-duplex
D : Full-duplex
Q.no 55. The link between two processes P and Q to send and receive messages is
called __________
A : communication link
B : message-passing link
C : synchronization link
Q.no 56. Octa-core processor are the processors of the computer system that
contains
A : 2 processors
B : 4 processors
C : 6 processors
D : 8 processors
A : cost
B : reliability
C : uncertainty
D : scalability
Q.no 59. The transparency that enables accessing local and remote resources
using identical operations is called ____________
A : Access transparency
B : Concurrency transparency
C : Performance transparency
D : Scaling transparency
Q.no 60. NVIDIA thought that 'unifying theme' of every forms of parallelism is the
A : CDA thread
B : PTA thread
C : CUDA thread
D : CUD thread
Answer for Question No 1. is b
A. Code compiled for hardware of one compute capability will not need to be re-
compiled to run on hardware of another
B. Different compute capabilities may imply a different amount of local memory per
thread
Answer : B
True or False: The threads in a thread block are distributed across SM units so that each
thread is executed by one SM unit.
A. True
B. False
Answer : B
Answer : C
True or false: Functions annotated with the __global__ qualifier may be executed on the
host or the device
A. True
B. Flase
Answer : A
Which of the following correctly describes a GPU kernel
B. All thread blocks involved in the same computation use the same kernel
Answer : B
C. Block and grid level parallelism - Different blocks or grids execute different tasks
D. Data parallelism - Different threads and blocks process different parts of data in
memory
Answer :A
What strategy does the GPU employ if the threads within a warp diverge in their execution?
A. Threads are moved to different warps so that divergence does not occur within a
single warp
C. All possible execution paths are run by all threads in a warp serially so that thread
instructions do not diverge
Answer : C
Which of the following does not result in uncoalesced (i.e. serialized) memory access on the
K20 GPUs installed on Stampede
Answer : A
Which of the following correctly describes the relationship between Warps, thread blocks,
and CUDA cores?
A. A warp is divided into a number of thread blocks, and each thread block executes on
a single CUDA core
B. A thread block may be divided into a number of warps, and each warp may execute
on a single CUDA core
C. A thread block is assigned to a warp, and each thread in the warp is executed on a
separate CUDA core
Answer : B
Answer : A
A. CUDA Libraries
B. CUDA Runtime
C. CUDA Driver
D. All Above
Answer : D
A. C
B. C++
C. Forton
D. All Above
Answer : D
Threads support Shared memory and Synchronization
A. True
B. False
Answer : A
B. Medical Imaging
C. Computational Science
E. All Above
Answer : E
A. True
B. False
Answer : A
What are the issues in sorting?
C. All above
Answer : C
Answer : A
A. Shell sort
B. Quick sort
C. Odd-Even transposition
D. Option A & C
Answer : D
Answer : A
Formally, given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to
find the shortest paths between all pairs of vertices. True or False?
A. True
B. False
Answer : A
A. One approach partitions the vertices among different processes and has each process
compute the single-source shortest paths for all vertices assigned to it. We refer to
this approach as the source-partitioned formulation.
B. Another approach assigns each vertex to a set of processes and uses the parallel
formulation of the single-source algorithm to solve the problem on each set of
processes. We refer to this approach as the source-parallel formulation.
C. Both are true
D. Non of these is true
Answer : C
Search algorithms can be used to solve discrete optimization problems. True or False ?
A. True
B. False
Answer : A
C. All of above
Answer : C
List the communication strategies for parallel BFS.
D. All of above
Answer : D
In a compare-split operation
A. Each process sends its block of size n/p to the other process
B. Each process merges the received block with its own block and retains only the
appropriate half of the merged block
C. Both A & B
Answer : C
Answer : D
A. Execution Time
B. Total Parallel Overhead
C. Speedup
D. Efficiency
E. Cost
F. All above
Answer : F
The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
Answer : A
Overhead function or total overhead of a parallel system as the total time collectively
spent by all the processing elements over and above that required by the fastest known
sequential algorithm for solving the same problem on a single processing element.
True or False?
A. True
B. False
Answer : A
What is Speedup?
A. A measure that captures the relative benefit of solving a problem in parallel. It is defined as the
ratio of the time taken to solve a problem on a single processing element to the time required to
solve the same problem on a parallel computer with p identical processing elements.
B. A measure of the fraction of time for which a processing element is usefully
employed.
C. None of the above
Answer : A
In an ideal parallel system, speedup is equal to p and efficiency is equal to one. True or
False?
A. True
B. False
Answer : A
A parallel system is said to be ________________ if the cost of solving a problem on a
parallel computer has the same asymptotic growth (in terms) as a function of the input
size as the fastest-known sequential algorithm on a single processing element.
A. Cost optimal
B. Non Cost optimal
Answer : A
Using fewer than the maximum possible number of processing elements to execute a
parallel algorithm is called ______________ a parallel system in terms of the number of
processing elements.
A. Scaling down
B. Scaling up
Answer : B
The __________________ function determines the ease with which a parallel system can
maintain a constant efficiency and hence achieve speedups increasing in proportion to the
number of processing elements.
A. Isoefficiency
B. Efficiency
C. Scalability
D. Total overhead
Answer : A
Minimum execution time for adding n numbers is Tp = n/p + 2 logp True or False ?
A. True
B. False
Answer : A
Unit I
1. Conventional architectures coarsely comprise of a_
A. A processor
B. Memory system
C Data path.
D All of Above
3. A pipeline is like_
A Latency
B Bandwidth
C Both a and b
D none of above
8. A single control unit that dispatches the same Instruction to various processors is__
A SIMD
B SPMD
C MIMD
D None of above
Unit 2
1. The First step in developing a parallel algorithm is_
A. Granularity
B. Priority
C. Modernity
D. None of above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
7. The Owner Computes Rule generally states that the process assigned a particular data
item is responsible for_
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
A. True
B. False
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
4. A hypercube has_
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
5. A binary tree in which processors are (logically) at the leaves and internal nodes are
routing nodes.
A. True
B. False
A. True
B. False
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
10. In All-to-All Personalized Communication Each node has a distinct message of size m
for every other node
A. True
B. False
4. Concrete having 28- days’ compressive strength in the range of 60 to 100 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: a
Explanation: High Performance Concrete having 28- days’ compressive strength in the
range of 60 to 100 MPa.
5. Concrete having 28-days compressive strength in the range of 100 to 150 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: b
Explanation: Very high performing Concrete having 28-days compressive strength in the
range of 100 to 150 MPa.
7. The choice of cement for high-strength concrete should not be based only on mortar-
cube tests but it should also include tests of compressive strengths of concrete at
___________ days.
a) 28, 56, 91
b) 28, 60, 90
c) 30, 60, 90
d) 30, 45, 60
View Answer
Answer: a
Explanation: The choice of cement for high-strength concrete should not be based only on
mortar-cube tests but it should also include tests of compressive strengths of concrete at
28, 56, and 91 days.
Unit I
A. A processor
B. Memory system
C Data path.
D All of Above
D None of above
3. A pipeline is like_
B House pipeline
C Both a and b
D A gas line
C Branch Dependency
D All of above
A Latency
B Bandwidth
C Both a and b
D none of above
C none of above
8. A single control unit that dispatches the same Instruction to various
processors is__
A SIMD
B SPMD
C MIMD
D None of above
B Exchanging messages.
C Both A and B
D None of Above
A True
B False
Unit 2
B. Execute directly
C. Execute indirectly
D. None of Above
A. Granularity
B. Priority
C. Modernity
D. None of above
D. None of above
D. None of Above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
A. recursive decomposition
B. data decomposition
C. exploratory decomposition
D. speculative decomposition
E. All of Above
7. The Owner Computes Rule generally states that the process assigned a
particular data item is responsible for_
D. None of Above
9. Speculative Decomposition consist of _
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
A. Task generation.
B. Task sizes.
D. All of Above
Unit 3
A. True
B. False
A. True
B. False
3. The dual of one-to-all broadcast is_
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
4. A hypercube has_
d
A. 2 nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
5. A binary tree in which processors are (logically) at the leaves and internal
nodes are routing nodes.
A. True
B. False
A. True
B. False
D. Scatter Kernel
D. None of Above
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
B. False
SN Question Option 1 Option 2
1 Any condition that causes a processor to stall is called as _____. Hazard Page fault
2 The time lost due to branch instruction is often referred to as _____. Latency Delay
3 _____ method is used in centralized systems to perform out of order execution.Scorecard Score boardin
4 The computer cluster architecture emerged as an alternative for ____. ISA Workstation
5 NVIDIA CUDA Warp is made up of how many threads? 512 1024
6 Out-of-order instructions is not possible on GPUs. WAHR FALSCH
7 CUDA supports programming in .... C or C++ only Java, Python
8 FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GP 32-bit IEEE flo 32-bit integer in
9 Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processo 1024 128
10 Each NVIDIA GPU has ------ Streaming Multiprocessors 8 1024
11 CUDA provides ------- warp and thread scheduling. Also, the overhead of thread“programming-overhead”,
c “zero-overhead”, 2 clock
1 clock
12 Each warp of GPU receives a single instruction and “broadcasts” it to all of its threads. It is SIMT
SIMD (Single a instru
---- operation.
(Single instru
13 Limitations of CUDA Kernel recursion, call No recursion
14 What is Unified Virtual Machine It is a techniq It is a techniq
15 _______ became the first language specifically designed by a GPU Company to Python,fa GPUs.C, CPUs.
16 The CUDA architecture consists of --------- for parallel computing kernels and funRISC instructio CISC instructio
17 CUDA stands for --------, designed by NVIDIA. Common Uni Complex Unid
18 The host processor spawns multithread tasks (or kernels as they are known in CUD WAHR FALSCH
19 The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core 128, d 256, 5132, 64, 128
20 NVIDIA 8-series GPUs offer -------- . 50-200 GFLOP 200-400 GFL
21 IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors 32-bit
o IEEE flo 32-bit integer in
22 CUDA Hardware programming model supports: a) fully generally data-parallel arch a,c,d,f b,c,d,e
23 In CUDA memory model there are following memory types available: a) Registers; a, b, d, f a, c, d, e, f
24 What is the equivalent of general C program with CUDA C: int main(void) { printf(" int main ( vo__global__ v
25 Which function runs on Device (i.e. GPU): a) __global__ void kernel (void ) { } b)aint m b
26 A simple kernel for adding two integers: __global__ void add( int *a, int *b, int add()
*c will execute add() will ex
27 If variable a is host variable and dev_a is a device (GPU) variable, to allocate mem
cudaMalloc(malloc( &dev
28 If variable a is host variable and dev_a is a device (GPU) variable, to copy input memcpy(
fro dev
cudaMemcpy(
29 Triple angle brackets mark in a statement inside main function, what does it indicates?
a call from ho a call from dev
30 What makes a CUDA code runs in parallel __global__ in main() functio
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
Option 3 Option 4 Correct Ans Level Marks
System error None of the 1 1 1
Branch penal None of the 3 1 1
Optimizing Redundancy 2 1 1
Super compu Distributed sy 3 1 1
312 32 4 1 1
-- -- 2 1 1
C, C++, thirdPascal
party 3 1 1
both none of the ab 1 1 1
512 8 4 1 1
512 16 4 1 1
64, 2 clock 32, 1 clock 2 2 2
SISD (SingleSISTinstru
(Single instru 2 1 1
recursion, no No recursion 2 2 2
It is a techniqIt is a techniq 1 1 1
CUDA C, GPUs. Java, CPUs. 3 1 1
ZISC instructioPTX instructio 4 1 1
Compute Uni Complex Unstructur 3 1 1
--- --- 1 1 1
64, 128, 256256, 512, 10 1 3 3
400-800 GFL800-1000 GFL 1 1 1
both none of the ab 2 1 1
a,d,e,f a,b,c,d,e,f 4 2 2
a, b, c, d, e, b,
f c, e, f 3 2 2
__global__ v__global__ in 2 2 2
both a,b --- 1 1 1
add() will beadd() will be 1 1 1
cudaMalloc(malloc( (void 3 1 1
memcpy( (vo cudaMemcpy( 2 1 1
less than com greater than 1 1 1
Kernel namefirst paramete 4 1 1
marks question A B C D ans
Interconnection Networks Direct Both Static and
0 1 Both Dynamic Static
can be classified as? Network Dynamic.
Parallel Computers are used
Algorithmic Optimization This is an
1 1 to solve which types of Both None
Problems Problems explaination.
problems.
One clock Is used
How many clocks control
2 1 One Three Four Five to control all the
all the stages in a pipeline?
stages.
Main memory is
Main memory in parallel
3 1 Shared Parallel Fixed None shared in parallel
computing is____?
computing.
Ans- (d)-
Application
Which of these is not a class
Application Distributed Symmetric Multicore checkpoiting. is
4 1 of parallel computing
Checkpointing Computing Multiprocessing Computing not a class of
architetcture?
parallel computer
architecture.
Parallel computing
software
Parallel Computing software Parallel
Automatic Application solutionincludes all
5 1 solutions and Techniques All Programming
Parallelization Checkpointing of the following..
includes: languages.
This is an
explanation
The Processors are The Processors
6 2 connected to the memory Switches Cables Buses Registers are connected
through a set of? thru. the switches.
Superscalar Architetcure
This is an
7 2 has how many execution Two One Three Four
explaination.
units?
What is used to hold the The Intermediate
Intermediate
8 2 intermediate output in a Cache RAM ROM Registers are used
Register
pipeline to hold the output.
International
International Human Genome
Human
Which oranization performs Sequencing and Genome Sequencing for
Genome This is an
9 2 sequencing of Human Consortium for Sequencing and Humans and
Sequencing explaination.
Genome? Human Constrium, Consortium,
and
Genome Org. Org.
Consortium
Ans(c)- Five
There are how many stages
10 2 Five Three Two Six stages are there in
in RISC Processor?
a RISC processor.
The DRAM acess
Over the last decade, The
time rate has
DRAM access time has None of the
11 2 0.1 0.2 0.15 improved at a rate
improved at what rate per above
of 10% over the
year?
last decade.
marks question A B C D ans
Cache acts as low
Which memory acts as low- latency high
12 2 latency high bandwidth Cache Register DRAM EPROM bandwidth storage
storage? .This is an
explanation.
Which processor This is an
13 2 SIMD MIMD MISD MIMD
architecture is this? explaination.
This diagram
Which core processor is
14 2 Quad-Core Dual-Core Octa-Core Single-Core shows Quad-
this?
Core.
Data Caching is
Which of these is not a
15 2 Data Caching Decomposition Simplification Parsimony not a prinicple of
scalable design principle?
scable design.
The distance between any O(1) is the ditance
16 2 two nodes in Bus Based O(1) O(n Logn) O(N) O(n^2) between any two
network is? nodes.
All of these are
Early SIMD computers early staged
17 2 All MPP CM-2 Illiac IV
include: SIMD parallel
computers.
This is called
This is which configuration
18 2 Pass-through Cross-Over Shuffle None Pass-through
in Omega networks.
configuration.
Parallelization
Automatic Parallelization includes parse,
19 2 technique doesn’t Share Memory Analyse Schedule Parse analyse schedule
ncludes: and code
generation.
The P4 processor
The Pentuim 4 or P4
has 20 staged
20 2 processor has how many 20 15 18 10
pipeline. This is an
stage pipeline?
explanation.
Sum, Prioirity and
Which protocol is not used
common are used
21 3 to remove concurrent Identify Priority Common Sum
to remove
writes?
concurrent writes.
Exclusive EREW stands for
Erasable Read Easily Read
Read and Exclusive Read
22 3 EREW PRAM stands for? and Erasable and Easily None
Exclusive and Exclsuive
Write PRAM Write
Write Write PRAM.
Multiple
During each clock cycle,
Instuctiion are
multiple instructions are
23 3 Parallel Series Both a and b None piped in parallel.
piped into the processor
This is an
in________?
explanation.
Multistaged
Which Interconnection Multistage Dynamic
24 3 Cross-Bar Bus-Staged Network uses this
Network uses this equation. Networks Networks
eqn.
marks question A B C D ans
There are
generally four
How many types of parallel types of parallel
computing are available computing,
25 3 from both proprietary and 4 2 3 6 available from
open source parallel both proprietary
computing vendors? and open source
parallel computing
vendors.
If a piece of data
is repeatedly used,
If a piece of data is the effective
repeatedly used, the latency of this
effective latency of this memory system
memory system can be Memory can be reduced by
26 3 Hit Ratio Memory ratio Hit Fraction
reduced by the cache. The Fraction. the cache. The
fraction of data references fraction of data
satisfied by the cache is references
called? satisfied by the
cache is called the
cache hit ratio.
SuperScalar
Superscalar Architetcure Data- Architecture can
27 3 Scheduling Phasing Data Extraction
can create problem in? Compiling cause problems in
CPU scheduling.
In cut-through
In cut-through routing, a routing, a message
28 3 message is broken into fixed Flits Flow Digits Control Digits All is broken into
size units called? fixed size units
called flits.
The total communication
This is an
29 3 time for cut-through routing A B C D
explaination.
is?
The Disadvantage of GPU Load- Process All of the This is an
30 1 Data balancing
Pipeline is? balancing balancing above explaination.
Examples of GPU AMD Both AMD and
31 1 Both NVIDIA None
Processors are: Processors NVIDIA.
Simultaneous
execution of
Simultaneous execution of
Stream different programs
32 1 different programs on a data Data Execution Data-paralleism None
Parallelism on a data stream is
stream is called?
called Stream
Parallelism.
Early GPU controllers were GPU This is an
33 1 Video Shifters GPU Shifters Video-Movers
known as? Controllers Explaination.
Algorithm
_____development is a
development is a
critical component of
34 1 Algorithm Code Pseudocode Problem critical component
problem solving using
of problem solving
computers?
using computers
marks question A B C D ans
Graphics
Graphical Gaming Graph This is an
35 1 GPU stands for? Processsing
Processing Unit Processing Unit Processing Unit Explaination.
Unit
Parallelism leads
naturally to
Concurrency. For
Serial
36 1 What leads to concurrency? Parallelism Decomposition All example, Several
Processing
processes trying to
print a file on a
single printer.
Rasterization is the
process of
The process of determining
Space- determining which
which screen-space pixel
37 2 Rasterization Pixelisation Fragmentation Determining screen-space pixel
locations are covered by
Process locations are
each\ntriangle is known as?
covered by
each\ntriangle.
The
programmable
units of the GPU
The programmable units of
follow a single
38 2 the GPU follow which SPMD MISD MIMD SIMD
program multiple-
programming model?
data (SPMD)
programming
model.
Shared Address
Which space can ease the space can ease the
programming effort, programming
especially if the distribution Shared Parallel Series- effort, especially if
39 2 Data- Address
of data is different in Address Address Address the distribution of
different phases of the data is different in
algorithm? different phases of
the algorithm.
Processors are the
Which are the hardware hardware units
40 2 units that physically perform Processsor ALU CPU CU that physically
computations? perform
computations
All of the these are
Examples of Graphics API
41 2 All DirectX CUDA Open-CL examples of
are?
Graphics API
The mechanism by
The mechanism by which which tasks are
tasks are assigned to assigned to
42 2 Mapping Computation Process None
processes for execution is processes for
called___? execution is called
mapping.
marks question A B C D ans
A decomposition
A decomposition into a into a large
large number of small tasks number of small
43 2 Fine- grained Coarse-grained Vector-granied All
is called__________ tasks is called
granularity. fine-grained
granularity.
Identical
operations being
Identical operations being
applied
applied concurrently on Data-
44 2 Parallelism Data Serialsm Concurrency concurrently on
different data items is Parallelism
different data
called?
items is called
Data Parallelism.
System which do not have
This is the
45 2 parallel processsing SISD SIMD MISD MIMD
explainantion.
capabiities?
The time and the
The time and the location in location in the
the program of a static one- program of a static
46 2 Priori Polling Decomposition Execution
way interaction is known as one-way
? interaction is
known a priori.
Memory access in RISC
CALL and MOV and This is the
47 2 architecture is limited to STA and LDA Push and POP
RET JMP explaination.
which instructions?
Data Parallel
Which Algorithms can be algorithms can be
implemented in both shared- implemented in
Data-Parallel Quick-Sort Bubble Sort
48 2 address-space and Data Algorithm both shared-
Algo. Algo. Algo.
message-passing address-space
paradigms? and message-
passing paradigms
Randomized This figure shows
Which type of Distribution is Block-Cyclic Cyclic
49 2 Block None Randomized
this? Distribution Distribution
Distribution Block Distribution.
An abstraction
used to express
An abstraction used to such dependencies
express such dependencies Task- Time- among tasks and
Dependency
50 2 among tasks and their Dependency Dependency None their relative order
Graph.
relative order of execution is Graph. Graph of execution is
known as__________? known as a task-
dependency
graph.
marks question A B C D ans
Block distributions
are some of the
Which is the simplest way to simplest ways to
distribute an array and distribute an array
Block Array Process
51 3 assign uniform contiguous All and assign uniform
Distrbution Distrbution Distribution
portions of the array to contiguous
different processes? portions of the
array to different
processes
An example of a
An example of a decomposition
Image- Travelling Time-
decomposition with a 8 Queen with a regular
52 3 dethering Salesman complexity
regular interaction pattern problem. interaction pattern
problem. Problem Problens
is? is the problem of
image dithering.
A feature of a
task-dependency
A feature of a task-
graph that
dependency graph that
determines the
53 3 determines the average Critical-path Process-path Granularity. Concurrency
average degree of
degree of concurrency for a
concurrency for a
given granularity is
given granularity is
critical path.
The shared-
address-space
The shared-address-space programming
54 3 programming paradigms can Both Two way One way None paradigms can
handle which interactions? handle both one-
way and two-way
interactions.
Cyclic Distribution
can result in an
Which distribution can result
almost perfect
in an almost perfect load
Cyclic Array Block-Cyclic Block load balance due
55 3 balance due to the extreme
Distribution. Distribution Distribution Distribution. to the extreme
fine-grained underlying
fine-grained
decomposition.
underlying
decomposition.
Data sharing
interactions can be
Data sharing interactions
categorized as
56 3 can be categorized Both Read-Write Read only None
either read-only or
as__________interactions?
read-write
interactions
marks question A B C D ans
Algo. Model is a
way of structuring
a parallel algorithm
What is the way of
by selecting a
structuring a parallel
decomposition
algorithm by selecting a
Algorithm Mapping and mapping
57 3 decomposition and mapping Parallel Model Data Model
Model Model technique and
technique and applying the
applying the
appropriate strategy to
appropriate
minimize interactions called?
strategy to
minimize
interactions.
This is Serial
Serial column Column- Bubble Sort
58 3 Which Algorithm is this? None. Column based
based Algo. Algorithm Algo.
algorithm.
Algorithms based on the Matrix- Parallel This is an
59 3 All Quicksort
task graph model include: Factorization QuickSort Explaination.
All-port
communication
Which model permits model permits
simultaneous communication All-port One-port Dual-port Quad-port simultaneous
60 1
on all the channels communication communication communication communication communication on
connected to a node? all the channels
connected to a
node.
A process sends the same
m-word message to every
other process, but different All to All One to All All to All This is an
61 1 None
processes may broadcast Broadcast Broadcast Reduction Explaination.
different messages. It is
called?
All to All One-to-all All-to-one One to one
The Matrix is transposed This is an
62 1 personalized personalized personalized personalized
using which operation? Explaination.
communication communication communication communication.
Each node in a
Each node in a two-
two-dimensional
63 1 dimensional wraparound Four Two Three One
wraparound mesh
mesh has how many ports?
has four ports
Circular shift is a member of
a broader class of global This is ann
64 1 Permutation Combination. Both a and b None
communication operations explaination.
known as?
We define a
circular q-shift as
We define_______ as the the operation in
operation in which node i which node i
Circular q-
65 1 sends a data packet to node Linear shift Circular shift Linear q-shift. sends a data
shift
(i + q) mod p in a p-node packet to node (i
ensemble (0 < q < p). + q) mod p in a p-
node ensemble (0
< q < p).
marks question A B C D ans
Parallel algorithms
often require a
Parallel algorithms often single process to
require a single process to send identical data
send identical data to all One to All One to One All to One to all other
66 1 None
other processes or to a Broadcast Broadcast Broadcast processes or to a
subset of them. This subset of them.
operation is known as? This operation is
known as One to
All Broadcast.
In which Communication
All to All One to One All-to-one One-to-all
each node sends a distinct This is an
67 1 personalized personalized personalized personalized
message of size m to every Explaination.
communication communication communication communication.
other node?
All to All personalized
communication operation is Matrix- Fourier Database Join This is an
68 1 Quick Sort
not used in a which of these Transpose Transformation operation Explaination.
parallel algorithms?
The dual of one to
The Dual of one-to-all All to one All to one One to Many All to All all Broadcast is
69 1
broadcast is? Reduction Broadcast Reduction Broadcast called all to one
reduction.
Reduction on a
Reduction on a linear array linear array can be
can be performed performed by
70 1 by_______ the direction Reversing Forwarding Escaping Widening simply reversing
and the sequence of the direction and
communication? the sequence of
communication
This equation is used to
solve which topology This is an
71 2 Hypercube Mesh Ring Linear-Array
operations in all to all Explaination.
communications?
\nThe communication
Second
pattern of all-to-all Third Variation First Variation Fifth Variation This is an
72 2 Variation of
broadcast can be used to of Reduction of Reduction of Reduction Explaination.
Reduction
perform________?
In the scatter
A single node sends a
operation, a single
unique message of size m to
node sends a
73 2 every other node. This Scatter Reduction Gather Concatenate
unique message of
operation is known
size m to every
as______?
other node.
The Algorithm represents All to All All to All All to All One to One This is an
74 2
which broadcast? Broadcast Broadcast Reduction Reduction explaination.
The message can be The message can
75 2 broadcast in how many Log(p) Log(p^2) One Sin(p) be broadcast in
steps? log p steps.
All to All One-to-all One to one All-to-one
This equation is used to This is an
76 2 personalized personalized personalized personalized
solve which operations? Explaination.
communication communication communication communication.
marks question A B C D ans
There are n^3
There are how many
computations for
computations for n^2 words
77 2 N^3 Tan n E^n Log n n^2 words of data
of data transferred among
transferred among
the nodes?
the nodes.
Scatter opeartion
One-to-all One-to-one All-to-one All-to-all is also known as
Scatter Operation is also
78 2 personalized personalized personalized personalized One-to-all
known as?
communication communication communication communication. personalized
communication.
A hypercube with
A Hypercube with 2d nodes 2d nodes can be
can be regarded as a d- regarded as a d-
79 2 Two One Three Four
dimensional mesh with____ dimensional mesh
nodes in each dimension. with two nodes in
each dimension
One-to-all broadcast and
all-to-one reduction are Gausiian Shortest path Matrix- Vector This is an
80 2 All
used in several important Elimination Algo. multiplication Explaination.
parallel algorithms including?
Each node of the
Each node of the distributed-
distributed-memory parallel memory parallel
81 2 computer is a______ NUMA UMA CCMA None computer is a
shared-memory NUMA shared-
multiprocessor. memory
multiprocessor.
To perform a q-
To perform a q-shift, we shift, we expand q
82 2 expand q as a sum of 2 3 e Log p as a sum of
distinct powers of______? distinct powers of
2.
In which implementation of
This is an
83 3 circular shift, the entire row Mesh Hypercube Ring Linear
Explaination
to data set is shifted by
On a p-node
hypercube with
all-port
communication,
On a p-node hypercube
the coefficients of
with all-port communication,
tw in the
the coefficients of tw in the
expressions for the
expressions for the
communication
communication times of
84 3 Log(p) Cos(p) Sin(p) E^p times of one-to-all
one-to-all and all-to-all
and all-to-all
broadcast and personalized
broadcast and
communication are all
personalized
smaller than their single-port
communication are
counterparts by a factor of?
all smaller than
their single-port
counterparts by a
factor of log p.
marks question A B C D ans
The Equation represents
Data Model Space-Time Ans-(c) Cost
85 3 which analysis in All to All Cost Analysis Time Analysis
Analysis Analysis Analysis.
Broadcasts?
On a p-node hypercube, the
size of each message
86 3 A B C D A
exchanged in the i th of the
log p steps is?
This figure shows
One to All
Which broadcast is applied One to All One to One All to One All to one
87 3 Broadcast being
on this 3D hypercube? Broadcast Broadcast Broadcast Reduction
applied on 3D
hypercube.
The Equation represents
This is an
88 3 which analysis in One to All Cost Analysis Time Analysis Data Analysis Space Analysis
explaination.
Broadcasts?
The time for
The time for circular shift on circular shift on a
a hypercube can be hypercube can be
89 3 improved by almost a factor Log p Cos(p) e^p Sin p improved by
of ______ for large almost a factor of
messages. log p for large
messages.
The execution time
of a parallel
algorithm depends
not only on input
size but also on
The execution time of the number of
Relative Communication
90 1 parallel algorithm Processor Input Size processing
computation speed
doesn’t depends upon? elements used,
and their relative
computation and
interprocess
communication
speeds.
Processing elements in a The processing Both
parallel system may become Load element synchronization
91 1 Both Synchronization
idle due to many reasons Imbalance doesn’t and load
such as: become idle. imbalance
If the scaled-
speedup curve is
If the scaled-speedup curve close to linear with
is close to linear with respect to the
respect to the number of number of
92 1 Scalable Iso-scalable Non-Scalable Scale-Efficient
processing elements, then processing
the parallel system is elements, then the
considered as? parallel system is
considered
scalable
marks question A B C D ans
A parallel system
is the combination
Which system is the
of an algorithm
combination of an algorithm Parallel Data- Parallel Architecture
93 1 Series System and the parallel
and the parallel architecture System System System
architecture on
on which it is implemented?
which it is
implemented
Scalable Speedup
defined as the
What is defined as the speedup obtained
speedup obtained when the when the problem
Scalable Unscalable Superlinearity Isoefficiency
94 1 problem size is increased size is increased
Speedup Speedup Speedup Speedup
linearly with the number of linearly with the
processing elements? number of
processing
elements
The maximum
number of tasks
The maximum number of that can be
tasks that can be executed executed
95 1 simultaneously at any time in Concurrency Parallelism Linearity Execution simultaneously at
a parallel algorithm is called any time in a
its degree of__________. parallel algorithm
is called its degree
of concurrency.
The isoefficiency due to
This is an
96 1 concurrency in 2-D O(p) O(n Logp) O(1) O(n^2)
explaination.
partitioning is:
We define total
overhead of a
parallel system as
the total time
The total time collectively collectively spent
spent by all the processing by all the
elements over and above processing
that required by the fastest elements over and
Total Parallel
97 2 known sequential algorithm Overhead Serial Runtime above that
Overhead Runtime
for solving the same required by the
problem on a single fastest known
processing element is sequential
known as? algorithm for
solving the same
problem on a
single processing
element.
Parallel
Parallel computations computations
involving matrices and involving\nmatrices
98 2 vectors readily lend Decomposition Composition Linearity Parallelsim and vectors
themselves to data readily lend
______________. themselves to data
decomposition.
marks question A B C D ans
Parallel 1-D with Pipelining
This is an
99 2 is a___________ Synchronous Asynchronous Optimal Cost-optimal
explaination.
algorithm?
The serial complexity of
This is an
100 2 Matrix-Matrix Multiplication Õ•(n^3) O(n^2) O(n) O(nlogn)
explaination
is:
What is the problem size for Ï´(n^3) is the
101 2 Ï´(n^3) Ï´(nlogn) Ï´(n^2) Ï´(1)
n x n matrix multiplication? problem size.
The given equation Overhead Series Parallel This is an
102 2 Parallel Model
represents which function? Function Overtime Overtime explaination.
The efficiency of a parallel
103 2 A B C D A
program can be written as:
The total number
The total number of steps in
of steps in the
104 2 the entire pipelined Θ(n) Θ(n^2) Θ(n^3) Θ(1)
entire pipelined
procedure is_______?
procedure is Θ(n)
In Canon's Algorithm, the This is an
105 2 θ(n^2) θ(n) θ(n^3) θ(nlogn)
memory used is? explaination.
Consider the
problem of
Consider the problem of
multiplying two n
multiplying two n × n
× n dense,
106 2 dense, square\nmatrices A A×B A/B A+B A-B
square\nmatrices
and B to yield the product
A and B to yield
matrix C =:
the product matrix
C = A × B.
The serial runtime of
multiplying a matrix of
107 2 A B C D A
dimension n x n with a
vector is?
Efficiency is a
measure of the
________is a measure of
fraction of time for
the fraction of time for Overtime
108 2 Efficiency Linearity Superlinearity which a
which a processing element Function
processing
is usefully employed.
element is usefully
employed.
When the work performed
by a serial algorithm is
greater than its parallel
formulation or due to Superlinear Linear Performance This is an
109 2 Super Linearity
hardware features that put Speedup Speedups Metrics explaintion
the serial implementation at
a disadvantage.This
phenomena is known as?
The all-to-all broadcast and
This is an
110 3 the computation of y[i] both Θ(n) Θ(nlogn) Θ(n^2) Θ(n^3)
explaination.
take time?
marks question A B C D ans
If virtual
processing
elements are
mapped
If virtual processing
appropriately onto
elements are mapped
physical
appropriately onto physical
processing
111 3 processing elements, the N/p P/n N+p N*p
elements, the
overall communication time
overall
does not grow by more than
communication
a factor of
time does not
grow by more
than a factor of
n/p
Parallel execution time can
be expressed as a function
of problem size, overhead
112 3 A B C D A
function, and the number of
processing elements.The
Formed eqn is:
In 2-D partioning, the first Ts + twn/
113 3 Ts - twn/√p.\n Ts*twn/√p.\n Ts/ twn*√p.\n Ts + twn/√p.\n
alignment takes time=? √p.\n
Using fewer than
the maximum
Using fewer than the
possible number
maximum possible number
of processing
114 3 of processing elements to Scaling Down Scaling up Scaling Stimulation
elements to
execute a parallel algorithm
execute a parallel
is called________?
algorithm is called
scaling down.
Which of the following is a
Memory This is an
115 3 drawback of matrix matrix Efficient Time-bound Complex
Optimal explaination
multiplication?
Consider the
problem of sorting
Consider the problem of 1024 numbers (n
sorting 1024 numbers (n = = 1024, log n =
116 3 1024, log n = 10) on 32 P/log n P*log n P+logn N*log p 10) on 32
processing elements. The processing
speedup expected is elements. The
speedup expected
is p/logn
Consider the problem of
adding n numbers on p
processing elements such
Ꙩ((n/p) log Ꙩ((n*p) log Ꙩ((p/n) log Ans-(a)-
117 3 that p < n and both n and p Ꙩ((n) log p).
p). p). p). Ꙩ((n/p) log p).
are powers of 2. The overall
parallel execution time of the
problem is:
DNS algorithm has____ DNS has Ω(n)
118 3 Ω(n) Ω(n^2) Ω(n^3) Ω(logn)
runtime? runtime
marks question A B C D ans
The serial algorithm Ans-(b)-n^2. The
requires______ serial algorithm
119 3 multiplications and additions N^2 N^3 Log n Nlog(n) requires n^2
in matrix-vector multiplications and
multiplication.\n additions.\n\n
The time required
The time required to merge to merge two
120 1 two sorted blocks of n/p θ(n/p) θ(n) θ(p/n) θ(nlogp) sorted blocks of
elements is_________?\n\n n/p elements is
θ(n/p).\n\n
The stack is split
into two equal
In Parallel DFS,the stack is pieces such that
split into two equal pieces the size of the
such that the size of the search space
121 1 Half-Split Half-Split Parallel-Split None
search space represented represented by
by each stack is the same. each stack is the
Such a split is called?. same. Such a split
is called a half-
split.
To avoid sending
very small
To avoid sending very small
amounts of work,
amounts of work, nodes
nodes beyond a
beyond a specified stack
122 1 Cut-Off Breakdown Full Series specified stack
depth are not given away.
depth are not
This depth is called
given away. This
the_________depth.
depth is called the
cutoff depth.
In sequential
In sequential sorting sorting algorithms,
algorithms, the input and the Process Secondary External the input and the
123 1 Main Memory
sorted sequences are stored Memory Memory Memory sorted sequences
in which memory? are stored in the
process's memory
Each process
sends its block to
the other process.
Each process sends its
Now, each
block to the other process.
process merges
Now, each process merges
the two sorted
the two sorted blocks and
124 1 Compare-Split Split Compare Exchange. blocks and retains
retains only the appropriate
only the
half of the merged block.
appropriate half of
We refer to this operation
the merged block.
as?
We refer to this
operation as
compare-split.
marks question A B C D ans
Each process
compares the
Each process compares the
received element
received element with its
with its own and
own and retains the Compare Process-
125 1 Exchange All retains the
appropriate element.We Exchange Exchange
appropriate
refer this operation
element. We refer
as_______?
this as compare
exchange.
Parallel BFS
maintains the
Which algorithm maintains
unexpanded nodes
the unexpanded nodes in the
126 1 Parallel BFS Parallel DFS Both a and b None in the search
search graph, ordered
graph, ordered
according to their l-value?
according to their
l-value.
The critical issue in
The critical issue in parallel parallel depth-first
depth-first search algorithms search algorithms
127 1 is the distribution of the Processor Space Memory Blocks is the distribution
search space among of the search
the____________? space among the
processors
Enumeration Sort uses how
This is an
128 2 many processes to sort n N^2 Logn N^3 N
explaination.
elements?
A bitonic
sequence is a
sequence of
elements <a0, a1,
Which sequence is a
..., an-1> with the
sequence of elements <a0,
property that
a1, ..., an-1> with the
either (1) there
property that either (1) there
exists an index i, 0
exists an index i, 0 ≤ i
i n - 1, such that
≤n - 1, such that <a0, Bitonic Acyclic Asymptotic Cyclic
129 2 <a0, ..., ai > is
..., ai > is monotonically Sequence Sequence Sequence Sequence.
monotonically
increasing and <ai +1, ...,
increasing and <ai
an-1> is monotonically
+1, ..., an-1> is
decreasing, or (2) there
monotonically
exists a cyclic shift of indices
decreasing, or (2)
so that (1) is satisfied.
there exists a
cyclic shift of
indices so that (1)
is satisfied
marks question A B C D ans
To make a
substantial
improvement over
To make a substantial
odd-even
improvement over odd-even
transposition sort,
transposition sort, we need
we need an
130 2 an algorithm that moves Shell Sort Linear Sort Quick-Sort Bubble Sort
algorithm that
elements long distances.
moves elements
Which one of these is such
long distances.
serial sorting algorithm?
Shellsort is one
such serial sorting
algorithm.
Quicksort is a
Quick-Sort is a_________ Divide and Greedy Divide and
131 2 Both a and b None
algorithm? Conquer Approach Conquer
algorithm.
The_______ transposition
algorithm sorts n elements in
n phases (n is even), each of This is an
132 2 Odd-Even Odd Even None
which requires n/2 explaination.
compare-exchange
operations.
The average time
The average time
complexity for
133 2 complexity for Bucket Sort O(n+k) O(nlog(n+k)) O(n^3) θ(n^2)
Bucket Sort is
is?
O(n + k).
A popular serial
algorithm for
A popular serial algorithm sorting an array of
for sorting an array of n n elements whose
elements whose values are Quick-Sort values are
134 2 Bucket Sort Linear Sort Bubble-Sort
uniformly distributed over an Algo. uniformly
interval [a, b] is which distributed over an
algorithm? interval [a, b] is
the bucket sort
algorithm
Best case
Best Case time complexity complexity of
135 2 O(n) O(n^3) O(nlogn) O(n^2)
of Bubble Sort is: bubblesort is
O(n).
marks question A B C D ans
When more than
one process tries
When more than one to write to the
process tries to write to the same memory
same memory location, only location, only one
one arbitrarily chosen arbitrarily chosen
CRCW-
136 2 process is allowed to write, PRAM Partitioning CRCW process is allowed
PRAM
and the remaining writes are to write, and the
ignored. This process is remaining writes
called_________ in quick are ignored.It is
sort. called CRCW
PRAM quick sort
algo.
Average Time Complexity in This is an
137 2 O(nlogn) O(n) O(n^3) θ(n^2)
a quicksort algorithm is: explainatoin.
The isoefficiency function of The isoefficiency
138 2 Global Round Robin (GRR) O (p^2 log p) O (p log p) O ( log p) O (p^2) function of GRR is
is: O (p^2 log p)
A comparator is a
A_____ is a device with
device with two
two inputs x and y and two
139 2 Comparator Router Separator Switch. inputs x and y and
outputs x' and y' in a Sorting
two outputs x' and
Network.
y'
If T is a DFS tree
in G then the
If T is a DFS tree in G then parallel
the parallel implementation implementation of
140 2 of the algorithm runs in O(t) O(tlogn) O(logt) O(1) the algorithm
______________time outputs a proof
complexity. that can be
verified in O(t)
time complexity.
In the quest for
fast sorting
In the quest for fast sorting methods, a
methods, a number of number of
networks have been networks have
141 2 θ(nlogn) θ(n) θ(1) θ(n^2)
designed that sort n been designed that
elements in time significantly sort n elements in
smaller than___? time significantly
smaller than
θ(nlogn).
The average value
The average value of the
of the search
search overhead factor in
142 2 One Two Three Four overhead factor in
parallel DFS is less
parallel DFS is
than______?
less than one
Parallel runtime for
Parallel runtime for Ring
Ring architecture
143 3 architecture in a bitonic sort θ(n) θ(nlogn) θ(n^2) θ(n^3)
in a bitonic sort is
is:
θ(n)
marks question A B C D ans
The Sequential Complexity
This is an
144 3 of Odd-Even Transposition θ(n^2) θ(nlogn) θ(n^3) θ(n)
explaination.
Algorithm is:
The Algorithm represents Sequential Circular Bubble Simple Bubble Linear Bubble This is an
145 3
which bubble sort: Bubble Sort Sort Sort Sort explaination.
Enumeration Sort uses how
This is an
146 3 much time to sort n θ(1) θ(nlogn) θ(n^2) θ(n)
explaination.
elements?
The radix sort
algorithm relies on
The______algorithm relies
the binary
147 3 on the binary representation Radix-sort Bubble Sort Quick-Sort Bucket-Sort
representation of
of the elements to be sorted.
the elements to be
sorted.
Parallel runtime for Mesh
This is an
148 3 architecture in a bitonic sort θ(n/logn) θ(n) θ(n^2) θ(n^3)
explaination.
is:
The number of
The number of threads in a threads in a thread
thread block is limited by block is also
149 1 the architecture to a total of 512 502 510 412 limited by the
how many threads per architecture to a
block? total of 512
threads per block
CUDA Architecture is
NVIDIA provides
150 1 mainly provided by which NVIDIA Intel Apple IBM
CUDA services.
company?
In CUDA Architecture,
Subprograms are
151 1 what are subprograms Kernel Grid Element Blocks
called kernels.
called?
CUDA Stands for
Compute Computer Common USB Common
What is the fullform of Compute Unified
152 1 Unified Device Unified Device Device Unified Disk
CUDA? Device
Architecture Architecture Architecture Architecture
Architecture.
CUDA
Which of these is not an
Thermo Neural VLSI architecture has no
153 2 application of CUDA Fluid Dynamics
Dynamics Networks Stimulation use on Thermo
Arhitecture?
Dynamics.
CUDA
CUDA programming is programming is
especially well-suited to especially well-
address problems that can suited to address
154 2 Data parallel Task Parallel Both a and b None
be expressed problems that can
as__________ be expressed as
computations. dataparallel
computations.
CUDA C/C++ uses which This is an
155 2 global kernel Cuda_void nvcc
keyword in programming: explaination.
CUDA programs are saved This is an
156 2 .cd .cx .cc .cu
with_____ extension. explaination
marks question A B C D ans
The Kepler K20X
chip block
The Kepler K20-X chip
diagram,
block, contains____
157 2 15 8 16 7 containing 15
streaming
streaming
multiprocessors\n(SMs).
multiprocessors
(SMs)
The K20X
The Kepler K20X architecture
158 2 architecture increases the 64K 32K 128K 256K increases the
register file size to: register fi le size to
64K
The register file in a GPU is Register size in a
159 2 2 MB 1 MB 3MB 1024B
of what size? GPU is 2MB.
NVIDIA’s GPU
computing platform is not This is an
160 2 AMD Tegra Quadro Tesla
enabled on which of the explaination.
following product families:
Tesla K-40 has compute This is an
161 2 3.5 3.2 3.4 3.1
capability of: explaination.
The SIMD unit
The SIMD unit creates, creates, manages,
manages, schedules and schedules and
162 2 executes_____ threads 32 16 24 8 executes 32
simultaneously to create a threads
warp. simultaneously to
create a warp
Which hardware is used by
the host interface to fasten Direct
Memory This is an
163 2 the transfer of bulk data to Memory Switch Hub
Hardware Explaination
and fro the graphics Access
pipeline?
A ‘grid’ is a
collection of
A ____ is a collection of
thread blocks of
thread blocks of the same
164 2 Grid Core Element Blcoks the same thread
thread dimensionality which
dimensionality
all execute the same kernel.
which all execute
the same kernel
Active Warps can be
This is an
165 2 classified into how many 3 2 4 5
explaination.
types?
All threads in a
All threads in a grid share Global Synchronized grid\nshare the
166 2 Local Memory All
the same_________space. memory Memory same global
memory space
CUDA was introduced in This is an
167 2 2007 2006 2008 2010
which year? explaination.
marks question A B C D ans
Unlike a C
function call, all
Unlike a C function call, all
168 3 Asynchronous Synchronous Both a and b None CUDA kernel
CUDA kernel launches are:
launches are
asynchronous
A warp consists of
32 consecutive
A warp consists
threads and
of____consecutive threads
all\nthreads in a
and all threads in a warp are
169 3 32 16 64 128 warp are executed
executed in Single
in Single
Instruction Multiple Thread
Instruction
(SIMT) fashion.
Multiple Thread
(SIMT) fashion
There are how many
This is an
170 3 streaming multiprocessors in 16 8 12 4
explaination.
CUDA architecture?
In CUDA programming, if
This is an
171 3 CPU is the host then device GPU Compiler HDD GPGPU
explaination.
will be:
Both grids and
Both grids and blocks use blocks use the
172 3 the______ type with three Dim3 Dim2 Dim1 Dim4 dim3 type with
unsigned integer fields. three unsigned
integer fi elds
Tesla P100 GPU
based on the
Tesla P100 GPU based on Pascal GPU
the Pascal GPU Architecture has
Architecture has 56 56 Streaming
173 3 Streaming Multiprocessors 2048 512 1024 256 Multiprocessors
(SMs), each capable of (SMs), each
supporting up to____active capable of
threads. supporting up to
2048 active
threads.
The maximum size
The maximum size at each
at each level of the
174 3 level of the thread hierarchy Device Host Compiler Memory
thread hierarchy is
is_____dependent.
device dependent.
Intel I7 has the memory bus This is an
175 3 19B 180B 152B 102B
of width: explaination.
The Streaming
The__________ is the Multiprocessor
Streaming
176 3 heart of the GPU Multiprocessor CUDA Compiler (SM) is the heart
Multiprocessor
architecture: of the GPU
architecture.
marks question A B C D ans
A kernel is defi
A kernel is defined using ned using
177 3 the_____ declaration _global _host _device _void the\n__global
specification declaration specifi
cation
The function
printThreadInfo() is not
Memory Matrix Ans-(d)- Memory
178 3 used to print out which of Block Index Control-Index
Allocations Coordinates Allocations.
the following information
about each thread:
Which is alternative options for latency hiding?
A. Increase CPU frequency
B. Multithreading
C. Increase Bandwidth
D. Increase Memory
ANSWER: B
If there are 6 nodes in a ring topology how many message passing cycles
will be required to complete broadcast process in one to all?
A. 1
B. 6
C. 3
D. 4
ANSWER: 3
If there is 4 X 4 Mesh topology network then how many ring operation will
perform to complete one to all broadcast?
A. 4
B. 8
C. 16
D. 32
ANSWER: 8
Consider all to all broadcast in ring topology with 8 nodes. How many
messages will be present with each node after 3rd step/cycle of
communication?
A. 3
B. 4
C. 6
D. 7
ANSWER: 4
Consider Hypercube topology with 8 nodes then how many message passing
cycles will require in all to all broadcast operation?
If there is 4X4 Mesh Topology ______ message passing cycles will require
complete all to all reduction.
A. 4
B. 6
C. 8
D. 16
ANSWER: C
Following issue(s) is/are the true about sorting techniques with parallel
computing.
A. Large sequence is the issue
B. Where to store output sequence is the issue
C. Small sequence is the issue
D. None of the above
ANSWER: B
Suppose there are 16 elements in a series then how many phases will be
required to sort the series using parallel odd-even bubble sort?
A. 8
B. 4
C. 5
D. 15
ANSWER: D
Pipeline implements ?
A. fetch instruction
B. decode instruction
C. fetch operand
D. all of above
ANSWER: D
How does the number of transistors per chip increase according to Moore
´s law?
A. Quadratically
B. Linearly
C. Cubicly
D. Exponentially
ANSWER: D
The Owner Computes Rule generally states that the process assigned a
particular data item is responsible for?
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
ANSWER: A
A simple application of exploratory decomposition is_?
A. The solution to a 15 puzzle
B. The solution to 20 puzzle
C. The solution to any puzzle
D. None of Above
ANSWER: A
A hypercube has?
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
A pipeline is like?
A. Overlaps various stages of instruction execution to achieve
performance.
B. House pipeline
C. Both a and b
D. A gas line
ANSWER: A
The Owner Computes Rule generally states that the process assigned a
particular data item are responsible for?
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
ANSWER: A
A hypercube has?
A. 2d nodes
B. 3d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
Broader concept offers Cloud computing .to select which of the following?
A. Parallel computing
B. Centralized computing
C. Utility computing
D. Decentralized computing
ANSWER: C
Aberration of HPC?
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
ANSWER: C
Answer : D
A. Execution Time
B. Total Parallel Overhead
C. Speedup
D. Efficiency
E. Cost
F. All above
Answer : F
The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
Answer : A
Overhead function or total overhead of a parallel system as the total time collectively
spent by all the processing elements over and above that required by the fastest known
sequential algorithm for solving the same problem on a single processing element.
True or False?
A. True
B. False
Answer : A
What is Speedup?
A. A measure that captures the relative benefit of solving a problem in parallel. It is defined as the
ratio of the time taken to solve a problem on a single processing element to the time required to
solve the same problem on a parallel computer with p identical processing elements.
B. A measure of the fraction of time for which a processing element is usefully
employed.
C. None of the above
Answer : A
In an ideal parallel system, speedup is equal to p and efficiency is equal to one. True or
False?
A. True
B. False
Answer : A
A parallel system is said to be ________________ if the cost of solving a problem on a
parallel computer has the same asymptotic growth (in terms) as a function of the input
size as the fastest-known sequential algorithm on a single processing element.
A. Cost optimal
B. Non Cost optimal
Answer : A
Using fewer than the maximum possible number of processing elements to execute a
parallel algorithm is called ______________ a parallel system in terms of the number of
processing elements.
A. Scaling down
B. Scaling up
Answer : B
The __________________ function determines the ease with which a parallel system can
maintain a constant efficiency and hence achieve speedups increasing in proportion to the
number of processing elements.
A. Isoefficiency
B. Efficiency
C. Scalability
D. Total overhead
Answer : A
Minimum execution time for adding n numbers is Tp = n/p + 2 logp True or False ?
A. True
B. False
Answer : A
C. All above
Answer : C
Answer : A
A. Shell sort
B. Quick sort
C. Odd-Even transposition
D. Option A & C
Answer : D
Answer : A
Formally, given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to
find the shortest paths between all pairs of vertices. True or False?
A. True
B. False
Answer : A
A. One approach partitions the vertices among different processes and has each process
compute the single-source shortest paths for all vertices assigned to it. We refer to
this approach as the source-partitioned formulation.
B. Another approach assigns each vertex to a set of processes and uses the parallel
formulation of the single-source algorithm to solve the problem on each set of
processes. We refer to this approach as the source-parallel formulation.
C. Both are true
D. Non of these is true
Answer : C
Search algorithms can be used to solve discrete optimization problems. True or False ?
A. True
B. False
Answer : A
C. All of above
Answer : C
List the communication strategies for parallel BFS.
D. All of above
Answer : D
In a compare-split operation
A. Each process sends its block of size n/p to the other process
B. Each process merges the received block with its own block and retains only the
appropriate half of the merged block
C. Both A & B
Answer : C
A. Code compiled for hardware of one compute capability will not need to be re-
compiled to run on hardware of another
B. Different compute capabilities may imply a different amount of local memory per
thread
Answer : B
True or False: The threads in a thread block are distributed across SM units so that each
thread is executed by one SM unit.
A. True
B. False
Answer : B
Answer : C
True or false: Functions annotated with the __global__ qualifier may be executed on the
host or the device
A. True
B. Flase
Answer : A
Which of the following correctly describes a GPU kernel
B. All thread blocks involved in the same computation use the same kernel
Answer : B
C. Block and grid level parallelism - Different blocks or grids execute different tasks
D. Data parallelism - Different threads and blocks process different parts of data in
memory
Answer :A
What strategy does the GPU employ if the threads within a warp diverge in their execution?
A. Threads are moved to different warps so that divergence does not occur within a
single warp
C. All possible execution paths are run by all threads in a warp serially so that thread
instructions do not diverge
Answer : C
Which of the following does not result in uncoalesced (i.e. serialized) memory access on the
K20 GPUs installed on Stampede
Answer : A
Which of the following correctly describes the relationship between Warps, thread blocks,
and CUDA cores?
A. A warp is divided into a number of thread blocks, and each thread block executes on
a single CUDA core
B. A thread block may be divided into a number of warps, and each warp may execute
on a single CUDA core
C. A thread block is assigned to a warp, and each thread in the warp is executed on a
separate CUDA core
Answer : B
Answer : A
A. CUDA Libraries
B. CUDA Runtime
C. CUDA Driver
D. All Above
Answer : D
A. C
B. C++
C. Forton
D. All Above
Answer : D
Threads support Shared memory and Synchronization
A. True
B. False
Answer : A
B. Medical Imaging
C. Computational Science
E. All Above
Answer : E
A. True
B. False
Answer : A
---------------------------------------------------------------------------------------------------------------------
SET 1 (120 MCQs)
---------------------------------------------------------------------------------------------------------------------
A. A processor
B. Memory system
C Data path.
D All of Above
3. A pipeline is like_
A Latency
B Bandwidth
C Both a and b
D none of above
OptimusPrime Page 1
A Cache hit ratio
B Cache fit ratio
B Cache best ratio
C none of above
8. A single control unit that dispatches the same Instruction to various processors is__
A SIMD
B SPMD
C MIMD
D None of above
12. The number of tasks into which a problem is decomposed determines its_
A. Granularity
B. Priority
C. Modernity
D. None of above
13. The length of the longest path in a task dependency graph is called_
A. the critical path length
B. the critical data length
C. the critical bit length
D. None of above
E.
14. The graph of tasks (nodes) and their interactions/data exchange (edges)_
A. Is referred to as a task interaction graph
OptimusPrime Page 2
B. Is referred to as a task Communication graph
C. Is referred to as a task interface graph
D. None of Above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
17. The Owner Computes Rule generally states that the process assigned a particular data
item is responsible for_
OptimusPrime Page 3
21. Group communication operations are built using point-to-point messaging primitives
A. True
B. False
22. Communicating a message of size m over an uncongested network takes time ts + tmw
A. True
B. False
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
25. A binary tree in which processors are (logically) at the leaves and internal nodes are
routing nodes.
A. True
B. False
A. True
B. False
OptimusPrime Page 4
D. None of Above
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
30. In All-to-All Personalized Communication Each node has a distinct message of size m
for every other node
A. True
B. False
A. Decentralized computing
B. Parallel computing
C. Centralized computing
D. Decentralized computing
E. Distributed computing
F. All of these
G. None of these
A. Parallel computation
B. Parallel processes
C. Parallel development
D. Parallel programming
E. Parallel computation
F. All of these
G. None of these
OptimusPrime Page 5
A. Multithreading
B. Cyber cycle
C. Internet of things
D. Cyber-physical system
E. All of these
F. None of these
A. HPC
D. HTC
C. HRC
D. Both A and B
E. All of these
F. None of these
A. Adaptivity
B. Transparency
C. Dependency
D. Secretive
E. Adaptivity
F. All of these
G. None of these
37. No special machines manage the network of architecture in which resources are known as
A. Peer-to-Peer
B. Space based
C. Tightly coupled
D. Loosely coupled
E. All of these
F. None of these
OptimusPrime Page 6
A. Many Server machines
B. 1 Server machine
C. 1 Client machine
D. Many Client machines
E. All of these
F. None of these
A. Business
B. Engineering
C. Science
D. Media mass
E. All of these
F. None of these
41. Virtualization that creates one single address space architecture that of, is called
A. Loosely coupled
B. Peer-to-Peer
C. Space-based
D. Tightly coupled
E. Loosely coupled
F. All of these
G. None of these
A. Centralized computing
B. Decentralized computing
C. Parallel computing
D. Both A and B
E. All of these
F. None of these
43. Data access and storage are elements of Job throughput, of __________.
A. Flexibility
B. Adaptation
C. Efficiency
D. Dependability
E. All of these
F. None of these
44. Billions of job requests is over massive data sets, ability to support known as
OptimusPrime Page 7
A. Efficiency
B. Dependability
C. Adaptation
D. Flexibility
E. All of these
F. None of these
45. Broader concept offers Cloud computing .to select which of the following.
A. Parallel computing
B. Centralized computing
C. Utility computing
D. Decentralized computing
E. Parallel computing
F. All of these
G. None of these
46. Resources and clients transparency that allows movement within a system is called
A. Mobility transparency
B. Concurrency transparency
C. Performance transparency
D. Replication transparency
E. All of these
F. None of these
A. Distributed process
B. Distributed program
C. Distributed application
D. Distributed computing
E. All of these
F. None of these
A. Grid computing
B. Centralized computing
C. Parallel computing
D. Distributed computing
E. All of these
F. None of these
OptimusPrime Page 8
A. Data
B. Cloud
C. Scalable
D. Business
E. All of these
F. None of these
A. 5C
B. 2C
C. 3C
D. 4C
E. All of these
F. None of these
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
E. All of these
F. None of these
A. Norming grids
B. Data grids
C. Computational grids
D. Both A and B
E. All of these
F. None of these
A. Management
B. Media mass
C. Business
D. Science
E. All of these
F. None of these
A. 6
OptimusPrime Page 9
B. 3
C. 4
D. 5
E. All of these
F. None of these
A. Adaptation
B. Efficiency
C. Dependability
D. Flexibility
E. All of these
F. None of these
56. Even under failure conditions Providing Quality of Service (QoS) assurance is the
responsibility of
A. Dependability
B. Adaptation
C. Flexibility
D. Efficiency
E. All of these
F. None of these
OptimusPrime Page 10
E. All of these
F. None of these
63. The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
64. Overhead function or total overhead of a parallel system as the total time collectively
spent by all the processing elements over and above that required by the fastest known
sequential algorithm for solving the same problem on a single processing element. True
or False?
A. True
B. False
OptimusPrime Page 11
66. In an ideal parallel system, speedup is equal to p and efficiency is equal to one. True or
False?
A. True
B. False
68. Using fewer than the maximum possible number of processing elements to execute a
parallel algorithm is called ______________ a parallel system in terms of the number of
processing elements.
A. Scaling down
B. Scaling up
69. The __________________ function determines the ease with which a parallel system can
maintain a constant efficiency and hence achieve speedups increasing in proportion to the
number of processing elements.
A. Isoefficiency
B. Efficiency
C. Scalability
D. Total overhead
70. Minimum execution time for adding n numbers is Tp = n/p + 2 logp True or False ?
A. True
B. False
OptimusPrime Page 12
74. What are the issues in sorting?
A. Where the Input and Output Sequences are Stored
B. How Comparisons are Performed
C. All above
Answer : C
75. The parallel run time of the formulation for Bubble sort is
A. Tp = O(n/plogn/p) + O(n) + O(n)
B. Tp = O(n/plogn/p) + O (n/plogp) + O(ln/p)
C. Non of the above
77. What is the overall complexity of parallel algorithm for quick sort?
A. Tp = O(n/p logn/p) + O(n/p logp) + O(log2 p)
B. Tp = O(n/p logn/p) + O(n/p logp)
C. Tp = O(n/p logn/p) + O(log2 p)
78. Formally, given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to
find the shortest paths between all pairs of vertices. True or False?
A. True
B. False
80. Search algorithms can be used to solve discrete optimization problems. True or False ?
A. True
B. False
OptimusPrime Page 13
81. Examples of Discrete optimization problems are ;
A. planning and scheduling,
B. The optimal layout of VLSI chips,
C. Robot motion planning,
D. Test-pattern generation for digital circuits, and logistics and control.
E. All of above
87. Bubble sort is difficult to parallelize since the algorithm has no concurrency
A. True
B. False
88. Which of the following statements are true with regard to compute capability in CUDA
A. Code compiled for hardware of one compute capability will not need to be re-compiled to
run on hardware of another
B. Different compute capabilities may imply a different amount of local memory per
thread
C. Compute capability is measured by the number of FLOPS a GPU accelerator can
compute.
OptimusPrime Page 14
87. True or False: The threads in a thread block are distributed across SM units so that each
thread is executed by one SM unit.
A. True
B. False
87. True or false: Functions annotated with the __global__ qualifier may be executed on the
host or the device
A. True
B. False
90.What strategy does the GPU employ if the threads within a warp diverge in their
execution?
A. Threads are moved to different warps so that divergence does not occur within a single
warp
B. Threads are allowed to diverge
C. All possible execution paths are run by all threads in a warp serially so that thread
instructions do not diverge
91. Which of the following does not result in coalesced (i.e. serialized) memory access on the
K20 GPUs installed on Stampede
A. Aligned, but non-sequential access
B. Misaligned data access
C. Sparse memory access
92. Which of the following correctly describes the relationship between Warps, thread
blocks, and CUDA cores?
OptimusPrime Page 15
A. A warp is divided into a number of thread blocks, and each thread block executes on a
single CUDA core
B. A thread block may be divided into a number of warps, and each warp may execute
on a single CUDA core
C. A thread block is assigned to a warp, and each thread in the warp is executed on a
separate CUDA core
A. Source register
B. Memory
C. Data
D. Destination register
OptimusPrime Page 16
100. Types of HPC application
A. Mass Media
B. Business
C. Management
D. Science
A. intraprocessor communication
B. intraprocess and intraprocessor communication
C. interprocess and interprocessor communication
D. interprocessor communication
A. Serial computation
B. Excess computation
C. perpendicular computation
D. parallel computing
A. ILP
B. Performance
C. Cost effectiveness
D. delay
104. The tightly coupled set of threads execution working on a single task ,that is called
A. Multithreading
B. Parallel processing
C. Recurrence
D. Serial processing
OptimusPrime Page 17
B. Bit model
C. Data model
D. Network model
A. reverse message
B. receive message
C. forward message
D. Collect message
A. Binary bit
B. Flag bit
C. Signed bit
D. Unsigned bit
108. For inter processor communication the miss arises are called
A. hit rate
B. coherence misses
C. comitt misses
D. parallel processing
A. control unit
B. microprocessor
C. processing unit
D. microprocessor or processing unit
110. _________ gives the theoretical speedup in latency of the execution of a task at fixed
execution time
A. Amdahl's
OptimusPrime Page 18
B. Moor's
C. metcalfe's
D. Gustafson's law
111. The number and size of tasks into which a problem is decomposed determines the
A. fine-grainularity
B. coarse-grainularity
C. sub Task
D. granularity
113. Private data that is used by a single processor then shared data are used
A. Single processor
B. Multi processor
C. Single tasking
D. Multi tasking
114. The time lost due to the branch instruction is often referred to as ____________
A. Delay
B. Branch penalty
C. Latency
D. control hazard
A. cache
B. shared memory
OptimusPrime Page 19
C. message passing
D. distributed memory
A. Global scheduling
B. Local Scheduling
C. post scheduling
D. pre scheduling
A. CISC
B. RISC
C. ISA
D. IANA
A. Unsign Char
B. Sign character
C. Long Char
D. unsign long char
OptimusPrime Page 20
---------------------------------------------------------------------------------------------------------------------
SET 2 (26 MCQs)
---------------------------------------------------------------------------------------------------------------------
OptimusPrime Page 21
Ans.b
4. From following code which particular line is responsible for copying between device to host
#include <iostream>
__global__ void add( int a, int b, int *c ) {
*c = a + b;
}
int main( void ) {
int c; int *dev_c;
HANDLE_ERROR( cudaMalloc( (void**)&dev_c, sizeof(int) ) );
add<<<1,1>>>( 2, 7, dev_c );
HANDLE_ERROR( cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) );
printf( "2 + 7 = %d\n", c );
cudaFree( dev_c );
return 0;
}
a. c, dev_c, sizeof(int);
b. HANDLE_ERROR( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost );
c. HANDLE_ERROR( cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) );
d. cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) ;
Ans.c
#include <iostream>
__global__ void add( int a, int b, int *c ) {
*c = a + b;
}
int main( void ) {
int c; int *dev_c;
HANDLE_ERROR( cudaMalloc( (void**)&dev_c, sizeof(int) ) );
add<<<1,1>>>( 2, 7, dev_c );
HANDLE_ERROR( cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) );
printf( "2 + 7 = %d\n", c );
cudaFree( dev_c );
return 0;
}
a.2
b.9
c.7
d.0
Ans. b
OptimusPrime Page 22
b. alerts the interpreter that a function should be compiled to run on a device instead of the host
c. alerts the interpreter that a function should be interpreted to run on a device instead of the host
d. alerts the interpreter that a function should be compiled to run on a host instead of the device
ans.a
7. The on-chip memory which is local to every multithreaded Single Instruction Multiple Data
(SIMD) Processor is called
a. Local Memory
b. Global Memory
c. Flash memory
d. Stack
Ans. a
8. The machine object created by the hardware, managing, scheduling, and executing is a thread
of
a. DIMS instructions
b. DMM instructions
c. SIMD instructions
d. SIM instructions
Ans. c
10. Which of the following architectures is/are not suitable for realizing SIMD ?
a. Vector Processor
b. Array Processor
c. Von Neumann
d. All of the above
Ans . c
OptimusPrime Page 23
d.nvcc
Ans.d
OptimusPrime Page 24
18.which function is used for free the memory in cuda
a.cudaFree()
b.Free()
c.Cudafree()
d.CudaFree()
Ans. a
23.Which of the following correctly describes the relationship between Warps, thread blocks,
and CUDA cores?
a.A warp is divided into a number of thread blocks, and each thread block executes on a single
CUDA core
b.A thread block may be divided into a number of warps, and each warp may execute on a single
CUDA core
c.A thread block is assigned to a warp, and each thread in the warp is executed on a separate
CUDA core
d. A block index is same as thread index
Ans .b
OptimusPrime Page 25
24. A processor assigned with a thread block, that executes a code ,which we usually call a
A. multithreaded MIMD processor
b. multithreaded SIMD processor
c. multithreaded
D. multicore
Ans. c
25. Thread blocked altogether and being executed in the sets of 32 thread called as
a.block of thread
b.thread block
c.thread
d.block
Ans. b
---------------------------------------------------------------------------------------------------------------------
SET 3 (30 MCQs)
---------------------------------------------------------------------------------------------------------------------
Unit I
A. A processor
B. Memory system
C Data path.
D All of Above
2. Data intensive applications utilize_
OptimusPrime Page 26
4. Scheduling of instructions is determined_
A Latency
B Bandwidth
C Both a and b
D none of above
A SIMD
B SPMD
C MIMD
D None of above
9. The primary forms of data exchange between parallel tasks are_
Unit 2
OptimusPrime Page 27
B. Execute directly
C. Execute indirectly
D. None of Above
A. Granularity
B. Priority
C. Modernity
D. None of above
OptimusPrime Page 28
A. Is referred to as a task interaction graph
B. Is referred to as a task Communication graph
C. Is referred to as a task interface graph
D. None of Above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
7. The Owner Computes Rule generally states that the process assigned a particular data item is
responsible for_
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
OptimusPrime Page 29
D. All of Above
Unit 3
A. True
B. False
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
4. A hypercube has_
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
5. A binary tree in which processors are (logically) at the leaves and internal nodes are routing
nodes.
A. True
B. False
A. True
B. False
OptimusPrime Page 30
8. In the scatter operation_
OptimusPrime Page 31
A. Single node send a unique message of size m to every other node
B. Single node send a same message of size m to every other node
C. Single node send a unique message of size m to next node
D. None of Above
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
10. In All-to-All Personalized Communication Each node has a distinct message of size m for
every other node
A. True
B. False
---------------------------------------------------------------------------------------------------------------------
SET 4 ( MCQs)
---------------------------------------------------------------------------------------------------------------------
1.Message passing system allows processes to : a) communicate with one another without
resorting to shared data b) communicate with one another by resorting to shared data c) share
data d) name the recipient or sender of the message
Ans-a
2. An IPC facility provides at least two operations : a) write & delete message b) delete &
receive message c) send & delete message d) receive & send message
Ans- d
3.Messages sent by a process : a) have to be of a fixed size b) have to be a variable size c) can be
fixed or variable sized d) None of the mentioned
Ans- c
4.The link between two processes P and Q to send and receive messages is called : a)
communication link b) message-passing link c) synchronization link d) all of the mentioned
Ans- a
5.In the Zero capacity queue : a) the queue can store at least one messageb) the sender blocks
until the receiver receives the message c) the sender keeps sending and the messages don’t wait
in the queue d) none of the mentioned
Ans- b
OptimusPrime Page 32
a) allows processes to communicate and synchronize their actions when using the same address
space.
b) allows processes to communicate and synchronize their actions without using the same
address space.
c) allows the processes to only synchronize their actions without communication.
d) None of these
Ans- b
11.Single instruction is applied to a multiple data item to produce the ___ output(s).
a)multiple
b)different
c)same
Ans- c
OptimusPrime Page 33
Ans- b
18) Which of the following bus is used to transfer data from main memory to peripheral device?
A. DMA bus
B. Output bus
C. Data bus
D.All of the above
Answer: C. Data bus
OptimusPrime Page 34
C. secondary storage
D. control memory
E. cache memory
Answer: D. control memory
B. Dynamic behaviour
C. Static behaviour
D. Speed
E. None of the answers above is correct
Answer: B. Dynamic behaviour
OptimusPrime Page 35
B. Control unit
C. Arithmetic logical unitD. Instruction set
E. None of the answers above is correct
Answer: A. Processor/memory interface
26) How does the number of transistors per chip increase according to Moore ´s law?
A. Quadratically
B. Linearly
C. Cubicly
D. Exponentially
E. None of the answers above is correct
Answer: D. Exponentially
27) Which value has the speedup of a parallel program that achieves an efficiency of 75% on 32
processors?
A. 18
B. 24
C. 16
D. 20
E. None of the answers above is correct
Answer: B. 24
29). The concept of pipelining is most effective in improving performance if the tasks being
performed in different stages :
A. require different amount of time
B. require about the same amount of time
C. require different amount of time with time difference between any two tasks being same
D. require different amount with time difference between any two tasks being different
Answer: B. require about the same amount of time
OptimusPrime Page 36
B. memory-monitor communication
C. pipelining
D. none of the above
Answer: C. pipelining
36) In daisy-chaining priority method, all the devices that can request an interrupt are connected
in
A. parallel
B. serial
C. random
D. none of the above
Answer: B. serial
37 ) Which one of the following is a characteristic of CISC (Complex Instruction Set Computer)
A. Fixed format instructions
B. Variable format instructions
C. Instructions are executed by hardware
D. None of the above
OptimusPrime Page 37
Answer: B. Variable format instructions
38). During the execution of the instructions, a copy of the instructions is placed in the ______ .
A. Register
B. RAM
C. System heap
D. Cache
Answer: D. Cache
39 ) Two processors A and B have clock frequencies of 700 Mhz and 900 Mhz respectively.
Suppose A can execute an instruction with an average of 3 steps and B can execute with an
average of 5 steps. For the execution of the same instruction which processor is faster ?
A. A
B. B
C. Both take the same time
D. Insuffient information
Answer: A. A
40 )A processor performing fetch or decoding of different instruction during the execution of
another instruction is called ______ .
A. Super-scaling
B. Pipe-lining
C. Parallel Computation
D. None of these
Answer: B. Pipe-lining
OptimusPrime Page 38
44 )The ultimate goal of a compiler is to,
A. Reduce the clock cycles for a programming task.
B. Reduce the size of the object code.
C. Be versatile.
D. Be able to detect even the smallest of errors.
Answer: A. Reduce the clock cycles for a programming task.
45 )To which class of systems does the von Neumann computer belong?
A. SIMD (Single Instruction Multiple Data)
B. MIMD (Multiple Instruction Multiple Data)
C. MISD (Multiple Instruction Single Data)
D. SISD (Single Instruction Single Data)
E. None of the answers above is correct.
Answer: D. SISD (Single Instruction Single Data)
46). Parallel programs: Which speedup could be achieved according to Amdahl´s law for infinite
number of processors if 5% of a program is sequential and the remaining part is ideally parallel?
A. Infinite speedup
B. 5
C. 20
D. 50
E. None of the answers above is correct.
Answer: C. 20
48 Which MIMD systems are best scalable with respect to the number of processors?
A. Distributed memory computers
B. ccNUMA systems
C. nccNUMA systems
D. Symmetric multiprocessors
E. None of the answers above is correct
Answer: A. Distributed memory computers
49 ). Cache coherence: For which shared (virtual) memory systems is the snooping protocol
suited?
A. Crossbar connected systems
B. Systems with hypercube network
C. Systems with butterfly network
OptimusPrime Page 39
D. Bus based systems
E. None of the answers above is correct.
Answer: D. Bus based systems
Unit 2
1. The best mode of connection between devices which need to send or receive large amounts of
data over a short distance is _____
a) BUS
b) Serial port
c) Parallel port
d) Isochronous port
View Answer
Answer: c
Explanation: The parallel port transfers around 8 to 16 bits of data simultaneously over the lines,
hence increasing transfer rates.
8. In the output interface of the parallel port, along with the valid signal ______ is also sent.
a) Data
b) Idle signal
c) Interrupt
d) Acknowledge signal
View Answer
Answer: b
Explanation: The idle signal is used to check if the device is idle and ready to receive data.
3. Parallel computing uses _____ execution.
a) sequential
b) unique
c) simultaneous
d) none of the above
ans c
OptimusPrime Page 40
4. Heap can be used as ________________
a) Priority queue
b) Stack
c) A decreasing order array
d) None of the mentioned
Answer: a
Explanation: The property of heap that the value of root must be either greater or less than both
of its children makes it work like a priority queue.
6. Which of the following is true about parallel computing performance?
a. Computations use multiple processors.
b. There is an increase in speed.
c. The increase in speed is loosely tied to the number of processor or computers used.
d. All of the answers are correct.
ANS: a
12) No special machines manage the network of architecture in which resources are known
as
A. Peer-to-Peer B. Space based C. Tightly coupled D. Loosely coupled E. All of these F. None
of these
OptimusPrime Page 41
Answer- A
13) Virtualization that creates one single address space architecture that of, is called
A. Loosely coupled B. Peer-to-Peer C. Space-based D. Tightly coupled E. Loosely coupled F.
All of these G. None of these
Answer- C
OptimusPrime Page 42
22) Data centers and centralized computing covers many and
A. Microcomputers B. Minicomputers C. Mainframe computers D. Supercomputers E. All of
these F. None of these
Answer- D
23) Execution of several activities at the same time. a) processing b) parallel processing c) serial
processing d) multitasking
Answer: b
27) A parallelism based on increasing processor word size. a) Increasing b) Count based c) Bit
based d) Bit leve
Answer: d
28) A type of parallelism that uses micro architectural techniques. a) instructional b) bit level c)
bit based d) increasing
Answer: A
29) MIPS stands for? a) Mandatory Instructions/sec b) Millions of Instructions/sec c) Most of
Instructions/sec d) Many Instructions / sec
Answer: B
31) Computer has a built-in system clock that emits millions of regularly spaced electric pulses
per _____ called clock cycles. a) second b) millisecond c) microsecond d) minute
Answer: a
32) It takes one clock cycle to perform a basic operation. a) True b) False
Answer: a
33) The operation that does not involves clock cycles is _________ a) Installation of a device b)
Execute c) Fetch d) Decode
Answer: a
OptimusPrime Page 43
34). The number of clock cycles per second is referred as ________ a) Clock speed b) Clock
frequency c) Clock rate d) Clock timing
Answer: a
35). CISC stands for ____________ a) Complex Information Sensed CPU b) Complex
Instruction Set Computer c) Complex Intelligence Sensed CPU d) Complex Instruction Set CPU
Answer: b
36). Which of the following processor has a fixed length of instructions? a) CISC b) RISC c)
EPIC d) Multi-core
Answer: b
37). Processor which is complex and expensive to produce is ________ a) RISC b) EPIC c)
CISC d) Multi-core
Answer: c
38). The architecture that uses a tighter coupling between the compiler and the processor is
____________ a) EPIC b) Multi-core c) RISC d) CISC
Answer: a
39). MAR stands for ___________ a) Memory address register b) Main address register c) Main
accessible register
d) Memory accessible register
Answer: a.
40). A circuitry that processes that responds to and processes the basic instructions that are
required to drive a computer system is ________ a) Memory b) ALU c) CU d) Processor
Answer: d
41) The graph of tasks (nodes) and their interactions/data exchange (edges)_
A. Is referred to as a task interaction graph
B. Is referred to as a task Communication graph
C. Is referred to as a task interface graph
D. None of Above
Answer: A
OptimusPrime Page 44
C. exploratory decomposition
D. speculative decomposition
E. All of Above
Answer: E
44) The Owner Computes Rule generally states that the process assigned a particular data item is
responsible for_
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
Answer: A
Unit 3
3 Multipoint topology is
a.Bus
b.Star
c.Mesh
d.Ring
Ans:- a
OptimusPrime Page 45
c.WAN
d.Internetwork
Ans:- a
10.In a star-topology Ethernet LAN, _______ is just a point where the signals coming from
different stations collide; it is the collision point.
a.An active hub
b.A passive hub
c.either (a) or (b)
d.neither (a) nor (b)
Ans:- b
11) Group communication operations are built using point-to-point messaging primitives
A. True
B. False
Ans:- A
OptimusPrime Page 46
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
Ans:- A
OptimusPrime Page 47
Ans: b
19) In the TV receivers, the device used for tuning the receiver to the incoming signal is
a. Varactor diode
b. High pass Filter
c. Zener diode
d. Low pass filter
Ans: a
20) The modulation technique that uses the minimum channel bandwidth and transmitted power
is
a. FM
b. DSB-SC
c. VSB
d. SSB
Ans: d
21) Calculate the bandwidth occupied by a DSB signal when the modulating frequency lies in
the range from 100 Hz to 10KHz.
a. 28 KHz
b. 24.5 KHz
c. 38.6 KHz
d. 19.8 KHz
Ans: d
23)What is a high performance multi-core processor that can be used to accelerate a wide
variety of applications using parallel computing.
1. CLU
2. GPU
3. CPU
4. DSP
ANS-2
OptimusPrime Page 48
26. Interprocessor communication that takes place
1. Centralized memory
2. Shared memory
3. Message passing
4. Both A and B
ANS-4
27. Decomposition into a large number of tasks results in coarse -grained decomposition
1. True
2. False
ANS-2
28. The fetch and execution cycles are interleaved with the help of __
1. Modification in processor architecture
2. Clock
3. Special unit
4. Control unit
ANS-2
29. The processor of system which can read /write GPU memory is known as
1. kernal
2. device
3. Server
4. Host
ANS-4
30. Increasing the granularity of decomposition and utilizing the resulting concurrency to
perform more tasks in parallel decreses performance.
1. TRUE
2. FALSE
ANS-2
---------------------------------------------------------------------------------------------------------------------
SET 5 (MCQs)
---------------------------------------------------------------------------------------------------------------------
OptimusPrime Page 49
A. Execution Time
B. Total Parallel Overhead
C. Speedup
D. Efficiency
E. Cost
F. All above
Answer : F
The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
Answer : A
Overhead function or total overhead of a parallel system as the total time collectively
spent by all the processing elements over and above that required by the fastest known
sequential algorithm for solving the same problem on a single processing element.
True or False?
A. True
B. False
Answer : A
What is Speedup?
A. A measure that captures the relative benefit of solving a problem in parallel. It is defined as
the
ratio of the time taken to solve a problem on a single processing element to the time required to
solve the same problem on a parallel computer with p identical processing elements.
B. A measure of the fraction of time for which a processing element is usefully
employed.
C. None of the above
Answer : A
In an ideal parallel system, speedup is equal to p and efficiency is equal to one. True or
False?
A. True
B. False
Answer : A
Using fewer than the maximum possible number of processing elements to execute a
parallel algorithm is called ______________ a parallel system in terms of the number of
OptimusPrime Page 50
processing elements.
A. Scaling down
B. Scaling up
Answer : B
The __________________ function determines the ease with which a parallel system can
maintain a constant efficiency and hence achieve speedups increasing in proportion to the
number of processing elements.
A. Isoefficiency
B. Efficiency
C. Scalability
D. Total overhead
Answer : A
Minimum execution time for adding n numbers is Tp = n/p + 2 logp True or False ?
A. True
B. False
Answer : A
OptimusPrime Page 51
Answer : A
Formally, given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to
find the shortest paths between all pairs of vertices. True or False?
A. True
B. False
Answer : A
Search algorithms can be used to solve discrete optimization problems. True or False ?
A. True
B. False
Answer : A
OptimusPrime Page 52
A. Work- Splitting Strategies
B. Load balancing Schemes
C. All of above
Answer : C
In a compare-split operation
A. Each process sends its block of size n/p to the other process
B. Each process merges the received block with its own block and retains only the
appropriate half of the merged block
C. Both A & B
Answer : C
Unit I
OptimusPrime Page 53
C High processing and memory system performance.
D None of above
3. A pipeline is like_
A Overlaps various stages of instruction execution to achieve performance.
B House pipeline
C Both a and b
D A gas line
8. A single control unit that dispatches the same Instruction to various processors is__
A SIMD
B SPMD
C MIMD
D None of above
OptimusPrime Page 54
A True
B False
Unit 2
7. The Owner Computes Rule generally states that the process assigned a particular data
item is responsible for_
A. All computation associated with it
B. Only one computation
C. Only two computation
OptimusPrime Page 55
D. Only occasionally computation
Unit 3
4. A hypercube has_
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
5. A binary tree in which processors are (logically) at the leaves and internal nodes are
routing nodes.
A. True
B. False
OptimusPrime Page 56
6. In All-to-All Broadcast each processor is the source as well as destination.
A. True
B. False
10. In All-to-All Personalized Communication Each node has a distinct message of size m
for every other node
A. True
B. False
OptimusPrime Page 57
3. HPC is not used in high span bridges.
a) True
b) False
View Answer
Answer: b
Explanation: Major applications of high-performance concrete in the field of Civil
Engineering constructions have been in the areas of long-span bridges, high-rise buildings
or structures, highway pavements, etc.
4. Concrete having 28- days’ compressive strength in the range of 60 to 100 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: a
Explanation: High Performance Concrete having 28- days’ compressive strength in the
range of 60 to 100 MPa
5. Concrete having 28-days compressive strength in the range of 100 to 150 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: b
Explanation: Very high performing Concrete having 28-days compressive strength in the
range of 100 to 150 MPa.
7. The choice of cement for high-strength concrete should not be based only on mortarcube
tests but it should also include tests of compressive strengths of concrete at
___________ days.
a) 28, 56, 91
b) 28, 60, 90
c) 30, 60, 90
OptimusPrime Page 58
d) 30, 45, 60
View Answer
Answer: a
Explanation: The choice of cement for high-strength concrete should not be based only on
mortar-cube tests but it should also include tests of compressive strengths of concrete at
28, 56, and 91 days.
Explanation: Many studies have found that 9.5 mm to 12.5 mm nominal maximum size
aggregates gives optimum strength.
10. Due to low w/c ratio _____________
a) It doesn’t cause any problems
b) It causes problems
c) Workability is easy
d) Strength is more
View Answer
Answer: b
Explanation: Due to the low w/c ratio, it causes problems so superplasticizers are used.
Which of the following statements are true with regard to compute capability in CUDA
A. Code compiled for hardware of one compute capability will not need to be recompiled
to run on hardware of another
B. Different compute capabilities may imply a different amount of local memory per
thread
C. Compute capability is measured by the number of FLOPS a GPU accelerator can
compute.
Answer : B
OptimusPrime Page 59
True or False: The threads in a thread block are distributed across SM units so that each
thread is executed by one SM unit.
A. True
B. False
Answer : B
True or false: Functions annotated with the __global__ qualifier may be executed on the
host or the device
A. True
B. Flase
Answer : A
What strategy does the GPU employ if the threads within a warp diverge in their execution?
A. Threads are moved to different warps so that divergence does not occur within a
single warp
B. Threads are allowed to diverge
C. All possible execution paths are run by all threads in a warp serially so that thread
instructions do not diverge
Answer : C
Which of the following does not result in uncoalesced (i.e. serialized) memory access on the
K20 GPUs installed on Stampede
A. Aligned, but non-sequential access
B. Misaligned data access
OptimusPrime Page 60
C. Sparse memory access
Answer : A
Which of the following correctly describes the relationship between Warps, thread blocks,
and CUDA cores?
A. A warp is divided into a number of thread blocks, and each thread block executes on
a single CUDA core
B. A thread block may be divided into a number of warps, and each warp may execute
on a single CUDA core
C. A thread block is assigned to a warp, and each thread in the warp is executed on a
separate CUDA core
Answer : B
OptimusPrime Page 61
GPU execute device code
A. True
B. False
Answer : A
---------------------------------------------------------------------------------------------------------------------
SET 6 (MCQs)
--------------------------------------------------------------------------------------- ------------------------------
2)
Parallel processing has single execution flow.
a) True b) False
Ans: b Explanation: The statement is false. Sequential programming specifically has single
execution flow.
3)
A term for simultaneous access to a resource, physical or logical.
a) Multiprogramming b) Multitasking c) Threads d) Concurrency
Ans: d Explanation: Concurrency is the term used for the same. When several things are
accessed simultaneously, the job is said to be concurrent.
4)
______________ leads to concurrency.
a) Serialization b) Parallelism c) Serial processing d) Distribution Ans: b Explanation:
Parallelism leads naturally to Concurrency. For example, Several processes trying to print a file
on a single printer.
5)
A parallelism based on increasing processor word size.
a) Increasing b) Count based c) Bit based d) Bit level
Ans: d Explanation: Bit level parallelism is based on increasing processor word size. It focuses
on hardware capabilities for structuring.
6)
The measure of the “effort” needed to maintain efficiency while adding processors.
a) Maintainability b) Efficiency
c) Scalability d) Effectiveness
Ans: C Explanation: The measure of the “effort” needed to maintain efficiency while adding
processors is called as scalability.
7)
Several instructions execution simultaneously in ________________
OptimusPrime Page 62
a) processing b) parallel processing c) serial processing d) multitasking
Ans: b Explanation: In parallel processing, the several instructions are executed simultaneously.
8)
Conventional architectures coarsely comprise of a_
a) A processor
b) Memory system
c) Data path.
d) All of Above
Ans: d Explanation:
9) A pipeline is like_
a) Overlaps various stages of instruction execution to achieve performance.
b) House pipeline
c) Both a and b
d) A gas line
Ans: a Explanation:
11)
Memory system performance is largely captured by_
a) Latency
b) Bandwidth
c) Both a and b
d) none of above
Ans: c Explanation:
12)
The fraction of data references satisfied by the cache is called_
a) Cache hit ratio
b) Cache fit ratio
c) Cache best ratio
d) none of above
Ans: a Explanation:
13)
A single control unit that dispatches the same Instruction to various processors is__
a) SIMD
b) SPMD
c) MIMD
OptimusPrime Page 63
d) None of above
Ans: a Explanation:
14)
The primary forms of data exchange between parallel tasks are_
a) Accessing a shared data space
b) Exchanging messages.
c) Both A and B
d) None of Above Ans: c Explanation:
16)
Switches map a fixed number of inputs to outputs.
a) True
b) False
Ans: a Explanation:
UNIT-2
1)
The First step in developing a parallel algorithm is_
a) To Decompose the problem into tasks that can be executed concurrently
b) Execute directly
c) Execute indirectly
d) None of Above
Ans: a Explanation:
2)
The number of tasks into which a problem is decomposed determines its_
a) Granularity
b) Priority
c) Modernity
d) None of above
Ans: A Explanation:
3)
The length of the longest path in a task dependency graph is called_
a) the critical path length
b) the critical data length
c) the critical bit length
d) None of above
Ans: a Explanation:
4)
The graph of tasks (nodes) and their interactions/data exchange (edges)_
a) Is referred to as a task interaction graph
b) Is referred to as a task Communication graph
OptimusPrime Page 64
c) Is referred to as a task interface graph
d) None of Above
Ans: a Explanation:
5)
Mappings are determined by_
a) task dependency
b) task interaction graphs
c) Both A and B
d) None of Above
Ans: c Explanation:
6)
Decomposition Techniques are_
a) recursive decomposition
b) data decomposition
c) exploratory decomposition
d) speculative decomposition
e) All of Above
Ans: E Explanation:
7)
The Owner Computes Rule generally states that the process assigned a particular data item is
responsible for_
a) All computation associated with it
b) Only one computation
c) Only two computation
d) Only occasionally computation
Ans: A Explanation:
8)
A simple application of exploratory decomposition is_
a) The solution to a 15 puzzle
b) The solution to 20 puzzle
c) The solution to any puzzle
d) None of Above
Ans: A Explanation:
9)
Speculative Decomposition consist of _
a) conservative approaches
b) optimistic approaches
c) Both A and B
d) Only B
Ans: C Explanation:
OptimusPrime Page 65
10)
task characteristics include:
a) Task generation.
b) Task sizes.
c) Size of data associated with tasks.
d) All of Above
Ans: d Explanation: UNIT-3
1)
Group communication operations are built using point-to-point messaging primitives
a) True
b) False
Ans: A Explanation:
2)
Communicating a message of size m over an uncongested network takes time ts + tmw
a) True
b) False
Ans: A Explanation:
3)
The dual of one-to-all broadcast is_
a) All-to-one reduction
b) All-to-one receiver
c) All-to-one Sum
d) None of Above
Ans: A Explanation:
4)
A hypercube has_
a) 2d nodes
b) 2d nodes
c) 2n Nodes
d) N Nodes
Ans: a Explanation:
5)
A binary tree in which processors are (logically) at the leaves and internal nodes are routing
nodes.
a) True
b) False
Ans: A Explanation:
6)
In All-to-All Broadcast each processor is the source as well as destination.
a) True
OptimusPrime Page 66
b) False
Ans: A Explanation:
7)
The Prefix Sum Operation can be implemented using the_
a) All-to-all broadcast kernel.
b) All-to-one broadcast kernel.
c) One-to-all broadcast Kernel
d) Scatter Kernel
Ans: A Explanation:
8)
In the scatter operation_
a) Single node send a unique message of size m to every other node
b) Single node send a same message of size m to every other node
c) Single node send a unique message of size m to next node
d) None of Above
Ans: A Explanation:
9)
The gather operation is exactly the inverse of the_
a) Scatter operation
b) Broadcast operation
c) Prefix Sum
d) Reduction operation
Ans: A Explanation:
10)
In All-to-All Personalized Communication Each node has a distinct message of size m for every
other node
a) True
b) False
Ans: a Explanation:
UNIT-1
a) processor
b) Memory system
c) Datapath.
d) All of Above
Ans: d
Explanation:
OptimusPrime Page 67
2) Data intensive applications utilize______
a) High aggregate throughput
b) High aggregate network bandwidth
c) High processing and memory system performance.
d) None of above
Ans: a
Explanation:
3) A pipeline is like_____
Explanation:
4) Scheduling of instructions is determined ____
a) True Data Dependency
b) Resource Dependency
c) Branch Dependency
d) All of above
Ans: d
Explanation:
5) VLIW processors rely on______
Explanation:
6) Memory system performance is largely captured by_____
a) Latency
b) Bandwidth
c) Both a and b
d) none of above
Ans: c
Explanation:
OptimusPrime Page 68
b) Cache fit ratio
c) Cache best ratio
d) none of above
Ans: a
Explanation:
8) A single control unit that dispatches the same Instruction to various processors is__
a) SIMD
b) SPMD
c) MIMD
d) None of above
Ans: a
Explanation:
9) The primary forms of data exchange between parallel tasks are_
12) The CPU decodes the instructions and generates control words in
a) Prefetch stage
b) D1 (first decode) stage
c) D2 (second decode) stage
d) Final stage
Ans: b
Explanation: In D1 stage, the CPU decodes the instructions and generates control words. For
simple RISC instructions, only single control word is enough for starting the execution.
13) The fifth stage of pipeline is also known as
OptimusPrime Page 69
a) read back stage
b) read forward stage
c) write back stage
d) none of the mentioned
Ans: c
Explanation: The fifth stage or final stage of pipeline is also known as “Write back (WB)
stage”.
14) In the execution stage the function performed is
a) CPU accesses data cache
b) executes arithmetic/logic computations
c) executes floating point operations in execution unit
d) all of the mentioned
Ans: d
Explanation: In the execution stage, known as E-stage, the CPU accesses data cache, executes
arithmetic/logic computations, and floating point operations in execution unit.
15) The stage in which the CPU generates an address for data memory references in this
stage is
a) prefetch stage
b) D1 (first decode) stage
c) D2 (second decode) stage
d) execution stage
Ans: c
Explanation: In the D2 (second decode) stage, CPU generates an address for data memory
references in this stage. This stage is required where the control word from D1 stage is again
decoded for final execution.
OptimusPrime Page 70
c) write back stage
d) none of the mentioned
Ans: c
Explanation: In the two execution stages of X1 and X2, the floating point unit reads the data
from the data cache and executes the floating point computation. In the “write back stage” of
pipeline, the FPU (Floating Point Unit) writes the results to the floating point register file.
19) The floating point multiplier segment performs floating point multiplication in
a) single precision
b) double precision
c) extended precision
d) all of the mentioned
Ans: d
Explanation: The floating point multiplier segment performs floating point multiplication in
single precision, double precision and extended precision.
20) The instruction or segment that executes the floating point square root instructions is
a) floating point square root segment
b) floating point division and square root segment
c) floating point divider segment
d) none of the mentioned
Ans: c
Explanation: The floating point divider segment executes the floating point division and square
root instructions.
21) The floating point rounder segment performs rounding off operation at
OptimusPrime Page 71
2. overflow
3. denormal operand
4. underflow
5. invalid operation.
E. Granularity
F. Priority
G. Modernity
H. None of above
E. task dependency
F. task interaction graphs
G. Both A and B
H. None of Above
OptimusPrime Page 72
7. The Owner Computes Rule generally states that the process assigned a particular data item is
responsible for_
E. conservative approaches
F. optimistic approaches
G. Both A and B
H. Only B
OptimusPrime Page 73
13. ____________ is due to load imbalance, synchronization, or serial components as parts of
overheads in parallel programs.
a. Interprocess interaction
b. Synchronization
c. Idling
d. Excess computation
14. Which of the following parallel methodological design elements focuses on recognizing
opportunities for parallel execution?
a. Partitioning
b. Communication
c. Aggromeration
d. Mapping
15. Considering to use weak or strong scaling is part of ______________ in addressing the
challenges of distributed memory programming.
a. Splitting the problem
b. Speeding up computations
c. Speeding up communication
d. Speeding up hardware
16. Domain and functional decomposition are considered in the following parallel
methodological design elements, EXCEPT:
a. Partitioning
b. Communication
c. Agglomeration
d. Mapping
17. Synchronization is one of the common issues in parallel programming. The issues related to
synchronization include the followings, EXCEPT:
a. Deadlock
b. Livelock
c. Fairness
d. Correctness
18. Which of the followings is the BEST description of Message Passing Interface (MPI)?
a. A specification of a shared memory library
b. MPI uses objects called communicators and groups to define which collection of
processes may communicate with each other
c. Only communicators and not groups are accessible to the programmer only by a "handle"
d. A communicator is an ordered set of processes
---------------------------------------------------------------------------------------------------------------------
SET 7 MCQs
---------------------------------------------------------------------------------------------------------------------
Which is alternative options for latency hiding?
A. Increase CPU frequency
B. Multithreading
C. Increase Bandwidth
OptimusPrime Page 74
D. Increase Memory
ANSWER: B
______ Communication model is generally seen in tightly coupled
system.
A. Message Passing
B. Shared-address space
C. Client-Server
D. Distributed Network
ANSWER: B
The principal parameters that determine the communication latency
are as follows:
A. Startup time (ts) Per-hop time (th) Per-word transfer time (tw)
B. Startup time (ts) Per-word transfer time (tw)
C. Startup time (ts) Per-hop time (th)
D. Startup time (ts) Message-Packet-Size(W)
ANSWER: A
The number and size of tasks into which a problem is decomposed
determines the __
A. Granularity
B. Task
C. Dependency Graph
D. Decomposition
ANSWER: A
Average Degree of Concurrency is...
A. The average number of tasks that can run concurrently over the
entire duration of execution of the process.
B. The average time that can run concurrently over the entire
duration of execution of the process.
C. The average in degree of task dependency graph.
D. The average out degree of task dependency graph.
ANSWER: A
Which task decomposition technique is suitable for the 15-puzzle
problem?
A. Data decomposition
B. Exploratory decomposition
C. Speculative decomposition
D. Recursive decomposition
ANSWER: B
Which of the following method is used to avoid Interaction
Overheads?
A. Maximizing data locality
B. Minimizing data locality
C. Increase memory size
D. None of the above.
ANSWER: A
Which of the following is not parallel algorithm model
OptimusPrime Page 75
A. The Data Parallel Model
B. The work pool model
C. The task graph model
D. The Speculative Model
ANSWER: D
Nvidia GPU based on following architecture
A. MIMD
B. SIMD
C. SISD
D. MISD
ANSWER: B
What is Critical Path?
A. The length of the longest path in a task dependency graph is
called the critical path length.
B. The length of the smallest path in a task dependency graph is
called the critical path length.
C. Path with loop
D. None of the mentioned.
ANSWER: A
Which decompositioin technique uses divide-andconquer strategy?
A. recursive decomposition
B. Sdata decomposition
C. exploratory decomposition
D. speculative decomposition
ANSWER: A
If there are 6 nodes in a ring topology how many message passing
cycles will be required to complete broadcast process in one to all?
A. 1
B. 6
C. 3
D. 4
ANSWER: 3
If there is 4 X 4 Mesh topology network then how many ring operation
will perform to complete one to all broadcast?
A. 4
B. 8
C. 16
D. 32
ANSWER: 8
Consider all to all broadcast in ring topology with 8 nodes. How
many messages will be present with each node after 3rd step/cycle of
communication?
A. 3
B. 4
C. 6
D. 7
OptimusPrime Page 76
ANSWER: 4
Consider Hypercube topology with 8 nodes then how many message
passing cycles will require in all to all broadcast operation?
A. The longest path between any pair of finish nodes.
B. The longest directed path between any pair of start & finish
node.
C. The shortest path between any pair of finish nodes.
D. The number of maximum nodes level in graph.
ANSWER: D
Scatter is ____________.
A. One to all broadcast communication
B. All to all broadcast communication
C. One to all personalised communication
D. Node of the above.
ANSWER: C
If there is 4X4 Mesh Topology ______ message passing cycles will
require complete all to all reduction.
A. 4
B. 6
C. 8
D. 16
ANSWER: C
Following issue(s) is/are the true about sorting techniques with
parallel computing.
A. Large sequence is the issue
B. Where to store output sequence is the issue
C. Small sequence is the issue
D. None of the above
ANSWER: B
Partitioning on series done after ______________
A. Local arrangement
B. Processess assignments
C. Global arrangement
D. None of the above
ANSWER: C
In Parallel DFS processes has following roles.(Select multiple
choices if applicable)
A. Donor
B. Active
C. Idle
D. Passive
ANSWER: A
Suppose there are 16 elements in a series then how many phases will
be required to sort the series using parallel odd-even bubble sort?
A. 8
B. 4
OptimusPrime Page 77
C. 5
D. 15
ANSWER: D
Which are different sources of Overheads in Parallel Programs?
A. Interprocess interactions
B. Process Idling
C. All mentioned options
D. Excess Computation
ANSWER: C
The ratio of the time taken to solve a problem on a parallel
processors to the time required to solve the same problem on a
single processor with p identical processing elements.
A. The ratio of the time taken to solve a problem on a single
processor to the time required to solve the same problem on a
parallel computer with p identical processing elements.
B. The ratio of the time taken to solve a problem on a single
processor to the time required to solve the same problem on a
parallel computer with p identical processing elements
C. The ratio of number of multiple processors to size of data
D. None of the above
ANSWER: B
Efficiency is a measure of the fraction of time for which a
processing element is usefully employed.
A. TRUE
B. FALSE
ANSWER: A
CUDA helps do execute code in parallel mode using __________
A. CPU
B. GPU
C. ROM
D. Cash memory
ANSWER: B
In thread-function execution scenario thread is a ___________
A. Work
B. Worker
C. Task
D. None of the above
ANSWER: B
In GPU Following statements are true
A. Grid contains Block
B. Block contains Threads
C. All the mentioned options.
D. SM stands for Streaming MultiProcessor
ANSWER: C
Computer system of a parallel computer is capable of_____________
A. Decentralized computing
OptimusPrime Page 78
B. Parallel computing
C. Centralized computing
D. All of these
ANSWER: A
In which application system Distributed systems can run well?
A. HPC
B. Distrubuted Framework
C. HRC
D. None of the above
ANSWER: A
A pipeline is like .................... ?
A. an automobile assembly line
B. house pipeline
C. both a and b
D. a gas line
ANSWER: A
Pipeline implements ?
A. fetch instruction
B. decode instruction
C. fetch operand
D. all of above
ANSWER: D
A processor performing fetch or decoding of different instruction
during the execution of another instruction is called ______ ?
A. Super-scaling
B. Pipe-lining
C. Parallel Computation
D. None of these
ANSWER: B
In a parallel execution, the performance will always improve as the
number of processors will increase?
A. True
B. False
ANSWER: B
VLIW stands for ?
A. Very Long Instruction Word
B. Very Long Instruction Width
C. Very Large Instruction Word
D. Very Long Instruction Width
ANSWER: A
In VLIW the decision for the order of execution of the instructions
depends on the program itself?
A. True
B. False
ANSWER: A
Which one is not a limitation of a distributed memory parallel
OptimusPrime Page 79
system?
A. Higher communication time
B. Cache coherency
C. Synchronization overheads
D. None of the above
ANSWER: B
Which of these steps can create conflict among the processors?
A. Synchronized computation of local variables
B. Concurrent write
C. Concurrent read
D. None of the above
ANSWER: B
Which one is not a characteristic of NUMA multiprocessors?
A. It allows shared memory computing
B. Memory units are placed in physically different location
C. All memory units are mapped to one common virtual global memory
D. Processors access their independent local memories
ANSWER: D
Which of these is not a source of overhead in parallel computing?
A. Non-uniform load distribution
B. Less local memory requirement in distributed computing
C. Synchronization among threads in shared memory computing
D. None of the above
ANSWER: B
Systems that do not have parallel processing capabilities are?
A. SISD
B. SIMD
C. MIMD
D. All of the above
ANSWER: A
How does the number of transistors per chip increase according to
Moore ´s law?
A. Quadratically
B. Linearly
C. Cubicly
D. Exponentially
ANSWER: D
Parallel processing may occur?
A. in the instruction stream
B. in the data stream
C. both[A] and [B]
D. none of the above
ANSWER: C
To which class of systems does the von Neumann computer belong?
A. SIMD (Single Instruction Multiple Data)
B. MIMD (Multiple Instruction Multiple Data)
OptimusPrime Page 80
C. MISD (Multiple Instruction Single Data)
D. SISD (Single Instruction Single Data)
ANSWER: D
Fine-grain threading is considered as a ______ threading?
A. Instruction- level
B. Loop level
C. Task-level
D. Function-level
ANSWER: A
Multiprocessor is systems with multiple CPUs, which are capable of
independently executing different tasks in parallel. In this
category every processor and memory module has similar access time?
A. UMA
B. Microprocessor
C. Multiprocessor
D. NUMA
ANSWER: A
For inter processor communication the miss arises are called?
A. hit rate
B. coherence misses
C. comitt misses
D. parallel processing
ANSWER: B
NUMA architecture uses _______in design?
A. cache
B. shared memory
C. message passing
D. distributed memory
ANSWER: D
A multiprocessor machine which is capable of executing multiple
instructions on multiple data sets?
A. SISD
B. SIMD
C. MIMD
D. MISD
ANSWER: C
In message passing, send and receive message between?
A. Task or processes
B. Task and Execution
C. Processor and Instruction
D. Instruction and decode
ANSWER: A
The First step in developing a parallel algorithm is_________?
A. To Decompose the problem into tasks that can be executed
concurrently
B. Execute directly
OptimusPrime Page 81
C. Execute indirectly
D. None of Above
ANSWER: A
The number of tasks into which a problem is decomposed determines
its?
A. Granularity
B. Priority
C. Modernity
D. None of above
ANSWER: A
The length of the longest path in a task dependency graph is called?
A. the critical path length
B. the critical data length
C. the critical bit length
D. None of above
ANSWER: A
The graph of tasks (nodes) and their interactions/data exchange
(edges)?
A. Is referred to as a task interaction graph
B. Is referred to as a task Communication graph
C. Is referred to as a task interface graph
D. None of Above
ANSWER: A
Mappings are determined by?
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
ANSWER: C
Decomposition Techniques are?
A. recursive decomposition
B. data decomposition
C. exploratory decomposition
D. All of Above
ANSWER: D
The Owner Computes Rule generally states that the process assigned a
particular data item is responsible for?
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
ANSWER: A
A simple application of exploratory decomposition is_?
A. The solution to a 15 puzzle
B. The solution to 20 puzzle
C. The solution to any puzzle
OptimusPrime Page 82
D. None of Above
ANSWER: A
Speculative Decomposition consist of _?
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
ANSWER: C
task characteristics include?
A. Task generation.
B. Task sizes.
C. Size of data associated with tasks.
D. All of Above
ANSWER: D
Writing parallel programs is referred to as?
A. Parallel computation
B. Parallel processes
C. Parallel development
D. Parallel programming
ANSWER: D
Parallel Algorithm Models?
A. Data parallel model
B. Bit model
C. Data model
D. network model
ANSWER: A
The number and size of tasks into which a problem is decomposed
determines the?
A. fine-granularity
B. coarse-granularity
C. sub Task
D. granularity
ANSWER: A
A feature of a task-dependency graph that determines the average
degree of concurrency for a given granularity is its ___________
path?
A. critical
B. easy
C. difficult
D. ambiguous
ANSWER: A
The pattern of___________ among tasks is captured by what is known
as a task-interaction graph?
A. Interaction
B. communication
C. optmization
OptimusPrime Page 83
D. flow
ANSWER: A
Interaction overheads can be minimized by____?
A. Maximize Data Locality
B. Maximize Volume of data exchange
C. Increase Bandwidth
D. Minimize social media contents
ANSWER: A
Type of parallelism that is naturally expressed by independent tasks
in a task-dependency graph is called _______ parallelism?
A. Task
B. Instruction
C. Data
D. Program
ANSWER: A
Speed up is defined as a ratio of?
A. s=Ts/Tp
B. S= Tp/Ts
C. Ts=S/Tp
D. Tp=S /Ts
ANSWER: A
Parallel computing means to divide the job into several __________?
A. Bit
B. Data
C. Instruction
D. Task
ANSWER: D
_________ is a method for inducing concurrency in problems that can
be solved using the divide-and-conquer strategy?
A. exploratory decomposition
B. speculative decomposition
C. data-decomposition
D. Recursive decomposition
ANSWER: C
The___ time collectively spent by all the processing elements Tall =
p TP?
A. total
B. Average
C. mean
D. sum
ANSWER: A
Group communication operations are built using point-to-point
messaging primitives?
A. True
B. False
ANSWER: A
OptimusPrime Page 84
Communicating a message of size m over an uncongested network takes
time ts + tmw?
A. True
B. False
ANSWER: A
The dual of one-to-all broadcast is ?
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
ANSWER: A
A hypercube has?
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
A binary tree in which processors are (logically) at the leaves and
internal nodes are routing nodes?
A. True
B. False
ANSWER: A
In All-to-All Broadcast each processor is thesource as well as
destination?
A. True
B. False
ANSWER: A
The Prefix Sum Operation can be implemented using the ?
A. All-to-all broadcast kernel.
B. All-to-one broadcast kernel.
C. One-to-all broadcast Kernel
D. Scatter Kernel
ANSWER: A
In the scatter operation ?
A. Single node send a unique message of size m to every other node
B. Single node send a same message of size m to every other node
C. Single node send a unique message of size m to next node
D. None of Above
ANSWER: A
The gather operation is exactly the inverse of the ?
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
ANSWER: A
In All-to-All Personalized Communication Each node has a distinct
OptimusPrime Page 85
message of size m for every other node ?
A. True
B. False
ANSWER: A
Parallel algorithms often require a single process to send identical
data to all other processes or to a subset of them. This operation
is known as _________?
A. one-to-all broadcast
B. All to one broadcast
C. one-to-all reduction
D. all to one reduction
ANSWER: A
In which of the following operation, a single node sends a unique
message of size m to every other node?
A. Gather
B. Scatter
C. One to all personalized communication
D. Both A and C
ANSWER: D
Gather operation is also known as ________?
A. One to all personalized communication
B. One to all broadcast
C. All to one reduction
D. All to All broadcast
ANSWER: A
one-to-all personalized communication does not involve any
duplication of data?
A. True
B. False
ANSWER: A
Gather operation, or concatenation, in which a single node collects
a unique message from each node?
A. True
B. False
ANSWER: A
Conventional architectures coarsely comprise of a?
A. A processor
B. Memory system
C. Data path.
D. All of Above
ANSWER: D
Data intensive applications utilize?
A. High aggregate throughput
B. High aggregate network bandwidth
C. High processing and memory system performance.
D. None of above
OptimusPrime Page 86
ANSWER: A
A pipeline is like?
A. Overlaps various stages of instruction execution to achieve
performance.
B. House pipeline
C. Both a and b
D. A gas line
ANSWER: A
Scheduling of instructions is determined?
A. True Data Dependency
B. Resource Dependency
C. Branch Dependency
D. All of above
ANSWER: D
VLIW processors rely on?
A. Compile time analysis
B. Initial time analysis
C. Final time analysis
D. Mid time analysis
ANSWER: A
Memory system performance is largely captured by?
A. Latency
B. Bandwidth
C. Both a and b
D. none of above
ANSWER: C
The fraction of data references satisfied by the cache is called?
A. Cache hit ratio
B. Cache fit ratio
C. Cache best ratio
D. none of above
ANSWER: A
A single control unit that dispatches the same Instruction to
various processors is?
A. SIMD
B. SPMD
C. MIMD
D. None of above
ANSWER: A
The primary forms of data exchange between parallel tasks are?
A. Accessing a shared data space
B. Exchanging messages.
C. Both A and B
D. None of Above
ANSWER: C
Switches map a fixed number of inputs to outputs?
OptimusPrime Page 87
A. True
B. False
ANSWER: A
The First step in developing a parallel algorithm is?
A. To Decompose the problem into tasks that can be executed
concurrently
B. Execute directly
C. Execute indirectly
D. None of Above
ANSWER: A
The number of tasks into which a problem is decomposed determines
its?
A. Granularity
B. Priority
C. Modernity
D. None of above
ANSWER: A
The length of the longest path in a task dependency graph is called?
A. the critical path length
B. the critical data length
C. the critical bit length
D. None of above
ANSWER: A
The graph of tasks (nodes) and their interactions/data exchange
(edges)?
A. Is referred to as a task interaction graph
B. Is referred to as a task Communication graph
C. Is referred to as a task interface graph
D. None of Above
ANSWER: A
Mappings are determined by?
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
ANSWER: C
Decomposition Techniques are?
A. recursive decomposition
B. data decomposition
C. exploratory decomposition
D. All of Above
ANSWER: D
The Owner Computes Rule generally states that the process assigned a
particular data item are responsible for?
A. All computation associated with it
B. Only one computation
OptimusPrime Page 88
C. Only two computation
D. Only occasionally computation
ANSWER: A
A simple application of exploratory decomposition is?
A. The solution to a 15 puzzle
B. The solution to 20 puzzle
C. The solution to any puzzle
D. None of Above
ANSWER: A
Speculative Decomposition consist of ?
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
ANSWER: C
Task characteristics include?
A. Task generation.
B. Task sizes.
C. Size of data associated with tasks.
D. All of Above.
ANSWER: D
Group communication operations are built using point-to-point
messaging primitives?
A. True
B. False
ANSWER: A
Communicating a message of size m over an uncongested network takes
time ts + tmw?
A. True
B. False
ANSWER: A
The dual of one-to-all broadcast is?
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
ANSWER: A
A hypercube has?
A. 2d nodes
B. 3d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
A binary tree in which processors are (logically) at the leaves and
internal nodes are routing nodes?
A. True
OptimusPrime Page 89
B. False
ANSWER: A
In All-to-All Broadcast each processor is the source as well as
destination?
A. True
B. False
ANSWER: A
The Prefix Sum Operation can be implemented using the?
A. All-to-all broadcast kernel.
B. All-to-one broadcast kernel.
C. One-to-all broadcast Kernel
D. Scatter Kernel
ANSWER: A
In the scatter operation?
A. Single node send a unique message of size m to every other node
B. Single node send a same message of size m to every other node
C. Single node send a unique message of size m to next node
D. None of Above
ANSWER: A
The gather operation is exactly the inverse of the?
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
ANSWER: A
In All-to-All Personalized Communication Each node has a distinct
message of size m for every other node?
A. True
B. False
ANSWER: A
Computer system of a parallel computer is capable of?
A. Decentralized computing
B. Parallel computing
C. Centralized computing
D. Decentralized computing
E. Distributed computing
ANSWER: A
Writing parallel programs is referred to as?
A. Parallel computation
B. Parallel processes
C. Parallel development
D. Parallel programming
ANSWER: D
Simplifies applications of three-tier architecture is ____________?
A. Maintenance
B. Initiation
OptimusPrime Page 90
C. Implementation
D. Deployment
ANSWER: D
Dynamic networks of networks, is a dynamic connection that grows is
called?
A. Multithreading
B. Cyber cycle
C. Internet of things
D. Cyber-physical system
ANSWER: C
In which application system Distributed systems can run well?
A. HPC
D. HTC
C. HRC
D. Both A and B
ANSWER: D
In which systems desire HPC and HTC?
A. Adaptivity
B. Transparency
C. Dependency
D. Secretive
ANSWER: B
No special machines manage the network of architecture in which
resources are known as?
A. Peer-to-Peer
B. Space based
C. Tightly coupled
D. Loosely coupled
ANSWER: A
Significant characteristics of Distributed systems have of ?
A. 5 types
B. 2 types
C. 3 types
D. 4 types
ANSWER: C
Built of Peer machines are over?
A. Many Server machines
B. 1 Server machine
C. 1 Client machine
D. Many Client machines
ANSWER: D
Type HTC applications are?
A. Business
B. Engineering
C. Science
D. Media mass
OptimusPrime Page 91
ANSWER: A
Virtualization that creates one single address space architecture
that of, is called?
A. Loosely coupled
B. Peer-to-Peer
C. Space-based
D. Tightly coupled
ANSWER: C
We have an internet cloud of resources In cloud computing to form?
A. Centralized computing
B. Decentralized computing
C. Parallel computing
D. All of these
ANSWER: D
Data access and storage are elements of Job throughput, of
__________?
A. Flexibility
B. Adaptation
C. Efficiency
D. Dependability
ANSWER: C
Billions of job requests is over massive data sets, ability to
support known as?
A. Efficiency
B. Dependability
C. Adaptation
D. Flexibility
ANSWER: C
Broader concept offers Cloud computing .to select which of the
following?
A. Parallel computing
B. Centralized computing
C. Utility computing
D. Decentralized computing
ANSWER: C
Resources and clients transparency that allows movement within a
system is called?
A. Mobility transparency
B. Concurrency transparency
C. Performance transparency
D. Replication transparency
ANSWER: A
Distributed program in a distributed computer running a is known as?
A. Distributed process
B. Distributed program
C. Distributed application
OptimusPrime Page 92
D. Distributed computing
ANSWER: B
Uniprocessor computing devices is called__________?
A. Grid computing
B. Centralized computing
C. Parallel computing
D. Distributed computing
ANSWER: B
Utility computing focuses on a______________ model?
A. Data
B. Cloud
C. Scalable
D. Business
ANSWER: D
What is a CPS merges technologies?
A. 5C
B. 2C
C. 3C
D. 4C
ANSWER: C
Aberration of HPC?
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
ANSWER: C
Peer-to-Peer leads to the development of technologies like?
A. Norming grids
B. Data grids
C. Computational grids
D. Both A and B
ANSWER: D
Type of HPC applications of?
A. Management
B. Media mass
C. Business
D. Science
ANSWER: D
The development generations of Computer technology has gone through?
A. 6
B. 3
C. 4
D. 5
ANSWER: D
Utilization rate of resources in an execution model is known to be
its?
OptimusPrime Page 93
A. Adaptation
B. Efficiency
C. Dependability
D. Flexibility
ANSWER: B
Even under failure conditions Providing Quality of Service (QoS)
assurance is the responsibility of?
A. Dependability
B. Adaptation
C. Flexibility
D. Efficiency
ANSWER: A
Interprocessor communication that takes place?
A. Centralized memory
B. Shared memory
C. Message passing
D. Both A and B
ANSWER: D
Data centers and centralized computing covers many and?
A. Microcomputers
B. Minicomputers
C. Mainframe computers
D. Supercomputers
ANSWER: D
Which of the following is an primary goal of HTC
paradigm___________?
A. High ratio Identification
B. Low-flux computing
C. High-flux computing
D. Computer utilities
ANSWER: C
The high-throughput service provided is measures taken by
A. Flexibility
B. Efficiency
C. Dependability
D. Adaptation
ANSWER: D
What are the sources of overhead?
A. Essential /Excess Computation
B. Inter-process Communication
C. Idling
D. All above
ANSWER: D
Which are the performance metrics for parallel systems?
A. Execution Time
B. Total Parallel Overhead
OptimusPrime Page 94
C. Speedup
D. All above
ANSWER: D
The efficiency of a parallel program can be written as: E = Ts /
pTp. True or False?
A. True
B. False
ANSWER: A
The important feature of the VLIW is ______?
A. ILP
B. Performance
C. Cost effectiveness
D. delay
ANSWER: A
---------------------------------------------------------------------------------------------------------------------
SET 8 (MCQs)
---------------------------------------------------------------------------------------------------------------------
OptimusPrime Page 95
No. Question 1 2 3 4 Ans
1 Any condition that causes a processor to stall is Hazard Page fault System error None of the above 1
called as _____.
2 The time lost due to branch instruction is often Latency Delay Branch penalty None of the above 3
referred to as _____.
3 _____ method is used in centralized systems to Scorecard Score boarding Optimizing Redundancy 2
perform out of order execution.
4 The computer cluster architecture emerged as an ISA Workstation Super computers Distributed systems 3
alternative for ____.
5 NVIDIA CUDA Warp is made up of how many 512 1024 312 32 4
threads?
6 Out-of-order instructions is not possible on GPUs. TRUE FALSE -- -- 2
7 CUDA supports programming in .... C or C++ only Java, Python, and more C, C++, third party Pascal 3
wrappers for
Java, Python, and
more
8 FADD, FMAD, FMIN, FMAX are ----- supported by 32-bit IEEE 32-bit integer instructions both none of the above 1
Scalar Processors of NVIDIA GPU. floating point
instructions
9 Each streaming multiprocessor (SM) of CUDA 1024 128 512 8 4
herdware has ------ scalar processors (SP).
OptimusPrime Page 96
13 Limitations of CUDA Kernel recursion, call No recursion, no call stack, recursion, no call No recursion, call stack, 2
stack, static no static variable stack, static no static variable
variable declarations variable declarations
declaration declaration
16 The CUDA architecture consists of --------- for RISC CISC instruction set ZISC instruction PTX instruction set 4
parallel computing kernels and functions. instruction set architecture set architecture architecture
architecture
17 CUDA stands for --------, designed by NVIDIA. Common Complex Unidentified Compute Unified Complex Unstructured 3
Union Discrete Device Architecture Device Distributed Architecture
Architecture Architecture
18 The host processor spawns multithread tasks (or TRUE FALSE --- --- 1
kernels as they are known in CUDA) onto the GPU
device. State true or false.
19 The NVIDIA G80 is a ---- CUDA core device, the 128, 256, 512 32, 64, 128 64, 128, 256 256, 512, 1024 1
NVIDIA G200 is a ---- CUDA core device, and the
NVIDIA Fermi is a ---- CUDA core device.
20 NVIDIA 8-series GPUs offer -------- . 50-200 GFLOPS 200-400 GFLOPS 400-800 GFLOPS 800-1000 GFLOPS 1
OptimusPrime Page 97
21 IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- 32-bit IEEE 32-bit integer instructions both none of the above 2
supported by Scalar Processors of NVIDIA GPU. floating point
instructions
22 CUDA Hardware programming model supports: a) a,c,d,f b,c,d,e a,d,e,f a,b,c,d,e,f 4
fully generally data-parallel archtecture; b)
General thread launch; c) Global load-store; d)
Parallel data cache; e) Scalar architecture; f)
Integers, bit operation
24 What is the equivalent of general C program with int main ( void __global__ void kernel( __global__ void __global__ int main ( 2
CUDA C: int main(void) { printf("Hello, World!\n"); ) { kernel void ) { } int main ( void ) { kernel( void ) { void ) { kernel
return 0; } <<<1,1>>>(); kernel <<<1,1>>>(); kernel <<<1,1>>>();
printf("Hello, printf("Hello, World!\n"); <<<1,1>>>(); printf("Hello,
World!\n"); return 0; } printf("Hello, World!\n"); return 0; }
return 0; } World!\n");
return 0; }
OptimusPrime Page 98
28 If variable a is host variable and dev_a is a device memcpy( cudaMemcpy( dev_a, &a, memcpy( (void*) cudaMemcpy( (void*) 2
(GPU) variable, to copy input from variable a to dev_a, &a, size, dev_a, &a, size); &dev_a, &a, size,
variable dev_a select correct statement: size); cudaMemcpyHostToDevic cudaMemcpyDeviceToH
e ); ost );
29 Triple angle brackets mark in a statement inside a call from a call from device code to less than greater than comparison 1
main function, what does it indicates? host code to host code comparison
device code
30 What makes a CUDA code runs in parallel __global__ main() function indicates Kernel name first parameter value 4
indicates parallel execution of code outside triple inside triple angle
parallel angle bracket bracket (N) indicates
execution of indicates excecution of kernel N
code excecution of times in parallel
kernel N times in
parallel
33
34 In sorting networks for INCREASING COMPARATOR X' = min { x , y } X' = max { x , y } and Y' = X' = min { x , y } X' = max { x , y } and Y' = 3
with input x,y select the correct output X', Y' from and Y' = min { x min { x , y } and Y' = max{ x , y max { x , y }
the following options ,y} }
35
36 In sorting networks for DECREASING X' = min { x , y } X' = max { x , y } and Y' = X' = min { x , y } X' = max { x , y } and Y' = 2
COMPARATOR with input x,y select the correct and Y' = min { x min { x , y } and Y' = max{ x , y max { x , y }
output X', Y' from the following options ,y} }
37
38 Which of the following is TRUE for Bitonic a) and b) a) and b) and d) a) and b) and c) a) and b) and c) and d) 4
Sequence a) Monotonically increasing b)
Monotonically Decreasing c) With cyclic shift of
indices d) First increasing then decreasing
OptimusPrime Page 99
39
40 Which of the following is NOT a BITONIC {8, 6, 4, 2, 3, 5, {0, 4, 8, 9, 2, 1} {3, 5, 7, 9, 8, 6, 4, {1, 2, 4, 7, 6, 0, 1} 4
Sequence 7, 9} 2}
41
42 The procedure of sorting a bitonic sequence using Bitonic Merge Bitonic Split Bitonic Divide Bitonic Series 1
bitonic splits is called
43
44 While mapping Bitonic sort on Hypercube, One Bit Two bits Three Bits Four bits 1
Compare-exchange operations take place between
wires whose labels differ in
45
46 Which of following is NOT A WAY of mapping the Row Major Column Major Mapping Row Major Row Major Shuffled 2
input wires of the bitonic Mapping Snakelike Mapping
sorting network to a MESH of processes mapping
47
48 Which is the sorting algorithm in below given steps Selection Sort Bubble Sort Parallel Selcetion Parallel Bubble Sort 2
- 1. procedure X_SORT(n) Sort
2. begin
3. for i := n - 1 downto 1 do
4. for j := 1 to i do
5. compare-exchange(aj, aj + 1);
6. end X_SORT
49
50 The odd-even transposition algorithm sorts n 2n n2 n/2 n 3
elements in n phases (n is even), each of which
requires ------------compare-exchange operations
51
53
54 Which is the fastest sorting algorithm Bubble Sort Odd-Even Transposition Shell Sort Quick Sort 4
Sort
55
56 Quicksort's performance is greatly affected by the TRUE FALSE 1
way it partitions a sequence.
57
58 Pivot in Quick sort can be selected as Always First Always Last element Always Middle Randomly Selected 4
Element index Element Element
59
60 Quick sort uses Recursive Decomposition TRUE FALSE 1
61
62 In first step of parallelizing quick sort for n Only one n processes are used two processes are None of the above 1
elements to get subarrays, which of the following process is used used
statement is TRUE
63
64 In Binary tree representation created by execution Leaf Node Root of tree Any internal node None of the above 2
of Quick sort, Pivot is at
65
66 What is the worst case time complexity of a quick O(N) O(N log N) O(N2) O(log N) 3
sort algorithm?
67
68 What is the average running time of a quick sort O(N) O(N log N) O(N2) O(log N) 2
algorithm?
69
70 Odd-even transposition sort is a variation of Quick Sort Shell Sort Bubble Sort Selection Sort 3
71
83
84 Given an array of n elements and p processes, in n*p n-p p/n n/p 4
the message-passing version of the parallel
quicksort, each process stores ---------elements of
array
85
86 In parallel quick sort Pivot selecton strategy is Maintaing load Maintaining uniform Effective Pivot all of the above 4
crucial for balance distribution of elements in selection in next
process groups level
87
89
90 Which Parallel formulation of Quick sort is possible Shared- Message Passing Hypercube All of the above 4
Address-Space formulation Formulation
Parallel
Formulation
91
92 Which formulation of Dijkstra's algorithm exploits source- source-parallel Partitioned- All of above 2
more parallelism partitioned formulation Parallel
formulation Formulation
93
94 In Dijkstra's all pair shortest path each process TRUE FALSE 1
compute the single-source shortest paths for all
vertices assigned to it in SOURCE PARTITIONED
FORMULATION
95
96 A complete graph is a graph in which each pair of TRUE FALSE 1
vertices is adjacent
97
98 The space required to store the adjacency matrix in order of n in order of n log n in order of n in order of n/2 3
of a graph with n vertices is squared
99
10 Graph can be represented by Identity Matrix Adjacency Matrix Sprse list Sparse matrix 2
0
10
1
10 to solve the all-pairs shortest paths problem which a) and c) a) and b) b) and c) c) and d) 2
2 algorithm/s is/are used a) Floyd's algorithm b)
Dijkstra's single-source shortest paths c) Prim's
Algorithm d) Kruskal's Algorithm
10
5
10 Best-first search (BFS) algorithms can search both TRUE FALSE 1
6 graphs and trees.
10
7
10 A* algorithm is a BFS algorithm DFS Algorithm Prim's Algorithm Kruskal's Algorithm 1
8
10
9
11 identify Load-Balancing Scheme/s Asynchronous Global Round Robin Random Polling All above methods 4
0 Round Robin
11
1
11 important component of best-first search (BFS) Open List Closed List Node List Mode List 1
2 algorithms is
11 Question a b c d Ans
3
11 A CUDA program is comprised of two primary GPU kernel CPU kernel OS none of above a
4 components: a host and a _____.
11 The kernel code is dentified by _host_ __global__ _device_ void b
5 the ________qualifier with void return type
11 The kernel code is only callable by the host TRUE FALSE a
6
11 The kernel code is executable on the device and TRUE FALSE b
7 host
11 Calling a kernel is typically referred to as kernel thread kernel initialization kernel kernel invocation d
8 _________. termination
11 Host codes in a CUDA application can Initialize a TRUE FALSE a
9 device
13 In CUDA, a single invoked kernel is referred to as a block tread grid none of above c
1 _____.
13 A grid is comprised of ________ of threads. block bunch host none of above a
2
13 A block is comprised of multiple _______. treads bunch host none of above a
3
13 a solution of the problem in representing the CUD PTA CDA CUDA d
4 parallelismin algorithm is
13 ______ is Callable from the host _host_ __global__ _device_ none of above b
5
13 ______ is Callable from the host _host_ __global__ _device_ none of above a
___________________________________________________________________________________
Date: 23/07/2020
Q. Options Correct
Question Description Marks CO PO PSO BTL
No. Answer
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
4 All-to-all personalized A. m C 2 3 1 3 4
communication is B. p
performed independently C. m√p
in each row with D. p√m
clustered messages of
size _______ on a mesh.
5 In All-to-All A. m A 2 3 1 3 1
Personalized B. p
Communication on a C. m-1
Ring, the size of the D. p-1
message reduces by
______ at each step
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
i) all-to-one reduction
in each row
ii) one-to-all broadcast
of each vector
element among the n
processes of each
column
iii) one-to-one
communication to
align the vector
along the main
diagonal
11 Parallel time in Rowwise A. Θ(1) D 2 4 4 3 4
1-D Partitioning of B. Θ(n log n)
Matrix-Vector C. Θ(n2)
Multiplication where p=n D. Θ(n)
is ____.
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
___________________________________________________________________________________
Options Corre
Q. ct
Question Description Marks CO PO PSO BTL
No. Answ
er
d. Sequential Locality
d. K-D Mesh
c. coarse grained
granularity
d. task grained
granularity
c. Speculative
decomposition
c. Speculative
decomposition
d. Recursive
decomposition
d. Irregular interaction
d. In-process mapping
c. GEForce 3800
d. GEForce 956
c. Variable memory
access
b. To search and
measure how far a
node in a search tree
seems to be from a
goal
b. Depth-first Search
d. Hill climbing
c. Priority Queue
d. Circular Queue
d. isoefficiency
D. Scalability
d. speedup
d. speedup effect
c. O(n^3)
d. O(n+logn)
c. theta(n^logn)
d. theta(n^2)
c. Scalable
multiprocessor
d. Summative
multiprocessor
d. Shortest path
partitioned
formulation
b. Multithreaded SIMD
processor
c. Multithreaded queue
d. Multithreaded stack
c. Guest
d. Host
b. Computing universal
device architecture
c. Computer unicode
device architecture
d. Compute unified
device architecture
d. Supercomputers
C. To = p TP - TS
D. To = TP - pTS
iii) one-to-one
D. Θ(n)
c. CUDA thread
d. CUD thread
c. 32 block
d. Unit block
c. Visible
d. Invisible
c. 32 block
d. Thread block
d. GTX 1060
c. Clock rate
d. All above
c. ((()))
d. [[[]]]
d. cudaMemorycpy()
c. Grid
d. Thread block
c. Excess computation
C. Excess computation
C. Speedup
D. Efficiency
C. CUDA
D. CUD
D. Cache misses
C. Low level
D. High level
d. CPU,GPU
c. __global__
d. A or C
(Mrs. D.A. Phalke & Mrs. Neha D. Patil) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
7 Cache memory works on the principle Locality of data Locality of memory Locality of reference Locality of
of reference &
c
memory
31 Multithreading allowing
multiple-threads for sharing the
Multiple
processor
Single processor Dual core Corei5
b
functional units of a
32 Allowing multiple instructions for issuing Single-issue
in a clock cycle, is the goal of processors
Dual-issue
processors
Multiple-issue
processors
No-issue
processors
c
4 Cost Analysis on a ring is (ts + twm)(p - 1) (ts - twm)(p + 1) (tw + tsm)(p - 1) (tw - tsm)(p +
1)
a
5 Cost Analysis on a mesh is 2ts(sqrt(p) + 1) +
twm(p - 1)
2tw(sqrt(p) + 1) + 2tw(sqrt(p) - 1) +
tsm(p - 1) tsm(p - 1)
2ts(sqrt(p) - 1)
+ twm(p - 1)
d
17 The dual of the scatter operation is the concatenation gather operation Both None
c
10 In DNS algorithm of matrix multiplication it used 1d partition 2d partition 3d partition both a,b
c
14 A parallel algorithm is evaluated by its runtime in function the input size, the number of the
of processors, communicatio
all
d
n parameters.
18 In CUDA, a single invoked kernel is referred to as a _____. block tread grid none of above
c
19 A grid is comprised of ________ of threads. block bunch host none of above
a
20 A block is comprised of multiple _______. treads bunch host none of above
a
21 a solution of the problem in representing the
parallelismin algorithm is
CUD PTA CDA CUDA
d
22 ______ is Callable from the host _host_ __global__ _device_ none of above
b
23 ______ is Callable from the host _host_ __global__ _device_ none of above
a
24 A CUDA program is comprised of two primary
components: a host and a _____.
GPU kernel CPU kernel OS none of above
a
25 The kernel code is dentified by the ________qualifier
with void return type
_host_ __global__ _device_ void
b
A.In one to all broadcast initially there will be P(Number of processors) copies of messages and
B.In one to all broadcast initially there will be single copy of message and after broadcast finally
Submit
Answer
“In one to all broadcast initially there will be single copy of message and after broadcast finally
2.If total 8 nodes are in ring topology after one to all message broadcasting how many
Submit
Answer
message broadcast
A.nearest node
B.longest node
Submit
Answer
longest node
4.In All-to-one reduction after reduction the final copy of massage is avilible on which
node?
A.Source Node
B.Destination Node
D.None of these
Answer
Destination Node
5.If there is 4 by 4 mesh topology network present(as per shown in the video) then in how
16
Submit
Answer
6.If there are 8 nodes in a ring topology how many message passing cycles will be
Submit
Answer
7.In One to all broadcast using Hypercube topology how source node selects next
destination node?
Submit
Answer
8.If there are 8 nodes connected in ring topology then ___ number of message passing
Submit
Answer
9.Consider all to all broadcast in ring topology with 8 nodes.How many messages will be
Submit
Answer
10.If there are 16 messages in 4x4 mesh then total how many message passsing cycles
Submit
Answer
11.If there are P messages in mxm mesh then total how many message passsing cycles
2 √P - 2
2 √P - 1
2 √P
Submit
Answer
2 √P - 2
12.How many massage passing cycles required for all-to-all broadcasting in 8 nodes
hypercube?
Submit
Answer
13.In scatter opreation after massage broadcasting every node avail with same massage
copy.
True
False
Submit
Answer
CPU
GPU
ROM
Cash memory
Submit
Answer
GPU
Work
Worker
Task
Submit
Answer
Worker
Submit
Answer
“Grid contains Block”, “Block contains Threads”, “SM stands for Streaming MultiProcessor
17.Following issue(s) is/are the true about sorting techniques with parallel computing.
Submit
Answer
“Where to store output sequence is the issue”, “Where to store input sequence is the issue”
Local arrangement
Processess assignments
Global arrangement
Submit
Answer
Global arrangement
19.In Parallel DFS processes has following roles.(Select multiple choices if applicable)
Donor
Active
Idle
Recipient
Submit
Answer
“Donor”, “Recipient”
15
Submit
Answer
15
Interprocess interactions
Process Idling
Excess Computation
Submit
Answer
1 / 1 points
1 / 1 attempts
22,Speedup (S) is….
The ratio of the time taken to solve a problem on a parallel processors to the time required to
solve the same problem on a single processor with p identical processing elements
The ratio of the time taken to solve a problem on a single processor to the time required to solve
Submit
Answer
The ratio of the time taken to solve a problem on a single processor to the time required to solve
the same problem on a parallel computer with p identical processing elements
1 / 1 points
1 / 1 attempts
23.Efficiency is a measure of the fraction of time for which a processing element is
usefully employed.
TRUE
FALSE
TRUE
Address
Contents
Both a and b
none
Ans:
Both a and b
Bus
Peripheral connection wires
Both a and b
internal wires
Ans:
Bus
Processor
Memory System
Data path
All of the above
Ans:
All of the above
True
False
Ans:
False
6. The access time of memory is …………… the time required for performing any single CPU operation.
longer than
shorter than
OptimusPrime Page 1
negligible than
same as
Ans:
longer than
Latency
bandwidth
both a and b
none of above
Ans:
both a and b
9. A processor performing fetch or decoding of different instruction during the execution of another
instruction is called __ .
Super-scaling
Pipe-lining
Parallel Computation
none of above
Ans:
Pipe-lining
10. For a given FINITE number of instructions to be executed, which architecture of the processor
provides for a faster execution ?
ISA
ANSA
Super-scalar
All of the above
Ans:
Super-scalar
True
false
Ans:
True
OptimusPrime Page 2
12. High Performance Computing of the Computer System tasks are done by
Node Cluster
Network Cluster
Beowulf Cluster
Stratified Cluster
Ans:
Beowulf Cluster
13. Octa Core Processors are the processors of the computer system that contains
2 Processors
4 Processors
6 Processors
8 Processors
Ans:
8 Processors
sequential
unique
simultaneous
None of above
Ans:
simultaneous
Serialization
Parallelism
Serial processing
Distribution
OptimusPrime Page 3
Ans:
Parallelism
Mandatory Instructions/sec
Millions of Instructions/sec
Most of Instructions/sec
Many Instructions / sec
Ans:
Millions of Instructions/sec
19. Which MIMD systems are best scalable with respect to the number of processors
CISC
RISC
ISA
IANA
Ans:
RISC
RISC
CISC
ISA
IANA
Ans:
RISC
23. The computer architecture aimed at reducing the time of execution of instructions is __.
OptimusPrime Page 4
RISC
CISC
ISA
IANA
Ans:
RISC
processor memory
primary memory
secondary memory
All of above
Ans:
All of above
28. A single control unit that dispatches the same Instruction to various processors is__
SIMD
SPMD
MIMD
none of above
OptimusPrime Page 5
Ans:
SIMD
29. The primary forms of data exchange between parallel tasks are_
True
False
Ans:
True
----------------------------------------------------------------------------------------------------------------------------- --
UNIT 2
-------------------------------------------------------------------------------------------------------------------------------
Granularity
Priority
Modernity
None of Above
Ans:
Granularity
OptimusPrime Page 6
Is referred to as a task interaction graph
Is referred to as a task Communication graph
Is referred to as a task interface graph
None of Above
Ans:
Is referred to as a task interaction graph
task dependency
task interaction graphs
Both A and B
None of Above
Ans:
Both A and B
recursive decomposition
data decomposition
exploratory decomposition
speculative decomposition
All of above
Ans:
All of above
7. The Owner Computes rule generally states that the process assigned a particular data item is
responsible for _
conservative approaches
optimistic approaches
Both A and B
only B
OptimusPrime Page 7
Ans:
Both A and B
Task generation.
Task sizes.
Size of data associated with tasks.
All of above
Ans:
All of above
11. What is a high performance multi-core processor that can be used to accelerate a wide variety of
applications using parallel computing.
CLU
GPU
CPU
DSP
Ans:
GPU
32 Thread
32 Block
Unit Block
Thread Block
Ans:
Thread Block
Centralized memory
Shared memory
Message passing
Both A and B
Ans:
Both A and B
OptimusPrime Page 8
True
False
Ans:
False
Task generation.
Task sizes
Size of data associated with tasks
Overhead
both A and B
Ans:
both A and B
17. The fetch and execution cycles are interleaved with the help of __
18. The processor of system which can read /write GPU memory is known as
kernal
device
Server
Host
Ans:
Host
19. Increasing the granularity of decomposition and utilizing the resulting concurrency to perform more
tasks in parallel decreses performance.
TRUE
FALSE
Ans:
FALSE
TRUE
FALSE
Ans:
FALSE
OptimusPrime Page 9
TRUE
FALSE
Ans:
TRUE
-------------------------------------------------------------------------------------------------------------------------------
UNIT 3
----------------------------------------------------------------------------------------------------------------------------- --
Gather-scatter operations
Gather operations
Scatter operations
Gather-scatter technique
Ans:
Gather-scatter operations
OptimusPrime Page 10
Ans:
Unique message from each node
Yes
No
Ans:
No
Yes
No
Ans:
No
Total Exchange
Personal Message
Scatter
Gather
Ans:
Total Exchange
Message size
Number of nodes
Same
None of above
Ans:
Message size
Inverse
Reverse
Multiple
Same
Ans:
OptimusPrime Page 11
Inverse
Scatter operation
Broadcast operation
Prefix Sum
Reduction operation
Ans:
Scatter operation
TRUE
FALSE
Ans:
TRUE
11. A binary tree in which processors are (logically) at the leaves and internal nodes are routing nodes.
TRUE
FALSE
Ans:
TRUE
12. Group communication operations are built using point-to-point messaging primitives
TRUE
FALSE
Ans:
TRUE
13. Communicating a message of size m over an uncongested network takes time ts + tmw
True
False
Ans:
True
14. Parallel programs: Which speedup could be achieved according to Amdahl´s law for infinite number
of processors if 5% of a program is sequential and the remaining part is ideally parallel?
Infinite speedup
5
20
None of above
Ans:
20
OptimusPrime Page 12
Invalid Counter
Valid Counter
Ring
Undefined
Ans:
Ring
2 Registers
4 Registers
6 Registers
8 Registers
Ans:
8 Registers
0
5
10
8
Ans:
10
18. The height of a binary tree is the maximum number of edges in any root to leaf path. The maximum
number of nodes in a binary tree of height h is?
2h – 1
2h – 1 – 1
2h + 1 – 1
2 * (h+1)
Ans:
2h + 1 – 1
2^d nodes
2d nodes
2n Nodes
N Nodes
Ans:
2^d nodes
OptimusPrime Page 13
Scatter Kernel
Ans:
All-to-all broadcast kernel
22. In All-to-All Personalized Communication Each node has a distinct message of size m for every other
node
True
False
Ans:
True
23. A binary tree in which processors are (logically) at the leaves and internal nodes are
routing nodes.
True
False
Ans:
True
True
False
Ans:
True
----------------------------------------------------------------------------------------------------------------------------- --
UNIT 4
-------------------------------------------------------------------------------------------------------------------------------
1. mathematically efficiency is
e=s/p
e=p/s
e*s=p/2
e=p+e/e
Ans:
e=s/p
OptimusPrime Page 14
work
processor time
both
none
Ans:
both
increase
constant
decreases
none
Ans:
constant
increase
constant
decreases
none
Ans:
decreases
5. Speedup obtained when the problem size is ______ linearly with the number of processing elements.
increase
constant
decreases
depend on problem size
Ans:
increase
6. The n × n matrix is partitioned among n processors, with each processor storing complete ___ _ of the
matrix.
row
column
both
depend on processor
Ans:
row
1
n
logn
complex
Ans:
OptimusPrime Page 15
1
8. The n × n matrix is partitioned among n2 processors such that each processor owns a _ element
n
2n
single
double
Ans:
single
9. how many basic communication operations are used in matrix vector multiplication
1
2
3
4
Ans:
3
1d partition
2d partition
3d partition
both a,b
Ans:
3d partition
normalization
communication
elimination
all
Ans:
all
12. the cost of the parallel algorithm is higher than the sequential run time by a factor of __
3/2
2/3
3*2
2/3+3/2
Ans:
3/2
13. The load imbalance problem in Parallel Gaussian Elimination: can be alleviated by using a __
mapping
OptimusPrime Page 16
acyclic
cyclic
both
none
Ans:
acyclic
15. For a problem consisting of W units of work, p__W processors can be used optimally
<=
>=
<
>
Ans:
16. C(W)__Θ(W) for optimality (necessary condition).
>
<
<=
equals
Ans:
equals
well defined
zig-zac
reverse
straight
Ans:
well defined
performance
communication
algorithm
all
Ans:
performance
OptimusPrime Page 17
development effort
software quality
both
none
Ans:
development effort
point-to-point
one-to-all
all-to-one
none
Ans:
point-to-point
21. one processor has a piece of data and it need to send to everyone is
one -to-all
all-to-one
point -to-point
all of above
Ans:
one -to-all
22. wimpleat way to send p-1 messages from source to the other p-1 processors
Algorithm
communication
concurrency
receiver
Ans:
concurrency
1
2
8
0
Ans:
0
24. The processors compute __ product of the vector element and the local matrix
local
global
both
none
OptimusPrime Page 18
Ans:
local
recursive doubling
simple algorithm
both
none
Ans:
recursive doubling
recursive order
straight order
vertical order
parallel order
Ans:
recursive order
27. if “X” is the message to broadcast it initially resides at the source node
1
2
8
0
Ans:
0
XOR
AND
both
none
Ans:
both
OptimusPrime Page 19
p+1
p+2
p-1
Ans:
p-1
30. Each node first sends to one of its neighbours the data it need to….
broadcast
identify
verify
none
Ans:
broadcast
All-to-all
one -to-all
all-to-one
point-to-point
Ans:
All-to-all
√p
p
p+1
p-1
Ans:
√p
Algorithm
hypercube
both
none
Ans:
Algorithm
error
contention
recursion
none
Ans:
contention
OptimusPrime Page 20
30. In the scatter operation __ node send message to every other node
single
double
triple
none
Ans:
single
scatter operation
recursion operation
execution
none
Ans:
scatter operation
reverse order
parallel order
straight order
vertical order
Ans:
reverse order
----------------------------------------------------------------------------------------------------------------------------- --
UNIT 5
----------------------------------------------------------------------------------------------------------------------------- --
1. In _, the number of elements to be sorted is small enough to fit into the process’s main memory.
internal sorting
internal searching
external sorting
external searching
Ans:
internal sorting
2. __ algorithms use auxiliary storage (such as tapes and hard disks) for sorting because the number of
elements to be sorted is too large to fit into memory.
internal sorting
internal searching
External sorting
external searching
Ans:
External sorting
OptimusPrime Page 21
searching
Sorting
both a and b
none of above
Ans:
Sorting
compare-exchange
searching
Sorting
swapping
Ans:
compare-exchange
TRUE
FALSE
Ans:
TRUE
TRUE
FALSE
Ans:
TRUE
7. Quicksort is one of the most common sorting algorithms for sequential computers because of its
simplicity, low overhead, and optimal average complexity.
TRUE
FALSE
Ans:
TRUE
non-pivote
pivot
center element
len of array
Ans:
pivot
TRUE
OptimusPrime Page 22
FALSE
Ans:
TRUE
10. DFS begins by expanding the initial node and generating its successors. In each subsequent step, DFS
expands one of the most recently generated nodes.
TRUE
FALSE
Ans:
TRUE
BFS
DFS
a and b
none of above
Ans:
DFS
BFS
DFS
a and b
none of above
Ans:
BFS
13. If the heuristic is admissible, the BFS finds the optimal solution.
TRUE
FALSE
Ans:
TRUE
14. The search overhead factor of the parallel system is defined as the ratio of the work done by the
parallel formulation to that done by the sequential formulation
TRUE
FALSE
Ans:
TRUE
15. The critical issue in parallel depth-first search algorithms is the distribution of the search space among
the processors.
TRUE
OptimusPrime Page 23
FALSE
Ans:
TRUE
16. Graph search involves a closed list, where the major operation is a _
sorting
searching
lookup
none of above
Ans:
lookup
17. Breadth First Search is equivalent to which of the traversal in the Binary Trees?
Pre-order Traversal
Post-order Traversal
Level-order Traversal
In-order Traversal
Ans:
Level-order Traversal
18. Time Complexity of Breadth First Search is? (V – number of vertices, E – number of edges)
O(V + E)
O(V)
O(E)
O(V*E)
Ans:
O(V + E)
Once
Twice
Equivalent to number of indegree of the node
Thrice
Ans:
Equivalent to number of indegree of the node
OptimusPrime Page 24
TRUE
FALSE
Ans:
TRUE
22. The critical issue in parallel depth-first search algorithms is the distribution of the search space among
the processors.
TRUE
FALSE
Ans:
TRUE
23. Graph search involves a closed list, where the major operation is a _
sorting
searching
lookup
none of above
Ans:
lookup
24. Which of the following is not a stable sorting algorithm in its typical implementation.
Insertion Sort
Merge Sort
Quick Sort
Bubble Sort
Ans:
Quick Sort
25. Which of the following is not true about comparison based sorting algorithms?
The minimum possible time complexity of a comparison based sorting algorithm is O(nLogn) for a
random input array
Any comparison based sorting algorithm can be made stable by using position as a criteria when two
elements are compared
Counting Sort is not a comparison based sorting algortihm
Heap Sort is not a comparison based sorting algorithm.
Ans:
Heap Sort is not a comparison based sorting algorithm.
------------------------------------------------------------------------------------------------------- ------------------------
UNIT 6
----------------------------------------------------------------------------------------------------------------------------- --
GPU kernel
CPU kernel
OptimusPrime Page 25
OS
none of above
Ans:
GPU kernel
2.The kernel code is dentified by the ________qualifier with void return type
_host_
__global__
_device_
void
Ans:
__global__
TRUE
FALSE
Ans:
TRUE
TRUE
FALSE
Ans:
FALSE
kernel thread
kernel initialization
kernel termination
kernel invocation
Ans:
kernel invocation
TRUE
FALSE
Ans:
TRUE
TRUE
FALSE
Ans:
TRUE
OptimusPrime Page 26
8. Host codes in a CUDA application can not Invoke kernels
TRUE
FALSE
Ans:
FALSE
TRUE
FALSE
Ans:
TRUE
10. the BlockPerGrid and ThreadPerBlock parameters are related to the __ model supported by CUDA.
host
kernel
thread abstraction
none of above
Ans:
thread abstraction
_host_
__global__
_device_
none of above
Ans:
_device_
__global__
_device_
none of above
Ans:
__global__
13. CUDA supports __ in which code in a single thread is executed by all other threads.
tread division
tread termination
thread abstraction
none of above
Ans:
thread abstraction
OptimusPrime Page 27
block
tread
grid
none of above
Ans:
grid
block
bunch
host
none of above
Ans:
block
TRUE
FALSE
Ans:
FALSE
treads
bunch
host
none of above
Ans:
treads
CUD
PTA
CDA
CUDA
Ans:
CUDA
19. Host codes in a CUDA application can Transfer data to and from the device
TRUE
FALSE
Ans:
TRUE
20. Host codes in a CUDA application can not Deallocate memory on the GPU
OptimusPrime Page 28
TRUE
FALSE
Ans:
FALSE
----------------------------------------------------------------------------------------------------------------------------- --
1. Moores Law
2. Minsky conjecture
3. Flynns Law
4. Amdhals Law
ANSWER
Amdhals Law
1. INTR
2. RST 7.5
3. RST 6.5
4. TRAP
ANSWER
-------------
1. Adaptivity
2. Transparency
3. Dependency
4. Secretivte
ANSWER
Transparency
Question 4 : When every caches hierarchy level is subset of level which futher away from the
processor
1. Synchronous
2. Atomic synschronous
3. Distrubutors
OptimusPrime Page 29
4. Multilevel inclusion
ANSWER
Multilevel inclusion
1. Serialization
2. cloud computing
3. Distribution
4. Parallelism
ANSWER
Parallelism
Question 6 : The problem where process concurrency becomes an issue is called as ___________
1. Reader-write problem
2. Bankers problem
3. Bakery problem
4. Philosophers problem
ANSWER
Reader-write problem
1. Centralized memory
2. Message passing
3. shared memory
4. cache memory
ANSWER
shared memory
1. 1
2. 2
3. 0
4. 3
OptimusPrime Page 30
ANSWER
0
1. bit based
2. bit level
3. increasing
4. instructional
ANSWER
instructional
Question 10 : MPI_Comm_size
ANSWER
Returns number of processes
Question 11 : High performance computing of the computer system tasks are done by
1. node clusters
2. network clusters
3. Beowulf clusters
4. compute nodes
ANSWER
compute nodes
Question 12 : MPI_Comm_rank
1. returns rank
2. returns processes
3. returns value
4. Returns value of instruction
ANSWER
returns rank
OptimusPrime Page 31
Question 13 : A processor performing fetch or decoding of different instruction during the
execution of another instruction is called ______ .
1. Super-scaling
2. Pipe-lining
3. Parallel Computation
4. distributed
ANSWER
Pipe-lining
1. page fault
2. system error
3. Hazard
4. execuation error
ANSWER
Hazard
ANSWER
one word instruction
ANSWER
It is costly
OptimusPrime Page 32
Question 17 : A microprogram sequencer
ANSWER
-------------
Question 18 : The___ time collectively spent by all the processing elements Tall = p TP
1. total
2. Average
3. mean
4. sum
ANSWER
total
ANSWER
-------------
Question 20 : The average number of steps taken to execute the set of instructions can be made
to be less than one by following _______ .
1. Sequentional
2. super-scaling
3. pipe-lining
4. ISA
ANSWER
super-scaling
OptimusPrime Page 33
Question 21 : The main difference between the VLIW and the other approaches to improve
performance is ___________
1. increase in performance
2. Lack of complex hardware design
3. Cost effectiveness
4. latency
ANSWER
Lack of complex hardware design
ANSWER
complex
1. lower
2. upper
3. left
4. right
ANSWER
upper
Question 24 : Virtualization that creates one single address space architecture that of, is called
1. Loosely coupled
2. Space based
3. Tightly coupled
4. peer-to-peer
ANSWER
Space based
Question 25 : MPI_Init
OptimusPrime Page 34
1. Close MPI environment
2. Initialize MPI environment
3. start programing
4. Call processes
ANSWER
start programing
Question 26 : Content of the program counter is added to the address part of the instruction in
order to obtain the effective address is called
ANSWER
-------------
Question 27 : The straight-forward model used for the memory consistency, is called
1. Sequential consistency
2. Random consistency
3. Remote node
4. Host node
ANSWER
-------------
Question 28 : Which MIMD systems are best scalable with respect to the number of processors
1. Distributed memory
2. ccNUMA
3. nccNUMA
4. Symmetric multiprocessor
ANSWER
Distributed memory
1. Uniprocessor Computer
OptimusPrime Page 35
2. Computer
3. Processor
4. System
ANSWER
-------------
Question 30 : The___ time collectively spent by all the processing elements Tall = p TP
1. total
2. sum
3. average
4. product
ANSWER
total
1. Source register
2. Memory
3. Data
4. Destination register
ANSWER
Destination register
Question 32 : The situation wherein the data of operands are not available is called ______
1. stock
2. Deadlock
3. data hazard
4. structural hazard
ANSWER
data hazard
1. Mass Media
2. Business
3. Management
OptimusPrime Page 36
4. Science
ANSWER
Science
1. intraprocessor communication
2. intraprocess and intraprocessor communication
3. interprocess and interprocessor communication
4. interprocessor communication
ANSWER
-------------
1. Serial computation
2. Excess computation
3. serial computation
4. parallel computing
ANSWER
-------------
1. ILP
2. Performance
3. Cost effectiveness
4. delay
ANSWER
ILP
Question 37 : The tightly coupled set of threads execution working on a single task ,that is called
1. Multithreading
2. Parallel processing
3. Recurrence
4. Serial processing
OptimusPrime Page 37
ANSWER
Multithreading
ANSWER
Data parallel model
1. reverse message
2. receive message
3. forward message
4. Collect message
ANSWER
receive message
1. Binary bit
2. Flag bit
3. Signed bit
4. Unsigned bit
ANSWER
-------------
Question 41 : For inter processor communication the miss arises are called
1. hit rate
2. coherence misses
3. comitt misses
4. parallel processing
ANSWER
coherence misses
OptimusPrime Page 38
Question 42 : The interconnection topologies are implemented using _________ as a node.
1. control unit
2. microprocessor
3. processing unit
4. microprocessor or processing unit
ANSWER
-------------
Question 43 : _________ gives the theoretical speedup in latency of the execution of a task at
fixed execution time
1. Amdahl's
2. Moor's
3. metcalfe's
4. Gustafson's law
ANSWER
Gustafson's law
Question 44 : The number and size of tasks into which a problem is decomposed determines the
1. fine-grainularity
2. coarse-grainularity
3. sub Task
4. granularity
ANSWER
granularity
ANSWER
Stop mpi environment program
OptimusPrime Page 39
Question 46 : Private data that is used by a single processor then shared data are used
1. Single processor
2. Multi processor
3. Single tasking
4. Multi tasking
ANSWER
Single processor
Question 47 : The time lost due to the branch instruction is often referred to as ____________
1. Delay
2. Branch penalty
3. Latency
4. control hazard
ANSWER
Branch penalty
1. cache
2. shared memory
3. message passing
4. distributed memory
ANSWER
distributed memory
ANSWER
Sequentional algorithm development
OptimusPrime Page 40
1. Global scheduling
2. Local Scheduling
3. post scheduling
4. pre scheduling
ANSWER
Global scheduling
ANSWER
In the data strea
1. CISC
2. RISC
3. ISA
4. IANA
ANSWER
RISC
1. Unsign Char
2. Sign character
3. Long Char
4. unsign long char
ANSWER
Sign Char
Question 54 : To increase the speed of memory access in pipelining, we make use of _______
OptimusPrime Page 41
3. Cache
4. Buffer
ANSWER
buffer
Question 55 : If the value V(x) of the target operand is contained in the address field itself, the
addressing mode is
1. Immediate
2. Direct
3. Indirect
4. Implied
ANSWER
-------------
1. must be same
2. may overlap
3. must be disjoint
4. must be the same as that of host
ANSWER
-------------
1. critical
2. easy
3. difficult
4. ambiguous
ANSWER
critical
1. collect message
OptimusPrime Page 42
2. transfer message
3. send message
4. receive message
ANSWER
send message
1. Instruction set
2. Arithmetic logical unit
3. Processor/memory interface
4. Control unit
ANSWER
Arithmetic logical unit
Question 60 : An interface between the user or an application program, and the system resources
is
1. Microprocessor
2. Microcontroller
3. Multimicroprocessor
4. operating system
ANSWER
-------------
Question 61 : The computer architecture aimed at reducing the time of execution of instructions
is ________.
1. CISC
2. RISC
3. SPARC
4. ISA
ANSWER
RISC
1. Decentalized computing
OptimusPrime Page 43
2. Parallel computing
3. Distributed computing
4. centralized computing
ANSWER
1. parallel
2. pipeline
3. serial
4. distributed
ANSWER
pipeline
Question 64 : The instructions which copy information from one location to another either in the
processor’s internal register set or in the external main memory are called
ANSWER
-------------
Question 65 : The pattern of___________ among tasks is captured by what is known as a task-
interaction graph
1. interaction
2. communication
3. optmization
4. flow
ANSWER
interaction
Question 66 : In vector processor a single instruction, can ask for ____________ data operations
1. multiple
2. single
OptimusPrime Page 44
3. two
4. four
ANSWER
multiple
1. switching complexity
2. circuit complexity
3. Time Complexity
4. space complexity
ANSWER
Time Complexity
ANSWER
Maximize Data Locality
1. Excess Computation
2. serial computation
3. Parallel Computing
4. cluster computation
ANSWER
Excess Computation
1. Packet
2. Ring
3. Static
OptimusPrime Page 45
4. Switching
ANSWER
Switching
Question 71 : The contention for the usage of a hardware device is called ______
1. data hazard
2. Stalk
3. Deadlock
4. structural hazard
ANSWER
structural hazard
1. Small Algorithm
2. Hash Algorithm
3. Merge-Sort Algorithm
4. Quick-Sort Algorithm
ANSWER
Merge-Sort Algorithm
1. Full operation
2. Limited operation
3. reduction operation
4. selected operation
ANSWER
reduction operation
Question 74 : The stalling of the processor due to the unavailability of the instructions is called
as ___________
1. Input hazard
2. data hazard
3. structural hazard
4. control hazard
OptimusPrime Page 46
ANSWER
control hazard
Question 75 : _____processors rely on compile time analysis to identify and bundle together
instructions that can be executed concurrently
1. VILW
2. LVIW
3. VLIW
4. VLWI
ANSWER
VLIW
1. Task
2. Instruction
3. Data
4. Program
ANSWER
Task
1. BHU
2. IITB
3. IITKG
4. IITM
ANSWER
BHU
1. Parallel computation
2. parallel development
3. parallel programing
4. Parallel processing
OptimusPrime Page 47
ANSWER
Parallel computation
1. Super-scaling
2. Pipe-lining
3. Parallel computation
4. serial computation
ANSWER
Pipe-lining
1. RISC architecture
2. CISC architecture
3. Von-Neuman architecture
4. Stack-organized architecture
ANSWER
-------------
Question 81 : An interface between the user or an application program, and the system resources
are
1. microprocessor
2. microcontroller
3. multi-microprocessor
4. operating system
ANSWER
-------------
1. greater throughput
2. enhanced fault tolerance
3. greater throughput and enhanced fault tolerance
4. zero throughput
OptimusPrime Page 48
ANSWER
-------------
1. cache
2. shared memory
3. message passing
4. distributed memory
ANSWER
shared memory
Question 84 : To which class of systems does the von Neumann computer belong
1. SIMD
2. MIMD
3. MISD
4. SISD
ANSWER
SISD
ANSWER
Variable format instruction
1. dot product
2. cross product
3. multiply
4. add
ANSWER
OptimusPrime Page 49
dot
1. program loops
2. Serial program
3. parallel program
4. long programs
ANSWER
parallel program
Question 88 : What is the execution time per stage of a pipeline that has 5 equal stages and a
mean overhead of 12 cycles
1. 2 cycles
2. 3 cycles
3. 5 cycles
4. 4 cycles
ANSWER
3 cycles
ANSWER
the greedy algorithm never considers the same solution again
Question 90 : If n is a power of two, we can perform this operation in ____ steps by propagating
partial sums up a logical binary tree of processors.
1. logn
2. nlogn
3. n
4. n^2
ANSWER
OptimusPrime Page 50
logn
1. SISD
2. SIMD
3. MIMD
4. MISD
ANSWER
MIMD
Question 92 : Tree networks suffer from a communication bottleneck at higher levels of the tree.
This network, also called a _________ tree.
1. fat
2. binary
3. order static
4. heap tree
ANSWER
FAT
1. Multiprograming
2. multiithreading
3. Multitasking
4. Synchronization
ANSWER
Multiprograming
Question 94 : Each of the clock cycle from the previous section of execution becomes
1. Previous stage
2. stall
3. previous cycle
4. pipe stage
ANSWER
OptimusPrime Page 51
pipe stage
1. greater throughput
2. enhanced fault tolerance
3. greater throughput and enhanced fault tolerance
4. none of the mentioned
ANSWER
-------------
1. stall
2. write operand
3. Read operand
4. Branching
ANSWER
Read operand
1. Task or processes
2. Task and Execution
3. Processor and Instruction
4. Instruction and decode
ANSWER
Task or processes
1. runtime
2. clock time
3. processor time
4. clock frequency
ANSWER
runtime
OptimusPrime Page 52
Question 99 : Uniprocessor computing devices is called__________.
1. Grid computing
2. Centralized computing
3. Parallel computing
4. Distributed computing
ANSWER
Centralized computing
Question 100 : The tighhtly copuled set of threads execution working on a single task is called
1. Serial processing
2. parallel processing
3. Multithreading
4. Recurrent
ANSWER
Multithreading
ANSWER
write after read
1. Small Task
2. Large Task
3. Full program
4. group of program
ANSWER
Small Task
OptimusPrime Page 53
1. s=Ts/Tp
2. S= Tp/Ts
3. Ts=S/Tp
4. Tp=S /Ts
ANSWER
s=Ts/Tp
Question 104 : A processor that continuously tries to acquire the locks, spinning around a loop
till it reaches its success, is known as
1. Spin locks
2. Store locks
3. Link locks
4. Store operational
ANSWER
-------------
1. Instruction execution
2. Instruction prefetch
3. Instruction manipulation
4. instruction decoding
ANSWER
Instruction prefetch
Question 106 : Parallel computing means to divide the job into several __________
1. Bit
2. Data
3. Instruction
4. Task
ANSWER
Task
Question 107 : if a piece of data is repeatedly used, the effective latency of this memory system
can be reduced by the __________.
OptimusPrime Page 54
1. RAM
2. ROM
3. Cache
4. HDD
ANSWER
Cache
1. Parallel processong
2. Distributed processing
3. Uni- processing
4. Multi-processing
ANSWER
Multi-processing
Question 109 : The instuction execution sequence ,that holds the instruction result known as
1. Data buffer
2. control buffer
3. reorder buffer
4. ordered buffer
ANSWER
reorder buffer
ANSWER
-------------
1. prefetching
2. pipelining
OptimusPrime Page 55
3. processor-printer communication
4. memory-monitor communication
ANSWER
pipelining
Question 112 : _________ is a method for inducing concurrency in problems that can be solved
using the divide-and-conquer strategy.
1. exploratory decomposition
2. speculative decomposition
3. data-decomposition
4. Recursive decomposition
ANSWER
data-decomposition
Question 113 : If no node having a copy of a cache block, this technique is known as
ANSWER
-------------
----------------------------------------------------------------------------------------------------------------------------- --
A. Decentralized computing
B. Parallel computing
C. Centralized computing
D. Decentralized computing
E. Distributed computing
F. All of these
G. None of these
Ans :
A
2: Writing parallel programs is referred to as
A. Parallel computation
B. Parallel processes
OptimusPrime Page 56
C. Parallel development
D. Parallel programming
E. Parallel computation
F. All of these
G. None of these
Ans :
D
A. Multithreading
B. Cyber cycle
C. Internet of things
D. Cyber-physical system
E. All of these
F. None of these
Ans :
C
A. HPC
D. HTC
C. HRC
D. Both A and B
E. All of these
F. None of these
Ans :
D
A. Adaptivity
B. Transparency
C. Dependency
D. Secretive
E. Adaptivity
F. All of these
G. None of these
OptimusPrime Page 57
Ans :
B
7: No special machines manage the network of architecture in which resources are known as
A. Peer-to-Peer
B. Space based
C. Tightly coupled
D. Loosely coupled
E. All of these
F. None of these
Ans :
A
8: Significant characteristics of Distributed systems have of
A. 5 types
B. 2 types
C. 3 types
D. 4 types
E. All of these
F. None of these
Ans :
C
9: Built of Peer machines are over
Ans :
D
10: Type HTC applications are
A. Business
B. Engineering
C. Science
D. Media mass
E. All of these
F. None of these
Ans :
OptimusPrime Page 58
A
11: Virtualization that creates one single address space architecture that of, is called
A. Loosely coupled
B. Peer-to-Peer
C. Space-based
D. Tightly coupled
E. Loosely coupled
F. All of these
G. None of these
Ans :
C
12: We have an internet cloud of resources In cloud computing to form
A. Centralized computing
B. Decentralized computing
C. Parallel computing
D. Both A and B
E. All of these
F. None of these
Ans :
E
13: Data access and storage are elements of Job throughput, of __________.
A. Flexibility
B. Adaptation
C. Efficiency
D. Dependability
E. All of these
F. None of these
Ans :
C
14: Billions of job requests is over massive data sets, ability to support known as
A. Efficiency
B. Dependability
C. Adaptation
D. Flexibility
E. All of these
F. None of these
Ans :
C
15: Broader concept offers Cloud computing .to select which of the following.
OptimusPrime Page 59
A. Parallel computing
B. Centralized computing
C. Utility computing
D. Decentralized computing
E. Parallel computing
F. All of these
G. None of these
Ans :
C
16: Resources and clients transparency that allows movement within a system is called
A.Mobility transparency
B. Concurrency transparency
C. Performance transparency
D. Replication transparency
E. All of these
F. None of these
Ans :
A
17: Distributed program in a distributed computer running a is known as
A. Distributed process
B. Distributed program
C. Distributed application
D. Distributed computing
E. All of these
F. None of these
Ans :
B
18: Uniprocessor computing devices is called__________.
A. Grid computing
B. Centralized computing
C. Parallel computing
D. Distributed computing
E. All of these
F. None of these
Ans :
B
19: Utility computing focuses on a______________ model.
A. Data
OptimusPrime Page 60
B. Cloud
C. Scalable
D. Business
E. All of these
F. None of these
Ans :
D
20: what is a CPS merges technologies
A. 5C
B. 2C
C. 3C
D. 4C
E. All of these
F. None of these
Ans :
C
21: Aberavationn of HPC
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
E. All of these
F. None of these
Ans :
C
A. Norming grids
B. Data grids
C. Computational grids
D. Both A and B
E. All of these
F. None of these
OptimusPrime Page 61
Ans :
D
23: Type of HPC applications of.
A. Management
B. Media mass
C. Business
D. Science
E. All of these
F.None of these
Ans :
D
24: The development generations of Computer technology has gone through
A. 6
B. 3
C. 4
D. 5
E. All of these
F. None of these
Ans :
D
A. Adaptation
B. Efficiency
C. Dependability
D. Flexibility
E. All of these
F. None of these
Ans :
B
26: Even under failure conditions Providing Quality of Service (QoS) assurance is the responsibility of
A. Dependability
B. Adaptation
C. Flexibility
D. Efficiency
E. All of these
F. None of these
OptimusPrime Page 62
Ans :
A
27: Interprocessor communication that takes place
A. Centralized memory
B. Shared memory
C. Message passing
D. Both A and B
E. All of these
F. None of these
Ans :
D
A. Microcomputers
B. Minicomputers
C. Mainframe computers
D. Supercomputers
E. All of these
F. None of these
Ans :
D
29: Which of the following is an primary goal of HTC paradigm___________.
Ans :
C
30: The high-throughput service provided is measures taken by
A. Flexibility
B. Efficiency
D. Adaptation
E. Dependability
F. All of these
G. None of these
OptimusPrime Page 63
Ans :
D
----------------------------------------------------------------------------------------------------------------------------- -
A modem is very helpful to link up two computers with the help of?
(A). telephone line
(B). dedicated line
(C). All of these
(D). None of these
Ans : (C)
A whole micro-computer system consists of which of the following?
(A). microprocessor
(B). memory
(C). peripheral equipment
(D). all of these
(E). None of these
Ans : (D).
Which of the following program is a micro-program written in 0 and 1?
(A). binary micro-program
(B). binary microinstruction
(C). symbolic microinstruction
(D). Symbolic microinstruction
(E). None of these
Ans : A
A pipeline is similar to which of the following?
(A). a gas line
(B). house pipeline
(C). both a and b
(D). an automobile assembly line
(E). None of these
Ans : D
OptimusPrime Page 64
A processor performing fetching or decoding of instructions during the execution of another
instruction is commonly known as?
(A). Super-scaling
(B). Parallel Computation
(C). Pipe-lining
(D). None of these
Ans : D
An optimizing compiler performs which of the following?
(A). Better compilation of the given code.
(B). better memory management.
(C). Takes the benefit of processor type and decreases its process time.
(D). Both a and c
(E). None of these
Ans : C
Which of the following wires is a collection of lines that connects several devices?
(A). internal wires
(B). peripheral connection wires
(C). Both a and b
(D). bus
(E). None of these
Ans : (D).
Which of the following is an instruction to give a small delay in the program?
(A). NOP
(B). LDA
(C). BEA
(D). None of these
Ans : A
How to define a peripheral?
(A). any physical device connected to the computer
(B). tape drive connected to a computer
(C). any drives installed in the computer
OptimusPrime Page 65
(D). None of these
Ans : A
----------------------------------------------------------------------------------------------------------------------------- --
OptimusPrime Page 66
UNIT SUB : 410241 HPC
ONE
7 Cache memory works on the principle Locality of data Locality of memory Locality of reference Locality of
of reference &
c
memory
8 SIMD represents an organization that refers to a
______________. computer
represents
organization of
includes many
processing units
none of the
above.
c
system capable single computer under the supervision
of processing containing a control of a common control
several unit, processor unit unit
programs at the and a memory unit.
same time.
9 A processor performing fetch or
decoding of different instruction
Super-scaling Pipe-lining Parallel Computation None of these
b
during the execution of another
instruction is called ______ .
10 General MIMD configuration usually
called
a
multiprocessor
a vector processor array processor none of the
above.
a
11 A Von Neumann computer uses which SISD
one of the following?
SIMD MISD MIMD.
a
12 MIMD stands for Multiple
instruction
Multiple Memory instruction
instruction memory multiple data
Multiple
information
a
multiple data data memory data
13 MIPS stands for: Memory
Instruction Per
Major Instruction
Per Second
Main Information
Per Second
Million
Instruction Per
d
Second Second
14 M.J. Flynn's parallel processing
classification is based on:
Multiple
Instructions
Multiple data Both (a) and (b) None of the
above
c
26 1 11 100 111
In a three-cube structure, node 101
cannot communicate directly with
b
node?
27 Which method is used as an
alternative way of snooping-based
Directory
protocol
Memory protocol Compiler based
protocol
None of above
a
coherence protocol?
28 snoopy cache protocol are used in
-----------------based system
bus mesh star hypercube
a
31 Multithreading allowing
multiple-threads for sharing the
Multiple
processor
Single processor Dual core Corei5
b
functional units of a
32 Allowing multiple instructions for issuing Single-issue
in a clock cycle, is the goal of processors
Dual-issue
processors
Multiple-issue
processors
No-issue
processors
c
33 OpenGL stands for: A. Open General B. Open Graphics
Liability Library
C. Open Guide Line D. Open Graphics
Layer
b
4 Cost Analysis on a ring is (ts + twm)(p - 1) (ts - twm)(p + 1) (tw + tsm)(p - 1) (tw - tsm)(p +
1)
a
5 Cost Analysis on a mesh is 2ts(sqrt(p) + 1) +
twm(p - 1)
2tw(sqrt(p) + 1) + 2tw(sqrt(p) - 1) +
tsm(p - 1) tsm(p - 1)
2ts(sqrt(p) - 1)
+ twm(p - 1)
d
17 The dual of the scatter operation is the concatenation gather operation Both None
c
18 In Scatter Operation on Hypercube, on each
step, the size of the messages communicated
tripled halved doubled no change
b
is ____
19 Which is also called "Total Exchange" ? All-to-all
broadcast
All-to-all
personalized
all-to-one
reduction
None
b
communication
20 All-to-all personalized communication can
be used in ____
Fourier transform matrix transpose sample sort all of the
above
d
7 1
cost-optimal parallel systems have an efficiency of ___ n logn complex
a
8 The n × n matrix is partitioned among n2 processors such
that each processor owns a _____ element.
n 2n single double
c
9 1 2 3 4
how many basic communication operations are used in
matrix vector multiplication
c
10 In DNS algorithm of matrix multiplication it used 1d partition 2d partition 3d partition both a,b
c
11 In the Pipelined Execution, steps contain normalization communicatio elimination
n
all
d
12 3/2 2/3
the cost of the parallel algorithm is higher than the
sequential run time by a factor of __
3*2 2/3+3/2
a
13 The load imbalance problem in Parallel Gaussian
Elimination: can be alleviated by using a ____ mapping
acyclic cyclic both none
b
14 A parallel algorithm is evaluated by its runtime in function the input size, the number of the
of processors, communicatio
all
d
n parameters.
25 1 2 8 0
In a eight node ring, node ____ is source of broadcast
d
26 The processors compute ______ product of the vector
element and the loval matrix
local global both none
a
27 one to all broadcast use recursive
doubling
simple
algorithm
both none
a
18 In CUDA, a single invoked kernel is referred to as a _____. block tread grid none of above
c
19 A grid is comprised of ________ of threads. block bunch host none of above
a
20 A block is comprised of multiple _______. treads bunch host none of above
a
21 a solution of the problem in representing the
parallelismin algorithm is
CUD PTA CDA CUDA
d
22 ______ is Callable from the host _host_ __global__ _device_ none of above
b
23 ______ is Callable from the host _host_ __global__ _device_ none of above
a
24 A CUDA program is comprised of two primary
components: a host and a _____.
GPU kernel CPU kernel OS none of above
a
25 The kernel code is dentified by the ________qualifier
with void return type
_host_ __global__ _device_ void
b
26 Host codes in a CUDA application can not Reset a device TRUE FALSE
b
27 Host codes in a CUDA application can not Invoke kernels TRUE FALSE
b
28 A CUDA program is comprised of two primary
components: a host and a _____.
GPU kernel CPU kernel OS none of above
a
29 Calling a kernel is typically referred to as _________. kernel
thread
kernel kernel
initialization termination
kernel
invocation
d
30 In CUDA, a single invoked kernel is referred to as a _____. block tread grid none of above
c
31 A grid is comprised of ________ of threads. block bunch host none of above
a
32 A block is comprised of multiple _______. treads bunch host none of above
a
33 A CUDA program is comprised of two primary
components: a host and a _____.
GPU kernel CPU kernel OS none of above
a
34 ______ is Callable from the host _host_ __global__ _device_ none of above
a
35 In CUDA, a single invoked kernel is referred to as a _____. block tread grid none of above
c
36 the BlockPerGrid and ThreadPerBlock parameters are
related to the ________ model supported by CUDA.
host kernel thread abstract none of above
ion
c
37 Host codes in a CUDA application can Transfer data to and TRUE
from the device
FALSE
a
38 Host codes in a CUDA application can not Deallocate
memory on the GPU
TRUE FALSE
b
39 Host codes in a CUDA application can not Reset a device TRUE FALSE
b
40 Calling a kernel is typically referred to as _________. kernel
thread
kernel kernel
initialization termination
kernel
invocation
d
UNIT FIVE SUB : 410241 HPC
In sorting networks for INCREASING COMPARATOR X' = min { x , y } X' = max { x , y } X' = min { x , y }
with input x,y select the correct output X', Y' from the and Y' = min { x , and Y' = min { x , and Y' = max{ x ,
X' = max { x , y }
and Y' = max { x ,
3
following options y} y} y} y}
In sorting networks for DECREASING COMPARATOR X' = min { x , y } X' = max { x , y } X' = min { x , y }
with input x,y select the correct output X', Y' from the and Y' = min { x , and Y' = min { x , and Y' = max{ x ,
X' = max { x , y }
and Y' = max { x ,
2
following options y} y} y} y}
In first step of parallelizing quick sort for n elements Only one process n processes are
to get subarrays, which of the following statement is is used used
two processes are None of the above
used
1
TRUE
UNIT FIVE SUB : 410241 HPC
O(N2)
What is the worst case time complexity of a quick sort O(N)
algorithm?
O(N log N) O(log N)
3
O(N2)
What is the average running time of a quick sort
algorithm?
O(N) O(N log N) O(log N)
2
Odd-even transposition sort is a variation of Quick Sort Shell Sort Bubble Sort Selection Sort
3
O(N2)
What is the average case time complexity of odd-even O(N log N)
transposition sort?
O(N) O(log N)
4
Shell sort is an improvement on Quick Sort Bubble Sort Insertion sort Selection Sort
3
In parallel Quick Sort Pivot is sent to processes by Broadcast Multicast Selective
Multicast
Unicast
1
UNIT FIVE SUB : 410241 HPC
Graph can be represented by Identity Matrix Adjacency Matrix Sprse list Sparse matrix
2
7 CUDA supports programming in .... C or C++ only Java, Python, and C, C++, third party Pascal
more wrappers for
3
Java, Python, and
more
10 8 1024 512 16
Each NVIDIA GPU has ------ Streaming
Multiprocessors
4
20 NVIDIA 8-series GPUs offer -------- . 50-200 GFLOPS 200-400 GFLOPS 400-800 GFLOPS 800-1000 GFLOPS
1
30 What makes a CUDA code runs in parallel __global__ main() function Kernel name
indicates parallel indicates parallel outside triple
first parameter
value inside
4
execution of code execution of code angle bracket triple angle
indicates bracket (N)
excecution of indicates
kernel N times in excecution of
parallel kernel N times in
parallel
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
UNIT-1
1) Conventional architectures coarsely comprise of a______
a) processor
b) Memory system
c) Datapath.
d) All of Above
Ans: d
Explanation:
2) Data intensive applications utilize______
Ans: a
Explanation:
3) A pipeline is like_____
Ans: a
Explanation:
4) Scheduling of instructions is determined ____
a) True Data Dependency
b) Resource Dependency
c) Branch Dependency
d) All of above
Ans: d
Explanation:
5) VLIW processors rely on______
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
a) Latency
b) Bandwidth
c) Both a and b
d) none of above
Ans: c
Explanation:
7) The fraction of data references satisfied by the cache is called_____
Ans: a
Explanation:
8) A single control unit that dispatches the same Instruction to various
processors is__
a) SIMD
b) SPMD
c) MIMD
d) None of above
Ans: a
Explanation:
9) The primary forms of data exchange between parallel tasks are_
Ans: a
Explanation:
11) The stage in which the CPU fetches the instructions from the instruction cache in
superscalar organization is
a) Prefetch stage
b) D1 (first decode) stage
c) D2 (second decode) stage
d) Final stage
Ans: a
Explanation: In the prefetch stage of pipeline, the CPU fetches the instructions from the instruction
cache, which stores the instructions to be executed. In this stage, CPU also aligns the
codes appropriately.
12) The CPU decodes the instructions and generates control words in
a) Prefetch stage
b) D1 (first decode) stage
c) D2 (second decode) stage
d) Final stage
Ans: b
In D1 stage, the CPU decodes the instructions and generates control words. For simple
RISC instructions, only single control word is enough for starting the execution.
Explanation:
13) The fifth stage of pipeline is also known as
a) read back stage
b) read forward stage
c) write back stage
d) none of the mentioned
Ans: c
Explanation: The fifth stage or final stage of pipeline is also known as “Write back (WB) stage”.
14) In the execution stage the function performed is
a) CPU accesses data cache
b) executes arithmetic/logic computations
c) executes floating point operations in execution unit
d) all of the mentioned
Ans: d
Explanation: In the execution stage, known as E-stage, the CPU accesses data cache, executes
arithmetic/logic computations, and floating point operations in execution unit.
15) The stage in which the CPU generates an address for data memory references in this
stage is
a) prefetch stage
b) D1 (first decode) stage
c) D2 (second decode) stage
d) execution stage
Ans: c
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
Explanation: In the D2 (second decode) stage, CPU generates an address for data memory
references in this stage. This stage is required where the control word from D1 stage is
again decoded for final execution.
Ans: C
Explanation: In the operand fetch stage, the FPU (Floating Point Unit) fetches the operands from
either floating point register file or data cache.
18) The FPU (Floating Point Unit) writes the results to the floating point register file in
a) X1 execution state
b) X2 execution state
c) write back stage
d) none of the mentioned
Ans: c
Explanation: In the two execution stages of X1 and X2, the floating point unit reads the data from the
data cache and executes the floating point computation. In the “write back stage” of
pipeline, the FPU (Floating Point Unit) writes the results to the floating point register file.
19) The floating point multiplier segment performs floating point multiplication in
a) single precision
b) double precision
c) extended precision
d) all of the mentioned
Ans: d
Explanation: The floating point multiplier segment performs floating point multiplication in single
precision, double precision and extended precision.
20) The instruction or segment that executes the floating point square root instructions is
a) floating point square root segment
b) floating point division and square root segment
c) floating point divider segment
d) none of the mentioned
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
Ans: c
Explanation: The floating point divider segment executes the floating point division and square root
instructions.
21) The floating point rounder segment performs rounding off operation at
Explanation: The results of floating point addition or division process may be required to be rounded
off, before write back stage to the floating point registers.
21) Which of the following is a floating point exception that is generated in case of integer
arithmetic?
a) divide by zero
b) overflow
c) denormal operand
d) all of the mentioned
Ans: D
Explanation: In the case of integer arithmetic, the possible floating point exceptions in Pentium are:
1. divide by zero
2. overflow
3. denormal operand
4. underflow
5. invalid operation.
UNIT-2
Note: Correct Answers are in Bold Fonts
1. The First step in developing a parallel algorithm is_
A. Granularity
B. Priority
C. Modernity
D. None of above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
D. speculative decomposition
E. All of Above
7. The Owner Computes Rule generally states that the process assigned a particular data
item is responsible for_
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
d. Pascal's Law
13. ____________ is due to load imbalance, synchronization, or serial components as
parts of overheads in parallel programs.
a. Interprocess interaction
b. Synchronization
c. Idling
d. Excess computation
14. Which of the following parallel methodological design elements focuses on
recognizing opportunities for parallel execution?
a. Partitioning
b. Communication
c. Aggromeration
d. Mapping
15. Considering to use weak or strong scaling is part of ______________ in addressing
the challenges of distributed memory programming.
a. Splitting the problem
b. Speeding up computations
c. Speeding up communication
d. Speeding up hardware
16. Domain and functional decomposition are considered in the following parallel
methodological design elements, EXCEPT:
a. Partitioning
b. Communication
c. Agglomeration
d. Mapping
17. Synchronization is one of the common issues in parallel programming. The issues
related to synchronization include the followings, EXCEPT:
a. Deadlock
b. Livelock
c. Fairness
d. Correctness
18. Which of the followings is the BEST description of Message Passing Interface (MPI)?
a. A specification of a shared memory library
b. MPI uses objects called communicators and groups to define which
collection of processes may communicate with each other
c. Only communicators and not groups are accessible to the programmer only
by a "handle"
d. A communicator is an ordered set of processes
c) Scalability
d) Effectiveness
Ans: C
Explanation: The measure of the “effort” needed to maintain efficiency while adding
processors is called as scalability.
7) Several instructions execution simultaneously in ________________
a) processing
b) parallel processing
c) serial processing
d) multitasking
Ans: b
Explanation: In parallel processing, the several instructions are executed simultaneously.
8) Conventional architectures coarsely comprise of a_
a) A processor
b) Memory system
c) Data path.
d) All of Above
Ans: d
Explanation:
9) A pipeline is like_
a) Overlaps various stages of instruction execution to achieve performance.
b) House pipeline
c) Both a and b
d) A gas line
Ans: a
Explanation:
10) VLIW processors rely on_
a) Compile time analysis
b) Initial time analysis
c) Final time analysis
d) Mid time analysis
Ans: a
Explanation:
11) Memory system performance is largely captured by_
a) Latency
b) Bandwidth
c) Both a and b
d) none of above
Ans: c
Explanation:
12) The fraction of data references satisfied by the cache is called_
a) Cache hit ratio
b) Cache fit ratio
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
d) None of above
Ans: a
Explanation:
4) The graph of tasks (nodes) and their interactions/data exchange (edges)_
a) Is referred to as a task interaction graph
b) Is referred to as a task Communication graph
c) Is referred to as a task interface graph
d) None of Above
Ans: a
Explanation:
5) Mappings are determined by_
a) task dependency
b) task interaction graphs
c) Both A and B
d) None of Above
Ans: c
Explanation:
6) Decomposition Techniques are_
a) recursive decomposition
b) data decomposition
c) exploratory decomposition
d) speculative decomposition
e) All of Above
Ans: E
Explanation:
7) The Owner Computes Rule generally states that the process assigned a
particular data item is responsible for_
a) All computation associated with it
b) Only one computation
c) Only two computation
d) Only occasionally computation
Ans: A
Explanation:
8) A simple application of exploratory decomposition is_
a) The solution to a 15 puzzle
b) The solution to 20 puzzle
c) The solution to any puzzle
d) None of Above
Ans: A
Explanation:
9) Speculative Decomposition consist of _
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
a) conservative approaches
b) optimistic approaches
c) Both A and B
d) Only B
Ans: C
Explanation:
10) task characteristics include:
a) Task generation.
b) Task sizes.
c) Size of data associated with tasks.
d) All of Above
Ans: d
Explanation:
UNIT-3
1) Group communication operations are built using point-to-point messaging
primitives
a) True
b) False
Ans: A
Explanation:
2) Communicating a message of size m over an uncongested network takes
time ts + tmw
a) True
b) False
Ans: A
Explanation:
3) The dual of one-to-all broadcast is_
a) All-to-one reduction
b) All-to-one receiver
c) All-to-one Sum
d) None of Above
Ans: A
Explanation:
4) A hypercube has_
a) 2d nodes
b) 2d nodes
c) 2n Nodes
d) N Nodes
Ans: a
Explanation:
5) A binary tree in which processors are (logically) at the leaves and internal
nodes are routing nodes.
ZEAL EDUCATION SOCIETY’S
ZEAL COLLEGE OF ENGINEERING AND RESEARCH
NARHE │PUNE -41 │ INDIA
DEPARTMENT OF COMPUTER ENGINEERING
a) True
b) False
Ans: A
Explanation:
6) In All-to-All Broadcast each processor is the source as well as destination.
a) True
b) False
Ans: A
Explanation:
7) The Prefix Sum Operation can be implemented using the_
a) All-to-all broadcast kernel.
b) All-to-one broadcast kernel.
c) One-to-all broadcast Kernel
d) Scatter Kernel
Ans: A
Explanation:
8) In the scatter operation_
a) Single node send a unique message of size m to every other node
b) Single node send a same message of size m to every other node
c) Single node send a unique message of size m to next node
d) None of Above
Ans: A
Explanation:
9) The gather operation is exactly the inverse of the_
a) Scatter operation
b) Broadcast operation
c) Prefix Sum
d) Reduction operation
Ans: A
Explanation:
10) In All-to-All Personalized Communication Each node has a distinct
message of size m for every other node
a) True
b) False
Ans: a
Explanation:
___________________________________________________________________________________
Date: 23/07/2020
Q. Options Correct
Question Description Marks CO PO PSO BTL
No. Answer
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
D. Y. Patil College of Engineering, Akurdi, Pune 411044
Department of Computer Engineering
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
D. Y. Patil College of Engineering, Akurdi, Pune 411044
Department of Computer Engineering
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
4 All-to-all personalized A. m C 2 3 1 3 4
communication is B. p
performed independently C. m√p
in each row with D. p√m
clustered messages of
size _______ on a mesh.
5 In All-to-All A. m A 2 3 1 3 1
Personalized B. p
Communication on a C. m-1
Ring, the size of the D. p-1
message reduces by
______ at each step
6 All-to-All Broadcast and A. p C 2 3 1 3 1
Reduction algorithm on a B. p+1
Ring terminates in C. p-1
_________steps. D. p*p
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
D. Y. Patil College of Engineering, Akurdi, Pune 411044
Department of Computer Engineering
___________________________________________________________________________________
Q. Options 28 Correct
Question Description Marks CO PO PSO BTL
No. Answer
i) all-to-one reduction
in each row
ii) one-to-all broadcast
of each vector
element among the n
processes of each
column
iii) one-to-one
communication to
align the vector
along the main
diagonal
11 Parallel time in Rowwise A. Θ(1) D 2 4 4 3 4
1-D Partitioning of B. Θ(n log n)
Matrix-Vector C. Θ(n2)
Multiplication where p=n D. Θ(n)
is ____.
(Mrs. Dhanashree Phalke) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
D. Y. Patil College of Engineering, Akurdi, Pune 411044
Department of Computer Engineering
___________________________________________________________________________________
Options Corre
Q. ct
Question Description Marks CO PO PSO BTL
No. Answ
er
d. Sequential Locality
d. K-D Mesh
c. coarse grained
granularity
d. task grained
granularity
c. Speculative
decomposition
c. Speculative
decomposition
d. Recursive
decomposition
10 An interaction pattern is a. Structured interaction C 1 2 1,1 3 2
considered to be _______if it 2
has some structure that can be b. unstructured
exploited for efficient interaction
implementation c. Regular interaction
d. Irregular interaction
d. In-process mapping
c. GEForce 3800
d. GEForce 956
c. Variable memory
access
b. To search and
measure how far a
node in a search tree
seems to be from a
goal
b. Depth-first Search
d. Hill climbing
20 Best – First search can be a. Queue C 1 5 1,2 1 1
implemented using the
following data structure b. Stack
c. Priority Queue
d. Circular Queue
d. isoefficiency
D. Scalability
d. speedup
d. speedup effect
c. O(n^3)
d. O(n+logn)
c. theta(n^logn)
d. theta(n^2)
c. Scalable
multiprocessor
d. Summative
multiprocessor
d. Shortest path
partitioned
formulation
b. Multithreaded SIMD
processor
c. Multithreaded queue
d. Multithreaded stack
c. Guest
d. Host
b. Computing universal
device architecture
c. Computer unicode
device architecture
d. Compute unified
device architecture
d. Supercomputers
C. To = p TP - TS
D. To = TP - pTS
iii) one-to-one
communication to align
the vector along the main
diagonal
D. Θ(n)
c. CUDA thread
d. CUD thread
c. 32 block
d. Unit block
c. Visible
d. Invisible
c. 32 block
d. Thread block
d. GTX 1060
c. Clock rate
d. All above
c. ((()))
d. [[[]]]
d. cudaMemorycpy()
c. Grid
d. Thread block
c. Excess computation
C. Excess computation
C. Speedup
D. Efficiency
C. CUDA
D. CUD
D. Cache misses
C. Low level
D. High level
d. CPU,GPU
c. __global__
d. A or C
(Mrs. D.A. Phalke & Mrs. Neha D. Patil) (Mrs. Vaishali Kolhe) ( Dr. Kailash Shaw) (Dr. Vinayak Kottawar)
Subject Teacher Academic Coordintor Dept. NBA Coordinator HOD Computer
HPC MCQ QB for Insem Examination
Unit I
1. Conventional architectures coarsely comprise of a_
A. A processor
B. Memory system
C Data path.
D All of Above
3. A pipeline is like_
A Latency
B Bandwidth
C Both a and b
D none of above
8. A single control unit that dispatches the same Instruction to various processors is__
A SIMD
B SPMD
C MIMD
D None of above
Unit 2
1. The First step in developing a parallel algorithm is_
A. Granularity
B. Priority
C. Modernity
D. None of above
A. task dependency
B. task interaction graphs
C. Both A and B
D. None of Above
7. The Owner Computes Rule generally states that the process assigned a particular data
item is responsible for_
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
A. True
B. False
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
4. A hypercube has_
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
5. A binary tree in which processors are (logically) at the leaves and internal nodes are
routing nodes.
A. True
B. False
A. True
B. False
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
10. In All-to-All Personalized Communication Each node has a distinct message of size m
for every other node
A. True
B. False
4. Concrete having 28- days’ compressive strength in the range of 60 to 100 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: a
Explanation: High Performance Concrete having 28- days’ compressive strength in the
range of 60 to 100 MPa.
5. Concrete having 28-days compressive strength in the range of 100 to 150 MPa.
a) HPC
b) VHPC
c) OPC
d) HSC
View Answer
Answer: b
Explanation: Very high performing Concrete having 28-days compressive strength in the
range of 100 to 150 MPa.
7. The choice of cement for high-strength concrete should not be based only on mortar-
cube tests but it should also include tests of compressive strengths of concrete at
___________ days.
a) 28, 56, 91
b) 28, 60, 90
c) 30, 60, 90
d) 30, 45, 60
View Answer
Answer: a
Explanation: The choice of cement for high-strength concrete should not be based only on
mortar-cube tests but it should also include tests of compressive strengths of concrete at
28, 56, and 91 days.
B. Multithreading
C. Increase Bandwidth
D. Increase Memory
ANSWER: B
A. Message Passing
B. Shared-address space
C. Client-Server
D. Distributed Network
ANSWER: B
The principal parameters that determine the communication latency are as follows:
A. Startup time (ts) Per-hop time (th) Per-word transfer time (tw)
ANSWER: A
The number and size of tasks into which a problem is decomposed determines the __
A. Granularity
B. Task
C. Dependency Graph
D. Decomposition
ANSWER: A
A. The average number of tasks that can run concurrently over the entire duration of execution of
the process.
B. The average time that can run concurrently over the entire duration of execution of the process.
ANSWER: A
A. Data decomposition
B. Exploratory decomposition
C. Speculative decomposition
D. Recursive decomposition
ANSWER: B
ANSWER: A
ANSWER: D
A. MIMD
B. SIMD
C. SISD
D. MISD
ANSWER: B
A. The length of the longest path in a task dependency graph is called the critical path length.
B. The length of the smallest path in a task dependency graph is called the critical path length.
ANSWER: A
A. recursive decomposition
B. Sdata decomposition
C. exploratory decomposition
D. speculative decomposition
ANSWER: A
If there are 6 nodes in a ring topology how many message passing cycles will be required to
complete broadcast process in one to all?
A. 1
B. 6
C. 3
D. 4
ANSWER: 3
If there is 4 X 4 Mesh topology network then how many ring operation will perform to complete one
to all broadcast?
A. 4
B. 8
C. 16
D. 32
ANSWER: 8
Consider all to all broadcast in ring topology with 8 nodes. How many messages will be present with
each node after 3rd step/cycle of communication?
A. 3
B. 4
C. 6
D. 7
ANSWER: 4
Consider Hypercube topology with 8 nodes then how many message passing cycles will require in all
to all broadcast operation?
B. The longest directed path between any pair of start & finish node.
ANSWER: D
Scatter is ____________.
ANSWER: C
If there is 4X4 Mesh Topology ______ message passing cycles will require complete all to all
reduction.
A. 4
B. 6
C. 8
D. 16
ANSWER: C
Following issue(s) is/are the true about sorting techniques with parallel computing.
A. Large sequence is the issue
ANSWER: B
A. Local arrangement
B. Processess assignments
C. Global arrangement
ANSWER: C
A. Donor
B. Active
C. Idle
D. Passive
ANSWER: A
Suppose there are 16 elements in a series then how many phases will be required to sort the series
using parallel odd-even bubble sort?
A. 8
B. 4
C. 5
D. 15
ANSWER: D
A. Interprocess interactions
B. Process Idling
C. All mentioned options
D. Excess Computation
ANSWER: C
The ratio of the time taken to solve a problem on a parallel processors to the time required to solve
the same problem on a single processor with p identical processing elements.
A. The ratio of the time taken to solve a problem on a single processor to the time required to solve
the same problem on a parallel computer with p identical processing elements.
B. The ratio of the time taken to solve a problem on a single processor to the time required to solve
the same problem on a parallel computer with p identical processing elements
ANSWER: B
Efficiency is a measure of the fraction of time for which a processing element is usefully employed.
A. TRUE
B. FALSE
ANSWER: A
A. CPU
B. GPU
C. ROM
D. Cash memory
ANSWER: B
A. Work
B. Worker
C. Task
ANSWER: B
In GPU Following statements are true
ANSWER: C
A. Decentralized computing
B. Parallel computing
C. Centralized computing
D. All of these
ANSWER: A
A. HPC
B. Distrubuted Framework
C. HRC
ANSWER: A
B. house pipeline
C. both a and b
D. a gas line
ANSWER: A
Pipeline implements ?
A. fetch instruction
B. decode instruction
C. fetch operand
D. all of above
ANSWER: D
A processor performing fetch or decoding of different instruction during the execution of another
instruction is called ______ ?
A. Super-scaling
B. Pipe-lining
C. Parallel Computation
D. None of these
ANSWER: B
In a parallel execution, the performance will always improve as the number of processors will
increase?
A. True
B. False
ANSWER: B
ANSWER: A
In VLIW the decision for the order of execution of the instructions depends on the program itself?
A. True
B. False
ANSWER: A
B. Cache coherency
C. Synchronization overheads
ANSWER: B
B. Concurrent write
C. Concurrent read
ANSWER: B
C. All memory units are mapped to one common virtual global memory
ANSWER: D
ANSWER: B
A. SISD
B. SIMD
C. MIMD
D. All of the above
ANSWER: A
How does the number of transistors per chip increase according to Moore ´s law?
A. Quadratically
B. Linearly
C. Cubicly
D. Exponentially
ANSWER: D
ANSWER: C
ANSWER: D
A. Instruction-level
B. Loop level
C. Task-level
D. Function-level
ANSWER: A
Multiprocessor is systems with multiple CPUs, which are capable of independently executing
different tasks in parallel. In this category every processor and memory module has similar access
time?
A. UMA
B. Microprocessor
C. Multiprocessor
D. NUMA
ANSWER: A
A. hit rate
B. coherence misses
C. comitt misses
D. parallel processing
ANSWER: B
A. cache
B. shared memory
C. message passing
D. distributed memory
ANSWER: D
A multiprocessor machine which is capable of executing multiple instructions on multiple data sets?
A. SISD
B. SIMD
C. MIMD
D. MISD
ANSWER: C
A. Task or processes
B. Task and Execution
ANSWER: A
B. Execute directly
C. Execute indirectly
D. None of Above
ANSWER: A
A. Granularity
B. Priority
C. Modernity
D. None of above
ANSWER: A
D. None of above
ANSWER: A
D. None of Above
ANSWER: A
A. task dependency
C. Both A and B
D. None of Above
ANSWER: C
A. recursive decomposition
B. data decomposition
C. exploratory decomposition
D. All of Above
ANSWER: D
The Owner Computes Rule generally states that the process assigned a particular data item is
responsible for?
ANSWER: A
D. None of Above
ANSWER: A
Speculative Decomposition consist of _?
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
ANSWER: C
A. Task generation.
B. Task sizes.
D. All of Above
ANSWER: D
A. Parallel computation
B. Parallel processes
C. Parallel development
D. Parallel programming
ANSWER: D
B. Bit model
C. Data model
D. network model
ANSWER: A
The number and size of tasks into which a problem is decomposed determines the?
A. fine-granularity
B. coarse-granularity
C. sub Task
D. granularity
ANSWER: A
A feature of a task-dependency graph that determines the average degree of concurrency for a given
granularity is its ___________ path?
A. critical
B. easy
C. difficult
D. ambiguous
ANSWER: A
The pattern of___________ among tasks is captured by what is known as a task-interaction graph?
A. Interaction
B. communication
C. optmization
D. flow
ANSWER: A
C. Increase Bandwidth
ANSWER: A
A. Task
B. Instruction
C. Data
D. Program
ANSWER: A
A. s=Ts/Tp
B. S= Tp/Ts
C. Ts=S/Tp
D. Tp=S /Ts
ANSWER: A
A. Bit
B. Data
C. Instruction
D. Task
ANSWER: D
_________ is a method for inducing concurrency in problems that can be solved using the divide-
and-conquer strategy?
A. exploratory decomposition
B. speculative decomposition
C. data-decomposition
D. Recursive decomposition
ANSWER: C
The___ time collectively spent by all the processing elements Tall = p TP?
A. total
B. Average
C. mean
D. sum
ANSWER: A
Group communication operations are built using point-to-point messaging primitives?
A. True
B. False
ANSWER: A
A. True
B. False
ANSWER: A
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
ANSWER: A
A hypercube has?
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
A binary tree in which processors are (logically) at the leaves and internal nodes are routing nodes?
A. True
B. False
ANSWER: A
A. True
B. False
ANSWER: A
D. Scatter Kernel
ANSWER: A
D. None of Above
ANSWER: A
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
ANSWER: A
In All-to-All Personalized Communication Each node has a distinct message of size m for every other
node ?
A. True
B. False
ANSWER: A
Parallel algorithms often require a single process to send identical data to all other processes or to a
subset of them. This operation is known as _________?
A. one-to-all broadcast
C. one-to-all reduction
ANSWER: A
In which of the following operation, a single node sends a unique message of size m to every other
node?
A. Gather
B. Scatter
D. Both A and C
ANSWER: D
ANSWER: A
A. True
B. False
ANSWER: A
Gather operation, or concatenation, in which a single node collects a unique message from each
node?
A. True
B. False
ANSWER: A
Conventional architectures coarsely comprise of a?
A. A processor
B. Memory system
C. Data path.
D. All of Above
ANSWER: D
D. None of above
ANSWER: A
A pipeline is like?
B. House pipeline
C. Both a and b
D. A gas line
ANSWER: A
B. Resource Dependency
C. Branch Dependency
D. All of above
ANSWER: D
ANSWER: A
A. Latency
B. Bandwidth
C. Both a and b
D. none of above
ANSWER: C
D. none of above
ANSWER: A
A single control unit that dispatches the same Instruction to various processors is?
A. SIMD
B. SPMD
C. MIMD
D. None of above
ANSWER: A
B. Exchanging messages.
C. Both A and B
D. None of Above
ANSWER: C
Switches map a fixed number of inputs to outputs?
A. True
B. False
ANSWER: A
B. Execute directly
C. Execute indirectly
D. None of Above
ANSWER: A
A. Granularity
B. Priority
C. Modernity
D. None of above
ANSWER: A
D. None of above
ANSWER: A
ANSWER: A
A. task dependency
C. Both A and B
D. None of Above
ANSWER: C
A. recursive decomposition
B. data decomposition
C. exploratory decomposition
D. All of Above
ANSWER: D
The Owner Computes Rule generally states that the process assigned a particular data item are
responsible for?
ANSWER: A
D. None of Above
ANSWER: A
Speculative Decomposition consist of ?
A. conservative approaches
B. optimistic approaches
C. Both A and B
D. Only B
ANSWER: C
A. Task generation.
B. Task sizes.
D. All of Above.
ANSWER: D
A. True
B. False
ANSWER: A
A. True
B. False
ANSWER: A
A. All-to-one reduction
B. All-to-one receiver
C. All-to-one Sum
D. None of Above
ANSWER: A
A hypercube has?
A. 2d nodes
B. 3d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
A binary tree in which processors are (logically) at the leaves and internal nodes are routing nodes?
A. True
B. False
ANSWER: A
A. True
B. False
ANSWER: A
D. Scatter Kernel
ANSWER: A
D. None of Above
ANSWER: A
The gather operation is exactly the inverse of the?
A. Scatter operation
B. Broadcast operation
C. Prefix Sum
D. Reduction operation
ANSWER: A
In All-to-All Personalized Communication Each node has a distinct message of size m for every other
node?
A. True
B. False
ANSWER: A
A. Decentralized computing
B. Parallel computing
C. Centralized computing
D. Decentralized computing
E. Distributed computing
ANSWER: A
A. Parallel computation
B. Parallel processes
C. Parallel development
D. Parallel programming
ANSWER: D
A. Maintenance
B. Initiation
C. Implementation
D. Deployment
ANSWER: D
A. Multithreading
B. Cyber cycle
C. Internet of things
D. Cyber-physical system
ANSWER: C
A. HPC
D. HTC
C. HRC
D. Both A and B
ANSWER: D
A. Adaptivity
B. Transparency
C. Dependency
D. Secretive
ANSWER: B
No special machines manage the network of architecture in which resources are known as?
A. Peer-to-Peer
B. Space based
C. Tightly coupled
D. Loosely coupled
ANSWER: A
A. 5 types
B. 2 types
C. 3 types
D. 4 types
ANSWER: C
B. 1 Server machine
C. 1 Client machine
ANSWER: D
A. Business
B. Engineering
C. Science
D. Media mass
ANSWER: A
Virtualization that creates one single address space architecture that of, is called?
A. Loosely coupled
B. Peer-to-Peer
C. Space-based
D. Tightly coupled
ANSWER: C
B. Decentralized computing
C. Parallel computing
D. All of these
ANSWER: D
A. Flexibility
B. Adaptation
C. Efficiency
D. Dependability
ANSWER: C
Billions of job requests is over massive data sets, ability to support known as?
A. Efficiency
B. Dependability
C. Adaptation
D. Flexibility
ANSWER: C
Broader concept offers Cloud computing .to select which of the following?
A. Parallel computing
B. Centralized computing
C. Utility computing
D. Decentralized computing
ANSWER: C
Resources and clients transparency that allows movement within a system is called?
A. Mobility transparency
B. Concurrency transparency
C. Performance transparency
D. Replication transparency
ANSWER: A
A. Distributed process
B. Distributed program
C. Distributed application
D. Distributed computing
ANSWER: B
A. Grid computing
B. Centralized computing
C. Parallel computing
D. Distributed computing
ANSWER: B
A. Data
B. Cloud
C. Scalable
D. Business
ANSWER: D
A. 5C
B. 2C
C. 3C
D. 4C
ANSWER: C
Aberration of HPC?
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
ANSWER: C
A. Norming grids
B. Data grids
C. Computational grids
D. Both A and B
ANSWER: D
A. Management
B. Media mass
C. Business
D. Science
ANSWER: D
A. 6
B. 3
C. 4
D. 5
ANSWER: D
A. Adaptation
B. Efficiency
C. Dependability
D. Flexibility
ANSWER: B
Even under failure conditions Providing Quality of Service (QoS) assurance is the responsibility of?
A. Dependability
B. Adaptation
C. Flexibility
D. Efficiency
ANSWER: A
A. Centralized memory
B. Shared memory
C. Message passing
D. Both A and B
ANSWER: D
A. Microcomputers
B. Minicomputers
C. Mainframe computers
D. Supercomputers
ANSWER: D
B. Low-flux computing
C. High-flux computing
D. Computer utilities
ANSWER: C
The high-throughput service provided is measures taken by
A. Flexibility
B. Efficiency
C. Dependability
D. Adaptation
ANSWER: D
B. Inter-process Communication
C. Idling
D. All above
ANSWER: D
A. Execution Time
C. Speedup
D. All above
ANSWER: D
The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
ANSWER: A
A. ILP
B. Performance
C. Cost effectiveness
D. delay
ANSWER: A
SUB : 410241 HPC
Which of the following statements are true with regard to compute capability in CUDA
A. Code compiled for hardware of one compute capability will not need to be re-
compiled to run on hardware of another
B. Different compute capabilities may imply a different amount of local memory per
thread
Answer : B
True or False: The threads in a thread block are distributed across SM units so that each
thread is executed by one SM unit.
A. True
B. False
Answer : B
Answer : C
True or false: Functions annotated with the __global__ qualifier may be executed on the
host or the device
A. True
B. Flase
Answer : A
SUB : 410241 HPC
Which of the following correctly describes a GPU kernel
B. All thread blocks involved in the same computation use the same kernel
Answer : B
C. Block and grid level parallelism - Different blocks or grids execute different tasks
D. Data parallelism - Different threads and blocks process different parts of data in
memory
Answer :A
What strategy does the GPU employ if the threads within a warp diverge in their execution?
A. Threads are moved to different warps so that divergence does not occur within a
single warp
C. All possible execution paths are run by all threads in a warp serially so that thread
instructions do not diverge
Answer : C
Which of the following does not result in uncoalesced (i.e. serialized) memory access on the
K20 GPUs installed on Stampede
Answer : A
SUB : 410241 HPC
Which of the following correctly describes the relationship between Warps, thread blocks,
and CUDA cores?
A. A warp is divided into a number of thread blocks, and each thread block executes on
a single CUDA core
B. A thread block may be divided into a number of warps, and each warp may execute
on a single CUDA core
C. A thread block is assigned to a warp, and each thread in the warp is executed on a
separate CUDA core
Answer : B
Answer : A
A. CUDA Libraries
B. CUDA Runtime
C. CUDA Driver
D. All Above
Answer : D
A. C
B. C++
C. Forton
D. All Above
Answer : D
SUB : 410241 HPC
Threads support Shared memory and Synchronization
A. True
B. False
Answer : A
B. Medical Imaging
C. Computational Science
E. All Above
Answer : E
A. True
B. False
Answer : A
SUB : 410241 HPC
What are the issues in sorting?
C. All above
Answer : C
Answer : A
A. Shell sort
B. Quick sort
C. Odd-Even transposition
D. Option A & C
Answer : D
Answer : A
SUB : 410241 HPC
Formally, given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to
find the shortest paths between all pairs of vertices. True or False?
A. True
B. False
Answer : A
A. One approach partitions the vertices among different processes and has each process
compute the single-source shortest paths for all vertices assigned to it. We refer to
this approach as the source-partitioned formulation.
B. Another approach assigns each vertex to a set of processes and uses the parallel
formulation of the single-source algorithm to solve the problem on each set of
processes. We refer to this approach as the source-parallel formulation.
C. Both are true
D. Non of these is true
Answer : C
Search algorithms can be used to solve discrete optimization problems. True or False ?
A. True
B. False
Answer : A
C. All of above
Answer : C
SUB : 410241 HPC
List the communication strategies for parallel BFS.
D. All of above
Answer : D
In a compare-split operation
A. Each process sends its block of size n/p to the other process
B. Each process merges the received block with its own block and retains only the
appropriate half of the merged block
C. Both A & B
Answer : C
Answer : D
A. Execution Time
B. Total Parallel Overhead
C. Speedup
D. Efficiency
E. Cost
F. All above
Answer : F
The efficiency of a parallel program can be written as: E = Ts / pTp. True or False?
A. True
B. False
Answer : A
Overhead function or total overhead of a parallel system as the total time collectively
spent by all the processing elements over and above that required by the fastest known
sequential algorithm for solving the same problem on a single processing element.
True or False?
A. True
B. False
Answer : A
What is Speedup?
A. A measure that captures the relative benefit of solving a problem in parallel. It is defined as the
ratio of the time taken to solve a problem on a single processing element to the time required to
solve the same problem on a parallel computer with p identical processing elements.
B. A measure of the fraction of time for which a processing element is usefully
employed.
C. None of the above
Answer : A
In an ideal parallel system, speedup is equal to p and efficiency is equal to one. True or
False?
A. True
B. False
Answer : A
SUB : 410241 HPC
A parallel system is said to be ________________ if the cost of solving a problem on a
parallel computer has the same asymptotic growth (in terms) as a function of the input
size as the fastest-known sequential algorithm on a single processing element.
A. Cost optimal
B. Non Cost optimal
Answer : A
Using fewer than the maximum possible number of processing elements to execute a
parallel algorithm is called ______________ a parallel system in terms of the number of
processing elements.
A. Scaling down
B. Scaling up
Answer : B
The __________________ function determines the ease with which a parallel system can
maintain a constant efficiency and hence achieve speedups increasing in proportion to the
number of processing elements.
A. Isoefficiency
B. Efficiency
C. Scalability
D. Total overhead
Answer : A
Minimum execution time for adding n numbers is Tp = n/p + 2 logp True or False ?
A. True
B. False
Answer : A
A. 1
B. 6
C. 3
D. 4
ANSWER: 3
A. 4
B. 8
C. 16
D. 32
ANSWER: 8
Scatter is ____________.
Pipeline implements ?
A. fetch instruction
B. decode instruction
C. fetch operand
D. all of above
ANSWER: D
The Owner Computes Rule generally states that the process assigned a
particular data item is responsible for?
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
ANSWER: A
A hypercube has?
A. 2d nodes
B. 2d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
The Owner Computes Rule generally states that the process assigned a
particular data item are responsible for?
A. All computation associated with it
B. Only one computation
C. Only two computation
D. Only occasionally computation
ANSWER: A
A hypercube has?
A. 2d nodes
B. 3d nodes
C. 2n Nodes
D. N Nodes
ANSWER: A
Aberration of HPC?
A. High-peak computing
B. High-peripheral computing
C. High-performance computing
D. Highly-parallel computing
ANSWER: C
A.In one to all broadcast initially there will be P(Number of processors) copies of messages and
B.In one to all broadcast initially there will be single copy of message and after broadcast finally
Submit
Answer
“In one to all broadcast initially there will be single copy of message and after broadcast finally
2.If total 8 nodes are in ring topology after one to all message broadcasting how many
Submit
Answer
8
3.Current source node selects _____ node as next source node in linear/ring one to all
message broadcast
A.nearest node
B.longest node
Submit
Answer
longest node
4.In All-to-one reduction after reduction the final copy of massage is avilible on which
node?
A.Source Node
B.Destination Node
D.None of these
Answer
Destination Node
5.If there is 4 by 4 mesh topology network present(as per shown in the video) then in how
8
4
16
Submit
Answer
6.If there are 8 nodes in a ring topology how many message passing cycles will be
Submit
Answer
7.In One to all broadcast using Hypercube topology how source node selects next
destination node?
Submit
Answer
8.If there are 8 nodes connected in ring topology then ___ number of message passing
Submit
Answer
9.Consider all to all broadcast in ring topology with 8 nodes.How many messages will be
7
None of the above
Submit
Answer
10.If there are 16 messages in 4x4 mesh then total how many message passsing cycles
Submit
Answer
11.If there are P messages in mxm mesh then total how many message passsing cycles
2 √P - 2
2 √P - 1
2 √P
None of the above
Submit
Answer
2 √P - 2
12.How many massage passing cycles required for all-to-all broadcasting in 8 nodes
hypercube?
Submit
Answer
13.In scatter opreation after massage broadcasting every node avail with same massage
copy.
True
False
Submit
Answer
False
CPU
GPU
ROM
Cash memory
Submit
Answer
GPU
Work
Worker
Task
Submit
Answer
Worker
16.In GPU Following statements are true
Submit
Answer
“Grid contains Block”, “Block contains Threads”, “SM stands for Streaming MultiProcessor
17.Following issue(s) is/are the true about sorting techniques with parallel computing.
Submit
Answer
“Where to store output sequence is the issue”, “Where to store input sequence is the issue”
18.Partitioning on series done after __
Local arrangement
Processess assignments
Global arrangement
Submit
Answer
Global arrangement
19.In Parallel DFS processes has following roles.(Select multiple choices if applicable)
Donor
Active
Idle
Recipient
Submit
Answer
“Donor”, “Recipient”
20.Suppose there are 16 elements in a series then how many phases will be required to
15
Submit
Answer
15
Interprocess interactions
Process Idling
Excess Computation
Submit
Answer
“Interprocess interactions”, “Process Idling”, “Excess Computation”
1 / 1 points
1 / 1 attempts
22,Speedup (S) is….
The ratio of the time taken to solve a problem on a parallel processors to the time required to
solve the same problem on a single processor with p identical processing elements
The ratio of the time taken to solve a problem on a single processor to the time required to solve
Submit
Answer
The ratio of the time taken to solve a problem on a single processor to the time required to solve
the same problem on a parallel computer with p identical processing elements
1 / 1 points
1 / 1 attempts
23.Efficiency is a measure of the fraction of time for which a processing element is
usefully employed.
TRUE
FALSE
Submit
Answer
TRUE
SUB : 410241 HPC
a.CUDA Architecture included a unified shader pipeline, allowing each and every chip to be
marshaled by a program.
b.CUDA Architecture included a unified shader pipeline, allowing each and every unit on the
chip to be marshaled by a program intending to perform general-purpose computations
c.CUDA Architecture included a unified shader pipeline, allowing each and every logic unit
on the chip to be marshaled by a program intending to perform general-purpose computations
d.CUDA Architecture included a unified shader pipeline, allowing each and every arithmetic
logic unit (ALU) on the chip to be marshaled by a program intending to perform general-purpose
computations
Ans.D
a.kernel<1, 1>(1,1);
b.kernel<<<1, 1>>>(1,1);
c.kernel<<<1, 1>>>();
d.kernel<<1, 1>>();
Ans. c
#include <iostream>
__global__ void add( int a, int b, int *c ) {
*c = a + b;
}
add<<<1,1>>>( 2, 7, dev_c );
Ans.b
4. From following code which particular line is responsible for copying between device to host
#include <iostream>
__global__ void add( int a, int b, int *c ) {
*c = a + b;
}
add<<<1,1>>>( 2, 7, dev_c );
a. c, dev_c, sizeof(int);
b. HANDLE_ERROR( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost );
SUB : 410241 HPC
Ans.c
#include <iostream>
__global__ void add( int a, int b, int *c ) {
*c = a + b;
}
add<<<1,1>>>( 2, 7, dev_c );
a.2
b.9
c.7
d.0
Ans. b
a. alerts the compiler that a function should be compiled to run on a device instead of the host
b. alerts the interpreter that a function should be compiled to run on a device instead of the host
SUB : 410241 HPC
c. alerts the interpreter that a function should be interpreted to run on a device instead of the host
d. alerts the interpreter that a function should be compiled to run on a host instead of the device
ans.a
7.The on-chip memory which is local to every multithreaded Single Instruction Multiple Data (SIMD)
Processor is called
a.Local Memory
b.Global Memory
c.Flash memory
d.Stack
Ans. a
8. The machine object created by the hardware, managing, scheduling, and executing is a thread of
a.DIMS instructions
b.DMM instructions
c.SIMD instructions
d.SIM instructions
Ans. c
a.Gather-scatter operations
b.Gather operations
c.Scatter operations
d.Gather-scatter technique
Ans. a
10. Which of the following architectures is/are not suitable for realizing SIMD ?
SUB : 410241 HPC
a.Vector Processor
b.Array Processor
c.Von Neumann
d.All of the above
Ans . c
Ans . b
Ans.d
Ans.a
Ans. c
SUB : 410241 HPC
Ans. A
Ans. d
b.Free()
c.Cudafree()
d.CudaFree()
Ans. a
a.Vector parallelism - Floating point computations are executed in parallel on wide vector units
b.Thread level task parallelism - Different threads execute a different tasks
c.Block and grid level parallelism - Different blocks or grids execute different tasks
d.Data parallelism - Different threads and blocks process different parts of data in memory
Ans . a
Ans. c
Ans .b
Ans.a
23.Which of the following correctly describes the relationship between Warps, thread blocks, and CUDA
cores?
a.A warp is divided into a number of thread blocks, and each thread block executes on a single
CUDA core
b.A thread block may be divided into a number of warps, and each warp may execute on a single
CUDA core
c.A thread block is assigned to a warp, and each thread in the warp is executed on a separate CUDA
core
d. A block index is same as thread index
Ans .b
24. A processor assigned with a thread block, that executes a code ,which we usually call a
A. multithreaded MIMD processor
b. multithreaded SIMD processor
c. multithreaded
D. multicore
Ans. c
25. Thread blocked altogether and being executed in the sets of 32 thread called as
a.block of thread
b.thread block
c.thread
d.block
Ans. b
Ans. d