QCM Bigdata 1 Exampdf
QCM Bigdata 1 Exampdf
a. Scalar
b. OLAP
c. User defined
d. Built in
You need to create a table that is not managed by the Big SQL database manager. Which keyword
would you use to create the table?
a. boolean
b. string
c. external
d. smallint
Which feature allows the big SQL user to securely access data in Hadoop on behalf of another
user?
a. Schema
b. impersonation
c. rights
d. privilege
When sharing a notebook, what will always point to the most recent version of the notebook?
a. The summarization of large, indexed data stores to provide information about potential
problems or opportunities.
b. Indexed databases containing very large volumes of historical data used for compliance
reporting purposes
c. Non-conventional methods used by businesses and organizations to capture, manage,
process, and make sense of a large volume of data.
d. Structured data stores containing very large data sets such as video and audio streams.
Which two descriptions are advantages of Hadoop?
a. intensive calculations on small amounts of data
b. processing random access transactions
c. processing a large number of small files
d. able to use inexpensive commodity hardware
e. processing large volumes of data with high throughput
Which statement is true about the Hadoop Distributed File System (HDFS)?
a. HDFS is a software framework to support computing on large clusters of computers.
b. HDFS is the framework for job scheduling and cluster resource management.
c. HDFS provides a web-based tool for managing Hadoop clusters.
d. HDFS links the disks on multiple nodes into one large file system.
You need to define a server to act as the medium between an application and a data source
a. SET AUTHORIZATION
b. CREATE WRAPPER
c. CREATE NICKNAME
d. CREATE SERVER
What are three examples of Big Data?
a. CREATE NICKNAME
b. CREATE SERVER
c. SET AUTHORIZATION
d. CREATE WRAPPER
In Big SQL, what is used for table definitions, location, and storage format of input files?
a. Ambari
b. Scheduler
c. Hadoop Cluster
d. The Hive Metastore
What are three examples of "Data Exhaust"?
a. browser cache
b. video streams
c. banner ads
d. log files
e. cookies
f. JavaScript
Which Hortonworks Data Platform (HDP) component provides a common web user interface for
applications running on a Hadoop cluster?
a. Ambari
b. HDFS
c. YARN
d. MapReduce
What Python statement is used to add a library to the current code cell?
a. using
b. pull
c. import
d. load
Which two are examples of personally identifiable information (PII)?
a. Email address
b. Medical record number
c. IP address
d. Time of interaction
Which component of the HDFS architecture manages storage attached to the nodes?
a. NameNode
b. MasterNode
c. DataNode
d. StorageNode
Which component of the HDFS architecture manages the file system namespace and metadata
a. NameNode
b. SlaveNode
c. WorkerNode
d. DataNode
Which type of foundation does Big SQL build on?
a. RStudio
b. Jupyter
c. Apache HIVE
d. MapReduc
How does MapReduce use ZooKeeper?
a. Coordination between servers.
b. Aid in the high availability of Resource Manager.
c. Server lease management of nodes.
d. Master server election and discovery
What is the default data format Sqoop parses to export data to a database?
a. JSON
b. CSV
c. XML
d. SQL
Under the HDFS storage model, what is the default method of replication?
it more appropriate and valuable for a variety of downstream purposes such as analytics and that
a. MapReduce
b. Data mining
c. Data munging
d. YARN
The Big SQL head node has a set of processes running. What is the name of the service ID running
these processes?
a. user1
b. bigsql
c. hdfs
d. Db2
When sharing a notebook, what will always point to the most recent version of the notebook?
Which hortonworks data platform (HDP) component provides a common web user interface for
applications running on a hadoop cluster?
a. Ambari
b. HDFS
c. YARN
d. MapReduce
Which file format contains human-readable data where the column values are separated by a
comma?
a. Parquet
b. ORC
c. Sequence
d. Delimited
Which file format has the highest performance?
a. ORC
b. Sequence
c. Delimited
d. Parquet
Which two of the following are column-based data encoding formats?
a. ORC
b. JSON
c. Parquet
d. Flat
e. Avro
What is the default number of rows Sqoop will export per transaction?
a. 100,000
b. 1,000
c. 100
Which statement describes the action performed by HDFS when data is written to the Hadoop
cluster?
a. The data is spread out and replicated across the cluster.
b. The MasterNodes write the data to disk.
c. The data is replicated to at least 5 different computers.
d. The FsImage is updated with the new data map. i think
Which two are use cases for deploying ZooKeeper?
a. Managing the hardware of cluster nodes.
b. Storing local temporary data files.
c. Simple data registry between nodes.
d. Configuration bootstrapping for new nodes.
What is one disadvantage to using CSV formatted data in a Hadoop data store?
a. Data must be extracted, cleansed, and loaded into the data warehouse.
b. It is difficult to represent complex data structures such as maps.
c. Fields must be positioned at a fixed offset from the beginning of the record.
d. Columns of data must be separated by a delimiter.
Which environmental variable needs to be set to properly start ZooKeeper?
a. ZOOKEEPER_HOME
b. ZOOKEEPER_DATA
c. ZOOKEEPER_APP
d. ZOOKEEPER
What is the primary purpose of Apache NiFi?
a. Identifying non-compliant data access.
b. Finding data across the cluster. –
c. Connect remote data sources via WiFi.
d. Collect and send data into a stream
Under the MapReduce v1 architecture, which element of the system manages the map and reduce
functions?
a. TaskTracker
b. JobTracker
c. StorageNode
d. SlaveNode
e. MasterNode
Under the MapReduce v1 (or classic) architecture, the JobTracker is responsible for managing and
coordinating the map and reduce functions. It tracks the progress of all the submitted MapReduce
jobs, schedules tasks on TaskTrackers, and ensures that tasks are executed successfully. The
TaskTrackers are responsible for running individual map and reduce tasks on cluster nodes
Load
Is
What must be done before using Sqoop to import from a relational database?
$SQOOP_HOME/lib
What is the default number of rows Sqoop will export per transaction?
1000
Which of the "Five V's"of Big Data describes the real purpose of deriving business insight from Big
Data?
Value
Which Spark RDD operation returns values after performing the evaluations?
Actions
a. avro
b. csv
Which element of Hadoop is responsible for spreading data across the cluster?
MapReduce
Under the MapReduce v1 programming model, what happens in the "Map" step?
zkCli.s