100% found this document useful (1 vote)
1K views7 pages

Big Data QCM 1 PDF

Ambari is a tool for managing Hadoop clusters that provides capabilities to monitor, manage, provision and integrate Hadoop systems. It has a user interface that allows checking the versions of installed software. Ambari uses RESTful APIs so that developers can integrate their applications with it. Managing users through the Ambari UI will also create those users on HDFS.

Uploaded by

Chaimae Khaled
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views7 pages

Big Data QCM 1 PDF

Ambari is a tool for managing Hadoop clusters that provides capabilities to monitor, manage, provision and integrate Hadoop systems. It has a user interface that allows checking the versions of installed software. Ambari uses RESTful APIs so that developers can integrate their applications with it. Managing users through the Ambari UI will also create those users on HDFS.

Uploaded by

Chaimae Khaled
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

1- Select all the components of HDP which provides data access  FALSE

capabilities 6- True or False: Ambari is backed by RESTful APIs for developers


to easily integrate with their own applications.
 Pig
 Sqoop  TRUE
 Flume  FALSE
 MapReduce
 Hive 7- Which Hadoop functionalities does Ambari provide?

 None of the above


2- Select the components that provides the capability to move data  All of the above
from relational database into Hadoop.
 Monitor
 Sql  Manage
 Sqoop  Provision
 Hive  Integrate
 Kafka
 Flume 8- Which page from the Ambari UI allows you to check the versions
3- Managing Hadoop clusters can be accomplished using which of the software installed on your cluster?
component?
 Monitor page
 Ambari  Integrate page
 HBase  The Admin > Manage Ambari page
 Phoenix  The Admin > Provision page
 Hive 9- True or False?Creating users through the Ambari UI will also
 Sqoop create the user on the HDFS.

 TRUE
4- True or False: The following components are value-add from IBM:  FALSE
Big Replicate, Big SQL, BigIntegrate, BigQuality, Big Match 10- True or False? You can use the CURL commands to issue
 TRUE commands to Ambari.
 FALSE  TRUE
5- True or False: Data Science capabilities can be achieved using only
 FALSE
HDP.
11- True or False: Hadoop systems are designed for transaction
 TRUE processing.
This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
 TRUE 18- What are the benefits of using Spark? (Please select the
 FALSE THREE that apply)
 Generality
12- What is the default number of replicas in a Hadoop system?  Versality
5  Speed
4  Ease of use
19- What are the languages supported by Spark? (Please select the
3
THREE that apply)
2
13- True or False: One of the driving principal of Hadoop is that  Javascript
the data is brought to the program.  HTML
 Python

TRUE
 Java

FALSE
 Scala
14- True or False: Atleast 2 Name Nodes are required for a
standalone Hadoop cluster.
20- Resilient Distributed Dataset (RDD) is the primary abstraction

TRUE of Spark.

FALSE  TRUE
15- True or False: The phases in a MR job are Map, Shuffle,  FALSE
Reduce and Combiner
21- What would you need to do in a Spark application that you
TRUE would not need to do in a Spark shell to start using Spark?
FALSE  Extract the necessary libraries to load the SparkContext
16- Centralized handling of job control flow is one of the the  Export the necessary libraries to load the SparkContext
limitations of MR v1.  Delete the necessary libraries to load the SparkContext

TRUE  Import the necessary libraries to load the SparkContext

FALSE
17- The Job Tracker in MR1 is replaced by which component(s) in
YARN? 22- True or False: NoSQL database is designed for those that do
not want to use SQL.
 ResourceMaster
 ApplicationMaster  TRUE
 ApplicationManager  FALSE
 ResourceManager 23- Which database is a columnar storage database?
This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
 SQL and relational databases.
 Hive
 TRUE
 HBase
 FALSE
24- Which database provides a SQL for Hadoop interface?
30- True or False: For Sqoop to connect to a relational database,
 Hive the JDBC JAR files for that database must be located in
 Hadoop $SQOOP_HOME/bin.
 HBase  TRUE
 FALSE
25- Which Apache project provides coordination of resources?
 Streams 31- True or False: Each Flume node receives data as "source",
 Spark stores it in a "channel", and sends it via a "sink".
 Zeppelin  TRUE
 ZooKeeper  FALSE
26- What is ZooKeeper's role in the Hadoop infrastructure? 32- Through what HDP component are Kerberos, Knox, and
Ranger managed?
 Manage the coordination between HBase servers
 None of the above  Zookeeper
 Hadoop and MapReduce uses ZooKeeper to aid in high  Ambari
availability of Resource Manager  Apache Knox
 All of the above
 Flume uses ZooKeeper for configuration purposes in recent releases
33- Which security component is used to provide peripheral
27- True or False: Slider provides an intuitive UI which allows you security?
to dynamically allocate YARN resources.  Apache Ranger
 TRUE  Apache Camel
 FALSE  Apache Knox
28- True or False: Knox can provide all the security you need 34- One of the governance issue that Hortonworks DataPlane
within your Hadoop infrastructure. Service (DPS) address is visibility over all of an organization's data
across all of their environments — on-prem, cloud, hybrid — while
 TRUE
making it easy to maintain consistent security and governance
 FALSE
 TRUE
29- True or False: Sqoop is used to transfer data between Hadoop  FALSE
This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
  Semi-structured
35- True or false: The typical sources of streaming data are  Graph-based
Sensors, "Data exhaust" and high-rate transaction data.  Structured
 TRUE  Machine-Generated
 FALSE  Unstructured
36- What are the components of Hortonworks Data Flow(HDF)?
41- What are the 4Vs of Big Data? (Please select the FOUR that
 Flow management apply)
 Stream processing  Veracity
 All of the above  Velocity
 None of the above  Variety
 Enterprise services  Value
 Volume
37- True or False: NiFi is a disk-based, microbatch ETL tool that  Visualization
provides flow management 42- What are the most important computer languages for Data
 TRUE Analytics? (Please select the THREE that apply)
 FALSE  Scala
 HTML
 R
38- True or False: MiNiFi is a complementary data collection tool  SQL
that feeds collected data to NiFi  Python
 TRUE 43- True or False: GPUs are special-purpose processors that
 FALSE traditionally can be used to power graphical displays, but for Data
Analytics lend themselves to faster algorithm execution because of
39- What main features does IBM Streams provide as a Streaming
the large number of independent processing cores.
Data Platform? (Please select the THREE that apply)
 TRUE
 Flow management
 FALSE
 Analysis and visualization
44- True or False: Jupyter stores its workbooks in files with the
 Sensors
.ipynb suffix. These files can not be stored locally or on a hub
40- What are the three types of Big Data? (Please select the server.
THREE that apply)  TRUE
 Natural Language  FALSE
This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
45- $BIGSQL_HOME/bin/bigsql startcommand is used to start 51- Which file storage format has the highest performance?
Big SQL from the command line?
 Delimited
 TRUE  Sequence
 FALSE  RC
 Parquet
46- What are the two ways you can work with Big SQL. (Please  Avro
select the TWO that apply)
 JQuery 52- What are the two ways to classify functions?
 R Built-in functions
 JSqsh Scalar functions
 Web tooling from DSM
User-defined functions
None of the above
47- What is one of the reasons to use Big SQL?
53- True or False: UMASK is used to determine permissions on
 Want to access your Hadoop data without using MapReduce directories and files.
 You want to learn new languages like MapReduce  TRUE
 Has deep learning curve because Big SQL uses standard 2011 query  FALSE
structure
54- True or False: You can only Kerberize a Big SQL server before
48- Should you use the default STRING data type? it is installed.
 Yes  TRUE
 No  FALSE
55- True or False: Authentication with Big SQL only occurs at the
49- The BOOLEAN type is defined as SMALLINT SQL type in Big Big SQL layer or the client's application layer.
SQL.
 TRUE
 TRUE  FALSE
 FALSE 56- True or False: Ranger and impersonation works well together.
 TRUE
50- Using the LOAD operation is the recommended method for
 FALSE
getting data into your Big SQL table for best performance.
57- True or False: RCAC can hide rows and columns.
 TRUE
 FALSE  TRUE
  FALSE
This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
58- True or False: Nicknames can be used for wrappers and 64- True or False: Community provides access to articles, tutorials,
servers. and even data sets that you can use.
 TRUE  TRUE
 FALSE  FALSE

59- True or False: Server objects defines the property and values of
the connection. 65- True or False: You can import visualization libraries into
Watson Studio.
 TRUE
 TRUE
 FALSE
 FALSE
60- True or False: The purpose of a wrapper provide a library of
routines that doesn't communicates with the data source.
66- True or False: Collaborators can be given certain access levels.
 TRUE
 FALSE  TRUE
 FALSE
67- True or False: Watson Studio contains Zeppelin as a notebook
interface.
61- True or False: User mappings are used to authenticate to the
remote data source.  TRUE
 FALSE
 TRUE
68- Spark is developed in which language
 FALSE
62- True or False: Collaboration with Watson Studio is an optional
add-on component that must be purchased. 
Java
 TRUE 
Scala
 FALSE 
Python
R
63- True or False: Watson Studio is designed only for Data 69- In Spark Streaming the data can be from what all sources?
Scientists, other personas would not know how to use it.  Kafka
 Flume
 TRUE  Kinesis
 FALSE  All of the above

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
70- Apache Spark has API's in
 Java
 Scala
 Python
 All of the above

71- Which of the following is not a component of Spark Ecosystem?
 Sqoop
 GraphX
 MLlib
 BlinkDB

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00
Powered by TCPDF (www.tcpdf.org)

You might also like