0% found this document useful (0 votes)
203 views13 pages

Lab 1: Accessing Cloudera Distribution For Hadoop (Vmware & Cluster Environment)

The document provides instructions for accessing a Cloudera Hadoop environment using either a single machine virtual machine (VM) setup or a cluster environment. It describes how to download and run the Cloudera VM, access the Cloudera Manager interface to view YARN applications and HUE to explore HDFS and Hive. For the cluster option, it outlines the server architecture and how to connect to the edge node via Putty to access Cloudera Manager, HUE, HDFS and Hive.

Uploaded by

Ahmad Hazzeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
203 views13 pages

Lab 1: Accessing Cloudera Distribution For Hadoop (Vmware & Cluster Environment)

The document provides instructions for accessing a Cloudera Hadoop environment using either a single machine virtual machine (VM) setup or a cluster environment. It describes how to download and run the Cloudera VM, access the Cloudera Manager interface to view YARN applications and HUE to explore HDFS and Hive. For the cluster option, it outlines the server architecture and how to connect to the edge node via Putty to access Cloudera Manager, HUE, HDFS and Hive.

Uploaded by

Ahmad Hazzeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 13

Lab 1: Accessing Cloudera Distribution for Hadoop (VMware & Cluster Environment)

Read on….

https://www.edureka.co/blog/cloudera-hadoop-tutorial/
Accessing Cloudera Methods:
Hadoop
A: Cloudera Quick Start VM (Single Machine)

B: Cluster Environment (Server BDLab) *

THIS SEMESTER LAB SESSION USES CLUSTER ENVIRONMENT

A: Cloudera Quick Start VM (Single Machine) – Installation Guide & Downloads

Installation https://ugc.futurelearn.com/uploads/files/3c/c9/3cc92360-1155-4eee-8d59-
4c7c0e3d192c/Instructions_Installing_Cloudera.pdf

Note: You can follow this note for installing cloudera, except the link to download the
VM since the link is no longer available. Refer to  the links below, to download the
respective VM.

Here are the external links that provide overview about cloudera VM:

https://www.coursera.org/learn/hadoop/lecture/oPPuR/exploring-the-cloudera-vm-
hands-on-part-1
https://www.edureka.co/blog/cloudera-hadoop-tutorial/

Download VM Note: The size of VM is approximately 5GB.

If you choose virtualbox as the virtualization tool, then download cloudera VM from here
- https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm-5.12.0-
0-virtualbox.zip
If you choose vmware as the virtualization tool, then download cloudera VM
from here - https://downloads.cloudera.com/demo_vm/vmware/cloudera-quickstart-vm-
5.12.0-0-vmware.zip

Desktop If you manage to run the VM in any of your selected virtualization tool, you should get
the following:
Note: Make sure you have set at least 2 processor and a minimum memory of 8GB in
vmware or virtualbox (depends on your selected virtualization tool)

Terminal You can open terminal by clicking the icon at the right-top:

if you click the icon, you should obtain as follows:


Browser You should see the following when you launch the browser:

It states that, this cloudera platform has:

1 NameNode/Manager Node
1 DataNode/Worker Node

Cloudera Home
Directory

Accessing Cloudera
Manager

Then, type cloudera as username and password :


Starting Hue
(Hadoop User
Experience) from
Cloudera Manager

Checkout Hue interface (cloudera as username and password):


Viewing HDFS via
HUE

B: Cluster Environment (Server BDLab)

Architecture of Server
Environment
In this architecture, there are six virtual machines that form a cluster
Five virtual machines are dedicated for cloudera hadoop
One server is dedicated for the edge node (as well as for the rapidminer server)
From the five virtual machines, one is meant for the master node, and the rest are meant for
data nodes
All the main services such as Cloudera Manager, HDFS, MariaDB, Hive Server, Hue Server,
Spark Server, YARN, Python are installed in the master node
The data nodes / worker nodes consist of HDFS datanode, YARN node manager and Python.
The edge node consists rapidminer server, radoop, jupyter notebook, and gateway to access
hdfs, hive, spark, sqoop and YARN.

Getting into UiTM Each student will be given access to a pc in big data lab
Network (Outside UiTM network);
You will need AnyDesk software to access the pc
Id and password will be given

Accessing Server Pre-requisite: You can access pc in big data lab

(via Putty) Details: Using Putty

Step 1: Fill in host name and click Open

Step 2: Type in the given password (Note: the cursor won’t move while you typing the
password)

If you have typed correctly, then, you will be prompted with the following:

Accessing Cloudera Pre-requisite: You can access pc in big data lab


Manager
Details: Accessing Cloudera Manager

Step 1: Type in http://10.5.19.231:7180/cmf/login

Step 2: Enter the given username and password and click Sign In, you should get the
following:

Step 3: Checking for yarn applications. Go to Clusters > YARN Applications

If you have executed any mapreduce application earlier (Executing MapReduce Program), you
can view as follows:
Accessing HUE Pre-requisite: not needed

Details: Accessing HUE

Step 1: Type this URL -> https://bigdatalab-rm-en1.uitm.edu.my:8889/

you should get:

Step 2: Insert the given username and password, and you should get the following:
Step 3: You can view HDFS contents as follows:

Step 4: You can view hive table as follows:

You might also like