0% found this document useful (0 votes)

94 views27 pages

Introduction To HDFS PDF

Uploaded by

Sumit Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views27 pages

Introduction To HDFS PDF

Uploaded by

Sumit Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

IBM Software An IBM Proof of Technology

Hadoop Basics with InfoSphere

BigInsights
Lesson 2: Hadoop architecture
An IBM Proof of Technology
Catalog Number

© Copyright IBM Corporation, 2013

US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
IBM Software

Contents
Lab 1Exploring Hadoop Distributed File System...................................................................................................4
1.1Getting Started....................................................................................................................5
1.2Exploring Hadoop Distributed File System (Terminal).........................................................9
1.2.1Using the command line Interface.......................................................................9
1.3Exploring Hadoop Distributed File System (Web Console)................................................15
1.3.1Using the Web Console.....................................................................................15
1.3.2Working with the Welcome page.......................................................................16
1.3.3Administering BigInsights...................................................................................18
1.3.4Inspecting the status of your cluster..................................................................18
1.3.5Starting and stopping a component...................................................................19
1.3.6Working with Files..............................................................................................20
1.4Summary...........................................................................................................................24

Contents Page 3
IBM Software

Lab 1 Exploring Hadoop Distributed File System

The overwhelming trend towards digital services, combined with cheap storage, has generated massive
amounts of data that enterprises need to effectively gather, process, and analyze. Data analysis
techniques from the data warehouse and high-performance computing communities are invaluable for
many enterprises, however often times their cost or complexity of scale-up discourages the accumulation
of data without an immediate need. As valuable knowledge may nevertheless be buried in this data,
related scaled-up technologies have been developed. Examples include Google’s MapReduce, and the
open-source implementation, Apache Hadoop.

Hadoop is an open-source project administered by the Apache Software Foundation. Hadoop’s

contributors work for some of the world’s biggest technology companies. That diverse, motivated
community has produced a collaborative platform for consolidating, combining and understanding data.

Technically, Hadoop consists of two key services: data storage using the Hadoop Distributed File System
(HDFS) and large-scale parallel data processing using a technique called MapReduce.

After completing this hands-on lab, you will be able to:

• Use Hadoop commands to explore HDFS on the Hadoop system

• Use Web Console to explore HDFS on the Hadoop system

Allow 45 minutes to 1 hour complete this section of lab.

This version of the lab was designed using the InfoSphere BigInsights 2.1 Quick Start Edition.
Throughout this lab you will be using the following account login information:

Username Password

VM image setup screen root password

Linux biadmin biadmin

Page 4 Hadoop Basics: Part1

IBM Software

1.1 Getting Started

To prepare for the contents of this lab, you must go through the process of getting all of the Hadoop
components started

__1. Start the VMware image by clicking the Play virtual machine button in the VMware Player if it is
not already on.

__2. Log in to the VMware virtual machine using the following credentials.

User: biadmin

Password: biadmin

Hands-on-Lab Page 5
IBM Software

__3. After you log in, your screen should look similar to the one below.

Before we can start working with Hadoop Distributed File system, we must first start all the Biginsights
components. There are two ways of doing this, through terminal and through simply double-clicking an
icon. Both of these methods will be shown in the following steps.

__4. Now open the terminal by double clicking the BigInsights Shell icon.

__5. Click on the Terminal icon

Page 6 Hadoop Basics: Part1

IBM Software

__6. Once the terminal has been opened change to the $BIGINSIGHTS_HOME/bin directory (which
by default is /opt/ibm/biginsights)

cd $BIGINSIGHTS_HOME/bin

cd /opt/ibm/biginsights/bin

__7. Start the Hadoop components (daemons) on the BigInsights server. You can practice starting all
components with these commands. Please note that they will take a few minutes to run.

./start-all.sh

__8. Sometimes certain hadoop components may fail to start. You can start and stop the failed
components one at a time by using start.sh and stop.sh respectively. For example to start and
stop Hive use:

./start.sh hive

./stop.sh hive

Notice that since Hive did not initially fail, the terminal is telling us that Hive is already running.

Hands-on-Lab Page 7
IBM Software

__9. Once all components have started successfully you may move on.

__10. If you would like to stop all components execute the command below. However, for this lab
please leave all components started.

./stop-all.sh

Next, let us look at how you would start all the components by double-clicking an icon.

__11. Double-clicking on the Start BigInsights icon would execute a script that does the above
mentioned steps. Once all components are started the terminal exits and you are set. Simple.

__12. We can stop the components in a similar manner, by double-clicking on the Stop Biginsights
icon. (To the right of Start BigInsights icon)

Now that are components are started you may move on to the next section.

Page 8 Hadoop Basics: Part1

IBM Software

1.2 Exploring Hadoop Distributed File System (Terminal)

Hadoop Distributed File System (HDFS) allows user data to be organized in the form of files and
directories. It provides a command line interface called FS shell that lets a user interact with the data in
HDFS accessible to Hadoop MapReduce programs.

There are two methods to interact with HDFS:

1. You can use the command-line approach and invoke the FileSystem (fs) shell using the format:
hadoop fs <args>
2. You can also manipulate HDFS using the BigInsights Web Console.

We will be using both methods in this lab

1.2.1 Using the command line Interface

We will start with the hadoop fs -ls command, which returns the list of files and directories with
permission information.

Ensure the Hadoop components are all started, and from the same terminal window as before (and
logged on as biadmin), follow these instructions

__1. List the contents of the root directory.

hadoop fs -ls /

__2. To list the contents of the /user/biadmin directory, execute:

hadoop fs -ls

hadoop fs -ls /user/biadmin

Hands-on-Lab Page 9
IBM Software

Note that in the first command there was no director referenced, but it is equivalent to the second
command where /user/biadmin is explicitly specified. Each user will get its own home directory under
/user. For example, in the case of user biadmin, the home directory is /user/biadmin. Any command
where there is no explicit directory specified will be relative to the user’s home directory. User space in
the native file system (Linux) is generally found under /home/biadmin or /usr/biadmin, but in HDFS user
space is /user/biadmin (spelled as “user” rather than “usr”).

__3. To create the directory test you can issue the following command:

hadoop fs -mkdir test

__4. Issue the ls command again to see the subdirectory test:

hadoop fs -ls /user/biadmin

Page 10 Hadoop Basics: Part1

IBM Software

The result of ls here is similar to that found with Linux, except for the second column (in this case
either “1” or “-“). The “1” indicates the replication factor (generally “1” for pseudo-distributed
clusters and “3” for distributed clusters); directory information is kept in the namenode and thus
not subject to replication (hence “-“).

To use HDFS commands recursively generally you add an “r” to the HDFS command.

__5. For example, to do a recursive listing we'll use the -lsr command rather than just -ls, like the
example below.

hadoop fs -ls /user

hadoop fs -lsr /user

__6. You can pipe (using the | character) any HDFS command to be used with the Linux shell. For
example, you can easily use grep with HDFS by doing the following.

hadoop fs -mkdir /user/biadmin/test2

hadoop fs -ls /user/biadmin | grep test

As you can see the grep command only returned the lines which had “test” in them (thus removing the
“Found x items” line and the other directories from the listing.

Hands-on-Lab Page 11
IBM Software

__7. To move files between your regular Linux file system and HDFS you can use the put and get
commands. For example, move the text file README to the hadoop file system:

hadoop fs -put /home/biadmin/README README

hadoop fs -ls /user/biadmin

You should now see a new file called /user/biadmin/README listed as shown above.

__8. In order to view the contents of this file use the -cat command as follows:

hadoop fs -cat README

You should see the output of the README file (that is stored in HDFS). We can also use the
linux diff command to see if the file we put in HDFS is actually the same as the original on the
local filesystem.

__9. Execute the commands below to use the diff command.

cd /home/biadmin/

diff <( hadoop fs -cat README ) README

Since the diff command produces no output we know that the files are the same (the diff command prints
all the lines in the files that differ).

To find the size of files you need to use the -du or -dus commands. Keep in mind that these
commands return the file size in bytes.

Page 12 Hadoop Basics: Part1

IBM Software

__10. To find the size of the README file use the following command.

hadoop fs -du README

__11. To find the size of all files individually in the /user/biadmin directory use the following command:

hadoop fs -du /user/biadmin

__12. To find the size of all files in total of the /user/biadmin directory use the following command.

hadoop fs -dus /user/biadmin

__13. If you would like to get more information about hadoop fs commands, invoke -help as follows.

hadoop fs -help

Hands-on-Lab Page 13
IBM Software

__14. For specific help on a command, add the command name after help. For example, to get help on
the dus command you'd do the following.

hadoop fs -help dus

We are now done with the terminal section, you may close the terminal.

Page 14 Hadoop Basics: Part1

IBM Software

1.3 Exploring Hadoop Distributed File System (Web Console)

The first step to accessing the BigInsights Web Console is to launch all of the BigInsights processes
(Hadoop, Hive, Oozie, Map/Reduce etc.) They should have been started at the beginning of this lab.

1.3.1 Using the Web Console

__1. Start the Web Console by double-clicking on the BigInsights WebConsole icon.

__2. Verify that your Web console appears similar to this, and note each section:
Tasks: quick access to popular BigInsights tasks,
Quick Links: Links to internal and external quick links and downloads to enhance your
environment, and
Learn More: Online resources available to learn more about BigInsights

Hands-on-Lab Page 15
IBM Software

1.3.2 Working with the Welcome page

This section introduces you to the Web console's main page displayed through the Welcome tab. The
Welcome page features links to common tasks, many of which can also be launched from other areas of
the console. In addition, the Welcome page includes links to popular external resources, such as the
BigInsights InfoCenter (product documentation) and community forum. You'll explore several aspects of
this page.

__3. In the Welcome Tab, the Tasks pane allows you to quickly access common tasks. Select the
View, start or stop a service task. If necessary scroll down.

__4. This takes you to the Cluster Status tab. Here, you can stop and start Hadoop services, as well
as gain additional information as shown in the next section

__5. Click on the Welcome tab to return back to the main page.

Page 16 Hadoop Basics: Part1

IBM Software

__6. Inspect the Quick Links pane at top right and use its vertical scroll bar (if necessary) to become
familiar with the various resources accessible through this pane. The first several links simply
activate different tabs in the Web console, while subsequent links enable you to perform set-up
functions, such as adding BigInsights plug-ins to your Eclipse development environment.

__7. Inspect the Learn More pane at lower right. Links in this area access external Web resources
that you may find useful, such as the Accelerator demos and documentation, BigInsights
InfoCenter, a public discussion forum, IBM support, and IBM's BigInsights product site. If
desired, click on one or more of these links to see what's available to you

Hands-on-Lab Page 17
IBM Software

1.3.3 Administering BigInsights

The Web console allows administrators to inspect the overall health of the system as well as perform
basic functions, such as starting and stopping specific servers (or components), adding nodes to the
cluster, and so on. You’ll explore a subset of these capabilities here.

1.3.4 Inspecting the status of your cluster

__8. Click on the Cluster Status tab at the top of the page to return to the Cluster Status window

__9. Inspect the overall status of your cluster. The figure below was taken on a single-node cluster
that had several services running. One service – Monitoring -- was unavailable. (If you installed
and started all BigInsights services on your cluster, your display will show all services to be
running)

__10. Click on the Hive service and note the detailed information provided for this service in the pane
at right. From here, you can start or stop the hive service (or any service you select) depending
on your needs. For example, you can see the URL for Hive's Web interface and its process ID.

Page 18 Hadoop Basics: Part1

IBM Software

__11. Optionally, cut-and-paste the URL for Hive’s Web interface into a new tab of your browser.
You'll see an open source tool provided with Hive for administration purposes, as shown below.

__12. Close this tab and return to the Cluster Status section of the BigInsights Web console

1.3.5 Starting and stopping a component

__13. If necessary, click on the Hive service to display its status.

__14. In the pane to the right (which displays the Hive status), click the red Stop button to stop the
service

__15. When prompted to confirm that you want to stop the Hive service, click OK and wait for the
operation to complete. The right pane should appear similar to the following image

__16. Restart the Hive service by clicking on the green arrow just beneath the Hive Status heading.
(See the previous figure.) When the operation completes, the Web console will indicate that
Hive is running again, likely under a process ID that differs from the earlier Hive process ID
shown at the beginning of this lab module. (You may need to use the Refresh button of your
Web browser to reload information displayed in the left pane.)

Hands-on-Lab Page 19
IBM Software

1.3.6 Working with Files

The Files tab of the console enables you to explore the contents of your file system, create new
subdirectories, upload small files for test purposes, and perform other file-related functions. In this
module, you’ll learn how to perform such tasks against the Hadoop Distributed File System (HDFS) of
BigInsights.

__17. Click on the Files tab of the console to begin exploring your distributed file system.

__18. Expand the directory tree shown in the pane at left (/user/biadmin). If you already uploaded
files to HDFS, you’ll be able to navigate through the directory to locate them.

__19. Become familiar with the functions provided through the icons at the top of this pane, as we'll
refer to some of these in subsequent sections of this module. Simply point your cursor at the
icon to learn its function. From left to right, the icons enable you to Copy a file or directory, move
a file, create a directory, rename, upload a file to HDFS, download a file from HDFS to your local
file system, delete a file from HDFS, set permissions, open a command window to launch HDFS
shell commands, and refresh the Web console page

Page 20 Hadoop Basics: Part1

IBM Software

__20. Position your cursor on the user/biadmin directory and click the Create Directory icon to create
a subdirectory for test purposes

__21. When a pop-up window appears prompting you for a directory name, enter ConsoleLab and
click OK

__22. Expand the directory hierarchy to verify that your new subdirectory was created.

__23. Create another directory named ConsoleLabTest.

__24. Use the Rename icon to rename this directory to ConsoleLabTest2

Hands-on-Lab Page 21
IBM Software

__25. Click the Move icon, when the pop up Move screen appears select the ConsoleLab directory
and click OK.

__26. Using the set permission icon, you can change the permission settings for your directory. When
finished click OK.

Page 22 Hadoop Basics: Part1

IBM Software

__27. While highlighting the ConsoleLabTest2 folder, select the Remove icon and delete the directory.

__28. Remain in the ConsoleLab directory, and click the Upload icon to upload a small sample file for
test purposes.

__29. When the pop-up window appears, click the Browse button to browse your local file system for a
sample file.

__30. Navigate through your local file system to the directory where BigInsights was installed. For the
IBM-provided VMWare image, BigInsights is installed in file system: /opt/ibm/biginsights.
Locate the …/IHC subdirectory and select the CHANGES.txt file. Click Open.

__31. Verify that the window displays the name of this file. Note that you can continue to Browse for
additional files to upload and that you can delete files as upload targets from the displayed list.
However, for this exercise, simply click OK

Hands-on-Lab Page 23
IBM Software

__32. When the upload completes, verify that the CHANGES.txt file appears in the directory tree at left,
If it is not immediately visible click the refresh button. On the right, you should see a subset of
the file’s contents displayed in text format

__33. Highlight the CHANGES.txt file in your ConsoleLab directory and click the Download button.

__34. When prompted, click the Save File button. Then select OK.

__35. If Firefox is set as default browser, the file will be saved to your user Downloads directory. For
this exercise, the default directory location is fine

1.4 Summary
Congratulations! You're now familiar with the Hadoop Distributed File System. You know now how to
manipulate files within by using the terminal and the BigInsights Web Console. You may move on to the
next Unit.

Page 24 Hadoop Basics: Part1

The information contained in these materials is provided for

informational purposes only, and is provided AS IS without warranty
of any kind, express or implied. IBM shall not be responsible for any
damages arising out of the use of, or otherwise related to, these
materials. Nothing contained in these materials is intended to, nor
shall have the effect of, creating any warranties or representations
from IBM or its suppliers or licensors, or altering the terms and
conditions of the applicable license agreement governing the use of
IBM software. References in these materials to IBM products,
programs, or services do not imply that they will be available in all
countries in which IBM operates. This information is based on
current IBM product plans and strategy, which are subject to change
by IBM without notice. Product release dates and/or capabilities
referenced in these materials may change at any time at IBM’s sole
discretion based on market opportunities or other factors, and are not
intended to be a commitment to future product or feature availability
in any way.

IBM, the IBM logo and ibm.com are trademarks of International

Business Machines Corp., registered in many jurisdictions
worldwide. Other product and service names might be trademarks of
IBM or other companies. A current list of IBM trademarks is
available on the Web at “Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml.

Hadoop实际解决方案手册: Chinese Edition
From Everand
Hadoop实际解决方案手册: Chinese Edition
Posts & Telecom Press
No ratings yet
Ccs334 Bda Lab Manual PRINT
No ratings yet
Ccs334 Bda Lab Manual PRINT
53 pages
Lesson 3 Classification of Drugs 9learners
No ratings yet
Lesson 3 Classification of Drugs 9learners
50 pages
Exp1 Hirday Merged
No ratings yet
Exp1 Hirday Merged
102 pages
Imperative Theory of Law
No ratings yet
Imperative Theory of Law
8 pages
M. Ali Asdar Departement of Pulmonology and Respiratory Medicine Faculty of Medicine University of Indonesia - Persahabatan General Hospital Jakarta
No ratings yet
M. Ali Asdar Departement of Pulmonology and Respiratory Medicine Faculty of Medicine University of Indonesia - Persahabatan General Hospital Jakarta
30 pages
Internship Jntuh 160425 With Schedule
No ratings yet
Internship Jntuh 160425 With Schedule
3 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
Big Data Class Activity Assignment 2
No ratings yet
Big Data Class Activity Assignment 2
17 pages
Bda Practical File
No ratings yet
Bda Practical File
28 pages
kh5 (Bda) Merged
No ratings yet
kh5 (Bda) Merged
21 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Lab1 InstallationOfBigInsight
No ratings yet
Lab1 InstallationOfBigInsight
72 pages
Bi Lab File
No ratings yet
Bi Lab File
19 pages
02 Haddop Biginsights
No ratings yet
02 Haddop Biginsights
36 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
TP 1 - HDFS
No ratings yet
TP 1 - HDFS
40 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
Final Bda 1-8 Lab Aayush
No ratings yet
Final Bda 1-8 Lab Aayush
17 pages
BDA Record
No ratings yet
BDA Record
34 pages
1 Big Data Lab - 230823 - 103054
No ratings yet
1 Big Data Lab - 230823 - 103054
34 pages
BDA UNIT - 3 Updated
No ratings yet
BDA UNIT - 3 Updated
25 pages
BigData Lab01 02
No ratings yet
BigData Lab01 02
27 pages
Briere ITCT-A Final PDF
No ratings yet
Briere ITCT-A Final PDF
119 pages
Hadoop 1
No ratings yet
Hadoop 1
15 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Jurisprudence New
No ratings yet
Jurisprudence New
108 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Big Data Security 20100BTCSDSI07268
No ratings yet
Big Data Security 20100BTCSDSI07268
76 pages
A Thief in The Night
No ratings yet
A Thief in The Night
103 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Sony MDS-JE 520 User Manual
No ratings yet
Sony MDS-JE 520 User Manual
136 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Amcas Coursework Video
100% (2)
Amcas Coursework Video
7 pages
H 0010-20-43061 2 10 0 Pds Protocol Programmer S Guide
No ratings yet
H 0010-20-43061 2 10 0 Pds Protocol Programmer S Guide
172 pages
Basic HDFS Commands
No ratings yet
Basic HDFS Commands
7 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Amisha Reflective Report
No ratings yet
Amisha Reflective Report
8 pages
St. Cyril of Alexandria Term Paper For Patrology
100% (3)
St. Cyril of Alexandria Term Paper For Patrology
16 pages
Photoluminescence FBG
No ratings yet
Photoluminescence FBG
13 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
College Code / Name: 9615 - Maria College of Engineering and Technology Branch Code / Name: 103 - B.E. Civil Engineering
No ratings yet
College Code / Name: 9615 - Maria College of Engineering and Technology Branch Code / Name: 103 - B.E. Civil Engineering
3 pages
Forensic Science - Module 1 & 2
No ratings yet
Forensic Science - Module 1 & 2
106 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
MR YARN - Lab 1 - Cloud - Updated-V2.0
No ratings yet
MR YARN - Lab 1 - Cloud - Updated-V2.0
30 pages
Hadoop Basics With Ibm Biginsights
No ratings yet
Hadoop Basics With Ibm Biginsights
22 pages
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Reform Telecom
No ratings yet
Reform Telecom
22 pages
Figure of Speech
No ratings yet
Figure of Speech
4 pages
HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture
No ratings yet
HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture
10 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
MR & YARN - Lab 1 - BigInsights 4.1.0 - Updated
No ratings yet
MR & YARN - Lab 1 - BigInsights 4.1.0 - Updated
30 pages
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
No ratings yet
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
74 pages
Service Manual, PM7100, English PT00112534 Rev A Release 8-2020
No ratings yet
Service Manual, PM7100, English PT00112534 Rev A Release 8-2020
64 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
QIG Quick Installation Guide DCU 305 R3
No ratings yet
QIG Quick Installation Guide DCU 305 R3
2 pages
1.4 HDFS Lab 1H
No ratings yet
1.4 HDFS Lab 1H
23 pages
HadoopBasicAdminCommands Benchmarking PDF
No ratings yet
HadoopBasicAdminCommands Benchmarking PDF
12 pages
Hands On-Exercies
No ratings yet
Hands On-Exercies
17 pages
Bal BBL 712 - Taxation Law I
No ratings yet
Bal BBL 712 - Taxation Law I
22 pages
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
No ratings yet
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
51 pages
Qualities of Such Coaching Institutes
No ratings yet
Qualities of Such Coaching Institutes
3 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
HANDS Hadoop Cloud
No ratings yet
HANDS Hadoop Cloud
10 pages
1.0 - Welcome
No ratings yet
1.0 - Welcome
18 pages
5 People Who Disappeared But Would Reappear Years Later
No ratings yet
5 People Who Disappeared But Would Reappear Years Later
5 pages
Hadoop Installation and Configuration
No ratings yet
Hadoop Installation and Configuration
16 pages
Titles AND RUBRICS of Research Paper BA LLB IX SEMESTER
No ratings yet
Titles AND RUBRICS of Research Paper BA LLB IX SEMESTER
5 pages
Orientering
No ratings yet
Orientering
15 pages
Arriving at The Scene: Initial Response/Prioritization of Efforts 2.safety Procedures
No ratings yet
Arriving at The Scene: Initial Response/Prioritization of Efforts 2.safety Procedures
2 pages
Hadoop Imp Commands
No ratings yet
Hadoop Imp Commands
21 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Department of Information Technology
No ratings yet
Department of Information Technology
1 page
Java4s Com Hibernate
No ratings yet
Java4s Com Hibernate
5 pages
L Hadoop 1 PDF
No ratings yet
L Hadoop 1 PDF
12 pages
My Classroom
No ratings yet
My Classroom
1 page
TTPL Supplier Evaluation Form Doc No:Ttpl/F/Pur/05 DOC REV NO/DATE:00/03.04.17 Page 1 of 3
No ratings yet
TTPL Supplier Evaluation Form Doc No:Ttpl/F/Pur/05 DOC REV NO/DATE:00/03.04.17 Page 1 of 3
3 pages
Sharda University: Term Paper Guidelines
No ratings yet
Sharda University: Term Paper Guidelines
1 page
HOL - Exploring HDFS
No ratings yet
HOL - Exploring HDFS
6 pages
Public International Law
No ratings yet
Public International Law
65 pages
Important: Service Data Sheet
No ratings yet
Important: Service Data Sheet
4 pages
"Artificial Intelligence in Healthcare Services: Submitted by
No ratings yet
"Artificial Intelligence in Healthcare Services: Submitted by
2 pages
Hands On
No ratings yet
Hands On
26 pages
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Petitioner Memo Final NUJS
No ratings yet
Petitioner Memo Final NUJS
3 pages
Homework Labs Lecture01
No ratings yet
Homework Labs Lecture01
9 pages
Om Namah Shivaya
100% (1)
Om Namah Shivaya
17 pages
BAL/BBL314: Course Objectives
No ratings yet
BAL/BBL314: Course Objectives
8 pages
SCBA Pre-Use Inspection
No ratings yet
SCBA Pre-Use Inspection
2 pages
Amanda Mcelvany Position Paper Final
No ratings yet
Amanda Mcelvany Position Paper Final
6 pages
Abu Dhabi Ports Company (PJSC) : Shamal Development - New 33/11Kv Substation
No ratings yet
Abu Dhabi Ports Company (PJSC) : Shamal Development - New 33/11Kv Substation
52 pages
FBS Midterm
No ratings yet
FBS Midterm
2 pages
Monsoon Theories
100% (1)
Monsoon Theories
14 pages
Fischer FBN Anchors
No ratings yet
Fischer FBN Anchors
23 pages
MP PCS (J) Syllabus: Judicial Adda
No ratings yet
MP PCS (J) Syllabus: Judicial Adda
4 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
What Is Budget ?: Sales Budget Production Budget - An Estimate of The Number of Units That Must Be Manufactured To Meet
No ratings yet
What Is Budget ?: Sales Budget Production Budget - An Estimate of The Number of Units That Must Be Manufactured To Meet
3 pages
Case Ih Tractor Ignition Electrical Parts
100% (2)
Case Ih Tractor Ignition Electrical Parts
16 pages
Portfolio Grade 1 Math Lesson
No ratings yet
Portfolio Grade 1 Math Lesson
1 page
In The Court of The District and Sessions Judge, GB Nagar. H.M.A. PETITION NO. - / 2019
No ratings yet
In The Court of The District and Sessions Judge, GB Nagar. H.M.A. PETITION NO. - / 2019
4 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
Plaint Filed Under Section 26 Read With Order Vii of CPC
No ratings yet
Plaint Filed Under Section 26 Read With Order Vii of CPC
2 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet
Shambhu Ram Yadav Vs Hanuman
No ratings yet
Shambhu Ram Yadav Vs Hanuman
1 page
Windows Batch File Programming
From Everand
Windows Batch File Programming
Michael Elliott
2/5 (2)

Introduction To HDFS PDF

Uploaded by

Introduction To HDFS PDF

Uploaded by

IBM Software An IBM Proof of Technology

Hadoop Basics with InfoSphere

© Copyright IBM Corporation, 2013

Lab 1 Exploring Hadoop Distributed File System

Hadoop is an open-source project administered by the Apache Software Foundation. Hadoop’s

After completing this hands-on lab, you will be able to:

• Use Hadoop commands to explore HDFS on the Hadoop system

Allow 45 minutes to 1 hour complete this section of lab.

VM image setup screen root password

Linux biadmin biadmin

Page 4 Hadoop Basics: Part1

1.1 Getting Started

__5. Click on the Terminal icon

Page 6 Hadoop Basics: Part1

Page 8 Hadoop Basics: Part1

1.2 Exploring Hadoop Distributed File System (Terminal)

There are two methods to interact with HDFS:

We will be using both methods in this lab

1.2.1 Using the command line Interface

__1. List the contents of the root directory.

__2. To list the contents of the /user/biadmin directory, execute:

hadoop fs -ls /user/biadmin

hadoop fs -mkdir test

__4. Issue the ls command again to see the subdirectory test:

hadoop fs -ls /user/biadmin

Page 10 Hadoop Basics: Part1

hadoop fs -ls /user

hadoop fs -lsr /user

hadoop fs -mkdir /user/biadmin/test2

hadoop fs -ls /user/biadmin | grep test

hadoop fs -put /home/biadmin/README README

hadoop fs -ls /user/biadmin

hadoop fs -cat README

__9. Execute the commands below to use the diff command.

diff <( hadoop fs -cat README ) README

Page 12 Hadoop Basics: Part1

hadoop fs -du README

hadoop fs -du /user/biadmin

hadoop fs -dus /user/biadmin

hadoop fs -help dus

Page 14 Hadoop Basics: Part1

1.3 Exploring Hadoop Distributed File System (Web Console)

1.3.1 Using the Web Console

1.3.2 Working with the Welcome page

Page 16 Hadoop Basics: Part1

1.3.3 Administering BigInsights

1.3.4 Inspecting the status of your cluster

Page 18 Hadoop Basics: Part1

1.3.5 Starting and stopping a component

__13. If necessary, click on the Hive service to display its status.

1.3.6 Working with Files

Page 20 Hadoop Basics: Part1

__23. Create another directory named ConsoleLabTest.

__24. Use the Rename icon to rename this directory to ConsoleLabTest2

Page 22 Hadoop Basics: Part1

Page 24 Hadoop Basics: Part1

The information contained in these materials is provided for

IBM, the IBM logo and ibm.com are trademarks of International

You might also like