0% found this document useful (0 votes)
32 views8 pages

Run Word Count - Hive Job On EMR - V1 - Reviewed - Sks - Lab Guides

Uploaded by

Aniket Sonale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Run Word Count - Hive Job On EMR - V1 - Reviewed - Sks - Lab Guides

Uploaded by

Aniket Sonale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Big Data

Run Hive Job on EMR– Demo

Table of Contents

Steps to create EMR Cluster – Demo ............................................................................................. 2


Step 1: Select the S3 service. ................................................................ Error! Bookmark not defined.
Step 2: Click Create bucket.................................................................... Error! Bookmark not defined.
Step 3: Write the Bucket name. Click Create. ...................................... Error! Bookmark not defined.
Step 4: Click the bucket name. .............................................................. Error! Bookmark not defined.
Step 5: Click Create folder. .................................................................... Error! Bookmark not defined.
Step 6: Type the folder name. Click Save. ........................................... Error! Bookmark not defined.
Step 7: Select the EMR service. ............................................................ Error! Bookmark not defined.
Step 8: Click Create clusters. ................................................................. Error! Bookmark not defined.
Step 9: Type the Cluster name. Click the folder icon........................... Error! Bookmark not defined.
Step 10: Select the S3 bucket created earlier. Click Select. ............... Error! Bookmark not defined.
Step 11: Choose the latest version. ...................................................... Error! Bookmark not defined.
Step 12: Choose the instance type as “m4.large”. Choose the number of instances as per your
requirement. Enter the EC2 key pair. Click Create cluster. ................. Error! Bookmark not defined.
Step 13: Check the cluster status. ......................................................... Error! Bookmark not defined.
Step 14: Go to EC2 service. Three instances are created automatically. ....... Error! Bookmark not
defined.
Step 15: Click the master node Security group. ................................... Error! Bookmark not defined.
Step 16: Click the Inbound tab. Click Edit. ............................................ Error! Bookmark not defined.
Step 17: Click Add Rule button. ............................................................. Error! Bookmark not defined.
Step 18: Add “SSH” and make it anywhere. Click Save. ..................... Error! Bookmark not defined.
Step 19: SSH your instance. .................................................................. Error! Bookmark not defined.

1
Big Data

Steps to run Hive Job on EMR – Demo


Pre-requisite: -------- Commented [SKS[1]: Please add.

Step 1: Click the cluster you created earlier.

Step 2: Click Steps tab. Click “Add Step” button.

2
Big Data

Step 3: Select the step type “Hive program”. Give it a name. Enter the Script S3 location and
Input S3 location.
Script location: S3://us-east-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q

Input location: s3://us-east-1.elasticmapreduce.samples

Step 4: Simultaneously, Go to S3 service on a new tab and click the bucket you created earlier.

3
Big Data

Step 5: Click Create folder.

Step 6: Type the folder name. Click Save.

4
Big Data

Step 7: Go to EMR service tab again. Click the folder icon.

Step 8: Select the folder “outputs” from the bucket. Click Select.

5
Big Data

Step 9: Check the cluster status.

Step 10: Select the S3 bucket you created earlier. Select the outputs folder.

6
Big Data

Step 11: Choose the os_requests.

Step 12: Download it. Open it in a notepad.

7
Big Data

Step 13: Check the file.

You might also like