0% found this document useful (0 votes)

267 views35 pages

CCS334 BDA Lab Manual

Uploaded by

a.l.gandhidurai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

267 views35 pages

CCS334 BDA Lab Manual

Uploaded by

a.l.gandhidurai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

CCS334 BIG DATA ANALYTICS LAB V Semester AI&DS

KINGS ENGINEERING COLLEGE

(Affiliated to Anna University, Chennai)
Chennai – Bangalore Highway, Opposite to Hyundai Motors Sriperumbudur, Taluk,
Irungattukottai, Tamil Nadu 602117

DEPARTMENT OF
ARTIFICIAL INTELLIGENCE &
DATA SCIENCE

CCS334 BIG DATA

ANALYTICS
V SEMESTER - III YEARLAB
MANUAL ACADEMIC YEAR
2023-24
SYLLABUS

CCS334 – BIG DATA ANALYTICS L T P C

2 0 2 3
OBJECTIVES:

 To understand big data.

 To learn and use NoSQL big data management.
 To learn mapreduce analytics using Hadoop and related tools.
 To work with map reduce applications
 To understand the usage of Hadoop related tools for Big Data Analytics

LIST OF THE EXPERIMENTS

1. Downloading and installing Hadoop; Understanding different Hadoop modes. Startup

scripts, Configuration files.
2. Hadoop Implementation of file management tasks, such as Adding files and directories,
retrieving files and Deleting files
3. Implement of Matrix Multiplication with Hadoop Map Reduce
4. Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm.
5. Installation of Hive along with practice examples.
6. Installation of HBase, Installing thrift along with Practice examples
7. Practice importing and exporting data from various databases.

OUTCOMES:
After the completion of this course, students will be able to:
CO1: Describe big data and use cases from selected business domains.
CO2: Explain NoSQL big data management.
CO3: Install, configure, and run Hadoop and HDFS.
CO4: Perform map-reduce analytics using Hadoop.
CO5: Use Hadoop-related tools such as HBase, Cassandra, Pig, and Hive for big data
analytics.

SOFTWARE REQUIRED:
 Cassandra, Hadoop, Java, Pig, Hive and HBase.

viii
TABLE OF CONTENTS

PAGE NO.
S.NO. TITLE OF THE EXPERIMENTS

Downloading and Installing Hadoop; Understanding Different

1. Hadoop Modes. Startup Scripts, Configuration Files. 1

Hadoop Implementation of File Management Tasks, Such as Adding

2. Files and Directories, Retrieving Files and Deleting Files 12

3. Implement of Matrix Multiplication with Hadoop Map Reduce 15

Run a Basic Word Count Map Reduce Program to Understand Map

4. Reduce Paradigm. 19

5.
Installation of Hive along with Practice Examples. 24

6. Installation of Hbase along with Practice Examples 27

7. Installation of Thrift. 30

Practice Importing and Exporting Data From Various Databases.

8. 31

ix
EX. No. 1: DOWNLOADING AND INSTALLING HADOOP; UNDERSTANDING
DIFFERENT HADOOP MODES. STARTUP SCRIPTS, CONFIGURATION
FILES.

PREREQUISITES TO INSTALL HADOOP ON WINDOWS

 VIRTUAL BOX (For Linux): it is used for installing the operating system on it.
 OPERATING SYSTEM: You can install Hadoop on Windows or Linux based
operating systems. Ubuntu and CentOS are very commonly used.
 JAVA: You need to install the Java 8 package on your system.
 HADOOP: You require Hadoop latest version

1. Install Java
 Java JDK Link to download
https://www.oracle.com/java/technologies/javase-jdk8-downloads.html
 Extract and install Java in C:\Java
 Open cmd and type -> javac –version

2. Download Hadoop

https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz

 extract to C:\Hadoop

1
3. Set the path JAVA_HOME Environment variable
4. Set the path HADOOP_HOME Environment variable

2
3
4
5
5. Configurations

Edit file C:/Hadoop-3.3.0/etc/hadoop/core-site.xml,

6
paste the xml code in folder and save
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
======================================================
Rename “mapred-site.xml.template” to “mapred-site.xml” and edit this file C:/Hadoop-
3.3.0/etc/hadoop/mapred-site.xml, paste xml code and save this file.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
======================================================
Create folder “data” under “C:\Hadoop-3.3.0”
Create folder “datanode” under “C:\Hadoop-3.3.0\data”
Create folder “namenode” under “C:\Hadoop-3.3.0\data”
======================================================
Edit file C:\Hadoop-3.3.0/etc/hadoop/hdfs-site.xml,
paste xml code and save this file.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/hadoop-3.3.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>

7
<value>/hadoop-3.3.0/data/datanode</value>
</property>
</configuration>
======================================================
Edit file C:/Hadoop-3.3.0/etc/hadoop/yarn-site.xml,
paste xml code and save this file.
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
======================================================
Edit file C:/Hadoop-3.3.0/etc/hadoop/hadoop-env.cmd
by closing the command line
“JAVA_HOME=%JAVA_HOME%” instead of set “JAVA_HOME=C:\Java”

6. Hadoop Configurations

Download
https://github.com/brainmentorspvtltd/BigData_RDE/blob/master/Hadoop%20Configuration.zip
or (for hadoop 3)
https://github.com/s911415/apache-hadoop-3.1.0-winutils
 Copy folder bin and replace existing bin folder in
C:\Hadoop-3.3.0\bin
 Format the NameNode
 Open cmd and type command “hdfs namenode –format”

8
7. Testing
 Open cmd and change directory to C:\Hadoop-3.3.0\sbin
 type start-all.cmd

(Or you can start like this)

Start namenode and datanode with this command
 type start-dfs.cmd
 Start yarn through this command
 type start-yarn.cmd
Make sure these apps are running
 Hadoop Namenode
 Hadoop datanode
 YARN Resource Manager

9
Open: http://localhost:8088

Open: http://localhost:9870

10
11
EX. No : 2 HADOOP IMPLEMENTATION OF FILE MANAGEMENT TASKS, SUCH AS
ADDING FILES AND DIRECTORIES, RETRIEVING FILES AND
DELETING FILES

1. Create a directory in HDFS at given path(s).

Usage:
hadoop fs -mkdir <paths> Example:
hadoop fs -mkdir /user/saurzcode/dir1 /user/saurzcode/dir2

2. List the contents of adirectory.

Usage :
hadoop fs -ls <args>
Example:
hadoop fs -ls /user/saurzcode

3. Upload and download a file in HDFS.

Upload: hadoop fs -put:
Copy single src file, or multiple src files from local file system to the Hadoop data file system
Usage:
hadoop fs -put <localsrc> ... <HDFS_dest_Path> Example:
hadoop fs -put /home/saurzcode/Samplefile.txt /user/ saurzcode/dir3/
Download:
hadoop fs -get:
Copies/Downloads files to the local file system
Usage:
hadoop fs -get <hdfs_src> <localdst> Example:
hadoop fs -get /user/saurzcode/dir3/Samplefile.txt /home/

4. See contents of a file

Same as unix cat command:
Usage:
hadoop fs -cat <path[filename]>

12
Example:
hadoop fs -cat /user/saurzcode/dir1/abc.txt

5. Copy a file from source todestination

This command allows multiple sources as well in which casethe destination must be a directory.
Usage:
hadoop fs -cp <source> <dest>

Example:

hadoop fs -cp
/user/saurzcode/dir1/abc.txt
/user/saurzcode/dir2

6. Copy a file from/To Local file system to HDFS

copyFromLocal
Usage:
hadoop fs -copyFromLocal <localsrc>

URI Example:
hadoop fs -copyFromLocal /home/saurzcode/abc.txt /user/ saurzcode/abc.txt
Similar to put command, except that the source is restricted to a local file reference.

copyToLocal
Usage:
hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>

Similar to get command, except that the destination is restricted to a local file reference.

7. Move file from source to destination.

Note:- Moving files across filesystem is not permitted.

Usage :
hadoop fs -mv <src> <dest> Example:
hadoop fs -mv /user/saurzcode/dir1/abc.txt /user/saurzcode/ dir2
13
8. Remove a file or directory in HDFS.
Remove files specified as argument. Deletes directory only when it is empty
Usage :
hadoop fs -rm <arg> Example:
hadoop fs -rm /user/saurzcode/dir1/abc.txt

Recursive version of delete.

Usage :
hadoop fs -rmr <arg> Example:
hadoop fs -rmr /user/saurzcode/

9. Display last few lines of a file.

Similar to tail command in Unix.
Usage :
hadoop fs -tail <path[filename]> Example:
hadoop fs -tail /user/saurzcode/dir1/abc.txt

10. Display the aggregate length of a file.

Usage :
hadoop fs -du <path> Example:
hadoop fs -du /user/saurzcode/dir1/abc.txt

14
EX. No : 3 IMPLEMENT OF MATRIX MULTIPLICATION WITH HADOOP MAP
REDUCE
AIM:-
To write a Map Reduce Program that implements Matrix Multiplication.

PROCEDURE:
We assume that the input matrices are already stored in Hadoop Distributed File System
(HDFS) in a suitable format (e.g., CSV, TSV) where each row represents a matrix element. The
matrices are compatible for multiplication (the number of columns in the first matrix is equal
to the number of rows in the second matrix).

STEP 1: MAPPER
The mapper will take the input matrices and emit key-value pairs for each element in
the result matrix. The key will be the (row, column) index of the result element, and the value
will be the corresponding element value.

STEP 2: REDUCER
The reducer will take the key-value pairs emitted by the mapper and calculate the partial
sum for each element in the result matrix.

STEP 3: MAIN DRIVER

The main driver class sets up the Hadoop job configuration and specifies the input and
output paths for the matrices.

STEP 4: RUNNING THE JOB

To run the MapReduce job, you need to package your classes into a JAR file and then submit
it to Hadoop using the hadoop jar command. Make sure to replace input_path and output_path
with the actual HDFS paths to your input matrices and desired output directory.

PROGRAM:
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

15
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.fs.Path;
public class MatrixMultiplicationMapper extends Mapper<LongWritable, Text, Text, Text>
{
protected void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
// Parse the input line to get row, column, and value of each element in the input matrices
String[] elements = value.toString().split(",");
int row = Integer.parseInt(elements[0]);
int col = Integer.parseInt(elements[1]);
int val = Integer.parseInt(elements[2]);

// Emit key-value pairs where key is (row, column) index of the result element
// and value is the corresponding element value
context.write(new Text(row + "," + col), new Text(val));
}
}
public class MatrixMultiplicationReducer extends Reducer<Text, Text, Text, IntWritable> {
protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException,
InterruptedException {
int result = 0;
for (Text value : values) {
// Accumulate the partial sum for the result element
result += Integer.parseInt(value.toString());
}
// Emit the final result for the result element
context.write(key, new IntWritable(result));
}

16
}
public class MatrixMultiplicationDriver {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Matrix Multiplication");
job.setJarByClass(MatrixMultiplicationDriver.class);
job.setMapperClass(MatrixMultiplicationMapper.class);
job.setReducerClass(MatrixMultiplicationReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

Run the program

hadoop jar matrixmultiplication.jar MatrixMultiplicationDriver input_path output_path

17
RESULT:

Thus the Map Reduce Program that implements Matrix Multiplication was executed
and verified successfully.

18
EX. NO: 4 RUN A BASIC WORD COUNT MAP REDUCE PROGRAM TO
UNDERSTAND MAP REDUCE PARADIGM.

AIM:-
To write a Basic Word Count program to understand Map Reduce Paradigm.

PROCEDURE:

The entire MapReduce program can be fundamentally divided into three parts:
 Mapper Phase Code
 Reducer Phase Code
 Driver Code

STEP 1: MAPPER CODE:

We have created a class Map that extends the class Mapper which is already defined in
the MapReduce Framework.
 We define the data types of input and output key/value pair after the class declaration
using angle brackets.
 Both the input and output of the Mapper is a key/value pair.
Input:
 The key is nothing but the offset of each line in the text file:LongWritable
 The value is each individual line : Text
Output:
 The key is the tokenized words: Text
 We have the hardcoded value in our case which is 1: IntWritable
 Example – Dear 1, Bear 1, etc.
We have written a java code where we have tokenized each word and assigned them a
hardcoded value equal to 1.

STEP 2 : REDUCER CODE:

 We have created a class Reduce which extends class Reducer like that of Mapper.
 We define the data types of input and output key/value pair after the class declaration
using angle brackets as done for Mapper.
 Both the input and the output of the Reducer is a key value pair.

19
Input:
 The key nothing but those unique words which have been generated after the sorting
and shuffling phase: Text
 The value is a list of integers corresponding to each key: IntWritable
 Example – Bear, [1, 1], etc.

Output:
 The key is all the unique words present in the input text file: Text
 The value is the number of occurrences of each of the unique words: IntWritable
 Example – Bear, 2; Car, 3, etc.
 We have aggregated the values present in each of the list corresponding to each key and
produced the final answer.
 In general, a single reducer is created for each of the unique words, but, you can specify the
number of reducer in mapred-site.xml.

STEP 3: DRIVER CODE:

 In the driver class, we set the configuration of our MapReduce job to run in Hadoop.
 We specify the name of the job , the data type of input/ output of the mapper and reducer.
 We also specify the names of the mapper and reducer classes.
 The path of the input and output folder is also specified.
 The method setInputFormatClass () is used for specifying that how a Mapper will read
the input data or what will be the unit of work. Here, we have chosen TextInputFormat
so that single line is read by the mapper at a time from the input text file. The main ()
method is the entry point for the driver. In this method, we instantiate a new
Configuration object for the job.

20
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.fs.Path;
public class WordCount
{
public static class Map extends Mapper<LongWritable,Text,Text,IntWritable> {
public void map(LongWritable key, Text value,Context context) throws
IOException,InterruptedException{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken());
context.write(value, new IntWritable(1));
}
}
}
public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException {
int sum=0;

for(IntWritable x: values)
{
sum+=x.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf= new Configuration();

21
Job job = new Job(conf,"My Word Count Program");
job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//deleting the output path automatically from hdfs so that we don't have to
delete it explicitly
outputPath.getFileSystem(conf).delete(outputPath);
//exiting the job only if the flag value becomes false
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Run the MapReduce code:
The command for running a MapReduce code is:
hadoop jar hadoop-mapreduce-example.jar WordCount /sample/input /sample/output
OUTPUT:

22
RESULT:

Thus the Map Reduce Program that implements word count was executed and verified
successfully.

23
EX. NO : 5 INSTALLATION OF HIVE ALONG WITH PRACTICE EXAMPLES.

PREREQUISITES:
 Java Development Kit (JDK) installed and the JAVA_HOME environment variable
set.
 Hadoop installed and configured on your Windows system.

STEP-BY-STEP INSTALLATION:

1. Download HIVE:
Visit the Apache Hive website and download the latest stable version of Hive.
Official Apache Hive website: https://hive.apache.org/
2. Extract the Downloaded Hive Archive to a Directory on Your Windows Machine,
e.g., C:\hive.
3. Configure Hive:
 Open the Hive configuration file (hive-site.xml) located in the conf folder of the
extracted Hive directory.
 Set the necessary configurations, such as Hive Metastore connection settings and
Hadoop configurations. Make sure to adjust paths accordingly for Windows. Here's an
example of some configurations:

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/path/to/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore.</description>
</property>

</configuration>

4. Environment Variables Setup:

 Add the Hive binary directory (C:\hive\bin in this example) to your PATH environment
variable.
 Set the HIVE_HOME environment variable to point to the Hive installation directory
(C:\hive in this example).

24
5. Start the Hive Metastore service:
To start the Hive Metastore service, you can use the schematool script:

6. Start Hive:
 Open a command prompt or terminal and navigate to the Hive installation directory.
 Execute the hive command to start the Hive shell.

EXAMPLES:

1. Create a Database:
To create a new database in HIVE, use the following syntax:

CREATE DATABASE database_name;

Example:
CREATE DATABASE mydatabase;

2. Use a Database:
To use a specific database in HIVE, use the following syntax:
USE database_name;
Example:
USE mydatabase;

3. Show Databases:
To display a list of available databases in HIVE, use the following syntax:
SHOW DATABASES;

4. Create a Table:
To create a table in HIVE, use the following syntax:
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
...
);

25
Example:
CREATE TABLE mytable (
id INT,
name STRING,
age INT
);

5. Show Tables:
To display a list of tables in the current database, use the following syntax:
SHOW TABLES;

6. Describe a Table:
To view the schema and details of a specific table, use the following syntax:
DESCRIBE table_name;

Example:
DESCRIBE mytable;

7. Insert Data into a Table:

To insert data into a table in HIVE, use the following syntax:
INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);
Example:
INSERT INTO mytable (id, name, age) VALUES (1, 'John Doe', 25);

8. Select Data from a Table:

SELECT column1, column2, ... FROM table_name WHERE condition;
Example:
SELECT * FROM mytable WHERE age > 20;

RESULT:
Thus the Installation of HIVE was done successfully.

26
EX. NO : 6 INSTALLATION OF HBASE ALONG WITH PRACTICE EXAMPLES

AIM:
To install HBASE using Virtual Machine and perform some operations in HBASE.

PROCEDURE:
Step 1: Install a Virtual Machine
 Download and install a virtual machine software such as VirtualBox
(https://www.virtualbox.org/) or VMware (https://www.vmware.com/).
 Create a new virtual machine and install a Unix-based operating system like Ubuntu or
CentOS. You can download the ISO image of your desired Linux distribution from their
official websites.

Step 2: Set up the Virtual Machine

 Launch the virtual machine and install the Unix-based operating system following the
installation wizard.
 Make sure the virtual machine has network connectivity to download software
packages.

Step 3: Install Java

 Open the terminal or command line in the virtual machine.
 Update the package list
sudo apt update
 Install OpenJDK (Java Development Kit):
sudo apt install default-jdk
 Verify the Java installation:
java -version

Step 4: Download and Install HBase

 In the virtual machine, navigate to the directory where you want to install HBase.
 Download the HBase binary distribution from the Apache HBase website
(https://hbase.apache.org/). Look for the latest stable version.
 Extract the downloaded archive
tar -xvf <hbase_archive_name>.tar.gz
 Replace <hbase_archive_name> with the actual name of the HBase archive file.

27
 Move the extracted HBase directory to a desired location:
sudo mv <hbase_extracted_directory> /opt/hbase
 Replace <hbase_extracted_directory> with the actual name of the extracted HBase
directory.

Step 5: Configure HBase

 Open the HBase configuration file for editing:
sudo nano /opt/hbase/conf/hbase-site.xml
 Add the following properties to the configuration file:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///var/lib/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/var/lib/zookeeper</value>
</property>
</configuration>

 Save the file and exit the text editor.

Step 6: Start HBase

 Start the HBase server:

sudo /opt/hbase/bin/start-hbase.sh

HBASE PRACTICE EXAMPLES:

Step 1: Start HBase

 Make sure HBase is installed and running on your Windows system.

Step 2: Open HBase Shell

 Open a command prompt or terminal window and navigate to the directory where the
HBase installation is located. Run the following command to start the HBase shell:
>>hbase shell

28
Step 3: Create a Table
 In the HBase shell, you can create a table with column families.
 For example, let's create a table named "my_table" with a column family called "cf":
>> create 'my_table', 'cf'

Step 4: Insert Data

 To insert data into the table, you can use the put command.
 Here's an example of inserting a row with a specific row key and values:
>> put 'my_table', 'row1', 'cf:column1', 'value1'
>> put 'my_table', 'row1', 'cf:column2', 'value2'

Step 5: Get Data

 You can retrieve data from the table using the get command.
 For example, to get the values of a specific row:
>> get 'my_table', 'row1'
 This will display all the column family values for the specified row.

Step 6: Scan Data

 To scan and retrieve multiple rows or the entire table, use the scan command.
 For instance, to scan all rows in the table:
>> scan 'my_table'
 This will display all rows and their corresponding column family values.

Step 7: Delete Data

 To delete a specific row or a particular cell value, you can use the delete command.
 Here's an example of deleting a specific row:
>>delete 'my_table', 'row1'

Step 8: Disable and Drop Table

 If you want to remove the table entirely, you need to disable and drop it.
 Use the following commands:
>>disable 'my_table'
>>drop 'my_table'

RESULT:
Thus the installation of HBase using Virtual Machine was done successfully.

29
EX. NO : 7 INSTALLATION OF THRIFT

AIM:
To install Apache thrift on Windows OS.

PROCEDURE:

Step 1: Download Apache Thrift:

 Visit the Apache Thrift website: https://thrift.apache.org/
 Go to the "Downloads" section and find the latest version of Thrift.
 Download the Windows binary distribution (ZIP file) for the desired version.

Step 2: Extract the ZIP file:

 Locate the downloaded ZIP file and extract its contents to a directory of your choice.
 This directory will be referred to as <THRIFT_DIR> in the following steps.

Step 3: Set up environment variables:

 Open the Start menu and search for "Environment Variables" and select "Edit the
system environment variables."
 Click the "Environment Variables" button at the bottom right of the "System Properties"
window.
 Under the "System variables" section, find the "Path" variable and click "Edit."
 Add the following entries to the "Variable value" field (replace <THRIFT_DIR> with
the actual directory path):
<THRIFT_DIR>\bin
<THRIFT_DIR>\lib
 Click "OK" to save the changes.

Step 4: Verify the installation:

 Open a new Command Prompt window.
 Run the following command to verify that Thrift is installed and accessible:
thrift –version
 If everything is set up correctly, you should see the version number of Thrift printed
on the screen.

RESULT:
Thus the installation of Thrift on windows OS was done successfully.

30
EX. NO : 8 PRACTICE IMPORTING AND EXPORTING DATA FROM VARIOUS
DATABASES.

AIM:
To import and export data from various Databases using SQOOP.

PROCEDURE:

Step 1: Install SQOOP.

 First, you need to install Sqoop on your Hadoop cluster or machine.
 Download the latest version of Sqoop from the Apache Sqoop website
(http://sqoop.apache.org/) and follow the installation instructions provided in the
documentation

Step 2: Importing data from a database:

 To import data from a database into Hadoop, use the following Sqoop command:
Sqoop import –connect
jdbc:<DB_TYPE>://<DB_HOST>:<DB_PORT>/<DB_NAME> \
--username <DB_USERNAME> \
--password <DB_PASSWORD> \
--table <TABLE_NAME> \
--target-dir <HDFS_TARGET_DIR> \
--m <NUMBER_OF_MAP_TASKS>

 Replace the placeholders

 (<DB_TYPE>, <DB_HOST>, <DB_PORT>, <DB_NAME>, <DB_USERNAME>,
<DB_PASSWORD>, <TABLE_NAME>, <HDFS_TARGET_DIR>, and
<NUMBER_OF_MAP_TASKS>) with the appropriate values for your database and
Hadoop environment.

Step 3: Exporting data to a database:

To export data from Hadoop to a database, use the following Sqoop command:
sqoop export –connect
jdbc:<DB_TYPE>://<DB_HOST>:<DB_PORT>/<DB_NAME> \
--username <DB_USERNAME> \

31
--password <DB_PASSWORD> \
--table <TABLE_NAME> \
--export-dir <HDFS_EXPORT_DIR> \
--input-fields-terminated-by '<DELIMITER>'
 Replace the placeholders
 (<DB_TYPE>, <DB_HOST>, <DB_PORT>, <DB_NAME>, <DB_USERNAME>,
<DB_PASSWORD>, <TABLE_NAME>, <HDFS_EXPORT_DIR>, and
<DELIMITER>) with the appropriate values for your database and Hadoop
environment.

RESULT:
Thus the implementation export data from various Databases using SQOOP was done
successfully.

Cambridge o Level Commerce Coursebook
0% (1)
Cambridge o Level Commerce Coursebook
9 pages
English Stage 9 Sample Paper 2 Insert - tcm143-595376
55% (11)
English Stage 9 Sample Paper 2 Insert - tcm143-595376
4 pages
UGEO - HM70A - Operation Manual (Vol1)
100% (1)
UGEO - HM70A - Operation Manual (Vol1)
232 pages
Sony rcp-1530 1st-Edition Rev.1 MM
No ratings yet
Sony rcp-1530 1st-Edition Rev.1 MM
172 pages
Ccs334 - Big Data Analytics
75% (4)
Ccs334 - Big Data Analytics
2 pages
Bda Manual
No ratings yet
Bda Manual
80 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
100% (1)
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
2 pages
HAI Knowledge Questionnaire
No ratings yet
HAI Knowledge Questionnaire
3 pages
Experiment No. 7: Numerical Aperture of The Optical Fiber
No ratings yet
Experiment No. 7: Numerical Aperture of The Optical Fiber
4 pages
Martin Luther's Legacy: Reforming Reformation Theology For The 21st Century
100% (8)
Martin Luther's Legacy: Reforming Reformation Theology For The 21st Century
369 pages
Precast Concrete Bearing Wall Panel Design
100% (1)
Precast Concrete Bearing Wall Panel Design
22 pages
JBL Bar Studio
No ratings yet
JBL Bar Studio
2 pages
Ccs334 Big Data Analytics
0% (1)
Ccs334 Big Data Analytics
2 pages
Standard Operating Procedure (SOP) For Shipping An Outbound Package
No ratings yet
Standard Operating Procedure (SOP) For Shipping An Outbound Package
15 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
100 Tareas Me Dejan en Ingles
No ratings yet
100 Tareas Me Dejan en Ingles
2 pages
Bar Velocities Capable of Optimising The Muscle Power in Strength-Power Exercises
No ratings yet
Bar Velocities Capable of Optimising The Muscle Power in Strength-Power Exercises
9 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Maligad Week-1 Assignment GEE-311 Gender-Society CSAB
No ratings yet
Maligad Week-1 Assignment GEE-311 Gender-Society CSAB
3 pages
Bda Lab
No ratings yet
Bda Lab
94 pages
Reading Workbook-3
No ratings yet
Reading Workbook-3
21 pages
IPR Gandhinagar Apprentice (Diploma Degree) Recruitment 2020RIJADEJAcom
No ratings yet
IPR Gandhinagar Apprentice (Diploma Degree) Recruitment 2020RIJADEJAcom
3 pages
Big Data Analytics IT
No ratings yet
Big Data Analytics IT
55 pages
Ansys Beam Analysis and Cross Sections
No ratings yet
Ansys Beam Analysis and Cross Sections
17 pages
Big Data Manual
No ratings yet
Big Data Manual
19 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
Bda Record
No ratings yet
Bda Record
46 pages
Notes
No ratings yet
Notes
53 pages
Course: Big Data Analytics Lab Scheme: 2017
No ratings yet
Course: Big Data Analytics Lab Scheme: 2017
25 pages
Big Data Lab Manual and Syllabus
No ratings yet
Big Data Lab Manual and Syllabus
71 pages
Anushka Shetty 35
No ratings yet
Anushka Shetty 35
34 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Big Data Analytics Syllabus
No ratings yet
Big Data Analytics Syllabus
3 pages
Final Copy - BDA LAB Record
No ratings yet
Final Copy - BDA LAB Record
44 pages
Big Data Manual
No ratings yet
Big Data Manual
82 pages
BIGDATALABCURRENT
No ratings yet
BIGDATALABCURRENT
54 pages
BDA Practicalfile
No ratings yet
BDA Practicalfile
19 pages
Data Science
No ratings yet
Data Science
82 pages
CCS334 BDA Syllabus
No ratings yet
CCS334 BDA Syllabus
5 pages
CCS334 Bda
No ratings yet
CCS334 Bda
23 pages
Ba Lab Record-It b2022-26
No ratings yet
Ba Lab Record-It b2022-26
43 pages
Synonyms
No ratings yet
Synonyms
3 pages
CH8568DOCSIS 3.1 Wireless Voice Gateway
No ratings yet
CH8568DOCSIS 3.1 Wireless Voice Gateway
3 pages
Big Data Analysis 3170722 Lab Manual
No ratings yet
Big Data Analysis 3170722 Lab Manual
68 pages
Dynamic Fluid Pulsation
No ratings yet
Dynamic Fluid Pulsation
17 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Butyl Diglycol Acetate
No ratings yet
Butyl Diglycol Acetate
4 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Thomasyl CV
No ratings yet
Thomasyl CV
7 pages
Unit 1 Bdhall
No ratings yet
Unit 1 Bdhall
66 pages
Bda Da1
No ratings yet
Bda Da1
14 pages
BY:-Walabuma Lenjiso: Advisor
No ratings yet
BY:-Walabuma Lenjiso: Advisor
22 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
62 pages
Anomalist Psychology
No ratings yet
Anomalist Psychology
74 pages
CCS334 Bda
No ratings yet
CCS334 Bda
5 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
Big Data
No ratings yet
Big Data
28 pages
BDA Lab ManuaL
No ratings yet
BDA Lab ManuaL
83 pages
Bda Record (24-25)
No ratings yet
Bda Record (24-25)
50 pages
Amc Engineering College: Dept. of Computer Science and Engineering
No ratings yet
Amc Engineering College: Dept. of Computer Science and Engineering
6 pages
Bda Record
No ratings yet
Bda Record
83 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Final Exam - Math 111-Second Term 222
No ratings yet
Final Exam - Math 111-Second Term 222
7 pages
Ikeja Electric PLC's Financial Statement For Statutory Report
No ratings yet
Ikeja Electric PLC's Financial Statement For Statutory Report
76 pages
BDA Lab Manual R22
0% (1)
BDA Lab Manual R22
70 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
GIVER Study Guide
No ratings yet
GIVER Study Guide
5 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
54 pages
Quran & Prime Numbers - Part 2
No ratings yet
Quran & Prime Numbers - Part 2
6 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
Post Colonial Literature Assignment
No ratings yet
Post Colonial Literature Assignment
3 pages
New Bda Manual
No ratings yet
New Bda Manual
80 pages
Bda Lab
No ratings yet
Bda Lab
4 pages
Fatigue Strength
No ratings yet
Fatigue Strength
7 pages
Lab Manual
No ratings yet
Lab Manual
34 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
Big Data Lab Record
No ratings yet
Big Data Lab Record
30 pages
CCS334 BIg Data Final Front Sheet BATHIMA - Pagenumber
No ratings yet
CCS334 BIg Data Final Front Sheet BATHIMA - Pagenumber
47 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
Rush
No ratings yet
Rush
90 pages
Practical Final Ai
No ratings yet
Practical Final Ai
90 pages
Bda 1
No ratings yet
Bda 1
54 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
43 pages
Building Websites with VB.NET and DotNetNuke 4
From Everand
Building Websites with VB.NET and DotNetNuke 4
Daniel N. Egan
1/5 (1)
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
Touchpad Prime Ver. 1.2 Class 6
From Everand
Touchpad Prime Ver. 1.2 Class 6
Nisha Batra
No ratings yet
Trackpad Pro Ver. 5.0 Class 7
From Everand
Trackpad Pro Ver. 5.0 Class 7
Nidhi Arora
5/5 (1)

CCS334 BDA Lab Manual

Uploaded by

CCS334 BDA Lab Manual

Uploaded by

CCS334 BIG DATA ANALYTICS LAB V Semester AI&DS

KINGS ENGINEERING COLLEGE

CCS334 BIG DATA

CCS334 – BIG DATA ANALYTICS L T P C

 To understand big data.

LIST OF THE EXPERIMENTS

1. Downloading and installing Hadoop; Understanding different Hadoop modes. Startup

Downloading and Installing Hadoop; Understanding Different

Hadoop Implementation of File Management Tasks, Such as Adding

3. Implement of Matrix Multiplication with Hadoop Map Reduce 15

Run a Basic Word Count Map Reduce Program to Understand Map

6. Installation of Hbase along with Practice Examples 27

Practice Importing and Exporting Data From Various Databases.

PREREQUISITES TO INSTALL HADOOP ON WINDOWS

Edit file C:/Hadoop-3.3.0/etc/hadoop/core-site.xml,

(Or you can start like this)

1. Create a directory in HDFS at given path(s).

2. List the contents of adirectory.

3. Upload and download a file in HDFS.

4. See contents of a file

5. Copy a file from source todestination

6. Copy a file from/To Local file system to HDFS

7. Move file from source to destination.

Note:- Moving files across filesystem is not permitted.

Recursive version of delete.

9. Display last few lines of a file.

10. Display the aggregate length of a file.

STEP 3: MAIN DRIVER

STEP 4: RUNNING THE JOB

Run the program

STEP 1: MAPPER CODE:

STEP 2 : REDUCER CODE:

STEP 3: DRIVER CODE:

4. Environment Variables Setup:

CREATE DATABASE database_name;

7. Insert Data into a Table:

8. Select Data from a Table:

Step 2: Set up the Virtual Machine

Step 3: Install Java

Step 4: Download and Install HBase

Step 5: Configure HBase

 Save the file and exit the text editor.

Step 6: Start HBase

 Start the HBase server:

HBASE PRACTICE EXAMPLES:

Step 1: Start HBase

Step 2: Open HBase Shell

Step 4: Insert Data

Step 5: Get Data

Step 6: Scan Data

Step 7: Delete Data

Step 8: Disable and Drop Table

Step 1: Download Apache Thrift:

Step 2: Extract the ZIP file:

Step 3: Set up environment variables:

Step 4: Verify the installation:

Step 1: Install SQOOP.

Step 2: Importing data from a database:

 Replace the placeholders

Step 3: Exporting data to a database:

You might also like