0% found this document useful (0 votes)

15 views20 pages

Bda Unit 5 Notes

The document provides an overview of HBase and Apache Pig, two components of the Big Data Analytics ecosystem. HBase is described as a distributed, column-oriented database that offers quick access to structured data, while Apache Pig is a high-level platform for executing MapReduce programs using a language called Pig Latin. Key features, data models, and advantages of both technologies are discussed, highlighting their roles in handling large datasets efficiently.

Uploaded by

flamekaiserkaiser9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views20 pages

Bda Unit 5 Notes

Uploaded by

flamekaiserkaiser9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

lOMoARcPSD|35430801

BDA Unit 5 notes

Big Data Analytics (Anna University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Mini S.A ([email protected])
lOMoARcPSD|35430801

What is HBase?
HBase is a distributed column-oriented database built on top of the Hadoop file system. It
is an open-source project and is horizontally scalable.
HBase is a data model that is similar to Google’s big table designed to provide quick
random access to huge amounts of structured data. It leverages the fault tolerance
provided by the Hadoop File System (HDFS).
It is a part of the Hadoop ecosystem that provides random real-time read/write access to
data in the Hadoop File System.
One can store the data in HDFS either directly or through HBase. Data consumer
reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the
Hadoop File System and provides read and write access.

Features of Hbase
o Horizontally scalable: You can add any number of columns anytime.
o Automatic Failover: Automatic failover is a resource that allows a
system administrator to automatically switch data handling to a
standby system in the event of system compromise
o Integrations with Map/Reduce framework: Al the commands and java
codes internally implement Map/ Reduce to do the task and it is built
over Hadoop Distributed File System.
o sparse, distributed, persistent, multidimensional sorted map, which is
indexed by rowkey, column key,and timestamp.
o Often referred as a key value store or column family-oriented database,
or storing versioned maps of maps.

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

o fundamentally, it's a platform for storing and retrieving data with

random access.
o It doesn't care about datatypes(storing an integer in one row and a
string in another for the same column).
o It doesn't enforce relationships within your data.
o It is designed to run on a cluster of computers, built using commodity
hardware.
o

HBase Data Model and implementations

HBase Client API?

Basically, to perform CRUD operations on HBase tables we use Java client API for HBase. Since
HBase has a Java Native API and it is written in Java thus it offers programmatic access to DML
(Data Manipulation Language).

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

i. Class HBase Configuration

This class adds HBase configuration files to a Configuration. It belongs to

the org.apache.hadoop.hbase package.
ii. Method

static org.apache.hadoop.conf.Configuration create()

To create a Configuration with HBase resources, we use this method.
Class HTable in HBase Client API
An HBase internal class which represents an HBase table is HTable. Basically, to communicate
with a single HBase table, we use this implementation of a table. It belongs to
the org.apache.hadoop.hbase.client class.
a. Constructors
i. HTable()
ii. HTable(TableName tableName, ClusterConnection connection, ExecutorService pool)
We can create an object to access an HBase table, by using this constructor.
b. Methods
i. void close()
Basically, to release all the resources of the HTable, we use this method.
ii. void delete(Delete delete)
The method “void delete(Delete delete)” helps to delete the specified cells/row.
iii. boolean exists(Get get)
As specified by Get, it is possible to test the existence of columns in the table, with this method.
iv. Result get(Get get)
This method retrieves certain cells from a given row.
v. org.apache.hadoop.conf.Configuration getConfiguration()
It returns the Configuration object used by this instance.
vi. TableName getName()
This method returns the table name instance of this table.
vii. HTableDescriptor getTableDescriptor()
It returns the table descriptor for this table.

viii. byte[] getTableName()

This method returns the name of this table.
ix. void put(Put put)
We can insert data into the table, by using this method.

Pig
Apache Pig is a high-level data 昀氀ow platform for executing MapReduce
programs of Hadoop. The language used for Pig is Pig Latin.

The Pig scripts get internally converted to Map Reduce jobs and get executed
on data stored in HDFS. Apart from that, Pig can also execute its job in
Apache Tez or Apache Spark.

Pig can handle any type of data, i.e., structured, semi-structured or

unstructured and stores the corresponding results into Hadoop Data File

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

System. Every task which can be achieved using PIG can also be achieved
using java used in MapReduce.

Features of Apache Pig

Let's see the various uses of Pig technology.

1) Ease of programming
Writing complex java programs for map reduce is quite tough for non-
programmers. Pig makes this process easy. In the Pig, the queries are
converted to MapReduce internally.

2) Optimization opportunities
It is how tasks are encoded permits the system to optimize their execution
automatically, allowing the user to focus on semantics rather than e昀케ciency.

3) Extensibility
A user-de昀椀ned function is written in which the user can write their logic to
execute over the data set.

4) Flexible
It can easily handle structured as well as unstructured data.

5) In-built operators
It contains various type of operators such as sort, 昀椀lter and joins.

Differences between Apache MapReduce and PIG

Apache MapReduce Apache PIG

It is a low-level data processing tool. It is a high-level data 昀氀ow tool.

Here, it is required to develop complex It is not required to develop complex

programs using Java or Python. programs.

It is di昀케cult to perform data operations It provides built-in operators to perform

in MapReduce. data operations like union, sorting and

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

ordering.

It doesn't allow nested data types. It provides nested data types like tuple, bag,
and map.

Advantages of Apache Pig

o Less code - The Pig consumes less line of code to perform any
operation.
o Reusability - The Pig code is 昀氀exible enough to reuse again.
o Nested data types - The Pig provides a useful concept of nested data
types like tuple, bag, and map.

Pig’s Data Model

Before we take a look at the operators that Pig Latin provides, we first need to
understand Pig’s data model. This includes Pig’s data types, how it handles
concepts such as missing data, and how you can describe your data to Pig.

Types
Pig’s data types can be divided into two categories: scalar types, which
contain a single value, and complex types, which contain other types.

Scalar Type
Pig’s scalar types are simple types that appear in most programming
languages. With the exception of bytearray, they are all represented in Pig
interfaces by java.lang classes, making them easy to work with in UDFs:
int

An integer. Ints are represented in interfaces by java.lang.Integer. They

store a four-byte signed integer. Constant integers are expressed as
integer numbers, for example, 42
lon

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

A long integer. Longs are represented in interfaces by java.lang.Long.

They store an eight-byte signed integer. Constant longs are expressed as
integer numbers with an L appended, for example, 5000000000L.
昀氀oat

A floating-point number. Floats are represented in interfaces

by java.lang.Float and use four bytes to store their value. You can find
the range of values representable by Java’s Float type
at http://java.sun.com/docs/books/jls/third_edition/html/typesValues.htm
l#4.2.3. Note that because this is a floating-point number, in some
calculations it will lose precision. For calculations that require no loss of
precision, you should use an int or long instead. Constant floats are
expressed as a floating-point number with an f appended. Floating-point
numbers can be expressed in simple format, 3.14f, or in exponent
format, 6.022e23f.
double

A double-precision floating-point number. Doubles are represented in

interfaces by java.lang.Double and use eight bytes to store their value.
You can find the range of values representable by Java’s Double type
at http://java.sun.com/docs/books/jls/third_edition/html/typesValues.htm
l#4.2.3. Note that because this is a floating-point number, in some
calculations it will lose precision. For calculations that require no loss of
precision, you should use an int or long instead. Constant doubles are
expressed as a floating-point number in either simple format, 2.71828, or
in exponent format, 6.626e-34.
chararray

A string or character array. Chararrays are represented in interfaces

by java.lang.String. Constant chararrays are expressed as string literals
with single quotes, for example, 'fred'. In addition to standard
alphanumeric and symbolic characters, you can express certain
characters in chararrays by using backslash codes, such as \t for Tab
and \n for Return. Unicode characters can be expressed as \u followed
by their four-digit hexadecimal Unicode value. For example, the value
for Ctrl-A is expressed as \u0001.
bytearray

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

A blob or array of bytes. Bytearrays are represented in interfaces by a

Java class Data Byte Array that wraps a Java byte[]. There is no way to
specify a constant byte array.

Complex Types
Pig has three complex data types: maps, tuples, and bags. All of these types
can contain data of any type, including other complex types. So it is possible
to have a map where the value field is a bag, which contains a tuple where one
of the fields is a map.

Map
A map in Pig is a char array to data element mapping, where that element can
be any Pig type, including a complex type. The char array is called a key and
is used as an index to find the element, referred to as the value
Because Pig does not know the type of the value, it will assume it is a byte
array. However, the actual value might be something different. If you know
what the actual type is (or what you want it to be), you can cast it; see Casts. If
you do not cast the value, Pig will make a best guess based on how you use
the value in your script. If the value is of a type other than bytearray, Pig will
figure that out at runtime and handle it. See Schemas for more information on
how Pig handles unknown types.
By default there is no requirement that all values in a map must be of the same
type. It is legitimate to have a map with two keys name and age, where the value
for name is a chararray and the value for age is an int. Beginning in Pig 0.9, a
map can declare its values to all be of the same type. This is useful if you
know all values in the map will be of the same type, as it allows you to
avoidthe casting, and Pig can avoid the runtime type-massaging referenced in
the previous paragraph
Map constants are formed using brackets to delimit the map, a hash between
keys and values, and a comma between key-value pairs. For
example, ['name'#'bob', 'age'#55] will create a map with two
keys, “name” and “age”. The first value is a chararray, and the second is an
integer.

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Tuple
A tuple is a fixed-length, ordered collection of Pig data elements. Tuples
aredivided into fields, with each field containing one data element. These
elements can be of any type—they do not all need to be the same type. A tuple
is analogous to a row in SQL, with the fields being SQL columns. Because
tuples are ordered, it is possible to refer to the fields by position;
see Expressions in foreach for details. A tuple can, but is not required to, have
a schema associated with it that describes each field’s type and provides a
name for each field. This allows Pig to check that the data in the tuple is what
the user expects, and it allows the user to reference the fields of the tuple by
name.
Tuple constants use parentheses to indicate the tuple and commas to delimit
fields in the tuple. For example, ('bob', 55) describes a tuple constant with
two fields.
Bag
A bag is an unordered collection of tuples. Because it has no order, it is not
possible to reference tuples in a bag by position. Like tuples, a bag can, but is
not required to, have a schema associated with it. In the case of a bag, the
schema describes all tuples within the bag.
Bag constants are constructed using braces, with tuples in the bag separated by
commas. For example, {('bob', 55), ('sally', 52), ('john', 25)} constructs a
bag with three tuples, each with two fields.

Pig users often notice that Pig does not provide a list or set type that can store
items of any type. It is possible to mimic a set type using the bag, by wrapping
the desired type in a tuple of one field. For instance, if you want to store a set
of integers, you can create a bag with a tuple with one field, which is an int.
This is a bit cumbersome, but it works.

Bag is the one type in Pig that is not required to fit into memory. As you will
see later, because bags are used to store collections when grouping, bags can
become quite large. Pig has the ability to spill bags to disk when necessary,
keeping only partial sections of the bag in memory. The size of the bag is
limited to the amount of local disk available for spilling the bag.

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Pig Latin
The Pig Latin is a data 昀氀ow language used by Apache Pig to analyze the data
in Hadoop. It is a textual language that abstracts the programming from the
Java MapReduce idiom into a notation.

Pig Latin Statements

The Pig Latin statements are used to process the data. It is an operator that
accepts a relation as an input and generates another relation as an output.

o It can span multiple lines.

o Each statement must end with a semi-colon.
o It may include expression and schemas.
o By default, these statements are processed using multi-query execution

Pig Latin Conventions

Convention Description

() The parenthesis can enclose one or more items. It can also be used
to indicate the tuple data type.
Example - (10, xyz, (3,6,9))

[] The straight brackets can enclose one or more items. It can also
be used to indicate the map data type.
Example - [INNER | OUTER]

{} The curly brackets enclose two or more items. It can also be used to
indicate the bag data type
Example - { block | nested_block }

... The horizontal ellipsis points indicate that you can repeat a portion
of the code.
Example - cat path [path ...]

Latin Data Types

Simple Data Types

Type Description

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

int It de昀椀nes the signed 32-bit integer.

Example - 2

long It de昀椀nes the signed 64-bit integer.

Example - 2L or 2l

昀氀oat It de昀椀nes 32-bit 昀氀oating point number.

Example - 2.5F or 2.5f or 2.5e2f or 2.5.E2F

double It de昀椀nes 64-bit 昀氀oating point number.

Example - 2.5 or 2.5 or 2.5e2f or 2.5.E2F

chararray It de昀椀nes character array in Unicode UTF-8 format.

Example - javatpoint

bytearray It de昀椀nes the byte array.

boolean It de昀椀nes the boolean type values.

Example - true/false

datetime It de昀椀nes the values in datetime order.

Example - 1970-01- 01T00:00:00.000+00:00

biginteger It de昀椀nes Java BigInteger values.

Example - 5000000000000

bigdecimal It de昀椀nes Java BigDecimal values.

Example - 52.232344535345

Complex Types

Type Description

tuple It de昀椀nes an ordered set of 昀椀elds.

Example - (15,12)

bag It de昀椀nes a collection of tuples.

Example - {(15,12), (12,15)}

map It de昀椀nes a set of key-value pairs.

Example - [open#apache]

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Developing and Testing Pig Latin

Scripts
The last few chapters focused on Pig Latin the language. Now we will turn to
the practical matters of developing and testing your scripts. This chapter
covers helpful debugging tools such as describe and explain. It also covers
ways to test your scripts. Information on how to make your scripts perform
better will be covered in the next chapter.

Development Tools

Pig provides several tools and diagnostic operators to help you develop your
applications. In this section we will explore these and also look at some tools
others have written to make it easier to develop Pig with standard editors and
integrated development environments (IDEs).

Syntax Highlighting and Checking

Syntax highlighting often helps users write code correctly, at least
syntactically, the first time around. Syntax highlighting packages exist for
several popular editors. The packages listed in Table 7-1 were created and
added at various times, so how their highlighting conforms with current Pig
Latin syntax varies.

Table 7-1. Pig Latin syntax highlighting packages

Tool URL

Eclipse http://code.google.com/p/pig-eclipse

Emacs http://github.com/cloudera/piglatin-mode, http://sf.net/projects/pig-mode

TextMate http://www.github.com/kevinweil/pig.tmbundle

Vim http://www.vim.org/scripts/script.php?script_id=2186

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

In addi琀椀on to these syntax highligh琀椀ng packages, Pig will also let you check
the syntax of your script without running it. If you add -c or -check to the
command line, Pig will just parse and run seman琀椀c checks on your script.
The -dryrun command-line op琀椀on will also check your syntax, expand any
macros and imports, and perform parameter subs琀椀tu琀椀on.

describe
describe shows you the schema of a relation in your script. This can be very
helpful as you are developing your scripts. It is especially useful as you are
learning Pig Latin and understanding how various operators change the
data. describe can be applied to any relation in your script, and you can have
multiple describes in a script:

--describe.pig

divs = load 'NYSE_dividends' as (exchange:chararray,

symbol:chararray,

date:chararray, dividends:float);

trimmed = foreach divs generate symbol, dividends;

grpd = group trimmed by symbol;

avgdiv = foreach grpd generate group, AVG(trimmed.dividends);

describe trimmed;

describe grpd;

describe avgdiv;

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

trimmed: {symbol: chararray,dividends: float}

grpd: {group: chararray,trimmed: {(symbol: chararray,dividends:
float)}}
avgdiv: {group: chararray,double}

describe uses Pig’s standard schema syntax. For informa琀椀on on this

syntax, see Schemas. So, in this example, the rela琀椀on trimmed has two
昀椀elds: symbol, which is a chararray, and dividends, which is a
昀氀oat. grpd also has two 昀椀elds, group (the name Pig always assigns to the
group by key) and a bag trimmed, which matches the name of the rela琀椀on
that Pig grouped to produce the bag. Tuples in trimmed have two
昀椀elds: symbol and dividends. Finally, in avgdiv there are two
昀椀elds, group and a double, which is the result of the AVG func琀椀on and is
unnamed.

Data Types
Hive data types are categorized in numeric types, string types, misc types,
and complex types. A list of Hive data types is given below.

Integer Types

Type Size Range

TINYINT 1-byte signed -128 to 127

integer

SMALLINT 2-byte signed 32,768 to 32,767

integer

INT 4-byte signed 2,147,483,648 to 2,147,483,647

integer

BIGINT 8-byte signed -9,223,372,036,854,775,808 to

integer 9,223,372,036,854,775,807

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Decimal Type

Type Size Range

FLOAT 4-byte Single precision 昀氀oating point number

DOUBLE 8-byte Double precision 昀氀oating point number

Date/Time Types
TIMESTAMP

o It supports traditional UNIX timestamp with optional nanosecond

precision.
o As Integer numeric type, it is interpreted as UNIX timestamp in seconds.
o As Floating point numeric type, it is interpreted as UNIX timestamp in
seconds with decimal precision.
o As string, it follows java.sql.Timestamp format "YYYY-MM-DD
HH:MM:SS.昀昀昀昀昀昀昀昀f" (9 decimal place precision)

DATES

The Date value is used to specify a particular year, month and day, in the form
YYYY--MM--DD. However, it didn't provide the time of the day. The range of Date type
lies between 0000--01--01 to 9999--12--31.

String Types
STRING

The string is a sequence of characters. It values can be enclosed within single

quotes (') or double quotes (").

Varchar

The varchar is a variable length type whose range lies between 1 and 65535,
which speci昀椀es that the maximum number of characters allowed in the
character string.

CHAR

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

The char is a 昀椀xed-length type whose maximum length is 昀椀xed at 255.

Hive data types and file format

Apache Hive supports several familiar file formats used in Apache Hadoop. Hive can load and
query different data file created by other Hadoop components such as Pig or MapReduce. In this
article, we will check Apache Hive different file formats such as TextFile, SequenceFile,
RCFile, AVRO, ORC and Parquet formats. Cloudera Impala also supports these file
formats.

Hive Di昀昀erent File Formats

Different file formats and compression codecs work better for different data sets in
Apache Hive.

Following are the Apache Hive different file formats:

 Text File
 Sequence File
 RC File
 AVRO File
 ORC File
 Parquet File
Hive Text File Format

Hive Text 昀椀le format is a default storage format. You can use the text format to
interchange the data with other client application. The text file format is very common
most of the applications. Data is stored in lines, with each line being a record. Each lines
are terminated by a newline character (\n).

The text format is simple plane file format. You can use the compression (BZIP2) on the
text file to reduce the storage spaces.

Create a TEXT file by add storage option as ‘STORED AS TEXTFILE’ at the end of a
Hive CREATE TABLE command.

Hive Text File Format Examples

Below is the Hive CREATE TABLE command with storage format specification:

Create table textfile_table

(column_specs)
Store as textfile;

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Hive Sequence File Format

Sequence 昀椀les are Hadoop flat files which stores values in binary key-value pairs.
The sequence files are in binary format and these files are able to split. The main
advantages of using sequence file is to merge two or more files into one file.

Create a sequence file by add storage option as ‘STORED AS SEQUENCEFILE’ at

the end of a Hive CREATE TABLE command.

Hive Sequence File Format Example

Below is the Hive CREATE TABLE command with storage format specification:

create table sequencefile_table

(column_specs)

Stored as sequencefile_table

Hive AVRO File Format

AVRO is open source project that provides data serialization and data exchange
services for Hadoop. You can exchange data between Hadoop ecosystem and program
written in any programming languages. Avro is one of the popular file format in Big Data
Hadoop based applications.

Create AVRO file by specifying ‘STORED AS AVRO’ option at the end of a CREATE
TABLE Command.

Hive AVRO File Format Example

Below is the Hive CREATE TABLE command with storage format specification:

create table avro_table

(column_specs)

Stored as aveo:

HiveQL: Data Definition

HiveQL is the Hive query language. Like all SQL dialects in widespread
use, it doesn’t fully conform to any particular revision of the ANSI SQL
standard. It is perhaps closest to MySQL’s dialect, but with significant
differences. Hive offers no support for row-level inserts, updates, and
deletes. Hive doesn’t support transactions. Hive adds extensions to
provide better performance in the context of Hadoop and to integrate
with custom extensions and even external programs.

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

Still, much of HiveQL will be familiar. This chapter and the ones that
follow discuss the features of HiveQL using representative examples.
In some cases, we will briefly mention details for completeness, then
explore them more fully in later chapters.

This chapter starts with the so-called data definition language parts of
HiveQL, which are used for creating, altering, and dropping databases,
tables, views, functions, and indexes. We’ll discuss databases and
tables in this chapter, deferring the discussion of views until Chapter 7,
indexes until Chapter 8, and functions until Chapter 13.
We’ll also discuss the SHOW and DESCRIBE commands for listing and
describing items as we go.
Subsequent chapters explore the data manipulation language parts of
HiveQL that are used to put data into Hive tables and to extract data to
the filesystem, and how to explore and manipulate data with queries,
grouping, filtering, joining, etc.

Databases in Hive
The Hive concept of a database is essentially just
a catalog or namespace of tables. However, they are very useful for
larger clusters with multiple teams and users, as a way of avoiding
table name collisions. It’s also common to use databases to organize
production tables into logical groups.
If you don’t specify a database, the default database is used.

The simplest syntax for creating a database is shown in the following

example:

hive> CREATE DATABASE financials;

Hive will throw an error if financials already exists. You can suppress
these warnings with this variation:

hive> CREATE DATABASE IF NOT EXISTS financials;

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

While normally you might like to be warned if a database of the same

name already exists, the IF NOT EXISTS clause is useful for scripts that
should create a database on-the-fly, if necessary, before proceeding.

You can also use the keyword SCHEMA instead of DATABASE in all the
database-related commands.

At any time, you can see the databases that already exist as follows:

hive> SHOW DATABASES;

default
financials

hive> CREATE DATABASE human_resources;

hive> SHOW DATABASES;

default
financials
human_resources

If you have a lot of databases, you can restrict the ones listed using
a regular expression, a concept we’ll explain in LIKE and RLIKE, if it is
new to you. The following example lists only those databases that start
with the letter h and end with any other characters (the .* part):

hive> SHOW DATABASES LIKE 'h.*';

human_resources
hive> ...

Hive will create a directory for each database. Tables in that database
will be stored in subdirectories of the database directory. The exception
is tables in the default database, which doesn’t have its own directory.

Downloaded by Mini S.A ([email protected])

lOMoARcPSD|35430801

The database directory is created under a top-level directory specified

by the property hive.metastore.warehouse.dir, which we discussed in Local
Mode Configuration and Distributed and Pseudodistributed Mode
Configuration. Assuming you are using the default value for this
property, /user/hive/warehouse, when the financials database is
created, Hive will create the
directory /user/hive/warehouse/financials.db. Note the .db extension.

You can override this default location for the new directory as shown in
this example:

hive> CREATE DATABASE financials

> LOCATION '/my/preferred/directory';

You can add a descriptive comment to the database, which will be

shown by the DESCRIBE DATABASE <database> command.

hive> CREATE DATABASE financials

> COMMENT 'Holds all financial tables';

hive> DESCRIBE DATABASE financials;

financials Holds all financial tables
hdfs://master-server/user/hive/warehouse/financials.db

Downloaded by Mini S.A ([email protected])

Java Program Structure
75% (4)
Java Program Structure
3 pages
DA Unit 5
100% (1)
DA Unit 5
191 pages
Bda Unit 4 060115 Big Data Analytics Unit 4
No ratings yet
Bda Unit 4 060115 Big Data Analytics Unit 4
19 pages
BDA Unit 5 Notes BDA Unit 5 Notes: Big Data Analytics (Anna University) Big Data Analytics (Anna University)
No ratings yet
BDA Unit 5 Notes BDA Unit 5 Notes: Big Data Analytics (Anna University) Big Data Analytics (Anna University)
20 pages
PHP Question Bank Ch-1,2 - 090233
No ratings yet
PHP Question Bank Ch-1,2 - 090233
2 pages
Business Requirements Document (BRD) Template: Project/Initi Ati Ve Month 20YY
No ratings yet
Business Requirements Document (BRD) Template: Project/Initi Ati Ve Month 20YY
9 pages
Array Unit 2 Notes
No ratings yet
Array Unit 2 Notes
39 pages
Cambridge International AS & A Level: Computer Science 9618/04
No ratings yet
Cambridge International AS & A Level: Computer Science 9618/04
10 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Handbook MAD (3161612)
100% (1)
Handbook MAD (3161612)
123 pages
Unit 5 BDA
No ratings yet
Unit 5 BDA
34 pages
Bda - Unit 5
No ratings yet
Bda - Unit 5
30 pages
Week 02 Database System: COSC-2103
No ratings yet
Week 02 Database System: COSC-2103
38 pages
Constraint Random Verification With Python and Cocotb
No ratings yet
Constraint Random Verification With Python and Cocotb
33 pages
Opentext Department of Information Resources Contract Dir Cpo 4405 en
No ratings yet
Opentext Department of Information Resources Contract Dir Cpo 4405 en
151 pages
Hive - PIG - HBase - Zookeeper
100% (1)
Hive - PIG - HBase - Zookeeper
31 pages
Big Data Unit-5
No ratings yet
Big Data Unit-5
81 pages
Big Data Presentations (Autosaved)
No ratings yet
Big Data Presentations (Autosaved)
126 pages
Unit-V CC&BD CS62
No ratings yet
Unit-V CC&BD CS62
73 pages
Software Metrics
No ratings yet
Software Metrics
79 pages
Pbds Unit-5
No ratings yet
Pbds Unit-5
60 pages
9 HBase
No ratings yet
9 HBase
77 pages
Modding Skyrim - Scripter's Edition
No ratings yet
Modding Skyrim - Scripter's Edition
56 pages
Unit V Hadoop Related Tools
No ratings yet
Unit V Hadoop Related Tools
54 pages
Unit-5 Notes
No ratings yet
Unit-5 Notes
61 pages
Unit 5 Bda
No ratings yet
Unit 5 Bda
42 pages
DA Unit-5
No ratings yet
DA Unit-5
78 pages
BDM Unit 5
No ratings yet
BDM Unit 5
60 pages
Unit 5 Bda
No ratings yet
Unit 5 Bda
42 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
34 pages
Hadoop Intro - Part1
No ratings yet
Hadoop Intro - Part1
45 pages
BD 5
No ratings yet
BD 5
28 pages
S Pig Hive HBase Zookeeper
No ratings yet
S Pig Hive HBase Zookeeper
19 pages
BDA-2 Hadoop
No ratings yet
BDA-2 Hadoop
28 pages
Unit 5 (Pig, Hive, Hbase)
No ratings yet
Unit 5 (Pig, Hive, Hbase)
18 pages
Notes 5 Unit Big Data
No ratings yet
Notes 5 Unit Big Data
23 pages
Big Data UNIT 5 Own
No ratings yet
Big Data UNIT 5 Own
18 pages
Notes - 5 Unit Big Data
No ratings yet
Notes - 5 Unit Big Data
22 pages
Notes UNIT 5 Bigdata
No ratings yet
Notes UNIT 5 Bigdata
18 pages
Notes of Aktu Btech 3 Yr Big Data
No ratings yet
Notes of Aktu Btech 3 Yr Big Data
15 pages
4.5 Hbase
No ratings yet
4.5 Hbase
27 pages
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
2 Unit 5
No ratings yet
2 Unit 5
24 pages
Unit V Notes
No ratings yet
Unit V Notes
17 pages
BDA Unit 5 Notes: Big Data Analytics (Anna University)
No ratings yet
BDA Unit 5 Notes: Big Data Analytics (Anna University)
20 pages
Unit 5 Bigdata
No ratings yet
Unit 5 Bigdata
14 pages
Data Analytics Chapter 5
No ratings yet
Data Analytics Chapter 5
14 pages
S Pig Hive HBase
No ratings yet
S Pig Hive HBase
19 pages
Unit 5 Short
No ratings yet
Unit 5 Short
14 pages
Hbase Understanding Mapreduce: Unit-2 P-2
No ratings yet
Hbase Understanding Mapreduce: Unit-2 P-2
32 pages
Module 2.2
No ratings yet
Module 2.2
32 pages
Unit - III Java
No ratings yet
Unit - III Java
19 pages
BD U-5 (Anupam Sir)
No ratings yet
BD U-5 (Anupam Sir)
12 pages
Gravity Forms Update To PHP 8
No ratings yet
Gravity Forms Update To PHP 8
17 pages
DocScanner Jan 12, 2023 2-29 PM
No ratings yet
DocScanner Jan 12, 2023 2-29 PM
32 pages
How To Template MQTT Sensor JSON As Entity + Attributes - Configuration - Home Assistant Community
No ratings yet
How To Template MQTT Sensor JSON As Entity + Attributes - Configuration - Home Assistant Community
6 pages
4 Hadoop Ecosystem
No ratings yet
4 Hadoop Ecosystem
16 pages
Lambda Expressions and Function Types in Kotlin
No ratings yet
Lambda Expressions and Function Types in Kotlin
9 pages
Unit 5-1
No ratings yet
Unit 5-1
8 pages
BDA Unit 5 Notes
No ratings yet
BDA Unit 5 Notes
19 pages
Screenshot 2025-01-13 at 12.17.38 PM
No ratings yet
Screenshot 2025-01-13 at 12.17.38 PM
12 pages
WM845G Formation Cics v5 Cicsplex System Manager Introduction PDF
No ratings yet
WM845G Formation Cics v5 Cicsplex System Manager Introduction PDF
1 page
S Pig Hive HBase Zookeeper 07
No ratings yet
S Pig Hive HBase Zookeeper 07
21 pages
Case Study Pig Hive Hbase
No ratings yet
Case Study Pig Hive Hbase
15 pages
Wa0006.
No ratings yet
Wa0006.
14 pages
Data Analytics - Unit - 5
No ratings yet
Data Analytics - Unit - 5
15 pages
Big Data Analysis
No ratings yet
Big Data Analysis
8 pages
181 - PDFsam - Programming Pig
No ratings yet
181 - PDFsam - Programming Pig
10 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
HBASE
No ratings yet
HBASE
11 pages
OOP Concepts and Examples
No ratings yet
OOP Concepts and Examples
59 pages
Module 5 - Data Analytics
No ratings yet
Module 5 - Data Analytics
4 pages
Syed Sali Jailani M For AS400 - Chennai - Sagent
No ratings yet
Syed Sali Jailani M For AS400 - Chennai - Sagent
4 pages
1.7 Represent Simple Facts With RDF
No ratings yet
1.7 Represent Simple Facts With RDF
16 pages
Python Download Pip: "Hello, World!"
No ratings yet
Python Download Pip: "Hello, World!"
14 pages
Manish Singh
No ratings yet
Manish Singh
6 pages
BDAV Sem3
No ratings yet
BDAV Sem3
11 pages
What Is Apache Pig
No ratings yet
What Is Apache Pig
8 pages
BDA Module-4
No ratings yet
BDA Module-4
4 pages
Big Data BASICS
No ratings yet
Big Data BASICS
3 pages
UNIT-1 Introduction To Python Programming
No ratings yet
UNIT-1 Introduction To Python Programming
10 pages
21bai1660 Bcse308p Assignment5 Hrishikeshgk
No ratings yet
21bai1660 Bcse308p Assignment5 Hrishikeshgk
6 pages
MERN Stack Syllabus
No ratings yet
MERN Stack Syllabus
4 pages
Coede
No ratings yet
Coede
5 pages
Weekend Workshops On Linux Kernel Debugging
No ratings yet
Weekend Workshops On Linux Kernel Debugging
1 page
Big Data Emerging Technologie
No ratings yet
Big Data Emerging Technologie
10 pages
Template CV
No ratings yet
Template CV
3 pages
Flush End
No ratings yet
Flush End
4 pages
PowerShell Poster
No ratings yet
PowerShell Poster
2 pages
File Handling Function
No ratings yet
File Handling Function
2 pages