0% found this document useful (0 votes)

194 views69 pages

Chapter 5 Hive

Hive is an open source data warehouse system built on Hadoop that allows users to query and analyze large datasets stored in HDFS using SQL-like queries. It abstracts the complexity of Hadoop and MapReduce by allowing users to write SQL queries instead of complex MapReduce programs. Hive organizes data into tables and stores table metadata and data in a metastore and HDFS respectively. It then compiles SQL queries into a directed acyclic graph of MapReduce jobs for execution on a Hadoop cluster.

Uploaded by

Komal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

194 views69 pages

Chapter 5 Hive

Uploaded by

Komal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

HIVE

Instructor: Oussama Derbel

Introduction

■ What is Hadoop?

– Apache Hadoop is an open source software framework used to develop data processing

applications which are executed in a distributed computing environment.

– Applications built using HADOOP are run on large data sets distributed across clusters of

commodity computers.

■ Commodity computers are cheap and widely available. These are mainly useful for achieving greater

computational power at low cost.

Introduction

■ Core of Hadoop

HDFS
Storage Part
( Hadoop Distributed File System)

MAPREDUCE Processing Part

Introduction

■ Apache Hadoop consists of two sub-projects

1. Hadoop MapReduce: MapReduce is a computational model

and software framework for writing applications which are

run on Hadoop. These MapReduce programs are capable of

processing enormous data in parallel on large clusters of

computation nodes.

2. HDFS (Hadoop Distributed File System): HDFS takes care of

the storage part of Hadoop applications.

Note
MapReduce applications consume data from HDFS. HDFS creates multiple replicas of data blocks and distributes them
on compute nodes in a cluster. This distribution enables reliable and extremely rapid computations.
What is Hive?

■ Apache Hive is an open source data warehouse system built on top of Hadoop Housed for querying and

analyzing large datasets stored in Hadoop files.

■ Initially, you must write complex Map-Reduce jobs, but now with the help of the Hive, you just need to submit merely

SQL queries.

■ Hive is mainly targeted towards users who are comfortable with SQL.

■ Hive use language called HiveQL (HQL), which is like SQL.

■ HiveQL automatically translates SQL-like queries into MapReduce jobs.

What is Hive?

■ Hive abstracts the complexity of Hadoop. The main thing to notice is that there is no need to learn java for Hive.

■ The Hive generally runs on your workstation and converts your SQL query into a series of jobs for execution on

a Hadoop cluster.

■ Apache Hive organizes data into tables. This provides a means for attaching the structure to data stored in HDFS.
Why Apache Hive?

■ Facebook had faced a lot of challenges before the implementation of Apache Hive. Challenges like the size of the data

being generated increased or exploded, making it very difficult to handle them.

■ The traditional RDBMS could not handle the pressure. As a result, Facebook was looking out for better options.

■ To overcome this problem, Facebook initially tried using MapReduce.

■ But it has difficulty in programming and mandatory knowledge in SQL, making it an impractical solution.

■ Hence, Apache Hive allowed them to overcome the challenges they were facing.
Why Apache Hive?

■ With Apache Hive, we are now able to perform the following:

– Schema flexibility and evolution

– Tables can be portioned and bucketed

– Apache Hive tables are defined directly in the HDFS

– JDBC (Java Database Connector) /ODBC (Open Database Connectivity) drivers are available
Why Apache Hive?

■ Apache Hive saves developers from writing complex Hadoop MapReduce jobs for ad-hoc requirements.

■ Hence, hive provides summarization, analysis, and query of data.

■ Hive is very fast and scalable. It is highly extensible. Since Apache Hive is like SQL, hence it becomes very easy for the

SQL developers to learn and implement Hive Queries.

■ Hive reduces the complexity of MapReduce by providing an interface where the user can submit SQL queries. So, now

business analysts can play with Big Data using Apache Hive and generate insights.

■ It also provides file access on various data stores like HDFS and Hbase

■ The most important feature of Apache Hive is that to learn Hive we don’t have to learn Java.
Hive Architecture

■ Metastore:

■ It stores metadata for each of the tables like their schema and location.

■ Hive also includes the partition metadata. This helps the driver to track the progress of various data sets distributed over

the cluster. It stores the data in a traditional RDBMS format. Hive metadata helps the driver to keep a track of the data

and it is highly crucial. Backup server regularly replicates the data which it can retrieve in case of data loss.
Hive Architecture

■ Driver:

■ It acts like a controller which receives the HiveQL statements.

■ The driver starts the execution of the statement by creating sessions. It monitors the life cycle and progress of

the execution.

■ Driver stores the necessary metadata generated during the execution of a HiveQL statement. It also acts as a collection

point of data or query result obtained after the Reduce operation.

Hive Architecture

■ Complier:

■ It performs the compilation of the HiveQL query. This converts the query to an execution plan. The plan contains the

tasks. It also contains steps needed to be performed by the MapReduce to get the output as translated by the query.

■ The compiler in Hive converts the query to an Abstract Syntax Tree (AST). First, check for compatibility and compile-

time errors, then converts the AST to a Directed Acyclic Graph(DAG).

Hive Architecture

■ Optimizer

■ It performs various transformations on the execution plan to provide optimized DAG.

■ It aggregates the transformations together, such as converting a pipeline of joins to a single join, for better performance.

■ The optimizer can also split the tasks, such as applying a transformation on data before a reduce operation, to provide

better performance.
Hive Architecture

■ Executor

■ Once compilation and optimization complete, the executor executes the tasks. Executor takes care of pipelining the

tasks.

■ CLI, UI, and Thrift Server –CLI (command-line interface)provides a user interface for an external user to interact with

Hive. Thrift server in Hive allows external clients to interact with Hive over a network, similar to the JDBC or ODBC

protocols.
Hive Architecture

■ Architecture
Hive Components
Hive Limitations

• Does not provide row level updates (earlier versions)

• Not suitable for OLTP (Online Transactional Processing)

• Queries have higher latency

• Start-up overhead for MapReduce jobs

• Best when large dataset is maintained and mined

Hive -Data Types -Numeric

• TINYINT (1-byte signed integer)

• SMALLINT (2-byte signed integer)

• INT (4-byte signed integer)

• BIGINT (8-byte signed integer)

• FLOAT (4-byte single precision floating point number)

• DOUBLE (8-byte double precision floating point number)

• DECIMAL (User defined precisions)

Hive -Data Types –Date/Time

• TIMESTAMP ( Hive version > 0.8.0)

• DATE ( Hive version > 0.12.0 ) -YYYY-MM-DD

Hive -Data Types –String

• STRING

• VARCHAR ( Hive version > 0.12.0)

• CHAR ( Hive version > 0.13.0)

Hive -Data Types –Misc

• BOOLEAN

• BINARY ( Hive version > 0.8.0)

Hive -Data Types –Complex

• arrays: ARRAY<data_type>

• maps: MAP <primitive_type,data_type>

• structs: STRUCT <col_name : data_type [COMMENT col_comment],...>

• union: UNIONTYPE <data_type, data_type, ...> ( Hive version > 0.7.0)

Hive -Data Types –Example

CREATE TABLEemployees(
nameSTRING,
salaryFLOAT,
subordinates ARRAY<STRING>, ‘’John’’
deductions MAP<STRING,FLOAT>,
addressSTRUCT<street:STRING,
city:STRING,
state:STRING,
zip:INT>,
auth UNION<fbid:INT, gid:INT,email:STRING>
)
Hive -Data Types –Example

CREATE TABLEemployees(
nameSTRING,
salaryFLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING,FLOAT>,
addressSTRUCT<street:STRING,
[“Michael”,“Rumi”]
city:STRING,
state:STRING,
zip:INT>,
auth UNION<fbid:INT, gid:INT,email:STRING>
)
Hive -Data Types –Example

CREATE TABLEemployees(
nameSTRING,
salaryFLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING,FLOAT>,
addressSTRUCT<street:STRING,
city:STRING,
state:STRING, {
“Insurance”:500.00,
zip:INT>, “Charity”:600.00
auth UNION<fbid:INT, gid:INT,email:STRING> }

)
Hive -Metastore

• Stores the metadata of tables into a relational database

• Metadata includes

• Details of databases and tables

• Table definitions: name of table, columns, partitions etc.

Hive -Warehouse

• Hive tables are stored in the Hive warehouse directory

• /apps/hive/warehouse on HDFS

• At the location specified in the table definition

Hive –Getting started – Command Line

• Login to CloudxLab Linux console

• Type “hive” to access hive shell

• By default database named “default” will be selected as current db for the current session

• Type “SHOW DATABASES” to see list of all databases

Hive –Getting started – Command Line

• “SHOW TABLES” will list tables in current selected database which is “default” database.

• Create your own database with your login name

• CREATE DATABAS Eabhinav9884;

• DESCRIBE DATABAS Eabhinav9884;

• DROP DATABASE abhinav9884;

Hive –Getting started – Command Line

• CREATE DATABASE abhinav9884;

• USE abhinav9884;

• CREATE TABLE x(a INT);

Hive –Getting started – Hue

• Login to Hue

• Click on “Query Editors” and select “Hive”

• Select your database (abhinav9984) from the list

• SELECT * FROM x;

• DESCRIBE x;

• DESCRIBE FORMATTED x;
Hive –Tables

●Managed tables

●External tables
Hive –Managed Tables

●Also called Internal

●Lifecycle managed by Hive

●Data is stored in the warehouse directory

●Dropping the table deletes data from warehouse

Hive –External Tables

●The lifecycle is not managed by Hive

●Hive assumes that it does not own the data

●Dropping the table does not delete the underlying data

●Metadata will be deleted

Hive –Managed Tables -Example

CREATE TABLEnyse(
exchange1STRING,
symbol1 STRING,
ymd STRING,
price_openFLOAT,
price_high FLOAT,
price_low FLOAT,
price_closeFLOAT,
volumeINT,
price_adj_closeFLOAT
)
ROWFORMATDELIMITEDFIELDSTERMINATEDBY'\t’;

DESCRIB Enyse;
DESCRIBE FORMATTED nyse;
Hive –Loadind Data-From Local Directory

● hadoop fs -copyToLocal/data/NYSE_daily

● Launch Hive

● use yourdatabase;

● load data localinpath 'NYSE_daily'overwrite into table nyse;

● Copies the data from local file system to warehouse

Hive –Loadind Data-From HDFS

CREATE TABLEnyse_hdfs(
exchange1 STRING,
symbol1STRING,
ymd STRING,
price_open FLOAT,
price_high FLOAT,
price_low FLOAT,
price_close FLOAT,
volume INT,
price_adj_closeFLOAT
)
ROWFORMATDELIMITEDFIELDSTERMINATEDBY'\t';
Hive –Loadind Data-From HDFS

●Copy /data/NYSE_daily to your home directory in HDFS

●load data inpath 'hdfs:///user/abhinav9884/NYSE_daily' overwrite into table nyse_hdfs;

●Moves the data from specified location to warehouse

●Check if NYSE_daily is in your home directory in HDFS

Hive – External Tables

CREATE EXTERNAL TABLE nyse_external(

exchange1STRING,
symbol1 STRING,
ymd STRING,
price_openFLOAT,
price_high FLOAT,
price_low FLOAT,
price_closeFLOAT,
volumeINT,
price_adj_closeFLOAT
)
ROWFORMATDELIMITEDFIELDSTERMINATEDBY'\t'
LOCATION'/user/abhinav9884/NYSE_daily';
describe formatted nyse_external;
Hive – S3 Based External Table

create external table miniwikistats(

projcode string,
pagename string,
pageviews int,
bytes int)

partitioned by(dt string)

row format delimited fields terminated by ' ‘

lines terminated by '\n’

location 's3n://paid/default-datasets/miniwikistats/’ ;
Hive – Select Statements

● Select allcolumns

SELECT * FROM nyse;

● Select only required columns

SELECT exchange1, symbol1 FROM nyse;

Hive – Aggregations

●Find average opening price for each stock

SELECTsymbol1,AVG(price_open) AS avg_price FROM nyse

GROUP BY symbol1;

●To improve performance set top-level aggregation in mapphase

SET hive.map.aggr=true;
Hive – Saving Data

●In local filesystem

insert overwrite local directory '/home/abhinav9884/onlycmc’

select * from nyse where symbol1 ='CMC';

●In HDFS

insert overwrite directory 'onlycmc' select * from nyse where symbol1 = 'CMC';
Hive – Tables – DDL -- ALTER

● Rename a table

ALTERTABLExRENAMETOx1;

● Change datatype of column

ALTER TABLE x1 CHANGEa a FLOAT;

● Add columns in existing table

ALTERTABLEx1ADDCOLUMNS(bFLOAT,cINT);
Hive – Partitions

#First name, Department,Year of joining Mark, Engineering, 2012

Jon, HR,2012

Monica, Finance,2015

Steve, Engineering,2012

Michael, Marketing,2015
Hive – Partitions– Hands-on

●Data is located at /data/bdhs/employees/ on HDFS

●Copy data to your home directory in HDFS
hadoop fs -cp /data/bdhs/employees.
●Create table

CREATE TABLEemployees(
nameSTRING,
departmentSTRING,
somedateDATE
)
PARTITIONED BY(yearSTRING)
ROWFORMATDELIMITEDFIELDSTERMINATEDBY',';
Hive – Partitions– Hands-on

● Load dataset2012.csv

load data inpath 'hdfs:///user/sandeepgiri9034/employees/2012.csv' into table employees partition(year=2012);

●Load dataset2015.csv

load data inpath 'hdfs:///user/sandeepgiri9034/employees/2015.csv' into table employees partition(year=2015);

● SHOW PARTITIONS employees;

● Check warehouse and metastore

Hive – Partitions– Summary

To avoid the full table scan

•The data is stored in different files in warehouse defined by the partitions

•Define the partitions using “partition by” in “create table”

•We can also add a partition later

•Partition can happen on multiple columns (year=2012, month=10, day=12)

Hive – Views

●SELECT * FROM employees where department='Engineering';

●Create a view
CREATE VIEW employees_engineeringAS
SELECT * FROM employees where department='Engineering';
●Now query from the view

SELECT * FROM employees_engineering;

Hive – Views- Summary

•Allows a query to be saved and treated like a table

•Logical construct -does not store data

•Hides the query complexity

•Divide long and complicated query into smaller and manageable pieces

•Similar to writing a function in a programming language

Hive – Load JSON Data

• Download JSON-SERDEBINARIES

• ADD JAR

hdfs:///data/serde/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar;

• Create Table

CREATE EXTERNAL TABLE tweets_raw (

ROW FORMAT SERDE'org.apache.hive.hcatalog.data.JsonSerDe'

LOCATION'/user/abhinav9884/senti/upload/data/tweets_raw';
Hive – Sorting & Distributing – Order By

ORDER By x

●Guarantees global ordering

●Data goes through just one reducer

●This is unacceptable for large datasets as it will overload the reducer

●You end up one sorted file as output

Hive – Sorting & Distributing – Sort By

SORT By x

●Orders data at each of N reducers

●Number of reducers are 1 per1GB

●You end up with N or more sorted files with overlapping ranges

Hive – Sorting & Distributing – Sort By

SORT By x
Hive – Sorting & Distributing – Distribute By

DISTRIBUTE By x

●Ensures each of N reducers gets non-overlapping ranges of x

●But doesn't sort the output of each reducer

●You end up with N or unsorted files with non-overlapping ranges

Hive – Sorting & Distributing – Distribute By

DISTRIBUTE By x
Hive – Sorting & Distributing – Cluster By

CLUSTER By x

●Gives global ordering

●Is the same as (DISTRIBUTE BY x and SORT BYx)

●CLUSTER BY is basically the more scalable version of ORDERBY

Hive – Bucketing

CREATETABLEpage_view(viewTimeINT,useridBIGINT, page_url STRING, referrer_urlSTRING,

ipSTRINGCOMMENT'IPAddressoftheUser')

COMMENT 'This is the page view table’

PARTITIONED BY(dt STRING, country STRING)

CLUSTERED BY(userid) INTO 32BUCKETS

ROW FORMAT DELIMITED FIELD STERMINATED BY'\001’

COLLECTION ITEMS TERMINATED BY '\002'

MAP KEYS TERMINATED BY '\003’

STORED ASSEQUENCEFILE;
Hive – ORC files

●Optimized Row Columnar file format

●Provides a highly efficient way to store Hive data

●Improves performance when

○Reading

○Writing

○Processing

●Has a built-in index, min/max values, and other aggregations

●Proven in large-scale deployments

○Facebook uses the ORC file format for a 300+ PB deployment

Hive – ORC files-Example

CREATE TABLE orc_table(

first_name STRING,

last_nameSTRING

) STORED ASORC;

INSERT INTO orc_table VALUES('John','Gill');

SELECT * fromorc_table;

To Know more, please visit https://orc.apache.org

Hive – Quick Recap

•Each table has got a location

•By default the table is in a directory under the location

/apps/hive/warehouse

•We can override that location by mentioning 'location' in create table clause

•Load data copies the data if it is local

•Load moves the data if it is on hdfs for both external and managed tables

•Dropping managed table deletes the data at the 'location'

•Dropping external table does not delete the data at the ’location'

•The metadata is stored in the relational database –hive metastore

Hive – Connecting to Tableau

•Tableau is a visualization tool

•Tableau allows for instantaneous insight by transforming data into visually appealing, interactive visualizations

called dashboards
Hive – Connecting to Tableau - Steps

•Download and install Tableau desktop from https://www.tableau.com/products/desktop

•Download and install Hortonworks ODBC driver for Apache Hive for yourOS

https://hortonworks.com/downloads/
Hive – Connecting to Tableau – Hands-on

Visualize top 10 stocks with highest opening price on Dec 31,2009

Hive – Quick Demo

1.Copy data from /data/ml100k/u.data into our hdfs home

2.Open Hive in Hue and runfollowing:

CREATE TABLE u_data(useridINT,movieidINT,ratingINT,unixtimeSTRING)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY'\t’

STORED AS TEXT FILE;

LOAD DATA IN PATH '/user/sandeepgiri9034/u.data'overwriteintotableu_data;

select * from u_data limit5;

Select movie id,avg(rating)arfromu_data group by movieid order by ar desc

Hive – Quick Demo

Join with MovieNames

create view top100mas

Select movieid,avg(rating)arfromu_data group by movie id order by ar desc

CREATE TABLE m_data( movieid INT, name STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY'|’
STORED AS TEXTFILE;

load data inpath '/user/sandeepgiri9034/u.item' into tablem_data;

select * from m_data limit100

select * from m_data, top100m where top100m.movieid =m_data.movieid

Hive – Lab

1.For each movie how many users rated it

2.For movies having more than 30 ratings, what is the average rating
References

https://data-flair.training/
Thank you

Apache Spark Programming With Databricks
No ratings yet
Apache Spark Programming With Databricks
112 pages
Unit-V Pig Programming
No ratings yet
Unit-V Pig Programming
123 pages
Apache Hive DDL DML, Queries
100% (2)
Apache Hive DDL DML, Queries
4 pages
Big Data Exam Correction
100% (1)
Big Data Exam Correction
10 pages
BDA Lab ManuaL
No ratings yet
BDA Lab ManuaL
83 pages
MidTerm Exam - Attempt Review
100% (1)
MidTerm Exam - Attempt Review
16 pages
Microsoft Power BI DIAD Hands On Lab
No ratings yet
Microsoft Power BI DIAD Hands On Lab
152 pages
Soa
100% (1)
Soa
129 pages
PDF
No ratings yet
PDF
23 pages
Hive Installation On Windows 10
No ratings yet
Hive Installation On Windows 10
13 pages
Test Blanc
No ratings yet
Test Blanc
23 pages
Hive Quiz and Questions
No ratings yet
Hive Quiz and Questions
6 pages
BDA Unit - II
No ratings yet
BDA Unit - II
66 pages
Bigdataaaaa
No ratings yet
Bigdataaaaa
180 pages
Spark Scala Protected
No ratings yet
Spark Scala Protected
211 pages
M Tech 1sem BDA Question Paper With Answers
No ratings yet
M Tech 1sem BDA Question Paper With Answers
98 pages
BD - Unit - IV - Hive and Pig
No ratings yet
BD - Unit - IV - Hive and Pig
41 pages
Hive Mock Test
100% (1)
Hive Mock Test
6 pages
Unit 4 Hadoop
No ratings yet
Unit 4 Hadoop
86 pages
Twitter Sentimental Analysis
No ratings yet
Twitter Sentimental Analysis
42 pages
IT Infrastructure Evolution
100% (1)
IT Infrastructure Evolution
1 page
Concepts and Techniques: - Chapter 4
No ratings yet
Concepts and Techniques: - Chapter 4
50 pages
3-2 Storage Data Protection Technologies and Applications
No ratings yet
3-2 Storage Data Protection Technologies and Applications
53 pages
Hadoop Hive - One
No ratings yet
Hadoop Hive - One
10 pages
Hive Using Hiveql
No ratings yet
Hive Using Hiveql
38 pages
Install and Run Hadoop On Windows
No ratings yet
Install and Run Hadoop On Windows
29 pages
Hadoop - Hive
No ratings yet
Hadoop - Hive
190 pages
BigData - Oozie
No ratings yet
BigData - Oozie
5 pages
Create Standby Database On Oracle 10G
No ratings yet
Create Standby Database On Oracle 10G
10 pages
How Sqoop Works?: Sqoop "SQL To Hadoop and Hadoop To SQL"
No ratings yet
How Sqoop Works?: Sqoop "SQL To Hadoop and Hadoop To SQL"
27 pages
Hive Queries
No ratings yet
Hive Queries
5 pages
Hadoop Interview Questions and Answers
No ratings yet
Hadoop Interview Questions and Answers
3 pages
BigData Exam C2122 PDF
No ratings yet
BigData Exam C2122 PDF
6 pages
Lecture 20-22 (Memory II)
No ratings yet
Lecture 20-22 (Memory II)
56 pages
10
No ratings yet
10
4 pages
Hive Exercises
No ratings yet
Hive Exercises
2 pages
CSCI312 Big Data Management Singapore 2022-2 Assignment 2: Published On 24 April 2022
No ratings yet
CSCI312 Big Data Management Singapore 2022-2 Assignment 2: Published On 24 April 2022
10 pages
Hadoop - Quick Guide Hadoop - Big Data Overview
No ratings yet
Hadoop - Quick Guide Hadoop - Big Data Overview
41 pages
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
No ratings yet
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
112 pages
Junior Training Sheet - Template - V7.0 PDF
No ratings yet
Junior Training Sheet - Template - V7.0 PDF
45 pages
NoSQL Intro
No ratings yet
NoSQL Intro
26 pages
SQL Deep Guide
No ratings yet
SQL Deep Guide
236 pages
CSI 2110 Summary PDF
No ratings yet
CSI 2110 Summary PDF
17 pages
Compe 431 Sample Questions
100% (1)
Compe 431 Sample Questions
19 pages
Tutorial Hbase
No ratings yet
Tutorial Hbase
100 pages
Bigdatacourse
No ratings yet
Bigdatacourse
10 pages
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
No ratings yet
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
11 pages
File Hanling - New - C++
No ratings yet
File Hanling - New - C++
26 pages
Big Data Hadoop Insight
No ratings yet
Big Data Hadoop Insight
46 pages
Final Exam Big Data - 11112
No ratings yet
Final Exam Big Data - 11112
6 pages
Commonly Asked Power Bi Interview Question
No ratings yet
Commonly Asked Power Bi Interview Question
7 pages
Apache HIVE
No ratings yet
Apache HIVE
9 pages
CH 23
No ratings yet
CH 23
126 pages
Bda Unit 3
No ratings yet
Bda Unit 3
22 pages
DBMS Module5 Questions With Answers
No ratings yet
DBMS Module5 Questions With Answers
27 pages
PySpark RDD Assignment
No ratings yet
PySpark RDD Assignment
1 page
Hive Partitions and Buckets Exercises
No ratings yet
Hive Partitions and Buckets Exercises
8 pages
Install Hadoop-2.6.0 On Windows10
No ratings yet
Install Hadoop-2.6.0 On Windows10
10 pages
14-Lesson Cloudera Hive
No ratings yet
14-Lesson Cloudera Hive
9 pages
MapR Sandbox For Hadoop DocUpdateFor3.1.1
No ratings yet
MapR Sandbox For Hadoop DocUpdateFor3.1.1
7 pages
MapReduce Introduction
No ratings yet
MapReduce Introduction
34 pages
Hive Join
No ratings yet
Hive Join
6 pages
Dwques
No ratings yet
Dwques
5 pages
Publication Certificate IJEAT F8265088619
No ratings yet
Publication Certificate IJEAT F8265088619
1 page
DBMS - 3rd Year VI Semester - AICTE 2020-21 - 9 March 2021
No ratings yet
DBMS - 3rd Year VI Semester - AICTE 2020-21 - 9 March 2021
2 pages
P6 File Corruption
No ratings yet
P6 File Corruption
20 pages
Oracle Retail Store Inventory Management: Data Migration Guide Release 12.0
No ratings yet
Oracle Retail Store Inventory Management: Data Migration Guide Release 12.0
17 pages
SQL For Data Analysis
No ratings yet
SQL For Data Analysis
236 pages
AVL Trees: Cse, Postech
100% (2)
AVL Trees: Cse, Postech
29 pages
Oracle GoldenGate 11gr2 IE and Oracle DG - Switchover-Fail-over Ops v1.1-ID1436913.1
No ratings yet
Oracle GoldenGate 11gr2 IE and Oracle DG - Switchover-Fail-over Ops v1.1-ID1436913.1
24 pages
Introduction To Structured Query Language (SQL)
No ratings yet
Introduction To Structured Query Language (SQL)
55 pages
Task 1 Data Science With Documentation
No ratings yet
Task 1 Data Science With Documentation
11 pages
OAS: Cheat Sheet: File / Directories
No ratings yet
OAS: Cheat Sheet: File / Directories
8 pages
06 - IBM Watsonx - Data Competitive Insights
No ratings yet
06 - IBM Watsonx - Data Competitive Insights
113 pages
Ritesh Srivastava PMP 9090
No ratings yet
Ritesh Srivastava PMP 9090
4 pages
Staging BAPIs For The SAP Business Information Warehouse
No ratings yet
Staging BAPIs For The SAP Business Information Warehouse
79 pages
Chapter 1 Data Science
No ratings yet
Chapter 1 Data Science
18 pages
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
No ratings yet
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
37 pages
Ex-5 241022 140400
No ratings yet
Ex-5 241022 140400
7 pages
Exploring The Sysmaster Database INFORMIX
No ratings yet
Exploring The Sysmaster Database INFORMIX
22 pages
Data Storage and Querying
No ratings yet
Data Storage and Querying
2 pages
Requirements Gathering Using Object-Oriented Models: Mario Castillo
No ratings yet
Requirements Gathering Using Object-Oriented Models: Mario Castillo
53 pages
What Is Unified Modeling Language
No ratings yet
What Is Unified Modeling Language
4 pages
Software Testing
No ratings yet
Software Testing
36 pages
Chapter N1 Introduction To Big Data
No ratings yet
Chapter N1 Introduction To Big Data
40 pages
Chapter n3 Sqoop
No ratings yet
Chapter n3 Sqoop
24 pages
Gathering A StatsPack Snapshot
No ratings yet
Gathering A StatsPack Snapshot
5 pages
Requirements Gathering Using Object-Oriented Models: UML - Unified Modeling Language
No ratings yet
Requirements Gathering Using Object-Oriented Models: UML - Unified Modeling Language
13 pages
DWTT.E27258 - Conduit Fittings - HUB
No ratings yet
DWTT.E27258 - Conduit Fittings - HUB
1 page
2595 10332 1 PB
No ratings yet
2595 10332 1 PB
10 pages
Development of University Ontology
No ratings yet
Development of University Ontology
9 pages
Computer Class 12 - T - 3
No ratings yet
Computer Class 12 - T - 3
1 page
Codigo SQL
No ratings yet
Codigo SQL
5 pages
Introduction To PLSQL and Other Schema Objects
No ratings yet
Introduction To PLSQL and Other Schema Objects
3 pages
Assignment 5
No ratings yet
Assignment 5
4 pages
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
JSP-Servlet Interview Questions You'll Most Likely Be Asked
From Everand
JSP-Servlet Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet