Sample
Sample
Job Tracker
TaskTracker
DataNode
Note: Write once and read any no of times but not try to change
the contents of file in hdfs.
Note: Namenode,secondary node,Jobtracker are called master
services or master nodes or deoman
Note: Datanode,task tracker are called slave nodes or deomans
hdfs-site.xml
Settings:
block size : by default block size in data node is maintained as 64
mb
replica 3
Trash
HeartBeat Interval
Vertical Scaling: Increasing ram and hardisk o existing data node we say
it as vertical scaling
ifconfig
Linux Commands:
Disck Usage:
6. rm (Remove a file)
hadoop fs –rm /home/training/hdfs/file1
Hive Architecture:
DisAdvantages:
and string
Load data: Loading data can be from local file system or HDFS
hive> create database sample
>;
Creating table:
Load data:
hdfs mode:
[training@localhost Desktop]$ hadoop fs -ls
/user/hive/warehouse/sample.db
Warning: $HADOOP_HOME is deprecated.
Found 1 items
drwxr-xr-x - training supergroup 0 2018-01-22 17:47 /user/hive/w
arehouse/sample.db/employee
With Location
Without Location :
Here we create a directroy and create a table and point to that
directory
i.e Data loading and table creation is done in single step
View:
Index 2 types
1.Bit map
2.compact
Array ex:
Ramesh,1,maths$physics$chemistry,a
Suresh,2,bilogy$maths$physices,b
create table array_table(sname string,sid int,sub array<string>,grade
string)
> row format delimited
> fields terminated by ','
> collection items terminated by '$';
install linux.hadoop,pig
local modes: Here we have input file and output file in linux file
system here we can install pig without hadoop mapreduce