0% found this document useful (0 votes)
61 views69 pages

Pig Program - Odt

The document shows the configuration and usage of Hadoop and Pig on a local system. It starts HDFS daemons like NameNode and DataNode, loads sample data files into HDFS, runs Pig commands to load and dump data, and encounters errors while running Pig commands due to syntax issues.

Uploaded by

bazeera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views69 pages

Pig Program - Odt

The document shows the configuration and usage of Hadoop and Pig on a local system. It starts HDFS daemons like NameNode and DataNode, loads sample data files into HDFS, runs Pig commands to load and dump data, and encounters errors while running Pig commands due to syntax issues.

Uploaded by

bazeera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 69

rr@ubuntu:~$ start-all.

sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
19/09/24 14:25:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-rr-namenode-ubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-rr-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-rr-secondarynamenode-
ubuntu.out
19/09/24 14:27:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-rr-resourcemanager-ubuntu.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-rr-nodemanager-ubuntu.out
rr@ubuntu:~$ jps
3536 SecondaryNameNode
3249 NameNode
4099 Jps
3371 DataNode
3757 ResourceManager
3887 NodeManager
rr@ubuntu:~$ hdfs dfs -ls
19/09/24 14:28:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Found 3 items
drwxr-xr-x - rr supergroup 0 2019-07-23 14:37 dir11
drwxr-xr-x - rr supergroup 0 2019-07-23 15:13 gayathri
drwxr-xr-x - rr supergroup 0 2019-07-30 15:21 testfile.txt
rr@ubuntu:~$ hdfs dfs -cat testfile.txt
19/09/24 14:29:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
cat: `testfile.txt': Is a directory
rr@ubuntu:~$ hdfs dfs -ls /dir11
19/09/24 14:30:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Found 1 items
-rwx------ 1 rr supergroup 117 2019-03-05 15:05 /dir11/sample2.txt
rr@ubuntu:~$ hdfs dfs -cat /dir11/sample2.txt
19/09/24 14:31:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
it does not support writing of data into the files using multiple writers.the best for reading data either
from end.
rr@ubuntu:~$ gedit emp.txt
rr@ubuntu:~$ hdfs dfs -put empx.txt hdfs://localhost:54310/dir11
19/09/24 14:33:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
put: `empx.txt': No such file or directory
rr@ubuntu:~$ hdfs dfs -put emp.txt hdfs://localhost:54310/dir11
19/09/24 14:33:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
rr@ubuntu:~$ hdfs dfs -ls /dir11
19/09/24 14:34:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 1 rr supergroup 43 2019-09-24 14:33 /dir11/emp.txt
-rwx------ 1 rr supergroup 117 2019-03-05 15:05 /dir11/sample2.txt
rr@ubuntu:~$
rr@ubuntu:~$
rr@ubuntu:~$ pig -x local
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-
1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/slf4j-log4j12-
1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/09/24 14:35:30 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
19/09/24 14:35:30 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2019-09-24 14:35:30,823 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0 (r1746530)
compiled Jun 01 2016, 23:10:49
2019-09-24 14:35:30,830 [main] INFO org.apache.pig.Main - Logging error messages to:
/home/rr/pig_1569364530821.log
2019-09-24 14:35:30,956 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file
/home/rr/.pigbootup not found
2019-09-24 14:35:32,706 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:35:32,714 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:35:32,755 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
2019-09-24 14:35:33,386 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:35:33,587 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-
default-865c4ad5-0f14-4317-bd8f-79b37a45ccd6
2019-09-24 14:35:33,594 [main] WARN org.apache.pig.PigServer - ATS is disabled since
yarn.timeline-service.enabled set to false
grunt> a1=LOAD 'hdfs://localhost:54310/dir11/emp.txt USING PigStorage(',') AS
(id:int,name:chararray);
2019-09-24 14:37:21,392 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error
during parsing. Encountered " <PATH> "a1=LOAD "" at line 1, column 1.
Was expecting one of:
<EOF>
"cat" ...
"clear" ...
"fs" ...
"sh" ...
"cd" ...
"cp" ...
"copyFromLocal" ...
"copyToLocal" ...
"dump" ...
"\\d" ...
"describe" ...
"\\de" ...
"aliases" ...
"explain" ...
"\\e" ...
"help" ...
"history" ...
"kill" ...
"ls" ...
"mv" ...
"mkdir" ...
"pwd" ...
"quit" ...
"\\q" ...
"register" ...
"rm" ...
"rmf" ...
"set" ...
"illustrate" ...
"\\i" ...
"run" ...
"exec" ...
"%default" ...
"%declare" ...
"scriptDone" ...
"" ...
"" ...
<EOL> ...
";" ...

Details at logfile: /home/rr/pig_1569364530821.log


grunt> a1=LOAD 'hdfs://localhost:54310/dir11/emp.txt' USING PigStorage(',') AS
(id:int,name:chararray);
2019-09-24 14:37:54,452 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error
during parsing. Encountered " <PATH> "a1=LOAD "" at line 1, column 1.
Was expecting one of:
<EOF>
"cat" ...
"clear" ...
"fs" ...
"sh" ...
"cd" ...
"cp" ...
"copyFromLocal" ...
"copyToLocal" ...
"dump" ...
"\\d" ...
"describe" ...
"\\de" ...
"aliases" ...
"explain" ...
"\\e" ...
"help" ...
"history" ...
"kill" ...
"ls" ...
"mv" ...
"mkdir" ...
"pwd" ...
"quit" ...
"\\q" ...
"register" ...
"rm" ...
"rmf" ...
"set" ...
"illustrate" ...
"\\i" ...
"run" ...
"exec" ...
"%default" ...
"%declare" ...
"scriptDone" ...
"" ...
"" ...
<EOL> ...
";" ...

Details at logfile: /home/rr/pig_1569364530821.log


grunt> A1 = LOAD 'hdfs://localhost:54310/dir11/emp.txt' USING PigStorage(',') AS
(id:int,name:chararray);
2019-09-24 14:41:13,348 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:41:13,349 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:41:13,359 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:41:15,048 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where applicable
grunt> a1 = LOAD 'hdfs://localhost:54310/dir11/emp.txt USING PigStorage(',') AS
(id:int,name:chararray);
>> dump A1;
>> dump a1;
>> A1 = LOAD 'hdfs://localhost:54310/dir11/emp.txt' USING PigStorage(',') AS
(id:int,name:chararray);
>> pig -x local
>>
>>
>>
>>
>>
>>
>>
[1]+ Stopped pig -x local
rr@ubuntu:~$ ^Zz;
#z: command not found
rr@ubuntu:~$
rr@ubuntu:~$ pig -x local
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-
1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/slf4j-log4j12-
1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/09/24 14:45:54 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
19/09/24 14:45:54 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2019-09-24 14:45:54,472 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0 (r1746530)
compiled Jun 01 2016, 23:10:49
2019-09-24 14:45:54,472 [main] INFO org.apache.pig.Main - Logging error messages to:
/home/rr/pig_1569365154470.log
2019-09-24 14:45:54,592 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file
/home/rr/.pigbootup not found
2019-09-24 14:45:57,354 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:45:57,354 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:45:57,357 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
2019-09-24 14:45:57,748 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:45:58,001 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-
default-ad7a727d-02ac-4f9d-8759-2b573fecc5e2
2019-09-24 14:45:58,002 [main] WARN org.apache.pig.PigServer - ATS is disabled since
yarn.timeline-service.enabled set to false
grunt> D1 = LOAD 'hdfs://localhost:54310/dir11/emp.txt' USING PigStorage(',') AS
(id:int,name:chararray);
2019-09-24 14:48:41,171 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:48:41,174 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:48:41,176 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:48:43,809 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where applicable
grunt> dump D1;
2019-09-24 14:48:59,722 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: UNKNOWN
2019-09-24 14:48:59,864 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:48:59,871 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:48:59,872 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:49:00,229 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:49:01,331 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:49:01,541 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:49:01,610 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:49:01,612 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:49:02,294 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:49:02,332 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:49:03,736 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - session.id
is deprecated. Instead, use dfs.metrics.session-id
2019-09-24 14:49:04,364 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM
Metrics with processName=JobTracker, sessionId=
2019-09-24 14:49:04,712 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:49:04,726 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:49:04,726 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:49:04,734 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2019-09-24 14:49:04,880 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:49:04,962 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:49:04,966 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:49:04,966 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365344962-0
2019-09-24 14:49:05,283 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:49:05,289 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2019-09-24 14:49:05,391 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:05,674 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2019-09-24 14:49:05,858 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:49:05,961 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:49:06,007 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:49:06,011 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:49:06,373 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:49:06,596 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:49:07,471 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local1655394155_0001
2019-09-24 14:49:08,230 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:49:08,233 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local1655394155_0001
2019-09-24 14:49:08,233 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1
2019-09-24 14:49:08,234 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 14:49:08,260 [Thread-20] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:49:08,261 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:49:08,261 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local1655394155_0001]
2019-09-24 14:49:08,453 [Thread-20] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:49:08,467 [Thread-20] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:49:08,468 [Thread-20] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:49:08,468 [Thread-20] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:49:08,469 [Thread-20] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:49:08,491 [Thread-20] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:49:08,738 [Thread-20] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
map tasks
2019-09-24 14:49:08,739 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local1655394155_0001_m_000000_0
2019-09-24 14:49:08,973 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:49:09,275 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:49:09,284 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:49:09,419 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:49:09,455 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:49:09,492 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:49:09,748 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:49:09,752 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:49:09,812 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 14:49:10,292 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:49:10,293 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local1655394155_0001_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:49:10,382 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:49:10,386 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task attempt_local1655394155_0001_m_000000_0 is allowed to
commit now
2019-09-24 14:49:10,390 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local1655394155_0001_m_000000_0' to
file:/tmp/temp1989380716/tmp1997065368/_temporary/0/task_local1655394155_0001_m_000000
2019-09-24 14:49:10,396 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:49:10,396 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local1655394155_0001_m_000000_0' done.
2019-09-24 14:49:10,396 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local1655394155_0001_m_000000_0
2019-09-24 14:49:10,397 [Thread-20] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:49:10,636 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,641 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,642 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2019-09-24 14:49:10,642 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:49:10,643 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,788 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:49:10,790 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:49:04 2019-09-24 14:49:10 UNKNOWN

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local1655394155_0001 1 0 n/a n/a n/a n/a 0 0 0 0
D1MAP_ONLY file:/tmp/temp1989380716/tmp1997065368,

Input(s):
Successfully read 4 records (43 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 4 records in: "file:/tmp/temp1989380716/tmp1997065368"

Counters:
Total records written : 4
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1655394155_0001

2019-09-24 14:49:10,800 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,800 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,801 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:49:10,819 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:49:10,828 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:49:10,831 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:49:10,832 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:49:10,838 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:49:10,930 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:49:10,932 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(12, Narien)
(14, Nandhu)
(16, Naveen)
(18, Nagul)
grunt> idlimit = FILTER D1 BY id>14;
grunt> dump idlimit;
2019-09-24 14:50:27,302 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: FILTER
2019-09-24 14:50:27,377 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:50:27,386 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:50:27,386 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:50:27,386 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:50:27,387 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:50:27,393 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:50:27,399 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:50:27,401 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:50:27,462 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:50:27,478 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:50:27,483 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:27,496 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:50:27,497 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:50:27,504 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:50:27,505 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:50:27,508 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:50:27,508 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365427504-0
2019-09-24 14:50:27,587 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:50:27,596 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:27,644 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:50:27,695 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:50:27,727 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:50:27,727 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:50:27,732 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:50:27,816 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:50:28,060 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local1084196401_0002
2019-09-24 14:50:28,384 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:50:28,387 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local1084196401_0002
2019-09-24 14:50:28,387 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1,idlimit
2019-09-24 14:50:28,387 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1],idlimit[2,10] C: R:
2019-09-24 14:50:28,389 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:50:28,389 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local1084196401_0002]
2019-09-24 14:50:28,390 [Thread-45] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:50:28,434 [Thread-45] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:50:28,435 [Thread-45] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:50:28,435 [Thread-45] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:50:28,438 [Thread-45] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:50:28,439 [Thread-45] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:50:28,439 [Thread-45] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:50:28,495 [Thread-45] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
map tasks
2019-09-24 14:50:28,496 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local1084196401_0002_m_000000_0
2019-09-24 14:50:28,538 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:50:28,549 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:50:28,555 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:50:28,567 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:50:28,568 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:50:28,572 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:50:28,627 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:50:28,628 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:50:28,654 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],idlimit[2,10] C: R:
2019-09-24 14:50:28,669 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:50:28,669 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local1084196401_0002_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:50:28,672 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:50:28,673 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task attempt_local1084196401_0002_m_000000_0 is allowed to
commit now
2019-09-24 14:50:28,675 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local1084196401_0002_m_000000_0' to file:/tmp/temp1989380716/tmp-
2053029601/_temporary/0/task_local1084196401_0002_m_000000
2019-09-24 14:50:28,679 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:50:28,679 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local1084196401_0002_m_000000_0' done.
2019-09-24 14:50:28,680 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local1084196401_0002_m_000000_0
2019-09-24 14:50:28,680 [Thread-45] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:50:28,893 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,894 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,899 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,902 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:50:28,903 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:50:27 2019-09-24 14:50:28 FILTER

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local1084196401_0002 1 0 n/a n/a n/a n/a 0 0 0 0
D1,idlimit MAP_ONLY file:/tmp/temp1989380716/tmp-2053029601,

Input(s):
Successfully read 4 records (86 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 2 records in: "file:/tmp/temp1989380716/tmp-2053029601"

Counters:
Total records written : 2
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1084196401_0002

2019-09-24 14:50:28,907 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,919 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,920 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:50:28,967 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:50:28,967 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:50:28,971 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:50:28,971 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:50:28,971 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:50:28,993 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:50:28,999 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(16, Naveen)
(18, Nagul)
grunt> describe D1;
D1: {id: int,name: chararray}
grunt> groupbyid = GROUP D1 BY id;
grunt> dump groupbyid;
2019-09-24 14:52:55,549 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: GROUP_BY
2019-09-24 14:52:55,620 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:52:55,622 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:52:55,623 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:52:55,634 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:52:55,634 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:52:55,647 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:52:55,659 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:52:55,674 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:52:55,701 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:52:55,710 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:52:55,711 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:52:55,712 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:52:55,714 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:52:55,715 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:52:55,727 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using
reducer estimator:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-09-24 14:52:55,733 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator -
BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=43
2019-09-24 14:52:55,733 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:52:55,739 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:52:55,742 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:52:55,742 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:52:55,742 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365575741-0
2019-09-24 14:52:55,934 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:52:55,943 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:52:56,015 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:52:56,083 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:52:56,120 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:52:56,120 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:52:56,123 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:52:56,218 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:52:56,405 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local231953609_0003
2019-09-24 14:52:56,692 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:52:56,692 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local231953609_0003
2019-09-24 14:52:56,693 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1,groupbyid
2019-09-24 14:52:56,693 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1],groupbyid[3,12] C: R:
2019-09-24 14:52:56,697 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:52:56,697 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local231953609_0003]
2019-09-24 14:52:56,698 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:52:56,703 [Thread-68] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:52:56,706 [Thread-68] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:52:56,706 [Thread-68] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:52:56,706 [Thread-68] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:52:56,706 [Thread-68] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:52:56,706 [Thread-68] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:52:56,707 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:52:56,740 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
map tasks
2019-09-24 14:52:56,740 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local231953609_0003_m_000000_0
2019-09-24 14:52:56,757 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:52:56,758 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:52:56,760 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:52:56,780 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:52:56,783 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:53:01,004 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:53:01,005 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:53:01,005 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:53:01,005 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:53:01,005 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:53:01,068 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:53:01,082 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:53:01,083 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:53:01,093 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],groupbyid[3,12]
C: R:
2019-09-24 14:53:01,134 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:53:01,135 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:53:01,135 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:53:01,135 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 75; bufvoid = 104857600
2019-09-24 14:53:01,135 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:53:01,273 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:53:01,287 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local231953609_0003_m_000000_0 is done. And is in
the process of committing
2019-09-24 14:53:01,289 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:53:01,289 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local231953609_0003_m_000000_0' done.
2019-09-24 14:53:01,289 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local231953609_0003_m_000000_0
2019-09-24 14:53:01,290 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:53:01,309 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
reduce tasks
2019-09-24 14:53:01,310 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local231953609_0003_r_000000_0
2019-09-24 14:53:01,347 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:53:01,368 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:53:01,390 [pool-10-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@703e83d7
2019-09-24 14:53:01,415 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50%
complete
2019-09-24 14:53:01,415 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local231953609_0003]
2019-09-24 14:53:01,486 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:53:01,550 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - attempt_local231953609_0003_r_000000_0
Thread started: EventFetcher for fetching Map Completion Events
2019-09-24 14:53:01,721 [localfetcher#1] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#1 about to shuffle output of map
attempt_local231953609_0003_m_000000_0 decomp: 85 len: 89 to MEMORY
2019-09-24 14:53:01,757 [localfetcher#1] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 85 bytes from map-output for
attempt_local231953609_0003_m_000000_0
2019-09-24 14:53:01,779 [localfetcher#1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 85, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->85
2019-09-24 14:53:01,802 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:53:01,806 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:53:01,806 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:53:01,857 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:53:01,861 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 77 bytes
2019-09-24 14:53:01,865 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 85 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:53:01,867 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 89 bytes from disk
2019-09-24 14:53:01,868 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:53:01,872 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:53:01,873 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 77 bytes
2019-09-24 14:53:01,874 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:53:01,933 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:53:02,099 [pool-10-thread-1] INFO org.apache.hadoop.conf.Configuration.deprecation
- mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2019-09-24 14:53:02,142 [pool-10-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:53:02,143 [pool-10-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:53:02,146 [pool-10-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],groupbyid[3,12] C: R:
2019-09-24 14:53:02,952 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local231953609_0003_r_000000_0 is done. And is in the process of committing
2019-09-24 14:53:02,975 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:53:02,975 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local231953609_0003_r_000000_0 is allowed to commit now
2019-09-24 14:53:03,017 [pool-10-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local231953609_0003_r_000000_0' to file:/tmp/temp1989380716/tmp-
123884449/_temporary/0/task_local231953609_0003_r_000000
2019-09-24 14:53:03,023 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:53:03,023 [pool-10-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local231953609_0003_r_000000_0' done.
2019-09-24 14:53:03,023 [pool-10-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local231953609_0003_r_000000_0
2019-09-24 14:53:03,030 [Thread-68] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce task
executor complete.
2019-09-24 14:53:03,297 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,297 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,302 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,353 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:53:03,355 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:52:55 2019-09-24 14:53:03 GROUP_BY

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local231953609_0003 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
D1,groupbyid GROUP_BY file:/tmp/temp1989380716/tmp-123884449,

Input(s):
Successfully read 4 records (258 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"
Output(s):
Successfully stored 4 records in: "file:/tmp/temp1989380716/tmp-123884449"

Counters:
Total records written : 4
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local231953609_0003

2019-09-24 14:53:03,359 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,363 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,368 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:53:03,402 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:53:03,408 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:53:03,408 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:53:03,408 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:53:03,409 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:53:03,592 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:53:03,595 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(12,{(12, Narien)})
(14,{(14, Nandhu)})
(16,{(16, Naveen)})
(18,{(18, Nagul)})
grunt> maxid = FOREACH groupbyid GENERATE group,COUNT(D1.id);
grunt> dump maxid;
2019-09-24 14:54:55,008 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: GROUP_BY
2019-09-24 14:54:55,077 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:54:55,080 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:54:55,081 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:54:55,081 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 14:54:55,081 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:54:55,127 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:54:55,129 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move
algebraic foreach to combiner
2019-09-24 14:54:55,162 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:54:55,183 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:54:55,245 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:54:55,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:54:55,269 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:55,277 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:54:55,278 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:54:55,278 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:54:55,278 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using
reducer estimator:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-09-24 14:54:55,281 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator -
BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=43
2019-09-24 14:54:55,281 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:54:55,283 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:54:55,304 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:54:55,304 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:54:55,304 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365695286-0
2019-09-24 14:54:55,431 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:54:55,434 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:55,477 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:54:55,481 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:54:55,498 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:54:55,498 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:54:55,546 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:54:55,600 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:54:55,687 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local234157501_0004
2019-09-24 14:54:55,948 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:54:55,948 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local234157501_0004
2019-09-24 14:54:55,948 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1,groupbyid,maxid
2019-09-24 14:54:55,948 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1],maxid[4,8],groupbyid[3,12] C: maxid[4,8],groupbyid[3,12] R:
maxid[4,8]
2019-09-24 14:54:55,952 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:54:55,953 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local234157501_0004]
2019-09-24 14:54:55,953 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:54:55,958 [Thread-96] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:54:55,958 [Thread-96] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:54:55,960 [Thread-96] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:54:55,960 [Thread-96] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:54:55,961 [Thread-96] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:54:55,961 [Thread-96] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:54:55,961 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:54:55,983 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
map tasks
2019-09-24 14:54:55,984 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local234157501_0004_m_000000_0
2019-09-24 14:54:55,992 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:54:55,997 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:54:55,999 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:54:56,006 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:54:56,011 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:54:56,240 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:54:56,240 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:54:56,240 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:54:56,241 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:54:56,241 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:54:56,245 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:54:56,249 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:54:56,250 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:54:56,268 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-
1],maxid[4,8],groupbyid[3,12] C: maxid[4,8],groupbyid[3,12] R: maxid[4,8]
2019-09-24 14:54:56,303 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:54:56,304 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:54:56,304 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:54:56,304 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 44; bufvoid = 104857600
2019-09-24 14:54:56,304 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:54:56,337 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine - Aliases
being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-
1],maxid[4,8],groupbyid[3,12] C: maxid[4,8],groupbyid[3,12] R: maxid[4,8]
2019-09-24 14:54:56,357 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:54:56,362 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local234157501_0004_m_000000_0 is done. And is in
the process of committing
2019-09-24 14:54:56,364 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:54:56,364 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local234157501_0004_m_000000_0' done.
2019-09-24 14:54:56,364 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local234157501_0004_m_000000_0
2019-09-24 14:54:56,364 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:54:56,364 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for
reduce tasks
2019-09-24 14:54:56,365 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local234157501_0004_r_000000_0
2019-09-24 14:54:56,381 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:54:56,382 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:54:56,383 [pool-13-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5f425b3b
2019-09-24 14:54:56,383 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:54:56,385 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - attempt_local234157501_0004_r_000000_0
Thread started: EventFetcher for fetching Map Completion Events
2019-09-24 14:54:56,386 [localfetcher#2] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#2 about to shuffle output of map
attempt_local234157501_0004_m_000000_0 decomp: 54 len: 58 to MEMORY
2019-09-24 14:54:56,391 [localfetcher#2] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 54 bytes from map-output for
attempt_local234157501_0004_m_000000_0
2019-09-24 14:54:56,391 [localfetcher#2] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 54, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->54
2019-09-24 14:54:56,391 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:54:56,392 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:54:56,392 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:54:56,393 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:54:56,393 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 46 bytes
2019-09-24 14:54:56,394 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 54 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:54:56,394 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 58 bytes from disk
2019-09-24 14:54:56,394 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:54:56,394 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:54:56,409 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 46 bytes
2019-09-24 14:54:56,410 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:54:56,416 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:54:56,455 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50%
complete
2019-09-24 14:54:56,455 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local234157501_0004]
2019-09-24 14:54:56,461 [pool-13-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:54:56,461 [pool-13-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:54:56,475 [pool-13-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-
1],maxid[4,8],groupbyid[3,12] C: maxid[4,8],groupbyid[3,12] R: maxid[4,8]
2019-09-24 14:54:56,479 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local234157501_0004_r_000000_0 is done. And is in the process of committing
2019-09-24 14:54:56,482 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:54:56,485 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local234157501_0004_r_000000_0 is allowed to commit now
2019-09-24 14:54:56,490 [pool-13-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local234157501_0004_r_000000_0' to
file:/tmp/temp1989380716/tmp770682788/_temporary/0/task_local234157501_0004_r_000000
2019-09-24 14:54:56,491 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:54:56,498 [pool-13-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local234157501_0004_r_000000_0' done.
2019-09-24 14:54:56,498 [pool-13-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local234157501_0004_r_000000_0
2019-09-24 14:54:56,498 [Thread-96] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce task
executor complete.
2019-09-24 14:54:56,711 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:56,769 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:57,445 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:57,694 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:54:57,695 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:54:55 2019-09-24 14:54:57 GROUP_BY

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local234157501_0004 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
D1,groupbyid,maxid GROUP_BY,COMBINER file:/tmp/temp1989380716/tmp770682788,

Input(s):
Successfully read 4 records (344 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 4 records in: "file:/tmp/temp1989380716/tmp770682788"

Counters:
Total records written : 4
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local234157501_0004

2019-09-24 14:54:57,905 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:57,907 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:57,909 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:54:57,962 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:54:57,962 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:54:57,970 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:54:57,971 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:54:57,971 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:54:58,016 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:54:58,016 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(12,1)
(14,1)
(16,1)
(18,1)
grunt> groupbyname = GROUP D1 BY name;
grunt> dump groupbyname;
2019-09-24 14:56:07,772 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: GROUP_BY
2019-09-24 14:56:08,000 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:56:08,001 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:56:08,002 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:56:08,007 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 14:56:08,007 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:56:08,010 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:56:08,017 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:56:08,018 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:56:08,335 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:56:08,354 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:56:08,362 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:08,374 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:56:08,385 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:56:08,385 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:56:08,385 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using
reducer estimator:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-09-24 14:56:08,392 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator -
BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=43
2019-09-24 14:56:08,394 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:56:08,404 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:56:08,405 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:56:08,411 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:56:08,412 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365768405-0
2019-09-24 14:56:08,536 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:56:08,547 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:08,622 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:56:08,746 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:56:08,840 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:56:08,840 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:56:08,900 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:56:09,230 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:56:09,439 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local1713093984_0005
2019-09-24 14:56:09,741 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:56:09,741 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local1713093984_0005
2019-09-24 14:56:09,741 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1,groupbyname
2019-09-24 14:56:09,742 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1],groupbyname[5,14] C: R:
2019-09-24 14:56:09,746 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:56:09,746 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local1713093984_0005]
2019-09-24 14:56:09,747 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:56:09,751 [Thread-124] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:56:09,754 [Thread-124] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:56:09,754 [Thread-124] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:56:09,754 [Thread-124] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:56:09,754 [Thread-124] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:56:09,755 [Thread-124] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:56:09,755 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:56:09,774 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 14:56:09,775 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local1713093984_0005_m_000000_0
2019-09-24 14:56:09,790 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:56:09,794 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:56:09,796 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:56:09,802 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:56:09,802 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:56:10,366 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:56:10,367 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:56:10,367 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:56:10,367 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:56:10,367 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:56:10,368 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:56:10,372 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:56:10,372 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:56:10,376 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-
1],groupbyname[5,14] C: R:
2019-09-24 14:56:10,392 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:56:10,394 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:56:10,394 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:56:10,395 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 59; bufvoid = 104857600
2019-09-24 14:56:10,395 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:56:10,396 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:56:10,403 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local1713093984_0005_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:56:10,413 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:56:10,414 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local1713093984_0005_m_000000_0' done.
2019-09-24 14:56:10,414 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local1713093984_0005_m_000000_0
2019-09-24 14:56:10,414 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:56:10,419 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for reduce tasks
2019-09-24 14:56:10,420 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local1713093984_0005_r_000000_0
2019-09-24 14:56:10,433 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:56:10,440 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:56:10,440 [pool-16-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@51b87349
2019-09-24 14:56:10,441 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:56:10,442 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher -
attempt_local1713093984_0005_r_000000_0 Thread started: EventFetcher for fetching Map
Completion Events
2019-09-24 14:56:10,443 [localfetcher#3] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#3 about to shuffle output of map
attempt_local1713093984_0005_m_000000_0 decomp: 69 len: 73 to MEMORY
2019-09-24 14:56:10,444 [localfetcher#3] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 69 bytes from map-output for
attempt_local1713093984_0005_m_000000_0
2019-09-24 14:56:10,444 [localfetcher#3] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 69, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->69
2019-09-24 14:56:10,445 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:56:10,447 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:56:10,447 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:56:10,449 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:56:10,460 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 58 bytes
2019-09-24 14:56:10,461 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 69 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:56:10,461 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 73 bytes from disk
2019-09-24 14:56:10,461 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:56:10,461 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:56:10,461 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 58 bytes
2019-09-24 14:56:10,462 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:56:10,463 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:56:10,493 [pool-16-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:56:10,493 [pool-16-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:56:10,495 [pool-16-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],groupbyname[5,14] C:
R:
2019-09-24 14:56:10,496 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local1713093984_0005_r_000000_0 is done. And is in the process of committing
2019-09-24 14:56:10,498 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:56:10,502 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local1713093984_0005_r_000000_0 is allowed to commit now
2019-09-24 14:56:10,503 [pool-16-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local1713093984_0005_r_000000_0' to
file:/tmp/temp1989380716/tmp261580045/_temporary/0/task_local1713093984_0005_r_000000
2019-09-24 14:56:10,515 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:56:10,519 [pool-16-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local1713093984_0005_r_000000_0' done.
2019-09-24 14:56:10,519 [pool-16-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local1713093984_0005_r_000000_0
2019-09-24 14:56:10,521 [Thread-124] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce
task executor complete.
2019-09-24 14:56:10,760 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,762 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,763 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,787 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:56:10,792 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:56:08 2019-09-24 14:56:10 GROUP_BY

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local1713093984_0005 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
D1,groupbyname GROUP_BY file:/tmp/temp1989380716/tmp261580045,

Input(s):
Successfully read 4 records (430 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 4 records in: "file:/tmp/temp1989380716/tmp261580045"

Counters:
Total records written : 4
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1713093984_0005

2019-09-24 14:56:10,795 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,795 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,796 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:56:10,809 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:56:10,819 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:56:10,819 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:56:10,820 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:56:10,820 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:56:10,845 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:56:10,846 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
( Nagul,{(18, Nagul)})
( Nandhu,{(14, Nandhu)})
( Narien,{(12, Narien)})
( Naveen,{(16, Naveen)})
grunt> groupp = GROUP D1 ALL;
grunt> dump groupp;
2019-09-24 14:57:23,153 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: GROUP_BY
2019-09-24 14:57:23,252 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:57:23,263 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:57:23,264 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:57:23,264 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 14:57:23,264 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:57:23,269 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:57:23,275 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:57:23,287 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 14:57:23,306 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:57:23,315 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:57:23,316 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:23,317 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:57:23,317 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:57:23,318 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:57:23,318 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:57:23,361 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:57:23,366 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:57:23,366 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:57:23,367 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365843362-0
2019-09-24 14:57:23,395 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:57:23,560 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:23,672 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:57:23,677 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:57:23,689 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:57:23,689 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:57:23,692 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:57:23,741 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:57:24,459 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local961983538_0006
2019-09-24 14:57:25,851 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:57:25,852 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local961983538_0006
2019-09-24 14:57:25,852 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1,groupp
2019-09-24 14:57:25,852 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1],groupp[6,9] C: R:
2019-09-24 14:57:25,859 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:57:25,860 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local961983538_0006]
2019-09-24 14:57:25,922 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:57:25,926 [Thread-152] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:57:25,940 [Thread-152] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:57:25,940 [Thread-152] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:57:25,940 [Thread-152] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:57:25,940 [Thread-152] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:57:25,940 [Thread-152] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:57:25,941 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:57:25,943 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 14:57:26,016 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local961983538_0006_m_000000_0
2019-09-24 14:57:26,084 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:57:26,089 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:57:26,091 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:57:26,131 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:57:26,134 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:57:26,703 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:57:26,703 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:57:26,704 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:57:26,704 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:57:26,704 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:57:26,705 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:57:26,721 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:57:26,724 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:57:26,747 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],groupp[6,9] C:
R:
2019-09-24 14:57:26,777 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:57:26,777 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:57:26,778 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:57:26,778 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 83; bufvoid = 104857600
2019-09-24 14:57:26,778 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:57:26,779 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:57:26,781 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local961983538_0006_m_000000_0 is done. And is in
the process of committing
2019-09-24 14:57:26,782 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:57:26,782 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local961983538_0006_m_000000_0' done.
2019-09-24 14:57:26,782 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local961983538_0006_m_000000_0
2019-09-24 14:57:26,783 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:57:26,783 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for reduce tasks
2019-09-24 14:57:26,783 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local961983538_0006_r_000000_0
2019-09-24 14:57:26,801 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:57:26,806 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:57:26,807 [pool-19-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5ffc30a1
2019-09-24 14:57:26,807 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:57:26,811 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - attempt_local961983538_0006_r_000000_0
Thread started: EventFetcher for fetching Map Completion Events
2019-09-24 14:57:26,813 [localfetcher#4] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#4 about to shuffle output of map
attempt_local961983538_0006_m_000000_0 decomp: 93 len: 97 to MEMORY
2019-09-24 14:57:26,822 [localfetcher#4] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 93 bytes from map-output for
attempt_local961983538_0006_m_000000_0
2019-09-24 14:57:26,822 [localfetcher#4] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 93, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->93
2019-09-24 14:57:26,823 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:57:26,824 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:57:26,827 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:57:26,828 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:57:26,829 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 85 bytes
2019-09-24 14:57:26,830 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 93 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:57:26,834 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 97 bytes from disk
2019-09-24 14:57:26,834 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:57:26,834 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:57:26,835 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 85 bytes
2019-09-24 14:57:26,835 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:57:26,836 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:57:26,848 [pool-19-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:57:26,855 [pool-19-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:57:26,856 [pool-19-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1],groupp[6,9] C: R:
2019-09-24 14:57:26,857 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local961983538_0006_r_000000_0 is done. And is in the process of committing
2019-09-24 14:57:26,859 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:57:26,869 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local961983538_0006_r_000000_0 is allowed to commit now
2019-09-24 14:57:26,871 [pool-19-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local961983538_0006_r_000000_0' to file:/tmp/temp1989380716/tmp-
586598137/_temporary/0/task_local961983538_0006_r_000000
2019-09-24 14:57:26,875 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:57:26,879 [pool-19-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local961983538_0006_r_000000_0' done.
2019-09-24 14:57:26,879 [pool-19-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local961983538_0006_r_000000_0
2019-09-24 14:57:26,879 [Thread-152] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce
task executor complete.
2019-09-24 14:57:27,024 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,025 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,026 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,032 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:57:27,033 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 14:57:23 2019-09-24 14:57:27 GROUP_BY

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local961983538_0006 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
D1,groupp GROUP_BY file:/tmp/temp1989380716/tmp-586598137,
Input(s):
Successfully read 4 records (516 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 1 records in: "file:/tmp/temp1989380716/tmp-586598137"

Counters:
Total records written : 1
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local961983538_0006

2019-09-24 14:57:27,038 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,044 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,048 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:57:27,065 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:57:27,070 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:57:27,070 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:57:27,070 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:57:27,071 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:57:27,090 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:57:27,095 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(all,{(18, Nagul),(16, Naveen),(14, Nandhu),(12, Narien)})
grunt> sortbyname = ORDER D1 by id;
grunt> dump sortbyname;
2019-09-24 14:58:35,543 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: ORDER_BY
2019-09-24 14:58:35,602 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:35,603 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:35,607 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:35,626 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 14:58:35,627 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 14:58:35,630 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:58:35,692 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizerMR -
Using Secondary Key Optimization for MapReduce node scope-120
2019-09-24 14:58:35,694 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 3
2019-09-24 14:58:35,695 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 3
2019-09-24 14:58:35,764 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:35,773 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:35,777 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:35,782 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:58:35,783 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:58:35,786 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:58:35,800 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 14:58:35,800 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 14:58:35,800 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569365915800-0
2019-09-24 14:58:35,837 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:58:35,874 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:35,954 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:58:35,957 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 14:58:36,000 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:58:36,000 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:58:36,003 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:58:36,052 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:58:36,101 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local2027325755_0007
2019-09-24 14:58:36,301 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:58:36,302 [Thread-180] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:58:36,306 [Thread-180] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:36,306 [Thread-180] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:58:36,306 [Thread-180] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:36,306 [Thread-180] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:36,306 [Thread-180] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:36,307 [Thread-180] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:58:36,311 [Thread-180] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 14:58:36,314 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local2027325755_0007_m_000000_0
2019-09-24 14:58:36,324 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:36,340 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:58:36,346 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:
-----------------------

2019-09-24 14:58:36,351 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 14:58:36,352 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 14:58:36,355 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:36,366 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local2027325755_0007
2019-09-24 14:58:36,366 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1
2019-09-24 14:58:36,367 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 14:58:36,368 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 14:58:36,368 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local2027325755_0007]
2019-09-24 14:58:36,385 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:58:36,386 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 14:58:36,390 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 14:58:36,401 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:58:36,404 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local2027325755_0007_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:58:36,406 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:58:36,414 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task attempt_local2027325755_0007_m_000000_0 is allowed to
commit now
2019-09-24 14:58:36,416 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local2027325755_0007_m_000000_0' to
file:/tmp/temp1989380716/tmp2118310213/_temporary/0/task_local2027325755_0007_m_000000
2019-09-24 14:58:36,420 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:58:36,420 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local2027325755_0007_m_000000_0' done.
2019-09-24 14:58:36,420 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local2027325755_0007_m_000000_0
2019-09-24 14:58:36,420 [Thread-180] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:58:36,607 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33%
complete
2019-09-24 14:58:36,610 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:36,611 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:36,612 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:36,618 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:58:36,624 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:58:36,630 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:58:36,638 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using
reducer estimator:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-09-24 14:58:36,648 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator -
BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=63
2019-09-24 14:58:36,649 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:58:36,650 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:58:36,705 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:58:36,715 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:36,730 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:36,735 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:36,747 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:36,774 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:58:36,808 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:58:36,808 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:58:36,808 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:58:36,879 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:58:36,962 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local2104222054_0008
2019-09-24 14:58:37,164 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:58:37,165 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:58:37,169 [Thread-200] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:37,169 [Thread-200] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:58:37,169 [Thread-200] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:58:37,170 [Thread-200] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:37,170 [Thread-200] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:37,170 [Thread-200] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:37,170 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:58:37,173 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 14:58:37,174 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local2104222054_0008_m_000000_0
2019-09-24 14:58:37,187 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:37,199 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:58:37,201 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 63
Input split[0]:
Length = 63
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:58:37,206 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed file:/tmp/temp1989380716/tmp2118310213/part-m-00000:0+63
2019-09-24 14:58:37,393 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local2104222054_0008
2019-09-24 14:58:37,399 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases sortbyname
2019-09-24 14:58:37,401 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: sortbyname[7,13] C: R:
2019-09-24 14:58:37,564 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:58:37,565 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:58:37,565 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:58:37,565 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:58:37,565 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:58:37,600 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:58:37,608 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:58:37,609 [LocalJobRunner Map Task Executor #0] WARN
org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-09-24 14:58:37,612 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: sortbyname[7,13] C: R:
2019-09-24 14:58:37,615 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:58:37,618 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:58:37,618 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:58:37,618 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 64; bufvoid = 104857600
2019-09-24 14:58:37,618 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:58:37,621 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:58:37,629 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local2104222054_0008_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:58:37,631 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:58:37,631 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local2104222054_0008_m_000000_0' done.
2019-09-24 14:58:37,631 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local2104222054_0008_m_000000_0
2019-09-24 14:58:37,631 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:58:37,632 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for reduce tasks
2019-09-24 14:58:37,632 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local2104222054_0008_r_000000_0
2019-09-24 14:58:37,637 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:37,648 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:58:37,650 [pool-24-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@7665dca7
2019-09-24 14:58:37,652 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:58:37,656 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher -
attempt_local2104222054_0008_r_000000_0 Thread started: EventFetcher for fetching Map
Completion Events
2019-09-24 14:58:37,659 [localfetcher#5] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#5 about to shuffle output of map
attempt_local2104222054_0008_m_000000_0 decomp: 74 len: 78 to MEMORY
2019-09-24 14:58:37,661 [localfetcher#5] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 74 bytes from map-output for
attempt_local2104222054_0008_m_000000_0
2019-09-24 14:58:37,661 [localfetcher#5] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 74, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->74
2019-09-24 14:58:37,662 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:58:37,664 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:37,668 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:58:37,675 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:58:37,676 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 61 bytes
2019-09-24 14:58:37,676 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 74 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:58:37,677 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 78 bytes from disk
2019-09-24 14:58:37,677 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:58:37,677 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:58:37,677 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 61 bytes
2019-09-24 14:58:37,678 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:37,683 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:37,694 [pool-24-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:58:37,695 [pool-24-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:58:37,720 [pool-24-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: sortbyname[7,13] C: R:
2019-09-24 14:58:37,731 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local2104222054_0008_r_000000_0 is done. And is in the process of committing
2019-09-24 14:58:37,733 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:37,734 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local2104222054_0008_r_000000_0 is allowed to commit now
2019-09-24 14:58:37,735 [pool-24-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local2104222054_0008_r_000000_0' to file:/tmp/temp1989380716/tmp-
523252261/_temporary/0/task_local2104222054_0008_r_000000
2019-09-24 14:58:37,738 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:58:37,739 [pool-24-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local2104222054_0008_r_000000_0' done.
2019-09-24 14:58:37,741 [pool-24-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local2104222054_0008_r_000000_0
2019-09-24 14:58:37,741 [Thread-200] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce
task executor complete.
2019-09-24 14:58:37,970 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66%
complete
2019-09-24 14:58:37,974 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:37,975 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:37,976 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:37,996 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 14:58:38,011 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 14:58:38,011 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase
detected, estimating # of required reducers.
2019-09-24 14:58:38,014 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting
Parallelism to 1
2019-09-24 14:58:38,019 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 14:58:38,049 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 14:58:38,049 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:38,054 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:38,065 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:38,066 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:38,080 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 14:58:38,106 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 14:58:38,118 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 14:58:38,118 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 14:58:38,171 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 14:58:38,211 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local2057231340_0009
2019-09-24 14:58:38,458 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 14:58:38,458 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 14:58:38,463 [Thread-226] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:38,463 [Thread-226] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 14:58:38,463 [Thread-226] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2019-09-24 14:58:38,464 [Thread-226] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:38,464 [Thread-226] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:38,464 [Thread-226] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:38,464 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 14:58:38,467 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 14:58:38,467 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local2057231340_0009_m_000000_0
2019-09-24 14:58:38,506 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:38,508 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 14:58:38,513 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 63
Input split[0]:
Length = 63
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 14:58:38,515 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed file:/tmp/temp1989380716/tmp2118310213/part-m-00000:0+63
2019-09-24 14:58:38,682 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local2057231340_0009
2019-09-24 14:58:38,682 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases sortbyname
2019-09-24 14:58:38,682 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: sortbyname[7,13] C: R:
2019-09-24 14:58:38,828 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2019-09-24 14:58:38,828 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2019-09-24 14:58:38,829 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2019-09-24 14:58:38,829 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2019-09-24 14:58:38,829 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2019-09-24 14:58:38,830 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-09-24 14:58:38,835 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 14:58:38,835 [LocalJobRunner Map Task Executor #0] WARN
org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-09-24 14:58:38,837 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map -
Aliases being processed per job phase (AliasName[line,offset]): M: sortbyname[7,13] C: R:
2019-09-24 14:58:38,842 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 14:58:38,842 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Starting flush of map output
2019-09-24 14:58:38,842 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Spilling map output
2019-09-24 14:58:38,842 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 75; bufvoid = 104857600
2019-09-24 14:58:38,842 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
2019-09-24 14:58:38,844 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Finished spill 0
2019-09-24 14:58:38,852 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local2057231340_0009_m_000000_0 is done. And is
in the process of committing
2019-09-24 14:58:38,863 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 14:58:38,863 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local2057231340_0009_m_000000_0' done.
2019-09-24 14:58:38,863 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local2057231340_0009_m_000000_0
2019-09-24 14:58:38,863 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 14:58:38,864 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for reduce tasks
2019-09-24 14:58:38,864 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Starting task: attempt_local2057231340_0009_r_000000_0
2019-09-24 14:58:38,882 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:38,883 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Task - Using
ResourceCalculatorProcessTree : [ ]
2019-09-24 14:58:38,887 [pool-27-thread-1] INFO org.apache.hadoop.mapred.ReduceTask - Using
ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@8198439
2019-09-24 14:58:38,888 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - MergerManager:
memoryLimit=709551680, maxSingleShuffleLimit=177387920, mergeThreshold=468304128,
ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-09-24 14:58:38,890 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher -
attempt_local2057231340_0009_r_000000_0 Thread started: EventFetcher for fetching Map
Completion Events
2019-09-24 14:58:38,891 [localfetcher#6] INFO
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher - localfetcher#6 about to shuffle output of map
attempt_local2057231340_0009_m_000000_0 decomp: 85 len: 89 to MEMORY
2019-09-24 14:58:38,893 [localfetcher#6] INFO
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput - Read 85 bytes from map-output for
attempt_local2057231340_0009_m_000000_0
2019-09-24 14:58:38,894 [localfetcher#6] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of
size: 85, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->85
2019-09-24 14:58:38,894 [EventFetcher for fetching Map Completion Events] INFO
org.apache.hadoop.mapreduce.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
2019-09-24 14:58:38,895 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:38,899 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory
map-outputs and 0 on-disk map-outputs
2019-09-24 14:58:38,905 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:58:38,906 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 77 bytes
2019-09-24 14:58:38,906 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merged 1 segments, 85 bytes to disk
to satisfy reduce memory limit
2019-09-24 14:58:38,907 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 1 files, 89 bytes from disk
2019-09-24 14:58:38,907 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from
memory into reduce
2019-09-24 14:58:38,907 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Merger - Merging 1
sorted segments
2019-09-24 14:58:38,907 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Merger - Down to the
last merge-pass, with 1 segments left of total size: 77 bytes
2019-09-24 14:58:38,908 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:38,911 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 14:58:38,967 [pool-27-thread-1] INFO org.apache.pig.impl.util.SpillableMemoryManager
- Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 14:58:38,978 [pool-27-thread-1] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:58:38,990 [pool-27-thread-1] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce - Aliases
being processed per job phase (AliasName[line,offset]): M: sortbyname[7,13] C: R:
2019-09-24 14:58:38,992 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Task -
Task:attempt_local2057231340_0009_r_000000_0 is done. And is in the process of committing
2019-09-24 14:58:38,994 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - 1 / 1
copied.
2019-09-24 14:58:38,994 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Task - Task
attempt_local2057231340_0009_r_000000_0 is allowed to commit now
2019-09-24 14:58:38,996 [pool-27-thread-1] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local2057231340_0009_r_000000_0' to file:/tmp/temp1989380716/tmp-
1327869809/_temporary/0/task_local2057231340_0009_r_000000
2019-09-24 14:58:38,997 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
reduce > reduce
2019-09-24 14:58:39,001 [pool-27-thread-1] INFO org.apache.hadoop.mapred.Task - Task
'attempt_local2057231340_0009_r_000000_0' done.
2019-09-24 14:58:39,002 [pool-27-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
Finishing task: attempt_local2057231340_0009_r_000000_0
2019-09-24 14:58:39,002 [Thread-226] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce
task executor complete.
2019-09-24 14:58:39,134 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,134 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,138 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,150 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 14:58:39,152 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.1 0.16.0 rr 2019-09-24 14:58:35 2019-09-24 14:58:39 ORDER_BY

Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local2027325755_0007 1 0 n/a n/a n/a n/a 0 0 0 0
D1MAP_ONLY
job_local2057231340_0009 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
sortbyname ORDER_BY file:/tmp/temp1989380716/tmp-1327869809,
job_local2104222054_0008 1 1 n/a n/a n/a n/a n/a n/a n/a n/a
sortbyname SAMPLER

Input(s):
Successfully read 4 records (301 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 4 records in: "file:/tmp/temp1989380716/tmp-1327869809"

Counters:
Total records written : 4
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local2027325755_0007 -> job_local2104222054_0008,
job_local2104222054_0008 -> job_local2057231340_0009,
job_local2057231340_0009

2019-09-24 14:58:39,159 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,164 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,171 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,179 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,182 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,182 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,207 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,215 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,215 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 14:58:39,227 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-09-24 14:58:39,230 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:58:39,231 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:58:39,232 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:58:39,233 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 14:58:39,272 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 14:58:39,273 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(12, Narien)
(14, Nandhu)
(16, Naveen)
(18, Nagul)
grunt> explain D1;
2019-09-24 14:59:09,955 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 14:59:09,955 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 14:59:09,955 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 14:59:09,956 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 14:59:09,956 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
#-----------------------------------------------
# New Logical Plan:
#-----------------------------------------------
D1: (Name: LOStore Schema: id#441:int,name#442:chararray)
|
|---D1: (Name: LOForEach Schema: id#441:int,name#442:chararray)
| |
| (Name: LOGenerate[false,false] Schema:
id#441:int,name#442:chararray)ColumnPrune:InputUids=[441, 442]ColumnPrune:OutputUids=[441,
442]
| | |
| | (Name: Cast Type: int Uid: 441)
| | |
| | |---id:(Name: Project Type: bytearray Uid: 441 Input: 0 Column: (*))
| | |
| | (Name: Cast Type: chararray Uid: 442)
| | |
| | |---name:(Name: Project Type: bytearray Uid: 442 Input: 1 Column: (*))
| |
| |---(Name: LOInnerLoad[0] Schema: id#441:bytearray)
| |
| |---(Name: LOInnerLoad[1] Schema: name#442:bytearray)
|
|---D1: (Name: LOLoad Schema: id#441:bytearray,name#442:bytearray)RequiredFields:null
#-----------------------------------------------
# Physical Plan:
#-----------------------------------------------
D1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-149
|
|---D1: New For Each(false,false)[bag] - scope-148
| |
| Cast[int] - scope-143
| |
| |---Project[bytearray][0] - scope-142
| |
| Cast[chararray] - scope-146
| |
| |---Project[bytearray][1] - scope-145
|
|---D1: Load(hdfs://localhost:54310/dir11/emp.txt:PigStorage(',')) - scope-141

2019-09-24 14:59:09,960 [main] INFO


org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 14:59:09,961 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 14:59:09,961 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-150
Map Plan
D1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-149
|
|---D1: New For Each(false,false)[bag] - scope-148
| |
| Cast[int] - scope-143
| |
| |---Project[bytearray][0] - scope-142
| |
| Cast[chararray] - scope-146
| |
| |---Project[bytearray][1] - scope-145
|
|---D1: Load(hdfs://localhost:54310/dir11/emp.txt:PigStorage(',')) - scope-141--------
Global sort: false
----------------

grunt> Illustrate D1;


2019-09-24 15:00:02,028 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:00:02,028 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 15:00:02,029 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:00:02,029 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file
system at: file:///
2019-09-24 15:00:02,321 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:00:02,322 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 15:00:02,322 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:00:02,327 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 15:00:02,327 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer -
{RULES_ENABLED=[ConstantCalculator, LoadTypeCastInserter, PredicatePushdownOptimizer,
StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune,
GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2019-09-24 15:00:02,381 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:02,387 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:02,387 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:02,388 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:02,388 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:02,411 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:02,414 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:02,501 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:02,502 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:02,503 [main] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 15:00:02,506 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat -
Total input paths to process : 1
2019-09-24 15:00:02,506 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 15:00:03,124 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:03,126 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:03,126 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:03,127 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:03,128 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:03,136 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:03,140 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:03,159 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:03,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:03,185 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:03,186 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:03,188 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:03,189 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:03,209 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:03,209 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:03,213 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:03,214 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:03,215 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:03,216 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:03,217 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:03,217 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:03,249 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:03,253 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:03,254 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:03,259 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:03,260 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:03,260 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:03,261 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:03,261 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:03,860 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:03,861 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:03,865 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
(16, Naveen)
2019-09-24 15:00:03,950 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:03,950 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:03,951 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:03,951 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:03,951 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:04,624 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:04,629 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:04,633 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:04,640 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:04,641 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:04,641 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:04,643 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:04,647 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:05,285 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:05,289 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:05,293 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:05,295 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:05,298 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:05,298 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:05,304 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:05,319 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:08,766 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:08,766 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:08,770 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
2019-09-24 15:00:08,771 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:00:08,772 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:00:08,772 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:00:08,773 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:00:08,774 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:00:11,943 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected
heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752,
usageThreshold = 489350752
2019-09-24 15:00:11,943 [main] WARN org.apache.pig.data.SchemaTupleBackend -
SchemaTupleBackend has already been initialized
2019-09-24 15:00:11,947 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5] C: R:
--------------------------------------------
| D1 | id:int | name:chararray |
--------------------------------------------
| | 16 | Naveen |
--------------------------------------------

grunt> STORE D1 INTO 'hdfs://localhost:54310/dir11/stu';


2019-09-24 15:01:40,498 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:01:40,502 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 15:01:40,502 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:01:40,560 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.textoutputformat.separator is deprecated. Instead, use
mapreduce.output.textoutputformat.separator
2019-09-24 15:01:40,764 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in
the script: UNKNOWN
2019-09-24 15:01:40,815 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:01:40,816 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 15:01:40,817 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:01:40,817 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key
[pig.schematuple] was not set... will not generate code.
2019-09-24 15:01:40,817 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach,
ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter,
StreamTypeCastInserter]}
2019-09-24 15:01:40,820 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation
threshold: 100 optimistic? false
2019-09-24 15:01:40,822 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size before optimization: 1
2019-09-24 15:01:40,826 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan
size after optimization: 1
2019-09-24 15:01:40,847 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:01:40,848 [main] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:01:40,850 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:40,852 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig
script settings are added to the job
2019-09-24 15:01:40,855 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler -
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-09-24 15:01:40,856 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up
single store job
2019-09-24 15:01:40,860 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key
[pig.schematuple] is false, will not generate code.
2019-09-24 15:01:40,861 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process
to move generated code to distributed cacche
2019-09-24 15:01:40,861 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache
not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp
directory: /tmp/1569366100857-0
2019-09-24 15:01:40,867 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-
reduce job(s) waiting for submission.
2019-09-24 15:01:40,930 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:41,102 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader -
No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2019-09-24 15:01:41,105 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using
PigTextInputFormat
2019-09-24 15:01:41,107 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat
- Total input paths to process : 1
2019-09-24 15:01:41,107 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-09-24 15:01:41,110 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to
process : 1
2019-09-24 15:01:41,169 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1
2019-09-24 15:01:41,385 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter -
Submitting tokens for job: job_local1377189287_0010
2019-09-24 15:01:41,945 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the
job: http://localhost:8080/
2019-09-24 15:01:41,945 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
HadoopJobId: job_local1377189287_0010
2019-09-24 15:01:41,945 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing
aliases D1
2019-09-24 15:01:41,946 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed
locations: M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 15:01:41,950 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%
complete
2019-09-24 15:01:41,950 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running
jobs are [job_local1377189287_0010]
2019-09-24 15:01:41,987 [Thread-257] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter set in config null
2019-09-24 15:01:41,991 [Thread-257] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.textoutputformat.separator is deprecated. Instead, use
mapreduce.output.textoutputformat.separator
2019-09-24 15:01:42,000 [Thread-257] INFO org.apache.hadoop.conf.Configuration.deprecation -
fs.default.name is deprecated. Instead, use fs.defaultFS
2019-09-24 15:01:42,000 [Thread-257] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent
2019-09-24 15:01:42,001 [Thread-257] INFO org.apache.hadoop.conf.Configuration.deprecation -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-09-24 15:01:42,001 [Thread-257] INFO org.apache.hadoop.conf.Configuration.deprecation -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2019-09-24 15:01:42,001 [Thread-257] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 15:01:42,001 [Thread-257] INFO org.apache.hadoop.mapred.LocalJobRunner -
OutputCommitter is
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2019-09-24 15:01:42,396 [Thread-257] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting
for map tasks
2019-09-24 15:01:42,422 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local1377189287_0010_m_000000_0
2019-09-24 15:01:42,513 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 15:01:42,520 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2019-09-24 15:01:42,527 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 43
Input split[0]:
Length = 43
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:

-----------------------

2019-09-24 15:01:42,570 [LocalJobRunner Map Task Executor #0] INFO


org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-09-24 15:01:42,570 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split
being processed hdfs://localhost:54310/dir11/emp.txt:0+43
2019-09-24 15:01:42,571 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm
version is 1
2019-09-24 15:01:42,822 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to
monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2019-09-24 15:01:42,847 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate
code.
2019-09-24 15:01:42,887 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being
processed per job phase (AliasName[line,offset]): M: D1[1,5],D1[-1,-1] C: R:
2019-09-24 15:01:42,979 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 15:01:43,931 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task:attempt_local1377189287_0010_m_000000_0 is done. And is
in the process of committing
2019-09-24 15:01:43,940 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2019-09-24 15:01:43,940 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task attempt_local1377189287_0010_m_000000_0 is allowed to
commit now
2019-09-24 15:01:44,016 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task
'attempt_local1377189287_0010_m_000000_0' to
hdfs://localhost:54310/dir11/stu/_temporary/0/task_local1377189287_0010_m_000000
2019-09-24 15:01:44,017 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - map
2019-09-24 15:01:44,017 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Task 'attempt_local1377189287_0010_m_000000_0' done.
2019-09-24 15:01:44,017 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Finishing task:
attempt_local1377189287_0010_m_000000_0
2019-09-24 15:01:44,017 [Thread-257] INFO org.apache.hadoop.mapred.LocalJobRunner - map task
executor complete.
2019-09-24 15:01:44,499 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,499 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,500 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,503 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%
complete
2019-09-24 15:01:44,503 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats -
Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features


2.7.1 0.16.0 rr 2019-09-24 15:01:40 2019-09-24 15:01:44 UNKNOWN
Success!

Job Stats (time in seconds):


JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime
MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias
Feature Outputs
job_local1377189287_0010 1 0 n/a n/a n/a n/a 0 0 0 0
D1MAP_ONLY hdfs://localhost:54310/dir11/stu,

Input(s):
Successfully read 4 records (387 bytes) from: "hdfs://localhost:54310/dir11/emp.txt"

Output(s):
Successfully stored 4 records (43 bytes) in: "hdfs://localhost:54310/dir11/stu"

Counters:
Total records written : 4
Total bytes written : 43
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1377189287_0010

2019-09-24 15:01:44,507 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize


JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,508 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,509 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2019-09-24 15:01:44,545 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
grunt>

You might also like