Saturday, October 1, 2016

Machine Learning using Apache Spark on top of Apache Hadoop and Kafka

Hadoop processes historical data where Spark processes real time iOT data as it comes in for trends.

http://www.cs.toronto.edu/~hinton/science.pdf

MapReduce Batch Processing
http://hadoop.apache.org/
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.3/hadoop-2.7.3-src.tar.gz
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Realtime Streaming Processing
runs on Hadoop
http://spark.apache.org/

Distributed Streaming Platform
runs on Scala on top of Zookeeper
http://kafka.apache.org/
https://kafka.apache.org/08/quickstart.html


output
[root@nuc12-i7 hadoop-2.7.3]# vi /home/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
[root@nuc12-i7 hadoop-2.7.3]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+'
16/10/02 00:16:47 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/10/02 00:16:47 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/10/02 00:16:48 INFO input.FileInputFormat: Total input paths to process : 8
16/10/02 00:16:48 INFO mapreduce.JobSubmitter: number of splits:8
16/10/02 00:16:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1447651192_0001
16/10/02 00:16:48 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/10/02 00:16:48 INFO mapreduce.Job: Running job: job_local1447651192_0001
16/10/02 00:16:48 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Waiting for map tasks
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000000_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/hadoop-policy.xml:0+9683
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.MapTask: Spilling map output
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
16/10/02 00:16:48 INFO mapred.MapTask: Finished spill 0
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000000_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000000_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000000_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000001_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/kms-site.xml:0+5511
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000001_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000001_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000001_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000002_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/capacity-scheduler.xml:0+4436
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000002_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000002_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000002_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000003_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/kms-acls.xml:0+3518
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000003_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000003_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000003_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000004_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/hdfs-site.xml:0+775
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000004_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000004_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000004_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000005_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/core-site.xml:0+774
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000005_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000005_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000005_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000006_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/yarn-site.xml:0+690
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000006_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000006_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000006_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_m_000007_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/input/httpfs-site.xml:0+620
16/10/02 00:16:48 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:48 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:48 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:48 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:48 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:48 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 
16/10/02 00:16:48 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_m_000007_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_m_000007_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_m_000007_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: map task executor complete.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Waiting for reduce tasks
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Starting task: attempt_local1447651192_0001_r_000000_0
16/10/02 00:16:48 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:48 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:48 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@619f906a
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
16/10/02 00:16:48 INFO reduce.EventFetcher: attempt_local1447651192_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000003_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000003_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000000_0 decomp: 21 len: 25 to MEMORY
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1447651192_0001_m_000000_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 2, commitMemory -> 2, usedMemory ->23
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000006_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000006_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 23, usedMemory ->25
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000007_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000007_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 25, usedMemory ->27
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000004_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000004_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 27, usedMemory ->29
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000001_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000001_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 6, commitMemory -> 29, usedMemory ->31
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000005_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000005_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 7, commitMemory -> 31, usedMemory ->33
16/10/02 00:16:48 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1447651192_0001_m_000002_0 decomp: 2 len: 6 to MEMORY
16/10/02 00:16:48 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1447651192_0001_m_000002_0
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 8, commitMemory -> 33, usedMemory ->35
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:48 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 8 / 8 copied.
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: finalMerge called with 8 in-memory map-outputs and 0 on-disk map-outputs
16/10/02 00:16:48 INFO mapred.Merger: Merging 8 sorted segments
16/10/02 00:16:48 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: Merged 8 segments, 35 bytes to disk to satisfy reduce memory limit
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk
16/10/02 00:16:48 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
16/10/02 00:16:48 INFO mapred.Merger: Merging 1 sorted segments
16/10/02 00:16:48 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 8 / 8 copied.
16/10/02 00:16:48 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
16/10/02 00:16:48 INFO mapred.Task: Task:attempt_local1447651192_0001_r_000000_0 is done. And is in the process of committing
16/10/02 00:16:48 INFO mapred.LocalJobRunner: 8 / 8 copied.
16/10/02 00:16:48 INFO mapred.Task: Task attempt_local1447651192_0001_r_000000_0 is allowed to commit now
16/10/02 00:16:48 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1447651192_0001_r_000000_0' to file:/home/hadoop-2.7.3/grep-temp-965988443/_temporary/0/task_local1447651192_0001_r_000000
16/10/02 00:16:48 INFO mapred.LocalJobRunner: reduce > reduce
16/10/02 00:16:48 INFO mapred.Task: Task 'attempt_local1447651192_0001_r_000000_0' done.
16/10/02 00:16:48 INFO mapred.LocalJobRunner: Finishing task: attempt_local1447651192_0001_r_000000_0
16/10/02 00:16:48 INFO mapred.LocalJobRunner: reduce task executor complete.
16/10/02 00:16:49 INFO mapreduce.Job: Job job_local1447651192_0001 running in uber mode : false
16/10/02 00:16:49 INFO mapreduce.Job:  map 100% reduce 100%
16/10/02 00:16:49 INFO mapreduce.Job: Job job_local1447651192_0001 completed successfully
16/10/02 00:16:49 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=2892701
FILE: Number of bytes written=5231540
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=745
Map output records=1
Map output bytes=17
Map output materialized bytes=67
Input split bytes=877
Combine input records=1
Combine output records=1
Reduce input groups=1
Reduce shuffle bytes=67
Reduce input records=1
Reduce output records=1
Spilled Records=2
Shuffled Maps =8
Failed Shuffles=0
Merged Map outputs=8
GC time elapsed (ms)=90
Total committed heap usage (bytes)=3163029504
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=26007
File Output Format Counters 
Bytes Written=123
16/10/02 00:16:49 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/10/02 00:16:49 INFO input.FileInputFormat: Total input paths to process : 1
16/10/02 00:16:49 INFO mapreduce.JobSubmitter: number of splits:1
16/10/02 00:16:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1445128842_0002
16/10/02 00:16:49 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/10/02 00:16:49 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/10/02 00:16:49 INFO mapreduce.Job: Running job: job_local1445128842_0002
16/10/02 00:16:49 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:49 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Waiting for map tasks
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1445128842_0002_m_000000_0
16/10/02 00:16:49 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:49 INFO mapred.MapTask: Processing split: file:/home/hadoop-2.7.3/grep-temp-965988443/part-r-00000:0+111
16/10/02 00:16:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/10/02 00:16:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/10/02 00:16:49 INFO mapred.MapTask: soft limit at 83886080
16/10/02 00:16:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/10/02 00:16:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/10/02 00:16:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/10/02 00:16:49 INFO mapred.LocalJobRunner: 
16/10/02 00:16:49 INFO mapred.MapTask: Starting flush of map output
16/10/02 00:16:49 INFO mapred.MapTask: Spilling map output
16/10/02 00:16:49 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600
16/10/02 00:16:49 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
16/10/02 00:16:49 INFO mapred.MapTask: Finished spill 0
16/10/02 00:16:49 INFO mapred.Task: Task:attempt_local1445128842_0002_m_000000_0 is done. And is in the process of committing
16/10/02 00:16:49 INFO mapred.LocalJobRunner: map
16/10/02 00:16:49 INFO mapred.Task: Task 'attempt_local1445128842_0002_m_000000_0' done.
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1445128842_0002_m_000000_0
16/10/02 00:16:49 INFO mapred.LocalJobRunner: map task executor complete.
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Waiting for reduce tasks
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1445128842_0002_r_000000_0
16/10/02 00:16:49 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/10/02 00:16:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/10/02 00:16:49 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@1673725c
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
16/10/02 00:16:49 INFO reduce.EventFetcher: attempt_local1445128842_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
16/10/02 00:16:49 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local1445128842_0002_m_000000_0 decomp: 21 len: 25 to MEMORY
16/10/02 00:16:49 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1445128842_0002_m_000000_0
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->21
16/10/02 00:16:49 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
16/10/02 00:16:49 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/10/02 00:16:49 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
16/10/02 00:16:49 INFO mapred.Merger: Merging 1 sorted segments
16/10/02 00:16:49 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: Merged 1 segments, 21 bytes to disk to satisfy reduce memory limit
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk
16/10/02 00:16:49 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
16/10/02 00:16:49 INFO mapred.Merger: Merging 1 sorted segments
16/10/02 00:16:49 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes
16/10/02 00:16:49 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/10/02 00:16:49 INFO mapred.Task: Task:attempt_local1445128842_0002_r_000000_0 is done. And is in the process of committing
16/10/02 00:16:49 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/10/02 00:16:49 INFO mapred.Task: Task attempt_local1445128842_0002_r_000000_0 is allowed to commit now
16/10/02 00:16:49 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1445128842_0002_r_000000_0' to file:/home/hadoop-2.7.3/output/_temporary/0/task_local1445128842_0002_r_000000
16/10/02 00:16:49 INFO mapred.LocalJobRunner: reduce > reduce
16/10/02 00:16:49 INFO mapred.Task: Task 'attempt_local1445128842_0002_r_000000_0' done.
16/10/02 00:16:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1445128842_0002_r_000000_0
16/10/02 00:16:49 INFO mapred.LocalJobRunner: reduce task executor complete.
16/10/02 00:16:50 INFO mapreduce.Job: Job job_local1445128842_0002 running in uber mode : false
16/10/02 00:16:50 INFO mapreduce.Job:  map 100% reduce 100%
16/10/02 00:16:50 INFO mapreduce.Job: Job job_local1445128842_0002 completed successfully
16/10/02 00:16:50 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=1248270
FILE: Number of bytes written=2321184
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=1
Map output records=1
Map output bytes=17
Map output materialized bytes=25
Input split bytes=121
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=25
Reduce input records=1
Reduce output records=1
Spilled Records=2
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=0
Total committed heap usage (bytes)=695205888
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=123
File Output Format Counters 
Bytes Written=23
[root@nuc12-i7 hadoop-2.7.3]# cat output/*
1 dfsadmin
[root@nuc12-i7 hadoop-2.7.3]# ls output

part-r-00000  _SUCCESS

No comments:

Total Pageviews

Followers