• <ruby id="5koa6"></ruby>
    <ruby id="5koa6"><option id="5koa6"><thead id="5koa6"></thead></option></ruby>

    <progress id="5koa6"></progress>

  • <strong id="5koa6"></strong>
  • MapReduce實例淺析(4)

    發表于:2015-07-10來源:uml.org.cn作者:open經驗庫點擊數: 標簽:數據庫
    14/12/17 23:04:20 INFO mapred.MapTask: record buffer = 262144/327680 14/12/17 23:04:20 INFO mapred.MapTask: Starting flush of map output 14/12/17 23:04:20 INFO mapred.MapTask: Finished spill 0 14/12/1

      14/12/17 23:04:20 INFO mapred.MapTask: record buffer = 262144/327680

      14/12/17 23:04:20 INFO mapred.MapTask: Starting flush of map output

      14/12/17 23:04:20 INFO mapred.MapTask: Finished spill 0

      14/12/17 23:04:20 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting

      14/12/17 23:04:20 INFO mapred.LocalJobRunner:

      14/12/17 23:04:20 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000001_0′ done.

      14/12/17 23:04:20 INFO mapred.LocalJobRunner:

      14/12/17 23:04:20 INFO mapred.Merger: Merging 2 sorted segments

      14/12/17 23:04:20 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 90 bytes

      14/12/17 23:04:20 INFO mapred.LocalJobRunner:

      14/12/17 23:04:20 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting

      14/12/17 23:04:20 INFO mapred.LocalJobRunner:

      14/12/17 23:04:20 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now

      14/12/17 23:04:20 INFO output.FileOutputCommitter: Saved output of task ‘attempt_local_0001_r_000000_0′ to out

      14/12/17 23:04:20 INFO mapred.LocalJobRunner: reduce > reduce

      14/12/17 23:04:20 INFO mapred.TaskRunner: Task ‘attempt_local_0001_r_000000_0′ done.

      14/12/17 23:04:20 INFO mapred.JobClient: map 100% reduce 100%

      14/12/17 23:04:20 INFO mapred.JobClient: Job complete: job_local_0001

      14/12/17 23:04:20 INFO mapred.JobClient: Counters: 14

      14/12/17 23:04:20 INFO mapred.JobClient: FileSystemCounters

      14/12/17 23:04:20 INFO mapred.JobClient: FILE_BYTES_READ=46040

      14/12/17 23:04:20 INFO mapred.JobClient: HDFS_BYTES_READ=51471

      14/12/17 23:04:20 INFO mapred.JobClient: FILE_BYTES_WRITTEN=52808

      14/12/17 23:04:20 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=98132

      14/12/17 23:04:20 INFO mapred.JobClient: Map-Reduce Framework

      14/12/17 23:04:20 INFO mapred.JobClient: Reduce input groups=3

      14/12/17 23:04:20 INFO mapred.JobClient: Combine output records=0

      14/12/17 23:04:20 INFO mapred.JobClient: Map input records=4

      14/12/17 23:04:20 INFO mapred.JobClient: Reduce shuffle bytes=0

      14/12/17 23:04:20 INFO mapred.JobClient: Reduce output records=4

      14/12/17 23:04:20 INFO mapred.JobClient: Spilled Records=8

      14/12/17 23:04:20 INFO mapred.JobClient: Map output bytes=78

      14/12/17 23:04:20 INFO mapred.JobClient: Combine input records=0

      14/12/17 23:04:20 INFO mapred.JobClient: Map output records=4

      14/12/17 23:04:20 INFO mapred.JobClient: Reduce input records=4

      可見在默認情況下,MapReduce原封不動地將輸入寫到輸出

      下面介紹MapReduce的部分參數及其默認設置:

      (1)InputFormat類

      該類的作用是將輸入的數據分割成一個個的split,并將split進一步拆分成對作為map函數的輸入

      (2)Mapper類

      實現map函數,根據輸入的對生產中間結果

      (3)Combiner

      實現combine函數,合并中間結果中具有相同key值的鍵值對。

      (4)Partitioner類

      實現getPartition函數,用于在Shuffle過程按照key值將中間數據分成R份,每一份由一個Reduce負責

      (5)Reducer類

      實現reduce函數,將中間結果合并,得到最終的結果

      (6)OutputFormat類

      該類負責輸出最終的結果

      上面的代碼可以改寫為:

    public class LazyMapReduce {
        public static void main(String[] args) throws Exception {
            // TODO Auto-generated method stub
            Configuration conf = new Configuration();
            String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
            if(otherArgs.length != 2) {
                System.err.println("Usage:wordcount");
                System.exit(2);
            }
            Job job = new Job(conf, "LazyMapReduce");
            job.setInputFormatClass(TextInputFormat.class);
            job.setMapperClass(Mapper.class);
             
            job.setMapOutputKeyClass(LongWritable.class);
            job.setMapOutputValueClass(Text.class);
            job.setPartitionerClass(HashPartitioner.class);
            job.setReducerClass(Reducer.class);
             
            job.setOutputKeyClass(LongWritable.class);
            job.setOutputValueClass(Text.class);
            job.setOutputFormatClass(FileOutputFormat.class);
             
            FileInputFormat.addInputPath(job, new Path(args[0]));
            FileOutputFormat.setOutputPath(job, new Path(args[1]));
            System.exit(job.waitForCompletion(true)? 0:1);
        }
    }
    
    

    原文轉自:http://www.uml.org.cn/sjjm/201501201.asp

    老湿亚洲永久精品ww47香蕉图片_日韩欧美中文字幕北美法律_国产AV永久无码天堂影院_久久婷婷综合色丁香五月

  • <ruby id="5koa6"></ruby>
    <ruby id="5koa6"><option id="5koa6"><thead id="5koa6"></thead></option></ruby>

    <progress id="5koa6"></progress>

  • <strong id="5koa6"></strong>