Tuesday, May 17, 2016

hadoop : 使用自己寫的word count



http://glj8989332.blogspot.tw/2015/09/windows-hadoop-eclipse-mapreduce-wordcount.html


step 1:

Maven 的設定


step 2: 
build my word count successfully.


 public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

   
  /* 初始化 */
  Configuration conf = new Configuration();
   
  /* 建立MapReduce Job, 該job的名稱為MyWordcount */
  @SuppressWarnings("deprecation")
Job job = new Job(conf,"KWordCnt");
   
  /* 啟動job的jar class 為MyWordcount */
  job.setJarByClass(KWordCnt.class);
  /* 啟動job的map class 為MyMapper */
  job.setMapperClass(MyMapper.class);
  /* 啟動job的reduce class 為MyReducer */
  job.setReducerClass(MyReducer.class);
   
  /* 輸入資料的HDFS路徑 */
  //FileInputFormat.addInputPath(job, new Path("/input02"));
  FileInputFormat.addInputPath(job, new Path("/1.csv"));


step 3:


















比較結果:使用原始的jar (一樣1489928)

yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /1.csv /2




No comments:

Post a Comment