jackgaolei

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

1.配置hadoop环境变量,本文已hadoop版本2.5.2为例。

下载hadoop2.5.2后解压,配置环境变量如下(若不生效,需要重启)

将winutils.exe文件放到hadoop的bin目录下

hadoop2.x版本未发布winutils.exe,没有该文件会报如下错误:

 

2.安装eclipse插件

在hadoop1的较早版本中提供了该插件,hadoop2中未提供该插件,需要到github中自己下载。此处使用:hadoop-eclipse-plugin-2.5.2.jar。

将hadoop-eclipse-plugin-2.5.2.jar复制到eclipse的dropins目录下解压后重启eclipse:

 

3.配置hadoop插件

将Hadoop installation directory设置为hadoop的根目录

 

显示Hadoop连接配置窗口:Window--Show View--Other-MapReduce Tools,如下图所示

 

配置连接Hadoop

 

4.检查是否与服务器连接

能够显示hadoop服务器上的文件和目录即已连接

 

5.新建一个mapreduce项目

hadoop相关的jar文件会本自动引入到项目中

 

6.运行wordCount程序

hadoop2.5.2源码自带的WordCount程序所在目录如下:

hadoop-2.5.2-src\hadoop-mapreduce-project\hadoop-mapreduce-examples\src\main\java\org\apache\hadoop\examples\WordCount.java

(对代码的main方法稍作了修改)

package mapreduce;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class WordCount {

	public static class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {

		private final IntWritable one = new IntWritable(1);
		private Text word = new Text();

		public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
			String line = value.toString();
			StringTokenizer token = new StringTokenizer(line);
			while (token.hasMoreTokens()) {
				word.set(token.nextToken());
				context.write(word, one);
			}
		}
	}

	public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> {

		public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
			int sum = 0;
			for (IntWritable val : values) {
				sum += val.get();
			}
			context.write(key, new IntWritable(sum));
		}
	}

	public static void main(String[] args) throws Exception {
		Configuration conf = new Configuration();
		Job job = new Job(conf);
		job.setJarByClass(WordCount.class);
		job.setJobName("wordcount");

		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);

		job.setMapperClass(WordCountMap.class);
		job.setReducerClass(WordCountReduce.class);

		job.setInputFormatClass(TextInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		
		FileInputFormat.addInputPath(job, new Path("hdfs://192.168.107.167:9000/input/test"));
		FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.107.167:9000/output/test"));

		job.waitForCompletion(true);
	}
}

 

使用eclipse在hadoop上运行上述代码,报如下错误

 

拷贝源码文件hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio\NativeIO.java

到项目的org.apache.hadoop.io.nativeio.NativeIO中,定位到570行,直接修改为return true。如下图所示: 

 

修改后,程序运行后的结果如下:

 

 

后记

eclipse中无法编辑目录,切执行mapreduce程序时报如下错误

原因是文件系统权限设置了检查,可在hdfs-site.xml文件中添加以下配置取消检查

<property>
    <name>dfs.permissions</name>
    <value>false</value>
</property>

  

 

 

 

 

 

 

posted on 2015-12-11 11:19  jackgaolei  阅读(519)  评论(0编辑  收藏  举报