MapReduce 异常 LongWritable cannot be cast to Text

有一个txt文件,内容格公式是这样的:

深圳订做T恤	5729944
深圳厂家t恤批发	5729945
深圳定做文化衫	5729944
文化衫厂家	5729944
订做文化衫	5729944
深圳t恤厂家	5729945


前面是搜索关键词,后面的是所属的分类ID,以tab分隔,想统计分类情况。于是用以下的MapReduce程序跑了下:

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*;

public class  ClassCount extends Configured implements Tool
{
	public static class ClassMap 
		extends Mapper<Text ,Text,Text,IntWritable>
	{
		private static final IntWritable one = new IntWritable(1);
		private Text word = new Text();

		public void map(Text key,Text value,Context context)
			throws IOException,InterruptedException
		{
			String eachLine = value.toString();
			StringTokenizer tokenizer = new StringTokenizer(eachLine,"\n");
			while(tokenizer.hasMoreTokens())
			{
				StringTokenizer token = new StringTokenizer(tokenizer.nextToken(),"\t");
				String keyword = token.nextToken();//i don't use it now.
				String classId = token.nextToken();
				word.set(classId);
				context.write(word,one);
			}
		}
	}

	public static class Reduce 
		extends Reducer<Text,IntWritable,Text,IntWritable>
	{
		public void reduce(Text key,Iterable<IntWritable> values,Context context)
			throws IOException,InterruptedException
		{
			int sum = 0;
			for(IntWritable val : values)
				sum += val.get();
			context.write(key,new IntWritable(sum));
		}
	}
	public int run(String args[]) throws Exception{
		Job job = new Job(getConf());
		job.setJarByClass(ClassCount.class);
		job.setJobName("classCount");
		
		job.setMapperClass(ClassMap.class);
		job.setReducerClass(Reduce.class);
		
		job.setInputFormatClass(TextInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);

		FileInputFormat.setInputPaths(job,new Path(args[0]));
		FileOutputFormat.setOutputPath(job,new Path(args[1]));

		boolean success = job.waitForCompletion(true);
		return success ?

0 : 1; } public static void main(String[] args) throws Exception { int ret = ToolRunner.run(new ClassCount(),args); System.exit(ret); } }


抛出例如以下异常:

java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text


我以为输入的键是文本就用Text来作为key,但貌似不是这样子的,map方法把文件的行号当成key,所以要用LongWritable。
可是改过来之后,报了以下的异常:

14/04/25 17:21:15 INFO mapred.JobClient: Task Id : attempt_201404211802_0040_m_000000_1, Status : FAILED
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.IntWritable

这个就更加直观了,须要在run方法中加入以下的两行以明白声明输入的格式。

	job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);



    
        

版权声明:本文博客原创文章。博客,未经同意,不得转载。

posted @ 2015-07-15 16:28  hrhguanli  阅读(2781)  评论(1编辑  收藏  举报