MapReduce 异常 LongWritable cannot be cast to Text

有一个txt文件，内容格公式是这样的：

深圳订做T恤	5729944
深圳厂家t恤批发	5729945
深圳定做文化衫	5729944
文化衫厂家	5729944
订做文化衫	5729944
深圳t恤厂家	5729945

前面是搜索关键词，后面的是所属的分类ID,以tab分隔，想统计分类情况。于是用以下的MapReduce程序跑了下：

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*;

public class  ClassCount extends Configured implements Tool
{
	public static class ClassMap 
		extends Mapper<Text ,Text,Text,IntWritable>
	{
		private static final IntWritable one = new IntWritable(1);
		private Text word = new Text();

		public void map(Text key,Text value,Context context)
			throws IOException,InterruptedException
		{
			String eachLine = value.toString();
			StringTokenizer tokenizer = new StringTokenizer(eachLine,"\n");
			while(tokenizer.hasMoreTokens())
			{
				StringTokenizer token = new StringTokenizer(tokenizer.nextToken(),"\t");
				String keyword = token.nextToken();//i don't use it now.
				String classId = token.nextToken();
				word.set(classId);
				context.write(word,one);
			}
		}
	}

	public static class Reduce 
		extends Reducer<Text,IntWritable,Text,IntWritable>
	{
		public void reduce(Text key,Iterable<IntWritable> values,Context context)
			throws IOException,InterruptedException
		{
			int sum = 0;
			for(IntWritable val : values)
				sum += val.get();
			context.write(key,new IntWritable(sum));
		}
	}
	public int run(String args[]) throws Exception{
		Job job = new Job(getConf());
		job.setJarByClass(ClassCount.class);
		job.setJobName("classCount");
		
		job.setMapperClass(ClassMap.class);
		job.setReducerClass(Reduce.class);
		
		job.setInputFormatClass(TextInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);

		FileInputFormat.setInputPaths(job,new Path(args[0]));
		FileOutputFormat.setOutputPath(job,new Path(args[1]));

		boolean success = job.waitForCompletion(true);
		return success ?
 0 : 1;
	}
	public static void main(String[] args) throws Exception
	{
		int ret = ToolRunner.run(new ClassCount(),args);
		System.exit(ret);
	}
}

抛出例如以下异常：

java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

我以为输入的键是文本就用Text来作为key,但貌似不是这样子的，map方法把文件的行号当成key,所以要用LongWritable。
可是改过来之后，报了以下的异常：

14/04/25 17:21:15 INFO mapred.JobClient: Task Id : attempt_201404211802_0040_m_000000_1, Status : FAILED
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.IntWritable

这个就更加直观了,须要在run方法中加入以下的两行以明白声明输入的格式。

	job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

posted @ 2015-07-15 16:28 hrhguanli 阅读(2790) 评论(1) 编辑收藏举报

努力加载评论中...

刷新页面返回顶部

公告

昵称： hrhguanli
园龄： 10年8个月
粉丝： 94
关注： 0

2025年2月

日

一

二

三

四

五

六

hrhguanli

MapReduce 异常 LongWritable cannot be cast to Text

公告

搜索

常用链接

友情链接

阅读排行榜

评论排行榜

推荐排行榜

最新评论