版本:
2.2版
描述:
OutputFormat是设置MR的结果输出写操作格式,包括如何写?写那?也就是定义写规则
类代码:
抽象类定义:
public abstract RecordWriter<K, V> getRecordWriter( TaskAttemptContext context) throws IOException, InterruptedException; public abstract void checkOutputSpecs(JobContext context) throws IOException, InterruptedException; public abstract OutputCommitter getOutputCommitter( TaskAttemptContext context) throws IOException, InterruptedException;
获取RecordWriter定义了写的具体操作,那么他抽象的方法如下:
public abstract void write(K key, V value) throws IOException, InterruptedException; public abstract void close(TaskAttemptContext context) throws IOException, InterruptedException;
也就是具体的写和资源关闭操作,比如LineRecordWriter那么他就是基于Key和Value分割然后直接写的操作
在OutputCommitter中定义了跟MRjob执行情况的一些操作,比如job启动,job失败等,其抽象操作如下:
public abstract void setupJob(JobContext jobContext) throws IOException; @Deprecated public void cleanupJob(JobContext jobContext) throws IOException { } public void commitJob(JobContext jobContext) throws IOException { cleanupJob(jobContext); } public void abortJob(JobContext jobContext, JobStatus.State state) throws IOException { cleanupJob(jobContext); } public abstract void setupTask(TaskAttemptContext taskContext) throws IOException; public abstract boolean needsTaskCommit(TaskAttemptContext taskContext) throws IOException; public abstract void commitTask(TaskAttemptContext taskContext) throws IOException; public abstract void abortTask(TaskAttemptContext taskContext) throws IOException; public boolean isRecoverySupported() { return false; } public void recoverTask(TaskAttemptContext taskContext) throws IOException { }
在写的操作中需要核实资源是否够用,资源是否合理被操作等操作都是在checkOutputSpecs中进行的
已有 0 人发表留言,猛击->>这里<<-参与讨论
ITeye推荐