摘要:在mapreduce程序运行的开始阶段,hadoop需要将待处理的输入文件进行分割,按预定义的格式对文件读取等操作,这些操作都在InputFormat中进行。主要工作有以下3个: 1. Validate the input-specification of the job. 2. Split-up the input file(s) into logical InputSplits, each of which is then assigned to an individual Mapper. 3. Provide the RecordReader implementation to be .
阅读全文