hadoop 笔记
The key and value classes have to be serializable by the framework and hence need to implement the Writable interface. Additionally, the key classes have to implement the WritableComparable interface to facilitate sorting by the framework.
How Many Maps?
Task setup takes a while, so it is best if the maps take at least a minute to execute.
Reducer
The number of reduces for the job is set by the user via Job.setNumReduceTasks(int).
Reducer has 3 primary phases: shuffle, sort and reduce.