hadoop MapReduce 的DBOutputFormat 使用体会
试试在DBOutputFormat 的Reducer 阶段有几十万的记录要插入,就会报错内存溢出。
可以尝试 在mapred-site.xml 里面修改
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>