hadoop MapReduce 的DBOutputFormat 使用体会

试试在DBOutputFormat 的Reducer 阶段有几十万的记录要插入，就会报错内存溢出。

可以尝试在mapred-site.xml 里面修改

<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc

The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>

posted @ 2013-08-07 16:17 小湖海阅读(343) 评论(0) 编辑收藏举报

刷新页面返回顶部

奋斗的历程

Java开发者，高可用性、分布式集群、内存数据库实践者,ODE流程引擎研究跟随者，伪.net开发者，至今服务于XXXX云计算平台

hadoop MapReduce 的DBOutputFormat 使用体会

公告