$王大少

导航

hadoop中的压缩方式

1、在代码中设置压缩

  

设置我们的map阶段的压缩

 

Configuration configuration = new Configuration();
configuration.set("mapreduce.map.output.compress","true");
configuration.set("mapreduce.map.output.compress.codec","org.apache.hadoop.io.compress.SnappyCodec");

设置我们的reduce阶段的压缩

 

configuration.set("mapreduce.output.fileoutputformat.compress","true");
configuration.set("mapreduce.output.fileoutputformat.compress.type","RECORD");
configuration.set("mapreduce.output.fileoutputformat.compress.codec","org.apache.hadoop.io.compress.SnappyCodec");

2、配置全局的MapReduce压缩

我们可以修改mapred-site.xml配置文件,然后重启集群,以便对所有的mapreduce任务进行压缩

 map输出数据进行压缩

 

<property>

 

          <name>mapreduce.map.output.compress</name>

 

          <value>true</value>

 

</property>

 

<property>

 

         <name>mapreduce.map.output.compress.codec</name>

 

         <value>org.apache.hadoop.io.compress.SnappyCodec</value>

 

</property>

 

 

reduce输出数据进行压缩

 

<property>       <name>mapreduce.output.fileoutputformat.compress</name>

 

       <value>true</value>

 

</property>

 

<property>         <name>mapreduce.output.fileoutputformat.compress.type</name>

 

        <value>RECORD</value>

 

</property>

 

 <property>        <name>mapreduce.output.fileoutputformat.compress.codec</name>

 

        <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property>

所有节点都要修改mapred-site.xml修改完成之后记得重启集群

 

 

 

posted on 2020-03-29 16:14  $王大少  阅读(350)  评论(0编辑  收藏  举报