随笔档案「2020年5月」 - liuluvaliant

spark 任务提交参数

摘要：网上看到的关于Executor，Cores和Memory的分配相关博客，先记录下来，再汇总。 <1>第一篇 Spark处理多少数据是否需要多少内存 Spark处理1Tb数据不需要1Tb的内存。具体需要多少内存是根据executor的core数量和每次读取数据集的block大小决定的。以读取hdfs 阅读全文

posted @ 2020-05-19 16:08 liuluvaliant 阅读(631) 评论(0) 推荐(0)

Java Spark读取Hbase

摘要：本文记录Spark读取Hbase基本操作，及读取多版本Hbase数据示例。 Hbase数据示例如下：示例代码如下 package org.HbaseLearn.Util; import org.apache.hadoop.conf.Configuration; import org.apache. 阅读全文

posted @ 2020-05-12 17:53 liuluvaliant 阅读(2651) 评论(0) 推荐(1)

Impala整合HBase

摘要：• 步骤1：创建hbase 表，向表中添加数据 1 create 'test_info', 'info' 2 3 put 'test_info','0001','info:birthday','2020-01-01' 4 put 'test_info','0001','info:gender','m 阅读全文

posted @ 2020-05-09 16:41 liuluvaliant 阅读(773) 评论(0) 推荐(0)

Elasticsearch 配置参数

摘要：es.nodes.wan.only (default false) 此模式下(=true )，连接器将禁用发现，并且仅在所有操作（包括读取和写入）期间通过声明的es.nodes进行连接。在此模式下，性能会受到很大影响 es.index.read.missing.as.empty(default no 阅读全文

posted @ 2020-05-08 10:54 liuluvaliant 阅读(3576) 评论(0) 推荐(0)

liuluvaliant

05 2020 档案

公告