代号菜鸟

2019年8月28日

摘要： 1 /* 2 * Licensed to the Apache Software Foundation (ASF) under one or more 3 * contributor license agreements. See the NOTICE file distributed with 4 * this work for additional information regarding 阅读全文

posted @ 2019-08-28 14:41 代号菜鸟阅读(280) 评论(0) 推荐(0)

2019年8月27日

Nifi自定义processor

摘要：下面这个网址介绍了自定义processor的具体方法 https://www.nifi.rocks/developing-a-custom-apache-nifi-processor-json/ 开发的详细API可以参照http://nifi.apache.org/developer-guide.h 阅读全文

posted @ 2019-08-27 16:07 代号菜鸟阅读(635) 评论(0) 推荐(0)

2019年4月7日

Spark性能调优

摘要：石杉老师讲要从以下几个方面去优化Spark的性能，其中Shuffle调优是重点。下面是与调优相关的几篇不错的博客，以供参考官网的调优 https://spark.apache.org/docs/latest/tuning.html 序列化 https://stackoverflow.com/qu 阅读全文

posted @ 2019-04-07 21:11 代号菜鸟阅读(122) 评论(0) 推荐(0)

2019年4月1日

Spark sumbit command on yarn

摘要：下面是一个spark提交的例子 spark-submit --class HiveColNullRatioStats --master yarn --deploy-mode client --num-executors 3 --executor-memory 6G --executor-cores 阅读全文

posted @ 2019-04-01 17:20 代号菜鸟阅读(210) 评论(0) 推荐(0)

2019年3月19日

Hive UDF 创建的两个例子

摘要： 1. 全角到半角的转换 2. Use hdfs file 注册Function hadoop fs -put -f full2half-1.0-SNAPSHOT.jar /home/hypers/lib beeline -u jdbc:hive2://******:10000/ -n hdfs -p 阅读全文

posted @ 2019-03-19 20:30 代号菜鸟阅读(258) 评论(0) 推荐(0)

2018年11月18日

HIVE存储格式详解

posted @ 2018-11-18 17:01 代号菜鸟阅读(2345) 评论(0) 推荐(1)

2018年10月8日

Spark Word Count

摘要： spark-submit --class WordCount \> --master yarn-cluster \> --num-executors 10 \> --executor-memory 6G \> --executor-cores 4 \> --driver-memory 1G \> / 阅读全文

posted @ 2018-10-08 22:34 代号菜鸟阅读(364) 评论(0) 推荐(0)

2017年8月28日

HBase MapReduce

摘要： 1. HBase to HBase Mapper 继承 TableMapper，输入为Rowkey和Result. Reducer 继承 TableReducer Driver 2. HBase to File Mapper No Reducer Reducer Driver 3. File to 阅读全文

posted @ 2017-08-28 08:14 代号菜鸟阅读(278) 评论(0) 推荐(0)

2017年8月16日

HBase 操作

摘要： CellCounter: Count cells in HBase table completebulkload: Complete a bulk data load. copytable: Export a table from local cluster to peer cluster expo 阅读全文

posted @ 2017-08-16 20:06 代号菜鸟阅读(588) 评论(0) 推荐(0)

HBase Java API 例子

摘要： import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hbase.*; import... 阅读全文

posted @ 2017-08-16 15:44 代号菜鸟阅读(257) 评论(0) 推荐(0)

公告