2015 年 3月随笔档案 - 瞌睡中的葡萄虎

pic

摘要：阅读全文

posted @ 2015-03-27 13:16 瞌睡中的葡萄虎阅读(167) 评论(0) 推荐(0) 编辑

摘要：今天测试过程中发现YARN Node变成Unhealthy了，后来定位到硬盘空间不够。。。。。通过查找大于100M的文件时发现有N多个spark-assembly-1.4.0-SNAPSHOT-hadoop2.5.0-cdh5.3.1.jar包，大小为170多M，每提交一个application到y... 阅读全文

posted @ 2015-03-24 16:22 瞌睡中的葡萄虎阅读(836) 评论(0) 推荐(0) 编辑

Spark1.3使用外部数据源时条件过滤只要是字符串类型的值均报错

摘要：CREATE TEMPORARY TABLE spark_tblsUSING org.apache.spark.sql.jdbcOPTIONS (url 'jdbc:mysql://hadoop000:3306/hive?user=root&password=root',dbtable ... 阅读全文

posted @ 2015-03-23 19:01 瞌睡中的葡萄虎阅读(807) 评论(0) 推荐(0) 编辑

spark1.3编译过程中遇到的一个坑

摘要：在编译spark1.3.0时：export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"mvn clean package -DskipTests -Phadoop-2.4 -Dhadoop.versi... 阅读全文

posted @ 2015-03-18 17:57 瞌睡中的葡萄虎阅读(1257) 评论(0) 推荐(0) 编辑

Hive On Spark hiveserver2方式使用

摘要：启动hiveserver2：hiveserver2 --hiveconf hive.execution.engine=spark spark.master=yarn使用beeline连接hiveserver2：beeline -u jdbc:hive2://hadoop000:10000 -n sp... 阅读全文

posted @ 2015-03-12 18:18 瞌睡中的葡萄虎阅读(2132) 评论(0) 推荐(0) 编辑

Hive On Spark概述

摘要：Hive现有支持的执行引擎有mr和tez，默认的执行引擎是mr，Hive On Spark的目的是添加一个spark的执行引擎，让hive能跑在spark之上；在执行hive ql脚本之前指定执行引擎、spark.home、spark.masterset hive.execution.engine=... 阅读全文

posted @ 2015-03-11 18:43 瞌睡中的葡萄虎阅读(1678) 评论(0) 推荐(0) 编辑

Hive On Spark环境搭建

摘要：Spark源码编译与环境搭建Note that you must have a version of Spark which does not include the Hive jars;Spark编译:git clone https://github.com/apache/spark.git sp... 阅读全文

posted @ 2015-03-10 18:03 瞌睡中的葡萄虎阅读(3195) 评论(0) 推荐(0) 编辑

RDD常用方法之subtract&intersection&cartesian

摘要：subtractReturn an RDD with the elements from `this` that are not in `other` . def subtract(other: RDD[T]): RDD[T]def subtract(other: RDD[T], numParti... 阅读全文

posted @ 2015-03-04 16:17 瞌睡中的葡萄虎阅读(1333) 评论(0) 推荐(0) 编辑

SparkSQL DataFrames操作

摘要：Hive中已经存在emp和dept表：select * from emp;+--------+---------+------------+-------+-------------+---------+---------+---------+| empno | ename | job ... 阅读全文

posted @ 2015-03-03 15:41 瞌睡中的葡萄虎阅读(1917) 评论(0) 推荐(0) 编辑

瞌睡中的葡萄虎

公告

03 2015 档案