GrantYu

2014年5月12日

spark app

摘要：使用Spark和Scala分析Apache访问日志http://www.jdon.com/bigdata/analyzing-apache-access-logs-files-spark-scala.html 阅读全文

posted @ 2014-05-12 17:03 GrantYu 阅读(285) 评论(0) 推荐(0) 编辑

source code spark

摘要： http://blog.csdn.net/pelick/article/category/1556747http://www.cnblogs.com/hseagle/ 阅读全文

posted @ 2014-05-12 17:01 GrantYu 阅读(259) 评论(0) 推荐(0) 编辑

spark dev by IDEA

摘要： Spark探秘：利用Intellij IDEA构建开发环境阅读全文

posted @ 2014-05-12 16:51 GrantYu 阅读(190) 评论(0) 推荐(0) 编辑

编译spark-0.9.1

摘要：准备工作：注意 spark-0.9.1 要求 scala-2.10.x 版本，sbt-0.12.4版本。 centos 6.4 x64 系统,java 1.7.0 x64 1，安装 scala-2.10.x 2, 安装sbt-0.12.4 download rpm, http://www.scala... 阅读全文

posted @ 2014-05-12 14:34 GrantYu 阅读(300) 评论(0) 推荐(0) 编辑

2014年5月5日

图解GIT,ZT

摘要：图解GIT,ZThttp://nettedfish.sinaapp.com/blog/2013/08/05/deep-into-git-with-diagrams/ 阅读全文

posted @ 2014-05-05 10:53 GrantYu 阅读(107) 评论(0) 推荐(0) 编辑

2014年4月24日

Spark分布式安装

摘要：三台服务器 n0,n2,n3 centos 6.4 X64 JDK, SCALA 2.11 Hadoop 2.2.0 spark-0.9.1-bin-hadoop2.tgz 说明： 1.所有机器上安装scala 2.所有机器上安装spark,可从master机器配置好，用scp 复制到剩余节点. 阅读全文

posted @ 2014-04-24 17:10 GrantYu 阅读(484) 评论(0) 推荐(0) 编辑

2014年4月16日

倒排索引（Inverted Index）

摘要：倒排索引（Inverted Index）倒排索引是一种索引结构，它存储了单词与单词自身在一个或多个文档中所在位置之间的映射。倒排索引通常利用关联数组实现。它拥有两种表现形式：inverted file index，其表现形式为 {词项，词项所在文档的ID}full inverted index，其表... 阅读全文

posted @ 2014-04-16 17:22 GrantYu 阅读(1276) 评论(0) 推荐(1) 编辑

2014年4月9日

Stanford: Creating a Hadoop-2.x project in Eclipse

摘要： Creating a Hadoop-2.x project in Eclipsehttp://snap.stanford.edu/class/cs246-data-2014/hw0.pdfHadoop WordCount with new map reduce apihttp://codesfusion.blogspot.com/2013/10/hadoop-wordcount-with-new-map-reduce-api.html 阅读全文

posted @ 2014-04-09 23:58 GrantYu 阅读(243) 评论(0) 推荐(0) 编辑

Hadoop – The Definitive Guide Examples,,IntelliJ

摘要： IntelliJ Project for Building Hadoop – The Definitive Guide Exampleshttp://vichargrave.com/intellij-project-for-building-hadoop-the-definitive-guide-examples/ 阅读全文

posted @ 2014-04-09 22:51 GrantYu 阅读(171) 评论(0) 推荐(0) 编辑

Creating a Hadoop-2.x project in Eclipse

摘要： Creating a Hadoop-2.x project in Eclipsehortonworks:MapReduce Portshttp://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.0/bk_reference/content/reference_chap2_2.htmlhadoop-1.x 集群默认配置和常用配置http://www.cnblogs.com/ggjucheng/archive/2012/04/17/2454590.htmlEclipse下搭建Hadoop-2.x开发环境{good}http://blog.csdn.n 阅读全文

posted @ 2014-04-09 19:16 GrantYu 阅读(367) 评论(0) 推荐(0) 编辑

2014年4月7日

Create a Hadoop Build and Development Environment

摘要： Create a Hadoop Build and Development Environmenthttp://vichargrave.com/create-a-hadoop-build-and-development-environment-for-hadoop/Debugging Hadoop Applications with IntelliJhttp://vichargrave.com/debugging-hadoop-applications-with-intellij/ 阅读全文

posted @ 2014-04-07 15:31 GrantYu 阅读(154) 评论(0) 推荐(0) 编辑

2014年4月3日

Hadoop-2.3.0的Eclipse插件编译

摘要： Hadoop-2.3.0的Eclipse插件编译#cd /usr/local/src/hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin#ant jar -Dversion=2.3.0 -Declipse.home=/usr/local/eclipse -Dhadoop.home=/home/hm/hadoop编译很简单：经常出现的问题。因为Proxy问题不能获得ivy-2.1.0.jar，需要设置代理Can't get http://repo2.maven.org/maven2/org/apache/ivy/ivy/2. 阅读全文

posted @ 2014-04-03 15:43 GrantYu 阅读(1408) 评论(3) 推荐(0) 编辑

Storm集群安装部署步骤【详细版】

摘要： Storm集群安装部署步骤【详细版】假设1.已安装jdk,python,unzip 2.已经搭建Zookeeper集群；1. 安装Storm依赖库；需要在Nimbus和Supervisor机器上安装Storm的依赖库 1.1 ZeroMQ $./configure $make $sudo make install 1.2 JZMQ $./autogen.sh $./configure $make $sudo make install ZMQ和JZMQ默认安装在/usr/local/lib 下2. 下载并解压Storm发布版本https://github.com/nathanmarz/storm 阅读全文

posted @ 2014-04-03 11:48 GrantYu 阅读(236) 评论(0) 推荐(0) 编辑

How-to: Use HBase Bulk Loading, and Why

摘要： How-to: Use HBase Bulk Loading, and Whyhttp://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/ 阅读全文

posted @ 2014-04-03 11:47 GrantYu 阅读(125) 评论(0) 推荐(0) 编辑

Hbase分布式安装

摘要： Hbase分布式安装Hbase分布式安装hbase-0.98.0-hadoop2-bin.tar前提是已经安装好 Hadoop,zookeeperhadoop port9000zookeeper port 2181 , dir/var/lib/zookeeper[hm@n0 ~]$ tar -zxv... 阅读全文

posted @ 2014-04-03 11:46 GrantYu 阅读(273) 评论(0) 推荐(0) 编辑

公告