Alluxio 内存存储系统部署
一、文件下载和解压
1)下载地址:http://www.alluxio.org/download
2) 解压命令如下:
$ wget http://alluxio.org/downloads/files/1.2.0/alluxio-1.2.0-bin.tar.gz
$ tar xvfz alluxio-1.2.0-bin.tar.gz
$ cd alluxio-1.2.0
二、 配置文件更改
目前只是基本配置更改:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
|
#!/usr/bin/env bash # # The Alluxio Open Foundation licenses this work under the Apache License, version 2.0 # (the "License"). You may not use this work except in compliance with the License, which is # available at www.apache.org/licenses/LICENSE-2.0 # # This software is distributed on an "AS IS" basis, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, # either express or implied, as more fully set forth in the License. # # See the NOTICE file distributed with this work for information regarding copyright ownership. # # Copy it as alluxio-env.sh and edit that to configure Alluxio for your # site. This file is sourced to launch Alluxio servers or use Alluxio shell # commands. # # This file provides one way to configure Alluxio options by setting the # following listed environment variables. Note that, setting this file will not # affect jobs (e.g., Spark job or MapReduce job) that are using Alluxio client # as a library. Alternatively, you can edit alluxio-site.properties file, where # you can set all the configuration options supported by Alluxio # (http://alluxio.org/documentation/) which is respected by both external jobs # and Alluxio servers (or shell). # The directory where Alluxio deployment is installed. (Default: the parent directory of libexec/). export ALLUXIO_HOME= /data/spark/software/alluxio-1 .2.0 # The directory where log files are stored. (Default: ${ALLUXIO_HOME}/logs). # ALLUXIO_LOGS_DIR # Hostname of the master. # ALLUXIO_MASTER_HOSTNAME export ALLUXIO_MASTER_HOSTNAME=spark29 # This is now deprecated. Support will be removed in v2.0 # ALLUXIO_MASTER_ADDRESS #export ALLUXIO_MASTER_ADDRESS=spark29 # The directory where a worker stores in-memory data. (Default: /mnt/ramdisk). # E.g. On linux, /mnt/ramdisk for ramdisk, /dev/shm for tmpFS; on MacOS, /Volumes/ramdisk for ramdisk # ALLUXIO_RAM_FOLDER export ALLUXIO_RAM_FOLDER= /data/spark/software/alluxio-1 .2.0 /ramdisk # Address of the under filesystem address. (Default: ${ALLUXIO_HOME}/underFSStorage) # E.g. "/my/local/path" to use local fs, "hdfs://localhost:9000/alluxio" to use a local hdfs # ALLUXIO_UNDERFS_ADDRESS export ALLUXIO_UNDERFS_ADDRESS=hdfs: //spark29 :9000 # How much memory to use per worker. (Default: 1GB) # E.g. "1000MB", "2GB" # ALLUXIO_WORKER_MEMORY_SIZE export ALLUXIO_WORKER_MEMORY_SIZE=12GB # Config properties set for Alluxio master, worker and shell. (Default: "") # E.g. "-Dalluxio.master.port=39999" # ALLUXIO_JAVA_OPTS # Config properties set for Alluxio master daemon. (Default: "") # E.g. "-Dalluxio.master.port=39999" # ALLUXIO_MASTER_JAVA_OPTS # Config properties set for Alluxio worker daemon. (Default: "") # E.g. "-Dalluxio.worker.port=49999" to set worker port, "-Xms2048M -Xmx2048M" to limit the heap size of worker. # ALLUXIO_WORKER_JAVA_OPTS # Config properties set for Alluxio shell. (Default: "") # E.g. "-Dalluxio.user.file.writetype.default=CACHE_THROUGH" # ALLUXIO_USER_JAVA_OPTS |
三 、主机配置更改
1)在家目录下更改 .bash_profile 添加一下内容:
四 、Spark 添加依赖Jar
1、在所有的spark主机的spark安装目录下的conf目录下
更改spark-env.sh 后面添加:export SPARK_CLASSPATH="/data/spark/software/spark-1.5.2-bin-hadoop2.6/lib/alluxio-core-client-spark-1.2.0-jar-with-dependencies.jar:$SPARK_CLASSPATH"
五 、分发到各个Worker节点上去
六、格式化和启动
1、进入到alluxio的安装目录下面的bin目录,执行命令: alluxio format 进行内存格式化。
2、启动集群:./alluxio-start.sh all
七、可能遇到问题
1、启动worker报错,报错内容:Pseudo-terminal will not be allocated because stdin is not a terminal.
更改:alluxio\bin\alluxio-workers.sh 的44行内容
原始内容为:
nohup ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -t ${worker} ${LAUNCHER} \
改成如下:
nohup ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -tt $ {worker} ${LAUNCHER} \
2、如果启动报sudo相关命令错误,是因为启动用户未在sudoers里面,需要将用户添加到此文件中,添加方法搜下root位置,再后面添加即可。
内容如下:
root ALL=(ALL) ALL
spark ALL=(ALL) ALL
同时把此文件中的:#Defaults requiretty 注释掉。
3、如果还报错,可以在启动master之后,一个一个节点去启动worker。
八、 官网安装说明
官网安装说明:http://www.alluxio.org/docs/master/cn/Running-Alluxio-on-a-Cluster.html 有中文的,可以看看。
作者:明翼(XGogo)
-------------
公众号:TSparks
微信:shinelife
扫描关注我的微信公众号感谢
-------------