# 大数据之HUE安装及调测
大数据之HUE安装及调测
参考:http://www.javashuo.com/article/p-rshyvmke-dv.html
https://www.jianshu.com/p/a80ec32afb27
1、Hue 简介
Hue是一个开源的Apache Hadoop UI系统,最先是由Cloudera Desktop演化而来,由Cloudera贡献给开源社区,它是基于Python Web框架Django实现的。经过使用Hue咱们能够在浏览器端的Web控制台上与Hadoop集群进行交互来分析处理数据,例如操做HDFS上的数据,运行MapReduce Job等等。很早之前就据说过Hue的便利与强大,一直没能亲自尝试使用,下面先经过官网给出的特性,经过翻译原文简单了解一下Hue所支持的功能特性集合:html
- 默认基于轻量级sqlite数据库管理会话数据,用户认证和受权,能够自定义为MySQL、Postgresql,以及Oraclenode
- 基于文件浏览器(File Browser)访问HDFSpython
- 基于Hive编辑器来开发和运行Hive查询mysql
- 支持基于Solr进行搜索的应用,并提供可视化的数据视图,以及仪表板(Dashboard)c++
- 支持基于Impala的应用进行交互式查询git
- 支持Spark编辑器和仪表板(Dashboard)github
- 支持Pig编辑器,并可以提交脚本任务web
- 支持Oozie编辑器,能够经过仪表板提交和监控Workflow、Coordinator和Bundlesql
- 支持HBase浏览器,可以可视化数据、查询数据、修改HBase表数据库
- 支持Metastore浏览器,能够访问Hive的元数据,以及HCatalog
- 支持Job浏览器,可以访问MapReduce Job(MR1/MR2-YARN)
- 支持Job设计器,可以建立MapReduce/Streaming/Java Job
- 支持Sqoop 2编辑器和仪表板(Dashboard)
- 支持ZooKeeper浏览器和编辑器
- 支持MySql、PostGresql、Sqlite和Oracle数据库查询编辑器
2、Hue的架构
3、安装与部署
一、下载
hue官网:http://gethue.com/
配置文档:http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/manual.html#_install_hue
源码:https://github.com/cloudera/hue
这里使用4.6,下载地址:https://cdn.gethue.com/downloads/hue-4.6.0.tgz
二、安装系统包
yum install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel libffi-devel python-simplejson sqlite-devel maven
hue还依赖于nodejs
<: 获取软件源
curl -fsSL https://rpm.nodesource.com/setup_14.x | bash -
<: 安装nodejs
Run `sudo yum install -y nodejs` to install Node.js 14.x and npm
# 增加国内源
npm config set registry https://registry.npm.taobao.org
# 查看源
npm config get registry
在安装系统包的时候遇到的问题:sqlite-devel不能从镜像下载,这里我是用了手动下载tar包,安装编译。下载地址: http://www.sqlite.org/sqlite-autoconf-3070500.tar.gz
tar zxf sqlite-autoconf-3070500.tar.gz
cd sqlite-autoconf-3070500
./configure
make
sudo make install
三、编译Hue
tar -xf hue-4.6.0.tgz
1. 进入Hue目录下
cd /opt/package/hue-release-4.3.0
# 改为中文页面
vim desktop/core/src/desktop/settings.py
LANGUAGE_CODE = 'zh_CN'
#LANGUAGE_CODE = 'en-us'
LANGUAGES = [
('en-us', _('English')),
('zh_CN', _('Simplified Chinese')),
]
2. 编译(到指定目录下)
PREFIX=/opt/moudle make install
# 如果想把HUE从移动到另外一个地方,由于HUE使用了Python包的一些绝对路径,移动之后则必须执行以下命令:
# 这里不要执行
rm app.reg
rm -r build
make apps
编译Hue时遇到的问题:
a、
OpenSSL/crypto/crl.c:6:23: error: static declaration of ‘X509_REVOKED_dup’ follows non-static declaration
static X509_REVOKED * X509_REVOKED_dup(X509_REVOKED *orig) {
^
In file included from /usr/include/openssl/ssl.h:156:0,
from OpenSSL/crypto/x509.h:17,
from OpenSSL/crypto/crypto.h:30,
from OpenSSL/crypto/crl.c:3:
/usr/include/openssl/x509.h:751:15: note: previous declaration of ‘X509_REVOKED_dup’ was here
X509_REVOKED *X509_REVOKED_dup(X509_REVOKED *rev);
^
error: command 'gcc' failed with exit status 1
make[2]: *** [/opt/hue/desktop/core/build/pyopenssl/egg.stamp] Error 1
make[2]: Leaving directory `/opt/hue/desktop/core'
make[1]: *** [.recursive-env-install/core] Error 2
make[1]: Leaving directory `/opt/hue/desktop'
make: *** [desktop] Error 2
解决办法:
将/usr/include/openssl/x509.h文件下:
这两行删除,必须删除,采用注释的方式不行:
X509_REVOKED *X509_REVOKED_dup(X509_REVOKED *rev);
X509_REQ *X509_REQ_dup(X509_REQ *req);
如果因为访问不了国外网站,多编译几次就好了
四、配置文件
在此之前,新建一个数据库
MySql初始化
在mysql数据库上建一个名为hue的库
# 登录mysql数据库
mysql -u root -p
# 创建数据库hue
create database hue;
# 创建用户
create user 'hue'@'%' identified by 'Mysql@1234';
ALTER USER 'hue'@'%' IDENTIFIED BY 'Mysql@1234' PASSWORD EXPIRE NEVER; #更改加密方式
ALTER USER 'hue'@'%' IDENTIFIED WITH mysql_native_password BY 'Mysql@1234'; #更新用户密码
# 授权
grant all privileges on hue.* to 'hue'@'%';
FLUSH PRIVILEGES; #刷新权限
cd /opt/hue/desktop/conf
vim hue.ini
# 进入hue配置目录
cd desktop/conf
# 复制一份HUE的配置文件,并修改复制的配置文件
cp pseudo-distributed.ini.tmpl pseudo-distributed.ini
vim pseudo-distributed.ini
-- 如下修改
# [desktop]
http_host=linux122
http_port=8000
is_hue_4=true
time_zone=Asia/Shanghai
dev=true
server_user=hue
server_group=hue
default_user=hue
# 211行左右。禁用solr,规避报错
app_blacklist=search
# [[database]]。Hue默认使用SQLite数据库记录相关元数据,替换为mysql
engine=mysql
host=linux123
port=3306
user=root
password=12345678
name=hue
# 1003行左右,Hadoop配置文件的路径
hadoop_conf_dir=/opt/moudle/hadoop-2.9.2/etc/hadoop
# 去mysql所在的机器上
# 在mysql中创建数据库hue,用来存放元数据
mysql -uroot -p12345678
mysql> create database hue;
# 在hue目录中
# 初始化数据库,可以看到mysql的hue数据库下出现很多表
build/env/bin/hue syncdb
build/env/bin/hue migrate
五、启动
/opt/hue/build/env/bin/supervisor
启动的时候遇到的问题:
Couldn't get user id for user hue
首先说明出现此问题的缘由是由于你使用的root用户安装了hue,而后在root用户下使用的build/env/bin/supervisor
解决办法:
a、建立个普通用户,并给添加密码:
[root@master bin]# useradd hue
[root@master bin]# passwd hue 而后设置好密码
b、给刚才解压的hue文件改变拥有者属性,经过 chown -R 用户名 文件地址。
[root@master bin]# chown -R hue /opt/hue
而后在页面上登陆:192.168.48.136:8888
输入用户和密码:
六、Hue与HDFS、MYSQL、Hive、Zookeeper集成配置
Hue集成zookeeper:
进入目录:/opt/hue/desktop/conf,配置hue.ini
[zookeeper]
[[clusters]]
[[[default]]]
# Zookeeper ensemble. Comma separated list of Host/Port.
# e.g. localhost:2181,localhost:2182,localhost:2183
host_ports=master:2181,slave01:2181,slave02:2181
# The URL of the REST contrib service (required for znode browsing)
## rest_url=http://localhost:9998
A、 启动zk(master、slave0一、slave02)
zkServer.sh start
B、 启动hue
进入目录/opt/hue/build/env/bin:
./ supervisor
C、访问192.168.200.100:8888页面
Hue集成MYSQL
进入目录:/opt/hue/desktop/conf,配置hue.ini
# mysql, oracle, or postgresql configuration.
[[[mysql]]]
# Name to show in the UI.
nice_name="My SQL DB"
# For MySQL and PostgreSQL, name is the name of the database.
# For Oracle, Name is instance of the Oracle server. For express edition
# this is 'xe' by default.
## name=mysqldb
# Database backend to use. This can be:
# 1. mysql
# 2. postgresql
# 3. oracle
engine=mysql
# IP or hostname of the database to connect to.
host=master
# Port the database server is listening to. Defaults are:
# 1. MySQL: 3306
# 2. PostgreSQL: 5432
# 3. Oracle Express Edition: 1521
port=3306
# Username to authenticate with when connecting to the database.
user=root
# Password matching the username to authenticate with when
# connecting to the database.
password=010209
# Database options to send to the server when connecting.
# https://docs.djangoproject.com/en/1.4/ref/databases/
## options={}
启动hue:
对比mysql数据库:
Hue集成hive
A、进入目录:/opt/hue/desktop/conf,配置hue.ini
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=master
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/opt/hive/conf
# Timeout in seconds for thrift calls to Hive service
server_conn_timeout=120
# Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
# If false, Hue will use the FetchResults() thrift call instead.
## use_get_log_api=true
B、配置hue与hive集成须要启动hiveserver2的相关参数(hive-site.xml):
或参考:https://blog.csdn.net/weixin_43159039/article/details/122080027
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>master</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.200.100:9083</value>
</property>
C、启动
一、启动hive以前先启动hdfs:start-dfs.sh
二、启动hive相关服务
hive --service metastore &
hive --service hiveserver2 &
三、启动hue
配置环境变量以后能够这样使用,没有配置的话,请到相关目录下执行:
./supervisor
四、访问HUE页面
当每执行一次查询成功的时候,hiveserver2就会打印ok
D、Hue集成hive遇到的问题:
启动hive的hue以后,访问hue页面,链接hive数据库时,始终超时:
相关错误信息:
Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found
解决办法:
查看是否少了cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi中的rpm包,缺乏谁就安装谁
yum install cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi
Hue集成HDFS
A、进入目录:/opt/hue/desktop/conf,配置hue.ini
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://master:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://master:50070/webhdfs/v1
hadoop_hdfs_home=/mnt/hadoop
hadoop_bin=/opt/hadoop/bin
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# Default umask for file and directory creation, specified in an octal value.
## umask=022
# Directory of the Hadoop configuration
## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
hadoop_conf_dir=/opt/hadoop/etc/hadoop
B、启动hdfs,和HUE,访问页面:
能够看到咱们能够经过hue对hdfs上的文件进行操做,删除等等,还能够直接查看文件:点击sparktest.txt,以下:
问题处理
A.
User: root is not allowed to impersonate hue', sqlState=None, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException)
解决办法:
配置core-site.xml
和httpfs-site.xml
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
本文来自博客园,作者:Jerry·,转载请注明原文链接:https://www.cnblogs.com/jerry-0910/p/16446636.html