ZEEPLIN安装集成CDH、Flink及Iceberg

目录

一、服务安装 3

1 安装包下载 3

2 服务安装 3

2.1 环境配置 3

2.2 节点配置 3

2.3 服务启动 4

2.4 服务访问 4

二 简单使用-flink 4

1 web端配置Interpreters 4

2 demo测试 5

三 简单实用-iceberg 6

1 配置flink-iceberg jar包 6

2 demo测试 7

一、服务安装

1 安装包下载

下载地址

https://zeppelin.apache.org/download.html

2 服务安装

服务安装路径:/opt/soft/ zeppelin-0.10.1-bin-all

cd /opt/soft

tar -zxvf zeppelin-0.10.1-bin-all.tgz

2.1 环境配置

cd zeppelin-0.10.1-bin-all/conf

#修改配置文件

cp zeppelin-env.sh.template zeppelin-env.sh

vim zeppelin-env.sh #添加如下配置

export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera

export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn

export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop

export SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark

export MASTER=yarn-cluster

说明:java及hadoop相关配置路径,根据环境对应适配

2.2 节点配置

cp zeppelin-site.xml.template zeppelin-site.xml

vim zeppelin-site.xml

<property>

<name>zeppelin.server.addr</name>

<value>**.**.**.**</value>

<description>Server binding address</description>

</property>

<property>

<name>zeppelin.server.port</name>

<value>****</value>

<description>Server port.</description>

</property>

说明:该处只需要配置具体的IP和端口即可,如果为集群模式,则配置zeppelin.cluster.addr属性即可。

2.3 服务启动

./bin/zeppelin-daemon.sh start#启动命令

./bin/zeppelin-daemon.sh stop

2.4 服务访问

访问地址:http://ip:port

二 简单使用-flink

1 web端配置Interpreters

首先,右上角点击Interpreters

然后,检索flink,进行配置项编辑,配置下面4项即可

FLINK_HOME=/opt/flink

HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.yarn

HIVE_CONF_DIR=/etc/hive/conf.cloudera.hive

flink.execution.mode=yarn

2 demo测试

创建新的note并编辑demo进行测试

%flink

val data=benv.fromElements("Hello Kobe","Hello Jordan","Hello James")

data.flatMap(record=>record.split("\\s"))

.map(word=>(word,1))

.groupBy(0)

.sum(1)

.print()

三 简单实用-iceberg

1 配置flink-iceberg jar包

配置方式:

flink.execution.jars= /opt/flink/lib/iceberg-flink-runtime-1.13-0.13.1.jar

如果没有配置的话报如下错误

Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath.

Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath.

Available factory identifiers are:

generic_in_memory

at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:319) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

at org.apache.flink.table.factories.FactoryUtil.getCatalogFactory(FactoryUtil.java:455) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

at org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:251) ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]

2 demo测试

%flink.bsql

show catalogs;

CREATE CATALOG hive_catalog WITH (

'type'='iceberg',

'catalog-type'='hive',

'uri'='thrift://cdh-test01:9083',

'clients'='5',

'property-version'='1',

'warehouse'='hdfs://hdfsCluster/user/hive/warehouse'

);

use catalog hive_catalog;

select * from `hive_catalog`.`iceberg_db`.`sample`;

 

posted @ 2022-03-17 17:48  疯码牛Pro  阅读(726)  评论(0编辑  收藏  举报