H2O.ai初步使用
1.官网下载最新稳定版,https://www.h2o.ai/download/ ,如果点击下载无反应,请使用ie浏览器
2.解压h2o-3.18.0.10.zip到目录h2o-3.18.0.10
3.执行命令
cd h2o-3.18.0.10 java -jar h2o.jar -name clusterName
选项参考http://docs.h2o.ai/h2o/latest-stable/h2o-docs/starting-h2o.html#h2o-options
[root@eureka-8810 h2o-3.18.0.10]# java -jar h2o.jar -name clusterName 05-29 16:49:43.010 192.168.0.80:54321 10858 main INFO: Found XGBoost backend with library: xgboost4j_gpu 05-29 16:49:43.020 192.168.0.80:54321 10858 main INFO: XGBoost supported backends: [WITH_GPU, WITH_OMP] 05-29 16:49:43.020 192.168.0.80:54321 10858 main INFO: ----- H2O started ----- 05-29 16:49:43.020 192.168.0.80:54321 10858 main INFO: Build git branch: rel-wolpert 05-29 16:49:43.020 192.168.0.80:54321 10858 main INFO: Build git hash: b26ef10d0f1b4dd26b8227c1672ee47e0e893fec 05-29 16:49:43.020 192.168.0.80:54321 10858 main INFO: Build git describe: jenkins-3.18.0.9-19-gb26ef10 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Build age: 7 days, 8 hours and 36 minutes 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Built by: 'jenkins' 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Built on: '2018-05-22 08:13:35' 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Build git branch: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Build git hash: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Build git describe: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Build project version: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Built by: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: Watchdog Built on: (unknown) 05-29 16:49:43.021 192.168.0.80:54321 10858 main INFO: XGBoost Build git branch: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: XGBoost Build git hash: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: XGBoost Build git describe: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: XGBoost Build project version: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: XGBoost Built by: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: XGBoost Built on: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: KrbStandalone Build git branch: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: KrbStandalone Build git hash: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: KrbStandalone Build git describe: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: KrbStandalone Build project version: (unknown) 05-29 16:49:43.022 192.168.0.80:54321 10858 main INFO: KrbStandalone Built by: (unknown) 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: KrbStandalone Built on: (unknown) 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Processed H2O arguments: [-name, clusterName] 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Java availableProcessors: 16 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Java heap totalMemory: 964.5 MB 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Java heap maxMemory: 13.95 GB 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Java version: Java 1.8.0_121 (from Oracle Corporation) 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: JVM launch parameters: [] 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: OS version: Linux 3.10.0-693.el7.x86_64 (amd64) 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: Machine physical memory: 62.76 GB 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: X-h2o-cluster-id: 1527583782454 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: User name: 'root' 05-29 16:49:43.023 192.168.0.80:54321 10858 main INFO: IPv6 stack selected: false 05-29 16:49:43.024 192.168.0.80:54321 10858 main INFO: Possible IP Address: ens160 (ens160), fe80:0:0:0:df23:d65d:4aa8:62f5%ens160 05-29 16:49:43.024 192.168.0.80:54321 10858 main INFO: Possible IP Address: ens160 (ens160), 192.168.0.80 05-29 16:49:43.024 192.168.0.80:54321 10858 main INFO: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo 05-29 16:49:43.024 192.168.0.80:54321 10858 main INFO: Possible IP Address: lo (lo), 127.0.0.1 05-29 16:49:43.024 192.168.0.80:54321 10858 main INFO: H2O node running in unencrypted mode. 05-29 16:49:43.026 192.168.0.80:54321 10858 main INFO: Internal communication uses port: 54322 05-29 16:49:43.026 192.168.0.80:54321 10858 main INFO: Listening for HTTP and REST traffic on http://192.168.0.80:54321/ 05-29 16:49:43.027 192.168.0.80:54321 10858 main INFO: H2O cloud name: 'clusterName' on /192.168.0.80:54321, static configuration based on -flatfile null 05-29 16:49:43.027 192.168.0.80:54321 10858 main INFO: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555): 05-29 16:49:43.027 192.168.0.80:54321 10858 main INFO: 1. Open a terminal and run 'ssh -L 55555:localhost:54321 root@192.168.0.80' 05-29 16:49:43.027 192.168.0.80:54321 10858 main INFO: 2. Point your browser to http://localhost:55555 05-29 16:49:43.735 192.168.0.80:54321 10858 main INFO: Log dir: '/tmp/h2o-root/h2ologs' 05-29 16:49:43.736 192.168.0.80:54321 10858 main INFO: Cur dir: '/opt/software/h2o-3.18.0.10' 05-29 16:49:43.740 192.168.0.80:54321 10858 main INFO: HDFS subsystem successfully initialized 05-29 16:49:43.743 192.168.0.80:54321 10858 main INFO: S3 subsystem successfully initialized 05-29 16:49:43.743 192.168.0.80:54321 10858 main INFO: Flow dir: '/root/h2oflows' 05-29 16:49:43.757 192.168.0.80:54321 10858 main INFO: Cloud of size 1 formed [/192.168.0.80:54321] 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV] 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: Watchdog extension initialized 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: XGBoost extension initialized 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: KrbStandalone extension initialized 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: Registered 3 core extensions in: 256ms 05-29 16:49:43.770 192.168.0.80:54321 10858 main INFO: Registered H2O core extensions: [Watchdog, XGBoost, KrbStandalone] 05-29 16:49:44.035 192.168.0.80:54321 10858 main INFO: Registered: 165 REST APIs in: 264ms 05-29 16:49:44.035 192.168.0.80:54321 10858 main INFO: Registered REST API extensions: [XGBoost, Algos, AutoML, Core V3, Core V4] 05-29 16:49:44.149 192.168.0.80:54321 10858 main INFO: Registered: 235 schemas in 113ms 05-29 16:49:44.150 192.168.0.80:54321 10858 main INFO: H2O started in 1691ms 05-29 16:49:44.150 192.168.0.80:54321 10858 main INFO: 05-29 16:49:44.150 192.168.0.80:54321 10858 main INFO: Open H2O Flow in your web browser: http://192.168.0.80:54321 05-29 16:49:44.150 192.168.0.80:54321 10858 main INFO:
4.找另一台机器或者重新打开一个shell命令窗口,再次输入命令
java -jar h2o.jar -name clusterName
会自动根据-name选项查找存在的集群
05-29 16:45:22.648 192.168.0.166:54323 10408 main INFO: Found XGBoost backend with library: xgboost4j_gpu 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: XGBoost supported backends: [WITH_GPU, WITH_OMP] 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: ----- H2O started ----- 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: Build git branch: rel-wolpert 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: Build git hash: b26ef10d0f1b4dd26b8227c1672ee47e0e893fec 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: Build git describe: jenkins-3.18.0.9-19-gb26ef10 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: Build age: 7 days, 8 hours and 31 minutes 05-29 16:45:22.661 192.168.0.166:54323 10408 main INFO: Built by: 'jenkins' 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Built on: '2018-05-22 08:13:35' 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Build git branch: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Build git hash: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Build git describe: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Build project version: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Built by: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: Watchdog Built on: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: XGBoost Build git branch: (unknown) 05-29 16:45:22.662 192.168.0.166:54323 10408 main INFO: XGBoost Build git hash: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: XGBoost Build git describe: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: XGBoost Build project version: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: XGBoost Built by: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: XGBoost Built on: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: KrbStandalone Build git branch: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: KrbStandalone Build git hash: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: KrbStandalone Build git describe: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: KrbStandalone Build project version: (unknown) 05-29 16:45:22.663 192.168.0.166:54323 10408 main INFO: KrbStandalone Built by: (unknown) 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: KrbStandalone Built on: (unknown) 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Processed H2O arguments: [-name, clusterName] 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Java availableProcessors: 4 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Java heap totalMemory: 88.5 MB 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Java heap maxMemory: 1.27 GB 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Java version: Java 1.8.0_77 (from Oracle Corporation) 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: JVM launch parameters: [] 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: OS version: Linux 2.6.32-431.el6.x86_64 (amd64) 05-29 16:45:22.664 192.168.0.166:54323 10408 main INFO: Machine physical memory: 5.72 GB 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: X-h2o-cluster-id: 1527583522016 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: User name: 'root' 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: IPv6 stack selected: false 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: Possible IP Address: eth1 (eth1), fe80:0:0:0:20c:29ff:fe29:d906%eth1 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: Possible IP Address: eth1 (eth1), 192.168.0.166 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: Possible IP Address: lo (lo), 127.0.0.1 05-29 16:45:22.665 192.168.0.166:54323 10408 main INFO: H2O node running in unencrypted mode. 05-29 16:45:22.668 192.168.0.166:54323 10408 main INFO: Internal communication uses port: 54324 05-29 16:45:22.668 192.168.0.166:54323 10408 main INFO: Listening for HTTP and REST traffic on http://192.168.0.166:54323/ 05-29 16:45:22.669 192.168.0.166:54323 10408 main INFO: H2O cloud name: 'clusterName' on /192.168.0.166:54323, static configuration based on -flatfile null 05-29 16:45:22.669 192.168.0.166:54323 10408 main INFO: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555): 05-29 16:45:22.669 192.168.0.166:54323 10408 main INFO: 1. Open a terminal and run 'ssh -L 55555:localhost:54323 root@192.168.0.166' 05-29 16:45:22.669 192.168.0.166:54323 10408 main INFO: 2. Point your browser to http://localhost:55555 05-29 16:45:23.457 192.168.0.166:54323 10408 main INFO: Log dir: '/tmp/h2o-root/h2ologs' 05-29 16:45:23.458 192.168.0.166:54323 10408 main INFO: Cur dir: '/opt/software' 05-29 16:45:23.462 192.168.0.166:54323 10408 main INFO: HDFS subsystem successfully initialized 05-29 16:45:23.466 192.168.0.166:54323 10408 main INFO: S3 subsystem successfully initialized 05-29 16:45:23.466 192.168.0.166:54323 10408 main INFO: Flow dir: '/root/h2oflows' 05-29 16:45:23.481 192.168.0.166:54323 10408 main INFO: Cloud of size 1 formed [/192.168.0.166:54323] 05-29 16:45:23.518 192.168.0.166:54323 10408 main INFO: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV] 05-29 16:45:23.519 192.168.0.166:54323 10408 main INFO: Watchdog extension initialized 05-29 16:45:23.519 192.168.0.166:54323 10408 main INFO: XGBoost extension initialized 05-29 16:45:23.519 192.168.0.166:54323 10408 main INFO: KrbStandalone extension initialized 05-29 16:45:23.519 192.168.0.166:54323 10408 main INFO: Registered 3 core extensions in: 269ms 05-29 16:45:23.519 192.168.0.166:54323 10408 main INFO: Registered H2O core extensions: [Watchdog, XGBoost, KrbStandalone] 05-29 16:45:23.972 192.168.0.166:54323 10408 main INFO: Registered: 165 REST APIs in: 452ms 05-29 16:45:23.980 192.168.0.166:54323 10408 main INFO: Registered REST API extensions: [XGBoost, Algos, AutoML, Core V3, Core V4] 05-29 16:45:24.391 192.168.0.166:54323 10408 main INFO: Registered: 235 schemas in 410ms 05-29 16:45:24.391 192.168.0.166:54323 10408 main INFO: H2O started in 2369ms 05-29 16:45:24.391 192.168.0.166:54323 10408 main INFO: 05-29 16:45:24.400 192.168.0.166:54323 10408 main INFO: Open H2O Flow in your web browser: http://192.168.0.166:54323 05-29 16:45:24.400 192.168.0.166:54323 10408 main INFO: 05-29 16:45:26.961 192.168.0.166:54323 10408 FJ-126-7 INFO: Cloud of size 2 formed [/192.168.0.80:54321, /192.168.0.166:54323]
5. 打开浏览器,输入网址
http://192.168.0.166:54323
会看到h2o的flow ui界面
6. 页面使用
6.1数据导入
选择左侧帮助部分的导入数据,importFiles
页面下方会显示出导入数据的界面,在search文本框上输入服务器上文件的路径,点击右侧的搜索按钮,页面会列出所有查到到的文件。然后点击Add all,没问题后点击import按钮
6.2 导入本地客户端数据
如果服务器上不存在要分析的文件,你可以选择上传自己的文件
6.3 解析导入的数据
如果你页面刷新了或者页面数据太乱了,你可以在getFrames里找到你刚才导入的数据集
找到我们刚才导入的数据集,点击parse按钮,
你可以自己输入列名,选择数据类型及其他修改,最后点击parse完成数据集的格式处理
点击上图中的view按钮,会显示下图,然后点击下图中的view data,会进行数据预览
数据预览查看:
parse完数据之后,你会发现数据集的扩展名已经由我们的.csv转为.hex,刷新一下页面,点击getFrames,
6.4 拆分数据集,
点击进入数据集,点击split按钮,
选择输入拆分比例,会拆分成多个数据集
6.5创建模型
点击顶菜单的'model -- k-meas'
选择算法,这里选择k-means,输入训练集training_frame和验证集validation_frame,或调整其他参数。
点击底部的create model按钮,会生成一个job,
点击view按钮,可查看模型的详细情况,如下图。
点击预测predict按钮,会显示下图,选择要预测的数据集,点击预测按钮即可查看结果。