Ozone数据探查服务Recon的启用
前言
笔者之前写过一篇关于Ozone数据探查服务Recon的文章:存储系统“数据之眼”的设计–数据探查服务,大致阐述了Recon如何通过定期获取OM的快照来做数据的二次分析的过程。不过笔者当时没有介绍Recon服务如何被启用的过程,以及此服务被启用后,它的内部运行过程是如何的。本文笔者来补充介绍下这块的内容。
Ozone Recon服务的启用
Ozone Recon服务需要依赖于OM的metadata,它采用定期同步OM metadata db的方式来做的。因此这里需要为Recon服务配置一个OM的通信地址,在这里为OM的http地址(此处9874为默认端口),
<property>
<name>ozone.om.http-address</name>
<value>{om host}:9874</value>
</property>
然后是recon db, recon om db的存放目录,这个我们也是建议另外配置的,
<property>
<name>ozone.recon.db.dir</name>
<value>/path/to/recon.db</value>
</property>
<property>
<name>ozone.recon.om.db.dir</name>
<value>/path/to/recon_om.db</value>
</property>
接下来是Recon同步OM db文件的时间间隔,这个可以实际使用的需要来进行配置,默认是10分钟同步一次。
<property>
<name>recon.om.snapshot.task.interval.delay</name>
<value>3m</value>
</property>
上述配置确认配置完毕之后,我们就可以开始启动recon服务了,执行下面的命令,
~/apache/ozone/bin/ozone --daemon start recon
然后通过在recon的日志中,我们能够清晰地看到其内部正在执行的操作,
2019-12-15 07:28:33,697 [main] INFO - Starting Recon server
2019-12-15 07:28:34,921 [main] INFO - Registered task ContainerKeyMapperTask with controller.
2019-12-15 07:28:35,178 [main] INFO - Registered task FileSizeCountTask with controller.
2019-12-15 07:28:35,341 [main] WARN - ozone.recon.om.db.dir is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
2019-12-15 07:28:35,539 [main] INFO - Starting ReconOMMetadataManagerImpl
2019-12-15 07:28:35,539 [main] WARN - ozone.recon.om.db.dir is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
2019-12-15 07:29:35,541 [pool-8-thread-1] INFO - Syncing data from Ozone Manager.
2019-12-15 07:29:35,542 [pool-8-thread-1] INFO - Obtaining full snapshot from Ozone Manager
2019-12-15 07:29:36,917 [pool-8-thread-1] INFO - Got new checkpoint from OM : /home/hdfs/data/meta/om.snapshot.db_1576420175542
2019-12-15 07:29:37,062 [pool-8-thread-1] INFO - Created OM DB snapshot at /home/hdfs/data/meta/om.snapshot.db_1576420175542.
2019-12-15 07:29:37,342 [pool-8-thread-1] INFO - Calling reprocess on Recon tasks.
2019-12-15 07:29:37,345 [pool-6-thread-1] INFO - Starting a 'reprocess' run of ContainerKeyMapperTask.
2019-12-15 07:29:37,939 [pool-6-thread-1] INFO - Creating new Recon Container DB at /home/hdfs/data/meta/recon.db/recon-container.db_1576420177346
2019-12-15 07:29:37,940 [pool-6-thread-1] INFO - Cleaning up old Recon Container DB at /home/hdfs/data/meta/recon.db/recon-container.db_1576384466698.
2019-12-15 07:29:37,997 [pool-6-thread-1] INFO - Completed 'reprocess' of ContainerKeyMapperTask.
2019-12-15 07:29:37,998 [pool-6-thread-1] INFO - It took me 0.651 seconds to process 0 keys.
201
因为上述recon server在启动时,笔者没有创建任何的key文件,所以上面没有key被处理,但是recon db文件其实已经建立好了。
[hdfs@lyq meta]$ ls -l /home/hdfs/data/meta/recon.db/
total 44
drwxrwxr-x 2 hdfs hdfs 4096 Dec 14 08:42 om.db.checkpoints
-rw-rw-r-- 1 hdfs hdfs 36864 Dec 16 07:19 ozone_recon_sqlite.db
drwxr-xr-x 2 hdfs hdfs 4096 Dec 15 07:38 recon-container.db_1576420725643
上面的ozone_recon_sqlite.db将会用来存储数据分析的汇总表数据。
OM Snapshot db文件,笔者测试时是放在另外一个目录上的,
[hdfs@lyq meta]$ ls -l
total 32
drwxrwxr-x 2 hdfs hdfs 4096 Dec 15 07:38 om.snapshot.db_1576420724252
然后我们用Ozone freon工具来随机创建少量的key文件,
[hdfs@lyq logs]$ ~/apache/ozone/bin/ozone freon randomkeys --numOfVolumes=1 --numOfBuckets=1 --numOfKeys=2 --keySize=10240
2019-12-15 07:37:57,102 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2019-12-15 07:37:57,158 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2019-12-15 07:37:57,158 INFO impl.MetricsSystemImpl: ozone-freon metrics system started
2019-12-15 07:38:03,571 [main] INFO - Number of Threads: 10
2019-12-15 07:38:03,574 [main] INFO - Number of Volumes: 1.
2019-12-15 07:38:03,575 [main] INFO - Number of Buckets per Volume: 1.
2019-12-15 07:38:03,575 [main] INFO - Number of Keys per Bucket: 2.
2019-12-15 07:38:03,575 [main] INFO - Key size: 10240 bytes
2019-12-15 07:38:03,575 [main] INFO - Buffer size: 4096 bytes
2019-12-15 07:38:03,575 [main] INFO - validateWrites : false
2019-12-15 07:38:03,585 [main] INFO - Starting progress bar Thread.
0.00% |? | 0/2 Time: 0:00:002019-12-15 07:38:03,600 [pool-2-thread-2] INFO - Creating Volume: vol-0-85226, with hdfs as owner.
2019-12-15 07:38:04,107 [pool-2-thread-1] INFO - Creating Bucket: vol-0-85226/bucket-0-19333, with Versioning false and Storage Type set to DISK and Encryption set to false
0.00% |? | 0/2 Time: 0:00:012019-12-15 07:38:04,839 WARN impl.MetricsSystemImpl: ozone-freon metrics system already initialized!
0.00% |? | 50.00% |??????????????????????????????????????????????????? 100.00% |?????????????????????????????????????????????????????????????????????????????????????????????????????| 2/2 Time: 0:00:04
***************************************************
Status: Success
Git Base Revision: e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Number of Volumes created: 1
Number of Buckets created: 1
Number of Keys added: 2
Ratis replication factor: ONE
Ratis replication type: STAND_ALONE
Average Time spent in volume creation: 00:00:00,048
Average Time spent in bucket creation: 00:00:00,006
Average Time spent in key creation: 00:00:00,056
Average Time spent in key write: 00:00:00,441
Total bytes written: 20480
Total Execution time: 00:00:11,299
***************************************************
然后我们在下一次OM db文件的同步处理过程中,就能够看到新的被写入的key正在被处理了。
2019-12-15 07:38:45,421 [pool-8-thread-1] INFO - Created OM DB snapshot at /home/hdfs/data/meta/om.snapshot.db_1576420724252.
2019-12-15 07:38:45,642 [pool-8-thread-1] INFO - Calling reprocess on Recon tasks.
2019-12-15 07:38:45,643 [pool-6-thread-1] INFO - Starting a 'reprocess' run of ContainerKeyMapperTask.
2019-12-15 07:38:46,272 [pool-6-thread-1] INFO - Creating new Recon Container DB at /home/hdfs/data/meta/recon.db/recon-container.db_1576420725643
2019-12-15 07:38:46,272 [pool-6-thread-1] INFO - Cleaning up old Recon Container DB at /home/hdfs/data/meta/recon.db/recon-container.db_1576420543845.
2019-12-15 07:38:46,875 [pool-6-thread-1] INFO - Completed 'reprocess' of ContainerKeyMapperTask.
2019-12-15 07:38:46,876 [pool-6-thread-1] INFO - It took me 1.233 seconds to process 2 keys.
2019-12-15 07:38:47,303 [pool-6-thread-1] INFO - Completed a 'reprocess' run of FileSizeCountTask.
但目前这些分析结果数据只是被写入ozone_recon_sqlite.db,还没有很好地展现给外部使用,这块功能后续应该会完善许多。目前Recon Server的UI只是一个简单的页面展示,后续会做更多数据的集成。大家可以继续关注Recon的后续进展,目前Ozone社区已经在实现Recon 2.0阶段的工作了。下图为Recon的web UI: