HDFS的快照snapshot的使用

一,snapshot命令

允许这个文件路径可以创建snapshots:

hdfs dfsadmin -allowSnapshot <路径>

不允许创建目录的快照。必须先删除目录的所有快照,然后再禁止快照。

hdfs dfsadmin -disallowSnapshot <路径>

创建快照

hdfs dfs -createSnapshot <路径> [<快照名称>]
[root@cm1 ~]# hdfs dfs -createSnapshot /data/mytest mytest-snap

删除快照

hdfs dfs -deleteSnapshot <路径> <快照名称>
[root@cm1 ~]# hdfs dfs -deleteSnapshot /data/mytest mytest-snap

重命名快照

hdfs dfs -renameSnapshot <路径> <旧名称> <新名称>
[root@cm1 ~]# hdfs dfs -renameSnapshot /data/mytest mytest-snap mytest-snap-new

获取Snapshottable目录列表

hdfs lsSnapshottableDir

获取快照差异报告

hdfs snapshotDiff <路径> <fromSnapshot> <toSnapshot>

二,snapshot的具体使用案例

创建data目录

hdfs dfs -mkdir /user/hadoop-cw/data
hdfs dfs -ls /user/hadoop-cw
hdfs dfs -touchz /user/hadoop-cw/data/file1.txt
hdfs dfs -touchz /user/hadoop-cw/data/file2.txt
hdfs dfs -put word.txt test.txt /user/hadoop-cw/data

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 4 items
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt

假设上面是昨天写进去的,想要把这些数据进行备份,使用快照的方式进行备份,想要对/user/hadoop-cw/data目录进行备份,需要允许对其创建快照。允许创建快照:

[root@cm1 ~]# hdfs dfsadmin -allowSnapshot /user/hadoop-cw/data
Allowing snapshot on /user/hadoop-cw/data succeeded

对/user/hadoop-cw/data创建快照,并对快照命名:

[root@cm1 ~]# hdfs dfs -createSnapshot /user/hadoop-cw/data data-20200514-snapshots
Created snapshot /user/hadoop-cw/data/.snapshot/data-20200514-snapshots

查看快照备份的文件:

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot/data-20200514-snapshots
Found 4 items
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file1.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file2.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/test.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/word.txt

通过WebUI可以查看备份情况

今天0515又往data里面写了数据:

hdfs dfs -touchz /user/hadoop-cw/data/file3.txt
hdfs dfs -touchz /user/hadoop-cw/data/file4.txt
hdfs dfs -put test2.txt /user/hadoop-cw/data

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 7 items
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:38 /user/hadoop-cw/data/file3.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:38 /user/hadoop-cw/data/file4.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:39 /user/hadoop-cw/data/test2.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt

创建今天的快照(备份)

$ hdfs dfs -createSnapshot /user/hadoop-cw/data data-20200515-snapshots
Created snapshot /user/hadoop-cw/data/.snapshot/data-20200515-snapshots

恢复14号的数据:

$ hdfs dfs -cp -ptopax /user/hadoop-cw/data/.snapshot/data-20200514-snapshots /user/hadoop-cw

查看恢复的数据:

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/
Found 2 items
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:40 /user/hadoop-cw/data
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:33 /user/hadoop-cw/data-20200514-snapshots
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data-20200514-snapshots
Found 4 items
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file1.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file2.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/test.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/word.txt

查看当前数据:

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 7 items
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:38 /user/hadoop-cw/data/file3.txt
-rw-r--r--   3 hdfs hdfs          0 2020-05-15 18:38 /user/hadoop-cw/data/file4.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-15 18:39 /user/hadoop-cw/data/test2.txt
-rw-r--r--   3 hdfs hdfs         12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt

两个快照之间进行对比:

[root@cm1 ~]# hdfs snapshotDiff /user/hadoop-cw/data data-20200514-snapshots data-20200515-snapshots
Difference between snapshot data-20200514-snapshots and snapshot data-20200515-snapshots under directory /user/hadoop-cw/data:
M	.
+	./file3.txt
+	./file4.txt
+	./test2.txt

Results:

+ The file/directory has been created.
- The file/directory has been deleted.
M The file/directory has been modified.
R The file/directory has been renamed.

重命名快照

hdfs dfs -renameSnapshot /user/hadoop-cw/data data-20200514-snapshots data-20200514-snapshot

[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 2 items
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:33 /user/hadoop-cw/data/.snapshot/data-20200514-snapshot
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots

删除14号的快照:

[root@cm1 ~]# hdfs dfs -deleteSnapshot /user/hadoop-cw/data data-20200514-snapshot
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots

当未删除snapshot时,进行禁止快照时会报错

[root@cm1 ~]# hdfs dfsadmin -disallowSnapshot /user/hadoop-cw/data
disallowSnapshot: The directory /user/hadoop-cw/data has snapshot(s). Please redo the operation after removing all the snapshots.
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots

此时需要将.snapshot下的快照删除后,才能进行禁止操作

[root@cm1 ~]# hdfs dfs -deleteSnapshot /user/hadoop-cw/data data-20200515-snapshots
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
[root@cm1 ~]# hdfs dfsadmin -disallowSnapshot /user/hadoop-cw/data
Disallowing snapshot on /user/hadoop-cw/data succeeded
posted @ 2020-05-15 15:25  陈小哥cw  阅读(680)  评论(0编辑  收藏  举报