HDFS的快照snapshot的使用
一,snapshot命令
允许这个文件路径可以创建snapshots:
hdfs dfsadmin -allowSnapshot <路径>
不允许创建目录的快照。必须先删除目录的所有快照,然后再禁止快照。
hdfs dfsadmin -disallowSnapshot <路径>
创建快照
hdfs dfs -createSnapshot <路径> [<快照名称>]
[root@cm1 ~]# hdfs dfs -createSnapshot /data/mytest mytest-snap
删除快照
hdfs dfs -deleteSnapshot <路径> <快照名称>
[root@cm1 ~]# hdfs dfs -deleteSnapshot /data/mytest mytest-snap
重命名快照
hdfs dfs -renameSnapshot <路径> <旧名称> <新名称>
[root@cm1 ~]# hdfs dfs -renameSnapshot /data/mytest mytest-snap mytest-snap-new
获取Snapshottable目录列表
hdfs lsSnapshottableDir
获取快照差异报告
hdfs snapshotDiff <路径> <fromSnapshot> <toSnapshot>
二,snapshot的具体使用案例
创建data目录
hdfs dfs -mkdir /user/hadoop-cw/data
hdfs dfs -ls /user/hadoop-cw
hdfs dfs -touchz /user/hadoop-cw/data/file1.txt
hdfs dfs -touchz /user/hadoop-cw/data/file2.txt
hdfs dfs -put word.txt test.txt /user/hadoop-cw/data
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 4 items
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt
假设上面是昨天写进去的,想要把这些数据进行备份,使用快照的方式进行备份,想要对/user/hadoop-cw/data目录进行备份,需要允许对其创建快照。允许创建快照:
[root@cm1 ~]# hdfs dfsadmin -allowSnapshot /user/hadoop-cw/data
Allowing snapshot on /user/hadoop-cw/data succeeded
对/user/hadoop-cw/data创建快照,并对快照命名:
[root@cm1 ~]# hdfs dfs -createSnapshot /user/hadoop-cw/data data-20200514-snapshots
Created snapshot /user/hadoop-cw/data/.snapshot/data-20200514-snapshots
查看快照备份的文件:
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot/data-20200514-snapshots
Found 4 items
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file1.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file2.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/test.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/word.txt
通过WebUI可以查看备份情况
今天0515又往data里面写了数据:
hdfs dfs -touchz /user/hadoop-cw/data/file3.txt
hdfs dfs -touchz /user/hadoop-cw/data/file4.txt
hdfs dfs -put test2.txt /user/hadoop-cw/data
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 7 items
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:38 /user/hadoop-cw/data/file3.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:38 /user/hadoop-cw/data/file4.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:39 /user/hadoop-cw/data/test2.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt
创建今天的快照(备份)
$ hdfs dfs -createSnapshot /user/hadoop-cw/data data-20200515-snapshots
Created snapshot /user/hadoop-cw/data/.snapshot/data-20200515-snapshots
恢复14号的数据:
$ hdfs dfs -cp -ptopax /user/hadoop-cw/data/.snapshot/data-20200514-snapshots /user/hadoop-cw
查看恢复的数据:
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/
Found 2 items
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:40 /user/hadoop-cw/data
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:33 /user/hadoop-cw/data-20200514-snapshots
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data-20200514-snapshots
Found 4 items
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file1.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:29 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/file2.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/test.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:31 /user/hadoop-cw/data/.snapshot/data-20200514-snapshots/word.txt
查看当前数据:
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data
Found 7 items
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file1.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-14 18:29 /user/hadoop-cw/data/file2.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:38 /user/hadoop-cw/data/file3.txt
-rw-r--r-- 3 hdfs hdfs 0 2020-05-15 18:38 /user/hadoop-cw/data/file4.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/test.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-15 18:39 /user/hadoop-cw/data/test2.txt
-rw-r--r-- 3 hdfs hdfs 12 2020-05-14 18:31 /user/hadoop-cw/data/word.txt
两个快照之间进行对比:
[root@cm1 ~]# hdfs snapshotDiff /user/hadoop-cw/data data-20200514-snapshots data-20200515-snapshots
Difference between snapshot data-20200514-snapshots and snapshot data-20200515-snapshots under directory /user/hadoop-cw/data:
M .
+ ./file3.txt
+ ./file4.txt
+ ./test2.txt
Results:
+ | The file/directory has been created. |
---|---|
- | The file/directory has been deleted. |
M | The file/directory has been modified. |
R | The file/directory has been renamed. |
重命名快照
hdfs dfs -renameSnapshot /user/hadoop-cw/data data-20200514-snapshots data-20200514-snapshot
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 2 items
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:33 /user/hadoop-cw/data/.snapshot/data-20200514-snapshot
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots
删除14号的快照:
[root@cm1 ~]# hdfs dfs -deleteSnapshot /user/hadoop-cw/data data-20200514-snapshot
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 1 items
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots
当未删除snapshot时,进行禁止快照时会报错
[root@cm1 ~]# hdfs dfsadmin -disallowSnapshot /user/hadoop-cw/data
disallowSnapshot: The directory /user/hadoop-cw/data has snapshot(s). Please redo the operation after removing all the snapshots.
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
Found 1 items
drwxr-xr-x - hdfs hdfs 0 2020-05-15 18:40 /user/hadoop-cw/data/.snapshot/data-20200515-snapshots
此时需要将.snapshot下的快照删除后,才能进行禁止操作
[root@cm1 ~]# hdfs dfs -deleteSnapshot /user/hadoop-cw/data data-20200515-snapshots
[root@cm1 ~]# hdfs dfs -ls /user/hadoop-cw/data/.snapshot
[root@cm1 ~]# hdfs dfsadmin -disallowSnapshot /user/hadoop-cw/data
Disallowing snapshot on /user/hadoop-cw/data succeeded