使用Hadoop WebHDFS访问HDFS
使用Hadoop WebHDFS访问HDFS
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
webHDFS和HttpFS都是Hadoop的HTTP/HTTPS REST接口。这两个接口使我们能够读取HDFS数据并写入,以及执行与HDFS相关的几个管理命令。可以将它们嵌入程序,脚本或通过命令行工具(如curl或wget)来使用这两个接口。
WebHDFS不支持高可用NameNode架构,但HttpFS支持。
一.WebHDFS概述
当在Hadoop集群中运行的应用程序想要访问HDFS数据时,它们使用Hadoop的本地客户端在HDFS上工作。但是,可能需要从集群外部访问HDFS,以便处理,存储和检索HDFS数据。 如果应用程序需要使用本机HDFS协议,则必须在运行应用程序的服务器上安装Hadoop,并且要提供与应用程序的Java依赖。 Hadoop的WebHDFS提供了一组强大的HTTP REST API。REST是一种用于构建大规模Web服务的架构风格,其允许应用程序远程访问和使用HDFS。除了便于从外部访问HDFS之外,当尝试使用两个Hadoop(每个都运行不同版本的Hadoop)集群时,WebHDFS也很有用。 由于WebHDFS和MapReduce,HDFS版本无关,因为它使用REST API,所以它可以在两个集群中使用。例如,当需要使用DistCp实用程序在两个集群之间执行数据复制时,可以使用它。 当使用WebHDFS远程访问HDFS数据时,不需要在客户端上安装Hadoop。可以使用curl和wget等知名工具来访问HDFS数据。WebHDFS支持直接连接到Hadoop集群执行所有HDFS操作。 WebHDFS使用基本的HTTP操作,如GET,PUT,POST和DELETE来远程操作HDFS文件系统。 博主推荐阅读: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html 温馨提示: 如果你得HDFS集群启用来了Kerberos安全认证,则你应该需要关心以下参数(修改hdfs-site..xml): dfs.web.authentication.kerberos.principal dfs.web.authentication.kerberos.keytab
二.使用HDFS命令行工具通过WebHDFS REST API访问HDFS实战案例
使用WebHDFS很简单,需要做的就是将HDFS文件系统URI替换为HTTP URL,接下来我们看一下几个案例。
1>.列出"/yinzhengjie"的HDFS目录所有文件和目录
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / #需要注意的是,我们在使用命令行工具并没有指定文件系统的名称则使用"core-site.xml"文件中"fs.defaultFS"属性定义的默认文件系统名称。 Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-21 18:42 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie #使用webhdfs协议访问HDFS Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
2>.将本地文件上传到HDFS集群中
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -put /etc/fstab webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/fstab #将本地文件"/etc/fstab"文件上传到HDFS的"/yinzhengjie/"目录 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]#
3>.下载HDFS文件系统中的文件或目录
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d #下载目录 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll yum.repos.d/ total 40 -rw-r--r-- 1 root root 1664 Aug 31 14:32 CentOS-Base.repo -rw-r--r-- 1 root root 1309 Aug 31 14:32 CentOS-CR.repo -rw-r--r-- 1 root root 649 Aug 31 14:32 CentOS-Debuginfo.repo -rw-r--r-- 1 root root 314 Aug 31 14:32 CentOS-fasttrack.repo -rw-r--r-- 1 root root 630 Aug 31 14:32 CentOS-Media.repo -rw-r--r-- 1 root root 1331 Aug 31 14:32 CentOS-Sources.repo -rw-r--r-- 1 root root 5701 Aug 31 14:32 CentOS-Vault.repo -rw-r--r-- 1 root root 951 Aug 31 14:32 epel.repo -rw-r--r-- 1 root root 1050 Aug 31 14:32 epel-testing.repo [root@hadoop105.yinzhengjie.com ~]#
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz #下载文件 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 4 -rw-r--r-- 1 root root 69 Aug 31 14:33 wc.txt.gz drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]#
4>.删除HDFS文件系统中的文件或目录
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm -r webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d #删除目录 20/08/31 14:38:12 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tr ash/Current/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]#
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz #删除文件 20/08/31 14:38:28 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tras h/Current/yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
5>.其它操作
有了上面的4个案例打底,想必接下来让你自行探索其它使用方法估计问题不大,和我之前分享的hdfs dfs工具的使用方法基本雷同,只不过需要将hdfs协议换成webhdfs协议即可。 博主推荐阅读: https://www.cnblogs.com/yinzhengjie2020/p/13296680.html
三.使用curl工具通过WebHDFS REST API访问HDFS实战案例
WebHDFS真的是一个相当全面的工具,其包括许多用于访问和使用HDFS数据的命令。接下来我们就来看如何使用curl工具通过WebHDFS REST API访问HDFS。 关于curl工具的使用我这里就不赘述了,感兴趣的小伙伴可以自行参考网上的博客,该工具的基本使用方法查看我的笔记即可。curl常见的选项如下所示: -A/--user-agent <string>: 设置用户代理发送给服务器 -e/--referer <URL>: 来源网址 --cacert <file>: CA证书 (SSL) -k/--insecure: 允许忽略证书进行 SSL 连接 --compressed: 要求返回是压缩的格式 -H/--header <line>: 自定义首部信息传递给服务器 -i: 显示页面内容,包括报文首部信息 -I/--head: 只显示响应报文首部信息 -D/--dump-header <file>: 将url的header信息存放在指定文件中 --basic: 使用HTTP基本认证 -u/--user <user[:password]>: 设置服务器的用户和密码 -L: 如果有3xx响应码,重新发请求到新位置 -O: 使用URL中默认的文件名保存文件到本地 -o <file>: 将网络文件保存为指定的文件中 --limit-rate <rate>: 设置传输速度 -0/--http1.0: 数字0,使用HTTP 1.0 -v/--verbose: 更详细 -C: 选项可对文件使用断点续传功能 -c/--cookie-jar <file name>: 将url中cookie存放在指定文件中 -x/--proxy <proxyhost[:port]>: 指定代理服务器地址 -X/--request <command>: 向服务器发送指定请求方法 -U/--proxy-user <user:password>: 代理服务器用户和密码 -T: 选项可将指定的本地文件上传到FTP服务器上 --data/-d: 方式指定使用POST方式传递数据 -b name=data: 从服务器响应set-cookie得到值,返回给服务器 博主推荐阅读: https://www.cnblogs.com/yinzhengjie/p/7719804.html
1>.读取HDFS中的文件(本案例读取的是"/yinzhengjie/hosts")
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie" #op指定操作,而user.name指定访问URI的用户 HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:39:16 GMT Date: Mon, 31 Aug 2020 07:39:16 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:39:16 GMT Date: Mon, 31 Aug 2020 07:39:16 GMT Pragma: no-cache Content-Type: application/octet-stream X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=yinzhengjie&p=yinzhengjie&t=simple&e=1598895556829&s=ak8QrD/3I7HowelGDzH9uvnDeAGBihJhCbCm0wVqS2M="; Path=/; HttpOnly Location: http://hadoop104.yinzhengjie.com:50075/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie&namenoderpcaddress=hadoop101.yinzhengjie.com:9000&offset=0 Content-Length: 0 HTTP/1.1 200 OK Access-Control-Allow-Methods: GET Access-Control-Allow-Origin: * Content-Type: application/octet-stream Connection: close Content-Length: 371 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com [root@hadoop105.yinzhengjie.com ~]#
2>.检查HDFS目录的状态
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-31 14:38 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=LISTSTATUS" #查看"/yinzhengjie"目录的状态 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:51:31 GMT Date: Mon, 31 Aug 2020 07:51:31 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:51:31 GMT Date: Mon, 31 Aug 2020 07:51:31 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"FileStatuses":{"FileStatus":[ {"accessTime":1598855175268,"blockSize":536870912,"childrenNum":0,"fileId":16489,"group":"admingroup","length":490,"modificationTime":1598855175823,"owner":"root","pathSuffix":"fstab","perm ission":"644","replication":3,"storagePolicy":0,"type":"FILE"},{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix":"hosts","perm ission":"644","replication":3,"storagePolicy":0,"type":"FILE"}]}} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
3>.检查HDFS文件的状态
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=GETFILESTATUS" ;echo #查看"/yinzhengjie/hosts"文件的状态 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:58:53 GMT Date: Mon, 31 Aug 2020 07:58:53 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:58:53 GMT Date: Mon, 31 Aug 2020 07:58:53 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"FileStatus":{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix" :"","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"}} [root@hadoop105.yinzhengjie.com ~]#
4>.创建目录
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-31 16:17 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X PUT "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?user.name=root&op=MKDIRS&permissions=751" ;echo #创建"/yinzhengjie/webHDFS"目录 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 08:14:10 GMT Date: Mon, 31 Aug 2020 08:14:10 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 08:14:10 GMT Date: Mon, 31 Aug 2020 08:14:10 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598897650918&s=rp1JdtIpaV59fm8TFisjCUMH3ARerDWzI4oL+jCezrs="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts drwxr-xr-x - root admingroup 0 2020-08-31 16:14 /yinzhengjie/webHDFS [root@hadoop105.yinzhengjie.com ~]#
5>.创建并写入数据到文件
我使用的是"Hadoop 2.10.0"版本,在尝试使用webhdfs官方的方法创建文件或者往已有的文件追加内容均失败了,官方提供的2个方法需要发送2次HTTP请求,但我在测试多次均无法创建,若有成功的小伙伴请不吝赐教。 参考连接: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Append_to_a_File
6>.删除目录或文件
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts drwxr-xr-x - root admingroup 0 2020-08-31 18:07 /yinzhengjie/webHDFS [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?op=DELETE&user.name=root";echo #删除目录 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:07:56 GMT Date: Mon, 31 Aug 2020 10:07:56 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:07:56 GMT Date: Mon, 31 Aug 2020 10:07:56 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904476157&s=4aHgz6EwyJfdmjlwOtkXs+8Je94BybNxDUYoon7FIWE="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]#
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/fstab?op=DELETE&user.name=root";echo #删除文件 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:08:52 GMT Date: Mon, 31 Aug 2020 10:08:52 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:08:52 GMT Date: Mon, 31 Aug 2020 10:08:52 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904532486&s=MCjvGp705lVZcZx7hc5UCeERNoRDGC5rsW5E/USXi6c="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 1 items -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]#
7>.检查目录配额
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf none inf 1 2 742 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:30:13 GMT Date: Mon, 31 Aug 2020 10:30:13 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:30:13 GMT Date: Mon, 31 Aug 2020 10:30:13 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":-1,"spaceConsumed":29631,"spaceQuota":-1,"typeQuota":{}}} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setSpaceQuota 10g /yinzhengjie/ [root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setQuota 50 /yinzhengjie/ [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 50 47 10 G 10.0 G 1 2 742 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:30:52 GMT Date: Mon, 31 Aug 2020 10:30:52 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:30:52 GMT Date: Mon, 31 Aug 2020 10:30:52 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":50,"spaceConsumed":29631,"spaceQuota":10737418240,"typeQuota":{}}} [root@hadoop105.yinzhengjie.com ~]#
8>.其它操作
博主推荐阅读: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
当你的才华还撑不起你的野心的时候,你就应该静下心来学习。当你的能力还驾驭不了你的目标的时候,你就应该沉下心来历练。问问自己,想要怎样的人生。 欢迎加入基础架构自动化运维:598432640,大数据SRE进阶之路:959042252,DevOps进阶之路:526991186