hadoop 文件操作

Create a directory in HDFS - mkdir

The hadoop mkdir command is for creating directories in the hdfs. This is similar to the unix mkdir command. You can use the -p option for creating parent directories. Takes path uri’s as argument and creates directories.

hadoop fs -mkdir 
hadoop fs -mkdir /user/hadoop/corejavaguru
hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -mkdir -p /user/hadoop/corejavaguru/fscommands/demo

List the contents of a HDFS directory - ls

The ls command is used to list out the directories and files.

For a file ls returns stat on the file with the following format:

permissions number_of_replicas userid groupid filesize modification_date modification_time filename

For a directory it returns list of its direct children as in Unix. A directory is listed as:

permissions userid groupid modification_date modification_time dirname
hadoop fs -ls 
hadoop fs -ls /user/hadoop/file1

Upload a file into HDFS - put

put command is used to copy single source, or multiple sources to the destination file system. Also reads input from stdin and writes to destination file system. The different ways for the put command are :

hadoop fs -put  ... <hdfs_dest_path>
hadoop fs -put /home/hadoop/Samplefile.txt  /user/hadoop/dir3/
hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile

Download a file from HDFS - get

Hadoop get command copies the files from HDFS to the local file system. The syntax of the get command is shown below:

hadoop fs -get [-ignorecrc] [-crc]  
hadoop fs -get /user/hadoop/file localfile
hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile

See contents of a file in HDFS - cat

cat command is used to print the contents of the file on the stdout.

hadoop fs -cat <path[filename]>
hadoop fs -cat /user/hadoop/dir1/xyz.txt

Copy a file from source to destination in HDFS - cp

cp command is for copying the source into the target. This command allows multiple sources as well in which case the destination must be a directory.

hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] 
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

Copy a file from Local file system to HDFS - copyFromLocal

The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. Similar to put command, except that the source is restricted to a local file reference.

hadoop fs -copyFromLocal  URI
hadoop fs -copyFromLocal /home/hadoop/xyz.txt  /user/hadoop/xyz.txt

Copy a file from HDFS to Local file system - copyToLocal

The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. Similar to get command, except that the destination is restricted to a local file reference.

hadoop fs -copyToLocal [-ignorecrc] [-crc] URI 
hadoop fs -copyToLocal /user/hadoop/xyz.txt /home/hadoop/xyz.txt

Move file from source to destination in HDFS - mv

Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Note: Moving files across file systems is not permitted.

hadoop fs -mv URI [URI ...] 
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/dir1

Remove a file or directory in HDFS - rm, rmdir


Delete files specified as args. Deletes directory only when it is empty

hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]
hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir


Delete a directory specified as args.

hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]
hadoop fs -rmdir /user/hadoop/emptydir

Options: --ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.

Display last few lines of a file in HDFS - tail

Displays last kilobyte of the file to stdout.

hadoop fs -tail [-f] URI
hafoop fs -tail /user/hadoop/demo.txt

Print statistics about the file or directory in HDFS - stat

Use stat to print statistics about the file/directory at in the specified format.

hadoop fs -stat [format]  ...
hadoop fs -stat /user/hadoop/

Display the size of files and directories in HDFS - du

The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file.

Usage :
hadoop fs -du 
hadoop fs -du /user/hadoop/dir1/xyz.txt

Change group of files in HDFS - chgrp

The hadoop chgrp shell command is used to change the group association of files. The user must be the owner of files, or else a super-user.

hadoop fs -chgrp [-R] GROUP URI [URI ...]

Change the permissions of files in HDFS - chmod

The hadoop chmod command is used to change the permissions of files. The user must be the owner of the file, or else a super-user.

hadoop fs -chmod [-R] <mode[,mode]... |="" octalmode=""> URI [URI ...]

Change the owner of files in HDFS - chown

The hadoop chown command is used to change the ownership of files. The user must be a super-user.

hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

Help for an individual HDFS command - usage

Below command return the help for an individual command.

hadoop fs -usage command


posted @ 2018-09-26 11:18  随笔`  阅读(288)  评论(0编辑  收藏  举报