Basic HDFS commands – Demo

Hi BigDs,

I want to end this weekend with a few basic HDFS command, instructor taught me today. Playing with files are always exciting!

HDFS

Let’s create a input folder and copy some contents in it.

hadoop@gandhari:~$ mkdir input
hadoop@gandhari:~$ cd input
hadoop@gandhari:~/input$ vi rivers.txt
hadoop@gandhari:~$ ls input
rivers.txt

We shall use this rivers.txt for this interesting assignment

List

We use ls command to list the contents of the HDFS folder.

hadoop@gandhari:~$ hadoop fs -ls /
Found 5 items
-rw-r--r--   1 hadoop supergroup       1749 2016-08-24 06:01 /data
drwxr-xr-x   - hadoop supergroup          0 2016-09-02 14:32 /hbase
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:53 /pigdata
drwxrwx---   - hadoop supergroup          0 2016-08-24 16:14 /tmp
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:56 /user

hadoop@gandhari:~$ hadoop fs -ls /user
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2016-08-26 17:40 /user/hadoop
drwxr-xr-x   - hadoop supergroup          0 2016-08-22 15:17 /user/hive

mkdir

As the name implies, lets create a few directories in HDFS.

hadoop@gandhari:~$ hadoop fs -mkdir /user/hadoop/trial1 /user/hadoop/trial2

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/
Found 4 items
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:56 /user/hadoop/output
drwxr-xr-x   - hadoop supergroup          0 2016-08-26 17:40 /user/hadoop/share
drwxr-xr-x   - hadoop supergroup          0 2016-09-03 23:35 /user/hadoop/trial1
drwxr-xr-x   - hadoop supergroup          0 2016-09-03 23:35 /user/hadoop/trial2

hadoop@gandhari:~$ hadoop fs -put input/rivers.txt /user/hadoop/trial1
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 1 items
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt

du, df

Here comes the command to list down the disk and file size. du displays size of files and directories contained in the given directory or the size of a file if its just a file

hadoop@gandhari:~$ hadoop fs -du /user/hadoop/trial1
184  /user/hadoop/trial1/rivers.txt

hadoop@gandhari:~$ hadoop fs -du /user/hadoop/
1807       /user/hadoop/output
171430346  /user/hadoop/share
184        /user/hadoop/trial1
0          /user/hadoop/trial2

df command shows the capacity, free and used space of the filesystem.

hadoop@gandhari:~$ hadoop fs -df
Filesystem                    Size       Used     Available  Use%
hdfs://gandhari:9000  194200133632  190238720  159151742976    0%

get

Copies/Downloads files from HDFS to the local file system

Let’s create a local directory to store the downloaded files

hadoop@gandhari:~$ mkdir downloads
hadoop@gandhari:~$ hadoop fs -get /user/hadoop/trial1/rivers.txt downloads/
hadoop@gandhari:~$ hadoop fs -get /user/hadoop/trial1/rivers.txt downloads/
hadoop@gandhari:~$ ls -alt downloads/
total 12
drwxrwxr-x  2 hadoop hadoop 4096 Sep  3 23:58 .
-rw-r--r--  1 hadoop hadoop  184 Sep  3 23:58 rivers.txt
drwxr-xr-x 38 hadoop hadoop 4096 Sep  3 23:57 ..
hadoop@gandhari:~$ cat downloads/rivers.txt
Adyar
Amaravati
Arasalar
Bhavani
Bambar
Gomukhi

getmerge

Takes a source directory file or files as input and concatenates files in src into the local destination file.

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 00:41 /user/hadoop/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt

hadoop@gandhari:~$ hadoop fs -getmerge /user/hadoop/trial1 downloads/mergedContent.txt

distcp

  • Copy file or directories recursively
  • It is a tool used for large inter/intra-cluster copying
  • It uses MapReduce to effect its distribution copy, error handling and recovery, and reporting
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 00:41 /user/hadoop/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2
hadoop@gandhari:~$

Lets copy the files from trial1 to trial2

hadoop@gandhari:~$ hadoop distcp /user/hadoop/trial1 /user/hadoop/trial2

16/09/04 01:00:30 INFO mapreduce.Job: Job job_local759622795_0001 completed successfully
16/09/04 01:00:30 INFO mapreduce.Job: Counters: 26
        File System Counters
                FILE: Number of bytes read=97844
                FILE: Number of bytes written=357366
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=820643
                HDFS: Number of bytes written=820643
                HDFS: Number of read operations=32
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=5
        Map-Reduce Framework
                Map input records=3
                Map output records=0
                Input split bytes=159
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=0
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
                Total committed heap usage (bytes)=214958080
        File Input Format Counters
                Bytes Read=619
        File Output Format Counters
                Bytes Written=8
        org.apache.hadoop.tools.mapred.CopyMapper$Counter
                BYTESCOPIED=820643
                BYTESEXPECTED=820643
                COPY=3

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2016-09-04 01:00 /user/hadoop/trial2/trial1

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 01:00 /user/hadoop/trial2/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-04 01:00 /user/hadoop/trial2/trial1/rivers.txt

I’ll see you in an another interesting post. Wish you a pleasant week.

Advertisements

One thought on “Basic HDFS commands – Demo

  1. Pingback: Anotomy of Read & Write in HDFS | JavaShine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s