HDFS Permissions

The Hadoop Distributed File System (HDFS) implements a permissions model for files and directories that shares much of the POSIX model. Each file and directory is associated with an owner and a group. The file or directory has separate permissions for the user that is the owner, for other users that are members of the group, and for all other users. For files, the r permission is required to read the file, and the w permission is required to write or append to the file. For directories, the r permission is required to list the contents of the directory, the w permission is required to create or delete files or directories, and the x permission is required to access a child of the directory.
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html

This assignment will create a new user, assign a folder in HDFS for him to demonstrate the permission capabilities.

HDFS

Add a Unix user

hadoop@gandhari:~$ sudo groupadd feeder
hadoop@gandhari:~$ sudo useradd -g feeder -m feeder
hadoop@gandhari:~$ sudo passwd feeder

Create a folder in HDFS and assign permissions

hadoop@gandhari:~$ hadoop fs -mkdir /feeder
hadoop@gandhari:~$ hadoop fs -chown -R feeder:feeder /feeder
hadoop@gandhari:~$ hadoop fs -ls /
Found 6 items
-rw-r--r--   1 hadoop supergroup       1749 2016-08-24 06:01 /data
drwxr-xr-x   - feeder feeder              0 2016-09-05 15:34 /feeder
drwxr-xr-x   - hadoop supergroup          0 2016-09-05 15:15 /hbase
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:53 /pigdata
drwxrwx---   - hadoop supergroup          0 2016-08-24 16:14 /tmp
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:56 /user

We need to enable the permissions in hdfs-site.xml

hadoop@gandhari:~$ vi etc/hadoop/hdfs-site.xml
        <property>
                <name>dfs.permissions</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.permissions.enabled</name>
                <value>true</value>
        </property>

After this change, we need to restart dfs daemon.

hadoop@gandhari:~$ stop-dfs.sh
hadoop@gandhari:~$ start-dfs.sh

Let’s test the permissions using another user kannan who does not have write permission to /data/feeder

kannan@gandhari:~$ /opt/hadoop/bin/hadoop fs -put javashine.xml /data/feeder
put: Permission denied: user=kannan, access=EXECUTE, inode="/data":hadoop:supergroup:-rw-r--r--

See you in another interesting post!

Basic HDFS commands – Demo

Hi BigDs,

I want to end this weekend with a few basic HDFS command, instructor taught me today. Playing with files are always exciting!

HDFS

Let’s create a input folder and copy some contents in it.

hadoop@gandhari:~$ mkdir input
hadoop@gandhari:~$ cd input
hadoop@gandhari:~/input$ vi rivers.txt
hadoop@gandhari:~$ ls input
rivers.txt

We shall use this rivers.txt for this interesting assignment

List

We use ls command to list the contents of the HDFS folder.

hadoop@gandhari:~$ hadoop fs -ls /
Found 5 items
-rw-r--r--   1 hadoop supergroup       1749 2016-08-24 06:01 /data
drwxr-xr-x   - hadoop supergroup          0 2016-09-02 14:32 /hbase
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:53 /pigdata
drwxrwx---   - hadoop supergroup          0 2016-08-24 16:14 /tmp
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:56 /user

hadoop@gandhari:~$ hadoop fs -ls /user
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2016-08-26 17:40 /user/hadoop
drwxr-xr-x   - hadoop supergroup          0 2016-08-22 15:17 /user/hive

mkdir

As the name implies, lets create a few directories in HDFS.

hadoop@gandhari:~$ hadoop fs -mkdir /user/hadoop/trial1 /user/hadoop/trial2

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/
Found 4 items
drwxr-xr-x   - hadoop supergroup          0 2016-08-24 13:56 /user/hadoop/output
drwxr-xr-x   - hadoop supergroup          0 2016-08-26 17:40 /user/hadoop/share
drwxr-xr-x   - hadoop supergroup          0 2016-09-03 23:35 /user/hadoop/trial1
drwxr-xr-x   - hadoop supergroup          0 2016-09-03 23:35 /user/hadoop/trial2

hadoop@gandhari:~$ hadoop fs -put input/rivers.txt /user/hadoop/trial1
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 1 items
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt

du, df

Here comes the command to list down the disk and file size. du displays size of files and directories contained in the given directory or the size of a file if its just a file

hadoop@gandhari:~$ hadoop fs -du /user/hadoop/trial1
184  /user/hadoop/trial1/rivers.txt

hadoop@gandhari:~$ hadoop fs -du /user/hadoop/
1807       /user/hadoop/output
171430346  /user/hadoop/share
184        /user/hadoop/trial1
0          /user/hadoop/trial2

df command shows the capacity, free and used space of the filesystem.

hadoop@gandhari:~$ hadoop fs -df
Filesystem                    Size       Used     Available  Use%
hdfs://gandhari:9000  194200133632  190238720  159151742976    0%

get

Copies/Downloads files from HDFS to the local file system

Let’s create a local directory to store the downloaded files

hadoop@gandhari:~$ mkdir downloads
hadoop@gandhari:~$ hadoop fs -get /user/hadoop/trial1/rivers.txt downloads/
hadoop@gandhari:~$ hadoop fs -get /user/hadoop/trial1/rivers.txt downloads/
hadoop@gandhari:~$ ls -alt downloads/
total 12
drwxrwxr-x  2 hadoop hadoop 4096 Sep  3 23:58 .
-rw-r--r--  1 hadoop hadoop  184 Sep  3 23:58 rivers.txt
drwxr-xr-x 38 hadoop hadoop 4096 Sep  3 23:57 ..
hadoop@gandhari:~$ cat downloads/rivers.txt
Adyar
Amaravati
Arasalar
Bhavani
Bambar
Gomukhi

getmerge

Takes a source directory file or files as input and concatenates files in src into the local destination file.

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 00:41 /user/hadoop/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt

hadoop@gandhari:~$ hadoop fs -getmerge /user/hadoop/trial1 downloads/mergedContent.txt

distcp

  • Copy file or directories recursively
  • It is a tool used for large inter/intra-cluster copying
  • It uses MapReduce to effect its distribution copy, error handling and recovery, and reporting
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 00:41 /user/hadoop/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-03 23:37 /user/hadoop/trial1/rivers.txt
hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2
hadoop@gandhari:~$

Lets copy the files from trial1 to trial2

hadoop@gandhari:~$ hadoop distcp /user/hadoop/trial1 /user/hadoop/trial2

16/09/04 01:00:30 INFO mapreduce.Job: Job job_local759622795_0001 completed successfully
16/09/04 01:00:30 INFO mapreduce.Job: Counters: 26
        File System Counters
                FILE: Number of bytes read=97844
                FILE: Number of bytes written=357366
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=820643
                HDFS: Number of bytes written=820643
                HDFS: Number of read operations=32
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=5
        Map-Reduce Framework
                Map input records=3
                Map output records=0
                Input split bytes=159
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=0
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
                Total committed heap usage (bytes)=214958080
        File Input Format Counters
                Bytes Read=619
        File Output Format Counters
                Bytes Written=8
        org.apache.hadoop.tools.mapred.CopyMapper$Counter
                BYTESCOPIED=820643
                BYTESEXPECTED=820643
                COPY=3

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2016-09-04 01:00 /user/hadoop/trial2/trial1

hadoop@gandhari:~$ hadoop fs -ls /user/hadoop/trial2/trial1
Found 2 items
-rw-r--r--   3 hadoop supergroup     820459 2016-09-04 01:00 /user/hadoop/trial2/trial1/hadoop-hadoop-datanode-gandhari.log
-rw-r--r--   3 hadoop supergroup        184 2016-09-04 01:00 /user/hadoop/trial2/trial1/rivers.txt

I’ll see you in an another interesting post. Wish you a pleasant week.