Increase the partition size of a Unix VM in XenServer

Distribution: Ubuntu 18.04

This is how I increased the disk partition size of my Ubuntu virtual machine running in XenServer.

Using fdisk, view the partition information, delete the old partition, add a new partition with extended size, save the partition. This is the workflow followed below. Pls note that the partition will not be commited, until you confirm.


# fdisk /dev/xvda
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux
/dev/xvda2 2099200 419430399 208665600 8e Linux LVM

Command (m for help): d
Partition number (1,2, default 2): 2
Partition 2 is deleted

Command (m for help): p

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux

Command (m for help): ^C
# fdisk /dev/xvda
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux
/dev/xvda2 2099200 419430399 208665600 8e Linux LVM

Command (m for help): m
Command action
a toggle a bootable flag
b edit bsd disklabel
c toggle the dos compatibility flag
d delete a partition
g create a new empty GPT partition table
G create an IRIX (SGI) partition table
l list known partition types
m print this menu
n add a new partition
o create a new empty DOS partition table
p print the partition table
q quit without saving changes
s create a new empty Sun disklabel
t change a partition's system id
u change display/entry units
v verify the partition table
w write table to disk and exit
x extra functionality (experts only)

Command (m for help): p

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux
/dev/xvda2 2099200 419430399 208665600 8e Linux LVM

Command (m for help): d
Partition number (1,2, default 2):
Partition 2 is deleted

Command (m for help): n
Partition type:
p primary (1 primary, 0 extended, 3 free)
e extended
Select (default p): p
Partition number (2-4, default 2):
First sector (2099200-838860799, default 2099200):
Using default value 2099200
Last sector, +sectors or +size{K,M,G} (2099200-838860799, default 838860799):
Using default value 838860799
Partition 2 of type Linux and of size 399 GiB is set

Command (m for help): p

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux
/dev/xvda2 2099200 838860799 418380800 83 Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.
# fdisk -l

Disk /dev/xvda: 429.5 GB, 429496729600 bytes, 838860800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b3935

Device Boot Start End Blocks Id System
/dev/xvda1 * 2048 2099199 1048576 83 Linux
/dev/xvda2 2099200 838860799 418380800 83 Linux

Disk /dev/mapper/cl_mgr-root: 53.7 GB, 53687091200 bytes, 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/cl_mgr-swap: 3892 MB, 3892314112 bytes, 7602176 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/cl_mgr-home: 156.1 GB, 156086829056 bytes, 304857088 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Advertisements

ftp mput without confirmation in shell script

Hi,

I need to upload the output of my MapReduce to my web server using FTP. Here is how I uploaded all the files of output folder to ftp with a single script

 

NOW=$(date +"%Y-%m-%d")
HTML_OUTPUT_FOLDER=output/output_html/$NOW
mkdir $HTML_OUTPUT_FOLDER

HOST='myftpsite'
USER='myftpuser'
PASSWD='myftppassword'
REMOTE_FOLDER=public_html/nandu/$NOW


cd $HTML_OUTPUT_FOLDER
echo Current directory
pwd
echo Files would be uploaded to $REMOTE_FOLDER

ftp -i -n $HOST <<END_SCRIPT
quote USER $USER
quote PASS $PASSWD
mkdir $REMOTE_FOLDER
cd $REMOTE_FOLDER
mput *.*
quit
END_SCRIPT

exit 0
cd $HOME/feed

note – ftp -n will mput the files without prompting.

Oozie mkdistro fails with mvn: command not found

Oozie installation is not stright forward similar to other applications. When I executed the mkdistro script, it failed with the below given error

./mkdistro.sh
./mkdistro.sh: line 71: mvn: command not found

Maven is a build tool for java, which is not instaled in my ubuntu VM. Hence we need to install it using the below given command to make the script working

hadoop@gandhari:~/oozie/bin$ sudo apt-get install maven

 

Hadoop – psedodistributed mode installation – second time

I was waiting for a computing machine for hadoop. Unfortunatley I couldn’t get it for the past two months due to multiple commitments. Kannan and I visited one of the local computer stores before 2 weeks. I selected a Dell based tower based desktop. But for the desired config (i7/16GB RAM/500 GB) it is going out of my budget.

I lost my hope and postponed the plan. I got a old model laptop with high config in a local expo. It doesn’t have modern features like touch screen, SSD harddisk. But I’m ok. I named it after Jeyamohan’s novel on Krishna – Neelam! (Neelam=Blue)

 

Here are the steps I followed to create the Hadoop environment. This is more precised than my earlier post.

Here is the summary of Hadoop – psedodistributed mode installation. This is my 2nd post regarding the environmental setup.

System Specs

  • OS: Ubuntu 64 bit/VMware Workstation Player
  • RAM: 8 GB
  • CPU: 4
  • Java: 1.8
  • Hadoop: 2.6

Update Ubuntu

Let’s update ubuntu first before starting the process. This may take much time based on your update frequency.

The following command will update the package definitions.

pandian@kunthi:~$ sudo apt-get update
...
...
Fetched 1,646 kB in 8s (204 kB/s)
AppStream cache update completed, but some metadata was ignored due to errors.
Reading package lists... Done

The following command will update the packages

pandian@kunthi:~$ sudo apt-get dist-upgrade
...
...
355 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB/465 MB of archives.
After this operation, 279 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...
...

<It is time consuming. Take a break.>

Installing JDK

With reference to http://askubuntu.com/questions/521145/how-to-install-oracle-java-on-ubuntu-14-04 follow the below given instructions to install JDK

pandian@kunthi:~$ sudo apt-add-repository ppa:webupd8team/java
pandian@kunthi:~$ sudo apt-get update
pandian@kunthi:~$ sudo apt-get install oracle-java8-installer
pandian@kunthi:~$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) Client VM (build 25.101-b13, mixed mode)
pandian@kunthi:~$ whereis java
java: /usr/bin/java /usr/share/java /usr/share/man/man1/java.1.gz

Create User and User Group

Let’s run Hadoop with its own user and user group.

pandian@kunthi:~$ sudo groupadd -g 599 hadoop
pandian@kunthi:~$ sudo useradd -u 599 -g 599 hadoop

Directory structure

Let’s create the directory structure

pandian@kunthi:~$ sudo mkdir -p /var/lib/hadoop/journaldata
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /var/lib/hadoop/journaldata

User access and sudo privilage

We are still doing linux tasks. We haven’t touched Hadoop part yet.

pandian@kunthi:~$ sudo passwd hadoop
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
pandian@kunthi:/opt/software/hadoop$ sudo su
root@kunthi:/home/pandian# cp /etc/sudoers /etc/sudoers.20160820
root@kunthi:~# vi /etc/sudoers

I made the highlighted change.

# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL

root@kunthi:~# cd /opt

root@kunthi:~# wget http://download.nus.edu.sg/mirror/apache/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

root@kunthi:~# gunzip hadoop-2.6.4.tar.gz

root@kunthi:~# tar -xvf hadoop-2.6.4.tar.gz
root@gandhari:/opt# ln -s /opt/hadoop-2.6.4 hadoop
root@gandhari:/opt# chown hadoop:hadoop hadoop
root@gandhari:/opt# chown hadoop:hadoop -R hadoop-2.6.4
root@gandhari:/opt# usermod -d /opt/hadoop hadoop

root@kunthi:~# exit
pandian@kunthi:~$ su - hadoop
$ pwd
/opt/hadoop
$ bash
hadoop@kunthi:~$ id
uid=1001(hadoop) gid=599(hadoop) groups=599(hadoop)

Hadoop

Lets create the configuration directory for Hadoop.
hadoop@kunthi:~$ sudo mkdir -p /etc/hadoop/conf
Create a softlink for the conf folder
hadoop@kunthi:~$ sudo ln -s /opt/hadoop/hadoop-2.6.4/etc/hadoop/** /etc/hadoop/conf/

SSH Keys creation

Hadoop wants to create key based SSH login
hadoop@kunthi:~$ mkdir ~/.ssh
hadoop@kunthi:~$ cd ~/.ssh/
hadoop@kunthi:~/.ssh$ touch authorized keys
hadoop@kunthi:~/.ssh$ touch known hosts
hadoop@kunthi:~/.ssh$ chmod 700 ~/.ssh/&& chmod 600 ~/.ssh/*
hadoop@gandhari:/opt/hadoop-2.6.4$ ssh gandhari
The authenticity of host 'gandhari (192.168.0.169)' can't be established.
ECDSA key fingerprint is SHA256:Y/ed5Le/5xqY1ImoVZBsSF7irydJRUn2TNwPBow4uSA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'gandhari,192.168.0.169' (ECDSA) to the list of known hosts.
hadoop@gandhari's password:
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-34-generic x86_64)

Bash profile – Environmental variables

As the home folder of the unix user is created by me manually, I need to create the bashprofile. I’ll get a copy of the bash profile, which is working for another user
hadoop@kunthi:~$ sudo cp /home/pandian/.bash* .
I’ll modify the above environmental variables to .bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_HOME=/opt/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
export HADOOP_USER_CLASSPATH_FIRST=true
export HADOOP_PREFIX=$HADOOP_HOME
export JAVA_HOME HADOOP_HOME HADOOP_MAPRED_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME PATH HADOOP_LOG_DIR

Let’s apply the changes to current session
hadoop@kunthi:~$ source ~/.bashrc

Hadoop env config

Let’s specify JAVA_HOME
hadoop@kunthi:~/hadoop/etc/hadoop$ cd $HADOOP_HOME/etc/hadoop/
hadoop@kunthi:~/hadoop/etc/hadoop$ cp hadoop-env.sh hadoop-env.sh.20160821

I made the following changes to hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

Setup passwordless ssh login

hadoop@kunthi:~/hadoop$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/hadoop/.ssh/id_rsa):
Your identification has been saved in /opt/hadoop/.ssh/id_rsa.
Your public key has been saved in /opt/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:UXGO3tnfK9K8DayD0/jc+T/WgZetCHOuBAcssUw3gBo hadoop@kunthi
The key's randomart image is:
+---[RSA 2048]----+
| .+.o o.. |
| E .o = o + |
| o + + . . |
| . . + . o |
| S o o o o|
| oo o. =o|
| ==o+..=|
| =.+=+=oo|
| +=o+=++|
+----[SHA256]-----+
hadoop@kunthi:~/hadoop$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@kunthi:~/hadoop$ sudo /etc/init.d/ssh restart
[ ok ] Restarting ssh (via systemctl): ssh.service.
hadoop@kunthi:~/hadoop$ ssh hadoop@gandhari
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-34-generic x86_64)

Temp folders for Hadoop

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chmod 750 /var/lib/hadoop/cache/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs

$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/namesecondary

$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/mapred/local
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/mapred/local/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/namesecondary/

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop -R /etc/hadoop/

Define the slave name

Add slave hostname. After change, this is the slave name. It is similar to my hostname
hadoop@kunthi:~/hadoop$ cat /etc/hadoop/conf/slaves
kunthi

core-site.xml

Make the appropriate changes core-site.xml

hadoop@kunthi:~/hadoop$ cat etc/hadoop/core-site.xml
<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://gandhari:9000</value>
        </property>
</configuration>

hadoop executable

Check if hadoop command is working. It is located inside $HADOOP_HOME/bin folder

hadoop@kunthi:~$ cd $HADOOP_HOME
hadoop@kunthi:~/hadoop$ hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /opt/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar

hdfs-site.xml

hadoop@kunthi:~/hadoop$ cp etc/hadoop/hdfs-site.xml etc/hadoop/hdfs-site.xml.20160820

I made the folllowing changes

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/data</value>
</property>
</configuration>

Formatting and starting the namenode

hadoop@kunthi:~/hadoop$ hadoop namenode -format
.....
16/08/20 09:15:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = gandhari/192.168.0.169
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.4
....
16/08/20 09:15:10 INFO common.Storage: Storage directory /var/lib/hadoop/cache/hadoop/dfs/name has been successfully formatted.
....
16/08/20 09:15:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at gandhari/192.168.0.169
************************************************************/
hadoop@kunthi:~/hadoop/sbin$ start-dfs.sh
hadoop@kunthi:~/hadoop/sbin$ start-yarn.sh
hadoop@kunthi:~/hadoop/sbin$ jps
6290 DataNode
6707 NodeManager
6599 ResourceManager
6459 SecondaryNameNode
6155 NameNode
7003 Jps
hadoop@kunthi:~/hadoop/sbin$ ./mr-jobhistory-daemon.sh start historyserver

Access the job tracker, name node and data node using your browser as shown below

Job History: http://gandhari:19888/

hadoop004 - jobhistory

Name Node: http://gandhari:50070/

hadoop005 - namenode information

 

Data Node: http://gandhari:50075/

hadoop006 - datanode information

 All applications http://gandhari:8088/cluster

hadoop007 - all applications

 

 

Hadoop Pseudo-Distributed Mode – Setup – Ubuntu – old post. Do not use

OLD POST.. DO NOT USE.

USE MY SECOND POST FOR HADOOP INSTALLATION

Here is the summary of Hadoop – psedodistributed mode installation. This is my 2nd post regarding the environmental setup.

System Specs

  • OS: Ubuntu 32 bit/VirtualBox VM
  • RAM: 4 GB
  • CPU: 1
  • Java: 1.8
  • Hadoop: 2.6

Update Ubuntu

Let’s update ubuntu first before starting the process. This may take much time based on your update frequency.

The following command will update the package definitions.

pandian@kunthi:~$ sudo apt-get update
...
...
Fetched 1,646 kB in 8s (204 kB/s)
AppStream cache update completed, but some metadata was ignored due to errors.
Reading package lists... Done

The following command will update the packages

pandian@kunthi:~$ sudo apt-get dist-upgrade
...
...
355 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB/465 MB of archives.
After this operation, 279 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...
...

<It is time consuming. Take a break.>

Installing JDK

With reference to http://askubuntu.com/questions/521145/how-to-install-oracle-java-on-ubuntu-14-04 follow the below given instructions to install JDK

pandian@kunthi:~$ sudo apt-add-repository ppa:webupd8team/java
pandian@kunthi:~$ sudo apt-get update
pandian@kunthi:~$ sudo apt-get install oracle-java8-installer
pandian@kunthi:~$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) Client VM (build 25.101-b13, mixed mode)
pandian@kunthi:~$ whereis java
java: /usr/bin/java /usr/share/java /usr/share/man/man1/java.1.gz

Create User and User Group

Let’s run Hadoop with its own user and user group.

pandian@kunthi:~$ sudo groupadd -g 599 hadoop
pandian@kunthi:~$ sudo useradd -u 599 -g 599 hadoop

Directory structure

Let’s create the directory structure

pandian@kunthi:~$ sudo mkdir -p /opt/hadoop
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /opt/hadoop
pandian@kunthi:~$ sudo mkdir -p /var/lib/hadoop/journaldata
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /var/lib/hadoop/journaldata

User access and sudo privilage

We are still doing linux tasks. We haven’t touched Hadoop part yet.

pandian@kunthi:~$ sudo passwd hadoop
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
pandian@kunthi:~$ sudo usermod -d /opt/hadoop hadoop
pandian@kunthi:/opt/software/hadoop$ sudo su
root@kunthi:/home/pandian# cp /etc/sudoers /etc/sudoers.20160820
root@kunthi:~# vi /etc/sudoers

I made the highlighted change.

# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL
root@kunthi:~# exit
pandian@kunthi:~$ su - hadoop
$ pwd
/opt/hadoop
$ bash
hadoop@kunthi:~$ id
uid=1001(hadoop) gid=599(hadoop) groups=599(hadoop)

Hadoop package download

I copy the link to download hadoop from http://hadoop.apache.org/releases.html. Here is how you’ll download it.

hadoop@kunthi:~$ wget http://download.nus.edu.sg/mirror/apache/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

The downloaded file is saved in the hadoop directory.
hadoop@kunthi:~$ ls -alt
total 24
-rw-rw-r-- 1 hadoop hadoop 15339 Aug 20 07:43 hadoop-2.6.4.tar.gz
hadoop@kunthi:~$ gunzip hadoop-2.6.4.tar.gz
hadoop@kunthi:~$ tar -xvf hadoop-2.6.4.tar

This will extract the tar file in a new location /opt/hadoop/hadoop-2.6.4. Here is the content of the folder.
hadoop@kunthi:~$ ls -alt hadoop-2.6.4
total 60
drwxr-xr-x 3 hadoop hadoop 4096 Aug 20 07:53 ..
drwxr-xr-x 9 hadoop hadoop 4096 Feb 12 2016 .
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 bin
drwxr-xr-x 3 hadoop hadoop 4096 Feb 12 2016 etc
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 include
drwxr-xr-x 3 hadoop hadoop 4096 Feb 12 2016 lib
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 libexec
-rw-r--r-- 1 hadoop hadoop 15429 Feb 12 2016 LICENSE.txt
-rw-r--r-- 1 hadoop hadoop 101 Feb 12 2016 NOTICE.txt
-rw-r--r-- 1 hadoop hadoop 1366 Feb 12 2016 README.txt
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 sbin
drwxr-xr-x 4 hadoop hadoop 4096 Feb 12 2016 share

Lets create the configuration directory for Hadoop.
hadoop@kunthi:~$ sudo mkdir -p /etc/hadoop/conf
Create a softlink for the conf folder
hadoop@kunthi:~$ sudo ln -s /opt/hadoop/hadoop-2.6.4/etc/hadoop/** /etc/hadoop/conf/
hadoop@kunthi:~$ ln -s hadoop-2.6.4 hadoop

SSH Keys creation.

Hadoop wants to create key based SSH login
hadoop@kunthi:~$ mkdir ~/.ssh
hadoop@kunthi:~$ cd ~/.ssh/
hadoop@kunthi:~/.ssh$ touch authorized keys
hadoop@kunthi:~/.ssh$ touch known hosts
hadoop@kunthi:~/.ssh$ chmod 700 ~/.ssh/&& chmod 600 ~/.ssh/*
hadoop@kunthi:~/.ssh$ ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:Fj6op9qzbfodhsQTmpQJ17G/mcAvu541bTMTb3huhPg.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
hadoop@localhost's password:
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-31-generic i686)

Bash profile – Environmental variables

As the home folder of the unix user is created by me manually, I need to create the bashprofile. I’ll get a copy of the bash profile, which is working for another user
hadoop@kunthi:~$ cp /home/pandian/.bash*
I’ll modify the above environmental variables to .bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_HOME=/opt/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
export HADOOP_USER_CLASSPATH_FIRST=true
export HADOOP_PREFIX=$HADOOP_HOME
export JAVA_HOME HADOOP_HOME HADOOP_MAPRED_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME PATH HADOOP_LOG_DIR

Let’s apply the changes to current session
hadoop@kunthi:~$ source ~/.bashrc

Hadoop env config

Let’s specify JAVA_HOME
hadoop@kunthi:~/hadoop/etc/hadoop$ cd $HADOOP_HOME/etc/hadoop/
hadoop@kunthi:~/hadoop/etc/hadoop$ cp hadoop-env.sh hadoop-env.sh.20160820

I made the following changes to hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

Setup passwordless ssh login

hadoop@kunthi:~/hadoop$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/hadoop/.ssh/id_rsa):
Your identification has been saved in /opt/hadoop/.ssh/id_rsa.
Your public key has been saved in /opt/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:UXGO3tnfK9K8DayD0/jc+T/WgZetCHOuBAcssUw3gBo hadoop@kunthi
The key's randomart image is:
+---[RSA 2048]----+
| .+.o o.. |
| E .o = o + |
| o + + . . |
| . . + . o |
| S o o o o|
| oo o. =o|
| ==o+..=|
| =.+=+=oo|
| +=o+=++|
+----[SHA256]-----+
hadoop@kunthi:~/hadoop$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@kunthi:~/hadoop$ sudo /etc/init.d/ssh restart
[ ok ] Restarting ssh (via systemctl): ssh.service.
hadoop@kunthi:~/hadoop$ ssh hadoop@kunthi
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-31-generic i686)

Define the slave name

Add slave hostname. After change, this is the slave name. It is similar to my hostname
hadoop@kunthi:~/hadoop$ cat /etc/hadoop/conf/slaves
kunthi

core-site.xml

Make the appropriate changes core-site.xml
hadoop@kunthi:~/hadoop$ cat etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://kunthi:9000</value>
</property>
</configuration>

mapred-site.xml

hadoop@kunthi:~$ cd $HADOOP_HOME
hadoop@kunthi:~/hadoop$ hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /opt/hadoop/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar
hadoop@kunthi:~/hadoop$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@kunthi:~/hadoop$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@kunthi:~/hadoop$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@kunthi:~/hadoop$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@kunthi:~/hadoop$ cp etc/hadoop/hdfs-site.xml etc/hadoop/hdfs-site.xml.20160820

I made the folllowing changes
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/data</value>
</property>
</configuration>

Formatting and starting the namenode

hadoop@kunthi:~/hadoop$ hadoop namenode -format
.....
16/08/20 09:15:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = kunthi/192.168.0.159
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.4
....
16/08/20 09:15:10 INFO common.Storage: Storage directory /var/lib/hadoop/cache/hadoop/dfs/name has been successfully formatted.
....
16/08/20 09:15:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at kunthi/192.168.0.159
************************************************************/
hadoop@kunthi:~/hadoop/sbin$ sudo mkdir /logs
hadoop@kunthi:~/hadoop/sbin$ sudo chown hadoop:hadoop /logs/
hadoop@kunthi:~/hadoop/sbin$ start-dfs.sh
hadoop@kunthi:~/hadoop/sbin$ start-yarn.sh
hadoop@kunthi:~/hadoop/sbin$ jps
6290 DataNode
6707 NodeManager
6599 ResourceManager
6459 SecondaryNameNode
6155 NameNode
7003 Jps
hadoop@kunthi:~/hadoop/sbin$ ./mr-jobhistory-daemon.sh start historyserver

Access the job tracker, name node and data node using your browser as shown below

hadoop001 - jobhistory hadoop002 - namenode information hadoop003 - datanode information

 

Integrating Apache and Tomat with mod_jk

This post is a continuation of my previous post. In this post I’ll tell you what I did to access my tomcat apps via Apache.

All the files you edit here are very crucial. Take a backup and make a note of the changes you do. Otherwise, it may goof your Apache configuration.

Installing mon-jk library

$ sudo apt-get install libapache2-mod-jk

Make sure the AJP redirect port is enabled on tomcat/config/server.xml

<Connector port=”8009″ protocol=”AJP/1.3″ redirectPort=”8443″ />

Create a new worker file

sudo vi /etc/apache2/workers.properties

# Define 1 real worker using ajp13
worker.list=worker1
# Set properties for worker (ajp13)
worker.worker1.type=ajp13
worker.worker1.host=localhost
worker.worker1.port=8009

Reconfigure jk.conf

Change the JkWorkersFile property to /etc/apache2/workers.properties

sudo vi /etc/apache2/mods-available/jk.conf

# JkWorkersFile /etc/libapache2-mod-jk/workers.properties
JkWorkersFile /etc/apache2/workers.properties

Configure the URL Apache should pass through the Tomcat

sudo vi /etc/apache2/sites-enabled/000-default

and add the line starts with JKMount in your configurtation

<VirtualHost *:80>
…………………………………
…………………………………
JkMount /gplus* worker1
</VirtualHost *:80>

Restart apache

I found http://<apacheIP>/gplus worked.

Appreciations and thanks to http://thetechnocratnotebook.blogspot.sg/2012/05/installing-tomcat-7-and-apache2-with.html