Oozie mkdistro fails with mvn: command not found

Oozie installation is not stright forward similar to other applications. When I executed the mkdistro script, it failed with the below given error

./mkdistro.sh
./mkdistro.sh: line 71: mvn: command not found

Maven is a build tool for java, which is not instaled in my ubuntu VM. Hence we need to install it using the below given command to make the script working

hadoop@gandhari:~/oozie/bin$ sudo apt-get install maven

 

Hadoop – psedodistributed mode installation – second time

I was waiting for a computing machine for hadoop. Unfortunatley I couldn’t get it for the past two months due to multiple commitments. Kannan and I visited one of the local computer stores before 2 weeks. I selected a Dell based tower based desktop. But for the desired config (i7/16GB RAM/500 GB) it is going out of my budget.

I lost my hope and postponed the plan. I got a old model laptop with high config in a local expo. It doesn’t have modern features like touch screen, SSD harddisk. But I’m ok. I named it after Jeyamohan’s novel on Krishna – Neelam! (Neelam=Blue)

 

Here are the steps I followed to create the Hadoop environment. This is more precised than my earlier post.

Here is the summary of Hadoop – psedodistributed mode installation. This is my 2nd post regarding the environmental setup.

System Specs

  • OS: Ubuntu 64 bit/VMware Workstation Player
  • RAM: 8 GB
  • CPU: 4
  • Java: 1.8
  • Hadoop: 2.6

Update Ubuntu

Let’s update ubuntu first before starting the process. This may take much time based on your update frequency.

The following command will update the package definitions.

pandian@kunthi:~$ sudo apt-get update
...
...
Fetched 1,646 kB in 8s (204 kB/s)
AppStream cache update completed, but some metadata was ignored due to errors.
Reading package lists... Done

The following command will update the packages

pandian@kunthi:~$ sudo apt-get dist-upgrade
...
...
355 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB/465 MB of archives.
After this operation, 279 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...
...

<It is time consuming. Take a break.>

Installing JDK

With reference to http://askubuntu.com/questions/521145/how-to-install-oracle-java-on-ubuntu-14-04 follow the below given instructions to install JDK

pandian@kunthi:~$ sudo apt-add-repository ppa:webupd8team/java
pandian@kunthi:~$ sudo apt-get update
pandian@kunthi:~$ sudo apt-get install oracle-java8-installer
pandian@kunthi:~$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) Client VM (build 25.101-b13, mixed mode)
pandian@kunthi:~$ whereis java
java: /usr/bin/java /usr/share/java /usr/share/man/man1/java.1.gz

Create User and User Group

Let’s run Hadoop with its own user and user group.

pandian@kunthi:~$ sudo groupadd -g 599 hadoop
pandian@kunthi:~$ sudo useradd -u 599 -g 599 hadoop

Directory structure

Let’s create the directory structure

pandian@kunthi:~$ sudo mkdir -p /var/lib/hadoop/journaldata
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /var/lib/hadoop/journaldata

User access and sudo privilage

We are still doing linux tasks. We haven’t touched Hadoop part yet.

pandian@kunthi:~$ sudo passwd hadoop
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
pandian@kunthi:/opt/software/hadoop$ sudo su
root@kunthi:/home/pandian# cp /etc/sudoers /etc/sudoers.20160820
root@kunthi:~# vi /etc/sudoers

I made the highlighted change.

# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL

root@kunthi:~# cd /opt

root@kunthi:~# wget http://download.nus.edu.sg/mirror/apache/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

root@kunthi:~# gunzip hadoop-2.6.4.tar.gz

root@kunthi:~# tar -xvf hadoop-2.6.4.tar.gz
root@gandhari:/opt# ln -s /opt/hadoop-2.6.4 hadoop
root@gandhari:/opt# chown hadoop:hadoop hadoop
root@gandhari:/opt# chown hadoop:hadoop -R hadoop-2.6.4
root@gandhari:/opt# usermod -d /opt/hadoop hadoop

root@kunthi:~# exit
pandian@kunthi:~$ su - hadoop
$ pwd
/opt/hadoop
$ bash
hadoop@kunthi:~$ id
uid=1001(hadoop) gid=599(hadoop) groups=599(hadoop)

Hadoop

Lets create the configuration directory for Hadoop.
hadoop@kunthi:~$ sudo mkdir -p /etc/hadoop/conf
Create a softlink for the conf folder
hadoop@kunthi:~$ sudo ln -s /opt/hadoop/hadoop-2.6.4/etc/hadoop/** /etc/hadoop/conf/

SSH Keys creation

Hadoop wants to create key based SSH login
hadoop@kunthi:~$ mkdir ~/.ssh
hadoop@kunthi:~$ cd ~/.ssh/
hadoop@kunthi:~/.ssh$ touch authorized keys
hadoop@kunthi:~/.ssh$ touch known hosts
hadoop@kunthi:~/.ssh$ chmod 700 ~/.ssh/&& chmod 600 ~/.ssh/*
hadoop@gandhari:/opt/hadoop-2.6.4$ ssh gandhari
The authenticity of host 'gandhari (192.168.0.169)' can't be established.
ECDSA key fingerprint is SHA256:Y/ed5Le/5xqY1ImoVZBsSF7irydJRUn2TNwPBow4uSA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'gandhari,192.168.0.169' (ECDSA) to the list of known hosts.
hadoop@gandhari's password:
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-34-generic x86_64)

Bash profile – Environmental variables

As the home folder of the unix user is created by me manually, I need to create the bashprofile. I’ll get a copy of the bash profile, which is working for another user
hadoop@kunthi:~$ sudo cp /home/pandian/.bash* .
I’ll modify the above environmental variables to .bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_HOME=/opt/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
export HADOOP_USER_CLASSPATH_FIRST=true
export HADOOP_PREFIX=$HADOOP_HOME
export JAVA_HOME HADOOP_HOME HADOOP_MAPRED_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME PATH HADOOP_LOG_DIR

Let’s apply the changes to current session
hadoop@kunthi:~$ source ~/.bashrc

Hadoop env config

Let’s specify JAVA_HOME
hadoop@kunthi:~/hadoop/etc/hadoop$ cd $HADOOP_HOME/etc/hadoop/
hadoop@kunthi:~/hadoop/etc/hadoop$ cp hadoop-env.sh hadoop-env.sh.20160821

I made the following changes to hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

Setup passwordless ssh login

hadoop@kunthi:~/hadoop$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/hadoop/.ssh/id_rsa):
Your identification has been saved in /opt/hadoop/.ssh/id_rsa.
Your public key has been saved in /opt/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:UXGO3tnfK9K8DayD0/jc+T/WgZetCHOuBAcssUw3gBo hadoop@kunthi
The key's randomart image is:
+---[RSA 2048]----+
| .+.o o.. |
| E .o = o + |
| o + + . . |
| . . + . o |
| S o o o o|
| oo o. =o|
| ==o+..=|
| =.+=+=oo|
| +=o+=++|
+----[SHA256]-----+
hadoop@kunthi:~/hadoop$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@kunthi:~/hadoop$ sudo /etc/init.d/ssh restart
[ ok ] Restarting ssh (via systemctl): ssh.service.
hadoop@kunthi:~/hadoop$ ssh hadoop@gandhari
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-34-generic x86_64)

Temp folders for Hadoop

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chmod 750 /var/lib/hadoop/cache/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs

$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/namesecondary

$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/mapred/local
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/mapred/local/
hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/namesecondary/

hadoop@gandhari:/opt/hadoop-2.6.4$ sudo chown hadoop:hadoop -R /etc/hadoop/

Define the slave name

Add slave hostname. After change, this is the slave name. It is similar to my hostname
hadoop@kunthi:~/hadoop$ cat /etc/hadoop/conf/slaves
kunthi

core-site.xml

Make the appropriate changes core-site.xml

hadoop@kunthi:~/hadoop$ cat etc/hadoop/core-site.xml
<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://gandhari:9000</value>
        </property>
</configuration>

hadoop executable

Check if hadoop command is working. It is located inside $HADOOP_HOME/bin folder

hadoop@kunthi:~$ cd $HADOOP_HOME
hadoop@kunthi:~/hadoop$ hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /opt/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar

hdfs-site.xml

hadoop@kunthi:~/hadoop$ cp etc/hadoop/hdfs-site.xml etc/hadoop/hdfs-site.xml.20160820

I made the folllowing changes

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/data</value>
</property>
</configuration>

Formatting and starting the namenode

hadoop@kunthi:~/hadoop$ hadoop namenode -format
.....
16/08/20 09:15:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = gandhari/192.168.0.169
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.4
....
16/08/20 09:15:10 INFO common.Storage: Storage directory /var/lib/hadoop/cache/hadoop/dfs/name has been successfully formatted.
....
16/08/20 09:15:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at gandhari/192.168.0.169
************************************************************/
hadoop@kunthi:~/hadoop/sbin$ start-dfs.sh
hadoop@kunthi:~/hadoop/sbin$ start-yarn.sh
hadoop@kunthi:~/hadoop/sbin$ jps
6290 DataNode
6707 NodeManager
6599 ResourceManager
6459 SecondaryNameNode
6155 NameNode
7003 Jps
hadoop@kunthi:~/hadoop/sbin$ ./mr-jobhistory-daemon.sh start historyserver

Access the job tracker, name node and data node using your browser as shown below

Job History: http://gandhari:19888/

hadoop004 - jobhistory

Name Node: http://gandhari:50070/

hadoop005 - namenode information

 

Data Node: http://gandhari:50075/

hadoop006 - datanode information

 All applications http://gandhari:8088/cluster

hadoop007 - all applications

 

 

Hadoop Pseudo-Distributed Mode – Setup – Ubuntu – old post. Do not use

OLD POST.. DO NOT USE.

USE MY SECOND POST FOR HADOOP INSTALLATION

Here is the summary of Hadoop – psedodistributed mode installation. This is my 2nd post regarding the environmental setup.

System Specs

  • OS: Ubuntu 32 bit/VirtualBox VM
  • RAM: 4 GB
  • CPU: 1
  • Java: 1.8
  • Hadoop: 2.6

Update Ubuntu

Let’s update ubuntu first before starting the process. This may take much time based on your update frequency.

The following command will update the package definitions.

pandian@kunthi:~$ sudo apt-get update
...
...
Fetched 1,646 kB in 8s (204 kB/s)
AppStream cache update completed, but some metadata was ignored due to errors.
Reading package lists... Done

The following command will update the packages

pandian@kunthi:~$ sudo apt-get dist-upgrade
...
...
355 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB/465 MB of archives.
After this operation, 279 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...
...

<It is time consuming. Take a break.>

Installing JDK

With reference to http://askubuntu.com/questions/521145/how-to-install-oracle-java-on-ubuntu-14-04 follow the below given instructions to install JDK

pandian@kunthi:~$ sudo apt-add-repository ppa:webupd8team/java
pandian@kunthi:~$ sudo apt-get update
pandian@kunthi:~$ sudo apt-get install oracle-java8-installer
pandian@kunthi:~$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) Client VM (build 25.101-b13, mixed mode)
pandian@kunthi:~$ whereis java
java: /usr/bin/java /usr/share/java /usr/share/man/man1/java.1.gz

Create User and User Group

Let’s run Hadoop with its own user and user group.

pandian@kunthi:~$ sudo groupadd -g 599 hadoop
pandian@kunthi:~$ sudo useradd -u 599 -g 599 hadoop

Directory structure

Let’s create the directory structure

pandian@kunthi:~$ sudo mkdir -p /opt/hadoop
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /opt/hadoop
pandian@kunthi:~$ sudo mkdir -p /var/lib/hadoop/journaldata
pandian@kunthi:~$ sudo chown hadoop:hadoop -R /var/lib/hadoop/journaldata

User access and sudo privilage

We are still doing linux tasks. We haven’t touched Hadoop part yet.

pandian@kunthi:~$ sudo passwd hadoop
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
pandian@kunthi:~$ sudo usermod -d /opt/hadoop hadoop
pandian@kunthi:/opt/software/hadoop$ sudo su
root@kunthi:/home/pandian# cp /etc/sudoers /etc/sudoers.20160820
root@kunthi:~# vi /etc/sudoers

I made the highlighted change.

# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL
root@kunthi:~# exit
pandian@kunthi:~$ su - hadoop
$ pwd
/opt/hadoop
$ bash
hadoop@kunthi:~$ id
uid=1001(hadoop) gid=599(hadoop) groups=599(hadoop)

Hadoop package download

I copy the link to download hadoop from http://hadoop.apache.org/releases.html. Here is how you’ll download it.

hadoop@kunthi:~$ wget http://download.nus.edu.sg/mirror/apache/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

The downloaded file is saved in the hadoop directory.
hadoop@kunthi:~$ ls -alt
total 24
-rw-rw-r-- 1 hadoop hadoop 15339 Aug 20 07:43 hadoop-2.6.4.tar.gz
hadoop@kunthi:~$ gunzip hadoop-2.6.4.tar.gz
hadoop@kunthi:~$ tar -xvf hadoop-2.6.4.tar

This will extract the tar file in a new location /opt/hadoop/hadoop-2.6.4. Here is the content of the folder.
hadoop@kunthi:~$ ls -alt hadoop-2.6.4
total 60
drwxr-xr-x 3 hadoop hadoop 4096 Aug 20 07:53 ..
drwxr-xr-x 9 hadoop hadoop 4096 Feb 12 2016 .
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 bin
drwxr-xr-x 3 hadoop hadoop 4096 Feb 12 2016 etc
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 include
drwxr-xr-x 3 hadoop hadoop 4096 Feb 12 2016 lib
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 libexec
-rw-r--r-- 1 hadoop hadoop 15429 Feb 12 2016 LICENSE.txt
-rw-r--r-- 1 hadoop hadoop 101 Feb 12 2016 NOTICE.txt
-rw-r--r-- 1 hadoop hadoop 1366 Feb 12 2016 README.txt
drwxr-xr-x 2 hadoop hadoop 4096 Feb 12 2016 sbin
drwxr-xr-x 4 hadoop hadoop 4096 Feb 12 2016 share

Lets create the configuration directory for Hadoop.
hadoop@kunthi:~$ sudo mkdir -p /etc/hadoop/conf
Create a softlink for the conf folder
hadoop@kunthi:~$ sudo ln -s /opt/hadoop/hadoop-2.6.4/etc/hadoop/** /etc/hadoop/conf/
hadoop@kunthi:~$ ln -s hadoop-2.6.4 hadoop

SSH Keys creation.

Hadoop wants to create key based SSH login
hadoop@kunthi:~$ mkdir ~/.ssh
hadoop@kunthi:~$ cd ~/.ssh/
hadoop@kunthi:~/.ssh$ touch authorized keys
hadoop@kunthi:~/.ssh$ touch known hosts
hadoop@kunthi:~/.ssh$ chmod 700 ~/.ssh/&& chmod 600 ~/.ssh/*
hadoop@kunthi:~/.ssh$ ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:Fj6op9qzbfodhsQTmpQJ17G/mcAvu541bTMTb3huhPg.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
hadoop@localhost's password:
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-31-generic i686)

Bash profile – Environmental variables

As the home folder of the unix user is created by me manually, I need to create the bashprofile. I’ll get a copy of the bash profile, which is working for another user
hadoop@kunthi:~$ cp /home/pandian/.bash*
I’ll modify the above environmental variables to .bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_HOME=/opt/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
export HADOOP_USER_CLASSPATH_FIRST=true
export HADOOP_PREFIX=$HADOOP_HOME
export JAVA_HOME HADOOP_HOME HADOOP_MAPRED_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME PATH HADOOP_LOG_DIR

Let’s apply the changes to current session
hadoop@kunthi:~$ source ~/.bashrc

Hadoop env config

Let’s specify JAVA_HOME
hadoop@kunthi:~/hadoop/etc/hadoop$ cd $HADOOP_HOME/etc/hadoop/
hadoop@kunthi:~/hadoop/etc/hadoop$ cp hadoop-env.sh hadoop-env.sh.20160820

I made the following changes to hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

Setup passwordless ssh login

hadoop@kunthi:~/hadoop$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/hadoop/.ssh/id_rsa):
Your identification has been saved in /opt/hadoop/.ssh/id_rsa.
Your public key has been saved in /opt/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:UXGO3tnfK9K8DayD0/jc+T/WgZetCHOuBAcssUw3gBo hadoop@kunthi
The key's randomart image is:
+---[RSA 2048]----+
| .+.o o.. |
| E .o = o + |
| o + + . . |
| . . + . o |
| S o o o o|
| oo o. =o|
| ==o+..=|
| =.+=+=oo|
| +=o+=++|
+----[SHA256]-----+
hadoop@kunthi:~/hadoop$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@kunthi:~/hadoop$ sudo /etc/init.d/ssh restart
[ ok ] Restarting ssh (via systemctl): ssh.service.
hadoop@kunthi:~/hadoop$ ssh hadoop@kunthi
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-31-generic i686)

Define the slave name

Add slave hostname. After change, this is the slave name. It is similar to my hostname
hadoop@kunthi:~/hadoop$ cat /etc/hadoop/conf/slaves
kunthi

core-site.xml

Make the appropriate changes core-site.xml
hadoop@kunthi:~/hadoop$ cat etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://kunthi:9000</value>
</property>
</configuration>

mapred-site.xml

hadoop@kunthi:~$ cd $HADOOP_HOME
hadoop@kunthi:~/hadoop$ hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /opt/hadoop/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar
hadoop@kunthi:~/hadoop$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@kunthi:~/hadoop$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/name
hadoop@kunthi:~/hadoop$ sudo mkdir -p /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@kunthi:~/hadoop$ sudo chown hadoop:hadoop /var/lib/hadoop/cache/hadoop/dfs/data
hadoop@kunthi:~/hadoop$ cp etc/hadoop/hdfs-site.xml etc/hadoop/hdfs-site.xml.20160820

I made the folllowing changes
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/var/lib/hadoop/cache/hadoop/dfs/data</value>
</property>
</configuration>

Formatting and starting the namenode

hadoop@kunthi:~/hadoop$ hadoop namenode -format
.....
16/08/20 09:15:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = kunthi/192.168.0.159
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.4
....
16/08/20 09:15:10 INFO common.Storage: Storage directory /var/lib/hadoop/cache/hadoop/dfs/name has been successfully formatted.
....
16/08/20 09:15:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at kunthi/192.168.0.159
************************************************************/
hadoop@kunthi:~/hadoop/sbin$ sudo mkdir /logs
hadoop@kunthi:~/hadoop/sbin$ sudo chown hadoop:hadoop /logs/
hadoop@kunthi:~/hadoop/sbin$ start-dfs.sh
hadoop@kunthi:~/hadoop/sbin$ start-yarn.sh
hadoop@kunthi:~/hadoop/sbin$ jps
6290 DataNode
6707 NodeManager
6599 ResourceManager
6459 SecondaryNameNode
6155 NameNode
7003 Jps
hadoop@kunthi:~/hadoop/sbin$ ./mr-jobhistory-daemon.sh start historyserver

Access the job tracker, name node and data node using your browser as shown below

hadoop001 - jobhistory hadoop002 - namenode information hadoop003 - datanode information

 

Integrating Apache and Tomat with mod_jk

This post is a continuation of my previous post. In this post I’ll tell you what I did to access my tomcat apps via Apache.

All the files you edit here are very crucial. Take a backup and make a note of the changes you do. Otherwise, it may goof your Apache configuration.

Installing mon-jk library

$ sudo apt-get install libapache2-mod-jk

Make sure the AJP redirect port is enabled on tomcat/config/server.xml

<Connector port=”8009″ protocol=”AJP/1.3″ redirectPort=”8443″ />

Create a new worker file

sudo vi /etc/apache2/workers.properties

# Define 1 real worker using ajp13
worker.list=worker1
# Set properties for worker (ajp13)
worker.worker1.type=ajp13
worker.worker1.host=localhost
worker.worker1.port=8009

Reconfigure jk.conf

Change the JkWorkersFile property to /etc/apache2/workers.properties

sudo vi /etc/apache2/mods-available/jk.conf

# JkWorkersFile /etc/libapache2-mod-jk/workers.properties
JkWorkersFile /etc/apache2/workers.properties

Configure the URL Apache should pass through the Tomcat

sudo vi /etc/apache2/sites-enabled/000-default

and add the line starts with JKMount in your configurtation

<VirtualHost *:80>
…………………………………
…………………………………
JkMount /gplus* worker1
</VirtualHost *:80>

Restart apache

I found http://<apacheIP>/gplus worked.

Appreciations and thanks to http://thetechnocratnotebook.blogspot.sg/2012/05/installing-tomcat-7-and-apache2-with.html

 

 

 

 

 

Setting up Ubuntu VM in Azure cloud for PHP & Java development

This step by step guide will be helpful to understand how to setup LAMP with tomcat on a fresh Ubuntu Azure VM

What are we going to do?

  1. Install Apache
  2. Install PHP
  3. Open Azure firewall for port 80
  4. Configure Apache for PHP
  5. Install mysql
  6. Install phpmyadmin
  7. Install FTP Server
  8. Install JDK 8
  9. Install Tomcat 8
  10. Open Azure firewall for port 8080
  11. Deploy a war file to tomcat

All of the instructions given below assumes you have a SSH connection to your Azure VM.

02

Apache

sudo apt-get install apache2

Let’s start apache service

pandian@grassfield:~$ sudo /etc/init.d/apache2 start
* Starting web server apache2                                                   *

Is it listening to port 80?

pandian@grassfield:~$ telnet localhost 80
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.

Let’s open the port 80 with azure firewall.

Goto Network security groups

03

You will find you security group. If you don’t find any, add one.04

This is your essential settings. Click on Inbound security rules

05

You will find the rules created already. Here is the one created for SSH access.

06

Allowed TCP Port 80

08

Try to browse your public IP on your browser.

12

PHP

sudo apt-get install php5

Let’s test php now.

$ cd /var/www/html/

$ sudo mkdir testphp

$ sudo  chmod 777 testphp/

$ cd testphp/

$ vi index.php

09

Try to browse your url.

10

MySQL

ok

Lets install mysql and its connectors now

$ sudo apt-get install mysql-server libapache2-mod-auth-mysql php5-mysql

Lets’s start it now.

pandian@grassfield:~$ sudo /etc/init.d/mysql start
* Starting MySQL database server mysqld                                 [ OK ]

Check the port is open

$ telnet localhost 3306
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
[
5.5.46-0ubuntu0.14.04.2

Lets install phpmyadmin now. Otherwise, I’ll feel handicapped as many other developers

sudo apt-get install phpmyadmin

Lets test if this is accessible.

11

Java SDK

We can install Open JDK, But I wanted to use the oracle jdk. Hence I downloaded and transferred the .tar.gz file using SCP.

$ wget http://xxxxxxxxxxxxxxxxxx/jdk-8u65-linux-x64.tar.gz

$ gunzip jdk-8u65-linux-x64.tar.gz

$ tar -xvf jdk-8u65-linux-x64.tar

$ jdk1.8.0_65/bin/java -version
java version “1.8.0_65”
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)

Tomcat

$ wget http://www.eu.apache.org/dist/tomcat/tomcat-8/v8.0.30/bin/apache-tomcat-8.0.30.tar.gz

$ gunzip apache-tomcat-8.0.30.tar.gz

$ tar -xvf apache-tomcat-8.0.30.tar

$ cd apache-tomcat-8.0.30/bin

$ ./catalina.sh start
Neither the JAVA_HOME nor the JRE_HOME environment variable is defined
At least one of these environment variable is needed to run this program

$ export JAVA_HOME=/home/pandian/jdk1.8.0_65

$ ./catalina.sh start              Using CATALINA_BASE:   /home/pandian/apache-tomcat-8.0.30
Using CATALINA_HOME:   /home/pandian/apache-tomcat-8.0.30
Using CATALINA_TMPDIR: /home/pandian/apache-tomcat-8.0.30/temp
Using JRE_HOME:        /home/pandian/jdk1.8.0_65
Using CLASSPATH:       /home/pandian/apache-tomcat-8.0.30/bin/bootstrap.jar:/home/pandian/apache-tomcat-8.0.30/bin/tomcat-juli.jar
Tomcat started.

Make sure tomcat is started

tail ../logs/catalina.out

09-Jan-2016 11:34:16.126 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 5799 ms

Okay! Ready!!

 

 

 

 

Connect to Office 365 email, Calendar & Lync 2013 from Ubuntu 14

I have installed Ubuntu 14.04 LTS on a laptop. I was wondering how to connect to my Office365 account. Here you go!

Thailand Dance!

Emails:

Use your thunderbird to connect to the following IMAP server

Server name: outlook.office365.com
Port: 993
Encryption method: SSL

Calendar

Install Lightning plugin for thunderbird from the following location.

https://addons.mozilla.org/en-US/thunderbird/addon/lightning/

Install the thunderbird plugin for Exchange Servers from the following location

http://www.1st-setup.nl/wordpress/?page_id=551

I’ve used the latest beta version.

Outlook 365 Addressbook

The thunderbird addon downloaded from the 1st-setup site given above works!

Lync 2013

Pidgin IM client comes by default in Ubuntu. Install the following plugin for Pidgin from the Ubuntu Software Center.

“Pidgin plugin for MS Office Communicator and MS Lync”

Add a “Office Communicator” account.

Basic Tab

Username: pandian@grassfield.org

Login:pandian@grassfield.org

Password: xxxxx

Advanced Tab

Server [port]: leave this as blank

Connection Type: SSL/TLS

User Agent: UCCAPI/15.0.4420.1017 OC/15.0.4420.1017 (Microsoft Lync)

Authentication Scheme: TLS-DSK

Leave the rest.

Good Luck!