Oozie Installation and Configuration

Hi,

Here is the output of my latest lab exercise – Oozie installation. This process was not strightforward as the steps given by master doesn’t work as expected. I need to manually tune the SQL to make it working.

2000px-wikipedia-logo-v2-en-svg1Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs.

Workflows in Oozie are defined as a collection of control flow and action nodes in a directed acyclic graph. Control flow nodes define the beginning and the end of a workflow (start, end and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including Hadoop MapReduce, Hadoop distributed file system operations, Pig, SSH, and email. Oozie can also be extended to support additional types of actions

Here are the steps.

Download and extract

hadoop@gandhari:~$ wget http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ gunzip oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ tar -xvf oozie-4.0.0-cdh5.1.0.tar

hadoop@gandhari:~$ ln -s oozie-4.0.0-cdh5.1.0/ oozie

hadoop@gandhari:~$ ls oozie
 bin             examples     README.txt          webapp
 builds          hadooplibs   release-log.txt     workflowgenerator
 client          LICENSE.txt  sharelib            work.log
 core            login        source-headers.txt  zookeeper-security-tests
 DISCLAIMER.txt  minitest     src
 distro          NOTICE.txt   tools
 docs            pom.xml      utils

Setting up the Oozie MySQL user

hadoop@gandhari:~$ mysql -u root -p

mysql> CREATE DATABASE oozie;

mysql> USE oozie;

mysql> CREATE USER 'oozie' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT SELECT,INSERT,UPDATE,DELETE ON *.* TO 'oozie';

mysql> GRANT ALL ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON *.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';
 Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> GRANT ALL privileges ON *.* TO '%'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO '*'@'*' IDENTIFIED BY 'P@ssw0rd';

mysql> FLUSH PRIVILEGES;

mysql> exit

Oozie portal settings

hadoop@gandhari:~$ pwd
 /opt/hadoop

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~/etc/hadoop$ vi core-site.xml

#OOZIE
 <property>
 <name>hadoop.proxyuser.oozie.hosts</name>
 <value>*</value>
 </property>
 <property>
 <name>hadoop.proxyuser.oozie.groups</name>
 <value>*</value>
 </property>

Creating Oozie database

hadoop@gandhari:~/oozie$ cd /opt/hadoop

hadoop@gandhari:~$ cd oozie/bin

hadoop@gandhari:~/oozie/bin$ ./ooziedb.sh create -run
 setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

Validate DB Connection
 DONE
 Check DB schema does not exist
 DONE
 Check OOZIE_SYS table does not exist
 DONE
 Create SQL schema
 DONE
 Create OOZIE_SYS table
 DONE

Oozie DB has been created for Oozie version '4.0.0-cdh5.1.0'

The SQL commands have been written to: /tmp/ooziedb-5275812012387848818.sql

This process had many errors. The default value given with this script is faulty. I need to change everything to CURRENT_TIMESTAMP to make it working.

extjs

extjs script is not bundled with Oozie due to license limitation. Hence we need to add it separately.

hadoop@gandhari:~/oozie/bin$ mkdir /opt/hadoop/extjs

hadoop@gandhari:~/oozie/bin$ cd /opt/hadoop/extjs

hadoop@gandhari:~/extjs$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip

Setting up Oozie portal

Let’s build the war file first.

hadoop@gandhari:~/extjs$ cd /opt/hadoop/oozie/bin/

hadoop@gandhari:~/oozie/bin$ ./addtowar.sh -inputwar ../oozie.war -outputwar ../oozieout.war -extjs /opt/hadoop/extjs/ext-2.2.zip -hadoopJarsSNAPSHOT ../oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz -hadoop 2.6.4 $HADOOP_HOME ../oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -jars /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar

...

New Oozie WAR file with added 'Hadoop JARs, ExtJS library, JARs' at ../oozieout.war

hadoop@gandhari:~/oozie/bin$ cd ..
 hadoop@gandhari:~/oozie$ ls *.war
 oozieout.war  oozie.war

Let’s copy the war file to Oozie Tomcat’s webapps folder. MySQL JDBC driver is needed to connect to Oozie database.

hadoop@gandhari:~/oozie$ cp oozie.war /opt/hadoop/oozie/oozie-server/webapps/

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/lib

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/libtools/

hadoop@gandhari:~/oozie$ vi conf/oozie-site.xml

<property>
 <name>oozie.service.JPAService.jdbc.driver</name>
 <value>com.mysql.jdbc.Driver</value>
 <description>
 JDBC driver class.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.url</name>
 <value>jdbc:mysql://gandhari:3306/oozie</value>
 <description>
 JDBC URL.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.username</name>
 <value>oozie</value>
 <description>
 DB user name.
 </description>
 </property>
 <property>
 <name>oozie.service.JPAService.jdbc.password</name>
 <value>P@ssw0rd</value>
 <description>
 DB user password.

IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
 if empty Configuration assumes it is NULL.
 </description>
 </property>

ShareLib update

Let’s update the sharelib folder.

hadoop@gandhari:~/oozie$ bin/oozie-setup.sh sharelib create -fs hdfs://gandhari:9000 -locallib oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -locallib oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz

....

the destination path for sharelib is: /user/hadoop/share/lib/lib_20160826174043
hadoop@gandhari:~/oozie$ bin/oozied.sh start

hadoop@gandhari:~/oozie$ bin/oozie admin -oozie http://gandhari:11000/oozie -sharelibupdate hdfs://gandhari:9000/user/hadoop/share/lib/lib_20160826174043
 null

hadoop@gandhari:~/oozie$ bin/oozie admin -shareliblist -oozie http://gandhari:11000/oozie
 [Available ShareLib]
 hive
 distcp
 mapreduce-streaming
 oozie
 hcatalog
 hive2
 sqoop
 pig

hadoop@gandhari:~/oozie$ bin/oozied.sh stop

hadoop@gandhari:~/oozie$ bin/oozied.sh start

Point your browser to http://gandhari:11000/oozie/ to get the console

hadoop008 - oozie

 

Advertisements

One thought on “Oozie Installation and Configuration

  1. Pingback: Hadoop Eco System Installation – Contents | JavaShine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s