Oozie job failure – Error: E0501 : E0501: Could not perform authorization operation, User: hadoop is not allowed to impersonate hadoop

Hi hadoopers,

I’m sorry for resuming the tutorials. I need to complete a project first. Tutorial posts would be resumed after this.

Today, I tried to form a workflow with Oozie. Here is the way I executed it.


hadoop@gandhari:/opt/hadoop-2.6.4/workspace/oozie$ ../../oozie/bin/oozie job --oozie http://gandhari:11000/oozie/ -Doozie.wf.application.path=hdfs://gandhari:9000/user/hadoop/feed/myflow.xml -dryrun

Unfortunately it is broken with the following error.


Error: E0501 : E0501: Could not perform authorization operation, User: hadoop is not allowed to impersonate hadoop

oozie_workflow

hadoop is my OS user. It is the user who is running the Oozie daemon as well. core-site.xml should contain the following entry to proxy this user.


<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>gandhari</value>
</property>
</configuration>

hadoop – OS user name

gandhari – hostname

 

Advertisements

Hadoop Eco System Installation – Contents

Here is the list of pages, that can help you to install Hadoop and its ecosystem products

Distributed HBASE & ZooKeeper Installation and Configuration

Hue Installation and Configuration

Oozie Installation and Configuration

Hi,

Here is the output of my latest lab exercise – Oozie installation. This process was not strightforward as the steps given by master doesn’t work as expected. I need to manually tune the SQL to make it working.

2000px-wikipedia-logo-v2-en-svg1Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs.

Workflows in Oozie are defined as a collection of control flow and action nodes in a directed acyclic graph. Control flow nodes define the beginning and the end of a workflow (start, end and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including Hadoop MapReduce, Hadoop distributed file system operations, Pig, SSH, and email. Oozie can also be extended to support additional types of actions

Here are the steps.

Download and extract

hadoop@gandhari:~$ wget http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ gunzip oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ tar -xvf oozie-4.0.0-cdh5.1.0.tar

hadoop@gandhari:~$ ln -s oozie-4.0.0-cdh5.1.0/ oozie

hadoop@gandhari:~$ ls oozie
 bin             examples     README.txt          webapp
 builds          hadooplibs   release-log.txt     workflowgenerator
 client          LICENSE.txt  sharelib            work.log
 core            login        source-headers.txt  zookeeper-security-tests
 DISCLAIMER.txt  minitest     src
 distro          NOTICE.txt   tools
 docs            pom.xml      utils

Setting up the Oozie MySQL user

hadoop@gandhari:~$ mysql -u root -p

mysql> CREATE DATABASE oozie;

mysql> USE oozie;

mysql> CREATE USER 'oozie' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT SELECT,INSERT,UPDATE,DELETE ON *.* TO 'oozie';

mysql> GRANT ALL ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON *.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';
 Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> GRANT ALL privileges ON *.* TO '%'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO '*'@'*' IDENTIFIED BY 'P@ssw0rd';

mysql> FLUSH PRIVILEGES;

mysql> exit

Oozie portal settings

hadoop@gandhari:~$ pwd
 /opt/hadoop

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~/etc/hadoop$ vi core-site.xml

#OOZIE
 <property>
 <name>hadoop.proxyuser.oozie.hosts</name>
 <value>*</value>
 </property>
 <property>
 <name>hadoop.proxyuser.oozie.groups</name>
 <value>*</value>
 </property>

Creating Oozie database

hadoop@gandhari:~/oozie$ cd /opt/hadoop

hadoop@gandhari:~$ cd oozie/bin

hadoop@gandhari:~/oozie/bin$ ./ooziedb.sh create -run
 setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

Validate DB Connection
 DONE
 Check DB schema does not exist
 DONE
 Check OOZIE_SYS table does not exist
 DONE
 Create SQL schema
 DONE
 Create OOZIE_SYS table
 DONE

Oozie DB has been created for Oozie version '4.0.0-cdh5.1.0'

The SQL commands have been written to: /tmp/ooziedb-5275812012387848818.sql

This process had many errors. The default value given with this script is faulty. I need to change everything to CURRENT_TIMESTAMP to make it working.

extjs

extjs script is not bundled with Oozie due to license limitation. Hence we need to add it separately.

hadoop@gandhari:~/oozie/bin$ mkdir /opt/hadoop/extjs

hadoop@gandhari:~/oozie/bin$ cd /opt/hadoop/extjs

hadoop@gandhari:~/extjs$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip

Setting up Oozie portal

Let’s build the war file first.

hadoop@gandhari:~/extjs$ cd /opt/hadoop/oozie/bin/

hadoop@gandhari:~/oozie/bin$ ./addtowar.sh -inputwar ../oozie.war -outputwar ../oozieout.war -extjs /opt/hadoop/extjs/ext-2.2.zip -hadoopJarsSNAPSHOT ../oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz -hadoop 2.6.4 $HADOOP_HOME ../oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -jars /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar

...

New Oozie WAR file with added 'Hadoop JARs, ExtJS library, JARs' at ../oozieout.war

hadoop@gandhari:~/oozie/bin$ cd ..
 hadoop@gandhari:~/oozie$ ls *.war
 oozieout.war  oozie.war

Let’s copy the war file to Oozie Tomcat’s webapps folder. MySQL JDBC driver is needed to connect to Oozie database.

hadoop@gandhari:~/oozie$ cp oozie.war /opt/hadoop/oozie/oozie-server/webapps/

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/lib

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/libtools/

hadoop@gandhari:~/oozie$ vi conf/oozie-site.xml

<property>
 <name>oozie.service.JPAService.jdbc.driver</name>
 <value>com.mysql.jdbc.Driver</value>
 <description>
 JDBC driver class.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.url</name>
 <value>jdbc:mysql://gandhari:3306/oozie</value>
 <description>
 JDBC URL.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.username</name>
 <value>oozie</value>
 <description>
 DB user name.
 </description>
 </property>
 <property>
 <name>oozie.service.JPAService.jdbc.password</name>
 <value>P@ssw0rd</value>
 <description>
 DB user password.

IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
 if empty Configuration assumes it is NULL.
 </description>
 </property>

ShareLib update

Let’s update the sharelib folder.

hadoop@gandhari:~/oozie$ bin/oozie-setup.sh sharelib create -fs hdfs://gandhari:9000 -locallib oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -locallib oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz

....

the destination path for sharelib is: /user/hadoop/share/lib/lib_20160826174043
hadoop@gandhari:~/oozie$ bin/oozied.sh start

hadoop@gandhari:~/oozie$ bin/oozie admin -oozie http://gandhari:11000/oozie -sharelibupdate hdfs://gandhari:9000/user/hadoop/share/lib/lib_20160826174043
 null

hadoop@gandhari:~/oozie$ bin/oozie admin -shareliblist -oozie http://gandhari:11000/oozie
 [Available ShareLib]
 hive
 distcp
 mapreduce-streaming
 oozie
 hcatalog
 hive2
 sqoop
 pig

hadoop@gandhari:~/oozie$ bin/oozied.sh stop

hadoop@gandhari:~/oozie$ bin/oozied.sh start

Point your browser to http://gandhari:11000/oozie/ to get the console

hadoop008 - oozie

 

Oozie mkdistro fails with mvn: command not found

Oozie installation is not stright forward similar to other applications. When I executed the mkdistro script, it failed with the below given error

./mkdistro.sh
./mkdistro.sh: line 71: mvn: command not found

Maven is a build tool for java, which is not instaled in my ubuntu VM. Hence we need to install it using the below given command to make the script working

hadoop@gandhari:~/oozie/bin$ sudo apt-get install maven