MySQL error 1449: The user specified as a definer does not exist

I faced this error when exporting the database from one server to other server, as the user doesn’t exist. So I changed the incorrect username into right one as given below.

W.R.T http://stackoverflow.com/questions/10169960/mysql-error-1449-the-user-specified-as-a-definer-does-not-exist

Execute this query to get the list of queries to be executed.


SELECT CONCAT("ALTER DEFINER=`youruser`@`host` VIEW ",
table_name, " AS ", view_definition, ";")
FROM information_schema.views
WHERE table_schema='your-database-name';

It would give list of queries as given below

ALTER DEFINER='jessica'@'%' VIEW vw_audit_log AS select `a`.`ID` AS `id`,`u`.`USER_NAME` AS `user_name`,`a`.`LOG_TYPE` AS `log_type`,`a`.`LOG_TIME` AS `log_time`,`a`.`MESSAGE` AS `message`,`a`.`STATUS` AS `status` from (`your-database-name`.`user_info` `u` join `your-database-name`.`audit_log` `a`) where (`u`.`ID` = `a`.`USER_ID`) order by `a`.`ID` desc;

ALTER DEFINER='jessica'@'%' VIEW vw_user_role AS select `ur`.`NAME` AS `ROLE_NAME`,`ur`.`EMAIL_PERMISSION` AS `EMAIL_PERMISSION`,`urm`.`user_id` AS `USER_ID`,
`urm`.`role_id` AS `ROLE_ID` from (`your-database-name`.`user_role` `ur` join `your-database-name`.`user_role_mapping` `urm`) where (`ur`.`ID` = `urm`.`role_id`);

ALTER DEFINER='jessica'@'%' VIEW vw_user_role_mapping AS select `ur`.`ROLE_NAME` AS `ROLE_NAME`,`ur`.`EMAIL_PERMISSION` AS `EMAIL_PERMISSION`,`ur`.`USER_ID` AS `USER_ID`,`ur`.`ROLE_ID` AS `ROLE_ID`,`ui`.`USER_NAME` AS `USER_NAME`,`ui`.`PASSWORD` AS `PASSWORD`,`ui`.`ENABLED` AS `ENABLED` from (`your-database-name`.`vw_user_role` `ur` join `your-database-name`.`user_info` `ui`) where (`ur`.`USER_ID` = `ui`.`ID`);

After executing this queries, the problem was resolved.

img_1522

Export and Import with MySQLDump

Here is the syntax.

To export the database –


C:\xampp7\mysql\bin>mysqldump.exe --databases mydatabase --user myuser --password >mydatabase.dump.sql
Enter password: ************

To import the dump to database –

Creation of the database in other server:


MariaDB [(none)]> create database mydatabase;
Query OK, 1 row affected (0.00 sec)

From command prompt, let’s import the dump


D:\Softwares\xampp\mysql\bin>mysql -uroot -p mydatabase <D:\gandhari\documents\projects\jessica\mydatabase.dump.sql
Enter password:

img_1520

Lab 14: Sending MapReduce output to JDBC

Hi Hadoopers,

Unfortunately I couldn’t post on time, as I’ve been hit with flu. Here is the post for today. Let’s see how to send the output of Reducer to JDBC in this post. I’ll take Lab 08 – MapReduce using custom class as Key post and modify it.

logo-mapreduce
Mapper

We have no change in Mapper. It will accept the long and Text object as input and emit the custom key EntryCategory and IntWritable Output.

Reducer

Reducer will accept the output of mapper as its input, EntryCategory as key and IntWritable as value. It will emit custom key DBOutputWritable as key and NullWritable as output.

/**
 * 
 */
package org.grassfield.hadoop;

import java.io.IOException;
import java.util.Date;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.Reducer;
import org.grassfield.hadoop.entity.DBOutputWritable;
import org.grassfield.hadoop.entity.EntryCategory;

/**
 * Reducer for Feed Category reducer
 * @author pandian
 *
 */
public class FeedCategoryReducer extends 
    Reducer<EntryCategory, IntWritable, DBOutputWritable, NullWritable> {

    @Override
    protected void reduce(EntryCategory key, Iterable<IntWritable> values, Context context) {
        int sum=0;
        for (IntWritable value:values){
            sum+=value.get();
        }
        DBOutputWritable db = new DBOutputWritable();
        db.setParseDate(new java.sql.Date(new Date().getTime()));
        db.setCategory(key.getCategory());
        db.setCount(sum);
        try {
            context.write(db, NullWritable.get());
        } catch (IOException | InterruptedException e) {
            System.err.println("Error while updating record in database");
            e.printStackTrace();
        }
    }
}

 

DBOutputWritable

Our bean DBOutputWritable should implement Writable and DBWritable interfaces so that we shall update the database.

package org.grassfield.hadoop.entity;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.sql.Date;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.lib.db.DBWritable;

/**
 * Bean for table feed_analytics
 * @author pandian
 *
 */
public class DBOutputWritable implements Writable, DBWritable {
    private Date parseDate;
    private String category;
    private int count;

    public Date getParseDate() {
        return parseDate;
    }

    public void setParseDate(Date parseDate) {
        this.parseDate = parseDate;
    }

    public String getCategory() {
        return category;
    }

    public void setCategory(String category) {
        this.category = category;
    }

    public int getCount() {
        return count;
    }

    public void setCount(int count) {
        this.count = count;
    }

    @Override
    public void readFields(ResultSet arg0) throws SQLException {
        throw new RuntimeException("not implemented");
    }

    @Override
    public void write(PreparedStatement ps) throws SQLException {
        ps.setDate(1, this.parseDate);
        ps.setString(2, this.category);
        ps.setInt(3, this.count);
    }

    @Override
    public void readFields(DataInput arg0) throws IOException {
        throw new RuntimeException("not implemented");

    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        throw new RuntimeException("not implemented");
    }
}

Driver

Driver is where I’ll be specifying my database details. Note the changes in output key classes and values

package org.grassfield.hadoop;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.lib.db.DBConfiguration;
import org.apache.hadoop.mapreduce.lib.db.DBOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.grassfield.hadoop.entity.DBOutputWritable;
import org.grassfield.hadoop.entity.EntryCategory;

/**
 * A Mapper Driver Program to count the categories in RSS XML file This may not
 * be the right approach to parse the XML. Only for demo purpose
 * 
 * @author pandian
 *
 */
public class FeedCategoryCountDriver extends Configured
        implements Tool {

    @Override
    public int run(String[] args) throws ClassNotFoundException, IOException, InterruptedException {
        Configuration conf = getConf();
        DBConfiguration.configureDB(
                conf, 
                "com.mysql.jdbc.Driver", 
                "jdbc:mysql://localhost:3306/feed_analytics?useUnicode=true&characterEncoding=UTF-8",
                "feed_analytics",
                "P@ssw0rd");
        GenericOptionsParser parser = new GenericOptionsParser(conf,
                args);
        args = parser.getRemainingArgs();

        Path input = new Path(args[0]);

        Job job = new Job(conf, "Feed Category Count");
        job.setJarByClass(getClass());

        job.setMapOutputKeyClass(EntryCategory.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setOutputKeyClass(DBOutputWritable.class);
        job.setOutputValueClass(NullWritable.class);
        job.setOutputFormatClass(DBOutputFormat.class);
        
        job.setMapperClass(FeedCategoryCountMapper.class);
        job.setPartitionerClass(FeedCategoryPartitioner.class);
        job.setCombinerClass(FeedCategoryCombiner.class);
        job.setReducerClass(FeedCategoryReducer.class);
        job.setNumReduceTasks(3);
        
        try {
            FileInputFormat.setInputPaths(job, input);
            DBOutputFormat.setOutput(job, 
                    "feed_category", //table name
                    new String[]{"parseDate", "category", "count"}    //fields
            );
        } catch (IOException e) {
            e.printStackTrace();
        }

        return job.waitForCompletion(true)?0:1;
    }

    public static void main(String[] args) throws Exception {
        System.exit(ToolRunner.run(new Configuration(),
                new FeedCategoryCountDriver(), args));
    }
}

Add the mysql driver to Maven dependencies. If you don’t use Maven, use the library as external jar dependency.

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.6</version>
        </dependency>

****Copy the jar to HadoopHome/lib/native and HadoopHome/share/hadoop/mapreduce/lib/***

Restart hadoop deamons.

Table Structure & DB setup

Let’s create our table first.

mysql-php

CREATE TABLE `feed_category` (
`id` bigint(20) NOT NULL,
`parseDate` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
`category` varchar(100) COLLATE utf8_bin NOT NULL,
`count` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

ALTER TABLE `feed_category`
ADD PRIMARY KEY (`id`);

ALTER TABLE `feed_category`
MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=151;

Execution

Let’s execute now.

hadoop@gandhari:/opt/hadoop-2.6.4/jars$ hadoop jar FeedCategoryCount-14.jar org.grassfield.hadoop.FeedCategoryCountDriver /user/hadoop/feed/2016-09-24

16/09/24 08:35:46 INFO mapreduce.Job: Counters: 40
        File System Counters
                FILE: Number of bytes read=128167
                FILE: Number of bytes written=1162256
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1948800
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=12
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Map-Reduce Framework
                Map input records=1107
                Map output records=623
                Map output bytes=19536
                Map output materialized bytes=4279
                Input split bytes=113
                Combine input records=623
                Combine output records=150
                Reduce input groups=150
                Reduce shuffle bytes=4279
                Reduce input records=150
                Reduce output records=150
                Spilled Records=300
                Shuffled Maps =3
                Failed Shuffles=0
                Merged Map outputs=3
                GC time elapsed (ms)=0
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
                Total committed heap usage (bytes)=1885339648
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=487200
        File Output Format Counters
                Bytes Written=0
        org.grassfield.hadoop.FeedCategoryCountMapper$MapperRCheck
                INVALID=35
                VALID=1072

So, is my table populated?

hadoop035-lab-14-jdbc-output

yes it is.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

Have a good weekend guys. Let me take some rest before moving to MRUnit.

Oozie Installation and Configuration

Hi,

Here is the output of my latest lab exercise – Oozie installation. This process was not strightforward as the steps given by master doesn’t work as expected. I need to manually tune the SQL to make it working.

2000px-wikipedia-logo-v2-en-svg1Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs.

Workflows in Oozie are defined as a collection of control flow and action nodes in a directed acyclic graph. Control flow nodes define the beginning and the end of a workflow (start, end and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including Hadoop MapReduce, Hadoop distributed file system operations, Pig, SSH, and email. Oozie can also be extended to support additional types of actions

Here are the steps.

Download and extract

hadoop@gandhari:~$ wget http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ gunzip oozie-4.0.0-cdh5.1.0.tar.gz

hadoop@gandhari:~$ tar -xvf oozie-4.0.0-cdh5.1.0.tar

hadoop@gandhari:~$ ln -s oozie-4.0.0-cdh5.1.0/ oozie

hadoop@gandhari:~$ ls oozie
 bin             examples     README.txt          webapp
 builds          hadooplibs   release-log.txt     workflowgenerator
 client          LICENSE.txt  sharelib            work.log
 core            login        source-headers.txt  zookeeper-security-tests
 DISCLAIMER.txt  minitest     src
 distro          NOTICE.txt   tools
 docs            pom.xml      utils

Setting up the Oozie MySQL user

hadoop@gandhari:~$ mysql -u root -p

mysql> CREATE DATABASE oozie;

mysql> USE oozie;

mysql> CREATE USER 'oozie' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT SELECT,INSERT,UPDATE,DELETE ON *.* TO 'oozie';

mysql> GRANT ALL ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON *.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'192.168.0.169' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'gandhari' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'P@ssw0rd';
 Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> GRANT ALL privileges ON *.* TO '%'@'%' IDENTIFIED BY 'P@ssw0rd';

mysql> GRANT ALL privileges ON *.* TO '*'@'*' IDENTIFIED BY 'P@ssw0rd';

mysql> FLUSH PRIVILEGES;

mysql> exit

Oozie portal settings

hadoop@gandhari:~$ pwd
 /opt/hadoop

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~$ cd etc/hadoop/

hadoop@gandhari:~/etc/hadoop$ vi core-site.xml

#OOZIE
 <property>
 <name>hadoop.proxyuser.oozie.hosts</name>
 <value>*</value>
 </property>
 <property>
 <name>hadoop.proxyuser.oozie.groups</name>
 <value>*</value>
 </property>

Creating Oozie database

hadoop@gandhari:~/oozie$ cd /opt/hadoop

hadoop@gandhari:~$ cd oozie/bin

hadoop@gandhari:~/oozie/bin$ ./ooziedb.sh create -run
 setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

Validate DB Connection
 DONE
 Check DB schema does not exist
 DONE
 Check OOZIE_SYS table does not exist
 DONE
 Create SQL schema
 DONE
 Create OOZIE_SYS table
 DONE

Oozie DB has been created for Oozie version '4.0.0-cdh5.1.0'

The SQL commands have been written to: /tmp/ooziedb-5275812012387848818.sql

This process had many errors. The default value given with this script is faulty. I need to change everything to CURRENT_TIMESTAMP to make it working.

extjs

extjs script is not bundled with Oozie due to license limitation. Hence we need to add it separately.

hadoop@gandhari:~/oozie/bin$ mkdir /opt/hadoop/extjs

hadoop@gandhari:~/oozie/bin$ cd /opt/hadoop/extjs

hadoop@gandhari:~/extjs$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip

Setting up Oozie portal

Let’s build the war file first.

hadoop@gandhari:~/extjs$ cd /opt/hadoop/oozie/bin/

hadoop@gandhari:~/oozie/bin$ ./addtowar.sh -inputwar ../oozie.war -outputwar ../oozieout.war -extjs /opt/hadoop/extjs/ext-2.2.zip -hadoopJarsSNAPSHOT ../oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz -hadoop 2.6.4 $HADOOP_HOME ../oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -jars /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar

...

New Oozie WAR file with added 'Hadoop JARs, ExtJS library, JARs' at ../oozieout.war

hadoop@gandhari:~/oozie/bin$ cd ..
 hadoop@gandhari:~/oozie$ ls *.war
 oozieout.war  oozie.war

Let’s copy the war file to Oozie Tomcat’s webapps folder. MySQL JDBC driver is needed to connect to Oozie database.

hadoop@gandhari:~/oozie$ cp oozie.war /opt/hadoop/oozie/oozie-server/webapps/

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/lib

hadoop@gandhari:~/oozie$ cp /opt/hadoop/hive/lib/mysql-connector-java-5.1.38.jar /opt/hadoop/oozie/libtools/

hadoop@gandhari:~/oozie$ vi conf/oozie-site.xml

<property>
 <name>oozie.service.JPAService.jdbc.driver</name>
 <value>com.mysql.jdbc.Driver</value>
 <description>
 JDBC driver class.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.url</name>
 <value>jdbc:mysql://gandhari:3306/oozie</value>
 <description>
 JDBC URL.
 </description>
 </property>

<property>
 <name>oozie.service.JPAService.jdbc.username</name>
 <value>oozie</value>
 <description>
 DB user name.
 </description>
 </property>
 <property>
 <name>oozie.service.JPAService.jdbc.password</name>
 <value>P@ssw0rd</value>
 <description>
 DB user password.

IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
 if empty Configuration assumes it is NULL.
 </description>
 </property>

ShareLib update

Let’s update the sharelib folder.

hadoop@gandhari:~/oozie$ bin/oozie-setup.sh sharelib create -fs hdfs://gandhari:9000 -locallib oozie-sharelib-4.0.0-cdh5.1.0-yarn.tar.gz -locallib oozie-hadooplibs-4.0.0-cdh5.1.0.tar.gz

....

the destination path for sharelib is: /user/hadoop/share/lib/lib_20160826174043
hadoop@gandhari:~/oozie$ bin/oozied.sh start

hadoop@gandhari:~/oozie$ bin/oozie admin -oozie http://gandhari:11000/oozie -sharelibupdate hdfs://gandhari:9000/user/hadoop/share/lib/lib_20160826174043
 null

hadoop@gandhari:~/oozie$ bin/oozie admin -shareliblist -oozie http://gandhari:11000/oozie
 [Available ShareLib]
 hive
 distcp
 mapreduce-streaming
 oozie
 hcatalog
 hive2
 sqoop
 pig

hadoop@gandhari:~/oozie$ bin/oozied.sh stop

hadoop@gandhari:~/oozie$ bin/oozied.sh start

Point your browser to http://gandhari:11000/oozie/ to get the console

hadoop008 - oozie

 

Hive installation & configuration

After Hadoop – psedodistributed mode installation – second time, this is our next ICT job as part of my course. Let’s install and test Hive. This would be a continuation of the Hadoop installation. Hence I’d be following the folder structures, usernames as given in the previous post.

Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. While developed by Facebook, Apache Hive is now used and developed by other companies such as Netflix and the Financial Industry Regulatory Authority (FINRA). Amazon maintains a software fork of Apache Hive that is included in Amazon Elastic MapReduce on Amazon Web Services.

2000px-wikipedia-logo-v2-en-svg

Download and Install

hadoop@gandhari:/opt/hadoop-2.6.4$ wget http://download.nus.edu.sg/mirror/apache/hive/hive-2.1.0/apache-hive-2.1.0-bin.tar.gz
hadoop@gandhari:/opt/hadoop-2.6.4$ gunzip apache-hive-2.1.0-bin.tar.gz
hadoop@gandhari:/opt/hadoop-2.6.4$ -xvf apache-hive-2.1.0-bin.tar
hadoop@gandhari:/opt/hadoop-2.6.4$ ln -s apache-hive-2.1.0-bin/ hive

Setup Environment – .bashrc changes

Make the following .bashrc file
#HIVE VARIABLES
export HIVE_HOME=/opt/hadoop/apache-hive-2.1.0-bin
export HIVE_CONF_DIR=$HIVE_HOME/conf
export PATH=$PATH:$HIVE_HOME/bin

Setup Environment – Creating directory structure

hadoop@gandhari:~$ hadoop fs -mkdir /tmp
mkdir: Call From gandhari/192.168.0.169 to gandhari:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

DFS and Yarn should be running to setup on hive.

hadoop@gandhari:~$ start-dfs.sh
hadoop@gandhari:~$ start-yarn.sh
hadoop@gandhari:~$ hadoop fs -mkdir /tmp
mkdir: `/tmp’: File exists
hadoop@gandhari:~$ hadoop fs -mkdir /user
hadoop@gandhari:~$ hadoop fs -mkdir /user/hive
hadoop@gandhari:~$ hadoop fs -mkdir /user/hive/warehouse
hadoop@gandhari:~$ hadoop fs -chmod g+w /tmp
hadoop@gandhari:~$ hadoop fs -chmod g+w /user/hive/warehouse

Install MySQL Server

hadoop@gandhari:~$ sudo apt-get install mysql-server
hadoop@gandhari:~$ sudo /etc/init.d/mysql start
[ ok ] Starting mysql (via systemctl): mysql.service.
hadoop@gandhari:~$ sudo apt-get install mysql-client
hadoop@gandhari:~$ sudo apt-get install libmysql-java
hadoop@gandhari:~$ cp /usr/share/java/mysql.jar $HIVE_HOME
hadoop@gandhari:~$ cp /usr/share/java/mysql-connector-java-5.1.38.jar /opt/hadoop/hive/lib/
hadoop@gandhari:~$ /usr/bin/mysql_secure_installation

Creating  Hive database

hadoop@gandhari:~/apache-hive-2.1.0-bin$ mysql -u root -p
Enter password:

mysql> CREATE DATABASE metastore;
Query OK, 1 row affected (0.00 sec)
mysql> USE metastore;
Database changed

mysql> SOURCE /opt/hadoop-2.6.4/hive/scripts/metastore/upgrade/mysql/hive-schema-0.12.0.mysql.sql
mysql> CREATE USER hive@gandhari IDENTIFIED BY ‘P@ssw0rd’;
mysql> REVOKE ALL PRIVILEGES, GRANT OPTION FROM hive@gandhari;
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,LOCK TABLES,EXECUTE ON metastore.* TO hive@gandhari;
mysql> FLUSH PRIVILEGES;
mysql> GRANT ALL ON metastore.* TO ‘hive’@’%’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT SELECT,INSERT,UPDATE,SELECT ON *.* TO ‘hive’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL ON *.* TO ‘hive’@’127.0.0.1’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL ON *.* TO ‘hive’@’localhost’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL ON *.* TO ‘hive’@’%’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL ON metastore.* TO ‘hive’@’%’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL PRIVILEGES ON *.* TO ‘hive’@’gandhari’ IDENTIFIED BY ‘P@ssw0rd’;
mysql>  GRANT ALL PRIVILEGES ON *.* TO ‘hive’@’192.168.0.169’ IDENTIFIED BY ‘P@ssw0rd’;
mysql>  GRANT ALL privileges ON metastore.* TO ‘hive’@’127.0.0.1’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON *.* TO ‘hive’@’127.0.0.1’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON metastore.* TO ‘hive’@’127.0.0.1’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON *.* TO ‘%’@’%’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON *.* TO ‘*’@’*’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> FLUSH PRIVILEGES;

Grant all permissions to Hive user

mysql> GRANT ALL privileges ON metastore.* TO ‘hive’@’127.0.0.1’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON metastore.* TO ‘hive’@’gandhari’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> GRANT ALL privileges ON metastore.* TO ‘hive’@’%’ IDENTIFIED BY ‘P@ssw0rd’;
mysql> FLUSH PRIVILEGES;
mysql> exit;

Creating Hive config file

hadoop@gandhari:~/hive/conf$ cp hive-default.xml.template hive-site.xml

hadoop@gandhari:~/hive/conf$ vi hive-site.xml

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>P@ssw0rd</value>
</property>

<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>

<property>
<name>hive.stats.autogather</name>
<value>false</value>
</property>

<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>

<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp</value>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/operation_logs</value>
</property>

We’ll be launching hive shortly. Let’s make sure the demons are running

hadoop@gandhari:~$ jps
7410 ResourceManager
6931 NameNode
7254 SecondaryNameNode
7046 DataNode
7527 NodeManager
7817 Jps

 Creating demo table and test

hive> CREATE TABLE demo1 (id int, name string);
OK
Time taken: 1.448 seconds
hive> SHOW TABLES;
OK
demo1
Time taken: 0.195 seconds, Fetched: 1 row(s)
hive> select count(*) from demo1;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20160823145925_c4271279-c5c0-4948-a1c3-fb6f79718b5d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
2016-08-23 14:59:29,802 Stage-1 map = 0%,  reduce = 100%
Ended Job = job_local1827679072_0001
MapReduce Jobs Launched:
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
0
Time taken: 4.525 seconds, Fetched: 1 row(s)
hive>

I faced the following issues –

Error while Starting Hive – DatastoreDriverNotFoundException

Here is a scary exception thrown out when I started Hive.

org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver (“com.mysql.jdbc.Driver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

I forgot to copy the mysql driver to hive lib folder. Here is the command to copy the same

cp /usr/share/java/mysql-connector-java-5.1.38.jar /opt/hadoop/hive/lib/

 

org.hibernate.AssertionFailure: null id in entry (don’t flush the Session after an exception occurs)

I’m  inserting multiple records to MySQL with Hibernate 5.

After a constraint failure, all the records are failed to get inserted with the error ‘org.hibernate.AssertionFailure: null id in entry (don’t flush the Session after an exception occurs)’. To get rid of this problem, I cleared the hibernate session when the exception occurs with session.clear().

Data is getting pumped without any problem now.

Cd1bUvDUUAEo8M4.jpg large