REST in Mule ESB: InitialisationException: No port defined. Set the host attribute either in the request or request-config elements

I got this exception when I deployed a new REST application to Mule ESB today. I was following the instructions at https://docs.mulesoft.com/mule-user-guide/v/3.7/rest-api-examples

This accepts a REST service at HTTP port 48080 and sends the parameters to external website baconipsum.com at port 80. Following was my request config to the external site.


<http:request-config name="HTTP_Request_Configuration" host="baconipsum.com"  doc:name="HTTP Request Configuration" basePath="api" />

The following exception is resolved after explicitly specifying the port for the external website.


<http:request-config name="HTTP_Request_Configuration" host="baconipsum.com"  doc:name="HTTP Request Configuration" basePath="api" port="80"/>

 


ERROR 2017-05-23 05:37:09,396 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DefaultArchiveDeployer:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ Failed to deploy artifact 'MuleRESTApp', see below       +
org.mule.module.launcher.DeploymentInitException: InitialisationException: No port defined. Set the host attribute either in the request or request-config elements
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
at org.mule.module.launcher.application.DefaultMuleApplication.init(DefaultMuleApplication.java:212) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.artifact.ArtifactWrapper$2.execute(ArtifactWrapper.java:63) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.artifact.ArtifactWrapper.executeWithinArtifactClassLoader(ArtifactWrapper.java:136) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.artifact.ArtifactWrapper.init(ArtifactWrapper.java:58) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DefaultArtifactDeployer.deploy(DefaultArtifactDeployer.java:25) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DefaultArchiveDeployer.deployArtifact(DefaultArchiveDeployer.java:310) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DefaultArchiveDeployer.deployExplodedApp(DefaultArchiveDeployer.java:297) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DefaultArchiveDeployer.deployExplodedArtifact(DefaultArchiveDeployer.java:96) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DeploymentDirectoryWatcher.deployExplodedApps(DeploymentDirectoryWatcher.java:294) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at org.mule.module.launcher.DeploymentDirectoryWatcher.run(DeploymentDirectoryWatcher.java:367) ~[mule-module-launcher-3.8.4.jar:3.8.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_80]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) ~[?:1.7.0_80]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) ~[?:1.7.0_80]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.7.0_80]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_80]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_80]
Caused by: org.mule.api.config.ConfigurationException: No port defined. Set the host attribute either in the request or request-config elements (org.mule.api.lifecycle.InitialisationException) (org.mule.api.config.ConfigurationException)
at org.mule.config.builders.AbstractConfigurationBuilder.configure(AbstractConfigurationBuilder.java:49) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractResourceConfigurationBuilder.configure(AbstractResourceConfigurationBuilder.java:69) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory$1.configure(DefaultMuleContextFactory.java:89) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.doCreateMuleContext(DefaultMuleContextFactory.java:222) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.createMuleContext(DefaultMuleContextFactory.java:81) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.module.launcher.application.DefaultMuleApplication.init(DefaultMuleApplication.java:203) ~[mule-module-launcher-3.8.4.jar:3.8.4]
... 16 more
Caused by: org.mule.api.config.ConfigurationException: No port defined. Set the host attribute either in the request or request-config elements (org.mule.api.lifecycle.InitialisationException)
at org.mule.config.builders.AbstractConfigurationBuilder.configure(AbstractConfigurationBuilder.java:49) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractResourceConfigurationBuilder.configure(AbstractResourceConfigurationBuilder.java:69) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AutoConfigurationBuilder.autoConfigure(AutoConfigurationBuilder.java:102) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AutoConfigurationBuilder.doConfigure(AutoConfigurationBuilder.java:54) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractConfigurationBuilder.configure(AbstractConfigurationBuilder.java:43) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractResourceConfigurationBuilder.configure(AbstractResourceConfigurationBuilder.java:69) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory$1.configure(DefaultMuleContextFactory.java:89) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.doCreateMuleContext(DefaultMuleContextFactory.java:222) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.createMuleContext(DefaultMuleContextFactory.java:81) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.module.launcher.application.DefaultMuleApplication.init(DefaultMuleApplication.java:203) ~[mule-module-launcher-3.8.4.jar:3.8.4]
... 16 more
Caused by: org.mule.api.lifecycle.InitialisationException: No port defined. Set the host attribute either in the request or request-config elements
at org.mule.module.http.internal.request.DefaultHttpRequester.validateRequiredProperties(DefaultHttpRequester.java:197) ~[mule-module-http-3.8.4.jar:3.8.4]
at org.mule.module.http.internal.request.DefaultHttpRequester.initialise(DefaultHttpRequester.java:130) ~[mule-module-http-3.8.4.jar:3.8.4]
at org.mule.processor.chain.AbstractMessageProcessorChain.initialise(AbstractMessageProcessorChain.java:87) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.AbstractFlowConstruct.initialiseIfInitialisable(AbstractFlowConstruct.java:317) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.AbstractPipeline.doInitialise(AbstractPipeline.java:242) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.Flow.doInitialise(Flow.java:75) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.AbstractFlowConstruct$1.onTransition(AbstractFlowConstruct.java:104) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.AbstractFlowConstruct$1.onTransition(AbstractFlowConstruct.java:98) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.AbstractLifecycleManager.invokePhase(AbstractLifecycleManager.java:138) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.FlowConstructLifecycleManager.fireInitialisePhase(FlowConstructLifecycleManager.java:78) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.construct.AbstractFlowConstruct.initialise(AbstractFlowConstruct.java:97) ~[mule-core-3.8.4.jar:3.8.4]
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_80]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_80]
at org.mule.lifecycle.phases.DefaultLifecyclePhase.applyLifecycle(DefaultLifecyclePhase.java:237) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.phases.MuleContextInitialisePhase.applyLifecycle(MuleContextInitialisePhase.java:71) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.RegistryLifecycleCallback.doApplyLifecycle(RegistryLifecycleCallback.java:99) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.RegistryLifecycleCallback.onTransition(RegistryLifecycleCallback.java:71) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.RegistryLifecycleManager.invokePhase(RegistryLifecycleManager.java:155) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.lifecycle.RegistryLifecycleManager.fireLifecycle(RegistryLifecycleManager.java:126) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.registry.AbstractRegistry.fireLifecycle(AbstractRegistry.java:146) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.registry.AbstractRegistry.initialise(AbstractRegistry.java:116) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.spring.SpringXmlConfigurationBuilder.createSpringRegistry(SpringXmlConfigurationBuilder.java:177) ~[mule-module-spring-config-3.8.4.jar:3.8.4]
at org.mule.config.spring.SpringXmlConfigurationBuilder.doConfigure(SpringXmlConfigurationBuilder.java:100) ~[mule-module-spring-config-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractConfigurationBuilder.configure(AbstractConfigurationBuilder.java:43) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractResourceConfigurationBuilder.configure(AbstractResourceConfigurationBuilder.java:69) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AutoConfigurationBuilder.autoConfigure(AutoConfigurationBuilder.java:102) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AutoConfigurationBuilder.doConfigure(AutoConfigurationBuilder.java:54) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractConfigurationBuilder.configure(AbstractConfigurationBuilder.java:43) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.config.builders.AbstractResourceConfigurationBuilder.configure(AbstractResourceConfigurationBuilder.java:69) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory$1.configure(DefaultMuleContextFactory.java:89) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.doCreateMuleContext(DefaultMuleContextFactory.java:222) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.context.DefaultMuleContextFactory.createMuleContext(DefaultMuleContextFactory.java:81) ~[mule-core-3.8.4.jar:3.8.4]
at org.mule.module.launcher.application.DefaultMuleApplication.init(DefaultMuleApplication.java:203) ~[mule-module-launcher-3.8.4.jar:3.8.4]
... 16 more

Advertisements

Enable jmx in MuleSoft ESB

To enable JMX for Mulesoft ESB –

  • Create a directory, call it MuleJMXApp
  • Drop the following code in, call it mule-config.xml
<?xml version="1.0" encoding="UTF-8"?>
<mule xmlns="http://www.mulesoft.org/schema/mule/core"     xmlns:management="http://www.mulesoft.org/schema/mule/management"     xmlns:spring="http://www.springframework.org/schema/beans"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-current.xsd http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd http://www.mulesoft.org/schema/mule/management http://www.mulesoft.org/schema/mule/management/current/mule-management.xsd">
    <management:jmx-server>
        <management:connector-server url="service:jmx:rmi:///jndi/rmi://127.0.0.1:9000/jmxrmi"/>
    </management:jmx-server>
</mule>

Enable JMX in MuleSoft ESB JavaShine

Ref:
https://www.ignoredbydinosaurs.com/posts/256-setting-up-jmx-on-mule-esb

Eclipse project dependency and Maven

Hi,

I’m unable to schedule the posts for the past two days, as I have been stuck with creating the input data for my ongoing exercise. Unfortunately I got stuck somewhere.

Today, lets talk about updating maven dependencies when you have project dependencies in eclipse.

Say, I have project dependencies in my Eclipse project.

hadoop046-distributed-cache-eclipse

Eclipse recognize it well and your code will not show any errors, when you use the classes of the dependencies.

But, Maven doesn’t care about the project dependencies unless you instruct it to do. So my build process is failed.

The project I rely on, is also a maven project with following identifiers.

groupId: jatomrss
artifactId: jatomrss
version: 0.0.5-SNAPSHOT

I define the same in my pom.

<!– RSS feed parsing library – local eclipse project –>
<dependency>
<groupId>jatomrss</groupId>
<artifactId>jatomrss</artifactId>
<version>0.0.5-SNAPSHOT</version>
<scope>compile</scope>
</dependency>

So what happens?

[INFO] ————————————————————————
[INFO] BUILD SUCCESS
[INFO] ————————————————————————
[INFO] Total time: 4.075 s
[INFO] Finished at: 2016-10-06T05:46:42+08:00
[INFO] Final Memory: 28M/337M
[INFO] ————————————————————————

Pls check how to add non-maven local jars to your maven projects in my post Adding local libraries to Maven

Good day.

java.lang.Exception: java.io.IOException: Incorrect string value: ‘\xE0\xAE\xB5\xE0\xAF\x87…’

Hi Hadoopers,

This is a nasty exception which kicked off my reducer task, which updates my MySQL table with the reducer output.

The reason behind this is unicode character.

MySQL table was created with non-unicode wester encoding. I’m trying to insert multi lingual unicode text. After changing the table collation (if needed field collation also) to utf8_bin, it worked fine.

alter table FeedEntryRecord convert to character set utf8 collate utf8_bin;

 

Lab 14: Sending MapReduce output to JDBC

Hi Hadoopers,

Unfortunately I couldn’t post on time, as I’ve been hit with flu. Here is the post for today. Let’s see how to send the output of Reducer to JDBC in this post. I’ll take Lab 08 – MapReduce using custom class as Key post and modify it.

logo-mapreduce
Mapper

We have no change in Mapper. It will accept the long and Text object as input and emit the custom key EntryCategory and IntWritable Output.

Reducer

Reducer will accept the output of mapper as its input, EntryCategory as key and IntWritable as value. It will emit custom key DBOutputWritable as key and NullWritable as output.

/**
 * 
 */
package org.grassfield.hadoop;

import java.io.IOException;
import java.util.Date;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.Reducer;
import org.grassfield.hadoop.entity.DBOutputWritable;
import org.grassfield.hadoop.entity.EntryCategory;

/**
 * Reducer for Feed Category reducer
 * @author pandian
 *
 */
public class FeedCategoryReducer extends 
    Reducer<EntryCategory, IntWritable, DBOutputWritable, NullWritable> {

    @Override
    protected void reduce(EntryCategory key, Iterable<IntWritable> values, Context context) {
        int sum=0;
        for (IntWritable value:values){
            sum+=value.get();
        }
        DBOutputWritable db = new DBOutputWritable();
        db.setParseDate(new java.sql.Date(new Date().getTime()));
        db.setCategory(key.getCategory());
        db.setCount(sum);
        try {
            context.write(db, NullWritable.get());
        } catch (IOException | InterruptedException e) {
            System.err.println("Error while updating record in database");
            e.printStackTrace();
        }
    }
}

 

DBOutputWritable

Our bean DBOutputWritable should implement Writable and DBWritable interfaces so that we shall update the database.

package org.grassfield.hadoop.entity;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.sql.Date;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.lib.db.DBWritable;

/**
 * Bean for table feed_analytics
 * @author pandian
 *
 */
public class DBOutputWritable implements Writable, DBWritable {
    private Date parseDate;
    private String category;
    private int count;

    public Date getParseDate() {
        return parseDate;
    }

    public void setParseDate(Date parseDate) {
        this.parseDate = parseDate;
    }

    public String getCategory() {
        return category;
    }

    public void setCategory(String category) {
        this.category = category;
    }

    public int getCount() {
        return count;
    }

    public void setCount(int count) {
        this.count = count;
    }

    @Override
    public void readFields(ResultSet arg0) throws SQLException {
        throw new RuntimeException("not implemented");
    }

    @Override
    public void write(PreparedStatement ps) throws SQLException {
        ps.setDate(1, this.parseDate);
        ps.setString(2, this.category);
        ps.setInt(3, this.count);
    }

    @Override
    public void readFields(DataInput arg0) throws IOException {
        throw new RuntimeException("not implemented");

    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        throw new RuntimeException("not implemented");
    }
}

Driver

Driver is where I’ll be specifying my database details. Note the changes in output key classes and values

package org.grassfield.hadoop;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.lib.db.DBConfiguration;
import org.apache.hadoop.mapreduce.lib.db.DBOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.grassfield.hadoop.entity.DBOutputWritable;
import org.grassfield.hadoop.entity.EntryCategory;

/**
 * A Mapper Driver Program to count the categories in RSS XML file This may not
 * be the right approach to parse the XML. Only for demo purpose
 * 
 * @author pandian
 *
 */
public class FeedCategoryCountDriver extends Configured
        implements Tool {

    @Override
    public int run(String[] args) throws ClassNotFoundException, IOException, InterruptedException {
        Configuration conf = getConf();
        DBConfiguration.configureDB(
                conf, 
                "com.mysql.jdbc.Driver", 
                "jdbc:mysql://localhost:3306/feed_analytics?useUnicode=true&characterEncoding=UTF-8",
                "feed_analytics",
                "P@ssw0rd");
        GenericOptionsParser parser = new GenericOptionsParser(conf,
                args);
        args = parser.getRemainingArgs();

        Path input = new Path(args[0]);

        Job job = new Job(conf, "Feed Category Count");
        job.setJarByClass(getClass());

        job.setMapOutputKeyClass(EntryCategory.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setOutputKeyClass(DBOutputWritable.class);
        job.setOutputValueClass(NullWritable.class);
        job.setOutputFormatClass(DBOutputFormat.class);
        
        job.setMapperClass(FeedCategoryCountMapper.class);
        job.setPartitionerClass(FeedCategoryPartitioner.class);
        job.setCombinerClass(FeedCategoryCombiner.class);
        job.setReducerClass(FeedCategoryReducer.class);
        job.setNumReduceTasks(3);
        
        try {
            FileInputFormat.setInputPaths(job, input);
            DBOutputFormat.setOutput(job, 
                    "feed_category", //table name
                    new String[]{"parseDate", "category", "count"}    //fields
            );
        } catch (IOException e) {
            e.printStackTrace();
        }

        return job.waitForCompletion(true)?0:1;
    }

    public static void main(String[] args) throws Exception {
        System.exit(ToolRunner.run(new Configuration(),
                new FeedCategoryCountDriver(), args));
    }
}

Add the mysql driver to Maven dependencies. If you don’t use Maven, use the library as external jar dependency.

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.6</version>
        </dependency>

****Copy the jar to HadoopHome/lib/native and HadoopHome/share/hadoop/mapreduce/lib/***

Restart hadoop deamons.

Table Structure & DB setup

Let’s create our table first.

mysql-php

CREATE TABLE `feed_category` (
`id` bigint(20) NOT NULL,
`parseDate` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
`category` varchar(100) COLLATE utf8_bin NOT NULL,
`count` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

ALTER TABLE `feed_category`
ADD PRIMARY KEY (`id`);

ALTER TABLE `feed_category`
MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=151;

Execution

Let’s execute now.

hadoop@gandhari:/opt/hadoop-2.6.4/jars$ hadoop jar FeedCategoryCount-14.jar org.grassfield.hadoop.FeedCategoryCountDriver /user/hadoop/feed/2016-09-24

16/09/24 08:35:46 INFO mapreduce.Job: Counters: 40
        File System Counters
                FILE: Number of bytes read=128167
                FILE: Number of bytes written=1162256
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1948800
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=12
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Map-Reduce Framework
                Map input records=1107
                Map output records=623
                Map output bytes=19536
                Map output materialized bytes=4279
                Input split bytes=113
                Combine input records=623
                Combine output records=150
                Reduce input groups=150
                Reduce shuffle bytes=4279
                Reduce input records=150
                Reduce output records=150
                Spilled Records=300
                Shuffled Maps =3
                Failed Shuffles=0
                Merged Map outputs=3
                GC time elapsed (ms)=0
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
                Total committed heap usage (bytes)=1885339648
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=487200
        File Output Format Counters
                Bytes Written=0
        org.grassfield.hadoop.FeedCategoryCountMapper$MapperRCheck
                INVALID=35
                VALID=1072

So, is my table populated?

hadoop035-lab-14-jdbc-output

yes it is.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

Have a good weekend guys. Let me take some rest before moving to MRUnit.

Lab 02 – A Simple Hadoop Mapper with Eclipse and Maven

Hi Hadoopers,

All the tasks given below are done on the Hadoop server. I assume you have downloaded the Eclipse IDE for your platform. (I use STS for this demo, as it has Maven plugin out of the box)

Create a new Java project.

screenshot-from-2016-09-09-17-54-34

After opening the Java project, convert it in to Maven project.

screenshot-from-2016-09-09-17-57-07

After adding maven capability add the following dependencies of Hadoop 2.6.4.

<dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.6.4</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.6.4</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.6.4</version>
        </dependency>

    </dependencies>

Mapper

logo-mapreduce

Let’s create a mapper to count the words in a file. Refer to https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0 for more details.

package org.grassfield.hadoop;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

/**
 * @author Pandian
 *
 */
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
    /**
     * This is to store the output string
     */
    private Text word = new Text();
    
    
    /**
     * This is to denote each occurrence of the word
     */
    private IntWritable one = new IntWritable(1);

    @Override
    protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
            throws IOException, InterruptedException {
        //read line by line
        String line = value.toString();
        
        //tokenize by comma
        StringTokenizer st = new StringTokenizer(line, ",");
        while (st.hasMoreTokens()) {
            //store the token as the word
            word.set(st.nextToken());
            
            //register the count
            context.write(word, one);
        }
    }

}

Driver

Mapper class cannot be executed by itself. hence we write a mapper driver.

package org.grassfield.hadoop;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;

import com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider.Text;

/**
 * @author pandian
 *
 */
public class WordCountDriver extends Configured implements Tool {
    public WordCountDriver(Configuration conf){
        //Assign the configuration to super class. 
        //Otherwise you will get null pointer for getConf()
        super(conf);
    }

    @Override
    public int run(String[] args) throws Exception {
        //get the configuration
        Configuration conf = getConf();

        //initiate the parser with arguments and configuration
        GenericOptionsParser parser = new GenericOptionsParser(conf, args);
        args = parser.getRemainingArgs();
        
        //input and output HDFS locations are received as command line argument
        Path input = new Path(args[0]);
        Path output = new Path(args[1]);
        
        //Mapper Job is defined
        Job job = new Job(conf, "Word Count Driver");
        job.setJarByClass(getClass());
        
        //Output format is defined
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        
        //reducer count is defined. We do not have any reducers for this assignment
        job.setNumReduceTasks(0);
        
        //File input and output formats are specified
        FileInputFormat.setInputPaths(job,  input);
        FileOutputFormat.setOutputPath(job, output);
        
        //Set the mapper class and run the job
        job.setMapperClass(WordCountMapper.class);
        boolean b = job.waitForCompletion(true);
        
        return 0;
    }
    
    /**
     * @param args input and output files are specified
     * @throws Exception
     */
    public static void main (String[]args) throws Exception{
        Configuration conf = new Configuration();
        WordCountDriver driver = new WordCountDriver(conf);
        driver.run(args);
    }

}

Export

Let’s execute maven with clean install target. Jar will be copied to the below given location

[INFO] --- maven-install-plugin:2.4:install (default-install) @ WordCount ---
[INFO] Installing D:\workspace_gandhari\WordCount\target\WordCount-0.0.1-SNAPSHOT.jar to C:\Users\pandian\.m2\repository\WordCount\WordCount\0.0.1-SNAPSHOT\WordCount-0.0.1-SNAPSHOT.jar

I copy the jar file to hadoop server as hadoop user.

Input file

To execute the mapper, we need a file in HDFS. I have the following file already.

hadoop@gandhari:~/jars$ hadoop fs -cat /user/hadoop/lab01/month.txt
chithirai
vaigasi
aani
aadi
aavani
purattasi
aippasi
karthikai
margazhi
thai
thai
panguni

hadoop@gandhari:~/jars$ hadoop jar WordCount-0.0.1-SNAPSHOT.jar org.grassfield.hadoop.WordCountDriver /user/hadoop/lab01/month.txt /user/hadoop/lab01/output/10
...
16/09/09 23:32:22 INFO mapreduce.JobSubmitter: number of splits:1
16/09/09 23:32:23 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/09/09 23:32:23 INFO mapreduce.Job: Running job: job_local57180184_0001
16/09/09 23:32:23 INFO mapred.LocalJobRunner: Starting task: attempt_local57180184_0001_m_000000_0
16/09/09 23:32:23 INFO mapred.MapTask: Processing split: hdfs://gandhari:9000/user/hadoop/lab01/month.txt:0+90
16/09/09 23:32:23 INFO output.FileOutputCommitter: Saved output of task 'attempt_local57180184_0001_m_000000_0' to hdfs://gandhari:9000/user/hadoop/lab01/output/10/_temporary/0/task_local57180184_0001_m_000000
16/09/09 23:32:24 INFO mapreduce.Job:  map 100% reduce 0%
16/09/09 23:32:24 INFO mapreduce.Job: Job job_local57180184_0001 completed successfull
y

Output file

Let’s see if our output file is created

hadoop@gandhari:~/jars$ hadoop fs -ls /user/hadoop/lab01/output/10
Found 2 items
-rw-r--r--   3 hadoop supergroup          0 2016-09-09 23:32 /user/hadoop/lab01/output/10/_SUCCESS
-rw-r--r--   3 hadoop supergroup        114 2016-09-09 23:32 /user/hadoop/lab01/output/10/part-m-00000

part-m denotes that it is the output of mapper. Here is the output of our job.

hadoop@gandhari:~/jars$ hadoop fs -cat /user/hadoop/lab01/output/10/part-m-00000
chithirai       1
vaigasi 1
aani    1
aadi    1
aavani  1
purattasi       1
aippasi 1
karthikai       1
margazhi        1
thai    1
thai    1
panguni 1

Interesting, isn’t it!

Have a good week.

Adding local libraries to Maven

How to add user defined .jar files or inhouse developed libraries to Maven Repository, so that my maven project will get executed correctly. Here is the steps I did to add two jar files c:\eg_agent.jar and c:\eg_util.jar to maven repository.

F:\sts-bundle\apache-maven-3.3.9-bin\apache-maven-3.3.9\bin>mvn install:install-file -Dfile=c:\eg_agent.jar -DgroupId=com.eg -DartifactId=agent -Dversion=6.1.2 -Dpackaging=jar

Once it is added to repository, I added the dependency as given below.

<dependency>
<groupId>com.eg</groupId>
<artifactId>agent</artifactId>
<version>6.1.2</version>
</dependency>

Here is another one.

F:\sts-bundle\apache-maven-3.3.9-bin\apache-maven-3.3.9\bin>mvn install:install-file -Dfile=c:\eg_util.jar -DgroupId=com.eg -DartifactId=util -Dversion=6.1.2 -Dpackaging=jar

<dependency>
<groupId>com.eg</groupId>
<artifactId>util</artifactId>
<version>6.1.2</version>
</dependency>

Ref http://stackoverflow.com/questions/29330577/maven-3-3-1-eclipse-dmaven-multimoduleprojectdirectory-system-propery-is-not-s