invalid flag: –release -> [Help 1]

I need to build a project with jdk 8. When I run Maven, Build failed with the below given error.

invalid flag: –release -> [Help 1]

After removing the below version specification in pom.xml, everything went as expected.

    <configuration>
        <source>1.8</source>
        <target>1.8</target>
    </configuration>

Eclipse project dependency and Maven

Hi,

I’m unable to schedule the posts for the past two days, as I have been stuck with creating the input data for my ongoing exercise. Unfortunately I got stuck somewhere.

Today, lets talk about updating maven dependencies when you have project dependencies in eclipse.

Say, I have project dependencies in my Eclipse project.

hadoop046-distributed-cache-eclipse

Eclipse recognize it well and your code will not show any errors, when you use the classes of the dependencies.

But, Maven doesn’t care about the project dependencies unless you instruct it to do. So my build process is failed.

The project I rely on, is also a maven project with following identifiers.

groupId: jatomrss
artifactId: jatomrss
version: 0.0.5-SNAPSHOT

I define the same in my pom.

<!– RSS feed parsing library – local eclipse project –>
<dependency>
<groupId>jatomrss</groupId>
<artifactId>jatomrss</artifactId>
<version>0.0.5-SNAPSHOT</version>
<scope>compile</scope>
</dependency>

So what happens?

[INFO] ————————————————————————
[INFO] BUILD SUCCESS
[INFO] ————————————————————————
[INFO] Total time: 4.075 s
[INFO] Finished at: 2016-10-06T05:46:42+08:00
[INFO] Final Memory: 28M/337M
[INFO] ————————————————————————

Pls check how to add non-maven local jars to your maven projects in my post Adding local libraries to Maven

Good day.

Lab 02 – A Simple Hadoop Mapper with Eclipse and Maven

Hi Hadoopers,

All the tasks given below are done on the Hadoop server. I assume you have downloaded the Eclipse IDE for your platform. (I use STS for this demo, as it has Maven plugin out of the box)

Create a new Java project.

screenshot-from-2016-09-09-17-54-34

After opening the Java project, convert it in to Maven project.

screenshot-from-2016-09-09-17-57-07

After adding maven capability add the following dependencies of Hadoop 2.6.4.

<dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.6.4</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.6.4</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.6.4</version>
        </dependency>

    </dependencies>

Mapper

logo-mapreduce

Let’s create a mapper to count the words in a file. Refer to https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0 for more details.

package org.grassfield.hadoop;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

/**
 * @author Pandian
 *
 */
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
    /**
     * This is to store the output string
     */
    private Text word = new Text();
    
    
    /**
     * This is to denote each occurrence of the word
     */
    private IntWritable one = new IntWritable(1);

    @Override
    protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
            throws IOException, InterruptedException {
        //read line by line
        String line = value.toString();
        
        //tokenize by comma
        StringTokenizer st = new StringTokenizer(line, ",");
        while (st.hasMoreTokens()) {
            //store the token as the word
            word.set(st.nextToken());
            
            //register the count
            context.write(word, one);
        }
    }

}

Driver

Mapper class cannot be executed by itself. hence we write a mapper driver.

package org.grassfield.hadoop;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;

import com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider.Text;

/**
 * @author pandian
 *
 */
public class WordCountDriver extends Configured implements Tool {
    public WordCountDriver(Configuration conf){
        //Assign the configuration to super class. 
        //Otherwise you will get null pointer for getConf()
        super(conf);
    }

    @Override
    public int run(String[] args) throws Exception {
        //get the configuration
        Configuration conf = getConf();

        //initiate the parser with arguments and configuration
        GenericOptionsParser parser = new GenericOptionsParser(conf, args);
        args = parser.getRemainingArgs();
        
        //input and output HDFS locations are received as command line argument
        Path input = new Path(args[0]);
        Path output = new Path(args[1]);
        
        //Mapper Job is defined
        Job job = new Job(conf, "Word Count Driver");
        job.setJarByClass(getClass());
        
        //Output format is defined
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        
        //reducer count is defined. We do not have any reducers for this assignment
        job.setNumReduceTasks(0);
        
        //File input and output formats are specified
        FileInputFormat.setInputPaths(job,  input);
        FileOutputFormat.setOutputPath(job, output);
        
        //Set the mapper class and run the job
        job.setMapperClass(WordCountMapper.class);
        boolean b = job.waitForCompletion(true);
        
        return 0;
    }
    
    /**
     * @param args input and output files are specified
     * @throws Exception
     */
    public static void main (String[]args) throws Exception{
        Configuration conf = new Configuration();
        WordCountDriver driver = new WordCountDriver(conf);
        driver.run(args);
    }

}

Export

Let’s execute maven with clean install target. Jar will be copied to the below given location

[INFO] --- maven-install-plugin:2.4:install (default-install) @ WordCount ---
[INFO] Installing D:\workspace_gandhari\WordCount\target\WordCount-0.0.1-SNAPSHOT.jar to C:\Users\pandian\.m2\repository\WordCount\WordCount\0.0.1-SNAPSHOT\WordCount-0.0.1-SNAPSHOT.jar

I copy the jar file to hadoop server as hadoop user.

Input file

To execute the mapper, we need a file in HDFS. I have the following file already.

hadoop@gandhari:~/jars$ hadoop fs -cat /user/hadoop/lab01/month.txt
chithirai
vaigasi
aani
aadi
aavani
purattasi
aippasi
karthikai
margazhi
thai
thai
panguni

hadoop@gandhari:~/jars$ hadoop jar WordCount-0.0.1-SNAPSHOT.jar org.grassfield.hadoop.WordCountDriver /user/hadoop/lab01/month.txt /user/hadoop/lab01/output/10
...
16/09/09 23:32:22 INFO mapreduce.JobSubmitter: number of splits:1
16/09/09 23:32:23 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/09/09 23:32:23 INFO mapreduce.Job: Running job: job_local57180184_0001
16/09/09 23:32:23 INFO mapred.LocalJobRunner: Starting task: attempt_local57180184_0001_m_000000_0
16/09/09 23:32:23 INFO mapred.MapTask: Processing split: hdfs://gandhari:9000/user/hadoop/lab01/month.txt:0+90
16/09/09 23:32:23 INFO output.FileOutputCommitter: Saved output of task 'attempt_local57180184_0001_m_000000_0' to hdfs://gandhari:9000/user/hadoop/lab01/output/10/_temporary/0/task_local57180184_0001_m_000000
16/09/09 23:32:24 INFO mapreduce.Job:  map 100% reduce 0%
16/09/09 23:32:24 INFO mapreduce.Job: Job job_local57180184_0001 completed successfull
y

Output file

Let’s see if our output file is created

hadoop@gandhari:~/jars$ hadoop fs -ls /user/hadoop/lab01/output/10
Found 2 items
-rw-r--r--   3 hadoop supergroup          0 2016-09-09 23:32 /user/hadoop/lab01/output/10/_SUCCESS
-rw-r--r--   3 hadoop supergroup        114 2016-09-09 23:32 /user/hadoop/lab01/output/10/part-m-00000

part-m denotes that it is the output of mapper. Here is the output of our job.

hadoop@gandhari:~/jars$ hadoop fs -cat /user/hadoop/lab01/output/10/part-m-00000
chithirai       1
vaigasi 1
aani    1
aadi    1
aavani  1
purattasi       1
aippasi 1
karthikai       1
margazhi        1
thai    1
thai    1
panguni 1

Interesting, isn’t it!

Have a good week.

Adding local libraries to Maven

How to add user defined .jar files or inhouse developed libraries to Maven Repository, so that my maven project will get executed correctly. Here is the steps I did to add two jar files c:\eg_agent.jar and c:\eg_util.jar to maven repository.

F:\sts-bundle\apache-maven-3.3.9-bin\apache-maven-3.3.9\bin>mvn install:install-file -Dfile=c:\eg_agent.jar -DgroupId=com.eg -DartifactId=agent -Dversion=6.1.2 -Dpackaging=jar

Once it is added to repository, I added the dependency as given below.

<dependency>
<groupId>com.eg</groupId>
<artifactId>agent</artifactId>
<version>6.1.2</version>
</dependency>

Here is another one.

F:\sts-bundle\apache-maven-3.3.9-bin\apache-maven-3.3.9\bin>mvn install:install-file -Dfile=c:\eg_util.jar -DgroupId=com.eg -DartifactId=util -Dversion=6.1.2 -Dpackaging=jar

<dependency>
<groupId>com.eg</groupId>
<artifactId>util</artifactId>
<version>6.1.2</version>
</dependency>

Ref http://stackoverflow.com/questions/29330577/maven-3-3-1-eclipse-dmaven-multimoduleprojectdirectory-system-propery-is-not-s

Oozie mkdistro fails with mvn: command not found

Oozie installation is not stright forward similar to other applications. When I executed the mkdistro script, it failed with the below given error

./mkdistro.sh
./mkdistro.sh: line 71: mvn: command not found

Maven is a build tool for java, which is not instaled in my ubuntu VM. Hence we need to install it using the below given command to make the script working

hadoop@gandhari:~/oozie/bin$ sudo apt-get install maven

 

Maven and jUnit

package junit.framework does not exist

21BGMCUBE_1904636e

This was another ugly exception I got while building my simple application. Build Failure!

17

Unfortunately I did created the test case inside src/main/java folder, which Maven doesn’t like. After moving the test to src/test/java, he is happy.

18

 

Maven Tutorial – 3

Lets create a simple application as we have seen in Tutorial -2.

Execute the following command. Ensure you have jdk bin and maven bin folders in path!

$ mvn archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=org.grassfield.app -DartifactId=my-app

We are generating a simple project based on generate architype – a simple template. After issuing the above command, you will see a list of lines and characters are running on the screen. You will be asked some questions, just press enter!

maven01

You can see a new folder is created with your artifact id, which is ‘my-app’ here! If you browse through this folder, you will see the following.

$ ls
my-app
$ cd my-app/
$ ls
pom.xml src
$ cd src
$ ls
main test
$ cd main
$ ls
java
$ cd java
$ ls
org
$ cd org/grassfield/app/
$ ls
App.java

App.java – Created by Maven

maven02

Lets modify it now.

Here it is!

maven03

Lets see what’s written inside pom.xml

pandian@pandian-SH560:~/pandian/my-app$ cat pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>org.grassfield.app</groupId>
<artifactId>my-app</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>

<name>my-app</name>
<url>http://maven.apache.org</url&gt;

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>

POM refers to junit alone as its dependency!

Now lets compile it.

$ mvn compile

maven04

Done!

Lets test it now! The following is the maven test case.

maven05

$ mvn test

maven06

Lets package it now!

$ mvn package

maven08

Console is interesting. But you love Eclipse IDE like, right? Lets create the Eclipse project structure for this artifact.

$ mvn eclipse:eclipse

maven09

Now import the java project from the location highlighted above.

maven10 - Eclipse IDE

Lets see how to work with Eclipse using Maven in my next post.

Related posts:

Maven – 1

Maven – 2