Lab11 – Reading from HDFS & charting

Hi Hadoopers,

This is another simple exercise to read a file from HDFS.

You already know from the previous post Lab 08 – MapReduce using custom class as Key that we have 3 outputs from the reducer.

/user/hadoop/lab08/03/part-r-00000
/user/hadoop/lab08/03/part-r-00001
/user/hadoop/lab08/03/part-r-00002

Here is a program that reads these files using Java API and prepare a chart. Here is the flow diagram.

hadoop031-lab-10-google-chart

Here is the code.

package org.grassfield.hadoop.output;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

/**
 * This will read the reducer output and export to desired format (HTML)
 * @author pandian
 *
 */
public class FeedCategoryHtml {

    public static void main(String[] args) throws IOException {
        Path path = new Path("hdfs://gandhari:9000"+args[0]);
        FileSystem fs = FileSystem.get(new Configuration());
        BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(path)));
        StringBuffer opBuffer = new StringBuffer();
        String line=null;
        while((line=br.readLine())!=null){
            StringTokenizer st = new StringTokenizer(line, "\t");
            opBuffer.append("['"+st.nextToken()+"', "+st.nextToken()+"],\n");
        }
        br.close();
        fs.close();
        
        StringBuffer htmlBuffer = new StringBuffer();
        FileReader fr = new FileReader(args[1]);
        br = new BufferedReader(fr);
        while((line=br.readLine())!=null){
            htmlBuffer.append(line+"\n");
        }
        br.close();
        fr.close();
        String htmlOut = htmlBuffer.toString();
        htmlOut = htmlOut.replaceAll("MROUTPUT", opBuffer.toString());
        System.out.println(htmlOut);
    }

}

I know you’d be excited to see the results. Here you go.

hadoop032-lab-10-google-chart-1hadoop033-lab-10-google-chart-2hadoop034-lab-10-google-chart-3

See you in another interesting post. Bye for now!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s