Java program to read a file from Hadoop Cluster 2 (with file seek)


You need to look at the following blog posts to understand this post in a better way.

  1. Copying the File to HDFS file system
  2. A java program to read the file from HDFS
  3. A java program to read the file from HDFS – 2

The InputStream we used in example 3 given above is nothing but FSDataInputStream, which is capable of doing random access in the file. Hence you can “seek” the content from the location, you prefer.

Beware, Seek is a costly operation!

The code given in example 3 is modifed as below. You can get this resource from

import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.*;
public class FileSystemDoubleCat{
public static void main(String [] args) throws Exception{
String uri=args[0];
Configuration conf= new Configuration();
FileSystem fs = FileSystem.get(URI.create(uri),conf);
FSDataInputStream in = null;
try{ Path(uri));
IOUtils.copyBytes(in, System.out,4096,false);;
IOUtils.copyBytes(in, System.out,4096,false);


View original post 122 more words

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s