HDFS & MapReduce – Set of concepts

After writing about the ecosystem of Hadoop, I should write about wiring those blocks to see them working. Before doing this, I prefer to document the HFDS/MR paradigm quickly.

If we look at the Hadoop in a high level, we can separate it into 2 parts.

1. HDFS

2. Map/Reduce

Nodes in Hadoop clusters stores the data in HDFS. It stores the huge volume of data as different small blocks. HDFS is running on top of unix filesystem (or others where the HDFS is running)

Searching for the data across multiple nodes, based on catalog and aggregating them to arrive at resired results is called as MP Reduce processing.

I have depicted it diagramatically below.

HDFS, MapReduce paradigm -  Javashine

HDFS, MapReduce paradigm – Javashine

 

Advertisements

One thought on “HDFS & MapReduce – Set of concepts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s