MapReduce Job Execution Process – Job initialization

Hi Hadoopers,

I wrote about the first step of the MR Job execution – Job Submission in my earlier post.

hadoop037-job-submission-1

In this post, we talk about 2nd circle, which is Job initialization.

I got the job, How will I execute it. This is what hadoop elephant is thinking with a yarn in its trunk!

hadoop039-job-init

  1. Once the job is submitted, it becomes Job Tracker’s responsibility to initialize it.
  2. The job xml uploaded at the staging directory created as given in my earlier post. Job Tracker reads it and perform the validation.
  3. Once the XML validation is completed, It goes to scheduler for job validations. Scheduler check is the user is authorized for this job, content is allowed etc.
  4. If the job validation is also successful, the job is added by the Scheduler. The schedule information is updated.
  5. Job Scheduler initializes the job.
  6. It reads the number of splits needed for the job to get executed.
  7. Tasks are created to exec the job. If we have many splits, that many map tasks would be spawned.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s