Apache Spark Cluster Architecture

Please refer the below cluster diagram of Spark.

cluster-overview.png

 

Here we are just going to see how the Spark cluster is working. Basically the Cluster manager is responsible for managing all the worker nodes and allocate resources upon the request from Driver program.

So the below is like this,
1. Driver submits the request to any one of the cluster manager to run the jobs. If in case of stand alone cluster it will submit the request to Master
2. Master/Cluster manager allocate the resources(Worker Nodes) to Driver
3. Then Driver program contacts each worker nodes directly and each node has Executor which is responsible for doing the tasks.
4. Driver sends the application code to each executors in the form of JAR
5. Finally the Spark Context sends tasks to each executors to run the tasks

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s