Master Pyspark Zero To Hero 1738689679
Master Pyspark Zero To Hero 1738689679
Driver:-
1. Heart of the Spark Application
2. Manages the information and state of executors
3. Analyses, distributes and Schedules the work on executors
Executor:-
1. Execute the code
2. Report the status of execution to driver
A User assigns a job to driver and driver in turn analyses
distribution and breaks down the job into Stages and Tasks. And it
assigns it to the executors. Here executors are basically a JVM
process which runs in the cluster of machines and it consists of
cores.
NOTE:-
1. Each task can only work on 1 partition of data at a time.
2. Tasks can execute in parallel.
3. Executor are JVM processes running on cluster machines.
4. Executors hosts cores and each core can run 1 task at a time.
Narrow Transformation:-
After applying transformation each partition contribute to at-most
one partition.
Wide Transformation:-
After applying transformation if one partition contribute to more
than one partition. This type of transformations lead to data
shuffle.
How Spark works on Data Partitions?
- Spark distributes the data in form of partitions to the Cluster.