Bigdata interview question -Part 3
Spark Optimisation
Optimize Spark SQL Joins
Sort-Merge Join (when both side of join is large datasets) : Sort-Merge join is composed of 2 steps. The first step is to sort the datasets and the second operation is to merge the sorted data in the partition by iterating over the elements and according to the join key join the rows having the same value.Make sure the…