The most important component of Apache Spark is Spark SQL that deals with DataFrame API and SQL queries. Inside Spark SQL there lies an optimizer called Catalyst Query Optimizer. using this Spark creates an extensible query optimizer. This query optimizer Spark is based on Scala’s functional programming construct.
Need of query optimizer:
To get solution to tackle various problem with Bigdata.
As a solution to extend the optimizer.
We use catalyst general tree transformation frame work in four phases