Spark SQL includes a cost-based optimizer, columnar storage and code generation to make queries fast.
https://spark.apache.org/sql/
Performance & Scalability
Spark SQL includes a cost-based optimizer, columnar storage and code generation to make queries fast. At the same time, it scales to thousands of nodes and multi hour queries using the Spark engine, which provides full mid-query fault tolerance. Don't worry about using a different engine for historical data.