Basically, to fetch data in sorted order Hive sort by and order by commands are used in Apache Hive. But there are few differences such as:
Order By Query
Syntax
SELECT [ALL | DISTINCT] select_expr, select_expr, …
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[HAVING having_condition]
[ORDER BY col_list]]
[LIMIT number];
– In order to guarantee the total order in output, it uses a single reducer.
– And, to minimize sort time, we can use the LIMIT.
To learn Order by in detail, follow the link: HiveQL Select – Order By Query
Sort by Query
– While it comes to final output, it may use multiple reducers.
– It only guarantees to order of rows within a reducer.
– It may offer a partially ordered result.
similarly, learn about Hive Sort by Query in detail: Hive Sort by