SQL Connector for HiveQL
The native query language supported by Spark is HiveQL. For simple queries, HiveQL is a subset of SQL-92. However, the syntax is different enough that most applications do not work with native HiveQL.
To bridge the difference between SQL and HiveQL, the SQL Connector feature translates standard SQL-92 queries into equivalent HiveQL queries. The SQL Connector performs syntactical translations and structural transformations. For example:
- Quoted Identifiers: The double quotes (
"
) that SQL uses to quote identifiers are translated into back quotes (`
) to match HiveQL syntax. The SQL Connector needs to handle this translation because even when a connector reports the back quote as the quote character, some applications still generate double-quoted identifiers. - Table Aliases: Support is provided for the AS keyword between a table reference and its alias, which HiveQL normally does not support.
- JOIN, INNER JOIN, and CROSS JOIN: SQL JOIN, INNER JOIN, and CROSS JOIN syntax is translated to HiveQL JOIN syntax.
- TOP N/LIMIT: SQL TOP N queries are transformed to HiveQL LIMIT queries.