Specifying Column Names
Use of the SQL asterisk (*
) selector to return all columns in a table is limited. You must modify your queries to specify the columns to return.
Depending on the schema type or file format, refer to the following syntax guidelines and modify your queries as needed.
Note that when you write queries for Drill, you must enclose all table and schema names in backticks (`
).
Note:
The Simba ODBC Connector for Apache Drill provides an application called Drill Explorer that helps generate queries for Drill. However, the Simba Drill JDBC Connector currently does not support Drill Explorer. If you would like to simplify the process of formulating valid queries for Drill, consider using the Drill Explorer and the ODBC connector.
HBase
The Simba Drill JDBC Connector presents an HBase column-family as a JDBC column. An HBase column is derived using the lookup scheme [ColumnFamily]'[Column]' where [ColumnFamily] is an HBase column-family and [Column] is an HBase column contained in [ColumnFamily]. You must specify individual HBase columns, not column-families, so you need to have some familiarity with the data. For example, the following query might return incorrect results:
SELECT * FROM `hbase`.`students`
When the query is modified to specify individual columns, then the query results are correct:
SELECT CAST(account['name'] AS varchar(20)) FROM `hbase`.`students`
Parquet
When a query is used as a subquery, such as when you use a SQL view to connect to a data store using SAP Lumira, you must specify column names in the query. For example, SAP Lumira might not connect to your Drill data source successfully when using the following query:
SELECT * FROM `dfs`.`default`.`./opt/drill/test.parquet`
When the query is modified to specify column names, SAP Lumira connects to your Drill data store successfully using the SQL view:
SELECT CAST(column1 AS varchar(20)) FROM `dfs`.`default`.`./opt/drill/test.parquet`
JSON
When a query is used as a subquery, such as when you use a SQL view to connect to a data store using SAP Lumira, you must specify column names in the query. For example, SAP Lumira might not connect to your Drill data store successfully when using the following query:
SELECT * FROM `dfs`.`default`.`./opt/drill/interval.json`
When the query is modified to specify column names, SAP Lumira connects to your Drill data store successfully using the SQL view:
SELECT column1 FROM `dfs`.`default`.`./opt/drill/interval.json`
To query nested elements, use the following syntax, where menu
is a child of column1
:
SELECT column1['menu'] FROM `dfs`.`default`.`./opt/drill/interval.json`
You can query elements that are multiple levels deep. Continuing the example, if menuitem
is a child of menu
, then use the following syntax:
SELECT column1['menu']['menuitem'] FROM `dfs`.`default`.`./opt/drill/interval.json`
CSV or TSV
Due to a limitation of Drill, you must specify columns using indices, requiring some familiarity with the data. For example, the following query might return incorrect results:
SELECT * FROM `dfs`.`default`.`./root/Test/Table.csv`
When the query is modified to specify columns using indices, the query results are correct:
SELECT columns[0], columns[2] FROM `dfs`.`default`.`./root/Test/Table.csv`
Hive
When using a Hive schema, you can use standard queries without modifying them.