Casting Binary Data

Given that Drill can work with self-describing data such as HBase and file systems without central metadata definitions, there are scenarios where the file formats do not have defined data types for the data. HBase, for example, always treats data as binary. Drill provides auxiliary functions to cast (or interpret) the data as certain data types. The following are a few examples of how to cast data as a specific data type. For more information about the SQL queries in Drill, see the Apache Drill documentation: http://drill.apache.org/docs/.

Important:

When writing queries for Drill, you must enclose all table and schema names in backticks (`). For examples of this syntax, see the queries below.

HBase and Parquet store data in binary format. In SQL statements, you need to cast binary data to another format explicitly to view the data. For example, the following query displays results from an HBase database in binary format:

SELECT account['name'] FROM `hbase`.`students`

The following query displays the same results in string format:

SELECT CAST(account['name'] AS varchar(20)) FROM `hbase`.`students`

The following query displays results from a Parquet file in binary format:

SELECT column1 FROM `dfs`.`default`.`./opt/drill/test.parquet`

The following query displays the same results in string format:

SELECT CAST(column1 AS varchar(20)) FROM `dfs`.`default`.`./opt/drill/test.parquet`

You can also cast the data as other data types, such as integer or date formats, as needed.