Virtual Tables

One advantage of the DynamoDB design is the ability to store data that is denormalized into a fewer number of tables. However, the JDBC interface does not natively support accessing denormalized data types such as lists and sets. By expanding the data contained within lists and sets into virtual tables, the Simba DynamoDB JDBC Driver allows users to directly interact with the data but leave the storage of the data in its denormalized form in DynamoDB.

If any columns are mapped to a list or set data type during schema generation, then the connector creates the following tables and saves them as part of the schema definition:

  • A base table, which contains all the data from the real table except for lists and sets.
  • A virtual table for each list or set, expanding the nested data.

Virtual tables refer to the data in the real table, enabling the connector to access the denormalized data. By querying the virtual tables, you can access the contents of lists and sets via JDBC.

Important:

For information about the data type changes that might occur when the connector reads DynamoDB data, see Schema Definition.

The base table and virtual tables appear as additional tables in the list of tables that exist in the database, and are named using the following conventions:

  • The base table uses the same name as the real table that it represents.
  • In the JDBC layer, the name of each virtual table is formed using the name of the real table that contains the denormalized data, an underscore (_), and the name of the list or set. If the virtual table represents nested data, then the table name includes a _dim[Number] suffix where [Number] is a number indicating which hierarchy level the data resides in.
  • In the DynamoDB layer, the name of each virtual table is formed using the name of the table that the data comes from, a period (.), and then the name of the list or set. If the virtual table represents nested data, then the table name ends with a pair of closed square brackets ([]) for each hierarchy level in which the list or set is nested.

Example of Renormalization for Lists and Sets

Consider the following DynamoDB table named ExampleTable, which contains lists and sets:

ExampleTable
Partition Sort List StringSet

0

1

{ 1, 2, 3 }

{ "A", "B", "C" }

3

3

{ 100, 101, 102, 105 }

{ "D", "E" }

Multiple virtual tables would be generated for this single source table. The first table is the base table, which is shown below.

ExampleTable
Partition Sort

0

1

3

3

The base table contains all of the data of the original table, except the data from the lists and sets have been omitted and will be expanded in the virtual tables.

The following two tables show the virtual tables that represent the data from the List and StringSet columns.

ExampleTable_List
Partition Sort List_idx List

0

1

0

1

0

1

1

2

0

1

2

3

3

3

0

100

3

3

1

101

3

3

2

102

3

3

3

105

ExampleTable_StringSet
Partition Sort StringSet

0

1

A

0

1

B

0

1

C

3

3

D

3

3

E

The Partition and Sort columns in the virtual tables reference the Partition and Sort columns in the real table, and indicate which real table row the virtual table row corresponds to.

Virtual tables that represent lists contain an index column, which is named using the name of the list and an _idx suffix. Columns that contain the expanded data from a list have the same name as the list.

In virtual tables that represent sets, the columns that contain the expanded data from the set have the same name as the set.

Example of Renormalization for Nested Lists

Virtual tables that represent nested lists include multiple indexes, where each index indicates the position of the data value in each hierarchy level. The names of the virtual table and the data column in the table both include a _dim[Number] suffix where [Number] is a number indicating which hierarchy level the data resides in.

Consider the following DynamoDB table named ExampleTable2, where the value in the third position of the list is itself a list:

ExampleTable2
Partition Sort List Letter

0

1

{ 100, 101, { 300, 301, 302 }, 105 }

A

The schema definition for this table would include the following base table and two virtual tables:

ExampleTable2
Partition Sort Letter

0

1

A

ExampleTable2_List
Partition Sort List_idx List

0

1

0

100

0

1

1

101

0

1

3

105

ExampleTable2_List_dim2
Partition Sort List_idx List_dim2_idx List_dim2

0

1

2

0

300

0

1

2

1

301

0

1

2

2

302