Schema Definition

Couchbase is able to store data that does not follow the rules of data typing and structure that apply to traditional relational tables and columns. Couchbase data is organized into buckets and documents, which can contain nested arrays or arrays of differently-typed elements. Because traditional JDBC toolsets might not support these data structures, the data needs to be mapped to a relational form. To achieve this, the Simba Couchbase JDBC Driver generates a database schema that maps the Couchbase data to a JDBC-compatible format.

Note:

The schema that the connector generates is distinct from the metadata schema that exists as part of the database architecture.

When the connector connects to a database that does not already have the necessary schema, it automatically generates a schema by doing the following:

  1. For each document type identified in the database, the connector samples data from multiple documents to detect the structure of the data.
  2. The connector organizes all the documents into collections based on their type, and saves these collections as part of the schema. Using the schema, the connector exposes collections as tables in JDBC applications.
  3. For each array detected in the database, the connector generates a virtual table to expand the data, and saves these virtual tables as part of the schema. Using the schema, the connector exposes virtual tables as normal tables in JDBC applications.
  4. The connector defines a Couchbase data type for each column and maps each Couchbase data type to the SQL data type that is best able to represent the greatest number of values.

In addition to the automatic schema generation done by the connector, you can also manually edit the schema by using the Schema Editor. The Schema Editor is a Java application that runs on Windows, Linux, and macOS. It is included in the ZIP archive for the connector. For more information about using the Schema Editor, see the Schema Editor User Guide. For detailed information about defining a schema, see Defining a Schema.

Important:

Before generating a schema definition, make sure that primary indexes have been created for all of the buckets in your Couchbase instance.

The tables defined in the schema refer to the actual data in the Couchbase database, allowing you to interact with the data through the JDBC layer while leaving the storage of the data in its denormalized form in the database. By querying the tables, you can access the contents of the database. When you write or modify data in a table, the corresponding data in the Couchbase database is updated.

Base Tables

Base tables represent data from collections of Couchbase documents. Documents appear as rows, and all attributes that are not arrays appear as columns. In each base table, the connector creates a primary key column named PK that identifies which Couchbase document each row comes from.

In the JDBC layer, the name of the base table is the document type that it represents. In the Couchbase layer, the name of the base table is the bucket that the data comes from.

Virtual Tables

Virtual tables provide support for arrays. Each virtual table contains the data from one array, and each row in the table represents an element from the array. If an element contains an array, then the connector creates additional virtual tables as needed to expand the nested data.

In each virtual table, the connector creates a primary key column named PK that identifies the Couchbase document the array comes from and references the PK column from the related base table. The connector also creates an index column (with the suffix _IDX in its name) to indicate the position of the element within the array.

In the JDBC layer, the name of the virtual table is formed using the document type that contains the array, an underscore character (_), the name of the object that contains the array (if any), an underscore character (_), and the name of the array. In the Couchbase layer, the name of the virtual table is formed using the name of the bucket that the data comes from, a period (.), and then the name of the array followed a set of closed square brackets ([]) for each hierarchy level in which the array is nested.

Example of Table Creation during Schema Generation

The following example shows the base tables and virtual tables that the connector would generate if it connected to a Couchbase database named ExampleDatabase, which contains two documents named Customer_123221 and Order_221354.

The Customer_123221 document is of type Customer and contains the following attributes. The SavedAddresses attribute is an array.

{

"Type": "Customer",

"Name": "John Doe",

"SavedAddresses": ["123 Main St.", "456 1st Ave"]

}

The Order_221354 document is of type Order and contains the following attributes. The CreditCard attribute is an object, and the Items attribute is an array of objects.

{

"Type": "Order",

"CustomerID":"Customer_123221",

"CreditCard":

{

"Type":"Visa",

"CardNumber":"4111 1111 1111 1111",

"Expiry":"12/12",

"CVN":"123"

},

"Items":

[

{"ItemID":89123, "Quantity":1},

{"ItemID":92312, "Quantity":5}

]

}

When the connector connects to ExampleDatabase and generates the schema, the connector creates a collection for each document type. The connector exposes these collections as two base tables, which are shown below.

PK Name

"Customer_123221"

John Doe

PK CustomerID CreditCard_Type CreditCard_Number CreditCard_Expiry CreditCard_CVN

"Order_221354"

"Customer_123221"

"Visa"

"4111 1111 1111 1111"

"12/12"

"123"

The SavedAddresses array from the Customer_123221 document and the Items array from the Order_221354 document do not appear in these base tables. Instead, the connector generates a virtual table for each array. The following tables show the virtual tables that represent data from the SavedAddresses and Items arrays.

PK SavedAddresses_IDX SavedAddresses

"Customer_123221"

0

"123 Main St."

"Customer_123221"

1

"456 1st Ave"

PK Items_IDX ItemID Quantity

"Order_221354"

0

89123

1

"Order_221354"

1

92312

5

You can select, insert, and update data in the base tables and virtual tables as if they were standard relational tables, and the connector will handle the storage details within Couchbase.

For example, to append an item to the Items array, where the ItemID is 78123 and the Quantity is 6, execute the following statement:

INSERT INTO "Order_Items" ("PK", "ItemID", "Quantity") VALUES ('Order_221354', '78123', '6')

Some operations might be processed differently or not supported for certain types of data. For more information, see Write-back.