Schema Definitions

To ensure consistent support for your MongoDB data, you must configure the connector to use a schema definition from a JSON file or the database. You can use the Schema Editor application in conjunction with the connector to create a schema definition and then save it in a JSON file or the database.

Note:

For information about how to use the Schema Editor, see the Schema Editor User Guide for JDBC Connections.

When the connector connects to a database without a specified schema definition, it automatically generates a temporary schema definition using the settings defined for the SamplingStrategy, SamplingLimit, and SamplingStepSize properties. However, temporary schema definitions do not persist after the connection is closed, and the connector might generate different schema definitions during subsequent connections to the same database.

Important:

Important:

When creating a schema definition or connecting to the database without specifying one, make sure to configure the connector to sample all the necessary data. Documents that are not sampled do not get included in the schema definition, and consequently do not become available in JDBC applications. If the schema definition was created through the Schema Editor, you can use the Schema Editor to sample additional documents and add them to the schema definition.

Mapping MongoDB Data

MongoDB is a "schemaless" data store; the databases are able to store data that does not follow the rules of data typing and structure that apply to traditional relational data, and the tables might contain complex data such as nested arrays or arrays of differently-typed elements. Because traditional JDBC toolsets might not support these data structures, the Simba MongoDB JDBC Driver generates a schema definition that maps the MongoDB data to an JDBC-compatible format.

The Simba MongoDB JDBC Driver does the following when generating a schema definition:

  1. Samples the data in the database in order to detect its structure and determine the data mappings that best support the data.
  2. Assigns a MongoDB data type to each column.
  3. Maps each MongoDB data type to the SQL data type that is best able to represent the greatest number of values.
  4. Renormalizes single-level objects into columns.
  5. For each array or nested object in the database, the connector generates a virtual table to expand the data, and saves these virtual tables as part of the schema definition. For more information about virtual tables, see Virtual Tables.

During this sampling process, the connector defines data types for each column, but does not change the data types of the individual cells in the database. As a result, columns might contain mixed data types. During read operations, values are converted to match the SQL data type of the column so that the connector can work with all the data in the column consistently.