Using an SDD Schema in Version 2.2.8 and Later

Connector versions 1.8.4 and earlier save schemas in SDD format. Connector versions 2.0.0 and later save schemas in JSON format. Beginning with version 2.2.8, the connector can load and use SDD-formatted schemas in addition to JSON-formatted schemas.

However, be aware of the following:

  • Compared to previous connector versions, the newer connectors report different data types for some column types. Specifically, when reporting the TYPE_NAME values for SQL data types, connector versions 1.8.4 and earlier report the BSON type names while connector versions 2.0.0 and later report the names in SQL-92 format. Also, the following column types are translated differently:
  • MongoDB Column Type SQL Type Reported in Versions 1.8.4 and Earlier SQL Type Reported in Versions 2.0.0 and Later

    Array index

    SQL_BIGINT

    SQL_INTEGER

    Binary

    SQL_BINARY

    SQL_VARBINARY

  • Connector versions 2.0.0 and later use a different data sampling mechanism compared to connector versions 1.8.4 and earlier. It is recommended that you do not re-sample an SDD-formatted schema using a newer connector, as this can result in unusual results such as redundant tables. Instead, create a new JSON-formatted schema by sampling the database directly.
  • If you make any changes to an SDD-formatted schema, it must be saved in JSON format. The connector translates the schema from SDD to JSON while preserving the table and column names defined in the schema.
  • When the connector translates a schema from SDD to JSON, an "invalid source name" error may occur. For detailed information about this issue and potential workarounds for it, see below.

Invalid Source Name Error

This error is known to occur when all of the following conditions are met:

  • You are using an SDD-formatted schema with connector version 2.2.8 or later.
  • The MongoDB table associated with the schema has arrays that share the same name.
  • These arrays contain data at different levels. For example, one is a two-level array while the other is only a single-level array.

Connector versions 2.2.8 and later handle this inconsistency in array levels by pushing the values from the upper array levels down into the bottom array level. For example, given a two-level array and a single-level array that have the same name, the connector pushes the values from the single-level array down one level.

After this push occurs, the columns that the 1.8.x connector originally generated for containing the single-level array values cause an "invalid source name" error to occur when you view the base table.

As a workaround, do one of the following:

  • Re-sample the entire table using connector version 2.2.8 or later.
  • Or, using the Schema Editor, hide the invalid columns.
  • Or, open the schema map in a text editor and then change the source names of the invalid columns to the source names of the columns where the pushed values now reside. You cannot make this change in the Schema Editor, because the application treats this as a naming conflict.

Example

For example, a MongoDB database contains a table named Test2, which contains the following documents:

Document 1:

{

"_id" : "1",

"Col2_String" : "Hello",

"Col3_Array" : [

[

{

"Int" : 01,

"String" : "Col3_Array_1"

},

{

"Int" : 02,

"String" : "Col3_Array_2"

}

],

],

}

Document 2:

{

"_id" : "2",

"Col2_String" : "Hello Again",

"Col3_Array" : [

{

"Int" : 01,

"String" : "Col3_Array_1"

}

],

}

Notice that both documents contain an array named "Col3_Array". However, Col3_Array is a two-level array in Document 1, whereas it is a single-level array in Document 2.

After the connector translates the schema definition of Test2 from SDD to JSON, the resulting base table (still named Test2) contains columns that use the following SQL Names and Source Names:

Column # SQL Name SourceName

1

_id

_id

2

Col2_String

Col2_String

3

Col3_Array

Col3_Array

4

Col3_Array__0

Col3_Array.*0

5

Col3_Array__0__0

Co3_Array.*0.*0

6

Col3_Array__0__0__Int

Col3_Array.*0.*0.Int

7

Col3_Array__0__0__String

Col3_Array.*0.*0.String

8

Col3_Array__0__1

Col3_Array.*0.*1

9

Col3_Array__0__1__Int

Col3_Array.*0.*1_Int

10

Col3_Array__0__1__String

Col3_Array.*0.*1_String

11

Col3_Array__0__Int

Col3_Array.*0.Int

12

Col3_Array__0__String

Col3_Array.*0.String

Columns 5 through 10 (the Col3_Array__0__0, Col3_Array__0__0__Int, Col3_Array__0__0__String, Col3_Array__0__1, Col3_Array__0__1__Int, and Col3_Array__0__1__String columns) were generated to contain the values from the two-level Col3_Array in Document 1.

Columns 11 and 12 (the Col3_Array__0__Int and Col3_Array__0__String columns) were originally generated to contain the values from the single-level Col3_Array in Document 2. Because of how the connector pushed the array values from columns 11 and 12 down one level, these columns now cause an "invalid source name" error when you view the base table.

To resolve this error, do one of the following:

  • Re-sample the table using connector version 2.2.8 or later.
  • Hide columns 11 and 12 in the Schema Editor.
  • Or, use a text editor to change the source name of column 11 to Col3_Array.*0.*0.Int and the source name of column 12 to Col3_Array.*0.*0.String.