Write-Back

The Simba MongoDB JDBC Driver supports Data Manipulation Language (DML) statements such as INSERT, UPDATE, UPSERT, and DELETE. The connector does not support SQL subqueries or JDBC transactions.

Important:

Important:

Writing data to the MongoDB database might change the typing of the data. If the existing data is of a type that is different from the column data type specified in the schema definition, then the existing data will be replaced by new data that is of the column data type.

When the connector samples the data and generates the schema definition, the connector defines a SQL data type and a MongoDB data type for each column in the base tables and virtual tables, but does not change the data types of the individual values in the database. As a result, columns might contain mixed data types. The connector supports DML statements on mixed data types. When you execute a write operation, the connector will attempt to complete the operation using the column data type specified in the schema definition.

The data provided in DML statements must match the column data types. For example, a String value cannot be inserted in a column that is defined as an Integer column in the schema definition.

The connector handles each DML statement as described below:

Important:

Important:

The CREATE, ALTER, and DROP statements are not supported for tables.

INSERT

Each row in MongoDB needs to have a unique ID represented by the _id column. If not provided during insertion, MongoDB auto-generates a unique ID for each row. The _id field is exposed as a valid column in the connector and can be auto-generated when issuing INSERT statements through the connector. For example, consider the following table.

_id sample_column

"517024D6CC79814E3FEBD352"

1

"5170ED77E49CC93A918DE316"

2

To insert a document with an auto-generated value for _id (data type: jstOID), issue the following command:

INSERT INTO sample_table_1(sample_column) VALUES(3)

The following table shows the table after the insertion.

_id sample_column

"517024D6CC79814E3FEBD352"

1

"5170ED77E49CC93A918DE316"

2

"51710FFCE49CC93A918DE322"

3

The value for the _id column can also be inserted using the INSERT statements, as in the following examples:

INSERT INTO sample_table_2(_id, sample_column) VALUES(1,1)

INSERT INTO sample_table_2 VALUES(1,1)

UPDATE

When updating rows, special care needs to be taken to avoid duplicate values for the _id column. As mentioned before, _id needs to be unique across all rows. When an UPDATE statement tries to set a value for the _id column and matches multiple rows, only one of the rows is updated with the new values, and the connector return an error for the remaining rows. UPDATE is not executed atomically.

Note:

You can configure the connector to execute UPDATE statements as UPSERT statements instead. To do this, set the UpsertOnUpdate property to true. For more information, see UpsertOnUpdate.

For information about how the connector handles UPSERT statements, see the following section.

UPSERT

When you execute an UPSERT statement, the connector handles the data differently depending on whether the UPSERT statement specifies a target location that already exists in the database:

  • If the statement specifies an existing row in the MongoDB database, then the row is updated with the new data values.
  • Alternatively, if the statement specifies a row that does not exist, then the data values are inserted as a new row in the database.

The basic syntax of an UPSERT statement is as follows:

UPSERT [TableName] SET [ColumnName] = '[Value]' WHERE _id = '[RowID]'

The SET clause specifies the data that you want to add to the database, while the specified table name and the WHERE clause indicate the location in the MongoDB database where you want this data to be added. The evaluation of the comparison in the WHERE clause determines whether the data is updated in an existing row or inserted as a new row.

Depending on whether you are executing the UPSERT statement on a base table or a virtual table, the SET and WHERE clauses are subject to different syntax rules. For more information, including examples of valid UPSERT statements, see the following sections:

UPSERT Statements for Base Tables

When writing an UPSERT statement to add data to a base table, make sure that the statement complies with all of the following rules:

  • The SET clause cannot specify any _id values. This pertains to values for the _id column in a table, as well as values for _id fields in a DocumentAsJson column.
  • If the SET clause specifies the DocumentAsJson column, then this SET clause cannot specify any other columns.
  • If the table uses _id values that are of a data type other than ObjectID, then you must specify an _id value in the WHERE clause. Otherwise, specifying the _id value is optional, depending on whether or not you want the data to be associated with a specific _id value when it gets inserted. When the _id value is of type ObjectID, the MongoDB database can automatically generate a valid _id value for the data being inserted.
  • When specifying an _id value in the WHERE clause, you must include it as a predicate that defines an equality comparison between the _id column and a literal value.
  • For example, the following WHERE clause is valid for an UPSERT operation:

    WHERE _id = '5db8d144aa075d2990217ee8'

    Additionally, the WHERE clause must comply with these rules:

    • The WHERE clause cannot include more than one predicate that involves the _id column.
    • If the WHERE clause contains more than one predicate, then you must append any additional predicates using the AND operator.

    If the WHERE clause does not follow these rules, and the _id value is of type ObjectID, then the data is inserted using an automatically generated _id value instead of the value that you specified. If the _id value is not of type ObjectID, then the query fails and the connector returns an error.

Example: Base Table with ObjectID _id Values

For example, consider the following base table named Customer, which uses ObjectID values for the_id column:

_id customer_name DocumentAsJson

5db8d144aa075d2990217ee8

John Doe

{"_id": "5db8d144aa075d2990217ee8", "customer_name": "John Doe", "address": ["123 Broadway", "111 Cambie Street"]

6z8guqk7xm8u14mhfw3peqyc

Jane Doe

{"_id": "6z8guqk7xm8u14mhfw3peqyc", "customer_name": "Jane Doe", "address": ["456 Central Avenue", "122 Main Street"]}

The following UPSERT statement complies with all the rules stated above, and specifies the _id of the first table row. If you execute this statement, the connector updates the customer_name value of the first row from "John Doe" to "James Smith":

UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '5db8d144aa075d2990217ee8'

The following UPSERT statement complies with all the rules stated above, but specifies an _id that does not exist in the table. If you execute this statement, the connector inserts a new row with "76ky3d4fex619fbd9uny9cqr" as the _id value, "Joe Smith" as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:

UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '76ky3d4fex619fbd9uny9cqr'

The following UPSERT statement specifies an _id that does not exist in the table, and does not comply with the rule of using the AND operator to append additional WHERE predicates. If you execute this statement, the connector inserts a new row with a randomly generated _id value, “Joe Smith” as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:

UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '5db8d144aa075d2990217ee8' OR customer_name = 'John Doe'

Example: Base Table with Integer _id Values

As another example, consider the following base table named Employees, which uses Integer values for the _id column:

_id customer_name DocumentAsJson

1

Joe Smith

{"_id": 1, "employee_name": "Joe Smith", "address": ["987 Park Lane ", "133 Big Street"]}

2

Jill Scott

{"_id": 2, "employee_name": "Jill Scott", "address": ["543 Central Avenue", "212 Boardwalk Place"]}

The following UPSERT statement complies with all the rules stated above, and specifies an _id that does not exist in the table. If you execute this statement, the connector inserts a new row with "3" as the _id value, "Dale Hill" as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:

UPSERT Employee SET employee_name = 'Dale Hill' WHERE _id = '3'

The WHERE clause in the following UPSERT statement does not refer to an existing row in the table, and does not specify an _id value. Because the Employees table uses Integer values instead of ObjectID values for the _id column, MongoDB cannot generate an appropriate _id value for the row that is being inserted. If you execute this statement, the connector returns an error:

UPSERT Employee SET employee_name = 'Dale Hill' WHERE employee_name = 'Bob Jackson'

UPSERT Statements for Virtual Tables

When writing an UPSERT statement to add data to a virtual table, make sure that the statement complies with all of the following rules:

  • The SET clause cannot specify any _id values or index column values.
  • Note:

    By default, index columns are identified by the "_idx" suffix in the column names.

  • The WHERE clause must specify exactly one _id value, and also one value for each index column except for the bottom-level index column, which is optional. The index column values must be of type Integer. Each of these predicates in the WHERE clause must be an equality comparison with a literal value, and they must be appended using the AND operator.

Example: Virtual Table for a Nested Object

For example, consider the following base table named Company, which has an Employee column that contains nested objects:

_id Employee

76ky3d4fex619fbd9uny9cqr

{"name": "James Smith", "contact": [{"phone" : [{"home": ["604-555-1234","778-555-5678" ]}, {"work": ["604-555-9876"] } ] } ] }

8z6guqk7xm8u14mhfw3peqyc

{"name": "Jill Scott", "contact": [{"phone": [{"home": ["604-555-8989","778-555-5678" ]}, {"work": ["604-555-4345"] } ] } ] }

When this data is expanded into virtual tables, the home phone number values are represented by the following virtual table named Company_Employee:

_id Employee_contact_dim1_idx Employee_contact_phone_dim1_idx Employee_contact_phone_home_dim1_idx Employee_contact_phone_home

76ky3d4fex619fbd9uny9cqr

0

0

0

604-555-1234

76ky3d4fex619fbd9uny9cqr

0 0

1

778-555-5678

8z6guqk7xm8u14mhfw3peqyc

0 0

0

604-555-8989

8z6guqk7xm8u14mhfw3peqyc

0 0

1

778-555-3232

The following UPSERT statement complies with all the rules stated above, and specifies the _id and complete set of index values of the first table row. If you execute this statement, the connector updates the phone number in the first row from "604-555-1234" to "604-555-2424":

UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0 AND Employee_contact_phone_home_dim1_idx = 0

The following UPSERT statement complies with all the rules stated above, specifies an existing _id value, and specifies all of the index values except for the bottom-level one. If you execute this statement, the connector inserts a new row with "76ky3d4fex619fbd9uny9cqr" as the _id value and "604-555-2424" as the phone number. The connector inserts this data at the index point that is next in the sequence, so in this case, the number would be inserted as the third row in the virtual table.

UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0

The following UPSERT statement complies with all the rules stated above, and specifies an existing _id value as well as a complete set of index values. The specified bottom-level index value does not exist in the table. If you execute this statement, the connector inserts a new row with "604-555-2424" as the phone number. The connector inserts this data at the specified index point.

UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0 AND Employee_contact_phone_home_dim1_idx = 2

You can omit indices from the WHERE clause in the following cases: 

  • Only _id is provided in the WHERE clause.
  • The WHERE clause contains _id and one or more non-key columns separated by AND and no index columns.

Note:

  • The conditions above must not include index and _id columns in the SET clause as per UPSERT Statements for Virtual Tables.

  • The following rules still apply:

    • Providing bottom-level index column in the WHERE clause is optional.
    • The index column values must be an integer. Each of these predicates in the WHERE clause must be an equality comparison with a literal value and they must be appended using the AND operator.

Examples

Note:

The following examples are in reference to the virtual table named Company_Employee seen above.

The following UPSERT statement complies with all the rules stated above, and specifies an UPSERT to a virtual table with no index column in the WHERE clause and _id that does not exist in the database:

UPSERT Company_Employee SET Employee_contact_phone_home='778-123-4444' WHERE _id = '6081b6a8a4c3273edc69fe87'

Executing the statement above would insert a new row in Company_Employee:

U6081b6a8a4c3273edc69fe87 0 0 0 778-123-4444

The following UPSERT statement complies with all the rules stated above, and specifies an UPSERT to a virtual table with _id that does not exist and one index column in the WHERE clause: 

UPSERT Company_Employee SET Employee_contact_phone_home ='604-666-1234' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_phone_dim1_idx = 1

The query fails because Rule 2 wasn't respected. The WHERE clause must either have no index columns or all index columns except the bottom level index.

The following UPSERT statement complies with all the rules stated above, and specifies multiple UPSERTs in one batch with repeated _id that does not exist and no indices in the WHERE clause:

UPSERT Company_Employee SET Employee_contact_phone_home = ? WHERE _id = ?

The batch contains:

('604-666-1234', '6081b6a8a4c3273edc69fe87')

('604-777-1234', '6081b6a8a4c3273edc69fe87')

The following rows are inserted:

6081b6a8a4c3273edc69fe87 0 0 0 604-666-2424

6081b6a8a4c3273edc69fe87 1 0 0 604-666-2424

Note:

A batch UPSERT with the same _id repeated results in inserting new rows with level 1 index incremented by 1. This only occurs when the _id value does not already exist in the database.

If the _id value already exists, the UPSERT query will result in update rather than insert as expected with UPSERT statements. For instance, if the query above contained the following batch:

('604-666-1234', '76ky3d4fex619fbd9uny9cqr')

('604-777-1234', '76ky3d4fex619fbd9uny9cqr')

The result is that the Employee_contact_phone_home is updated to 604-777-1234 for all rows with _id 76ky3d4fex619fbd9uny9cqr and therefore the table would look like this:

76ky3d4fex619fbd9uny9cqr 0 0 0 604-777-1234

76ky3d4fex619fbd9uny9cqr 0 0 1 604-777-1234

8z6guqk7xm8u14mhfw3peqyc 0 0 0 604-555-8989

8z6guqk7xm8u14mhfw3peqyc 0 0 1 778-555-3232

The following UPSERT statement complies with all the rules stated above, and specifies multiple UPSERTs in one batch with different _id values that do not exist and no indices in the WHERE clause:

UPSERT Company_Employee SET Employee_contact_phone_home = ? WHERE _id = ?

The batch contains:

('604-666-1234', '6081b6a8a4c3273edc69fe87')

('604-777-1234', '60837a4082eb2ecce38fc837')

The following rows will be inserted:

6081b6a8a4c3273edc69fe87 0 0 0 604-666-1234

60837a4082eb2ecce38fc837 0 0 0 604-777-1234

UPSERT Statements for Non-key Columns in the WHERE Clause

The connector uses the following UPSERT rules in relation to non-key columns in the WHERE clause:

Rule 1: Distinct Non-key Columns in the WHERE Clause

A value can be inserted for non-key columns if: 

  • A non-key column is specified in the WHERE clause as an equality comparison with a literal value.
  • The equality on the non-key column and the rest of the conditions in the WHERE clause is separated by the AND operator.

For example: 

_id employee_number employee_name DocumentAsJson

1

1

Joe Bloggs

{"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

Examples

Statement:

UPSERT employee SET employee_number = 2 WHERE _id = 2 and employee_name='John Doe'

Result: 

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

2 2 John Doe {"_id": 2, "employee_number": 2, "employee_name": "John Doe"}

Note:

If Rule 1 is met for multiple distinct non-key columns in the WHERE clause, each non-key column will have its value inserted. Previously, the connector inserted null for the employee_name value as the connector did not insert values for non-key columns when present in the WHERE clause.

Statement:

UPSERT employee SET employee_number = 3 WHERE _id = 3 and (employee_name='Jane Doe' or employee_number = 3)

Result: 

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

3 3 <null> {"_id": 3, "employee_number": 3, "employee_name": null}

Note:

No value is inserted for employee_name because the WHERE clause does not comply with the rules in UPSERT Statements for Base Tables.

Rule 2: Same Non-key Column in the WHERE Clause and SET Clause

When a non-key column is specified in the SET clause and the conditions from Rule 1 are met for that same column, the value from the SET clause is inserted.

Examples

Statement:

UPSERT employee SET employee_number = 4 WHERE _id = 4 and employee_name='Jane Doe' and employee_number=1000

Result:

_id employee_number employee_name

1 1 Joe Bloggs

4 4 Jane Doe

Statement:

UPSERT employee SET employee_name = 'John Doe' WHERE _id = 3 and (employee_name='Jane Doe' or employee_number = 3)

Result:

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

3 <null> John Doe {"_id": 3, "employee_number": null, "employee_name": "John Doe"}

Rule 3: Non-key Column in the WHERE Clause and DocumentAsJson in the SET Clause

Rule 1 does not apply to non-key columns when the DocumentAsJson column is in the SET clause.

Examples

Statement:

UPSERT employee SET DocumentAsJson = '{"employee_name": "John Doe", "employee_number": 2}' WHERE _id = 3 and employee_name = 'Jane Doe'

Result:

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

3 2 John Doe {"_id": 3, "employee_number": 2, "employee_name": "John Doe"}

Statement:

UPSERT employee SET DocumentAsJson = '{"employee_name": "John Doe"}' WHERE _id = 3 and employee_name = 'Jane Doe' and employee_number = 3

Result:

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

3 <null> John Doe {"_id": 3, "employee_name": "John Doe"}

Note:

The value of employee_number in the WHERE clause is not inserted because DocumentAsJson is in the set clause.

Rule 4: Duplicate Non-key Columns in the WHERE Clause

If there are multiple equalities with literal predicates for the same non-key column in the WHERE clause and the column is not present in the set clause, the value for that non-key column is set to null.

Example

Statement:

UPSERT employee SET employee_number = 5 WHERE _id = 3 and employee_name='Jane Doe' and employee_name='John Doe'

Result:

_id employee_number employee_name DocumentAsJson

1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}

3 5 <null> {"_id": 3, "employee_number": 5}

Note:

The same rules on the non-key columns apply to virtual tables except DocumentAsJson columns as it is not supported on virtual tables.

DELETE

The DELETE statement is not supported for virtual tables.

The connector considers a table valid as long as the table contains some data. If a table is completely empty, then the connector is not able to access the table. Consider the following example:

DELETE FROM sample_table_3

The command removes all data from sample_table_3. Therefore, sample_table_3 is invalid. Any users attempting to access the table receive an error.