Write-Back
The Simba MongoDB JDBC Driver supports Data Manipulation Language (DML) statements such as INSERT, UPDATE, UPSERT, and DELETE. The connector does not support SQL subqueries or JDBC transactions.
Important:
Writing data to the MongoDB database might change the typing of the data. If the existing data is of a type that is different from the column data type specified in the schema definition, then the existing data will be replaced by new data that is of the column data type.
When the connector samples the data and generates the schema definition, the connector defines a SQL data type and a MongoDB data type for each column in the base tables and virtual tables, but does not change the data types of the individual values in the database. As a result, columns might contain mixed data types. The connector supports DML statements on mixed data types. When you execute a write operation, the connector will attempt to complete the operation using the column data type specified in the schema definition.
The data provided in DML statements must match the column data types. For example, a String value cannot be inserted in a column that is defined as an Integer column in the schema definition.
The connector handles each DML statement as described below:
Important:
The CREATE, ALTER, and DROP statements are not supported for tables.
INSERT
Each row in MongoDB needs to have a unique ID represented by the _id column. If not provided during insertion, MongoDB auto-generates a unique ID for each row. The _id field is exposed as a valid column in the connector and can be auto-generated when issuing INSERT statements through the connector. For example, consider the following table.
_id | sample_column |
---|---|
"517024D6CC79814E3FEBD352" |
1 |
"5170ED77E49CC93A918DE316" |
2 |
To insert a document with an auto-generated value for _id (data type: jstOID), issue the following command:
INSERT INTO sample_table_1(sample_column) VALUES(3)
The following table shows the table after the insertion.
_id | sample_column |
---|---|
"517024D6CC79814E3FEBD352" |
1 |
"5170ED77E49CC93A918DE316" |
2 |
"51710FFCE49CC93A918DE322" |
3 |
The value for the _id column can also be inserted using the INSERT statements, as in the following examples:
INSERT INTO sample_table_2(_id, sample_column) VALUES(1,1)
INSERT INTO sample_table_2 VALUES(1,1)
UPDATE
When updating rows, special care needs to be taken to avoid duplicate values for the _id column. As mentioned before, _id needs to be unique across all rows. When an UPDATE statement tries to set a value for the _id column and matches multiple rows, only one of the rows is updated with the new values, and the connector return an error for the remaining rows. UPDATE is not executed atomically.
Note:
You can configure the connector to execute UPDATE statements as UPSERT statements instead. To do this, set the UpsertOnUpdate
property to true
. For more information, see UpsertOnUpdate.
For information about how the connector handles UPSERT statements, see the following section.
UPSERT
When you execute an UPSERT statement, the connector handles the data differently depending on whether the UPSERT statement specifies a target location that already exists in the database:
- If the statement specifies an existing row in the MongoDB database, then the row is updated with the new data values.
- Alternatively, if the statement specifies a row that does not exist, then the data values are inserted as a new row in the database.
The basic syntax of an UPSERT statement is as follows:
UPSERT [TableName] SET [ColumnName] = '[Value]' WHERE _id = '[RowID]'
The SET clause specifies the data that you want to add to the database, while the specified table name and the WHERE clause indicate the location in the MongoDB database where you want this data to be added. The evaluation of the comparison in the WHERE clause determines whether the data is updated in an existing row or inserted as a new row.
Depending on whether you are executing the UPSERT statement on a base table or a virtual table, the SET and WHERE clauses are subject to different syntax rules. For more information, including examples of valid UPSERT statements, see the following sections:
- UPSERT Statements for Base Tables
- UPSERT Statements for Virtual Tables
- UPSERT Statements for Non-key Columns in the WHERE Clause
UPSERT Statements for Base Tables
When writing an UPSERT statement to add data to a base table, make sure that the statement complies with all of the following rules:
- The SET clause cannot specify any _id values. This pertains to values for the _id column in a table, as well as values for _id fields in a DocumentAsJson column.
- If the SET clause specifies the DocumentAsJson column, then this SET clause cannot specify any other columns.
- If the table uses _id values that are of a data type other than ObjectID, then you must specify an _id value in the WHERE clause. Otherwise, specifying the _id value is optional, depending on whether or not you want the data to be associated with a specific _id value when it gets inserted. When the _id value is of type ObjectID, the MongoDB database can automatically generate a valid _id value for the data being inserted.
- When specifying an _id value in the WHERE clause, you must include it as a predicate that defines an equality comparison between the _id column and a literal value.
- The WHERE clause cannot include more than one predicate that involves the _id column.
- If the WHERE clause contains more than one predicate, then you must append any additional predicates using the AND operator.
For example, the following WHERE clause is valid for an UPSERT operation:
WHERE _id = '5db8d144aa075d2990217ee8'
Additionally, the WHERE clause must comply with these rules:
If the WHERE clause does not follow these rules, and the _id value is of type ObjectID, then the data is inserted using an automatically generated _id value instead of the value that you specified. If the _id value is not of type ObjectID, then the query fails and the connector returns an error.
Example: Base Table with ObjectID _id Values
For example, consider the following base table named Customer, which uses ObjectID values for the_id column:
_id | customer_name | DocumentAsJson |
---|---|---|
5db8d144aa075d2990217ee8 |
John Doe |
{"_id": "5db8d144aa075d2990217ee8", "customer_name": "John Doe", "address": ["123 Broadway", "111 Cambie Street"] |
6z8guqk7xm8u14mhfw3peqyc |
Jane Doe |
{"_id": "6z8guqk7xm8u14mhfw3peqyc", "customer_name": "Jane Doe", "address": ["456 Central Avenue", "122 Main Street"]} |
The following UPSERT statement complies with all the rules stated above, and specifies the _id of the first table row. If you execute this statement, the connector updates the customer_name value of the first row from "John Doe" to "James Smith":
UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '5db8d144aa075d2990217ee8'
The following UPSERT statement complies with all the rules stated above, but specifies an _id that does not exist in the table. If you execute this statement, the connector inserts a new row with "76ky3d4fex619fbd9uny9cqr" as the _id value, "Joe Smith" as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:
UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '76ky3d4fex619fbd9uny9cqr'
The following UPSERT statement specifies an _id that does not exist in the table, and does not comply with the rule of using the AND operator to append additional WHERE predicates. If you execute this statement, the connector inserts a new row with a randomly generated _id value, “Joe Smith” as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:
UPSERT Customer SET customer_name = 'James Smith' WHERE _id = '5db8d144aa075d2990217ee8' OR customer_name = 'John Doe'
Example: Base Table with Integer _id Values
As another example, consider the following base table named Employees, which uses Integer values for the _id column:
_id | customer_name | DocumentAsJson |
---|---|---|
1 |
Joe Smith |
{"_id": 1, "employee_name": "Joe Smith", "address": ["987 Park Lane ", "133 Big Street"]} |
2 |
Jill Scott |
{"_id": 2, "employee_name": "Jill Scott", "address": ["543 Central Avenue", "212 Boardwalk Place"]} |
The following UPSERT statement complies with all the rules stated above, and specifies an _id that does not exist in the table. If you execute this statement, the connector inserts a new row with "3" as the _id value, "Dale Hill" as the customer_name value, and a JSON representation of the row as the DocumentAsJson value:
UPSERT Employee SET employee_name = 'Dale Hill' WHERE _id = '3'
The WHERE clause in the following UPSERT statement does not refer to an existing row in the table, and does not specify an _id value. Because the Employees table uses Integer values instead of ObjectID values for the _id column, MongoDB cannot generate an appropriate _id value for the row that is being inserted. If you execute this statement, the connector returns an error:
UPSERT Employee SET employee_name = 'Dale Hill' WHERE employee_name = 'Bob Jackson'
UPSERT Statements for Virtual Tables
When writing an UPSERT statement to add data to a virtual table, make sure that the statement complies with all of the following rules:
- The SET clause cannot specify any _id values or index column values.
- The WHERE clause must specify exactly one _id value, and also one value for each index column except for the bottom-level index column, which is optional. The index column values must be of type Integer. Each of these predicates in the WHERE clause must be an equality comparison with a literal value, and they must be appended using the AND operator.
Note:
By default, index columns are identified by the "_idx" suffix in the column names.
Example: Virtual Table for a Nested Object
For example, consider the following base table named Company, which has an Employee column that contains nested objects:
_id | Employee |
---|---|
76ky3d4fex619fbd9uny9cqr |
{"name": "James Smith", "contact": [{"phone" : [{"home": ["604-555-1234","778-555-5678" ]}, {"work": ["604-555-9876"] } ] } ] } |
8z6guqk7xm8u14mhfw3peqyc |
{"name": "Jill Scott", "contact": [{"phone": [{"home": ["604-555-8989","778-555-5678" ]}, {"work": ["604-555-4345"] } ] } ] } |
When this data is expanded into virtual tables, the home phone number values are represented by the following virtual table named Company_Employee:
_id | Employee_contact_dim1_idx | Employee_contact_phone_dim1_idx | Employee_contact_phone_home_dim1_idx | Employee_contact_phone_home |
---|---|---|---|---|
76ky3d4fex619fbd9uny9cqr |
0 |
0 |
0 |
604-555-1234 |
76ky3d4fex619fbd9uny9cqr |
0 | 0 |
1 |
778-555-5678 |
8z6guqk7xm8u14mhfw3peqyc |
0 | 0 |
0 |
604-555-8989 |
8z6guqk7xm8u14mhfw3peqyc |
0 | 0 |
1 |
778-555-3232 |
The following UPSERT statement complies with all the rules stated above, and specifies the _id and complete set of index values of the first table row. If you execute this statement, the connector updates the phone number in the first row from "604-555-1234" to "604-555-2424":
UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0 AND Employee_contact_phone_home_dim1_idx = 0
The following UPSERT statement complies with all the rules stated above, specifies an existing _id value, and specifies all of the index values except for the bottom-level one. If you execute this statement, the connector inserts a new row with "76ky3d4fex619fbd9uny9cqr" as the _id value and "604-555-2424" as the phone number. The connector inserts this data at the index point that is next in the sequence, so in this case, the number would be inserted as the third row in the virtual table.
UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0
The following UPSERT statement complies with all the rules stated above, and specifies an existing _id value as well as a complete set of index values. The specified bottom-level index value does not exist in the table. If you execute this statement, the connector inserts a new row with "604-555-2424" as the phone number. The connector inserts this data at the specified index point.
UPSERT Company_Employee SET Employee_contact_phone_home = '604-555-2424' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_dim1_idx = 0 AND Employee_contact_phone_dim1_idx = 0 AND Employee_contact_phone_home_dim1_idx = 2
You can omit indices from the WHERE clause in the following cases:
- Only _id is provided in the WHERE clause.
- The WHERE clause contains _id and one or more non-key columns separated by AND and no index columns.
Note:
-
The conditions above must not include index and _id columns in the SET clause as per UPSERT Statements for Virtual Tables.
-
The following rules still apply:
- Providing bottom-level index column in the WHERE clause is optional.
- The index column values must be an integer. Each of these predicates in the WHERE clause must be an equality comparison with a literal value and they must be appended using the AND operator.
Examples
Note:
The following examples are in reference to the virtual table named Company_Employee seen above.
The following UPSERT statement complies with all the rules stated above, and specifies an UPSERT to a virtual table with no index column in the WHERE clause and _id that does not exist in the database:
UPSERT Company_Employee SET Employee_contact_phone_home='778-123-4444' WHERE _id = '6081b6a8a4c3273edc69fe87'
Executing the statement above would insert a new row in Company_Employee:
U6081b6a8a4c3273edc69fe87 0 0 0 778-123-4444
The following UPSERT statement complies with all the rules stated above, and specifies an UPSERT to a virtual table with _id that does not exist and one index column in the WHERE clause:
UPSERT Company_Employee SET Employee_contact_phone_home ='604-666-1234' WHERE _id = '76ky3d4fex619fbd9uny9cqr' AND Employee_contact_phone_dim1_idx = 1
The query fails because Rule 2 wasn't respected. The WHERE clause must either have no index columns or all index columns except the bottom level index.
The following UPSERT statement complies with all the rules stated above, and specifies multiple UPSERTs in one batch with repeated _id that does not exist and no indices in the WHERE clause:
UPSERT Company_Employee SET Employee_contact_phone_home = ? WHERE _id = ?
The batch contains:
('604-666-1234', '6081b6a8a4c3273edc69fe87')
('604-777-1234', '6081b6a8a4c3273edc69fe87')
The following rows are inserted:
6081b6a8a4c3273edc69fe87 0 0 0 604-666-2424
6081b6a8a4c3273edc69fe87 1 0 0 604-666-2424
Note:
A batch UPSERT with the same _id repeated results in inserting new rows with level 1 index incremented by 1. This only occurs when the _id value does not already exist in the database.
If the _id value already exists, the UPSERT query will result in update rather than insert as expected with UPSERT statements. For instance, if the query above contained the following batch:
('604-666-1234', '76ky3d4fex619fbd9uny9cqr')
('604-777-1234', '76ky3d4fex619fbd9uny9cqr')
The result is that the Employee_contact_phone_home is updated to 604-777-1234 for all rows with _id 76ky3d4fex619fbd9uny9cqr and therefore the table would look like this:
76ky3d4fex619fbd9uny9cqr 0 0 0 604-777-1234
76ky3d4fex619fbd9uny9cqr 0 0 1 604-777-1234
8z6guqk7xm8u14mhfw3peqyc 0 0 0 604-555-8989
8z6guqk7xm8u14mhfw3peqyc 0 0 1 778-555-3232
The following UPSERT statement complies with all the rules stated above, and specifies multiple UPSERTs in one batch with different _id values that do not exist and no indices in the WHERE clause:
UPSERT Company_Employee SET Employee_contact_phone_home = ? WHERE _id = ?
The batch contains:
('604-666-1234', '6081b6a8a4c3273edc69fe87')
('604-777-1234', '60837a4082eb2ecce38fc837')
The following rows will be inserted:
6081b6a8a4c3273edc69fe87 0 0 0 604-666-1234
60837a4082eb2ecce38fc837 0 0 0 604-777-1234
UPSERT Statements for Non-key Columns in the WHERE Clause
The connector uses the following UPSERT rules in relation to non-key columns in the WHERE clause:
- Rule 1: Distinct Non-key Columns in the WHERE Clause
- Rule 2: Same Non-key Column in the WHERE Clause and SET Clause
- Rule 3: Non-key Column in the WHERE Clause and DocumentAsJson in the SET Clause
- Rule 4: Duplicate Non-key Columns in the WHERE Clause
Rule 1: Distinct Non-key Columns in the WHERE Clause
A value can be inserted for non-key columns if:
- A non-key column is specified in the WHERE clause as an equality comparison with a literal value.
- The equality on the non-key column and the rest of the conditions in the WHERE clause is separated by the AND operator.
For example:
_id | employee_number | employee_name | DocumentAsJson |
---|---|---|---|
1 |
1 |
Joe Bloggs |
{"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"} |
Examples
Statement:
UPSERT employee SET employee_number = 2 WHERE _id = 2 and employee_name='John Doe'
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
2 2 John Doe {"_id": 2, "employee_number": 2, "employee_name": "John Doe"}
Note:
If Rule 1 is met for multiple distinct non-key columns in the WHERE clause, each non-key column will have its value inserted. Previously, the connector inserted null for the employee_name value as the connector did not insert values for non-key columns when present in the WHERE clause.
Statement:
UPSERT employee SET employee_number = 3 WHERE _id = 3 and (employee_name='Jane Doe' or employee_number = 3)
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
3 3 <null> {"_id": 3, "employee_number": 3, "employee_name": null}
Note:
No value is inserted for employee_name because the WHERE clause does not comply with the rules in UPSERT Statements for Base Tables.
Rule 2: Same Non-key Column in the WHERE Clause and SET Clause
When a non-key column is specified in the SET clause and the conditions from Rule 1 are met for that same column, the value from the SET clause is inserted.
Examples
Statement:
UPSERT employee SET employee_number = 4 WHERE _id = 4 and employee_name='Jane Doe' and employee_number=1000
Result:
_id employee_number employee_name
1 1 Joe Bloggs
4 4 Jane Doe
Statement:
UPSERT employee SET employee_name = 'John Doe' WHERE _id = 3 and (employee_name='Jane Doe' or employee_number = 3)
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
3 <null> John Doe {"_id": 3, "employee_number": null, "employee_name": "John Doe"}
Rule 3: Non-key Column in the WHERE Clause and DocumentAsJson in the SET Clause
Rule 1 does not apply to non-key columns when the DocumentAsJson column is in the SET clause.
Examples
Statement:
UPSERT employee SET DocumentAsJson = '{"employee_name": "John Doe", "employee_number": 2}' WHERE _id = 3 and employee_name = 'Jane Doe'
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
3 2 John Doe {"_id": 3, "employee_number": 2, "employee_name": "John Doe"}
Statement:
UPSERT employee SET DocumentAsJson = '{"employee_name": "John Doe"}' WHERE _id = 3 and employee_name = 'Jane Doe' and employee_number = 3
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
3 <null> John Doe {"_id": 3, "employee_name": "John Doe"}
Note:
The value of employee_number in the WHERE clause is not inserted because DocumentAsJson is in the set clause.
Rule 4: Duplicate Non-key Columns in the WHERE Clause
If there are multiple equalities with literal predicates for the same non-key column in the WHERE clause and the column is not present in the set clause, the value for that non-key column is set to null.
Example
Statement:
UPSERT employee SET employee_number = 5 WHERE _id = 3 and employee_name='Jane Doe' and employee_name='John Doe'
Result:
_id employee_number employee_name DocumentAsJson
1 1 Joe Bloggs {"_id": 1, "employee_number": 1, "employee_name": "Joe Bloggs"}
3 5 <null> {"_id": 3, "employee_number": 5}
Note:
The same rules on the non-key columns apply to virtual tables except DocumentAsJson columns as it is not supported on virtual tables.
DELETE
The DELETE statement is not supported for virtual tables.
The connector considers a table valid as long as the table contains some data. If a table is completely empty, then the connector is not able to access the table. Consider the following example:
DELETE FROM sample_table_3
The command removes all data from sample_table_3. Therefore, sample_table_3 is invalid. Any users attempting to access the table receive an error.
- Data Types
- Write Concern Driver Configuration Options on page 1
- Features
- Installing and Using the Simba MongoDB JDBC Driver
- Configuring the Simba MongoDB JDBC Driver on page 1
- Driver Configuration Options on page 1