Parallel Scan

The Simba DynamoDB JDBC Driver supports parallel scans, which allow the connector to improve performance by dividing a scan operation across multiple threads. You can specify the number of threads that the connector uses to scan a table by accessing the table properties through the Schema Editor and modifying the value of the FetchSegments property.

Before configuring parallel scans, consider the following factors:

  • Parallel scans consume more provisioned throughput units than single-threaded scans.
  • Parallel scans are only effective for improving the performance of scan operations on large tables. The overhead associated with running multiple threads often outweigh the performance gain of using parallel scan, so the default number of threads used for each table is 1.
  • The relationship between the number of threads used and the performance level of scan operations is not linear; continuing to increase the number of threads used does not guarantee an equal increase in performance.

Parallel scan is most effective when used on tables with high provisioned throughput values. Because each thread can only consume a certain amount of throughput, tables with large amounts of provisioned throughput can support more threads. If some of the provisioned throughput of a table appears to be unused, consider increasing the number of threads.