Sampling Data for DynamoDB

You can use the options in the Sample View to specify how the connector samples data from DynamoDB to generate a schema definition. The connector samples the data in order to detect its structure and determine the data mappings that best support the data.

To sample data for DynamoDB:

  1. Choose one:

    • From the start page, click Create New, provide your connection information, and then click Connect. For detailed information about how to specify connection information, Connecting to a Data Store.
    • The Sample dialog box opens.

    • Or, from the Design View, click the Sample View tab. If you are not already connected to a data store, the Schema Editor prompts you to provide your connection information. For detailed information about how to specify connection information, Connecting to a Data Store.
    • The Schema Editor displays the Sample View.

  2. From the Sampling Method drop-down list, select the direction in which the connector reads data during sampling. For example, if you select Forward, the connector samples data starting with the first record in the data store, then samples the next record, and so on.

  3. In the Sampling Count field, type the maximum number of records that the connector can sample to generate the schema definition. To sample every record in the database, set this option to 0.

    Note:

    Typically, sampling a large number of records results in a schema definition that is more accurate and better able to represent all the data in the database. However, the sampling process might take longer than expected when many records are sampled, especially if the database contains complex, nested data structures.

  4. In the Sampling Interval field, type the interval at which the connector samples a record when scanning through the data store. For example, if you set this option to 2, then the connector samples every second record in the data store.

  5. In the bottom pane, specify the collections that the connector samples records from by selecting the corresponding check boxes in the Selected column. You can select every collection in the database by selecting the check box in the Selected column header.

    Note:

    You can group and sort collections by clicking a column header. For example, to group collections based on the catalogs they belong to and then sort those groupings by catalog name in ascending order, click the Catalog column header. To sort the list in descending order, click the header again. To disable sorting, click the header a third time.

  6. To generate the schema, click Sample.

The connector samples the data as specified and generates a schema definition, which opens in the Design View in the Schema Editor. If you return to the Sample View, you will see that the check boxes in the Sampled column are selected for all the columns that were included in the sampling process.