Creating a Data Source Name

Typically, after installing the Simba Apache Spark ODBC Connector, you need to create a Data Source Name (DSN). A DSN is a data structure that stores connection information so that it can be used by the connector to connect to Spark.

You can specify connection settings in a DSN (in the odbc.ini file), in a connection string, or as connector-wide settings (in the simba.sparkodbc.ini file). Settings in the connection string take precedence over settings in the DSN, and settings in the DSN take precedence over connector-wide settings.

The following instructions describe how to create a DSN by specifying connection settings in the odbc.ini file. If your machine is already configured to use an existing odbc.ini file, then update that file by adding the settings described below. Otherwise, copy the odbc.ini file from the Setup subfolder in the connector installation directory to the home directory, and then update the file as described below.

For information about specifying settings in a connection string, see Configuring a DSN-less Connection and Using a Connection String. For information about connector-wide settings, see Setting Connector-Wide Configuration Options.

To create a Data Source Name:

  1. In a text editor, open the odbc.ini configuration file.
  2. Note:

    If you are using a hidden copy of the odbc.ini file, you can remove the period (.) from the start of the file name to make the file visible while you are editing it.

  3. In the [ODBC Data Sources] section, add a new entry by typing a name for the DSN, an equal sign (=), and then the name of the connector.
  4. For example, on a macOS machine:

    [ODBC Data Sources]

    Sample DSN=Simba Apache Spark ODBC Connector

    For example, for a 32-bit connector on a Linux/AIX/Solaris machine:

    [ODBC Data Sources]

    Sample DSN=Simba Apache Spark ODBC Connector 32-bit

  5. Create a section that has the same name as your DSN, and then specify configuration options as key-value pairs in the section:
    1. Set the Driver property to the full path of the connector library file that matches the bitness of the application.
    2. For example, on a macOS machine:

      Driver=/Library/simba/spark/lib/libsparkodbc_sbu.dylib

      For example, for a 32-bit connector on a Linux/AIX/Solaris machine:

      Driver=/opt/simba/spark/lib/32/libsparkodbc_sb32.so

    3. Set the SparkServerType property to one of the following values:
      • If you are running Shark 0.8.1 or earlier, set the property to 1.
      • If you are running Shark 0.9 or Spark 1.1 or later, set the property to 3.

      For example:

      SparkServerType=3

    4. Specify whether the connector uses the DataStax AOSS service when connecting to Spark, and provide the necessary connection information:
      • To connect to Spark without using the DataStax AOSS service, do the following:
        1. Set the ServiceDiscoveryMode property to No Service Discovery.
        2. Set the Host property to the IP address or host name of the Spark server.
        3. Set the Port property to the number of the TCP port that the Spark server uses to listen for client connections.

        For example:

        ServiceDiscoveryMode=No Service Discovery

        Host=192.168.222.160

        Port=10000

      • Or, to discover Spark services via the DataStax AOSS service, set properties as described in Configuring DataStax AOSS Service Discovery.
    5. If authentication is required to access the Spark server, then specify the authentication mechanism and your credentials. For more information, see Configuring Authentication.
    6. If you want to connect to the server through SSL, then enable SSL and specify the certificate information. For more information, see Configuring SSL Verification.
    7. Note:

      If the AuthMech property is set to 2 or 5, SSL is not available.

    8. If you want to configure server-side properties, then set them as key-value pairs using a special syntax. For more information, see Configuring Server-Side Properties.
    9. Optionally, set additional key-value pairs as needed to specify other optional connection settings. For detailed information about all the configuration options supported by the Simba Apache Spark ODBC Connector, see Driver Configuration Options on page 1.
  6. Save the odbc.ini configuration file.
  7. Note:

    If you are storing this file in its default location in the home directory, then prefix the file name with a period (.) so that the file becomes hidden. If you are storing this file in another location, then save it as a non-hidden file (without the prefix), and make sure that the ODBCINI environment variable specifies the location. For more information, see Specifying the Locations of the Connector Configuration Files.

For example, the following is an odbc.ini configuration file for macOS containing a DSN that connects to a Spark Thrift Server instance and authenticates the connection using a user name and password:

[ODBC Data Sources]

Sample DSN=Simba Apache Spark ODBC Connector

[Sample DSN]

Driver=/Library/simba/spark/lib/libsparkodbc_sbu.dylib

SparkServerType=3

ServiceDiscoveryMode=No Service Discovery

Host=192.168.222.160

Port=10000

UID=jsmith

PWD=simba123

For example, the following is an odbc.ini configuration file for a 32-bit connector on a Linux/AIX/Solaris machine, containing a DSN that connects to a SparkThrift Server instance and authenticates the connection using a user name and password:

[ODBC Data Sources]

Sample DSN=Simba Apache Spark ODBC Connector 32-bit

[Sample DSN]

Driver=/opt/simba/spark/lib/32/libsparkodbc_sb32.so

SparkServerType=3

ServiceDiscoveryMode=No Service Discovery

Host=192.168.222.160

Port=10000

UID=jsmith

PWD=simba123

You can now use the DSN in an application to connect to the data store.