Using SPSS Statistics

You can import Spark data into SPSS Statistics with a standard SQL query. The SPSS Database Wizard can automatically generate and execute an appropriate query based on parameters you specify.

The following procedure is written for SPSS Statistics 23. The Simba Spark ODBC Driver also supports earlier versions of SPSS.

Important:

Make sure that the bitness of the driver that you are using matches the bitness of SPSS. If you are using the 32-bit version of SPSS, then you need to use the 32-bit version of the driver. If you are using the 64-bit version of SPSS, then you need to use the 64-bit version of the driver. For more information about diagnosing the issue, see "Architecture Mismatch Problems" in Troubleshooting.

To retrieve data from your Spark data store using SPSS:

  1. In SPSS Statistics, select File > Open Database > New Query. The Database Wizard opens at the Welcome page.
  2. From the ODBC Data Source list, select your DSN.
  3. Under Select the Table Types, select the check boxes corresponding to the types of tables you want to retrieve and then click Next.
  4. On the Select Data page, select and order the tables and fields that you want to retrieve and then click Next.
  5. If you selected more than one table, then on the Specify Relationships page, specify how the tables should be joined and then click Next.
  6. On the Limit Retrieved Cases page, specify how to limit the data you retrieve and then click Next.
  7. On the Define Variables page, specify how SPSS should define those variables and then click Next.
  8. On the Results page, review the generated SQL.
  9. Make sure that Retrieve the Data I Have Selected is selected and then click Finish.

Data retrieved from the selected tables is displayed in SPSS Statistics in a new Dataset window. You can now use SPSS Statistics to analyze the data.

Note:

For more information about connecting to a database in SPSS Statistics, see the SPSS Statistics Help documentation that is provided in the application.