Accessing Spark
For the purposes of this Quickstart Guide for Windows, it is assumed that you already have a Spark database that you can query.
If necessary, you can use the following resources to set up Hadoop and Spark:
- You can download and use one of the preconfigured Hadoop distributions:
- You can download Hadoop from https://hadoop.apache.org/releases.html. For information about installing and configuring Hadoop, see the Hadoop Wiki: http://wiki.apache.org/hadoop/Hadoop2OnWindows.
- You can download Spark from http://spark.apache.org/downloads.html. For information about installing and configuring Spark, see the Apache Spark documentation: http://spark.apache.org/docs/latest/.
- You can download a sample data set named FAA_Spark from http://www.simba.com/wp-content/uploads/2014/11/FAA_Spark.zip.
Note:
The sample data set is a modified version of the FAA data set. The original FAA data set is available for download at http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time.
- About the on page 1
- Connecting to the Data Store