0% found this document useful (0 votes)
18 views11 pages

Install Apache Spark in A Standalone Mode On Windows

Uploaded by

ranganadh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views11 pages

Install Apache Spark in A Standalone Mode On Windows

Uploaded by

ranganadh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Install Apache Spark in a Standalone Mode

on Windows


Apache Spark is a lightning-fast unified analytics engine used for cluster


computing for large data sets like BigData and Hadoop with the aim to
run programs parallel across multiple nodes. It is a combination of
multiple stack libraries such as SQL and Dataframes, GraphX, MLlib, and
Spark Streaming.
Spark operates in 4 different modes:
1. Standalone Mode: Here all processes run within the same JVM
process.
2. Standalone Cluster Mode: In this mode, it uses the Job-Scheduling
framework in-built in Spark.
3. Apache Mesos: In this mode, the work nodes run on various
machines, but the driver runs only in the master node.
4. Hadoop YARN: In this mode, the drivers run inside the application’s
master node and is handled by YARN on the Cluster.
In This article, we will explore Apache Spark installation in a Standalone
mode. Apache Spark is developed in Scala programming language and
runs on the JVM. Java installation is one of the mandatory things in spark.
So let’s start with Java installation.

Installing Java:

Step 1: Download the Java JDK.


Step 2: Open the downloaded Java SE Development Kit and follow along
with the instructions for installation.
Step 3: Open the environment variable on the laptop by typing it in the
windows search bar.
Set JAVA_HOME Variables:
To set the JAVA_HOME variable follow the below steps:
 Click on the User variable Add JAVA_HOME to PATH with value Value:
C:\Program Files\Java\jdk1.8.0_261.
 Click on the System variable Add C:\Program Files\Java\jdk1.8.0_261\
bin to PATH variable.
 Open command prompt and type “java –version”, it will show below
appear & verify Java installation.

Installing Scala:

For installing Scala on your local machine follow the below steps:
Step 1: Download Scala.
Step 2: Click on the .exe file and follow along instructions to customize
the setup according to your needs.
Step 3: Accept the agreement and click the next button.
Set environmental variables:

 In User Variable Add SCALA_HOME to PATH with value C:\Program Files


(x86)\scala.
 In System Variable Add C:\Program Files (x86)\scala\bin to PATH
variable.

Verify Scala installation:

In the Command prompt use the below command to verify Scala


installation:
scala

Installing Spark:

Download a pre-built version of the Spark and extract it into the C drive,
such as C:\Spark. Then click on the installation file and follow along the
instructions to set up Spark.
Set environmental variables:

 In User variable Add SPARK_HOME to PATH with value C:\spark\spark-


2.4.6-bin-hadoop2.7.
 In System variable Add%SPARK_HOME%\bin to PATH variable.

Download Windows Utilities:

If you wish to operate on Hadoop data follow the below steps to download
utility for Hadoop:
Step 1: Download the winutils.exe file.

Step 2: Copy the file to C:\spark\spark-1.6.1-bin-hadoop2.6\bin.


Step 3: Now execute “spark-shell” on cmd to verify spark installation as
shown below:

You might also like