0% found this document useful (0 votes)
42 views23 pages

Install Spark

This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps: 1. Install WSL2 and Docker on your Windows system. 2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook. 3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL. 4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.

Uploaded by

istumpul5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views23 pages

Install Spark

This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps: 1. Install WSL2 and Docker on your Windows system. 2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook. 3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL. 4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.

Uploaded by

istumpul5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Apache Spark with Docker

Achmad Ginanjar @2022


To do list

Install WSL2

Install Docker

Install apache spark


WSL 2 Installation
• Manual installation steps for older versions of WSL |
https://docs.microsoft.com/en-us/windows/wsl/install-manual#step-
4---download-the-linux-kernel-update-packageMicrosoft Docs
• Lakukan sampai step 4
Install
subsystem
untuk linux
pada step 4
install Docker
Pilih sesuai
yang di
inginkan
Docker
desktop
installation
finish
Kembali ke bagian install WSL 2
Docker up and
running
Run Docker
buka command prompt, pastikan command diatas berjalan seperti pada tampilan
slide
• Windows “cmd”
• Linux “terminal”
Go to:

Docker Hub

https://hub.docker.com/r/jupyter/all
-spark-notebook/#!
Installation

docker pull jupyter/all-spark-


notebook

Install scala, spark, jupyter


notebook
Run spark container
docker run --rm -p 4040:4040 -p 8888:8888 -p 8080:8080 -v
C:\Users\mambo\Documents\mengajar\sparkFolder:/home/jovyan/w
ork -e GRANT_SUDO=yes --user root jupyter/all-spark-notebook
To access the
server, open this
file in a browser
• file:///home/jovyan/.loca
l/share/jupyter/runtime/j
pserver-8-open.html
• Or copy and paste one of
these URLs:
http://b53b3edc1e9b:88
88/lab?token=dd845b81
e8cb158faee76c1b849c9
55b34327236b5caea8e
• or
http://127.0.0.1:8888/lab
?token=dd845b81e8cb15
8faee76c1b849c955b343
27236b5caea8e
Load spark
kernel
%%init_spark
#konfigurasi spark untuk lokal
launcher.master="local"
Jalankan code berikut

val rdd = sc.parallelize(0 to 100)


rdd.sum()
Buka spark
console
http://localhost:4040
val rdd = sc.parallelize(0 to 10000)
rdd.sum()
Terima kasih

You might also like