This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps:
1. Install WSL2 and Docker on your Windows system.
2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook.
3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL.
4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
42 views23 pages
Install Spark
This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps:
1. Install WSL2 and Docker on your Windows system.
2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook.
3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL.
4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23
Apache Spark with Docker
Achmad Ginanjar @2022
To do list
Install WSL2
Install Docker
Install apache spark
WSL 2 Installation • Manual installation steps for older versions of WSL | https://docs.microsoft.com/en-us/windows/wsl/install-manual#step- 4---download-the-linux-kernel-update-packageMicrosoft Docs • Lakukan sampai step 4 Install subsystem untuk linux pada step 4 install Docker Pilih sesuai yang di inginkan Docker desktop installation finish Kembali ke bagian install WSL 2 Docker up and running Run Docker buka command prompt, pastikan command diatas berjalan seperti pada tampilan slide • Windows “cmd” • Linux “terminal” Go to:
notebook Run spark container docker run --rm -p 4040:4040 -p 8888:8888 -p 8080:8080 -v C:\Users\mambo\Documents\mengajar\sparkFolder:/home/jovyan/w ork -e GRANT_SUDO=yes --user root jupyter/all-spark-notebook To access the server, open this file in a browser • file:///home/jovyan/.loca l/share/jupyter/runtime/j pserver-8-open.html • Or copy and paste one of these URLs: http://b53b3edc1e9b:88 88/lab?token=dd845b81 e8cb158faee76c1b849c9 55b34327236b5caea8e • or http://127.0.0.1:8888/lab ?token=dd845b81e8cb15 8faee76c1b849c955b343 27236b5caea8e Load spark kernel %%init_spark #konfigurasi spark untuk lokal launcher.master="local" Jalankan code berikut
val rdd = sc.parallelize(0 to 100)
rdd.sum() Buka spark console http://localhost:4040 val rdd = sc.parallelize(0 to 10000) rdd.sum() Terima kasih
Implementing DevSecOps with Docker and Kubernetes: An Experiential Guide to Operate in the DevOps Environment for Securing and Monitoring Container Applications
Kubernetes: Build and Deploy Modern Applications in a Scalable Infrastructure. The Complete Guide to the Most Modern Scalable Software Infrastructure.: Docker & Kubernetes, #2
Software Containers: The Complete Guide to Virtualization Technology. Create, Use and Deploy Scalable Software with Docker and Kubernetes. Includes Docker and Kubernetes.
Docker: The Complete Guide to the Most Widely Used Virtualization Technology. Create Containers and Deploy them to Production Safely and Securely.: Docker & Kubernetes, #1