0% found this document useful (0 votes)
8 views2 pages

3 - PDFsam - Beginner Guide Spark

Uploaded by

mitmak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

3 - PDFsam - Beginner Guide Spark

Uploaded by

mitmak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Watch this short video to know more about ACADGILD.

© 2016 ACADGILD. All rights reserved.


No part of this book may be reproduced, distributed, or transmitted in any form or by any means, electronic or
mechanical methods, including photocopying, recording, or by any information storage retrieval system, without
permission in writing from ACADGILD.

Disclaimer
This material is intended only for the learners and is not intended for any commercial purpose. If you are not the
intended recipient, then you should not distribute or copy this material. Please notify the sender immediately or
click here to contact us.

Published by
ACADGILD,
[email protected]

Become a Big Data & Hadoop Developer 02


In this EBook we will be discussing the
basics of Spark’s functionality
and its installation.

What is Spark?
Apache spark is a cluster computing framework
which runs on Hadoop and handles different
Spark SQL +
types of data. It is a one stop solution to many
DataFrames
problems. Spark has rich resources for handling
the data and most importantly, it is 10-20x faster
than Hadoop’s MapReduce. It attains this speed Spark
of computation by its in-memory primitives. Streaming
The data is cached and is present in the memory MLlib
(RAM) and performs all the computations Machine
in-memory. Learning

GraphX
Spark’s rich resources has almost all the
Graph
components of Hadoop. For example we can
Computation
perform batch processing in Spark and real time
data processing, without using any additional
tools like kafka/flume of Hadoop. It has its own
streaming engine called spark streaming.
Spark Core API

Become a Big Data & Hadoop Developer 03

You might also like