0% found this document useful (0 votes)
264 views3 pages

SPARK and Scala Download Syllabus PDF

Apache Spark is group registering system. Initially created at the University of California, Berkeley’s AMPLab, the Spark codebase was later given to the Apache Software Foundation, which has kept up it since.

Uploaded by

shubham phulari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
264 views3 pages

SPARK and Scala Download Syllabus PDF

Apache Spark is group registering system. Initially created at the University of California, Berkeley’s AMPLab, the Spark codebase was later given to the Apache Software Foundation, which has kept up it since.

Uploaded by

shubham phulari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

A)Scala:

• Why Scala?

• What is Scala?

• Introducing Scala

• Installing Scala

• Scala Basics

• Scala Basic Types

• Defining Functions

• If statements

• Scala For Comprehensions

• While Loops

• Do-While Loops

• Conditional Operators

• Base Object Oriented Programming in Scala

• Case Objects and Classes

B)Spark
• Introduction to Big Data

• Challenges with Big Data

• Batch Vs. Real Time Big Data Analytics

• Batch Analytics – Hadoop Ecosystem Overview

• Spark Opportunity and Solution

• In Memory Data – Spark

• What is Spark?

• Modes of Spark

• Spark Installation
• Spark Standalone Cluster

• Capabilities and Ecosystem

• Spark Components vs Hadoop

•Loading a File in spark Shell

•Performing Some Basic Operations on Files in Spark Shell

C)RDD Fundamentals
• Purpose and Structure of RDDs

• Transformations, Actions, and DAG

• RDD programming API

D)Spark SQL / Dataframes


• Spark SQL and DataFrame Uses

• DataFrame / SQL APIs

• Catalyst Query Optimization

E)Spark Job Execution


• Jobs, Stages, and Tasks

• Partitions and Shuffles

• Data Locality

• Job Performance

F)Spark Streaming
•Streaming Sources and Tasks

• DStream APIs and Stateful Streams

• Reliability and Fault Recovery

G)Spark Mlib
• Classification Algorithm

• Clustering Algorithm

• Sequence Mining Algorithm

• Collbrative filtering

H)Spark GraphX
•Graph analysis with Spark

•GraphX for graphs

•Graph-parallel computation

You might also like