0% found this document useful (0 votes)
7 views7 pages

DVS SPARK Course Content

The DVS SPARK Course covers an introduction to Apache Spark and the Scala programming language, including its functional and object-oriented programming principles. It includes detailed sections on Spark architecture, RDDs, Spark SQL, Spark Streaming, and advanced topics, along with practical workshops and projects. The course aims to equip learners with the necessary skills to work with Spark and related technologies.

Uploaded by

Ipsita Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views7 pages

DVS SPARK Course Content

The DVS SPARK Course covers an introduction to Apache Spark and the Scala programming language, including its functional and object-oriented programming principles. It includes detailed sections on Spark architecture, RDDs, Spark SQL, Spark Streaming, and advanced topics, along with practical workshops and projects. The course aims to equip learners with the necessary skills to work with Spark and related technologies.

Uploaded by

Ipsita Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DVS SPARK Course Content

INTRODUCTION AND OVERVIEW OF APACHE SPARK

SCALA Programming language

1. Scala introduction
✓ It’s Functional programming(FP)
✓ Also Object-Oriented Programming(OOPs)
✓ After Scala, a simple definition for FP and OOPs
✓ Where are all Scala is using
✓ History of Scala
✓ Scala program flow
✓ First Scala program
✓ Immutability
✓ Interactive shell REPL

2. Variables

✓ variables
✓ Properties of variable
✓ Creating variable
✓ val keyword
✓ var keyword
✓ Summary: when should we go for val and var
✓ Expressions

3. Data types

✓ Types of data types


o Byte
o Short
o Int
o Long
o Float
o Double
o Char
o Boolean

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
4. Flow control

✓ if
✓ if else
✓ if else if
✓ while
✓ do while
✓ for
✓ switch and case
✓ pattern matching
✓ return
✓ break

5. Scala is functional programming

✓ What is a function?
✓ Why function?
✓ Where function is using?
✓ Functions are two types
o Without parameters
o With parameters
▪ VarArg parameters
✓ Higher order function
✓ Pure functions
✓ Examples on functions

6. Scala is Object Oriented Programming too

✓ What is OOPs (Object Oriented Programming Principles)


o class
o object
o Why should we create an object for a class?
▪ What is an Object
▪ Characteristics about object
✓ Data Hiding or Information Hiding:
✓ Abstraction
✓ Encapsulation:
✓ Methods
o Types of methods
o Zero parameterized methods
o Parameterized methods

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
✓ Constructors in Scala
o What is the purpose of constructor?
o When constructor will get execute?
o How many times constructor will get execute?
o Does developer need to call constructor explicitly like a method?
o Types of constructor
o Without parameters Primary constructor
o Primary constructor which are having parameters
o Auxiliary Constructor

✓ Inheritance
o What is inheritance?
o How to implement inheritance?
o Still expecting more explanation then…
o Advantages of Inheritance:
o Types of Inheritance
▪ Single Inheritance
▪ Multi-level Inheritance
▪ Multiple Inheritance
▪ Why multiple inheritances is not supporting?

✓ Polymorphism
o What is polymorphism
o Dynamic Polymorphism
o Method Overloading
o Cases in overloading
▪ Difference in the number of parameters.
▪ Difference in the datatype of parameters.
▪ Difference in the order or sequence of parameters.
✓ Can we overload main() method?
✓ Method overriding
o When should we go for overriding?
o Difference between Method overloading and Method overriding
✓ final keyword
o final method
o final class
o Smart question: If we are using final keyword then are, we missing OOPs
features?
✓ Abstract class
o Abstract keyword
o Types of methods
▪ Implemented method

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
▪ Unimplemented method
✓ Abstract method
✓ Abstract class
✓ Abstract variable
✓ If you have time
o Please prepare given scenarios
✓ trait
o trait keyword
o What is trait?
o A single class can extends multiple traits
o If you have time
▪ Please prepare given scenarios
✓ Different type of classes
o Normal
o Singleton
o Standalone
✓ Singleton object
o Purpose of singleton object
o Difference between instance variable and singleton variable
o How to access singleton variable
✓ Companion object
o What is companion object
o Advantage
o Rules to define companion object
✓ Case class
o Case keyword
o Why case class?
o Advantage
o Difference between case class and normal class
✓ Implicites

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
Spark Index

1. Spark Introduction

✓ What is a spark?
✓ Prerequisites to learn spark
✓ Purpose of Spark
✓ Spark is written in which programming language
✓ Can Spark integrate with Hadoop?
✓ What kind of files sparks support?
✓ Is Spark depending on Hadoop?
✓ History of Spark
✓ Spark features
✓ Introduction to spark Scala’s and python shells
✓ How to deploy spark applications
o Spark Deployment modes and their usage patterns
✓ Spark Architecture
o Standalone cluster mode
o Spark on YARN mode
✓ Apache Spark Components or modules
o Core
o SQL
o Streaming
o MLib
o GraphX
o SparkR
✓ Cluster Managers
✓ Storage Layers for Spark
✓ Spark Execution Model
✓ Spark Terminology table
✓ Spark follows…
✓ Driver program
✓ Executors
✓ SparkContext
o How many SparkContext objects can create for one application?
o Stopping SparkContext object
o SparkContext responsibilities
✓ Spark 1.x version
✓ Solution in Spark 2.x
✓ RDD

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
2. Programming with RDD (Spark’s Data Abstraction)
✓ Importance of RDD
✓ Partitions in RDD
✓ Difference ways of creating an RDDs
✓ RDD Lineage and Persistence
✓ RDD Partitioning & How It Helps Achieve Parallelization
✓ Caching
✓ Persistent
✓ Fault-Recovery Mechanism
✓ If RAM is inefficient to store RDD then where it stores?
✓ RDD features
✓ Spark RDD Operations
✓ Transformations
o Types of Transformations
o Narrow Transformations
o Wide Transformations
✓ Actions
✓ Limitation of RDD
✓ RDD Operations
✓ Transformations & Actions
✓ Programs
✓ Coalesce and Repartition
✓ RDD Partitioning & How It Helps Achieve Parallelization
✓ Data Loading and Saving through RDDs
✓ Performing data transformations and aggregations/joins through RDDs
✓ RDD Advanced concepts – Accumulators, Broadcast variables
✓ Internals of Job execution in Spark

3. Spark SQL and DataFrames


✓ Need for Spark SQL, Spark SQL and its features
✓ Spark SQL Architecture
✓ Data Frames – A Spark SQL data abstraction
✓ Connecting to diverse Data Sources (HDFS, Hive, S3, RDBMS and NoSQL etc.)
✓ Loading and writing to different file formats (CSV, XML, JSON, Parquet, ORC)
✓ Interoperating with RDDs
✓ Building ETL pipelines through Spark SQL
✓ Data transformation and aggregation/joins through Spark SQL
✓ Spark SQL User Defined Functions (UDFs)
✓ Spark SQL integration with Hive – Loading and writing to Hive tables in a partitioned
and bucketed manner
✓ Real time challenges and case studies

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in
4. Spark Streaming
✓ Batch vs Streaming, Spark Streaming and its features
✓ Architecture and Abstraction
✓ DStreams, DStreams vs RDD
✓ Spark Streaming workflow, DStream Transformations
✓ Input Streams (Socket, HDFS, Twitter, Kafka)
✓ Kafka and its architecture
✓ Using Kafka as source in Spark Streaming
✓ Fault tolerance through Check pointing, Persist and Caching
✓ Batch and Window Sizes
✓ Aggregations through Stateful operators

5. Upgrading Spark – Spark 2.x, 3.x


✓ Advancements in Spark 2.x and 3.x
✓ New robust data abstraction – Dataset
✓ Unified entry point to all Spark libraries
✓ Introduction to Spark Structured Streaming
✓ Examples running through Spark Structured Streaming

6. Overview sessions on Kafka, Cassandra

7. Project in Spark Streaming with Kafka and Cassandra

8. Spark advanced topics too

9. PYSPARK workshops once in 2-3 months

Address: DVS Technologies, Opp Home Town, Beside Biryani Zone, Maratha halli, Bangalore-37
Phone: 9632558585, 8892499499 |E-mail :[email protected] | www.dvstechnologies.in

You might also like