0% found this document useful (0 votes)
11 views53 pages

Spark and Scala - Module 2

The document is a course outline for a Scala and Apache Spark training program, detailing various modules including Scala essentials, control structures, functions, and collections. It covers key concepts such as data types, variable types, lazy values, and the use of control structures like loops and conditionals. Additionally, it discusses the differences between mutable and immutable collections in Scala.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views53 pages

Spark and Scala - Module 2

The document is a course outline for a Scala and Apache Spark training program, detailing various modules including Scala essentials, control structures, functions, and collections. It covers key concepts such as data types, variable types, lazy values, and the use of control structures like loops and conditionals. Additionally, it discusses the differences between mutable and immutable collections in Scala.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Apache Spark and Scala

Module 2: Scala – Essentials and Deep Dive

© 2015 BlueCamphor Technologies (P) Ltd.


Course Topics

Module 1 Module 2 Module 3 Module 4


Getting Started / Scala – Essentials and Introducing Traits and Functional Programming
Introduction to Scala Deep Dive OOPS in Scala in Scala

Module 5 Module 6 Module 7 Module 8


Spark and Big Data Advanced Spark Understanding RDDs Shark, SparkSQL and
Concepts Project Discussion

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 2


Session Objectives
In this session, you will be able to understand

ᗍ Data Types in Scala

ᗍ Variable Types in Scala

ᗍ Lazy Values

ᗍ Control Structures in Scala

ᗍ Functions

ᗍ Procedures

ᗍ Collections

ᗍ Reserved Words

ᗍ Pattern Matching

ᗍ Enumeration

ᗍ Ternary Operators

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 3


Data Types in Scala
ᗍ A Data type tells the compiler about the type of the value to be stored in a location
ᗍ Scala comes with the following built-in data types which you can use for your Scala variables

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 4


Data Types in Scala (cont’d)
Few Examples:

String result

Integer Result

Double result

Concatenation of two integers, results string(25) not int(5)

Concatenation of two strings , results string

Adding two numbers, results integers

Adding string and Integers results String

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 5


Variables Types in Scala

Variables are simply names used to refer to some location in memory – a location that holds a value with which we
are working

Scala variables come in two shapes: Values and Variables

Values:

Immutable - “val” (Read only)

ᗍ Similar to Java Final Variables


ᗍ Once initialized, Vals can’t be reassigned

Vals can’t be reassigned

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 6


Variables Types in Scala (Cont’d)
Variables:

Mutable - “var” (Read-write) - Similar to non-final variables in Java

Here, myVar is declared using the keyword var. This means that it is a variable that can change value and this is called
mutable variable

Vars can be reassign

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 7


Variables Types in Scala (Cont’d)
Type Inference :

When you assign an initial value to a variable, the Scala compiler can figure out the type of the variable based on the
value assigned to it

This is called type inference

Once a type is assigned to a variable, it remains same for entire scope

Thus, Scala is Statically Typed language

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 8


Variables Types in Scala (Cont’d)

Assigning Block Expression

ᗍ In Java or C++ a code block is a list of statements in curly braces { }

ᗍ In Scala, a { } block is a list of expressions, and result is also an expression

ᗍ The Value of a block is the value of the last expression of it

Note: You can assign an anonymous function result to a variable/value in Scala

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 9


Lazy Values

ᗍ One nice feature built into Scala are "lazy val" values.
ᗍ Lazy value initialization is deferred till it’s accessed for first time
ᗍ For example : If you want to read a file abc.txt, if the file is not existing , you will get FileNotFoundException exception

ᗍ But if you initialize the value as Lazy, you won’t get this error, because it will delay the initialization till it accesses
the file abc.txt

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 10


Lazy Values (cont’d)
ᗍ Lazy values are very useful for delaying costly initialization instructions

ᗍ Lazy values don’t give error on initialization, whereas no lazy value do give error

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 11


Check your Understanding – 1

If val a = (1, 2, 4,11, “Robert”,5,9,11,2.5 ) then a.-5?

a) No value , its wrong syntax


b) 5
c) Nil
d) “Robert”

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 12


Check your Understanding – Solution

If val a = (1, 2, 4,11, “Robert”,5,9,11,2.5 ) then a.-5?

a) No value , its wrong syntax


b) 5
c) Nil
d) “Robert”

No value, its wrong syntax

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 13


Control Structures in Scala

Controlled Flow

Start
ᗍ Control Structures controls the flow of execution

ᗍ Scala provides various tools to control the flow of program’s


Check
Turn execution
Task 2 turn

ᗍ Some of them are:


Task 1 turn
Task 3 turn

doTask1 • if..else
• while
• do-while
doTask2
• foreach
• for
doTask3

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 14


Control Structures in Scala (cont’d): if-else

An if statement can be followed by an optional else statement, which executes


when the boolean expression is false.
ᗍ if-else syntax in Scala is same as Java or C++
ᗍ In Scala, if-else has a value, of the expression following it
ᗍ Semicolons are optional in Scala

Every expression in Scala has a type. First If statement has a type Int

Second statement has a type Any. Type of a mixed expression is supertype of both branches

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 15


Control Structures in Scala (cont’d) : While Loop
ᗍ A while loop statement repeatedly executes a target statement as long as a given condition is true
ᗍ In Scala while and do-while loops are same as Java

Syntax:

While(condition)
{
// Block of code ;
}

Note: The ++i, or i++ operators don’t work in Scala, use i+=1 or i=i+1 expressions instead

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 16


Control Structures in Scala (cont’d): do- While Loop

A do...while loop is similar to a while loop, except that a do...while loop is guaranteed to execute at least one time

Syntax:

do
{
//Block of code
} while(condition);

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 17


Control Structures in Scala (cont’d): foreach Loop
Looping with foreach:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 18


Control Structures in Scala (cont’d): for loop
for Loop:
for loop can execute a block of code for specific number of times.
Scala doesn’t have for (initialize; test; update) syntax

for( var x <- n ) { here, n -- > Range


//Block of statements; <- operator is called a generator
}

Scala: For Loop : to vs. until


You can use either the keyword to or until when creating a Range object. The difference is, that to includes the last
value in the range, whereas until leaves it out. Here are two examples:

Excluding 5
Including 5

The first loop iterates 5 times, from 1 to 5 including 5


The second loop iterates 4 times, from 1 to 4, excluding the upper boundary value 5

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 19


Check your Understanding – 2

What is the output of the following program?

for (x <- 'a' until 'f‘)


print(x)

a) Error
b) abcde
c) abcdef
d) None of these

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 20


Check your Understanding – Solution

What is the output of the following program?

for (x <- 'a' until 'f‘)


print(x)

a) Error
b) abcde
c) abcedef
d) None of these

abcde

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 21


Control Structures in Scala: for Loop (cont’d)
While traversing an array, following could be applied:

Advanced For Loop: can have multiple generators in for loop

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 22


Control Structures in Scala: for Loop(cont’d)
We can put conditions in multi generators for loop

for(i<- 1 to 3; j<-1 to 3 if i ==j) println(5*i + j+1)

We can introduce variables in loop!

for(i<- 1 to 3; x = 4-i; j<- x to 3) println(5*i + j+1)

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 23


Control Structures in Scala: The for Loop with Yield
If the body of for loop starts with yield, it returns a collection of values

val x = for(i<- 1 to 10) yield i*5

for (i<- x) println(i)

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 24


Functions
A function is a group of statements that together perform a task
Scala function is a complete object which can be assigned to a variable

The last statement in the function is the return value.

You can create functions with “def” keyword

Syntax:

def functionName ([list of parameters]) : [return type] =


{
function body
return [expr]
}

Note: In Java, this concept is very close to a method

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 25


Functions (cont’d)
Named and Default Arguments
We can provide defaults to function arguments, which will be used in case no value is provided in function calls

ᗍ We can specify argument names in function calls


ᗍ In named invocations the order of arguments is not necessary
ᗍ We can mix unnamed and named arguments, if the unnamed argument is the first one. We can specify argument
names in function calls
ᗍ In named invocations the order of arguments is not necessary
ᗍ We can mix unnamed and named arguments, if the unnamed argument is the first one
Variable Arguments

ᗍ Scala supports variable number of arguments to a function

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 26


Check your Understanding – 3

What is the output of the following?

def concatStr(a:String, b:Int=2 , c:String) = {a + b + c}


println(concatStr( "Hi",200, "Welcome"))

a) Hi2Welocme
b) Hi200Welcome
c) Error
d) Hi2200Welcome

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 27


Check your Understanding – Solution

What is the output of the following?

def concatStr(a:String, b:Int=5 , c:String) = {a + b + c}


println(concatStr( "Hi",200, "Welcome"))

a) Hi5Welocme
b) Hi200Welcome
c) Error
d) Hi5200Welcome

Error

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 28


Procedures
Scala has a special notation for a function that returns no value
If the function body is enclosed in braces without a preceding = symbol, then the return type is Unit

Such functions are called Procedures. Procedures do not return any value in Scala

Example:

Same rules of default and named arguments apply on Procedures as well

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 29


Scala: Collections
ᗍ Scala has a rich set of collection library
ᗍ Collections are containers that hold objects
ᗍ Those containers can be sequenced, linear sets of items like Arrays, List, Tuple, Option, Map, etc.

Collections hierarchy

Traversable

Seq Iterable Map

Set

IndexedSeq LinearSeq

SortedSet BitSet SortedMap

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 30


Collections

Collections can be mutable and immutable

Scala collections systematically distinguish between mutable and immutable collections

Mutable collection:

ᗍ A mutablecollection can be updated or extended in place

ᗍ This means you can change, add, or remove elements of a collection as a side effect

Immutable collections:

ᗍ By contrast, never change

ᗍ You have still operations that simulate additions, removals, or updates, but those operations will in each case return
a new collection and leave the old collection unchanged

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 31


Scala Collections: Array

Arrays are mutable, indexed collections of values


Array[T] is Scala's representation for Java's T[]
Declaring Arrays:

Integer Array

String Array

Accessing arrays :

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 32


Scala Collections: Array (cont’d)
Fixed Length Arrays:
Examples:

Accessing Arrays:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 33


Scala Collections: ArrayBuffers
An ArrayBuffer buffer holds an array and a size. Most operations on an array buffer have the same speed as for an array,
because the operations simply access and modify the underlying array

Array buffers can have data efficiently added to the end

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 34


Scala Collections: ArrayBuffers (cont’d)
cars.trimEnd(1) : Removes the last Element

// Adds element at 2nd index

Adds a list

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 35


Scala Collections: ArrayBuffers (cont’d)

//Removes an element

//Removes three elements from index 1

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 36


Scala Collections: Maps
ᗍ A Map is a collection of key/value pairs
ᗍ Any value can be retrieved based on its key
ᗍ Keys are unique in the Map, but values need not be unique

Accessing immutable Maps:

Accessing with keys

Can’t access with values

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 37


Scala Collections: Maps (cont’d)
If there is a sensible default value for any key that might try with map, it can use the getOrElse method

it provides the key as the first argument, and then the default value as the second

It is quite common to use getOrElse with a default of 0

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 38


Scala Collections: Mutable Maps
To create a mutable Map, import it first:

Create a map with initial elements

add elements with +=

remove elements with -=

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 39


Scala Collections: Mutable Maps (cont’d)
Update elements by reassigning them

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 40


Scala Collections: Tuples
A tuple is an ordered container of two or more values of same or different types
Unlike lists and arrays, however, there is no way to iterate through elements in a tuple
Its purpose is only as a container for more than one value
You create a tuple with the following syntax, enclosing its elements in parentheses
Here's a tuple that contains an Int and a String and Double

Accessing the tuple elements:

In tuples the offset starts with 1 and NOT from 0

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 41


Scala Collections: Tuples (cont’d)
Tuples are typically used for the functions which return more than one value:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 42


Scala Collections: Lists
Lists are quite similar to arrays, but there are two important differences.
First, lists are immutable. i.e., elements of a list cannot be changed .
Second , lists have a recursive structure whereas arrays are flat.
Class for immutable linked lists representing ordered collections of elements of type

This class comes with two implementing case classes scala.Nil and scala.:: that implement the abstract members is
Empty, head and tail
Example:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 43


Scala Collections: Lists (cont’d)
:: operator adds a new List from given head and tail

We can use iterator to iterate over a list, but recursion is a preferred practice in Scala

Example:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 44


Reserved Words
ᗍ Reserved Keyword (also known as a Reserved Identifier) is a word that cannot be used as an identifier, such as the
name of a variable, function, or label – it is reserved from use

ᗍ Few are listed here:

• abstract case catch class


• def do else extends
• false final finally for
• forSome if implicit import
• lazy match new null finally is reserved keyword
• object override package private
• protected return sealed super

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 45


Pattern Matching
Scala has a built-in general pattern matching mechanism
It allows to match on any sort of data with a first-match policy
Here is a small example which shows how to match against an integer value:

Here is a second example which matches a value against patterns of different types:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 46


Enumeration
Enumeration allows programmer to define their own data type

Often we have a variable that can take one of several values. For instance, a WeekDays field of an object could be
either Mon, Tue, Wed, or Thu

In other languages such as C, Java, or Python, it is common to use a small integer to distinguish the possibilities

In Scala, we let the compiler create one object for each possibility, and we use a reference to that object

Here is the somewhat strange syntax to do this:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 47


Enumeration(Cont’d)
Another Way:

Gives the error if value is not found

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 48


Ternary Operators
In other programming languages there is a definite, unique ternary operator syntax, but in Scala, the ternary operator
is just the normal Scala if/else syntax

Example:

Another Example, you can use the Scala ternary operator syntax on the right hand side of the equation, as you
might be used to doing with Java:

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 49


Check your Understanding – 5

What is the output of the following?

val new = List(1,2,3,4) the new._2

a) 3
b) Error
c) 2
d) None of these

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 50


Check your Understanding – Solution

What is the output of the following?

val new = List(1,2,3,4) the new._2

a) 3
b) Error
c) 2
d) None of these

Error

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 51


Questions

© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Slide 52

You might also like