Lecture Notes 1.3.4 and 1.3.5
Lecture Notes 1.3.4 and 1.3.5
Course Outcomes
CO1: Understand the components of the Hadoop Ecosystem and Data Science methodology
Variables are nothing but reserved memory locations to store values. This means that when you
create a variable, you reserve some space in memory. Scala has two kinds of variables, vals and vars.
A val is similar to a final variable in Java. Once initialized, a val can never be reassigned. A var, by
contrast, is similar to a non-final variable in Java. A var can be reassigned throughout its lifetime
Based on the data type of a variable, the compiler allocates memory and decides what can be stored
in the reserved memory. Therefore, by assigning different data types to variables, you can store
integers, decimals, or characters in these variables.
Var Declaration
The type of a variable is specified after the variable name and before equals sign. You can define any
type of Scala variable by mentioning its data type as follows −
Syntax
Below, myVariable is declared using the keyword var. It is a variable that can change value and this is
called mutable variable. Following is the syntax to define a variable using var keyword −
Below, myVal is declared using the keyword val. This means that it is a variable that cannot be
changed, and this is called immutable variable. Following is the syntax to define a variable using val
keyword −
Identifier
All Scala components require names. Names used for objects, classes, variables and methods are
called identifiers. A keyword cannot be used as an identifier and identifiers are case-sensitive. Scala
supports four types of identifiers.
An alphanumeric identifier starts with a letter or an underscore, which can be followed by further
letters, digits, or underscores. The '$' character is a reserved keyword in Scala and should not be used
in identifiers.
An operator identifier consists of one or more operator characters. Operator characters are printable
ASCII characters such as +, :, ?, ~ or #.
Some legal operator identifiers include +, ++, :::
Inference
When you assign an initial value to a variable, the Scala compiler can figure out the type of the
variable based on the value assigned to it. This is called variable type inference. Therefore, you could
write variable declarations like this −
Here, by default, myVar will be Int type and myVal will become String type variable.
Scope
Variables in Scala can have three different scopes depending on the place where they are being used.
They can exist as fields, as method parameters and as local variables. Below are the details about
each type of scope.
4.4.1 Fields
Fields are variables that belong to an object. The fields are accessible from inside every method in
the object. Fields can also be accessible outside the object depending on what access modifiers the
field is declared with. Object fields can be both mutable and immutable types and can be defined
using either var or val.
Method parameters are variables, which are used to pass the value inside a method, when the
method is called. Method parameters are only accessible from inside the method but the objects
passed in may be accessible from the outside, if you have a reference to the object from outside the
method. Method parameters are always immutable which are defined by val keyword.
Local variables are variables declared inside a method. Local variables are only accessible from inside
the method, but the objects you create may escape the method if you return them from the method.
Local variables can be both mutable and immutable types and can be defined using either var or val.
Comments
Scala supports single-line and multi-line comments very similar to Java. Multi-line comments may be
nested, but are required to be properly nested. All characters available inside any comment are
ignored by Scala compiler
Semicolon Inference
In a Scala program, a semicolon at the end of a statement is usually optional. You can type one if you
want but you don't have to if the statement appears by itself on a single line. On the other hand, a
semicolon is required if you write multiple statements on a single line:
If you want to enter a statement that spans multiple lines, most of the time you can simply enter it
and Scala will separate the statements in the correct place. For example, the following is treated as
one four-line statement:
if (x < 2)
println("too small")
else
println("ok")
Occasionally, however, Scala will split a statement into two parts against your wishes:
+y
This parses as two statements x and +y. If you intend it to parse as one statement x + y, you can
always wrap it in parentheses:
(x
+ y)
Alternatively, you can put the + at the end of a line. For just this reason, whenever you are chaining
an infix operation such as +, it is a common Scala style to put the operators at the end of the line
instead of the beginning:
x+
y+
Rules
The precise rules for statement separation are simple for how well they work. In short, a line ending
is treated as a semicolon unless one of the following conditions is true:
1. The line in question ends in a word that would not be legal as the end of a statement, such
as a period or an infix operator.
2. The next line begins with a word that cannot start a statement.
3. The line ends while inside parentheses (...) or brackets [...], because these cannot contain
multiple statements anyway.
Here are some recommended basic syntax and coding conventions in Scala programming:
Case Sensitivity − Scala is case-sensitive, which means identifier x and X would have different
meaning in Scala.
Class Names − For all class names, the first letter should be in upper case. If multiple words
are used to form the name of the class, each inner word's first letter should be in upper case.
Method Names − All method names should start with a lower case letter. If multiple words
are used to form the name of the method, each inner word's first letter should be in upper
case.
o Example − def myMethod()
Program File Name − Name of the program file should match the object name. If the file
name and the object name do not match, it may result in warning / error.
o Example − Assume 'HelloWorld' is the object name.
Then the file should be saved as 'HelloWorld.scala'.
def main(args: Array[String]) − Scala program processing starts from the main() method. This
is a mandatory part of every Scala Program.
References
TEXT BOOK
T1 – Learning Spark: Lightning-Fast Data Analytics by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny Lee, Oreilly
Media
T2-- Scala Cookbook by Alvin Alexander, O'Reilly Media,
REFERENCE BOOKS
R1 - Advanced Analytics with Spark by Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills, Oreilly Media
R2: Programming in Scala by Martin Odersky, Lex Spoon, Bill Venners, Artima Press
OTHER LINKS
https://www.coursera.org/learn/scala-spark-big-data
https://www.ibm.com/training/badge/big-data-hadoop-and-spark-essentials
https://www.udemy.com/course/apache-spark-with-scala-hands-on-with-big-data/?
srsltid=AfmBOopGjLYOti2dUT0519WBiSFoScYW-IC0BBovs1Genbf1SPfI-C9E&couponCode=LEARNNOWPLANS