Scala , yet another programming language ?

Shishir Chandra
Analytics Vidhya
Published in
8 min readJul 18, 2021

--

Scala is not a new thing in town but has certainly gained a lot of interest among the developers in the recent years. It’s immense adoption in big data ecosystem has certainly highlighted its existence and made it a famous thing and got it a place among few of the most famous programmer’s picks. It’s important to understand it’s not the features scala exposes are new and aren’t provided by other alternatives but the way scala let you carry out those things in the code.

The evolution is inevitable which is no different in the world of programming languages. The standard bare minimum to expect from programming languages is no longer same as what we have read in bibles on the third generation programming languages. The aspects like interoperability and reusability have gotten different then our earlier perception about them, and for a good reason. Being verbose is actually no longer a trade off on code readability. Our friend for years, Java had been famously infamous for being too verbose.

What is functional in it?

We have been hearing a lot lately that all the modern programming languages are moving towards functional paradigm. What does that mean actually? Well its a way of creating your code with pure functions, It saves you from sharing states and the mutable data that we had been dealing with in the object oriented world. The focus is on what the outcome is but not the process of getting there. Everything is a function in it. The theory is to have everything like a mathematical expression using condition expression in order to calculate and arrive at a state.

Decompiling, its in JVM

Scala has been implemented on top of ridiculously famous Java Virtual Machine. Yes you guessed it right, Scala compiles to Java bytecode. Scala’s source code is interpreted by the compiler to produce efficient and effective JVM bytecode. This is one of the major contributor to its great success. Breaking down usual compilation into three layers — The front end builds an AST tree and enhances it. The middle end does some platform-independent optimisations knows as tail calls, and the back end does optimisations for CPUs and generates assembly files. On the other hand Scala compiler is organised little differently, for instance CPU specific optimisations are delegated to JVM.

An abstract syntax tree — also known as AST — is a representation of how the code is interpreted by the compiler. It is created by compiler stages of parser and lexer (some common compiler components). AST preserves all operations and values in your code.

Phases in compilation and building blocks

There are a total of 25 compilation phases in Scala, Lets talk a bit about most significant ones

Parse — The first phase where non-typed AST is created using parser and scanner. This is the where syntax errors are thrown.

Type checker — Infers types, checks whether types match, type checks references and violations.

Erasure — This type erasure is a JVM feature that was created with generics with Java 5 to enable backwards compatibility. When generics is used or value classes are created in the code, they are erased during this phase.

Implicit Conversion — These are a set of methods that are apply when an object of wrong type is referenced. It allows the compiler to automatically convert from one type to another.

Implicit conversions are needed in two conditions:

  • If an expression of type A and S does not match to the expected expression type B.
  • In a selection e. m of expression e of type A, if the selector m does not represent a member of A.

The operation on the two lists xa and ya of type List[Int] is legal

xa = ya

Given the implicit methods listorder and intorder are defined below in the scope:

implicit def listorder[A](x: List[A])
(implicit elemorder: A => Ordered[A]): Ordered[List[A]] =
new Ordered[List[A]] { /* .. */ }
implicit def intorder(x: Int): Ordered[Int] =
new Ordered[Int] { /* .. */ }

Pattern Matching — Pattern matching is a way of checking the given sequence of tokens for the presence of the specific pattern. It is the most widely used feature in Scala.

match” keyword is used instead of switch statement. “match” is defined in Scala’s root class to make its availability to the all objects. This can contain a sequence of alternatives. Each alternative will start from case keyword. Each case statement includes a pattern and one or more expression which get evaluated if the specified pattern gets matched. To separate the pattern from the expressions, arrow symbol(=>) is used

object Pattern {

def main(args: Array[String]) {

println(print(1));

}

def print(x:Int): String = x match {

// if value of x is 0,

// this case will be executed

case 0 => "Hello!!"

// if value of x is 1,

// this case will be executed

case 1 => "Hello again"

// if x doesnt match any sequence,

// then this case will be executed

case _ => "Bye!!"

}

}

Output — Hello again

Optimise Phase and Tail Calls — The tail calls phase optimizes tail recursion: in bytecode it is replaced with jump calls, so in bytecode it looks like a normal for loop in action. In Scala, only directly recursive calls to the current function are optimized. The concept is at times preferred over head recursion to avoid the stack overflow errors in case of deep recursions. Example — The JVM keeps all the calls in its internal stack. The internal stack has limited memory. So if we try to find the value of factorial(10000). This will cause a Stack Overflow error because the recursion depth is too deep for the JVM to handle.

The Scala compiler does a good job of optimizing the code and warning us when an implementation is not tail-recursive.

Why it actually takes so long to compile Scala code — well, the longest phase is typer phase — Type checker. Scala has a rich, complicated type system that needs a lot of time to process. Furthermore, such libraries as Shapeless are based on implicits and macros that are time consuming.

Lambda Expressions

Lambda Expression are expression that uses an anonymous function instead of variable or value. They are also called as currying in Scala. Lambda expressions are convenient when we have a simple function to be used at a place. These expressions are faster and more expressive than defining a whole function, this enhances the readability of the function. Lambda expressions can be made reusable for any kind of transformations. It can iterate over a collection of objects and perform some kind of transformation to them.

val name = (name_variable:Type) =>Todo_Expression

val output = (x:Int) => x+ 100

object GetCube {
val cube = (x:Int) => x*x*x

def main(args: Array[String]) {
var x = 2;
printf(“The cube of “ + x + “ is “ + cube(x))
}
}

Output — The cube of 2 is 8

Scala also supports Partial Functions — When a function is not able to produce a return for every single variable input data given to it then that function is termed as Partial function. It can determine an output for a subset of some practicable inputs only. It can only be applied partially to the stated inputs.

Partial function is a Trait, which needs two methods namely isDefinedAt and apply to be implemented. It can be interpreted by utilising case statements.

Concurrency in Scala

Scala concurrency is built on top of the Java concurrency model. On JVMs, with a IO-intensive payload, we can run thousands of threads on a single machine.

A Thread takes a Runnable. You have to call start on a Thread in order for it to run the Runnable.

scala> val x = new Thread(new Runnable {
def run() {
println("Scala")
}
})
x: java.lang.Thread = Thread[Thread-3,5,main]

scala> x.start
Scala

Futures — A Future is an asynchronous compute. You can wrap your computation in a Future and when you need the result, you simply call a blocking Await.result() method on it. An Executor returns a Future.

A FutureTask is a Runnable and is designed to be run by an Executor. Futures are a great approach to run parallel programs in an efficient and non-blocking way

val future = new FutureTask[String](new Callable[String]() {
def call(): String = {
searcher.search(something);
}})
executor.execute(future) }
//
val blockingOutput = Await.result(future)

Handling thread safely in scala is similar to java with options like synchronisation, volatile, atomic references, read write locks, countdown latches and atomic datatypes.

Futures allow us to run values off the main thread and handle values that are running in the background or yet to be executed by mapping them with callbacks.

If you come from a Java background, you might be aware of java.util.concurrent.Future. There are several challenges in using this:

  • Threads are always blocked while accessing values.
  • The wait until compute completes.
  • The GET is the only way to retrieve values.

It’s not so great way of writing concurrent code. Scala offers better features with Scala.concurrent.Future. With Scala Futures, we can achieve:

  1. Real-time non-blocking computations.
  2. Callbacks for onComplete (success or failure), values in Future are instances of the Try clause.
  3. The mapping of multiple Futures.

Futures are immutable by nature and are cached internally. Once a value or exception is assigned, Futures cannot be modified/overwritten (it’s hard to achieve referential transparency).

Execution Context, With futures at whatever point callback has to be execution it is to be determined on what thread it would get executed from the pool. Execution context is the one that decided this. The default one is ExecutionContext.global which uses threads from a global pool, The number of threads are determined by how many CPU cores are there. This uses default fork join from thread pool to run thread in background.

Lazy evaluation

The compiler does not immediately evaluate the bound expression of a lazy val. It evaluates the variable only on its first access.
Upon initial access, the compiler evaluates the expression and stores the result in the lazy val. Whenever we access this val at a later stage, no execution happens, and the compiler returns the result.

Lazy evaluation or call-by-need is an evaluation strategy where an expression isn’t evaluated until its first use i.e to postpone the evaluation till its demanded.

On eager computation we may waste our operation (CPU computations) which can be very costly when we write more complex and bigger code. Here lazy evaluation helps us in optimizing the process by evaluating the expression only when it’s needed and avoiding unnecessary overhead.

However, it’s not just always great — Finding bugs can be tricky as programmer has no control over program execution.

Conclusion

Its pretty evident that Scala is not a newer way of doing something but a slightly different or functional focussed way of doing things we had been doing in Java. It interacts with JVM seamlessly and that helps it bring the best out. The states like type checker may at times makes one feel that compilation time is the pain, but things like avoiding cold compilation, hydra GC check Also, macros and implicits can save a lot of time spend in typing phase.

--

--

Shishir Chandra
Analytics Vidhya

Distributed computing enthusiast, data engineer, system architect, cloud computing