December 2007: Why Scala?

Why Scala?

By Tim Dalton, OCI Software Engineer

December 2007

Introduction

Scala is a programming language that integrates features of object-oriented and functional programming languages. It is statically typed and compiles to JVM bytecode. A scripting interpreter is also part of the Scala language distribution. At the time of this writing, the current release of Scala for the JVM is version 2.6 with a significantly less mature .NET version beginning to get more attention. This document provides an overview of those aspects of Scala that make it an intriguing option for development on the JVM platform and perhaps eventually .NET as well.

Scala was invented at the EPFL (Ecole Polytechnique Federale de Lausanne) in Switzerland primarily by Martin Odersky, a professor there. Oderksy is the co-designer of Java generics and the original author of the javac reference compiler. The Scala language was first released in 2003.

Scala Basics

The Scala distribution requires 1.4 or later of the Java Runtime Environment (JRE). It can be downloaded from the Scala language download page. To run examples in this document, ensure that the ./bin directory of the Scala distribution is included on the execution path (PATH environment variable) and that the JAVA_HOME environment variable references the location of a JDK or JRE version 1.4 or later.

The best way to describe a language is to show code and explain what is happening.

The ubiquitous Hello World coded in Scala:

package HelloWorldDemo

object Main {

def main(args:Array[String])=

println("Hello World");

}

Save the above text in a file named HelloWorldDemo.scala, compile via scalac HelloWorldDemo.scala, and execute using scala HelloWorldDemo.Main. That will result in the output, Hello World.

Actually, Scala source files do not need to have any particular name nor conform to any directory structure. The source for multiple Scala classes and other types documented here can be in a single file.

Features demonstrated in "HelloWorldDemo":

Singleton objects versus classes:Scala does not support static methods within classes, but supports singleton objects designated by the keyword object. All methods of a singleton object are the equivalent of Java statics. An object and class of the same name are called "companions". A common idiom in Scala is a singleton object acting as a factory for its companion class.

Arrays are collections:Arrays are implemented in Scala as a collection class with a parameterized type (generic) specified using square brackets. Scala supports generics even when the underlying platform is Java version 1.4, which pretty much the same capabilities as Java 1.5 or later.

Implicit static import of Scala object scala.Predef:The println method is defined in the Scala singleton object, scala.Predef, which is statically imported to all Scala objects much like the implicit import of java.lang in Java. All classes in the packages java.lang and scala are implicitly imported as well.

A more complex example:

package SwingDemo

import javax.swing.{JFrame, JLabel}

object Main extends Application {

def getTitle()="Scala can Swing"

val frm =new JFrame(getTitle)

frm.getContentPane.add(new JLabel("Hello World"))

frm.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE)

frm.pack

frm setVisible true

}

To one accustomed to Java, there appears to be typographical errors in the example above, but there are no errors and it will compile. Scala has a very flexible syntax as compared to Java and other statically typed languages. This allows for a greater degree of conciseness, code can become more like a "Domain Specific Language" (DSL) and new idioms can be invented. Many features of Scala are enabled by this syntactical flexibility.

Like Java, Scala uses curly braces {} to define the scope of an object, class, method, or other block of code. It does have a different notation for importing packages in that multiple classes from a given package can be specified using a curly brace notation and uses _ instead * to import all from a package.

The Application object can be extended to eliminate the need to implement the main method and code just goes inside the class definition.

Type inference:

The Scala compiler can oftentimes infer the type of an object so there is no need to explicitly specify the type. In the SwingDemo example, the compiler can figure out that frm is a JFrame. Types can be explicitly specified for clarity. In the example above, the line var frm:JFrame = new JFrame(getTitle), could be used as well. The return type of methods can be inferred also. The getTitle() method has an inferred return type of String. The method declaration with explicit return type would be def getTitle():String =.

There is a special type, Unit, that is the Scala equivalent to void. A method that does not return a type is inferred to return a Unit. The main method of the HelloWorldDemo.Mainobject earlier has Unit as it's inferred return type and could be explicitly specified as def main(args:Array[String]):Unit =.

Sometimes types need to be explicitly specified to coerce an object into a type other than what would have been inferred. For example var n=1 will infer n to be an Int. To force n to be a Byte use, var n:Byte=1.

Values versus variables:

Scala uses keywords val and var to indicate a value versus a variable. Values are immutable and are the equivalent of final in Java while variables are mutable. Values and variables defined in the object or class definition context represent properties and are in effect public by default, but can be protected or private just like Java.

Inferred trailing semi-colons:

Though there are some exceptions, the compiler can figure out where a line of code ends.

Optional empty parameter lists:

Providing that no object or class property have the same name, parenthesis around empty parameter lists are optional. In the above example, the compiler can determine that getTitle invokes getTitle().

Optional "dot notation" and parenthesis around single parameter in certain cases:

When the invocation is in the form, <object> <method> <parameter(s)>, the parenthesis around the parameters and the . between the object reference and method name become optional. The line frm setVisible true in the SwingDemo example above demonstrates this. This is valuable for implementing DSLs.

Scala Types

Scala is a pure object-oriented language and does not support primitives. In the scala package, there are classes Char, Byte, Short, Int, Long, Float, Double, and Boolean that correlate to Java primitives. When Scala objects interact with Java, conversions between primitives and those types are implicitly performed.

Non-alphanumeric characters in method names:Scala allows characters in method names that are not allowed by Java. Coupled with the fact that parenthesis around single parameters are optional this enables a form of operator overloading. Here is a demonstration:

package FooPlusBarDemo

class Foo(value:Int){

def +(bar:Bar)= value + bar.value

}

class Bar(val value:Int)

object Main extends Application {

val foo =new Foo(13)

val bar =new Bar(23)

println(foo + bar)

}

Running the above application should produce the output 36. The expression foo + bar could be also expressed as foo.+(bar). Scala has rules as to whether a method is usable as a infix, prefix, or postfix operation. All methods can be used as an infix operator like the foo + bar expression above or as postfix operators. However, methods that end with colon, :, are right associative instead of left associative. The following characters can be used for prefix operations, '+', '-', '!', and '~', that are defined using notation def unary_. Here is an example of unary/prefix operation, postfix, and a right associative infix operation:

package PreInPostFixDemo

class Foo(val value:Int){

def unary_!="!!!" + value + "!!!"

def%= value + "%"

def*:(multiple:Int)= value * multiple

}

object Main extends Application {

var foo =new Foo(62)

Console.println(!foo)// unary

Console.println(foo %)// postfix

Console.println(2*: foo)// infix and right associative

}

Output:

!!!62!!!
62%
124

The expression !foo is the equivalent of foo.!, foo % is the same as foo.%, and the expression 2 *: foo, being right associative, is the equivalent of foo.*:(2). The List data type, described later, utilizes right associative infix operators.

--Inferred return values based on result of last expression in code blockScala does not require an explicit return in a method, though it is supported. Scala uses the result of the last line in the code block as the return value for the block. If the last line is a Java method returning void, then the method is inferred to return Unit. Example:

def NullSafeToUpper(s:String)={

println("in NullSafeToUpper)

if (s == null) "NULL" else s.toUpperCase

}

This example demonstrates how the result of the last expression in a block of code is the inferred return value and that if constructs in Scala can also act like the <condition> ? <then-expression> : <else-expression> construct in Java. Since both possible results of if (s == null) are strings, then String is the inferred return type for the method.

-- Primary constructorsThe primary constructor of a Scala object is defined as part of the class definition itself. Object properties can also be defined in the primary constructor. Example:

package ValVarConstructorDemo

class ImmutableFoo(val value:Int)

class MutableFoo(var value:Int)

object Main extends Application {

val immutable =new ImmutableFoo(54)

val mutable =new MutableFoo(32)

println("Immutable value = " + immutable.value)

// Compiler won't allow: immutable.value = 65

println("Mutable value = " + mutable.value)

mutable.value=39

println("Mutated Mutable value = " + mutable.value)

}

Output:

Immutable value = 54
Mutable value = 32
Mutated Mutable value = 39

Parameters provided to the primary constructor can use val or var to indicate whether properties of the same name should be defined for the object. Specifying a val makes the property accessible by an implicit accessor method, while var makes the property also mutable using an implicit mutator method. These accessors and mutators are implemented in a way that allows access in the same manner as a public field in Java. The value property in ImmutableFoo and MutableFoo classes in the above example demonstrates value and variable properties.

Scala can subclass based on a superclass constructor. Example:

package DogSpeakDemo

class Mammal(name:String){

overridedef toString()="Mammal named " + name

}

class Dog(name:String)extends Mammal(name){

def speak()= println(this.toString + " says Woof")

}

object Main extends Application {

var rinTinTin =new Dog("Rin Tin Tin")

rinTinTin.speak

}

he example above will output Mammal named Rin Tin Tin says Woof. The Dog class extends the Mammal class and uses the Mammal(name:String) constructor to pass the name to the super class. (Note: Scala requires explicit override for overriding methods.)

Allowing constructors and fields to be specified as part of the class definition itself allows simple classes to be defined using a single line of code.

Secondary constructors can be defined by implementing this methods. The example below adds a secondary constructor that takes a double value for the AbsoluteNumber class:

class AbsoluteNumber(num:Int){

var value = Math.abs(num)

defthis(dbl:Double)=this(dbl.toInt)

}

The this(dbl:Double) constructor simply invokes the primary constructor with the double value converted to an integer. Secondary constructor parameters do not become properties for the object.

-- Ability to specify accessors and mutators and create virtual public fieldsScala provides the ability to define accessor and mutator methods that hide the implementation of an object property. Example:

package AbsoluteNumberDemo

class AbsoluteNumber(num:Int){

privatevar_value = Math.abs(num)

def value =_value // "getter" method

def value_=(num:Int)=_value = Math.abs(num)// "setter" method

}

object Main extends Application {

var absolute =new AbsoluteNumber(10)

printf("Absolute = {0}\n", absolute.value)

absolute.value= -5

printf("Absolute = {0}\n", absolute.value)

}

The example produces output:

Absolute = 10
Absolute = 5

To a client object, an AbsoluteNumber object appears to have a public field named value. The def _=() notation specifies a mutator method that is used to perform assignment to the property in question. This allows Scala classes to better conform to the Uniform Access Principle 1 which states that the services on an object should be available through a uniform notation that does not reveal whether they are implemented through storage or through computation. The num parameter provided to the primary constructor becomes a hidden property.

Default apply method

Methods named apply have special meaning for Scala objects and classes. These are methods that are invoked by using a "method-less" expression in the form, <object>.([<parameters>])

package ApplyDemo

object Foo {

def apply(n:Int)= printf("FooObject({0})\n", n)

}

class Foo {

def apply(n:Int)= printf("FooClass({0})\n",n)

}

object Main extends Application {

Foo(1)

var foo =new Foo

foo(2)

foo(3)

}

Output:

FooObject(1)
FooClass(2)
FooClass(3)

Expression "Foo(1) is the equivalent to Foo.apply(1). Scala arrays and other collections classes use the apply method to provide indexers. Array access in Scala is done with parenthesis and not square brackets like in Java.

-- Type AliasingScala supports type aliasing which allows types to be defined as aliases to other types. Example:

Type aliases can be inherited from super classes or traits. More about traits later.

Functional Programming with Scala

Functional programming is getting more serious consideration outside of academia because of characteristics that are advantageous for concurrent programming. A main reason for this is that a pure functional language maintains no global state. The stack represents the state and therefore function invocations that are not dependent on each other can easily run concurrently.

Though Scala cannot be considered a pure functional language, it borrows many features from popular functional languages. With a little discipline by developers, benefits can be reaped.

First Class Functions

Functions in Scala are objects and can be passed like any other object. Function objects are expressed as in form, [()] => . Example:

The doOper method takes two integers and a function object that takes two integers and returns an integer. The <function or method name> _ expression indicates a partially applied function. This prevents the compiler from interpreting the expression as an attempt to invoke the function. When used with a method, like the add _ expression above, a function object is generated that delegates to the add method. The doOper(3,4, (x:Int, y:Int) => x * 2 + y) expression passes an anonymous inline function object to the doOper method. Inline function objects like this are often referred to as Lambda Expressions.

A function object without parameters is a code block that can be passed around just like any other object. Example:

The timeBlock method takes a block as a parameter and reports how many milliseconds it takes to execute it. The test method is used to demonstrate that values and variables in the scope of the method are accessible by the block. The ability to access values and variables in the current scope within code blocks represents a form of lexically scoped closures. (Note: the 1 to nexpression uses a to method on the Int class and is equivalent to 1.to(n) which returns a Range object that can be iterated over for each value between 1 and n.)

The multiple parameters lists for the repeat is one way that Scala supports currying. Currying is a technique in functional programming of reducing a function call containing multiple parameters to multiple function calls each with usually a single parameter. Each intermediate function call would return a function representing a partial application of the parameters. The expression repeat(5)_ represents a partially applied function which is a function object that takes a code block as parameter to complete the application of repeat. Nested functions provide another form of currying. Example:

package NestedBlockDemo

object Main extends Application {

def repeat(n: Int)={

def executeBlock(block:=>Unit)=(1 to n).foreach{ x =>

block

}

executeBlock _

}

val sevenTimes = repeat(7)

sevenTimes {

println("hello")

}

}

The executeBlock function is nested in the repeat method and it is partially applied as the return value.

Built-in support for List and Tuple Data Types

List and tuple types are very common in functional programming languages and Scala implements them both. Both types are immutable. Example:

The list1 and list2 values demonstrate two ways to define a list in Scala. The list1 value uses a special instance of List called Nil that represents an empty list and then uses the right-associative operator, ::, repeatedly. An equivalent expression is val list1 = Nil.::(8).::(5).::(3).::(2).::(1).

The implicitly imported scala package defines a List singleton object with an apply method that accepts a variable numbers of arguments of a parameterized type. The parameterized type usually can be inferred. Scala supports variable arguments like Java does in 1.5 and later. This apply method acts as a factory and returns an implementation of the List class. The assignment expression for list2 uses this method. Both forms of list instantiation demonstrate Scala syntax enabling other features.

Lists in Scala support methods to return the first item, head, or the rest of the list, tail. Other common list operations like map and filter are provided as well. Map methods apply a function object to each item in the list and return a new list with the result from each. Filter methods apply a function object that returns a boolean to each item and return a new list of items that evaluated to true. The _ % 2 expression is shorthand for a function object that could be expressed as x:T => x % 2 where T is the type of the item in the list or as x => x % 2 since the type can be inferred.

Tuples are groupings of objects of differing types. Scala Tuple objects have methods in the form_ where is number from 1 to the total number of objects in the tuple.

Pattern matching

Another common feature of functional languages is pattern matching. In Scala, pattern matching is not quite as integrated in to the language as it is in Erlang or OCaml where functions themselves have multiple definitions and the one invoked is based on how arguments match patterns. Scala provides a match/case construct. Example:

The first case that matches the passed value is processed and only that one. There is no need for something like break as in Java.

Case clauses can extract values from the matching pattern and use them as values in the corresponding execute block. For example, the case x::10::rest clause matches any List with a second value equal to 10. The value x is assigned the first item of the list and the rest value will reference a List containing the remainder of items or Nil if there are none.

A case class in Scala is a class that adds functionality that enables pattern matching for that class. Added functionality includes generation of a companion class with a factory apply method that creates an instance of the class. Hence, case classes can be instantiated without new. The Person("John", 6) expression uses the apply method on the Person singleton object as a factory to create an instance of the Person class and is equivalent to new Person("John", 6). Case class companions objects also have methods to allow fields to be extracted and used in the execute block for the case clause. Two case Person clauses in the above example extract the name field.

Case clauses can use if conditionals that are called "guards" to further specify the match.

The case _ clause is a "match anything" clause that acts like a default: clause on a Java switch statement.

One area of Java that represents a simple form of pattern matching is exception handling. Scala uses its own pattern matching constructs to process exceptions. Example:

Other Scala Language Features

Scala has many advanced features as compared to Java. Prominent features are outlined below:

Traits

Traits are an improvement over Java interfaces in that they can contain implementation code. They are similar to Ruby mixins and provide a form of multiple inheritance. Scala classes can use many traits. Example:

The FooClass and BarClass have the Foos and Bars traits respectively while FooBarClass has both traits. Traits can override methods on the objects that use them. The expression, new Now with Foos with Bars, demonstrates that an object with traits can be declared anonymously like an inner class in Java. The fooBarNow value is assigned an object that is a subclass of Now that has both the Foos and Bars traits.

Sequence comprehensions

Sequence comprehensions (also called "for comprehensions") provide a syntax to iterate over multiple enumerations, apply conditionals, and either produce a new sequence or execute a function object on each item. Example:

The first sequence comprehension in the above example iterates over a range of integers from 1 to 100 and returns a new sequence containing even numbers. The second one has two iterations over the same sequence nested to compare the scores of each team to all other teams and produce output indicating the result. Sequence comprehensions provide a sort of query language for sequences.

Implicit Conversion Methods

Scala supports implicit methods that are often used for converting types. If the compiler encounters a type mismatch, it will look for an implicit method that takes the specified type and returns the required type. Example:

package ImplicitsDemo

class Foo(val value:Int){

overridedef toString ="Foo(" + value + ")"

}

class Bar(val value:String){

def printValue = Console.println(value)

}

object Main extends Application {

implicitdef Foo2Bar(foo:Foo)=new Bar(foo.value.toString)

def printBar(bar:Bar)= bar.printValue

printBar(new Foo(42))

}

The printBar method expects a Bar object, but is provided a Foo object. The compiler will implicitly insert a call to Foo2Bar since it takes a Foo and returns a Bar. The method name is not significant here. An error will result if the compiler finds multiple matching implicit methods. Explicitly, the last line would be printBar(new Bar(new Foo.value.toString)). The scala.Predef object provides many implicit conversion methods with self-explanatory names like byte2int and int2long to perform various conversions. Similar concepts in Java would be the implicit call to Object.toString() when an object is passed to various print methods and auto-boxing of primitives to objects when needed.

XML Processing

Scala supports XML as a built-in data type and includes operations that are similar to XPath expressions for querying the document object model (DOM). Example:

In the above example, the map and filter functions transform the results of the \\ operations. The _.text expression is shorthand for a function object, x:T => x.text where T is the parameterized type of the list being mapped or filtered. Sequence comprehensions could be used on XML as well for more concise expressions. The following section of code is more readable and could be used in the XMLDemo example above:

Scala code can be embedded in the XML within curly braces {}. Below is an example of a servlet implemented in Scala that uses XML with embedded values:

package ServletDemo

import javax.servlet.http.HttpServlet

import javax.servlet.http.HttpServletRequest

import javax.servlet.http.HttpServletResponse

class HelloServlet extends HttpServlet {

overridedef doGet(request: HttpServletRequest

,response: HttpServletResponse):unit =

{

var user = request.getParameter("user")

if(user ==null){

user =""

}

var html =

<html>

<head>

</head>

<body>

<h1>Hello { user }</h1>

<form>

User:<input type="text" name="user" length="16"/>

</form>

</body>

</html>

response.getWriter.println(html)

}

}

The user value is retrieved from the servlet parameters and is embedded in the HTML to be rendered.

Actors Library

This feature was inspired by a similar concept in the functional language, Erlang. Actors are basically concurrent processes that communicate via message passing. Actors support both synchronous and asynchronous message passing. Example:

package ActorDemo

import scala.actors.Actor

import scala.actors.Actor._

caseclass Stop;

class Initiator(receiver:Actor)extends Actor {

def act(){

receiver !"Can you here me now?"

receive {

case response:String =>{

println("Initiator received response:\n" + response)

}

}

receiver ! Stop

}

}

class Receiver extends Actor {

def act(){

while(true){

receive {

case msg:String =>{

println("Receiver received:\n" + msg)

sender !"Yes I can."

}

case_:Stop =>{

println("Receiver received Stop.")

exit()

}

}

}

}

}

object Main extends Application {

val receiver =new Receiver;

val initiator =new Initiator(receiver)

receiver.start

initiator.start

}

Output:

Receiver received:
Can you here me now?
Initiatator received response:
Yes I can.
Receiver received Stop.

Actors implement the method act which is analogous to the Runnable.run() method in Java. A ! method is used to send a message to the receiving actor. Message queues are used to store messages until the receiver is ready to process them.

Scala actors use a thread pool that initially contains four threads. When an actor blocks via receive a thread is blocked and is not available for the pool. The actor library will grow the pool if a new actor needs a thread.

The Scala actors library provides a means for an actor to not consume a thread when blocked by using a combination of loop and react methods. Due to the way react is implemented, looping using while does not work. Example of the Receiver class using loop and react:

class Receiver extends Actor {

def act(){

loop {

react {

case msg:String =>{

println("Receiver received:\n" + msg)

sender !"Yes I can."

}

case_:Stop =>{

println("Receiver received Stop.")

exit()

}

}

}

}

}

This form of actor is called an "Event-based actor" and is much more scalable than the thread-based counterpart.

Lazy Evaluation

This feature was new as of Scala version 2.6. It allows an object property not to be evaluated until the value is accessed. Below is a code sample illustrating this:

The property eagerCartesianPoint is instantiated upon construction of the Polar object while the lazyCartesianPoint property is not instantiated until it is accessed. Lazy evaluation is useful in preventing waste as the result of unused resources without implicit coding.

Anonymous Typing

Sometimes also referred to as "Structural Typing", this is another feature introduced in version 2.6. It allows a form of "duck typing" when a type can be declared based on the methods implemented. Any classes that implement those methods match that type.

In the preceding example, classes Foo and Bar do not share an interface or a trait, but do implement the methods specified in the Duck type alias and therefore can comprise a list of type Duck. In other words, they quack like a Duck and waddle like a Duck and therefore both can be considered a Duck.

Summary

Scala's features make it as close to a dynamic language as a statically type language can get and yet maintain the performance characteristics of compiled Java. Scala approximates features that make dynamic languages like Ruby and Groovy attractive. Functional language features allow Scala programs to follow a more functional style when it is better suited for the task at hand or take a non-functional or imperative approach.

Features of Scala will certainly get consideration for inclusion in future versions of Java, particularly type inference and first class functions in the form of closures. There is a precedent for this in that Martin Odersky provided the basis for Java generics.

The unique characteristics of Scala as compared to other languages for the JVM platform make it compelling language for Java developers to learn and perhaps come to use.