This is done using a block, delimited by { ... } braces. The last element of a block is an expression that defines its return value.

The definitions inside a block are only visible from within the block. The block has access to what’s been defined outside of it, but if it redefines an external definition, the new one will shadow the old one, meaning it will be redefined inside the block.

Tail recursion

If a function calls itself as its last action, then the function’s stack frame can be reused. This is called tail recursion. In practice, this means that recursion is iterative in Scala, and is just as efficient as a loop.

One can require that a function is tail-recursive using a @tailrec annotation:

1
2

@tailrecdefgcd(a:Int,b:Int):Int=...

An error is issued if gcd isn’t tail recursive.

Higher-Order Functions

Functions that take other functions as parameters or that return functions as results are called higher order functions, as opposed to a first order function that acts on simple data types.

1
2
3
4
5
6
7
8
9
10
11
12
13

// Higher order function
// Corresponds to the sum of f(n) from a to b
defsum(f:Int=>Int,a:Int,b:Int):=if(a>b)0elsef(a)+sum(f,a+1,b)// Different functions f
defid(x:Int):Int=xdefcube(x:Int):Int=x*x*x// Calling our higher order function
defsumInts(a:Int,b:Int):Int=sum(id,a,b)defsumCubes(a:Int,b:Int):Int=sum(cube,a,b)

Anonymous functions

Instead of having to define a cube and id function in the example above, we can just write an anonymous function as such:

Identifier

The identifier is alphanumeric (starting with a letter, followed by letters or numbers) xor symbolic (starting with a symbol, followed by other symbols). We can mix them by using an alphanumeric name, an underscore _ and then a symbol.

Small practical trick: to define a neg function that returns the negation of a Rational, we can write:

1
2
3
4
5

classRational(x:Int,y:Int){...defunary_-:Rational=newRational(-numer,denom)// space between - and : because : shouldn't be a part of the identifier.
}

The precedence of an operator is determined by its first character, in the following priority (from lowest to highest):

All letters

|

^

&

< >

= !

:

+ -

* / %

All other symbolic characters

Infix notation

Any method with a parameter can be used like an infix operator:

1
2
3

raddsr.add(s)rlesss/* in place of */r.less(s)rmaxsr.max(s)

Constructors

Scala naturally executes the code in the class body as an implicit constructor, but there is a way to explicitly define more constructors if necessary:

Data abstraction

We can improve Rational by making it an irreducible fraction using the GCD:

1
2
3
4
5
6
7

classRational(x:Int,y:Int){privatedefgcd(a:Int,b:Int):Int=if(b==0)aelsegcd(b,a%b)valnumer=x/gcd(x,y)// Computed only once with a val
valdenom=y/gcd(x,y)...}

There are obviously multiple ways of achieving this; the above code just shows one. The ability to choose different implementations of the data without affecting clients is called data abstraction.

Assert and require

When calling the constructor, using a denominator of 0 will eventually lead to errors. There are two ways of imposing restrictions on the given constructor arguments:

require, which throws an IllegalArgumentException if it fails

assert, which throws an AssertionError if it fails

This reflects a difference in intent:

require is used to enforce a precondition on the caller of a function

assert is used to check the code of the function itself

1
2
3
4
5
6

classRational(x:Int,y:Int){require(y!=0,"denominator must be non-zero")valroot=sqrt(this)assert(root>=0)}

Class Hierarchies

Abstract classes

Just like in Java, we can have absctract classes and their implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

abstractclassIntSet{defincl(x:Int):IntSetdefcontains(x:Int):Boolean}classEmptyextendsIntSet{// Empty binary tree
defcontains(x:Int):Boolean=falsedefincl(x:Int):IntSet=newNonEmpty(x,newEmpty,newEmpty)}classNonEmpty(elem:Int,left:IntSet,right:Intset)extendsIntSet{// left and right subtree
defcontains(x:Int):Boolean=if(x<elem)leftcontainsxelseif(x>elem)rightcontainsxelsetruedefincl(x:Int):IntSet=if(x<elem)newNonEmpty(elem,leftinclx,right)if(x>elem)newNonEmpty(elem,left,rightinclx)elsethis// already in the tree, nothing to add
}

Terminology

Empty and NonEmpty both extend the class IntSet

The definitions of incl and containsimplement the abstract functions of IntSet

This implies that the types Empty and NonEmptyconform to the type IntSet, and can be used wherever an IntSet is required

IntSet is the superclass of Empty and NonEmpty

Empty and NonEmpty are subclasses of IntSet

In Scala, any user-defined class extends another class. By default, if no superclass is given, the superclass is Object

The direct or indirect superclasses are called base classes

Override

It is possible to redefine an existing, non-abstract definition in a subclass by using override.

1
2
3
4
5
6
7
8
9

abstractclassBase{deffoo=1defbar:Int}classSubextendsBase{overridedeffoo=2// You need to use override
defbar=3}

Overriding something that isn’t overrideable yields an error.

Traits

In Scala, a class can only have one superclass. But sometimes we want several supertypes. To do this we can use traits. It’s declared just like an abstract class, but using the keyword trait:

Classes, objects and traits can inherit from at most one class but as arbitrarily many traits.

1

classSquareextendsShapewithPlanarwithMovable...

Traits cannot have value parameters, only classes can.

Singleton objects

In the IntSet example, one could argue that there really only is a single empty IntSet, and that it’s overkill to have the user create many instances of Empty. Instead we can define a singleton object:

Polymorphism

traitList[T]{defisEmpty:Booleandefhead:Tdeftail:List[T]}classCons[T](valhead:T,valtail:List[T])extendsList[T]{defisEmpty=false// val head: T is a legal implementation of head
// and so is val tail: List[T]
// (they're in the argument list of Cons[T])
}classNil[T]extendsList[T]{defisEmpty=truedefhead=thrownewNoSuchElementException("Nil.head")deftail=thrownewNoSuchElementException("Nil.tail")// returns type Nothing
}

Type inference

Type bounds

We can set the types of parameters as either subtypes or supertypes of something. For instance, a method that takes an IntSet and returns it if all elements are positive, or throws an error if not, could be implemented as such:

1
2

// Can either return an Empty or a NonEmpty, depending on what it's given:
defassertAllPos[S<:IntSet](r:S):S=...

Here, <: IntSet is an upper bound of the type parameter S. Generally:

S <: T means S is a subtype of T

S >: T means S is a supertype of T

It’s also possible to mix a lower bound with an upper bound:

1

[S>:NonEmpty<:IntSet]

This would restrict S to any type on the interval between NonEmpty and IntSet.

Variance

Given NonEmpty <: IntSet, is List[NonEmpty] <: List[IntSet]? Yes!

Types for which this relationship holds are called covariant because their subtyping relationship varies with the type parameter. This makes sense in situations fitting the Liskov Substitution Principle (loosely paraphrased):

If A <: B, then everything one can do with a value of type B one should also be able to do with a value of type A.

In Scala, for instance, Arrays are not covariant.

There are in fact 3 types of variance (given A <: B):

C[A] <: C[B] means C is covariant

C[A] >: C[B] means C is contravariant

Neither C[A] nor C[B] is a subtype of the other means C is nonvariant

Scala lets you declare the variance of a type by annotating the type parameter:

Functions are contravariant in their argument types, and covariant in their result type. This allows us to state a very useful and important subtyping relation for functions: A1 => B2 <: A2 => B1if and only ifA1 >: A2andB1 >: B2.

Note that, in this case, A2 => B2 is unrelated toA1 => B1.

The Scala compiler checks that there are no problematic combinations when compiling a class with variance annotations. Roughly:

Covariant type parameters can only appear in method results

However, covariant type parameters may appear in lower bounds of method type parameters

Contravariant type parameters can only appear in method parameters

However, contravariant type parameters may appear in upper bounds of method type parameters

Invariant type parameters can appear anywhere

The following code, for instance, is correct as the covariant type parameter is a method result, and the contravariant is a parameter:

1
2
3
4

packagescalatraitFunction1[-T, +U]{defapply(x:T):U}

Object oriented decomposition

Instead of writing external methods that apply to different types of subclasses, we can write the functionality inside the respective classes.

But this is problematic if we need to add lots of methods but not add many classes, as we’ll need to define new methods in all the subclasses. Another limitation of OO decomposition is that some non-local operations cannot be encapsulated in the method of a single object.

Pattern matching is especially useful when what we do is mainly to add methods (not really changing the class hierarchy). Otherwise, if we mainly create sub-classes, then object-oriented decomposition works best.

Case classes

A case class definition is similar to a normal class definition, except that it is preceded by the modifier case. For example:

As a convention, operators ending in : associate to the right, and are calls on the right-hand operand.

List patterns

It is also possible to decompose lists with pattern matching. Examples:

1
2
3
4
5
6
7
8
9
10

Nil// Nil constant
p::ps// A pattern that matches a list with a head matching p and a tail matching ps
List(p1,...,pn)// Same as p1 :: ... :: pn :: Nil
1::2::xs// Lists that start with 1 then 2
x::Nil// Lists of length 1
List(x)// Same as x :: Nil
List()// Empty list, same as Nil
List(2::xs)// A list that contains as only element another list that starts with 2
x::y::List(xs,ys)::zs//Listsoflength>=3withalistof2elementsin3rdpos

We can do a really short insertion sort this way (but one that runs in O(n2))

List methods

Sublists and element access

xs.init: A list consisting of all elements of xs except the last one, except if xs is empty.

xs take n: A list consisting of the first n elements of xs or xs itself if it’s shorter than n

xs drop n: The rest of the collection after taking n elements.

xs(n): The element of xs at index n

Creating new lists

xs ++ ys or xs ::: ys: Concatenation of xs and ys

xs.reverse: The list containing the elements of xs in reversed order

xs updated (n, x): The list containing the same elements as xs, except at index n where it contains x.

Finding elements

xs indexOf x: The index of the first elemen in xs matching x, or -1 if x does not appear in xs

xs contains x: same as xs indexOf x >= 0

Higher-order list functions

These are functions that work on lists and take another function as argument. The above examples often have similar structures, and we can identify patterns:

transforming each element in a list in a certain way

retrieving a list of all elements satisfying a criterion

combining the elements of a list using an operator

Since Scala is a functional language, we can write generic function that implement these patterns using higher-order functions.

Map

The actual implementation of map is a bit more complicated for performance reasons, but follows something allong the lines of:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

abstractclassList[T]{...defmap[U](f:T=>U):List[U]=thismatch{caseNil=>thiscasex::xs=>f(x)::xs.map(f)}}// Multiplies all elements of the list by a factor
defscaleList(xs:List[Double],factor:Double):List[Double]=xsmap(x=>x*factor)// Squares all elements of the list
defsquareList(xs:List[Int]):List[Int]=xsmap(x=>x*x)

As a tiny note, it’s usually best to put the function value as the last parameter of a function, because that makes it more likely that the compiler can infer the types of the arguments of the function. E.g. we have written (x, y) => x < y) instead of (x: Int, y: Int) => x < y.

How can we make this code nicer? We can use the Ordering type to represent the function, and make it an implicit parameter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

defmsort[T](xs:List[T])(implicitord:Ordering[T]):List[T]={valn=xs.length/2if(n==0)xselse{defmerge(xs:List[T],ys:List[T]):List[T]=(xs,ys)match{case(Nil,ys)=>yscase(xs,Nil)=>xscase(x::xs1,y::ys1)=>if(ord.lt(x,y))x::merge(xs1,ys)elsey::merge(xs,ys1)}val(fst,snd)=xssplitAtnmerge(msort(fst),msort(snd))// ord is visible at this scope
}}valnums=List(2,-4,5,6,1)msort(nums)// Generalisation:
valfruits=List("apple","pineapple","orange","banana")msort(fruits)

Using the Ordering[T] type means using the predefined default ordering, which we don’t even need to supply to msort, namely Ordering.String and Ordering.Int. See notes on ordering in Java.

When you write an implicit parameter, and you don’t write an actual argument that matches that parameter, the compiler will figure out the right implicit to pass, based on the demanded type.

Rules for implicit parameters

Say that a function takes an implicit parameter of type T. The compiler will search for an implicit definition that:

is marked implicit

has a type compatible with T

is visible at the scope of the function call (see line 13 above), or is defined in a companion object associated with T

If there’s a single (most specific) definition, it will be taken as actual argument for the implicit parameter. Otherwise, it’s an error.

For instance, at line 13, the compiler inserts the ord parameter of msort

Proof techniques

Before we can prove anything, we’ll just assert that pure functional languages have a property called referential transparency, since they don’t have side effects. This means that we can use reduction steps as equalities to some part of a term.

Structural induction

The principle of structural induction is analogous to natural induction.

To prove a property P(xs) for all lists xs:

Base case: Show that P(Nil) holds

Induction step: for a list xsand some element x, show that if P(xs) holds then P(x :: xs) also holds.

Instead of constructing numbers and adding 1, we construct lists from Nil and add one element.

Other collections

All the collections we’ll study are immutable. The collection hierarchy is as follows:

Iterable

Seq

List

Vector

Range

Set

Map

Sequences

Vectors

A Vector of up to 32 elements is just an array, but once it grows past that bound, its representation changes; it becomes a Vector of 32 pointers to Vectors (that follow the same rule once they outgrow 32).

Unlike lists, which are linear (access to the end of the list is slower than the start), random access to a certain element in a vector can be done in time log32(n).

Vectors are fairly good for bulk operations that traverse a sequence, such as a map, fold or filter. Also, 32 is a good number since it corresponds to a cache line.

Vectors are created analogously to lists:

1
2
3
4
5
6

valnums=Vector(1,2,3,-88)valpeople=Vector("Bob","James","Peter")// Instead of x :: xs we have:
x+:xs// create a new vector with leading element x, followed by xs
xs:+x//createanewvectorwithtrailingelementx,precededbyxs

Creating new vectors with these :+ and +: operators works by adding a vector, and recreating parent vectors with pointers to the existing ones. Doing this preserves immutability while still being fairly efficient (log32(n)).

Arrays and Strings

They come from Java, so they can’t be subclasses of Iterable, but they still work just as if they were subclasses of Seq, and we can apply all the same operations.

For-Expressions

Higher order functions and collections in functional languages often replace loops in imperative languages. Programs using many nested loops can therefore often be replaced by a combination of higher order functions.

For example, let’s say we want to find all 1 < i < j < n for which i + j is prime. This would take two loops in an imperative language, but in Scala we can “just” write:

The rest of these notes correspond to the Functional Pogram Design in Scala course

Querying

Let’s say we want to query the number of authors who have written two or more books.

1
2
3
4
5
6
7
8
9
10
11

{for{b1<-booksb2<-booksifb1.title<b2.title// Prevent duplicates by using lexicographical order
// We could also use if b1 != b2, but this would
// match for the same pair of books twice.
a1<-b1.authorsa2<-b2.authorsifa1==a2}yielda1}.distinct//anotherwaytopreventduplicates

The first mechanism to prevent duplicates is to compare titles using lexicographical order instead of a simple !=. Another trick is to use .distinct, which is like a .toSet.

Translation to higher-order functions

The syntax of for is closely related to the higher-order functions map, flatMap, and filter. These functions could be implemented as such:

Interestingly, the translation of for is not limited to lists, sequences, or collections. Since it’s based solely on the presence of the methods map, flatMap and withFilter, we can simply redefine these methods for our own types.

If, for instance, we were to write a database supporting these methods, then as long as these methods are defined, we can use the for syntax for querying the database.

Functional Random Generators

Definition

We could also define these three methods (map, flatMap, withFilter) for a random value generator. Let’s define it as such:

traitGenerator[+T]{self=>// an alias for "this"
defgenerate:Tdefmap[S](f:T=>S):Generator[S]=newGenerator[S]{defgenerate=f(self.generate)// we use self instead of this to reference the trait and not the method
}defflatMap[S](f:T=>Generator[S]):Generator[S]=newGenerator[S]{defgenerate=f(self.generate).generate}}

defsingle[T](x:T):Generator[T]=newGenerator[T]{defgenerate=x// identity
}defchoose(lo:Int,hi:Int):Generator[Int]=for(x<-integers)yieldlo+x%(hi-lo)defoneOf[T](xs:T*):Generator[T]=// T* means you can give it as many arguments as you want
for(idx<-choose(0,xs.length))yieldxs(idx)

Usage

Having created a generator, we can use this as a building block for more complex expressions:

Application: Random Testing

Generators are especially useful for random testing. Obviously it’s hard to predict the result of any random input without running the program, but what we can do is test postconditions, which are properties of the expected result.

We can use a tool called ScalaCheck to do this in a more automated way. Instead of writing tests, with ScalaCheck we write properties that are assumed to hold. ScalaCheck will then try to find good counter-examples if the assertion fails.

1
2
3

forall{(l1:List[Int],l2:List[Int])=>l1.size+l2.size==(l1++l2).size}

Monads

Definition

A monad M is a parametric type M[T] with two operations, unit and flatMap (more commonly called bind in the literature):

1
2
3
4

traitM[T]{defflatMap[U](f:T=>M[U]):M[U]defunit[T](x:T):M[T]}

The unit method return a monad with the given type:

List is a monad with unit(x) = List(x)

Set is a monad with unit(x) = Set(x)

Option is a monad with unit(x) = Some(x)

Generator is a monad with unit(x) = single(x)

For every monad, map can be be defined as a combination of flatMap and unit. All of the following are equivalent.

1
2
3

mmapfmflatMap(x=>unit(f(x)))mflatMap(fandThenunit)

These methods have to satisfy some laws:

Associativity: we can put the parentheses either to the left or the right, so (m flatMap f) flatMap g == m flatMap(x => f(x) flatMap g)

Left unit: unit(x) flatMap f == f(x)

Right unit: m flatMap unit == m

Significance of the laws

Associativity says that one can “inline” nested for-expressions; the following are equivalent:

Streams

Sometimes, for performance reasons, we want avoid computing the tail of a sequence until it is needed for the evalutation result (which might be never). Streams implement this idea while keeping the notation concise. They’re similar to lists, but their tail is evaluated only on demand.

Implementation

All other methods can be defined in terms of these three. The actual implementation of streams is in the Stream companion object, so if we want to define a new type of Stream, we just need to redefine these three methods.

Lazy Evaluation

The proposed implementation suffers from a serious potential performance problem: if tail is called several times, the corresponding stream will be recomputed each time. To avoid this, we can store the result of the first evalutation of tail and re-use the stored result next time.

This is called lazy evaluation (as opposed to by-name evaluation where everything is recomputed, and strict evaluation for normal parameters and val definitions). Scala uses strict evaluation by default, but allows lazy evaluation:

1

lazyvalx=expr

x is computed only once, when it is needed the first time; since functional programming expressions yield the same result on each call, the result is saved and reused next time.

This means that using a lazy value for tail, Stream.cons can be implemented more efficiently:

Functions and State

So far we’ve seen that rewriting can be done anywhere in a term, and all rewritings which terminate lead to the same solution. For instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

defiterate(n:Int,f:Int=>Int,x:Int)=if(n==0)xelseiterate(n-1,f,f(x))defsquare=x*xiterate(1,square,3)// Can be rewritten as follows:
if(1==0)3elseiterate(1-1,square,square(3))iterate(0,square,square(3))iterate(0,square,3*3)iterate(0,square,9)if(0==0)9elseiterate(0-1,square,9)9// But also:
if(1==0)3elseiterate(1-1,square,square(3))iterate(0,square,square(3))if(0==0)square(3)elseiterate(0-1,square,square(square(3)))square(3)9

There are multiple ways to rewrite our way to the solution; this is known as the Church-Rosser Theorem of lambda-calculus.

In this chapter, we’ll look at code that doesn’t satisfy that property. We will say goodbye to the substitution model for code that isn’t purely functional.

Stateful Objects

An object has a state if its behavior is influenced by its history. It is mutable (while everything so far has been immutable).

Mutable states are defined using the var keyword (instead of val), and assigned with =:

1
2
3
4

varx:String="abc"varcount=111x="hi"count=count+1

If we define an object with stateful variables, then it is a stateful object if the result of calling a method depends on the history of the called methods, that the result may change over time.

Identity

Mutable state introduces questions about equality, identity between two objects.

If BankAccount is a stateful object (its balance may change), then val x = new BankAccount and val y = new BankAccount aren’t equal. This makes sense, because modifying x doesn’t mean modifying y, and we therefore have to different accounts.

In general, to determine equality, we must first specify what is meant by “being the same”. The precise meaning is defined by the property of operational equivalence: informally, x and y are operationally equivalent if no possible test can distinguish between them. For any arbitrary function f, f(x, y) and f(x, x) must return the same value.

Lisp

I don’t have a whole lot of notes on this, since most of Lisp was seen during lab sessions, and my notes on lambda-calculus are on paper (it wouldn’t have been easy typing it in real time). But for future reference, I’m adding a syntax list of the Lisp dialect seen in class:

(if c a b): special form which evaluates c, and then a if c != 0 and b if c = 0.

(cond (c1 r1) ... (cn rn) (else relse)): special form which evaluates c1, then r1 if c1 is true, or else continues with the other clauses.

(cons first rest): constructs a list equivalent to Scala’s x :: xs. In our interpreter, xs must be a list.