Scala or Java? Exploring myths and facts

The popularization of the Scala programming language, noticeable by the abundance of opinions and criticism on blogs and social networks (like this one by Nikita Ivanov from GridGain and the popular Yammer case), greatly increased the amount of information about the language. However, the quality of such information often leaves much to be desired.

Whether those opinions are favorable or contrary to the Scala, they often contain outdated, superficial or biased statements. The goal of this article is to help those learning or evaluating Scala to come into their own conclusions. It presents the most common questions about language and its environment and, for each one, added clarifications, examples and links, favoring the formation of a better opinion or a more accurate assessment.

Scala is a compiled language, designed to run on a managed environment, most likely the JVM, and offers the union of the functional and object-oriented paradigms. Scala offers functional and object-oriented programming, a modern compiler and a type system checked at compile time, as in Java, but with the expressive syntax of (usually) interpreted languages, such as Groovy or Ruby. However, the same features that makes Scala expressive can also lead to performance problems and complexity. This article details where this balance needs to be considered and it is not an introduction or tutorial, but does not presume knowledge in Scala.

Is Scala more productive than Java?

As development productivity is a very subjective matter, it is necessary to decompose this issue, evaluating the features of Scala that normally support it.

Functional Programming

The functional paradigm expresses programs as functions in the mathematical sense, mapping from one value to another (f (x) = y) without side effects, such as maintaining the status of objects or input/output of data. This allows several compile time optimizations and avoids concurrency problems, as will be presented throughout this article.

Functional programming is a software paradigm based on the Lambda Calculus and has long been part of the academic environment in languages like Lisp, Scheme and Haskell. It is a trend of commercial languages such as Java and C # to incorporate features from these languages, especially closure, but this effort is limited by backward compatibility, specifications and conflicts of interest. These limitations opens space for new multi-paradigm languages to be developed, such ad as Clojure and Scala, seeking to offer the best of both Object Oriented and Functional worlds.

These are the implementations of the iterative algorithm in Java and Scala, remarkably similar. However, a more compact and functional version, using infinite sequences and tuples can be very different:

This version can be considered concise or complex, according to individual abilities and preferences. Functional programming is not the only source of complexity in Scala, as will be shown throughout this article, but is relevant to the difference in productivity.

Some Scala features, such as type inference, unchecked exceptions, optional objects and implicit conversions, can greatly reduce the amount of statements and checks in a program, without changing its meaning. Furthermore, Scala tries to be leaner than Java, removing features that can be expressed in terms of others. For example, there are no static references or primitive types in Scala, because the same effect can be obtained using plain objects.

To understand the difference in boilerplate code, consider this program that prints the MAC addresses of all network interfaces in the system. There are several ways to write this code, but that's the way it would probably be written in each language.

Other features, like Implicit Conversion and Closures, may contribute to the reduction of code. Although these features do not exist in Java, some of them can be simulated or adapted, mostly by creating fluent APIs.

Traits

Traits are type definitions, similar to Java interfaces, but may contain implementation and are cumulative, offering a different approach to code reuse. A trait is similar to the combination of an abstract class and an interface in a single definition. The following example, taken from the shows how to combine traits to double, increment and/or filter a queue of integers:

Is Scala complex?

The same features that can make Scala more productive can also make it unreadable. A feature that is oftenly questioned is the use of symbols in method names, like the method ++ class. These methods, disguised as operators, may be useful to represent frequent operations, like list concatenation or addition of complex numbers. However, this nomenclature may easily be abused. When combined with type limits, variance and partial functions, delicate topics in any language, the statements may become very difficult, like this one mentioned in this post "Opinion:Scala is the new EJB2?":

Another difficulty is that ScalaDocs, the official documentation of the class library, is incomplete in many aspects. However, there is an effort to improve this documentation and other resources for learning, all being grouped into a documentation site.

This does not mean that there is no inherent complexity in Scala. In the article "True Scala Complexity", Zhang Yang presents a detailed example of how the type system and conversions can get confusing, even for an experienced Scala developer. Some of these complexities can be prevented, alleviated or eliminated in the future, but others shall remain, such as those resulting from the integration of a expressive type system over both functional and object-oriented paradigms.

Does Scala offer better concurrency?

Writing correct and efficient concurrent programs is difficult. Debugging these programs can be even more challenging and unpredictable. Scala offers parallelism at a high level of abstraction, but so does Java, particularly with the concurrency features introduced by Java 7. The concurrency features of the two languages are similar in their purpose, but they are very different in their architecture and a full comparison is beyond the scope of this article. However, this issue becomes clearer when analysing the features of Scala that support it.

Unfortunately, programmers have found it very difficult to reliably build robust multi-threaded applications using the shared data and locks model, especially as applications grow in size and complexity. The problem is that at each point in the program, you must reason about what data you are modifying or accessing that might be modified or accessed by other threads, and what locks are being held. At each method call, you must reason about what locks it will try to hold, and convince yourself that it will not deadlock while trying to obtain them.

Scala attempts to reduce these difficulties using the immutability and actors. Furthermore, the use of functional programming facilitates internal parallelism, automated by the compiler or libraries. The Closures proposal for Java 8 attempts to incorporate some of these features in the Java language and its libraries.

Immutability

Synchronizing access to shared mutable objects can result in much complexity in the use of concurrency primitive (locks, semaphores, etc.). Although this is not a common concern for application developers, it can be troublesome for developers of servers and frameworks. Scala tries to mitigate this problem by using immutable objects and pure functions. If an object is immutable, it can be shared or copied without worrying about who is using it, so it is naturally "thread-safe."

Unlike other functional languages, Scala does not force objects to be immutable. Mutable objects are important to implement a number of requirements, functional and nonfunctional. What Scala does is encourage the distinction between mutable and immutable using different packages and statements for each case.

This talk, by Rich Hickey, presents the main ideas behind the immutability and some considerations on how to code with it.

Actors

Low-level parallelism controls, such as locks and synchronized blocks, are sufficient to write concurrent programs correctly - but this task may not be easy. To write this type of program more productively and prevent defects, a high level concurrency control is very desirable. Such abstraction can be like Fork /Join, Software Transactional Memory, or, as featured in Scala, the Actor Model. In this abstraction, the parallelism is expressed as actors reacting to messages, rather than locking and releasing of threads.

The following example demonstrates actors estimating the value of Pi using the Monte Carlo method. This method generates random points in a square and calculates the ratio of them that falling within the inscribed circle, approximating Pi. The following implementation uses a recurring pattern in the actors model: one actor will be the "coordinator", managing several "calculators" who, either alone or cooperatively, progress towards the outcome.

The "calculator" in this example may receive two messages: "Calculate", which is replied with an estimate of Pi and "ShutDown", which shuts down the actor. As these messages are simple notifications, they can be represented as constant case objects. The calculating actor can be written as follows:

The "coordinator" starts a list of calculators and tell them to calculate until any of them produces an accurate enough estimation and, in this case, terminates the calculation and prints the value found and the execution time:

Finally, an object with a main method initializes the coordinator with the number of calculators to be used. The higher the number of coordinators, the more likely one of them will find the desired value sooner, reducing the total execution time.

objectPiActorsextendsApp{newCoordinator(2) start
}

The actors model goes beyond the communication between threads. The Akka platform extends the model to support remote actors and adds several tools for developing distributed systems with high scalability and fault tolerance.

Although useful in many cases, the actors model and its implementation in Scala is not free of controversies. Much of its benefits are not obtained from the actors themselves, but the exchange of messages, which are usually immutable and favors the absence of shared mutable state. The default implementation in Scala, shown above, binds each actor to a native thread, which can be problematic, as shown by Tony Arcieri in the post "Why I Do not Like Scala". There are alternative implementations, using event-based actors for example, but they always come at some cost, at least of complexity.

Parallel Collections

Scala 2.9 introduced parallel collections, which makes it easy to parallelize the execution of common collections operations, such as map(), filter () and foreach(). The par() method can be usedto obtain a parallel version of the collection, as shown in this example:

Note that the order of the printed elements in the second statement is unpredictable, as it depends on the scheduling of threads by the operating system. Aleksandar Prokopec shows interesting details of how the parallel collections were implemented in his presentation at the Scala Days 2010.

Is Scala extensible?

Scala allows developers to customize the look and feel of the language, creating new languages and altering the compilation process. Such tasks can be challenging or even make the code unreadable, but greatly extend the possibilities of language.

Domain-specific languages can also be created when a more abstract and declarative language is needed by developers. For example, Apache Camel offers a Scala DSL to make the configuration of service routes more concise and correct.

The development of domain-specific languages is a subject as deep as popular. For a introduction, see this presentation by Martin Fowler, who also wrote a book on DSLs.

Changing the compilation

Scala takes the development of languages a step further with parser combinators, allowing the creation of entirely new grammars, as shown by Daniel Spiewak in this article. When even that is not enough, one can still create compiler plugins to change the build process. These plugins could be written, for example, to perform static code analysis and evaluating metrics, like PMD or FindBugs. Another possibility would be to create a plugin to change or optimize the behavior of a library at compile time.

These features can make Scala code look very different from the original language, as shown by Michael Fogus in his implementation of BASIC in Scala. These customizations of the language can be used to solve complex problems elegantly, but can also be abused, alienating developers unfamiliar with the context of the changes.

Are Scala and Java interoperable?

Although it is possible to invoke Java methods in Scala code and vice versa, the interaction between languages is not without complications.

When calling Scala methods from Java, the developer needs to understand how the features that do not exist in Java are transformed into executable objects. For example, methods with non-alphabetic names, receiving functions and tuples as parameters, do work when used in java code, but need to be written properly, as shown in this article.

When invoking Java from Scala, the problems is the features of Java that were abandoned or implemented differently in Scala, such as interfaces, annotations and collections. This article explains more about these differences and alternatives for various cases, but some of these differences may force the developer to write code in the two languages. For example, interfaces and annotation must be written in Java, since these are usable, but not declarable in Scala.

The scalac compiler does not compile java code or invoke the javac compilerto do it, but it analyses the .java

Is Scala tooling bad?

The fifteen years of history and the strength of the Java community are certainly reflected by the abundance and maturity of its tools. The Scala tools are evolving faster, especially with the efforts from Typesafe (the company behind the language and several related projects) and community contributions, as those from Twitter.

Using Java Libraries and Frameworks in Scala

One of the benefits of using a JVM language is the abundance of libraries and frameworks available for reuse. Considering the limitations of interoperability presented above, any library or framework available for Java can be used in Scala. This includes all Java EE (EJB, JSF, JAX-RS, etc.) and popular libraries, such as Hibernate and JUnit, because Java and Scala classes are virtually indistinguishable once compiled. For example, a servlet that prints the parameters of HTTP requests can be written in Scala as follows:

The problem when using Java libraries in Scala is that they were not designed to the syntax improvements of Scala. In some cases this is just inconvenient, but sometimes it may require the boilerplate code that Scala tries to avoid.

For an extreme example, take the CollectionUtils class from the Apache Commons Collections library. It has methods to filter, transform and iterate over collections, as does the Scala standard library. A program that uses Commons Collections to print the double of each positive element from a list of integers can be written as follows in Scala:

While correct, that code would probably be considered very wordy or repetitive by a Scala developer. However, using native Scala collections, one can write the same program in a much more idiomatic syntax:

myList filter(_>0) map(_*2) foreach println

Although quite obvious, this is the reason why several libraries end up being re-developed in Scala. It is the case, for example, of the Scalaz library, which features several data structures more suitable to functional programming. The same goes for web frameworks (Lift,Play), libraries for TDD and BDD (ScalaTest, Specs) and many others. Some of these libraries are very similar to their Java versions, while others are radically different and inspired by libraries from other languages, such as Scalatra, based on the library Sinatra, for Ruby or ScalaCheck based on QuickCheck for Haskell.

The abundance of build automation tools can be both positive and confusing. The Simple Build Tool (SBT) is widely used, almost a community standard. However, to integrate with other tools (IDEs, servers, etc.) and support the complete development process (testing, packaging, deployment, etc.) tools such as maven,ant or Buildr may be more appropriate and they already support Scala, usually providing a wider variety of plugins and integrations.

Like the libraries, many Scala tools are inspired in languages other than Java. The most popular is the interactive console (REPL), very traditional in scripting languages and useful to test and learn the language and its libraries. Such tools can be unusual to the developer familiar with the Java environment, so the migration of the development environment needs to be analyzed carefully. Not all tools will have an exact parallel, but the differences may be beneficial to the development cycle.

Scala runs on both Java and. NET?

Scala has been designed to be independent of the underlying virtual machine, but the target was clearly the JVM. At the beginning of the project in 2004, a .NET version was developed, but that never got much popularity and quickly became obsolete. However, in June 2011 a project conducted by the Ecole Polytechnique Fédérale de Lausanne has achieved significant results in adapting the Scala to the .NET platform, as explained in this article. The .NET version is still in early development and has several limitations, but it is already possible to run Scala programs on both platforms. For more information about the Scala port to .NET, see this this interview with Martin Odersky and this page of the Scala website.

Portability to other platforms, even beyond Java and .NET, is an interesting feature of Scala. An innovative project in this sense is the scala-llvm, an implementation of the Scala compiler for LLVM. However, such portability is still only a possibility, Java being the only platform that supports all the features, tools and libraries usually expected by enterprise developers.

Is Scala slow?

Leaps in level of abstraction are usually followed by performance criticism and Scala is no exception. In general, Java and Scala systems have similar runtime performance, since both are subject to the costs and benefits of the JVM. At compile time, however, the Scala compiler has much more work to do and is quite slow compared to the Java compiler.

Runtime Performance

Performance differences usually arise from the features of Scala that are not natively supported by the JVM. Some of these features, such as closures, are likely to be supported soon, but many others never will. Therefore, a lean code at a high level of abstraction written in Scala can be compiled to a large amount of bytecode, degrading runtime performance, as shown in this presentation.

This can be a problem for performance sensitive programs, but there are workarounds. The most common is to analyze the generated bytecode using javap to understand what is happening under the hood and become familiar with the performance of the features and libraries. Understanding bytecode and tuning low level performance is not a simple task, but in traditional functional languages like Haskell and Scheme implementations, such analysis is much more complex, if possible.

However, it should be noted that the language is important for the performance of a system, but is not the only factor. The benefits obtained from concurrency utilities and other features may offset or even exceed the penalties from abstraction when measuring the final latency and throughput.

Compilation Performance

Some Scala features, such as the search for implicit conversions, can take a long time to build, and the compiler still need to check both the Scala type system and the type system of the underlying platform (JVM, CLR or other). All this makes compilation slower, but does so in the benefit of flexibility and platform independence.

The IDE or build tool can use incremental compilation to alleviate these problems, but for complex compilations or continuous integration, compiler performance can be an issue.

Scala binaries are not backward compatible?

Maintaining binary compatibility is a difficult choice for the developers of a programming language. If there are clear rules for the maintenance of compatibility, as does the Chapter 13 of the Java Language Specification, then programmers can expect new code to be compatible with old binaries without recompiling everything.

On the other hand, not committing to backward compatibility makes a language much easier to evolve. This can be very important in the early stages of the language, to allow changing of design decisions, fixing bugs and tuning features. The problem is that significant changes, as seen evolution of Scala, can cause incompatibilities and require libraries and applications to be recompiled for the same version of the language. This problem can be mitigated usingbridge methods,migrationtools, good documentation and processes, but there is no definitive solution.

For this reason, popular libraries in Scala have different distributions, compiled for different versions of the language. A single version of theBDD library specs, for example, can be found as specs_2.9.0 specs_2.8.1, targeting these respective versions of Scala. Less popular libraries may need to be recompiled using same version of the language used by the application. The build automation tools can help greatly in this task. SBT, for example, can easilycross-compile and generate binaries for several versions of Scala in a single run, as well as reference libraries correctly using both the version of the library and the version of the language.

Conclusions

The advantages and disadvantages of Scala are often stated in opinions influenced by either enthusiasm or disappointment. To properly evaluate the technology, it is important to understand the context and the facts behind such opinions, as well as the relevant design choices.

All code examples presented in this article are available in this Git Hub repository. Sincere thanks to the members of the Scala users list, which maintains a very active community and greatly helped this writing. If you are or have been a Scala developer, contribute with your opinion in the comments!

About the Author

Julio Faerman is a software engineer, developer and teacher, among other labels. Specialized in developing enterprise software with complex requirements and software process improvement, mostly for government and telecom.

Currently working at Red Hat / JBoss, after Borland, NEC and independent clients of his own company. Interested in a wide range of subjects, from algorithmic game theory to music and gardening, but always leaving time for community events and publications.

A comprehensive collection of references to useful information about Scala (although I've read it all before :) ). It's a good read for the advanced Java developer looking for balanced view of Scala. A refreshing contrast to all opinionated blog posts.

This is a very thorough article with lots of useful links to form one's opinion on the matter. The only thing which I'd like you to update is the references to the specs project. As can be read on the specs home page this project has been superseded by the specs2 project, so I prefer to have people directed to the proper place right away.

It always amazes me why people seem to assume that it is of benefit to save a couple lines of code by converting a function into an unreadable mess of characters. I've been a professional developer for 17 years and started out on super-cryptic languages like APL. I am here to say there is nothing easy to understand about "def fib(a: Int = 1 , b: Int = 1): Stream[Int] = a #:: fib(b, a+b)".

I have just started learning Scala and do not think it is overly complex when compared to other languages. If this is the case and the functional programming is the cause of this complexity then other FP languages should be lumped into this category as well.

The main thing that I have found to be difficult in the transition is to learn to think differently when writing code from OO to FP. Once I get past that paradigm then my code will follow. As far as the comparison to Java. I think Scala needs to stop this comparison as well as pointing out that Java programmers should consider moving to this language as it is a better alternative. If you have invested a large amount of time in Java then stay with Java. If you are looking for another language to add to your toolkit consider Scala or if you want to learn/use a new language add it.

I come from a Perl programming background and have several years of Java JVM support. I wanted to do Scala because I like the power it presented me with in the JVM. I grabbed a book Programming in Scala and am reading it now. I like the book it is clear and has helped me with my learning curve.

The great thing about coming from a Perl background is I know you are not going to know everything about a language in one sitting. So, some of the items in the language are going to seem complex until you learn what and why they are used. In the mean time you do not have to use those items to become productive in that language. Take baby steps using Baby Scala coding then move onto the complex items as you learn more.

Stating that one line is complex to someone who hasn't taken the time to learn the language is like saying the French or Portuguese language is complex after taking one semester of it in college. My point is using power tool sets takes time to learn and master.

I'm honestly perplexed by this kind of talk. Certainly "def" defining a function is no surprise, nor "a: Int" declaring a variable of type Int, nor "= 1" providing a default value, nor "fib(b,a+b)" invoking a function. Scala is following the lead of other quite common languages in *all* of that. Absolutely nothing new. And "Stream" is just a class, implementing the same concept with the same name in languages like Scheme, and "Stream[Int]" is only trivially different from the Java syntax for generics. The only thing remotely novel in all of this is the "#::" operator, but even that is clearly evocative of the "::" operator which performs the analogous function for lists not only in Scala but also the ML family of languages and probably others. How in the world could anybody find such fault with that line as to call it an "unreadable mess of characters"?

I understand that people may need to learn a bit before they can grok code in a new language, but it's quite a different matter to condemn its syntax because you aren't familiar with it. For someone with a reasonable background in computer science, and given that Scala is a statically typed language, that syntax is really quite unsurprising! Does anyone *really* think that after a modest effort to learn Scala the syntax shown above would continue to be an obstacle?

I completely agree. If something is shorter it does not mean it gives more productivity to the rest of the project/team. Imagine a big project written like that => it is hard to read and hard to debug.

That's really why the designers of java has left it out for so long. Anonymous inner classes are ok because they implement an explicit interface that declares what the code block does. An anonymous function is an unlabeled blackbox. This loss of traceability is a real design concern with a language whose popularity and success stems from strictness. It's the same reason why node.js will not have the same success in enterprise as java, even if it outperforms it

I have been a professional programmer for many years, and while acknowledging that I like the compactness put forward by many of Scala's features, I completely agree with Dave here. I am keen to know, if by writing an expression like this, we are gaining anything in the generated Bytecode's execution.