Re: Actors performance versus traditional Java threading

3 replies

Tue, 2010-04-27, 06:15

matlik

Joined: 2008-09-07,

The actor model of concurrency in Scala is an abstraction on top of Java threads. There is additional overhead when context switching between actors that wouldn't be there if implemented with vanilla threads. As a result, it is generally better to implement CPU intensive and performance demanding code with the traditional threading model. Actors tend to be a better fit for parallel processing units that aren't as demanding on the CPU, and may also be a better fit for distributed parallel computing (higher latency but more throughput).

I kind of look at actors as the garbage collection of concurrency. There is a performance cost for using it, but your code quality will likely be greater because the system takes care of more fore you. Of course, the more you know about how it works under the hood, the better you can make it work.

Sorry about maybe reentering an old-patched issue but i was reading about Actor pattern performance of Scala not been there yet for compete against classical java Threading model..
is this a JVM design issue (been primarily a java perspective Vm) ?
as Actor pattern is without a doubt a superior pattern against java threading model.. how to get there? is this continues to be true?Thanks in advance,
Fabio Kaminski

On Tue, 2010-04-27 at 01:15 -0400, James Matlik wrote:
> The actor model of concurrency in Scala is an abstraction on top of
> Java threads. There is additional overhead when context switching
> between actors that wouldn't be there if implemented with vanilla
> threads. As a result, it is generally better to implement CPU
> intensive and performance demanding code with the traditional
> threading model. Actors tend to be a better fit for parallel
> processing units that aren't as demanding on the CPU, and may also be
> a better fit for distributed parallel computing (higher latency but
> more throughput).
>
But do not be seduced by the dark side. Threads per se are not a bad
way of accessing concurrency and parallelism, but as soon as you do any
shared memory working (and so need locks, semaphores, moniters . . . )
the explicit thread programming becomes the enemy of maintainable code.

Experimentation -- not yet turned into statistically significant data,
so just anecdotal evidence -- based on a couple of problems that must be
treated as microbenchmarks indicates that Python-CSP with judicious use
of C or C++ is essentially as fast as pure C/PThreads and C
++/Just::Thread. On the JVM Groovy/GPars/GroovyCSP with judicious use
of Java is as fast as pure Java with threads.

Sequential Scala is basically the same speed as sequential Java.
Currently my code using threads and a SyncVar, actors and Scalaz ParMap
are showing worrying signs of a subtle locking bug so I don't believe
the data from them just now. Once the problem is found, I would expect
there to be such a small overhead to the actor and ParMap solutions
that all thoughts of explicit shared-memory multi-threading will be
washed away completely.

This is an embarrassingly parallel, computation intensive code, so is
only one of a number of different problems that are solved with
concurrency and parallelism. It is though the one where most argument
about "grunt" performance is likely to happen.

> I kind of look at actors as the garbage collection of concurrency.
> There is a performance cost for using it, but your code quality will
> likely be greater because the system takes care of more fore you. Of
> course, the more you know about how it works under the hood, the
> better you can make it work.

There may be a cost but anecdotal evidence indicates that cost can be
very small compared to the cost of the computation and so ignorable.
The only way of debating this is, I'm afraid, with actual metrics and
data not with qualitative argument. Sadly, I only have anecdotal
evidence just now, but I am hoping to turn it into statistical evidence
soon.

Data parallel approaches, CSP, dataflow models and actors are always
going to be better ways of building parallel and concurrent applications
compared to using shared-memory multi-threading. So on the moral of the
story I think everyone is going to agree.
>
> > On Apr 26, 2010 10:28 PM, "Fabio Kaminski"
> > wrote:
> >
> > Sorry about maybe reentering an old-patched issue but i was reading
> > about Actor pattern performance of Scala not been there yet for
> > compete against classical java Threading model..
> >
The classic Java Threading model is not for applications development, it
is an implementation infrastructure for more appropriate abstractions.
java.util.concurrent gives us Futures and (JCP willing) Parallel Arrays.
These are far, far better tools of application development than using
shared memory multi-threading. The overhead of using them is present
but it is minimal compared to the resources used for computation, at
least in the computationally intensive little tests I am running. In
non-computationally intensive cases the overhead will be even smaller.
> >
> > is this a JVM design issue (been primarily a java perspective Vm) ?
> >
Not really, but I don't have a strong argument to back that up :-(
> >
> > as Actor pattern is without a doubt a superior pattern against java
> > threading model.. how to get there?
> >
Just do it :-) Don't forget data parallelism, dataflow and CSP as
alternate models of avoiding shared-memory multi-threading. The trick
is to study your solution to your problem in the light of there being
more than one possible good architecture, then choosing the architecture
that provides the smallest translation distance between that solution
and its expression in code that is maintainable and fast enough.

Experience in data mining is leading to stopping the use of SQL queries
on databases and leading towards dumping out the entire database and
then using streaming with a dataflow or CSP model (I am sure actors
could be used as well). I don't have figures for this, but I know
people who do. On 16 and 32 processor systems they are seeing jobs that
used to take weeks taking a few minutes. This fundamentally changes the
way people work.

The moral of the story is to challenge accepted "dogma". In this case I
challenge that shared-memory multi-threading is the way of writing
concurrent and parallel applications. To date I have very little
evidence that the challenge will fail -- but it is still anecdotal and
not statistical evidence.
> >
> > is this continues to be true?
> >
> >
> > Thanks in advance,
> >
> >
> > Fabio Kaminski
> >
> >
>

On the whole, I agree with Russel, but this is dependent upon the application's requirements. For example, a while back a Scala implementation of a memcached style service was created using Actors. Performance was of critical importance, so as the tool was benchmarked, it evolved away from using Actors due to the overhead they introduce.
For most business applications, Actors should be fast enough, and I believe the improved code quality/correctness far outweighs the performance overhead. This is particularly true when the application can be implemented to be massively parallel. But Actors are just another tool in your toolbox, and may not be the best fit for a solution that is constrained to only a few parallel threads, is heavily CPU bound, and must have the highest performance. It also may not make sense to use Actors if implementing a high performance application for lower end desktop machines that can't effectively take advantage of parallelization (too few cores)... if performance is of primary concern.
Note that there are multiple implementations of Actors available in Scala. From what I've read (no practical experience) Akka has the highest performance Actor implementation with many other interesting features that lend itself well to distributed load and high availability.

On Tue, 2010-04-27 at 01:15 -0400, James Matlik wrote:
> The actor model of concurrency in Scala is an abstraction on top of
> Java threads. There is additional overhead when context switching
> between actors that wouldn't be there if implemented with vanilla
> threads. As a result, it is generally better to implement CPU
> intensive and performance demanding code with the traditional
> threading model. Actors tend to be a better fit for parallel
> processing units that aren't as demanding on the CPU, and may also be
> a better fit for distributed parallel computing (higher latency but
> more throughput).
>
But do not be seduced by the dark side. Threads per se are not a bad
way of accessing concurrency and parallelism, but as soon as you do any
shared memory working (and so need locks, semaphores, moniters . . . )
the explicit thread programming becomes the enemy of maintainable code.

Experimentation -- not yet turned into statistically significant data,
so just anecdotal evidence -- based on a couple of problems that must be
treated as microbenchmarks indicates that Python-CSP with judicious use
of C or C++ is essentially as fast as pure C/PThreads and C
++/Just::Thread. On the JVM Groovy/GPars/GroovyCSP with judicious use
of Java is as fast as pure Java with threads.

Sequential Scala is basically the same speed as sequential Java.
Currently my code using threads and a SyncVar, actors and Scalaz ParMap
are showing worrying signs of a subtle locking bug so I don't believe
the data from them just now. Once the problem is found, I would expect
there to be such a small overhead to the actor and ParMap solutions
that all thoughts of explicit shared-memory multi-threading will be
washed away completely.

This is an embarrassingly parallel, computation intensive code, so is
only one of a number of different problems that are solved with
concurrency and parallelism. It is though the one where most argument
about "grunt" performance is likely to happen.

> I kind of look at actors as the garbage collection of concurrency.
> There is a performance cost for using it, but your code quality will
> likely be greater because the system takes care of more fore you. Of
> course, the more you know about how it works under the hood, the
> better you can make it work.

There may be a cost but anecdotal evidence indicates that cost can be
very small compared to the cost of the computation and so ignorable.
The only way of debating this is, I'm afraid, with actual metrics and
data not with qualitative argument. Sadly, I only have anecdotal
evidence just now, but I am hoping to turn it into statistical evidence
soon.

Data parallel approaches, CSP, dataflow models and actors are always
going to be better ways of building parallel and concurrent applications
compared to using shared-memory multi-threading. So on the moral of the
story I think everyone is going to agree.
>
> > On Apr 26, 2010 10:28 PM, "Fabio Kaminski" <fabiokaminski [at] gmail [dot] com>
> > wrote:
> >
> > Sorry about maybe reentering an old-patched issue but i was reading
> > about Actor pattern performance of Scala not been there yet for
> > compete against classical java Threading model..
> >
The classic Java Threading model is not for applications development, it
is an implementation infrastructure for more appropriate abstractions.
java.util.concurrent gives us Futures and (JCP willing) Parallel Arrays.
These are far, far better tools of application development than using
shared memory multi-threading. The overhead of using them is present
but it is minimal compared to the resources used for computation, at
least in the computationally intensive little tests I am running. In
non-computationally intensive cases the overhead will be even smaller.
> >
> > is this a JVM design issue (been primarily a java perspective Vm) ?
> >
Not really, but I don't have a strong argument to back that up :-(
> >
> > as Actor pattern is without a doubt a superior pattern against java
> > threading model.. how to get there?
> >
Just do it :-) Don't forget data parallelism, dataflow and CSP as
alternate models of avoiding shared-memory multi-threading. The trick
is to study your solution to your problem in the light of there being
more than one possible good architecture, then choosing the architecture
that provides the smallest translation distance between that solution
and its expression in code that is maintainable and fast enough.

Experience in data mining is leading to stopping the use of SQL queries
on databases and leading towards dumping out the entire database and
then using streaming with a dataflow or CSP model (I am sure actors
could be used as well). I don't have figures for this, but I know
people who do. On 16 and 32 processor systems they are seeing jobs that
used to take weeks taking a few minutes. This fundamentally changes the
way people work.

The moral of the story is to challenge accepted "dogma". In this case I
challenge that shared-memory multi-threading is the way of writing
concurrent and parallel applications. To date I have very little
evidence that the challenge will fail -- but it is still anecdotal and
not statistical evidence.
> >
> > is this continues to be true?
> >
> >
> > Thanks in advance,
> >
> >
> > Fabio Kaminski
> >
> >
>

On a related note, Scala 2.8.0 adds a new `scala.actors.Reactor` trait,
which provides actors that are faster and more lightweight than
instances of `scala.actors.Actor`. To make this possible, `Reactor`s
implement only a subset of the functionality of `Actor`s (for instance,
they do not maintain a dynamic `self` reference, they do not transmit
implicit `sender` references, and they cannot be suspended in thread
mode via `receive`).

Cheers,
Philipp

James Matlik wrote:
> On the whole, I agree with Russel, but this is dependent upon the
> application's requirements. For example, a while back a Scala
> implementation of a memcached style service was created using Actors.
> Performance was of critical importance, so as the tool was benchmarked,
> it evolved away from using Actors due to the overhead they introduce.
>
> For most business applications, Actors should be fast enough, and I
> believe the improved code quality/correctness far outweighs the
> performance overhead. This is particularly true when the application
> can be implemented to be massively parallel. But Actors are just
> another tool in your toolbox, and may not be the best fit for a solution
> that is constrained to only a few parallel threads, is heavily CPU
> bound, and must have the highest performance. It also may not make
> sense to use Actors if implementing a high performance application for
> lower end desktop machines that can't effectively take advantage of
> parallelization (too few cores)... if performance is of primary concern.
>
> Note that there are multiple implementations of Actors available in
> Scala. From what I've read (no practical experience) Akka has the
> highest performance Actor implementation with many other interesting
> features that lend itself well to distributed load and high availability.
>
> On Tue, Apr 27, 2010 at 2:20 AM, Russel Winder > wrote:
>
> On Tue, 2010-04-27 at 01:15 -0400, James Matlik wrote:
> > The actor model of concurrency in Scala is an abstraction on top of
> > Java threads. There is additional overhead when context switching
> > between actors that wouldn't be there if implemented with vanilla
> > threads. As a result, it is generally better to implement CPU
> > intensive and performance demanding code with the traditional
> > threading model. Actors tend to be a better fit for parallel
> > processing units that aren't as demanding on the CPU, and may also be
> > a better fit for distributed parallel computing (higher latency but
> > more throughput).
> >
> But do not be seduced by the dark side. Threads per se are not a bad
> way of accessing concurrency and parallelism, but as soon as you do any
> shared memory working (and so need locks, semaphores, moniters . . . )
> the explicit thread programming becomes the enemy of maintainable code.
>
> Experimentation -- not yet turned into statistically significant data,
> so just anecdotal evidence -- based on a couple of problems that must be
> treated as microbenchmarks indicates that Python-CSP with judicious use
> of C or C++ is essentially as fast as pure C/PThreads and C
> ++/Just::Thread. On the JVM Groovy/GPars/GroovyCSP with judicious use
> of Java is as fast as pure Java with threads.
>
> Sequential Scala is basically the same speed as sequential Java.
> Currently my code using threads and a SyncVar, actors and Scalaz ParMap
> are showing worrying signs of a subtle locking bug so I don't believe
> the data from them just now. Once the problem is found, I would expect
> there to be such a small overhead to the actor and ParMap solutions
> that all thoughts of explicit shared-memory multi-threading will be
> washed away completely.
>
> This is an embarrassingly parallel, computation intensive code, so is
> only one of a number of different problems that are solved with
> concurrency and parallelism. It is though the one where most argument
> about "grunt" performance is likely to happen.
>
> > I kind of look at actors as the garbage collection of concurrency.
> > There is a performance cost for using it, but your code quality will
> > likely be greater because the system takes care of more fore you. Of
> > course, the more you know about how it works under the hood, the
> > better you can make it work.
>
> There may be a cost but anecdotal evidence indicates that cost can be
> very small compared to the cost of the computation and so ignorable.
> The only way of debating this is, I'm afraid, with actual metrics and
> data not with qualitative argument. Sadly, I only have anecdotal
> evidence just now, but I am hoping to turn it into statistical evidence
> soon.
>
> Data parallel approaches, CSP, dataflow models and actors are always
> going to be better ways of building parallel and concurrent applications
> compared to using shared-memory multi-threading. So on the moral of the
> story I think everyone is going to agree.
> >
> > > On Apr 26, 2010 10:28 PM, "Fabio Kaminski"
> >
> > > wrote:
> > >
> > > Sorry about maybe reentering an old-patched issue but i was reading
> > > about Actor pattern performance of Scala not been there yet for
> > > compete against classical java Threading model..
> > >
> The classic Java Threading model is not for applications development, it
> is an implementation infrastructure for more appropriate abstractions.
> java.util.concurrent gives us Futures and (JCP willing) Parallel Arrays.
> These are far, far better tools of application development than using
> shared memory multi-threading. The overhead of using them is present
> but it is minimal compared to the resources used for computation, at
> least in the computationally intensive little tests I am running. In
> non-computationally intensive cases the overhead will be even smaller.
> > >
> > > is this a JVM design issue (been primarily a java perspective Vm) ?
> > >
> Not really, but I don't have a strong argument to back that up :-(
> > >
> > > as Actor pattern is without a doubt a superior pattern against java
> > > threading model.. how to get there?
> > >
> Just do it :-) Don't forget data parallelism, dataflow and CSP as
> alternate models of avoiding shared-memory multi-threading. The trick
> is to study your solution to your problem in the light of there being
> more than one possible good architecture, then choosing the architecture
> that provides the smallest translation distance between that solution
> and its expression in code that is maintainable and fast enough.
>
> Experience in data mining is leading to stopping the use of SQL queries
> on databases and leading towards dumping out the entire database and
> then using streaming with a dataflow or CSP model (I am sure actors
> could be used as well). I don't have figures for this, but I know
> people who do. On 16 and 32 processor systems they are seeing jobs that
> used to take weeks taking a few minutes. This fundamentally changes the
> way people work.
>
> The moral of the story is to challenge accepted "dogma". In this case I
> challenge that shared-memory multi-threading is the way of writing
> concurrent and parallel applications. To date I have very little
> evidence that the challenge will fail -- but it is still anecdotal and
> not statistical evidence.
> > >
> > > is this continues to be true?
> > >
> > >
> > > Thanks in advance,
> > >
> > >
> > > Fabio Kaminski
> > >
> > >
> >
>
> --
> Russel.
> =============================================================================
> Dr Russel Winder t: +44 20 7585 2200 voip:
> sip:russel [dot] winder [at] ekiga [dot] net
> 41 Buckmaster Road m: +44 7770 465 077 xmpp:
> russel [at] russel [dot] org [dot] uk
> London SW11 1EN, UK w: www.russel.org.uk
> skype: russel_winder
>
>