How much time do we have to wait to see some parallel processing features in D?
People are getting more and more rabid because they have few ways to use their
2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.
There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.
Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.
Bye,
bearophile

How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.
There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.

I asked for parallelization support for foreach... well, ages ago. At
the time Walter said no because DMD was years away from being able to do
anything like that, but perhaps with the new focus on multiprogramming
one can argue more strongly that it's important to get something like
this in the spec even if DMD itself doesn't support it. My request was
pretty minimal and partially a reaction to foreach_reverse. It was:
foreach( ... ) // defaults to "fwd"
foreach(fwd)( ... )
foreach(rev)( ... )
foreach(any)( ... )
Thus foreach(any) is eligible for parallelization, while fwd and rev are
what we have now. This would be easy enough with templates and another
keyword:
apply!(fwd)( ... )
etc.
But passing a delegate literal as an argument isn't nearly as nice as
the built-in foreach. And Tom's (IIRC) proposal to clean up the syntax
for this doesn't look like it will ever be accepted.

Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.

D already has coroutines, DCSP, and futures available from various
programmers (Mikola Lysenko for the first two), so I think the state of
multiprogramming in D is actually pretty good even without additional
language support.
Sean

D already has coroutines, DCSP, and futures available from various
programmers (Mikola Lysenko for the first two), so I think the state of
multiprogramming in D is actually pretty good even without additional
language support.
Sean

How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.

Grow a pair and use threads. It's not _that_ hard.

There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.

Patched GDC supports autovectorization with -ftree-vectorize, although that's
single-core.
One of the good things IMHO about D is that its operations are mostly easy to
understand, i.e. there's little magic going on. PLEASE don't change that.

Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.

auto tp = new Threadpool(4);
tp.mt_foreach(Range[4], (int e) { });

Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.

Please, no hardware specific features. D is x86 dependent enough as it is, it
would be a bad idea to add dependencies on _graphics cards_.

Bye,
bearophile

IMHO what's really needed is good tools to discover interaction between
threads. I'd like a standardized way to grab debug info, like the current back
trace of a std.thread.Thread.
This could be used to implement fairly sophisticated logging.
Also, what I have requested before .. single-instruction function bodies should
be able to omit their {}s, to bring them in line with normal loop statements.
This sounds like a hack, but which is better?
void test()
{
synchronized(this)
{
...
}
}
or
void test() synchronized(this)
{
...
}
--downs

How much time do we have to wait to see some parallel processing
features in D? People are getting more and more rabid because they have
few ways to use their 2-4 core CPUs. Classic multithreading is useful,
but sometimes it's not easy to use correctly.

A very short answer; for true parallel processing, 2-4 processors is
nothing. The success of CFL (Control-Flow Languages) like C, C++, D,
Pascal, Perl, Python, BASICs, Cobol, Comal, PL/I, whitespace, malbolge,
etc. etc. is that they follow the underlaying paradigm of computer.
There has been many efforts to declare languages, that are implicitly
parallel. The most used approach is to use DFL (Data-Flow Language)
paradigms, and the most well-know of these is definitely VHDL. Others are
e.g. NESL and ID. Then there are several languages that are either in-
between like functional programming languages (Haskell, Erlang) or
reductive languages (like make and Prolog).
Short references:
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=714561http://portal.acm.org/citation.cfm?id=359579&dl=GUIDEhttp://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=630241
Especially Hartenstein's articles are good to read, if you are trying to
understand, why we are still using CFL & RASP, and why parallel
architectures have failed.
No, the future will show us not any more parallelism at source level.
Instead, (a) the compilers start to understand source better, to
parallelize inner kernels of loops automatically, and (b) there will be
even more layers between source we are writing and the instructions/
configurations processors are executing, and thus the main purpose of
source language is not any more to follow the underlaying paradigm, but
productivity - how easy it is to humans to express things; and CFL-
languages are far from their counterparts in this area. Comparing CFL/DFL
at compiler level, see e.g.
http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/
proceedings/&toc=comp/proceedings/
fccm/1995/7086/00/7086toc.xml&DOI=10.1109/FPGA.1995.477423
If I would asked to say what is the way of writing future programs, I
would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's
Actor Model (1973). Furthermore, I would predict processors to start to
do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set
Processor) -paradigm. Look google for GARP and the awesome performance
increasements it can offer for certain tasks.

If I would asked to say what is the way of writing future programs, I
would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's
Actor Model (1973).

I agree completely. MP is easy to comprehend (it's how people naturally
operate) and the tech behind it is extremely well established. I remain
skeptical that we'll see a tremendous amount of automatic parallelization
of ostensibly procedural code by the interpreter (ie. compiler or VM). For
one thing, it complicates debugging tremendously, not to mention the
error conditions that such translation can introduce.
As an potentially relevant anecdote, after Herb Sutter's presentation on
Concur a year or two ago at SDWest I asked him what should happen if
two threads of an automatically parallelized loop both throw an exception,
given that the C++ spec dictates that having more than one in-flight
exception per thread should call terminate(). He dodged my question and
turned to talk to someone else, who interestingly enough, did make an
attempt to ensure that Herb understood what I was asking, but to no avail.
Implications about Herb aside, I do think that this suggests that there are
known problems with implicit parallelization that everyone is hoping will
just magically disappear. How can one verify the correctness of code that
may fail if implicitly parallelized but work if not?

Furthermore, I would predict processors to start to
do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set
Processor) -paradigm. Look google for GARP and the awesome performance
increasements it can offer for certain tasks.

Interestingly, parallel programming is the topic covered by ACM Communications
magazine this month, and I believe there is a bit about this sort of hardware
parallelism in addition to transactional memory, etc. The articles I've read
so far
have all been well-reasoned and pretty honest about the benefits and problems
with each idea.
Sean

If I would asked to say what is the way of writing future programs, I
would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's
Actor Model (1973).

I agree completely. MP is easy to comprehend (it's how people naturally
operate) and the tech behind it is extremely well established.

I couldn't agree more. MP is very natural way for us humans to organize
parallel things. But there is even more behind it; the very fundamental
reason that restricts computers to come PRAM machines is this our world
around us. It restricts all physical machines, including computers to
maximum of three spatial dimensions, and inherently neighborhood
connected models; and those are very very far from ideal PRAM things...

I remain
skeptical that we'll see a tremendous amount of automatic
parallelization of ostensibly procedural code by the interpreter (ie.
compiler or VM). For one thing, it complicates debugging tremendously,
not to mention the error conditions that such translation can introduce.

Another thing I completely agree. It is not about what could be ideally
best things, it is the reality that matters. Debugging a highly parallel
thing, e.g. FPGA hardware, is very, very time-consuming thing.

As an potentially relevant anecdote, after Herb Sutter's presentation on
Concur [...]

Many highly skillful people are very bound to the great ideas they have
in their mind. I'm not an exception :)

Furthermore, I would predict processors to start to do low-level
reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor)
-paradigm. Look google for GARP and the awesome performance
increasements it can offer for certain tasks.

Interestingly, parallel programming is the topic covered by ACM
Communications magazine this month, and I believe there is a bit about
this sort of hardware parallelism in addition to transactional memory,
etc. The articles I've read so far have all been well-reasoned and
pretty honest about the benefits and problems with each idea.

If reconfigurable computers - and more or less distributed computing -
does not come as next major processor architectures, I will go to some
distant place and shame. They are not ideal nor optimal computers, far
from that - programming one is very laborous and it is very hard for
compilers. But they just work.

How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.
There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.
Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.
Bye,
bearophile

I'm hoping that the new "Pure" stuff Walter is working on, will enable
the compiler to automatically parrellize things like foreach. It won't
be as fast as something that's hand tuned to be faster however it will
be a hell of a lot easier to write.
-Joel

How much time do we have to wait to see some parallel processing features in
D? People are getting more and more rabid because they have few ways to use
their 2-4 core CPUs.
Classic multithreading is useful, but sometimes it's not easy to use correctly.
There are other ways to write parallel code, that D may adopt (more than one
way is probably better, no silver bullet exists in this field). Their point is
to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of
the future) in non-speed-critical parts of the code where the programmer wants
to use the other cores anyway, without too much programming efforts.
I think Walter wants D language to be multi-paradigm; one of the best ways to
allow multi-processing in a simple and safer way is the Stream Processing
(http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few
constructs to use such kind of programming in a simple way (C++ has some such
libs, I think).
Another easy way to perform multi processing is to vectorize. It means the
compiler can automatically use all the cores to perform operators like
array1+array2+array3.
Another way to perform multi processing is so add to the D syntax the
parallel_for (and few related things to merge things back, etc) syntax that was
present in the "Parallel Pascal" language. Such things are quite simpler to use
correctly than threads. The new "Fortress" language by Sun shows similar
things, but they are more refined compared to the Parallel Pascal ones (and
they look more complex to understand and use, so they may be overkill for D, I
don't know. Some of those parallel things of Fortress look quite difficult to
implement to me).
Time ago I have seen a form of parallel_for and the like in a small and easy
language from MIT, that I think are simple enough.
Other ways to use parallel code are now being pushed by Intel, OpenMP, and the
hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C
variant, but seems to require a large human memory and a large human brain to
be used, while I think D may have simpler built-in things. Much more "serious"
D programmers may use external libs that allow them any fine control they
want). I to me they look too much in flux now to be copied too much by D.
Bye,
bearophile