On Fri, Sep 25, 2009 at 17:08, David A. Black <dblack / rubypal.com> wrote:
> One idiom I've seen is:
>
> =A0a =3D [1,2,3].each
>
> =A0[4,5,6].each do |n|
> =A0 =A0p n + a.next
> =A0end
>
> i.e., as a way to do parallel iterators. (Of course that particular
> example could be done with zip.)
Regardless of the whether this would work on generative streams and
infinite sets, the other problem with zip is you effectively loop the
entire collection twice and possibly create a second collection that
is serving the purpose of temporary data storage. This goes against
the concept of deforestation [0] unless complex stream fusion
optimizations can be made by the compiler (not easy in this case).
Now it has been pointed out that Fibers are currently really slow. It
is kind of sad that the current implementation has these limitations
but there is no reason that certain platforms could use much more
efficient code paths for faster fiber operation. Examples of how big
of a difference this can make can be seen in projects like LuaJIT [1].
The fact that garbage may be referenced is really a bad side effect of
keeping the iterator around far too long in some scope. I think this
is a problem of both iterations, though dealing with native threads is
certainly going to make thread management a harder problem.
For JRuby I would consider using a combination of the noted options
rather than just one. First, I would keep optimized Enumerator objects
for native type. This should be doable with most common collections
avoid threads and expensive context switching. We all love speed for
the common case. Next, I would consider having a thread pool around
for spinning up new iteration fibers for the cases of non-native
streams. I have the feeling that enumerators will become more
commonplace in Ruby code so this might be a good pattern to support
(generator - consumer pairs are quite useful).
Longer term, I would find it interesting to discuss ways we can
express multi-stream operations in a safe way that allow the
implementation to perform intelligent fetching and fusion operations.
One thought is a list comprehension form which can help avoid common
problems in reducing dynamically dispatched calls.
Brian.
[0] http://en.wikipedia.org/wiki/Deforestation_(computer_science)
[1] Coco (not Cocoa) is worth checking out: http://coco.luajit.org/