Does the Spliterator have internal repository?

Please, have a look at the lines of codes below and the questions that follow.

The list remains intact even after splitting the spliterator. So, I ask:
Does the spliterator create a collection (repository) to which it copies the content of list and then operate on this spliterator-created collection?
Does the spliterator also create separate collection in which to store the left hand portion of the split?

I think a spliterator MAY copy the source collection, but I do not think that any actually DO.

Spliterators created by collections have access to the internals of the collection. That means they can directly access the data structure that the collection is based on. A spliterator for an ArrayList can just keep track of the index of the element that it will return next, and when you call the trySplit() on it, it may just cut the number of elements it will iterate over in half.

The original collection or its underlying structure is unaffected.

Here is an example of what it MIGHT look like. It probably looks completely different though, because there are a lot of issues to consider when implementing a spliterator:

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

My questions are in the image insert below.

The-spliterator.PNG

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Please, in the future post all questions as plain text in your message. Images are hard to work with.

Spliterators don't "contain" any elements at all, nor do they enclose a collection. They're just objects that have access to the internals of a collection that does contain the elements. The two spliterators in the example above reference the SAME array instance. The split simply comes from the fact that the original spliterator promises to iterate no further than the first half, and the new spliterator promises to iterate nothing but the other half. This is implemented using the fromIndex and toIndex fields of the spliterators.

The array is populated by the collection itself, when you call methods on it like add().

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

apology for using image. I was just trying to visualize the concept.

Spliterators don't "contain" any elements at all, nor do they enclose a collection. They're just objects that have access to the internals of a collection that does contain the elements.

Right. The previous example did show that the array and ArraySpliterator are in the same implementation of ArrayList.

It promises to iterate only from splitIndex to toIndex. Where is the promise to iterate only from 0 to splitIndex - 1? Are there automatic coding going on behind the scene?
Are the promises actually iterating on the original spliterator or are they iterating on the underlying collection?

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

The original spliterator iterates from 0 to the end of the array, as you can see in the parameterless constructor.

When trySplit() is called, the toIndex field of the original spliterator is modified, as you can see on line 67.

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

Supose I have the following:

That gives me a spliterator that iterates from 0 to the end of the arraylist.

That gives me a spliterator that iterates from splitIndex to toIndex. Although toIndex is now set to splitIdex, the trySplit() method ends without executing the new ArraySpliterator<>().spliterator() again. So, how is the promise to only iterate from
0 to toIndex for the first half fulfilled?
Besides, since the secondHalf has already taken the elements from splitIndex to the end of the arraylist, should line 67 not be this.toIndex = splitIndex - 1?
Bear with me. I'm trying to be sure I understand the concept.

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Biniman Idugboe wrote:Although toIndex is now set to splitIdex, the trySplit() method ends without executing the new ArraySpliterator<>().spliterator() again. So, how is the promise to only iterate from
0 to toIndex for the first half fulfilled?

You can't call spliterator() on a Spliterator because Spliterator has no such method. Why would it, and why would such a method need to be called? Whoever called trySplit() had a reference to a spliterator that covered all elements, and when the call returns they have references to two spliterators that each cover one half of the elements. It's their responsibility to continue using the two spliterators in parallel. Thankfully you don't have to worry about it. This is all handled by the terminal operators.

Besides, since the secondHalf has already taken the elements from splitIndex to the end of the arraylist, should line 67 not be this.toIndex = splitIndex - 1?

No. The toIndex is exclusive. On line 47 you can see the spliterator already stops advancing when index is equal to toIndex.

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

What will happen if I did the following:

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Biniman Idugboe wrote:What will happen if I did the following:

Absolutely nothing, because you can only return one spliterator from the method and your firstHalf spliterator will be unused. If you give more context to your hypothetical situation (e.g. write an entire example method) then maybe I can give you a more satisfactory answer.

The 0 is not exclusive. You need to get used to the convention that in Java, when specifying a range, the starting index is inclusive and the ending index is exclusive. For another example, take a look at String.substring(int beginIndex, int endIndex).

No. The array is untouched. Nothing is cut away from it. Thinking about spliterators as cutting parts off the array will only get you into trouble when you do other things with spliterators. The range is cut in half, NOT the data source.

It's not that complicated. When a stream is created from a collection, it will be wrapped around a spliterator that's responsible for iterating over all the elements in the collection. When trySplit() is called on the spliterator (by other code in the Stream API, not by you) and it succeeds, the original spliterator is modified so it will only be responsible for half of what it did before, and it will return a new spliterator that is responsible for iterating over the remaining elements. The code that called trySplit() will create a task to run the second spliterator concurrently with the original one.

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

... the convention that in Java, when specifying a range, the starting index is inclusive and the ending index is exclusive.

The range is cut in half, NOT the data source.

Noted with thanks.

When trySplit() is called on the spliterator (by other code in the Stream API, not by you) and it succeeds, the original spliterator is modified so it will only be responsible for half of what it did before, and it will return a new spliterator that is responsible for iterating over the remaining elements.

The codes that perform the modification are not obvious from the ArrayList example above.
Now, I go to the tryAdvance(Consumer<? super E> action). Could you give an example of the lamda expression that implements the consumer?

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Biniman Idugboe wrote:The codes that perform the modification are not obvious from the ArrayList example above.

It's line 67. This is the only change that needs to happen in the original spliterator.

Could you give an example of the lamda expression that implements the consumer?

In the example code that you gave, nothing happens at all because there is no terminal operation. But let's assume that you added the terminal operation .forEachOrdered(System.out::println).

The forEachOrdered() method would call the tryAdvance() method of the map operation, which in turn calls the tryAdvance() method of the filter operation, which in turn calls the tryAdvance() method of the source. The source takes the integer 2 and passes it to the Consumer that the filter operation passed to it. That Consumer checks that 2 is greater than or equal to 3, determines that it's not, and so does nothing more. Control returns to the tryAdvance() method of the source stream, which just returns the value 'true' to indicate to the filter operation that there are more elements remaining, which the filter returns back up to the map operation, which the map operation returns to the forEachOrdered operation.

Because there are more elements remaining, the forEachOrdered operation calls the tryAdvance() method of the map operation a second time. This will call the tryAdvance() method of the filter operation. This will call the tryAdvance() method of the source. The source passes the integer 4 to the Consumer that was passed in by the filter operation. The Consumer checks that 4 is greater than or equal to 3, determines that it is, and so it passes 4 to the Consumer that was passed in by the map operation. The map's Consumer then applies the toString() method on the 4, and passes the String "4" to the Consumer that was passed in by the forEachOrdered operation. The forEachOrdered operation's Consumer then calls System.out.println() on "4".

The understanding I have of the above explanation is that an element is taken from the stream, the element passes through the pipeline until it reaches the terminal operation. Then the next element is taken from the stream and the cycle repeats until all the elements have been processed. Now, I have difficulty comprehending the consumer because I am struggling to reconcile the following:

I am thinking the filter() operation has already been completed; without waiting for the map() operation.

I am also thinking the map() operation has already been completed; without waiting for the next operation.

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Of course the filter() and map() methods return immediately. All they do is create an instance of Spliterator and wrap a Stream around it. They don't actually run the spliterators.

Take a look at the map() method that I wrote in your other thread. It defines the MappingSpliterator class, but that doesn't actually do anything at runtime. It just tells the compiler that that declaration can only be used inside the method, and nowhere else. The only things the map() method actually does at runtime is creating an instance of this spliterator, wrapping it in a Stream, and returning the stream. That return value represents the chain of operations that is still to be performed.

Every time you call an intermediate stream operation, you just lengthen the chain of operations with another operation. But the operations are not executed. They are only executed when a terminal operation is called. This is called lazy evaluation.

Biniman Idugboe

Ranch Hand

Posts: 215

3

posted 1 week ago

Again I am embarrassed to say that the concept of consumer passed to the tryAdvance() method is eluding me.

1. The terminal operation is responsible for producing the ultimate result of the pipeline operations.
2. The terminal operation does not produce a stream, does not produce a spliterator.
3. The terminal operation creates a consumer, calls the tryAdvance() method of the spliterator created by the preceding operation and passes the consumer to it.
4. The consumer performs action on the element brought forward by the tryAdvance() method.
In fact, I cannot continue to enumerate the steps because the concept starts to blur from this point forward. An intermediate operation is supposed to do something to the element, still the consumer does yet another thing to the element. I just don't get it.

Stephan van Hulst

Saloon Keeper

Posts: 10136

214

posted 1 week ago

Okay, let's look at it from a different point of view.

Instead of its actual method signature, what if tryAdvance() looked like this:

The map() operation might then have been implemented like this:

This leads to a problem though: There is no way to distinguish between the end of a stream and a null element. So instead, the designers could have used the same design as with the Iterator interface:

This works fine. However, since we have lambdas now, they decided to roll those two methods into one: if the spliterator can't advance because there aren't enough elements available, it returns false. If it can advance it will advance not by returning the processed element to the next operation, but by passing it to the next operation through a consumer that the next operation supplied:

The consumer plays the role of the advance() method. However, the advance() method could return the element to the next operation directly, while the tryAdvance() method already returns boolean. It needs another way of returning the processed element to the next operation. It does this through the consumer that the next operation supplies.

If you still don't understand after this explanation, I think you need to study callback functions. Callbacks are a important idiom in functional languages.