On 16/05/2012 16:59, Walter Bright wrote:
> On 5/16/2012 7:38 AM, Steven Schveighoffer wrote:
>> On Wed, 16 May 2012 09:50:12 -0400, Walter Bright <newshound2@digitalmars.com>
>> wrote:
>>
>>> On 5/15/2012 3:34 PM, Nathan M. Swan wrote:
>>>> I do agree for e.g. with binary data some data can't be read with ranges (when
>>>> you need to read small chunks of varying size),
>>>
>>> I don't see why that should be true.
>>
>> How do you tell front and popFront how many bytes to read?
>
> std.byLine() does it.
And is what you want to do with a text file in many cases.
> In general, you can read n bytes by calling empty, front, and popFront n times.
Why would anybody want to read a large binary file _one byte at a time_?
Stewart.

On 5/16/2012 9:41 AM, Stewart Gordon wrote:
> On 16/05/2012 16:59, Walter Bright wrote:
>> On 5/16/2012 7:38 AM, Steven Schveighoffer wrote:
>>> On Wed, 16 May 2012 09:50:12 -0400, Walter Bright <newshound2@digitalmars.com>
>>> wrote:
>>>
>>>> On 5/15/2012 3:34 PM, Nathan M. Swan wrote:
>>>>> I do agree for e.g. with binary data some data can't be read with ranges (when
>>>>> you need to read small chunks of varying size),
>>>>
>>>> I don't see why that should be true.
>>>
>>> How do you tell front and popFront how many bytes to read?
>>
>> std.byLine() does it.
>
> And is what you want to do with a text file in many cases.
>
>> In general, you can read n bytes by calling empty, front, and popFront n times.
>
> Why would anybody want to read a large binary file _one byte at a time_?
You can have that range read from byChunk(). It's really the same thing that C's
stdio does.

On 5/16/2012 10:18 AM, Steven Schveighoffer wrote:
> On Wed, 16 May 2012 11:59:37 -0400, Walter Bright <newshound2@digitalmars.com>
> wrote:
>
>> On 5/16/2012 7:38 AM, Steven Schveighoffer wrote:
>>> On Wed, 16 May 2012 09:50:12 -0400, Walter Bright <newshound2@digitalmars.com>
>>> wrote:
>>>
>>>> On 5/15/2012 3:34 PM, Nathan M. Swan wrote:
>>>>> I do agree for e.g. with binary data some data can't be read with ranges (when
>>>>> you need to read small chunks of varying size),
>>>>
>>>> I don't see why that should be true.
>>>
>>> How do you tell front and popFront how many bytes to read?
>>
>> std.byLine() does it.
>
> Have you looked at how std.byLine works? It certainly does not use a range
> interface as a source.
It presents a range interface, though. Not a streaming one.
>
>> In general, you can read n bytes by calling empty, front, and popFront n times.
>
> I hope you are not serious! This will make D *the worst performing* i/o language.
You can read arbitrary numbers of bytes by tacking a range on after byChunk().

On Wed, 16 May 2012 13:21:37 -0400, Walter Bright
<newshound2@digitalmars.com> wrote:
> On 5/16/2012 9:41 AM, Stewart Gordon wrote:
>> On 16/05/2012 16:59, Walter Bright wrote:
>>> On 5/16/2012 7:38 AM, Steven Schveighoffer wrote:
>>>> On Wed, 16 May 2012 09:50:12 -0400, Walter Bright
>>>> <newshound2@digitalmars.com>
>>>> wrote:
>>>>
>>>>> On 5/15/2012 3:34 PM, Nathan M. Swan wrote:
>>>>>> I do agree for e.g. with binary data some data can't be read with
>>>>>> ranges (when
>>>>>> you need to read small chunks of varying size),
>>>>>
>>>>> I don't see why that should be true.
>>>>
>>>> How do you tell front and popFront how many bytes to read?
>>>
>>> std.byLine() does it.
>>
>> And is what you want to do with a text file in many cases.
>>
>>> In general, you can read n bytes by calling empty, front, and popFront
>>> n times.
>>
>> Why would anybody want to read a large binary file _one byte at a time_?
>
> You can have that range read from byChunk(). It's really the same thing
> that C's stdio does.
This is very wrong. byChunk doesn't cut it. The number of bytes to
consume from the stream can depend on any number of factors, including the
actual data in the stream. For instance, I challenge you to write an
efficient (meaning no extra buffering) byLine using byChunk as a base.
-Steve

On Wed, 16 May 2012 13:23:07 -0400, Walter Bright
<newshound2@digitalmars.com> wrote:
> On 5/16/2012 10:18 AM, Steven Schveighoffer wrote:
>> On Wed, 16 May 2012 11:59:37 -0400, Walter Bright
>> <newshound2@digitalmars.com>
>> wrote:
>>
>>> On 5/16/2012 7:38 AM, Steven Schveighoffer wrote:
>>>> On Wed, 16 May 2012 09:50:12 -0400, Walter Bright
>>>> <newshound2@digitalmars.com>
>>>> wrote:
>>>>
>>>>> On 5/15/2012 3:34 PM, Nathan M. Swan wrote:
>>>>>> I do agree for e.g. with binary data some data can't be read with
>>>>>> ranges (when
>>>>>> you need to read small chunks of varying size),
>>>>>
>>>>> I don't see why that should be true.
>>>>
>>>> How do you tell front and popFront how many bytes to read?
>>>
>>> std.byLine() does it.
>>
>> Have you looked at how std.byLine works? It certainly does not use a
>> range
>> interface as a source.
>
> It presents a range interface, though. Not a streaming one.
But that is *the point*! The code deciding how much data to read (i.e.
the entity I referenced above that 'tells front and popFront how many
bytes to read') is *not* using a range interface. In other words, ranges
aren't enough.
Ranges can be built on top of streaming interfaces. But there is *still*
a need for a comprehensive streaming toolkit. And C's streaming toolkit
is not as good as a native D toolkit can be.
>>
>>> In general, you can read n bytes by calling empty, front, and popFront
>>> n times.
>>
>> I hope you are not serious! This will make D *the worst performing* i/o
>> language.
>
> You can read arbitrary numbers of bytes by tacking a range on after
> byChunk().
No, this doesn't work in most cases. See my other post. You can't get
everything you want out of just byChunk and byLine.
what about byMySpecificPacketProtocol?
-Steve

On 5/16/12 12:34 PM, Steven Schveighoffer wrote:
> In other words, ranges aren't enough.
This is copiously clear to me, but the way I like to think about it is
by extending the notion of range (with notions such as e.g.
BufferedRange, LookaheadRange, and such) instead of developing an
abstraction independent from ranges and then working on stitching that
with ranges.
Andrei

tbh, I've found byChunk to be less than worthless
in my experience; it's a liability because I still
have to wrap it somehow to real real world files.
Consider reading a series of strings in the format
<length><data>,[...].
I'd like it to be this simple (neglecting priming the loop):
string[] s;
while(!file.eof)) {
ubyte length = file.read!ubyte;
s ~= file.read!string(length);
}
The C fgetc/fread interface can do this reasonably
well.
string[] s;
while(!feof(fp)) {
ubyte length = fgetc(fp);
char[] buffer;
buffer.length = length;
fread(buffer.ptr, 1, length, fp);
s ~= assumeUnique(buffer);
}
But, doing it with byChunk is an exercise in pain
that I don't even feel like writing here.
Another problem is consider a network interface. You
want to handle the packets as they come in.
byChunk doesn't work at all because it blocks until it
gets the chunk of the requested size.
foreach(chunk; socket.byChunk(1024))
suppose you get a packet of length 1000 and you have
to answer it. That will block forever.
So, if you use byChunk as the underlying thing to fill
your buffer... you don't get anywhere.
I think a better input primitive is byPacket(max_size).
This works more like the read primitive on the operating
system.
Moreover, I want it to buffer, and control how much is consumed.
auto packetSource = socket.byPacket(1024);
foreach(packet; packetSource) {
// as soon as some data comes in we can get the length
if(packet.length < 2) continue;
auto length = packet.peek!(ushort); // neglect endian for now
if(packet.length < length + 2) continue; // wait for more data
packet.consume(2);
handle(packet.consume(length));
}
In addition to the byChunk blocking problem...
what if the length straddles the edge?
byChunk is just a huge hassle to work with for every file
format I've tried so far. byLine is a little better
(some file formats are defined as being line based)
but still a bit of a pain for anything that can spill
into two lines.

On Wed, 16 May 2012 13:48:49 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail@erdani.org> wrote:
> On 5/16/12 12:34 PM, Steven Schveighoffer wrote:
>> In other words, ranges aren't enough.
>
> This is copiously clear to me, but the way I like to think about it is
> by extending the notion of range (with notions such as e.g.
> BufferedRange, LookaheadRange, and such) instead of developing an
> abstraction independent from ranges and then working on stitching that
> with ranges.
What I think we would end up with is a streaming API with range primitives
tacked on.
- empty is clunky, but possible to implement. However, it may become
invalid (think of reading a file that is being appended to by another
process).
- popFront and front do not have any clear definition of what they refer
to. The only valid thing I can think of is bytes, and then nobody will
use them.
That's hardly saying it's "range based". I refuse to believe that people
will be thrilled by having to 'pre-configure' each front and popFront call
in order to get work done. If you want to try and convince me, I'm
willing to listen, but so far I haven't seen anything that looks at all
appetizing.
-Steve