Timing Text File Reads, Part 3

So, the last two posts suggest there is major overhead to using the lazy iterator approach with .lines. I decided to explore this by rolling my own iterator to read a file. First, I suspected the gather / take has a big overhead, so I just tried for a basic customer iterator first:

That takes 21.7, very slightly more than the standard gather / take version. So much for the theory gather / take is inefficient here!

I’m told that the spec requires that .lines be strictly lazy so you can mix in calls to .get. I don’t know where, and it seems a bit crazy to me. But anyway, by those lights the following potential optimizations are actually illegal, because they break the strict connection between .lines and .get.

Here’s one that does three .gets at a time, cutting the number of iterator objects created by two-thirds.

Clocking in at 15.6 seconds — significantly better than the current .lines implementation, significantly worse than the .get version — this was actually the best variant I came up with.

I tried upping the count to 8, but it actually ran a touch slower then. And jnthn suggested a version which tried to optimize creation of the iterator objects, but it actually ran significantly slower than the naive version.