Comments

The attached patch addresses the problem identified in comment #22 of the PR.
For character array internal unit reads, eat_spaces must call next_char to
advance every single character until the end of the string is reached. In the
case sited which is very contrived, this amounts to about 100000 calls to next_char.
For clarity, this test case:
character buffer(1)*100000
integer i,j
j = 1234
write(buffer(1),'(i4)') j
DO j=1,9999
! write(*,*) buffer(1)(1:4)
read(buffer,*) i
! write(*,*) i
ENDDO
end
Without the patch takes about 25 seconds to run.
With the patch this takes about 2.8 seconds.
The speedup is accomplished by simply skipping over spaces without calling
next_read, then backing up one character and letting the existing execution path
proceed, preserving all the end of record code needed in next_char.
I also remove some unneeded error checks.
Regression tested on X86_64 gnu. No need for a new test case since no new
functionality is added.
OK for trunk? The PR is marked as a regression, so I think this could be the
last piece and call it done.
Regards,
Jerry
2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu>
PR libfortran/38199
* io/list_read.c (next_char): Delete unuseful error checks.
(eat_spaces): For character array reading, skip ahead over
spaces rather than call next_char multiple times.

Jerry DeLisle wrote:
> The attached patch addresses the problem identified in comment #22 of the PR.> For character array internal unit reads, eat_spaces must call next_char to> advance every single character until the end of the string is reached. In the> case sited which is very contrived, this amounts to about 100000 calls to next_char.
> Without the patch takes about 25 seconds to run.> With the patch this takes about 2.8 seconds.
Nice!
> The speedup is accomplished by simply skipping over spaces without calling> next_read, then backing up one character and letting the existing execution path> proceed, preserving all the end of record code needed in next_char.
I think that code is okay.
I wonder whether it can happen that we read one character too far: i.e.
stell() == end of string; offset++ -> one behind. As one then
immediately steps back (due to "offset < limit" plus offset--), the code
should be handled just fine - except for the out of bounds read - which
might be detected by, e.g., -fsanitizer=address (if libgfortran and the
program use it). However, I am not sure whether eat_spaces would be
called in that case or whether other conditions error out before.
> I also remove some unneeded error checks.
I have to admit that I do not really understand why the conditions are
unreachable. Is it because "dtp->u.p.current_unit->bytes_left == 0" is
already checked for "is_array_io (dtp)"? If so, can it happen for scalar
internal I/O? I assume, it is obvious, but I don't quickly see it.
Tobias
> 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu>>> PR libfortran/38199> * io/list_read.c (next_char): Delete unuseful error checks.> (eat_spaces): For character array reading, skip ahead over> spaces rather than call next_char multiple times.

On Sat, Mar 8, 2014 at 7:38 AM, Jerry DeLisle wrote:
> The speedup is accomplished by simply skipping over spaces without calling> next_read, then backing up one character and letting the existing execution path> proceed, preserving all the end of record code needed in next_char.>> I also remove some unneeded error checks.
Would it be enough to make them "unlikely" instead?
- if (length < 0)
+ if (unlikely(length < 0))
Ciao!
Steven

On 03/08/2014 02:45 AM, Tobias Burnus wrote:
--- snip ---
> > I think that code is okay.> > I wonder whether it can happen that we read one character too far: i.e. stell()> == end of string; offset++ -> one behind. As one then immediately steps back> (due to "offset < limit" plus offset--), the code should be handled just fine -> except for the out of bounds read - which might be detected by, e.g.,> -fsanitizer=address (if libgfortran and the program use it). However, I am not> sure whether eat_spaces would be called in that case or whether other conditions> error out before.>
The strings are internally NULL terminated, which helps. I will do some tests to
see.
>> I also remove some unneeded error checks.> > I have to admit that I do not really understand why the conditions are> unreachable. Is it because "dtp->u.p.current_unit->bytes_left == 0" is already> checked for "is_array_io (dtp)"? If so, can it happen for scalar internal I/O? I> assume, it is obvious, but I don't quickly see it.
The is_array_io (dtp) is checking for an array descriptor. For scalar internal
I/O the descriptor is a NULL, so the new code is not executed in that case. I
ran a test using a scalar string and the new code is not needed.
The errors given are not normal error conditions. If we hit these errors there
is a internal library bug and the program would fail out. Since we have had none
reported ever I don't think we need it. As Steven suggested I can make them
unlikely.
Also I just realized I did not address the kind=4 case, so I will do that and
resubmit.
Thanks for review.
Jerry

Am 08.03.2014 07:38, schrieb Jerry DeLisle:
> The attached patch addresses the problem identified in comment #22 of the PR.> For character array internal unit reads, eat_spaces must call next_char to> advance every single character until the end of the string is reached. In the> case sited which is very contrived, this amounts to about 100000 calls to next_char.>> For clarity, this test case:>> character buffer(1)*100000> integer i,j>> j = 1234> write(buffer(1),'(i4)') j>> DO j=1,9999> ! write(*,*) buffer(1)(1:4)> read(buffer,*) i> ! write(*,*) i> ENDDO> end>> Without the patch takes about 25 seconds to run.>> With the patch this takes about 2.8 seconds.
This is great.
However, this is still 10 times slower than the LEN_TRIM variant:
character buffer(1)*100000
integer i,j
j = 1234
write(buffer(1),'(i4)') j
DO j=1,9999
! write(*,*) buffer(1)(1:4)
read(buffer(1)(1:LEN_TRIM(buffer(1))),*) i
! write(*,*) i
ENDDO
end
which takes 0.23s on my box. So on the on hand the improvement is great,
on the other hand it is really sad, because the user will still
need to do manual LEN_TRIM's when reading larger strings to get
optimal performance...
Thanks,
Manfred
>> The speedup is accomplished by simply skipping over spaces without calling> next_read, then backing up one character and letting the existing execution path> proceed, preserving all the end of record code needed in next_char.>> I also remove some unneeded error checks.>> Regression tested on X86_64 gnu. No need for a new test case since no new> functionality is added.>> OK for trunk? The PR is marked as a regression, so I think this could be the> last piece and call it done.>> Regards,>> Jerry>> 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu>>> PR libfortran/38199> * io/list_read.c (next_char): Delete unuseful error checks.> (eat_spaces): For character array reading, skip ahead over> spaces rather than call next_char multiple times.>

Am 08.03.2014 23:37, schrieb Manfred Schwarb:
> Am 08.03.2014 07:38, schrieb Jerry DeLisle:>> The attached patch addresses the problem identified in comment #22 of the PR.>> For character array internal unit reads, eat_spaces must call next_char to>> advance every single character until the end of the string is reached. In the>> case sited which is very contrived, this amounts to about 100000 calls to next_char.>>>> For clarity, this test case:>>>> character buffer(1)*100000>> integer i,j>>>> j = 1234>> write(buffer(1),'(i4)') j>>>> DO j=1,9999>> ! write(*,*) buffer(1)(1:4)>> read(buffer,*) i>> ! write(*,*) i>> ENDDO>> end>>>> Without the patch takes about 25 seconds to run.>>>> With the patch this takes about 2.8 seconds.>> This is great.>> However, this is still 10 times slower than the LEN_TRIM variant:> character buffer(1)*100000> integer i,j>> j = 1234> write(buffer(1),'(i4)') j>> DO j=1,9999> ! write(*,*) buffer(1)(1:4)> read(buffer(1)(1:LEN_TRIM(buffer(1))),*) i> ! write(*,*) i> ENDDO> end>
and also 10 times slower than the scalar variant (which was fixed by Thomas):
character buffer*100000
integer i,j
j = 1234
write(buffer,'(i4)') j
DO j=1,9999
! write(*,*) buffer(1:4)
read(buffer,*) i
! write(*,*) i
ENDDO
end
> which takes 0.23s on my box. So on the on hand the improvement is great,> on the other hand it is really sad, because the user will still> need to do manual LEN_TRIM's when reading larger strings to get> optimal performance...>> Thanks,> Manfred>>>>>>> The speedup is accomplished by simply skipping over spaces without calling>> next_read, then backing up one character and letting the existing execution path>> proceed, preserving all the end of record code needed in next_char.>>>> I also remove some unneeded error checks.>>>> Regression tested on X86_64 gnu. No need for a new test case since no new>> functionality is added.>>>> OK for trunk? The PR is marked as a regression, so I think this could be the>> last piece and call it done.>>>> Regards,>>>> Jerry>>>> 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu>>>>> PR libfortran/38199>> * io/list_read.c (next_char): Delete unuseful error checks.>> (eat_spaces): For character array reading, skip ahead over>> spaces rather than call next_char multiple times.>>>>