Comments

Hello,
In this problem report, the compiler is fed a (bogus) translation unit
in which some literals contains bytes whose value is zero. The
preprocessor detects that and proceeds to emit diagnostics for that
king of bogus literals. But then when the diagnostics machinery
re-reads the input file again to display the bogus literals with a
caret, it attempts to calculate the length of each of the lines it got
using fgets. The line length calculation is done using strlen. But
that doesn't work well when the content of the line can have several
zero bytes. The result is that the read_line never sees the end of
the line because strlen repeatedly reports that the line ends before
the end-of-line character; so read_line thinks its buffer for reading
the line is too small; it thus increases the buffer, leading to a huge
memory consumption, pain and disaster.
The patch below introduces a new string_length() function that can
return the length of a string contained in a buffer even if the string
contains zero bytes; it does so by starting from the end of the buffer
and stops when it encounters the first non-null byte; for that to
work, the buffer must have been totally zeroed before getting data.
read_line is then modified to return the length of the line along
with the line itself, as the line can now contain zero bytes. Callers
of read_line are adjusted consequently.
diagnostic_show_locus() is modified to consider that a line can have
characters of value zero, and so just show a white space when
instructed to display one of these characters.
Tested on x86_64-unknown-linux-gnu against trunk.
I realize this is diagnostics code and I am supposed to be a maintainer
for it, but I'd appreciate a review for it nonetheless.
Thanks.
gcc/ChangeLog:
* input.h (location_get_source_line): Take an additional line_size
parameter by reference.
* input.c (string_length): New static function definition.
(read_line): Take an additional line_length output parameter to be
set to the size of the line. Use the new string_length function
to compute the size of the line returned by fgets, rather than
using strlen. Ensure that the buffer is initially zeroed; ensure
that when growing the buffer too.
(location_get_source_line): Take an additional output line_len
parameter. Update the use of read_line to pass it the line_len
parameter.
* diagnostic.c (adjust_line): Take an additional input parameter
for the length of the line, rather than calculating it with
strlen.
(diagnostic_show_locus): Adjust the use of
location_get_source_line and adjust_line with respect to their new
signature. While displaying a line now, do not stop at the first
null byte. Rather, display the zero byte as a space and keep
going until we reach the size of the line.
gcc/testsuite/ChangeLog:
* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
gcc/diagnostic.c | 17 +++---
gcc/input.c | 60 +++++++++++++++++----
gcc/input.h | 3 +-
.../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes
4 files changed, 62 insertions(+), 18 deletions(-)
create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi
literal 0
HcmV?d00001