Comments

Hello, this is a fix for cases like:
program main
implicit none
intrinsic :: real
print *,(/ real(a = 1) /)
end
where `real(a = 1)' is initially parsed as a typespec, creating
a symbol for 'a' along the way. The match fails, and then it is parsed
as a constructor element and accepted that way. However, accepting the
statement implies accepting all the symbols created so far including 'a',
which is wrong and leads to the ICE.
To handle correctly this, we have to remove 'a' before proceeding with
the second parse attempt. However, we can't use gfc_undo_symbols, as
it would also remove 'b' in the following case.
b = (/ real(a = 1) /)
The fix proposed here implements a partial undo framework. It packs the
changed_syms and tentative_tbp variables into a single 'gfc_change_set'
struct, and makes it possible to have more than one of those structs,
organised as a stack. That change makes the current linked list
implementation using in-symbol 'tlink' pointer impractical as it prevents
the same symbol from being in more than one changeset. I don't really know
whether that is a true limitation, but have decided to lift it anyway by
registering the symbols in a vector instead. This makes backporting a bit
more difficult unfortunately; I will submit the (yet nonexisting) backport
patches separately.
The work is divided as follows:
1/5: Pack the changed_syms and tentative_tbp variables in a 'gfc_change_set'
struct and move to the vec API.
2/5: New function restore_old_symbol, extracted from gfc_undo_symbols.
3/5: Fix restore_old_symbol
4/5: Add support for more than one 'gfc_change_set' variable.
5/5: Fix gfc_match_array_constructor using the just introduced functions.
The patches are attached to the follow-up mails; the full diff is also provided
here.
Bootstrap-asan'ed and regression tested on x86_64-unknown-linux-gnu.
OK for trunk?

Mikael Morin wrote:
> Hello, this is a fix for cases like:>> program main> implicit none> intrinsic :: real> print *,(/ real(a = 1) /)> end>> where `real(a = 1)' is initially parsed as a typespec, creating> a symbol for 'a' along the way. The match fails, and then it is parsed> as a constructor element and accepted that way. However, accepting the> statement implies accepting all the symbols created so far including 'a',> which is wrong and leads to the ICE.> [...]> This makes backporting a bit more difficult unfortunately; I will submit the (yet nonexisting) backport> patches separately.
I know that this PR is a 4.6/4.7/4.8 regression and that it presumably
comes from a real-world code; still, given that one can relatively
simple work around the issue and that the patch is not tiny (though not
very complicated either), I wonder whether one should only fix it on the
4.8 trunk.
> Bootstrap-asan'ed and regression tested on x86_64-unknown-linux-gnu.> OK for trunk?
It looks mostly okay.
However, I somehow do not like some of names of the new
procedures/global vars. I find the new "single_undo_checkpoint_p" clear,
but, without the context of this patch, I presumably had no idea what a
"checkpoint" means when reading gfortran.h:
+void gfc_new_checkpoint (gfc_change_set &);
+void gfc_drop_last_checkpoint (void);
+void gfc_restore_last_checkpoint (void);
Similarly:
+static gfc_change_set change_set_var = { vNULL, vNULL, NULL };
+static gfc_change_set *changes = &change_set_var;
"changes" is a bit too vague for me (though it is not bad) – and
"change_set_var" doesn't make it clear enough that it is simply a
variable, which matches the empty default status.
BTW: Can you also change "static .... = {vNULL ...};" into "static const
.... = {vNULL ...};" to make sure the value is not accidentally changed?
Regarding the naming, can you use a bit more speaking names? For
instance – without claiming that the naming choice is best:
"undo_changes" instead of "changes", "emtpy_undo_change_set_var" instead
of "change_set_var", "gfc_new_undo_checkpoint" instead of
"gfc_new_checkpoint". It can be also something different, but it should
imply what they a good for.
To sum up: The patch is okay with the "const" added. I'd prefer some
"speaking names", but if you cannot come up with one, the patch is also
acceptable as is.
Tobias

Le 22/02/2013 16:23, Tobias Burnus a écrit :
> Mikael Morin wrote:>> Hello, this is a fix for cases like:>>>> program main>> implicit none>> intrinsic :: real>> print *,(/ real(a = 1) /)>> end>>>> where `real(a = 1)' is initially parsed as a typespec, creating>> a symbol for 'a' along the way. The match fails, and then it is parsed>> as a constructor element and accepted that way. However, accepting the>> statement implies accepting all the symbols created so far including 'a',>> which is wrong and leads to the ICE.>> [...]>> This makes backporting a bit more difficult unfortunately; I will>> submit the (yet nonexisting) backport>> patches separately.>> I know that this PR is a 4.6/4.7/4.8 regression and that it presumably> comes from a real-world code; still, given that one can relatively> simple work around the issue and that the patch is not tiny (though not> very complicated either), I wonder whether one should only fix it on the> 4.8 trunk.>
Yes we have had two major versions with the bug after all.
Let's go for 4.8 only, that's less work for me. :-)
>> Bootstrap-asan'ed and regression tested on x86_64-unknown-linux-gnu.>> OK for trunk?>> It looks mostly okay.>> However, I somehow do not like some of names of the new> procedures/global vars. I find the new "single_undo_checkpoint_p" clear,> but, without the context of this patch, I presumably had no idea what a> "checkpoint" means when reading gfortran.h:>> +void gfc_new_checkpoint (gfc_change_set &);> +void gfc_drop_last_checkpoint (void);> +void gfc_restore_last_checkpoint (void);>
I have tried to find a good balance between descriptiveness and
verboseness. Before settling on those names, I tried (reading my local
dead branches):
gfc_register_undo_level/gfc_unregister_undo_level/?
gfc_push_undo_level/gfc_pop_undo_level/gfc_undo_one_level
Do you prefer any of them? Otherwise I will just replace "checkpoint"
by "undo_checkpoint" as you suggested.
> Similarly:>> +static gfc_change_set change_set_var = { vNULL, vNULL, NULL };> +static gfc_change_set *changes = &change_set_var;>> "changes" is a bit too vague for me (though it is not bad) – and> "change_set_var" doesn't make it clear enough that it is simply a> variable, which matches the empty default status.>
It's the default status, and it is empty at the beginning. But it's not
constant; changed symbols are added to it by default.
Regarding the name "changes", it is made necessary because the symbol
changes and the tentative_tbp_list are packed together, thus the
variable can't be called "changed_syms" any more. If you don't mind
seeing "changed_syms->syms" in the code we can keep the original name.
Otherwise I'm not very inspired. Would you feel more comfortable with
"latest_undo_changes"?
> Regarding the naming, can you use a bit more speaking names? For> instance – without claiming that the naming choice is best:> "undo_changes" instead of "changes", "emtpy_undo_change_set_var" instead> of "change_set_var",>
As said above, it's not always empty, so I will make it
"default_undo_change_set_var" (and keep it non-const). For the rest, I
will add "undo_" before "change_set" and before "checkpoint". Sounds good?
Mikael