Commit Message

Back to this...
Bernd Schmidt <bernds@codesourcery.com> writes:
>> gcc/>> * reload1.c (reload_regs_reach_end_p): Replace with...>> (reload_reg_rtx_reaches_end_p): ...this function.>> (new_spill_reg_store): Update commentary.>> (emit_input_reload_insns): Don't clear new_spill_reg_store here.>> (emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p>> before setting new_spill_reg_store.>> (emit_reload_insns): Use a separate loop to clear new_spill_reg_store.>> Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.>> Also use reload_reg_rtx_reaches_end_p when recording inheritance>> information for non-spill reload registers.>> Just an update to say that based on our discussion I think the general> approach is OK, but I'm still trying to figure out what exactly this> piece of code is doing, and whether the changes to it make sense:>>> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch>> the storing insn so that we may delete this insn with>> delete_output_reload. */>> src_reg = reload_reg_rtx_for_output[r];>> ->> - /* If this is an optional reload, try to find the source reg>> - from an input reload. */>> - if (! src_reg)>> + if (src_reg>> + && reload_reg_rtx_reaches_end_p (src_reg, r))>> + store_insn = new_spill_reg_store[REGNO (src_reg)];>> + else>> {>> + /* If this is an optional reload, try to find the>> + source reg from an input reload. */>> rtx set = single_set (insn);>> if (set && SET_DEST (set) == rld[r].out)>> {>> int k;>> + rtx cand;>> >> src_reg = SET_SRC (set);>> store_insn = insn;>> for (k = 0; k < n_reloads; k++)>> - {>> - if (rld[k].in == src_reg)>> - {>> - src_reg = reload_reg_rtx_for_input[k];>> - break;>> - }>> - }>> + if (rld[k].in == src_reg)>> + {>> + cand = reload_reg_rtx_for_input[k];>> + if (reload_reg_rtx_reaches_end_p (cand, k))>> + {>> + src_reg = cand;>> + break;>> + }>> + }>> }>> }>> - else>> - store_insn = new_spill_reg_store[REGNO (src_reg)];>> if (src_reg && REG_P (src_reg)>> && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)>> {
Yeah, I was in two minds what to do here. AIUI, the code:
if (!HARD_REGISTER_NUM_P (out_regno))
{
rtx src_reg, store_insn = NULL_RTX;
reg_last_reload_reg[out_regno] = 0;
/* If we can find a hard register that is stored, record
the storing insn so that we may delete this insn with
delete_output_reload. */
src_reg = reload_reg_rtx_for_output[r];
/* If this is an optional reload, try to find the source reg
from an input reload. */
if (! src_reg)
{
rtx set = single_set (insn);
if (set && SET_DEST (set) == rld[r].out)
{
int k;
[A] src_reg = SET_SRC (set);
store_insn = insn;
for (k = 0; k < n_reloads; k++)
{
if (rld[k].in == src_reg)
{
[B] src_reg = reload_reg_rtx_for_input[k];
break;
}
}
}
}
else
[C] store_insn = new_spill_reg_store[REGNO (src_reg)];
if (src_reg && REG_P (src_reg)
&& REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
{
...record inheritance for out <- src_reg...
is coping with three cases:
[C] we have an output reload for a non-spill register. E.g. the
pre-reload instruction might be:
(set (reg P1) (.... (reg H1) ...))
REG_DEAD: H1
(H = hard register, P = pseudo register), where the P1 and H1
operands have matching constraints. In this case we might use
H1 as the reload register for (reg P1).
This is the case that really does need reload_reg_rtx_reaches_end_p.
E.g. the SET above could be in parallel with another SET that needs
a secondary output reload. It's possible in principle for H1 to be
used as the secondary reload register there too, so that the final
sequence looks like:
(parallel [(set (reg H1 [orig P1]) (... (reg H1) ...))
(set (reg H2 [orig P2]) (...))]
(set (reg P1) (reg H1))
(set (reg H1) (reg H2))
(set (reg P2) (reg H1))
(I say "in principle" because I don't know whether there is code
to take advantage of this for non-spill registers like H1.
But it would be a valid choice.)
Either way, the idea is that new_spill_reg_store[R] is only valid
for reload registers R that reach the end of the reload sequence.
We really should check that H1 reaches the end before using
new_spill_reg_store[H1].
[A] (but not [B]). I think this is for optional reloads that we
decided not to make, in cases where the source of the set needs
no reloads. E.g. the pre-reload instruction might be:
(set (reg P1) (reg H1))
and P1 has an optional output reload that we decided not to make.
Here we record that H1 holds P1 as a possible inheritance.
If inheritance causes the P1 <- H1 move to become unnecessary,
we can delete it as dead.
There doesn't seem to be any kind of "reaches end" check, but I
suppose the hope is that instructions like this are going to be so
simple that there's no point. Although I'm not sure whether for:
(parallel [(set (reg P1) (reg H1))
(clobber (scratch))])
we'd ever consider using H1 for the scratch in cases where the
clobber isn't an earlyclobber.
Either way, this is unrelated to the original problem of
new_spill_reg_store being out of whack. reload_reg_rtx_reaches_end_p
wouldn't be the right thing to check, since we don't have a reload
register to test in this case. Because there's no "off the shelf"
fix, it's probably best to leave it be until we have a testcase
that exhibits a problem.
The patch doesn't change this case.
[B] I think this is similar to [A], but for cases where the source
is reloaded. E.g. the pre-reload instruction might be:
(set (reg P1) (reg P2))
where P1 has an optional reload that we decided not to make and
P2 is reloaded into H2. The final sequence would look like:
(set (reg H2) (reg P2))
(set (reg P1) (reg H2))
and the code would record P1 <- H2 for inheritance.
There does seem to be a real danger here that H2 could be reused
for clobbers (except again for earlyclobbers), so it seemed best
to test reload_reg_rtx_reaches_end_p. However, this change was
by inspection, and isn't directly related to the new handling
of new_spill_reg_store[], since the inherited "reload" is
actually the reloaded instruction itself.
If H2 doesn't reach the end, the patch falls back to [A].
This means that if the original source is a hard register
of the wrong class (instead of a pseudo as in the example above),
we can continue to record an inheritance for that.
I suppose it could be argued that this in itself has a risk
(although I think a smaller one) because of the aforementioned
lack of a "reaches end" test in [A]. So while the effect of the
reload_reg_rtx_reaches_end_p check is to stop [B] from trumping [A]
in cases where [B] is definitely invalid, it doesn't do anything to
fix any problems that there might be in [A].
Which is why this is a quandry. After the patch for [C],
I think this is the only remaining case in which we record
inheritance for a reload register without checking whether
the reload register reaches the end. (Note that the "i >= 0"
block above this one already applies the check consistently.)
So on the one hand, it seems wrong to leave an obvious hole
(especially one that will lead to silent wrong-code bugs),
but on the other, it's always dangerous changing reload by
inspection.
If you prefer changing as little as possible, then the version below
just does [C]. Also tested on mips64-linux-gnu.
Richard
gcc/
* reload1.c (reload_regs_reach_end_p): Replace with...
(reload_reg_rtx_reaches_end_p): ...this function.
(new_spill_reg_store): Update commentary.
(emit_input_reload_insns): Don't clear new_spill_reg_store here.
(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
before setting new_spill_reg_store.
(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
Also use reload_reg_rtx_reaches_end_p when reading new_spill_reg_store
for non-spill reload registers.