Commit Message

When playing around with hardware mapped, level triggered virtual IRQs,
there is the need to explicitly set the active state of an interrupt at
some point in time.
To prepare the GIC for that, we introduce a set_active_state() function
to let the VGIC manipulate the state of an associated hardware IRQ.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
---
xen/arch/arm/gic-v2.c | 9 +++++++++
xen/arch/arm/gic-v3.c | 16 ++++++++++++++++
xen/arch/arm/gic.c | 5 +++++
xen/include/asm-arm/gic.h | 5 +++++
4 files changed, 35 insertions(+)

On 12/02/18 17:53, Andre Przywara wrote:
> Hi,
Hi Andre,
> On 12/02/18 13:55, Julien Grall wrote:>> Hi Andre,>>>> On 09/02/18 14:39, Andre Przywara wrote:>>> When playing around with hardware mapped, level triggered virtual IRQs,>>> there is the need to explicitly set the active state of an interrupt at>>> some point in time.>>> To prepare the GIC for that, we introduce a set_active_state() function>>> to let the VGIC manipulate the state of an associated hardware IRQ.>>>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>>>> --->>> xen/arch/arm/gic-v2.c | 9 +++++++++>>> xen/arch/arm/gic-v3.c | 16 ++++++++++++++++>>> xen/arch/arm/gic.c | 5 +++++>>> xen/include/asm-arm/gic.h | 5 +++++>>> 4 files changed, 35 insertions(+)>>>>>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c>>> index 2e35892881..5339f69fbc 100644>>> --- a/xen/arch/arm/gic-v2.c>>> +++ b/xen/arch/arm/gic-v2.c>>> @@ -235,6 +235,14 @@ static unsigned int gicv2_read_irq(void)>>> return (readl_gicc(GICC_IAR) & GICC_IA_IRQ);>>> }>>> +static void gicv2_set_active_state(int irq, bool active)>>>> I would much prefer to have an irq_desc in parameter. This is matching>> the other interface> > ... and that's why I had it just like this in my first version. However> this proved to be nasty because I now need to get this irq_desc pointer> first, as the caller doesn't have it already. Since all we have and need> is the actual hardware IRQ number, I found it more straight-forward to> just use that number directly instead of going via the pointer and back> (h/w intid => irq_desc => irq).> >> and you could update the flags such as>> _IRQ_INPROGRESS which you don't do at the moment.> > Mmh, interesting point. I guess I should also clear this bit in the new> VGIC. At least once I wrapped my head around what this flag is> *actually* for (in conjunction with _IRQ_GUEST).> Anyway I guess this bit would still be set in our case.
For IRQ routed to the guest, the flag is used to know whether you need
to EOI the interrupt on domain destruction.
In general, I would like to keep desc->status in sync for the guest IRQ.
This is useful for debugging and potentially some ratelimit on interrupt
(I am thinking for ITS).
> >> Also, who is preventing two CPUs to clear the active bit at the same time?> > A certain hardware IRQ is assigned to one virtual IRQ on one VCPU at one> time only. Besides, GICD_ICACTIVERn has wired NAND semantics, so that's> naturally race free (as it was designed to be).> Unless I miss something here (happy to be pointed to an example where it> causes problems).
You could potentially have a race between ICACTIVER an ISACTIVER. This
is very similar to the enable/disable part. This matters a lot when
updating desc->status.
>>> +}>>> +>>> static void gicv2_set_irq_type(struct irq_desc *desc, unsigned int>>> type)>>> {>>> uint32_t cfg, actual, edgebit;>>> @@ -1241,6 +1249,7 @@ const static struct gic_hw_operations gicv2_ops = {>>> .eoi_irq = gicv2_eoi_irq,>>> .deactivate_irq = gicv2_dir_irq,>>> .read_irq = gicv2_read_irq,>>> + .set_active_state = gicv2_set_active_state,>>> .set_irq_type = gicv2_set_irq_type,>>> .set_irq_priority = gicv2_set_irq_priority,>>> .send_SGI = gicv2_send_SGI,>>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c>>> index 08d4703687..595eaef43a 100644>>> --- a/xen/arch/arm/gic-v3.c>>> +++ b/xen/arch/arm/gic-v3.c>>> @@ -475,6 +475,21 @@ static unsigned int gicv3_read_irq(void)>>> return irq;>>> }>>> +static void gicv3_set_active_state(int irq, bool active)>>> +{>>> + void __iomem *base;>>> +>>> + if ( irq >= NR_GIC_LOCAL_IRQS)>>> + base = GICD + (irq / 32) * 4;>>> + else>>> + base = GICD_RDIST_SGI_BASE;>>> +>>> + if ( active )>>> + writel(1U << (irq % 32), base + GICD_ISACTIVER);>>> + else>>> + writel(1U << (irq % 32), base + GICD_ICACTIVER);>>>> Shouldn't you wait until RWP bits is cleared here?> > I don't see why. I think this action has some posted semantics anyway,> so no need for any synchronisation. And also RWP does not track> I[SC]ACTIVER, only ICENABLER and some CTLR bits (ARM IHI 0069D, 8.9.4:> RWP[31]).> >>>>> +}>>>> Why don't you use the function poke?> > Ah, I didn't see this. But then this now does this quite costly RWP> dance now. We could add a check in there to only do this if we change> the affected registers or pass an explicit "bool wait_for_rwp" in there.
I guess this would be useful even for the current code. If I understand
correctly the RWP semantics, it should not be necessary to wait when
write to ISENABLER also.
Cheers,

Hi,
On 13/02/18 12:02, Julien Grall wrote:
> On 12/02/18 17:53, Andre Przywara wrote:>> Hi,> > Hi Andre,> >> On 12/02/18 13:55, Julien Grall wrote:>>> Hi Andre,>>>>>> On 09/02/18 14:39, Andre Przywara wrote:>>>> When playing around with hardware mapped, level triggered virtual IRQs,>>>> there is the need to explicitly set the active state of an interrupt at>>>> some point in time.>>>> To prepare the GIC for that, we introduce a set_active_state() function>>>> to let the VGIC manipulate the state of an associated hardware IRQ.>>>>>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>>>>> --->>>> xen/arch/arm/gic-v2.c | 9 +++++++++>>>> xen/arch/arm/gic-v3.c | 16 ++++++++++++++++>>>> xen/arch/arm/gic.c | 5 +++++>>>> xen/include/asm-arm/gic.h | 5 +++++>>>> 4 files changed, 35 insertions(+)>>>>>>>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c>>>> index 2e35892881..5339f69fbc 100644>>>> --- a/xen/arch/arm/gic-v2.c>>>> +++ b/xen/arch/arm/gic-v2.c>>>> @@ -235,6 +235,14 @@ static unsigned int gicv2_read_irq(void)>>>> return (readl_gicc(GICC_IAR) & GICC_IA_IRQ);>>>> }>>>> +static void gicv2_set_active_state(int irq, bool active)>>>>>> I would much prefer to have an irq_desc in parameter. This is matching>>> the other interface>>>> ... and that's why I had it just like this in my first version. However>> this proved to be nasty because I now need to get this irq_desc pointer>> first, as the caller doesn't have it already. Since all we have and need>> is the actual hardware IRQ number, I found it more straight-forward to>> just use that number directly instead of going via the pointer and back>> (h/w intid => irq_desc => irq).>>>>> and you could update the flags such as>>> _IRQ_INPROGRESS which you don't do at the moment.>>>> Mmh, interesting point. I guess I should also clear this bit in the new>> VGIC. At least once I wrapped my head around what this flag is>> *actually* for (in conjunction with _IRQ_GUEST).>> Anyway I guess this bit would still be set in our case.> > For IRQ routed to the guest, the flag is used to know whether you need> to EOI the interrupt on domain destruction.
Yeah, I found that. In general I am a bit suspicious of replicating and
tracking the hardware IRQ state in software.
> In general, I would like to keep desc->status in sync for the guest IRQ.> This is useful for debugging and potentially some ratelimit on interrupt> (I am thinking for ITS).> >>>>> Also, who is preventing two CPUs to clear the active bit at the same>>> time?>>>> A certain hardware IRQ is assigned to one virtual IRQ on one VCPU at one>> time only. Besides, GICD_ICACTIVERn has wired NAND semantics, so that's>> naturally race free (as it was designed to be).>> Unless I miss something here (happy to be pointed to an example where it>> causes problems).> > You could potentially have a race between ICACTIVER an ISACTIVER.
I don't see why this would be a problem:
Either you activate the IRQ or you deactivate it. The
wired-OR/wired-NAND semantics makes sure this never gets inconsistent on
the hardware side. If you issue two conflicting requests at the same
time, that's a benign race, which you either don't care about or handle
via locking in the code which triggers these requests.
Besides, we only do one direction in the code at the moment anyway.
And this should be *clearing* the active state, and not setting it,
which is a bug I discovered yesterday.
> is very similar to the enable/disable part. This matters a lot when> updating desc->status.
Which is one reason why I am suspicious of this whole state replication.
But the desc lock should take care of this in general, no?
>>>> +}>>>> +>>>> static void gicv2_set_irq_type(struct irq_desc *desc, unsigned int>>>> type)>>>> {>>>> uint32_t cfg, actual, edgebit;>>>> @@ -1241,6 +1249,7 @@ const static struct gic_hw_operations>>>> gicv2_ops = {>>>> .eoi_irq = gicv2_eoi_irq,>>>> .deactivate_irq = gicv2_dir_irq,>>>> .read_irq = gicv2_read_irq,>>>> + .set_active_state = gicv2_set_active_state,>>>> .set_irq_type = gicv2_set_irq_type,>>>> .set_irq_priority = gicv2_set_irq_priority,>>>> .send_SGI = gicv2_send_SGI,>>>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c>>>> index 08d4703687..595eaef43a 100644>>>> --- a/xen/arch/arm/gic-v3.c>>>> +++ b/xen/arch/arm/gic-v3.c>>>> @@ -475,6 +475,21 @@ static unsigned int gicv3_read_irq(void)>>>> return irq;>>>> }>>>> +static void gicv3_set_active_state(int irq, bool active)>>>> +{>>>> + void __iomem *base;>>>> +>>>> + if ( irq >= NR_GIC_LOCAL_IRQS)>>>> + base = GICD + (irq / 32) * 4;>>>> + else>>>> + base = GICD_RDIST_SGI_BASE;>>>> +>>>> + if ( active )>>>> + writel(1U << (irq % 32), base + GICD_ISACTIVER);>>>> + else>>>> + writel(1U << (irq % 32), base + GICD_ICACTIVER);>>>>>> Shouldn't you wait until RWP bits is cleared here?>>>> I don't see why. I think this action has some posted semantics anyway,>> so no need for any synchronisation. And also RWP does not track>> I[SC]ACTIVER, only ICENABLER and some CTLR bits (ARM IHI 0069D, 8.9.4:>> RWP[31]).>>>>>>>>> +}>>>>>> Why don't you use the function poke?>>>> Ah, I didn't see this. But then this now does this quite costly RWP>> dance now. We could add a check in there to only do this if we change>> the affected registers or pass an explicit "bool wait_for_rwp" in there.> > I guess this would be useful even for the current code. If I understand> correctly the RWP semantics, it should not be necessary to wait when> write to ISENABLER also.
Exactly. I changed poke_irq() to do:
if ( offset == GICD_ICENABLER )
gicv3_wait_for_rwp(irqd->irq);
Does that sound acceptable?
I also just added poke_irq()/peek_irq() to gic-v2.c as well.
Cheers,
Andre.

On 13/02/18 15:01, Andre Przywara wrote:
> Hi,
Hi Andre,
> On 13/02/18 12:02, Julien Grall wrote:>> On 12/02/18 17:53, Andre Przywara wrote:>>> Hi,>>>> Hi Andre,>>>>> On 12/02/18 13:55, Julien Grall wrote:>>>> Hi Andre,>>>>>>>> On 09/02/18 14:39, Andre Przywara wrote:>>>>> When playing around with hardware mapped, level triggered virtual IRQs,>>>>> there is the need to explicitly set the active state of an interrupt at>>>>> some point in time.>>>>> To prepare the GIC for that, we introduce a set_active_state() function>>>>> to let the VGIC manipulate the state of an associated hardware IRQ.>>>>>>>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>>>>>> --->>>>> xen/arch/arm/gic-v2.c | 9 +++++++++>>>>> xen/arch/arm/gic-v3.c | 16 ++++++++++++++++>>>>> xen/arch/arm/gic.c | 5 +++++>>>>> xen/include/asm-arm/gic.h | 5 +++++>>>>> 4 files changed, 35 insertions(+)>>>>>>>>>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c>>>>> index 2e35892881..5339f69fbc 100644>>>>> --- a/xen/arch/arm/gic-v2.c>>>>> +++ b/xen/arch/arm/gic-v2.c>>>>> @@ -235,6 +235,14 @@ static unsigned int gicv2_read_irq(void)>>>>> return (readl_gicc(GICC_IAR) & GICC_IA_IRQ);>>>>> }>>>>> +static void gicv2_set_active_state(int irq, bool active)>>>>>>>> I would much prefer to have an irq_desc in parameter. This is matching>>>> the other interface>>>>>> ... and that's why I had it just like this in my first version. However>>> this proved to be nasty because I now need to get this irq_desc pointer>>> first, as the caller doesn't have it already. Since all we have and need>>> is the actual hardware IRQ number, I found it more straight-forward to>>> just use that number directly instead of going via the pointer and back>>> (h/w intid => irq_desc => irq).>>>>>>> and you could update the flags such as>>>> _IRQ_INPROGRESS which you don't do at the moment.>>>>>> Mmh, interesting point. I guess I should also clear this bit in the new>>> VGIC. At least once I wrapped my head around what this flag is>>> *actually* for (in conjunction with _IRQ_GUEST).>>> Anyway I guess this bit would still be set in our case.>>>> For IRQ routed to the guest, the flag is used to know whether you need>> to EOI the interrupt on domain destruction.> > Yeah, I found that. In general I am a bit suspicious of replicating and> tracking the hardware IRQ state in software.
This is how Xen has been designed and I am pretty sure Linux does that
same. This makes easier if you want in the future to share interrupts
between multiple domains (think legacy PCI interrupt) and even abstract
the hardware from the virtual GIC.
> >> In general, I would like to keep desc->status in sync for the guest IRQ.>> This is useful for debugging and potentially some ratelimit on interrupt>> (I am thinking for ITS).>>>>>>>>> Also, who is preventing two CPUs to clear the active bit at the same>>>> time?>>>>>> A certain hardware IRQ is assigned to one virtual IRQ on one VCPU at one>>> time only. Besides, GICD_ICACTIVERn has wired NAND semantics, so that's>>> naturally race free (as it was designed to be).>>> Unless I miss something here (happy to be pointed to an example where it>>> causes problems).>>>> You could potentially have a race between ICACTIVER an ISACTIVER.> > I don't see why this would be a problem:> Either you activate the IRQ or you deactivate it. The> wired-OR/wired-NAND semantics makes sure this never gets inconsistent on> the hardware side. If you issue two conflicting requests at the same> time, that's a benign race, which you either don't care about or handle> via locking in the code which triggers these requests.> > Besides, we only do one direction in the code at the moment anyway.> And this should be *clearing* the active state, and not setting it,> which is a bug I discovered yesterday.
If the code do handle only one direction, then you should add an ASSERT
to check the state value rather than making believe the code is safe. It
is only safe for hardware interrupt assigned to a guest (thanks to
locking). It would not be for Xen.
> >> is very similar to the enable/disable part. This matters a lot when>> updating desc->status.> > Which is one reason why I am suspicious of this whole state replication.> But the desc lock should take care of this in general, no?
Yes if you take it.
> >>>>> +}>>>>> +>>>>> static void gicv2_set_irq_type(struct irq_desc *desc, unsigned int>>>>> type)>>>>> {>>>>> uint32_t cfg, actual, edgebit;>>>>> @@ -1241,6 +1249,7 @@ const static struct gic_hw_operations>>>>> gicv2_ops = {>>>>> .eoi_irq = gicv2_eoi_irq,>>>>> .deactivate_irq = gicv2_dir_irq,>>>>> .read_irq = gicv2_read_irq,>>>>> + .set_active_state = gicv2_set_active_state,>>>>> .set_irq_type = gicv2_set_irq_type,>>>>> .set_irq_priority = gicv2_set_irq_priority,>>>>> .send_SGI = gicv2_send_SGI,>>>>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c>>>>> index 08d4703687..595eaef43a 100644>>>>> --- a/xen/arch/arm/gic-v3.c>>>>> +++ b/xen/arch/arm/gic-v3.c>>>>> @@ -475,6 +475,21 @@ static unsigned int gicv3_read_irq(void)>>>>> return irq;>>>>> }>>>>> +static void gicv3_set_active_state(int irq, bool active)>>>>> +{>>>>> + void __iomem *base;>>>>> +>>>>> + if ( irq >= NR_GIC_LOCAL_IRQS)>>>>> + base = GICD + (irq / 32) * 4;>>>>> + else>>>>> + base = GICD_RDIST_SGI_BASE;>>>>> +>>>>> + if ( active )>>>>> + writel(1U << (irq % 32), base + GICD_ISACTIVER);>>>>> + else>>>>> + writel(1U << (irq % 32), base + GICD_ICACTIVER);>>>>>>>> Shouldn't you wait until RWP bits is cleared here?>>>>>> I don't see why. I think this action has some posted semantics anyway,>>> so no need for any synchronisation. And also RWP does not track>>> I[SC]ACTIVER, only ICENABLER and some CTLR bits (ARM IHI 0069D, 8.9.4:>>> RWP[31]).>>>>>>>>>>>> +}>>>>>>>> Why don't you use the function poke?>>>>>> Ah, I didn't see this. But then this now does this quite costly RWP>>> dance now. We could add a check in there to only do this if we change>>> the affected registers or pass an explicit "bool wait_for_rwp" in there.>>>> I guess this would be useful even for the current code. If I understand>> correctly the RWP semantics, it should not be necessary to wait when>> write to ISENABLER also.> > Exactly. I changed poke_irq() to do:> if ( offset == GICD_ICENABLER )> gicv3_wait_for_rwp(irqd->irq);> > Does that sound acceptable?
I would prefer the prototype to be updated with another parameter that
will tell whether we want to wait or not.
> > I also just added poke_irq()/peek_irq() to gic-v2.c as well.> > Cheers,> Andre.>
Cheers,