Comments

Martin Mokrejs wrote:
> Hi everybody,> > Bjorn Helgaas wrote:>> [+cc linux-pci, Sarah, Alan]>>>> On Mon, Mar 11, 2013 at 10:02 AM, Martin Mokrejs>> <mmokrejs@fold.natur.cuni.cz> wrote:>>> [re-sending to you all three directly, looks the original email did not make it into linux-pci>>> through some filters]>>>>>> I use for my daily work acpiphp to manage express cards in Dell Vostro 3550.>>> I have never seen something like this before and believe this is some new regression>>> in 3.8 series. I had in teh a USB3 card and ejected it. Then I inserted a>>> SATA Sil3132 card but it is not detected and dmesg still ends with last lines>>> added when the USB card was being removed. The funny thing is that lspci reports>>> a mixture of USB-card properties with NEC chips along with Silicon Image eSATA card.>>>> I don't know anything about the kmemleaks mentioned elsewhere in this>> thread, but the idea of "stale PCI device info" seems possibly related>> to some acpiphp issues we've been working on recently.>>>> Starting with v3.9, we don't handle ACPI Bus Check notifications to>> host bridges correctly, and the result is that when we're using>> acpiphp, we don't notice when PCI devices are added or removed. There>> are more details in https://bugzilla.kernel.org/show_bug.cgi?id=57961> > Looks to me it is already in 3.10-rc4 which I tested now. No, I still do see> same problem like before: a hotremoval of NEC-based xHCI express card is detected> on every second eject. But, sometimes it seems it is only delayed by some 25-30> seconds. Would have to do more testing. However, there are some *new* kmemleaks> reported by kernel related to acpiphpp bu xhci_hcd. That could a hint why> the hotremoval sometimes proceeds delayed but sometimes maybe not at all or at> least not *immediately* like for any other device?> > However, the stale sysfs entries for partially removed device SiI3132 (sata_sil24> driver) are NOT appearing anymore, good. That used to be associated with> 'sata_sil24: IRQ status == 0xffffffff, PCI fault or device removal?' line.> Now, I see under 3.10-rc4 the extra message about 'ACPI: Device does not support D3cold'.> would be nice if it said what device is it talking about? About upstream root port> or about my end device (express card)? Is it related by pcie_aspm= kernel> commandline option? If yes, please include the relevant info the message text.> referring to this being affected by the particular value. At the moment I used:> pcie_aspm=off> > --- dmesg_initial__inserted_eSATA__ejected__inserted__ejected__inserted.txt 2013-06-07 02:53:56.000000000 +0200> +++ dmesg_initial__inserted_eSATA__ejected__inserted__ejected__inserted__ejected.txt 2013-06-07 02:54:09.000000000 +0200> @@ -1439,3 +1439,5 @@> [ 254.317365] ata12: SATA max UDMA/100 host m128@0xf6c04000 port 0xf6c02000 irq 19> [ 256.400454] ata11: SATA link down (SStatus 0 SControl 0)> [ 258.493027] ata12: SATA link down (SStatus 0 SControl 0)> +[ 267.116723] sata_sil24: IRQ status == 0xffffffff, PCI fault or device removal?> +[ 267.117779] ACPI: Device does not support D3cold> > > So, in my eyes the "stale pci info" issue is fixed in 3.10-rc4 at least under acpiphp and pcie_aspm=off.
And to be even more exact, I had CONFIG_HOTPLUG_PCI_ACPI=y as I see now an updated
v2 patch from Yinghai:
[PATCH v3.9 stable] PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check
Please make sure that whatever I tested in plain 3.10-rc4 is what you had in those bugzilla patches
under https://bugzilla.kernel.org/show_bug.cgi?id=57961 or what Yinghai posted as an update.
Just in case are tested a different version.
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Martin Mokrejs wrote:
> Martin Mokrejs wrote:>> Hi everybody,>>>> Bjorn Helgaas wrote:>>> [+cc linux-pci, Sarah, Alan]>>>>>> On Mon, Mar 11, 2013 at 10:02 AM, Martin Mokrejs>>> <mmokrejs@fold.natur.cuni.cz> wrote:>>>> [re-sending to you all three directly, looks the original email did not make it into linux-pci>>>> through some filters]>>>>>>>> I use for my daily work acpiphp to manage express cards in Dell Vostro 3550.>>>> I have never seen something like this before and believe this is some new regression>>>> in 3.8 series. I had in teh a USB3 card and ejected it. Then I inserted a>>>> SATA Sil3132 card but it is not detected and dmesg still ends with last lines>>>> added when the USB card was being removed. The funny thing is that lspci reports>>>> a mixture of USB-card properties with NEC chips along with Silicon Image eSATA card.>>>>>> I don't know anything about the kmemleaks mentioned elsewhere in this>>> thread, but the idea of "stale PCI device info" seems possibly related>>> to some acpiphp issues we've been working on recently.>>>>>> Starting with v3.9, we don't handle ACPI Bus Check notifications to>>> host bridges correctly, and the result is that when we're using>>> acpiphp, we don't notice when PCI devices are added or removed. There>>> are more details in https://bugzilla.kernel.org/show_bug.cgi?id=57961>>>> Looks to me it is already in 3.10-rc4 which I tested now. No, I still do see>> same problem like before: a hotremoval of NEC-based xHCI express card is detected>> on every second eject. But, sometimes it seems it is only delayed by some 25-30>> seconds. Would have to do more testing. However, there are some *new* kmemleaks>> reported by kernel related to acpiphpp bu xhci_hcd. That could a hint why>> the hotremoval sometimes proceeds delayed but sometimes maybe not at all or at>> least not *immediately* like for any other device?>>>> However, the stale sysfs entries for partially removed device SiI3132 (sata_sil24>> driver) are NOT appearing anymore, good. That used to be associated with>> 'sata_sil24: IRQ status == 0xffffffff, PCI fault or device removal?' line.>> Now, I see under 3.10-rc4 the extra message about 'ACPI: Device does not support D3cold'.>> would be nice if it said what device is it talking about? About upstream root port>> or about my end device (express card)? Is it related by pcie_aspm= kernel>> commandline option? If yes, please include the relevant info the message text.>> referring to this being affected by the particular value. At the moment I used:>> pcie_aspm=off>>>> --- dmesg_initial__inserted_eSATA__ejected__inserted__ejected__inserted.txt 2013-06-07 02:53:56.000000000 +0200>> +++ dmesg_initial__inserted_eSATA__ejected__inserted__ejected__inserted__ejected.txt 2013-06-07 02:54:09.000000000 +0200>> @@ -1439,3 +1439,5 @@>> [ 254.317365] ata12: SATA max UDMA/100 host m128@0xf6c04000 port 0xf6c02000 irq 19>> [ 256.400454] ata11: SATA link down (SStatus 0 SControl 0)>> [ 258.493027] ata12: SATA link down (SStatus 0 SControl 0)>> +[ 267.116723] sata_sil24: IRQ status == 0xffffffff, PCI fault or device removal?>> +[ 267.117779] ACPI: Device does not support D3cold>>>>>> So, in my eyes the "stale pci info" issue is fixed in 3.10-rc4 at least under acpiphp and pcie_aspm=off.
No, it is not. :(
> > And to be even more exact, I had CONFIG_HOTPLUG_PCI_ACPI=y as I see now an updated> v2 patch from Yinghai:> [PATCH v3.9 stable] PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check> > Please make sure that whatever I tested in plain 3.10-rc4 is what you had in those bugzilla patches> under https://bugzilla.kernel.org/show_bug.cgi?id=57961 or what Yinghai posted as an update.> Just in case are tested a different version.
Sorry, I was "able" to plugin a firewire card into express card slot faster
than xhci_hcd released resource of the to be yet hotremoved NEC-based xHCI
card. So, like in older kernels, lspci reports chimeric entry 11:00 of the
NEC card and of the VIA-based firewire card. Upon eject of the VIA card
xhci_hcd released resources with usual messages, including the complaint
that 'xhci_hcd 0000:11:00.0: Host not halted after 16000 microseconds.'
Nothing new in dmesg, I would just say that whatever makes xhci_hcd or pcieport
slow in turning PME# to disabled is efectively blocked if I plugin some card back
into the express slot. It seems to me the "conclusion" in the past in Jan-April
was that pcieport is to blame and not xhci_hcd, and it always seemed to proceed
smoothly once 'PME# disabled' appeared in dmesg.
> > Martin> --> To unsubscribe from this list: send the line "unsubscribe linux-pci" in> the body of a message to majordomo@vger.kernel.org> More majordomo info at http://vger.kernel.org/majordomo-info.html> >
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html