On Fri, Aug 11, 2017 at 8:07 PM, John Stultz <john.stu...@linaro.org> wrote:
> On Fri, Aug 11, 2017 at 5:31 PM, John Stultz <john.stu...@linaro.org> wrote:
>> On Fri, Aug 11, 2017 at 5:10 PM, Wei Wang <wei...@google.com> wrote:
>>>> If after Cong's fix, the issue still happens, could you help try the
>>>> patch attached and collect all logs when you try the reproduce the
>>>> issue? It would be great to have logs for both success case and the
>>>> failure case.
>>>>
>>>> Thanks so much for your help.
>>>>
>>>
>>> I think we have a potential fix for this issue.
>>> Martin and I found that when addrconf_dst_alloc() creates a rt6, it is
>>> possible that rt6->dst.dev points to loopback device while
>>> rt6->rt6i_idev->dev points to a real device.
>>> When the real device goes down, the current fib6 clean up code only
>>> checks for rt6->dst.dev and assumes rt6->rt6i_idev->dev is the same.
>>> That leaves unreleased refcnt on the real device if rt6->dst.dev
>>> points to loopback dev.
>>>
>>> The attached potential fix is tested by Martin and made sure it fixes his
>>> issue.
>>>
>>> John,
>>> It will be great if you can also give it a try and see if it fixes the
>>> issue on your side before I submit an official patch.
>>
>> So yes, sorry I haven't been able to get back quicker on the other
>> patches sent, was mucking about in other work.
>>
>> So yea, this patch (potential fix for unregister_netdevice()) seems
>> to avoid the issue.
>>
>> I'm going to do some further testing, but its looking good so far.
>
> Looks good so far! I've not hit the issue yet.
>
> Thanks so much for sorting out a fix!
>
> If its useful:
> Tested-by: John Stultz <john.stu...@linaro.org>
>
> thanks again
> -john