*Re: [PATCH net-next 3/3] net: stmmac: Convert to phylink and remove phylib logic
2019-06-25 7:37 ` Jose Abreu@ 2019-06-25 11:10 ` Jon Hunter
2019-06-25 11:25 ` Jose Abreu0 siblings, 1 reply; 29+ messages in thread
From: Jon Hunter @ 2019-06-25 11:10 UTC (permalink / raw)
To: Jose Abreu, linux-kernel, netdev
Cc: Joao Pinto, David S . Miller, Giuseppe Cavallaro,
Alexandre Torgue, Russell King, Andrew Lunn, Florian Fainelli,
Heiner Kallweit, linux-tegra
On 25/06/2019 08:37, Jose Abreu wrote:
> From: Jon Hunter <jonathanh@nvidia.com>
>
>> Any further feedback? I am still seeing this issue on today's -next.
>
> Apologies but I was in FTO.
>
> Is there any possibility you can just disable the ethX configuration in
> the rootfs mount and manually configure it after rootfs is done ?
>
> I just want to make sure in which conditions this is happening (if in
> ifdown or ifup).
I have been looking at this a bit closer and I can see the problem. What
happens is that ...
1. stmmac_mac_link_up() is called and priv->eee_active is set to false
2. stmmac_eee_init() is called but because priv->eee_active is false,
timer_setup() for eee_ctrl_timer is never called.
3. stmmac_eee_init() returns true and so then priv->eee_enabled is set
to true.
4. When stmmac_tx_clean() is called because priv->eee_enabled is set to
true, mod_timer() is called for the eee_ctrl_timer, but because
timer_setup() was never called, we hit the BUG defined at
kernel/time/timer.c:952, because no function is defined for the
timer.
The following fixes it for me ...
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -399,10 +399,13 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
mutex_lock(&priv->lock);
/* Check if it needs to be deactivated */
- if (!priv->eee_active && priv->eee_enabled) {
- netdev_dbg(priv->dev, "disable EEE\n");
- del_timer_sync(&priv->eee_ctrl_timer);
- stmmac_set_eee_timer(priv, priv->hw, 0, tx_lpi_timer);
+ if (!priv->eee_active) {
+ if (priv->eee_enabled) {
+ netdev_dbg(priv->dev, "disable EEE\n");
+ del_timer_sync(&priv->eee_ctrl_timer);
+ stmmac_set_eee_timer(priv, priv->hw, 0, tx_lpi_timer);
+ }
+ mutex_unlock(&priv->lock);
return false;
}
It also looks like you have a potention deadlock in the current code
because in the case of if (!priv->eee_active && priv->eee_enabled)
you don't unlock the mutex. The above fixes this as well. I can send a
formal patch if this looks correct.
Cheers
Jon
--
nvpublic
^permalinkrawreply [flat|nested] 29+ messages in thread

*RE: [PATCH net-next 3/3] net: stmmac: Convert to phylink and remove phylib logic
2019-06-25 11:10 ` Jon Hunter@ 2019-06-25 11:25 ` Jose Abreu0 siblings, 0 replies; 29+ messages in thread
From: Jose Abreu @ 2019-06-25 11:25 UTC (permalink / raw)
To: Jon Hunter, Jose Abreu, linux-kernel, netdev
Cc: Joao Pinto, David S . Miller, Giuseppe Cavallaro,
Alexandre Torgue, Russell King, Andrew Lunn, Florian Fainelli,
Heiner Kallweit, linux-tegra
From: Jon Hunter <jonathanh@nvidia.com>
> I have been looking at this a bit closer and I can see the problem. What
> happens is that ...
>
> 1. stmmac_mac_link_up() is called and priv->eee_active is set to false
> 2. stmmac_eee_init() is called but because priv->eee_active is false,
> timer_setup() for eee_ctrl_timer is never called.
> 3. stmmac_eee_init() returns true and so then priv->eee_enabled is set
> to true.
> 4. When stmmac_tx_clean() is called because priv->eee_enabled is set to
> true, mod_timer() is called for the eee_ctrl_timer, but because
> timer_setup() was never called, we hit the BUG defined at
> kernel/time/timer.c:952, because no function is defined for the
> timer.
>
> The following fixes it for me ...
>
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -399,10 +399,13 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
> mutex_lock(&priv->lock);
>
> /* Check if it needs to be deactivated */
> - if (!priv->eee_active && priv->eee_enabled) {
> - netdev_dbg(priv->dev, "disable EEE\n");
> - del_timer_sync(&priv->eee_ctrl_timer);
> - stmmac_set_eee_timer(priv, priv->hw, 0, tx_lpi_timer);
> + if (!priv->eee_active) {
> + if (priv->eee_enabled) {
> + netdev_dbg(priv->dev, "disable EEE\n");
> + del_timer_sync(&priv->eee_ctrl_timer);
> + stmmac_set_eee_timer(priv, priv->hw, 0, tx_lpi_timer);
> + }
> + mutex_unlock(&priv->lock);
> return false;
> }
>
> It also looks like you have a potention deadlock in the current code
> because in the case of if (!priv->eee_active && priv->eee_enabled)
> you don't unlock the mutex. The above fixes this as well. I can send a
> formal patch if this looks correct.
Thanks for looking into this! The fix looks correct so if you could
submit a patch it would be great!
Thanks,
Jose Miguel Abreu
^permalinkrawreply [flat|nested] 29+ messages in thread

*Re: [PATCH net-next 0/3] net: stmmac: Convert to phylink
2019-07-22 13:58 ` Jose Abreu@ 2019-07-22 14:19 ` Andrew Lunn
2019-07-22 14:26 ` Jose Abreu0 siblings, 1 reply; 29+ messages in thread
From: Andrew Lunn @ 2019-07-22 14:19 UTC (permalink / raw)
To: Jose Abreu
Cc: Ondřej Jirman, linux-kernel, netdev, Joao Pinto,
David S . Miller, Giuseppe Cavallaro, Alexandre Torgue,
Russell King, Florian Fainelli, Heiner Kallweit
On Mon, Jul 22, 2019 at 01:58:20PM +0000, Jose Abreu wrote:
> From: Andrew Lunn <andrew@lunn.ch>
> Date: Jul/22/2019, 14:40:23 (UTC+00:00)
>
> > Does this mean that all stmmac variants support 1G? There are none
> > which just support Fast Ethernet?
>
> This glue logic drivers sometimes reflect a custom IP that's Synopsys
> based but modified by customer, so I can't know before-hand what's the
> supported max speed. There are some old versions that don't support 1G
> but I expect that PHY driver limits this ...
If a Fast PHY is used, then yes, it would be limited. But sometimes a
1G PHY is used because they are cheaper than a Fast PHY.
> > I'm also not sure the change fits the problem. Why did it not
> > negotiate 100FULL rather than 10Half? You are only moving the 1G
> > speeds around, so 100 speeds should of been advertised and selected.
>
> Hmm, now that I'm looking at it closer I agree with you. Maybe link
> partner or PHY doesn't support 100M ?
In the working case, ethtool shows the link partner supports 10, 100,
and 1G. So something odd is going on here.
You fix does seems reasonable, and it has been reported to fix the
issue, but it would be good to understand what is going on here.
Andrew
^permalinkrawreply [flat|nested] 29+ messages in thread