Commit Message

recently many new supports have been added in the stmmac driver w/o taking care
about where each new field had to be placed inside the private structure for
guaranteeing the best cache usage.
This is what I wanted in the beginning, so this patch reorganizes all the fields
in order to keep adjacent fields for cache effect.
I have also tried to optimize them by using pahole.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 70 +++++++++++++-------------
1 files changed, 35 insertions(+), 35 deletions(-)

On 4/3/2013 9:21 AM, Eric Dumazet wrote:
> On Wed, 2013-04-03 at 07:41 +0200, Giuseppe CAVALLARO wrote:>> recently many new supports have been added in the stmmac driver w/o taking care>> about where each new field had to be placed inside the private structure for>> guaranteeing the best cache usage.>> This is what I wanted in the beginning, so this patch reorganizes all the fields>> in order to keep adjacent fields for cache effect.>> I have also tried to optimize them by using pahole.>>>> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>>> --->> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 70 +++++++++++++------------->> 1 files changed, 35 insertions(+), 35 deletions(-)>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h>> index 75f997b..8aa28c5 100644>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h>> @@ -35,36 +35,45 @@>>>> struct stmmac_priv {>> /* Frequently used values are kept adjacent for cache effect */>> - struct dma_desc *dma_tx ____cacheline_aligned; /* Basic TX desc */>> - struct dma_extended_desc *dma_etx; /* Extended TX descriptor */>> - dma_addr_t dma_tx_phy;>> - struct sk_buff **tx_skbuff;>> - dma_addr_t *tx_skbuff_dma;>> + struct dma_extended_desc *dma_etx;>> + struct dma_desc *dma_tx ____cacheline_aligned_in_smp;>> + struct sk_buff **tx_skbuff ____cacheline_aligned_in_smp;>> dma_tx & tx_skbuff are readonly, why put them in separate cache lines ?
I put tx_skbuff in a separate cache line because, when we use extended
descriptors, the driver works with dma_etx instead of dma_tx.
So my idea was to have both dma_etx, dma_tx and tx_skbuff aligned in
any case.
>> It seems there is an abuse of ____cacheline_aligned_in_smp in this> driver (especially if this driver only runs on UP arch)
Yes I know that there is this abuse but why do you see an abuse for UP?
In that case we should fall through ____cacheline_aligned (e.g. it is ok
for SH4).
peppe
>>>> --> To unsubscribe from this list: send the line "unsubscribe netdev" in> the body of a message to majordomo@vger.kernel.org> More majordomo info at http://vger.kernel.org/majordomo-info.html>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

On Wed, 2013-04-03 at 09:33 +0200, Giuseppe CAVALLARO wrote:
> On 4/3/2013 9:21 AM, Eric Dumazet wrote:> > On Wed, 2013-04-03 at 07:41 +0200, Giuseppe CAVALLARO wrote:> >> recently many new supports have been added in the stmmac driver w/o taking care> >> about where each new field had to be placed inside the private structure for> >> guaranteeing the best cache usage.> >> This is what I wanted in the beginning, so this patch reorganizes all the fields> >> in order to keep adjacent fields for cache effect.> >> I have also tried to optimize them by using pahole.> >>> >> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>> >> ---> >> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 70 +++++++++++++-------------> >> 1 files changed, 35 insertions(+), 35 deletions(-)> >>> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h> >> index 75f997b..8aa28c5 100644> >> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h> >> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h> >> @@ -35,36 +35,45 @@> >>> >> struct stmmac_priv {> >> /* Frequently used values are kept adjacent for cache effect */> >> - struct dma_desc *dma_tx ____cacheline_aligned; /* Basic TX desc */> >> - struct dma_extended_desc *dma_etx; /* Extended TX descriptor */> >> - dma_addr_t dma_tx_phy;> >> - struct sk_buff **tx_skbuff;> >> - dma_addr_t *tx_skbuff_dma;> >> + struct dma_extended_desc *dma_etx;> >> + struct dma_desc *dma_tx ____cacheline_aligned_in_smp;> >> + struct sk_buff **tx_skbuff ____cacheline_aligned_in_smp;> >> > dma_tx & tx_skbuff are readonly, why put them in separate cache lines ?> > I put tx_skbuff in a separate cache line because, when we use extended> descriptors, the driver works with dma_etx instead of dma_tx.> So my idea was to have both dma_etx, dma_tx and tx_skbuff aligned in> any case.
[...]
It's generally not that important to put fields at the beginning of a
cache line. (If you've measured with typical systems using stmmac and
found an advantage, then I accept that you have a good reason to do
this. But that advantage is unlikely to be specific to SMP systems.)
I would use ____cacheline_aligned_in_smp to separate fields that are
likely to be changed on different CPUs, so that the cache line doesn't
bounce between their caches. Fields that are rarely modified (such as
these pointers), or are used together on the same CPU should be packed
together into a small number of cache lines.
Ben.

On 4/3/2013 6:09 PM, Ben Hutchings wrote:
> On Wed, 2013-04-03 at 09:33 +0200, Giuseppe CAVALLARO wrote:>> On 4/3/2013 9:21 AM, Eric Dumazet wrote:>>> On Wed, 2013-04-03 at 07:41 +0200, Giuseppe CAVALLARO wrote:>>>> recently many new supports have been added in the stmmac driver w/o taking care>>>> about where each new field had to be placed inside the private structure for>>>> guaranteeing the best cache usage.>>>> This is what I wanted in the beginning, so this patch reorganizes all the fields>>>> in order to keep adjacent fields for cache effect.>>>> I have also tried to optimize them by using pahole.>>>>>>>> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>>>>> --->>>> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 70 +++++++++++++------------->>>> 1 files changed, 35 insertions(+), 35 deletions(-)>>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h>>>> index 75f997b..8aa28c5 100644>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h>>>> @@ -35,36 +35,45 @@>>>>>>>> struct stmmac_priv {>>>> /* Frequently used values are kept adjacent for cache effect */>>>> - struct dma_desc *dma_tx ____cacheline_aligned; /* Basic TX desc */>>>> - struct dma_extended_desc *dma_etx; /* Extended TX descriptor */>>>> - dma_addr_t dma_tx_phy;>>>> - struct sk_buff **tx_skbuff;>>>> - dma_addr_t *tx_skbuff_dma;>>>> + struct dma_extended_desc *dma_etx;>>>> + struct dma_desc *dma_tx ____cacheline_aligned_in_smp;>>>> + struct sk_buff **tx_skbuff ____cacheline_aligned_in_smp;>>>>>> dma_tx & tx_skbuff are readonly, why put them in separate cache lines ?>>>> I put tx_skbuff in a separate cache line because, when we use extended>> descriptors, the driver works with dma_etx instead of dma_tx.>> So my idea was to have both dma_etx, dma_tx and tx_skbuff aligned in>> any case.> [...]>> It's generally not that important to put fields at the beginning of a> cache line. (If you've measured with typical systems using stmmac and> found an advantage, then I accept that you have a good reason to do> this. But that advantage is unlikely to be specific to SMP systems.)
That is the point. I had seen an improvement when testing on SH4
platforms if these pointers were at the beginning of a cache line.
> I would use ____cacheline_aligned_in_smp to separate fields that are> likely to be changed on different CPUs, so that the cache line doesn't> bounce between their caches. Fields that are rarely modified (such as> these pointers), or are used together on the same CPU should be packed> together into a small number of cache lines.
Thx Ben for this explanation. Let me do some other tests on SMP ARM.
I'll rework this patch trying to balance the "abuse" of
____cacheline_aligned_in_smp and the best performances (I can re-test
on ARM and SH4).
peppe
>> Ben.>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html