Commit Message

On Wed, May 09, 2018 at 04:53:19PM +0200, Allan Sandfeld Jensen wrote:
> > > @@ -2022,8 +2022,9 @@ simplify_vector_constructor (gimple_stmt_iterator> > > *gsi)> > > > elem_type = TREE_TYPE (type);> > > elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));> > > > > > - vec_perm_builder sel (nelts, nelts, 1);> > > - orig = NULL;> > > + vec_perm_builder sel (nelts, 2, nelts);> > > > Why this change? I admit the vec_parm_builder arguments are confusing, but> > I think the second times third is the number of how many indices are being> > pushed into the vector, so I think (nelts, nelts, 1) is right.> > > I had the impression it was what was selected from. In any case, I changed it > because without I get crash when vec_perm_indices is created later with a > possible nparms of 2.
The documentation is apparently in vector-builder.h:
This class is a wrapper around auto_vec<T> for building vectors of T.
It aims to encode each vector as npatterns interleaved patterns,
where each pattern represents a sequence:
{ BASE0, BASE1, BASE1 + STEP, BASE1 + STEP*2, BASE1 + STEP*3, ... }
The first three elements in each pattern provide enough information
to derive the other elements. If all patterns have a STEP of zero,
we only need to encode the first two elements in each pattern.
If BASE1 is also equal to BASE0 for all patterns, we only need to
encode the first element in each pattern. The number of encoded
elements per pattern is given by nelts_per_pattern.
The class can be used in two ways:
1. It can be used to build a full image of the vector, which is then
canonicalized by finalize (). In this case npatterns is initially
the number of elements in the vector and nelts_per_pattern is
initially 1.
2. It can be used to build a vector that already has a known encoding.
This is preferred since it is more efficient and copes with
variable-length vectors. finalize () then canonicalizes the encoding
to a simpler form if possible.
As the vector is constant width and we are building the full image of the
vector, the right arguments are (nelts, nelts, 1) as per 1. above, and the
finalization can perhaps change it to something more compact.
> > (and sorry for missing your patch first, the PR wasn't ASSIGNED and there> > was no link to gcc-patches for it).> > > It is okay. You are welcome to take it over. I am not a regular gcc > contributor and thus not well-versed in the details, only the basic logic of > how things work.
Ok, here is my version of the patch. Bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?
2018-05-10 Allan Sandfeld Jensen <allan.jensen@qt.io>
Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/85692
* tree-ssa-forwprop.c (simplify_vector_constructor): Try two
source permute as well.
* gcc.target/i386/pr85692.c: New test.
Jakub

Comments

On Donnerstag, 10. Mai 2018 09:57:22 CEST Jakub Jelinek wrote:
> On Wed, May 09, 2018 at 04:53:19PM +0200, Allan Sandfeld Jensen wrote:> > > > @@ -2022,8 +2022,9 @@ simplify_vector_constructor> > > > (gimple_stmt_iterator> > > > *gsi)>> > > > > > > > elem_type = TREE_TYPE (type);> > > > elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));> > > > > > > > - vec_perm_builder sel (nelts, nelts, 1);> > > > - orig = NULL;> > > > + vec_perm_builder sel (nelts, 2, nelts);> > > > > > Why this change? I admit the vec_parm_builder arguments are confusing,> > > but> > > I think the second times third is the number of how many indices are> > > being> > > pushed into the vector, so I think (nelts, nelts, 1) is right.> > > > I had the impression it was what was selected from. In any case, I changed> > it because without I get crash when vec_perm_indices is created later> > with a possible nparms of 2.> > The documentation is apparently in vector-builder.h:> This class is a wrapper around auto_vec<T> for building vectors of T.> It aims to encode each vector as npatterns interleaved patterns,> where each pattern represents a sequence:> > { BASE0, BASE1, BASE1 + STEP, BASE1 + STEP*2, BASE1 + STEP*3, ... }> > The first three elements in each pattern provide enough information> to derive the other elements. If all patterns have a STEP of zero,> we only need to encode the first two elements in each pattern.> If BASE1 is also equal to BASE0 for all patterns, we only need to> encode the first element in each pattern. The number of encoded> elements per pattern is given by nelts_per_pattern.> > The class can be used in two ways:> > 1. It can be used to build a full image of the vector, which is then> canonicalized by finalize (). In this case npatterns is initially> the number of elements in the vector and nelts_per_pattern is> initially 1.> > 2. It can be used to build a vector that already has a known encoding.> This is preferred since it is more efficient and copes with> variable-length vectors. finalize () then canonicalizes the encoding> to a simpler form if possible.> > As the vector is constant width and we are building the full image of the> vector, the right arguments are (nelts, nelts, 1) as per 1. above, and the> finalization can perhaps change it to something more compact.> > > > (and sorry for missing your patch first, the PR wasn't ASSIGNED and> > > there> > > was no link to gcc-patches for it).> > > > It is okay. You are welcome to take it over. I am not a regular gcc> > contributor and thus not well-versed in the details, only the basic logic> > of how things work.> > Ok, here is my version of the patch. Bootstrapped/regtested on x86_64-linux> and i686-linux, ok for trunk?>
Looks good to me if that counts for anything.
'Allan