Commit Message

Bug fixed and new patch attached.
Patch also available for review at http://codereview.appspot.com/5752064
Thanks,
-Sri.
On Mon, Jun 4, 2012 at 2:36 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Jun 4, 2012 at 11:59 AM, Sriraman Tallam <tmsriram@google.com> wrote:>> Hi,>>>> Attaching updated patch for function multiversioning which brings>> in plenty of changes.>>>> * As suggested by Richard earlier, I have made cgraph aware of>> function versions. All nodes of function versions are chained and the>> dispatcher bodies are created on demand while building cgraph edges.>> The dispatcher body will be created if and only if there is a call or>> reference to a versioned function. Previously, I was maintaining the>> list of versions separately in a hash map, all that is gone now.>> * Now, the file multiverison.c has some helper routines that are used>> in the context of function versioning. There are no new passes and no>> new globals.>> * More tests, updated existing tests.>> * Fixed lots of bugs.>> * Updated patch description.>>>> Patch attached. Patch also available for review at>> http://codereview.appspot.com/5752064>>>> Please let me know what you think,>>>> Build failed in libstdc++-v3:>> /export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/locale_classes.h:546:59:> internal compiler error: tree check: expected function_decl, have> identifier_node in tourney, at cp/call.c:8498> for (size_t __i = 0; __ret && __i < _S_categories_size - 1; ++__i)> ^> Please submit a full bug report,> with preprocessed source if appropriate.> See <http://gcc.gnu.org/bugs.html> for instructions.> make[5]: *** [x86_64-unknown-linux-gnu/bits/stdc++.h.gch/O2g.gch] Erro>> on Linux/x86-64.>>> --> H.J.
Overview of the patch which adds front-end support to specify function versions.
Example:
int foo (); /* Default version */
int foo () __attribute__ ((target("avx,popcnt")));/*Specialized for avx and popcnt */
int foo () __attribute__ ((target("arch=core2,ssse3")));/*Specialized for core2 and ssse3*/
int main ()
{
int (*p)() = &foo;
return foo () + (*p)();
}
int foo ()
{
return 0;
}
int __attribute__ ((target("avx,popcnt")))
foo ()
{
return 0;
}
int __attribute__ ((target("arch=core2,ssse3")))
foo ()
{
return 0;
}
The above example has foo defined 3 times, but all 3 definitions of foo are
different versions of the same function. The call to foo in main, directly and
via a pointer, are calls to the multi-versioned function foo which is dispatched
to the right foo at run-time.
What does the patch do?
* Tracking decls that correspond to function versions of function
name, say "foo":
When the front-end sees more than one decl for "foo", with atleast one decl
tagged with "target" attributes, it marks it as function versions. To
prevent duplicate definition errors with other versions of "foo",
"decls_match" function in cp/decl.c is made to return false when 2 decls have
the same signature but different target attributes. This will make all function
versions of "foo" to be added to the overload list of "foo".
* Change the assembler names of the function versions.
The front-end changes the assembler names of the function versions by suffixing
the sorted list of args to "target" to the function name of "foo". For example,
he assembler name of "void foo () __attribute__ ((target ("sse4")))" will
become _Z3foov.sse4.
* Overload resolution:
Function "build_over_call" in cp/call.c sees a call to function
"foo", which is multi-versioned. The overload resolution happens in
function "joust" in "cp/call.c". Here, the call to "foo" has all
possible versions of "foo" as candidates. All the candidates of "foo" are
stored in the cgraph data structures. Each version of foo is chained in a
doubly-linked list with the default function as the first element. This allows
any pass to access all the semantically identical versions. Also, a dispatcher
decl is created which should be called and at run-time will dispatch the right
function version.
Also, in joust, where overload resolution happens, a multiversioned function
resolution is made to return the most specialized version. This is the version
that will be checked for dispatching first and is determined by the target.
Now, if the caller can inline this function version then a direct call is made
to this function version rather than go through the dispatcher. When a direct
call cannot be made, a call to the dispatcher function is created.
* Creating the dispatcher body.
The dispatcher body, called the resolver is made only when there is a call to a
multiversioned function dispatcher or the address of a function is taken. This
is generated during build_cgraph_edges for a call or cgraph_mark_address_taken
for a pointer reference.
* Dispatch ordering.
The order in which the function versions are checked during dispatch is based
on a priority value assigned for the ISA that is catered. More specialized
versions are checked for dispatching first. This is to mitigate the ambiguity
that can arise when more than one function version is valid for execution on
a particular platform. This is not a perfect solution and in future, the user
should be allowed to assign a dispatching priority value to each version.
* doc/tm.texi.in (TARGET_DISPATCH_VERSION): New hook description.
(TARGET_COMPARE_VERSIONS): New hook description.
* doc/tm.texi: Regenerate.
* cgraphbuild.c (build_cgraph_edges): Generate body of multiversion
function dispatcher.
* c-family/c-common.c (handle_target_attribute): Always keep target
attributes tagged.
* target.def (dispatch_version): New target hook.
(compare_versions): New hook.
* cgraph.c (cgraph_mark_address_taken_node): Generate body of multiversion
function dispatcher.
* cgraph.h (cgraph_node): New members dispatcher_fndecl, resolver_fndecl,
prev_function_version, next_function_version, dispatcher_function.
(is_default_function_version): New function.
(mark_function_as_version): New function.
(has_different_version_attributes): New function.
(function_target_attribute): New function.
(build_dispatcher_for_function_versions): New function.
(build_resolver_for_function_versions): New function.
* tree.h (DECL_FUNCTION_VERSIONED): New macro.
(tree_function_decl): New bit-field versioned_function.
* multiversion.c: New file.
* testsuite/g++.dg/mv1.C: New test.
* testsuite/g++.dg/mv2.C: New test.
* testsuite/g++.dg/mv3.C: New test.
* testsuite/g++.dg/mv4.C: New test.
* cp/class.c:
(add_method): Change assembler names of function versions.
(resolve_address_of_overloaded_function): Save all function
version candidates. Create dispatcher decl and return address of
dispatcher instead.
* cp/decl.c (decls_match): Make decls unmatched for versioned
functions.
(duplicate_decls): Remove ambiguity for versioned functions.
(cxx_comdat_group): Make comdat group of versioned functions be the
same.
* cp/error.c (dump_exception_spec): Dump assembler name for function
versions.
* cp/semantics.c (expand_or_defer_fn_1): Mark as needed versioned
functions that are also marked inline.
* cp/decl2.c:(check_classfn): Check attributes of versioned functions
for match.
* cp/call.c: (build_new_function_call): Check if versioned functions
have a default version.
(build_over_call): Make calls to multiversioned functions
to call the dispatcher.
(joust): For calls to multi-versioned functions, make the most
specialized function version win.
(tourney): Generate dispatcher decl for function versions.
* cp/mangle.c (write_unqualified_name): Use assembler name for
versioned functions.
* Makefile.in: Add multiversion.o
* config/i386/i386.c (add_condition_to_bb): New function.
(get_builtin_code_for_version): New function.
(ix86_compare_versions): New function.
(feature_compare): New function.
(ix86_dispatch_version): New function.
(TARGET_DISPATCH_VERSION): New macro.
(TARGET_COMPARE_VERSION): New macro.

Comments

On Mon, Jun 4, 2012 at 3:29 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Bug fixed and new patch attached.>> Patch also available for review at http://codereview.appspot.com/5752064>
I think you should also export __cpu_indicator_init in libgcc_s.so.
Also, is this feature C++ only? Can you make it to work for C?