Comments

Hello All
Please notice that I made a mail mistake in the previous patch chunk.
The 6th patch [wstate] is in
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01716.html and not in
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01714.html which got the
wrong attachments (no patches there!). Sorry for the noise.
References
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01716.html
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01032.html
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01081.html
The last patch takes into account Lauryanas remarks in
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01081.html and is a
documentation patch. I added some more explanations so please check my
poor english language.
I still hope that my patch serie "thirdround" will be Ok, with perhaps
minor changes required.
#################### gcc/ChangeLog entry for documentation
2010-09-21 Basile Starynkevitch <basile@starynkevitch.net>
* gcc/doc/gty.texi:
(Generating GGC code with gengtype): New node.
(GTY Options): Documented that tag should be unique.
Added ptr_alias documentation.
(How to run the gengtype generator): New section.
################################################################
The attached patch is relative to
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01716.html
Ok for trunk? With what changes?
BTW, the dates in the patch series (and notably their ChangeLog entry)
might be wrong. If Ok-ed, I will put the date (as seen from Paris,
France) of the commit.
Cheers

The patch7_doc-relto06.diff file size increased four times since the
last submission. Are you sure you sent the correct patch?

Patch

--- ../thirdround_06_wstate//gty.texi 2010-09-21 20:21:19.000000000 +0200+++ gcc/doc/gty.texi 2010-09-21 20:20:20.000000000 +0200@@ -0,0 +1,552 @@+@c Copyright (C) 2002, 2003, 2004, 2007, 2008, 2009+@c Free Software Foundation, Inc.+@c This is part of the GCC manual.+@c For copying conditions, see the file gcc.texi.++@node Type Information+@chapter Memory Management and Type Information+@cindex GGC+@findex GTY++GCC uses some fairly sophisticated memory management techniques, which+involve determining information about GCC's data structures from GCC's+source code and using this information to perform garbage collection and+implement precompiled headers.++A full C parser would be too complicated for this task, so a limited+subset of C is interpreted and special markers are used to determine+what parts of the source to look at. All @code{struct} and+@code{union} declarations that define data structures that are+allocated under control of the garbage collector must be marked. All+global variables that hold pointers to garbage-collected memory must+also be marked. Finally, all global variables that need to be saved+and restored by a precompiled header must be marked. (The precompiled+header mechanism can only save static variables if they're scalar.+Complex data structures must be allocated in garbage-collected memory+to be saved in a precompiled header.)++The full format of a marker is+@smallexample+GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))+@end smallexample+@noindent+but in most cases no options are needed. The outer double parentheses+are still necessary, though: @code{GTY(())}. Markers can appear:++@itemize @bullet+@item+In a structure definition, before the open brace;+@item+In a global variable declaration, after the keyword @code{static} or+@code{extern}; and+@item+In a structure field definition, before the name of the field.+@end itemize++Here are some examples of marking simple data structures and globals.++@smallexample+struct GTY(()) @var{tag}+@{+ @var{fields}@dots{}+@};++typedef struct GTY(()) @var{tag}+@{+ @var{fields}@dots{}+@} *@var{typename};++static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */+static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */+@end smallexample++The parser understands simple typedefs such as+@code{typedef struct @var{tag} *@var{name};} and+@code{typedef int @var{name};}.+These don't need to be marked.++@menu+* GTY Options:: What goes inside a @code{GTY(())}.+* GGC Roots:: Making global variables GGC roots.+* Files:: How the generated files work.+* Generating GGC code with gengtype:: How to run the gengtype generator.+* Invoking the garbage collector:: How to invoke the garbage collector.+@end menu++@node GTY Options+@section The Inside of a @code{GTY(())}++Sometimes the C code is not enough to fully describe the type+structure. Extra information can be provided with @code{GTY} options+and additional markers. Some options take a parameter, which may be+either a string or a type name, depending on the parameter. If an+option takes no parameter, it is acceptable either to omit the+parameter entirely, or to provide an empty string as a parameter. For+example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are+equivalent.++When the parameter is a string, often it is a fragment of C code. Four+special escapes may be used in these strings, to refer to pieces of+the data structure being marked:++@cindex % in GTY option+@table @code+@item %h+The current structure.+@item %1+The structure that immediately contains the current structure.+@item %0+The outermost structure that contains the current structure.+@item %a+A partial expression of the form @code{[i1][i2]@dots{}} that indexes+the array item currently being marked.+@end table++For instance, suppose that you have a structure of the form+@smallexample+struct A @{+ @dots{}+@};+struct B @{+ struct A foo[12];+@};+@end smallexample+@noindent+and @code{b} is a variable of type @code{struct B}. When marking+@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},+@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}+would expand to @samp{[11]}.++As in ordinary C, adjacent strings will be concatenated; this is+helpful when you have a complicated expression.+@smallexample+@group+GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"+ " ? TYPE_NEXT_VARIANT (&%h.generic)"+ " : TREE_CHAIN (&%h.generic)")))+@end group+@end smallexample++The available options are:++@table @code+@findex length+@item length ("@var{expression}")++There are two places the type machinery will need to be explicitly told+the length of an array. The first case is when a structure ends in a+variable-length array, like this:+@smallexample+struct GTY(()) rtvec_def @{+ int num_elem; /* @r{number of elements} */+ rtx GTY ((length ("%h.num_elem"))) elem[1];+@};+@end smallexample++In this case, the @code{length} option is used to override the specified+array length (which should usually be @code{1}). The parameter of the+option is a fragment of C code that calculates the length.++The second case is when a structure or a global variable contains a+pointer to an array, like this:+@smallexample+struct gimple_omp_for_iter * GTY((length ("%h.collapse"))) iter;+@end smallexample+In this case, @code{iter} has been allocated by writing something like+@smallexample+ x->iter = ggc_alloc_cleared_vec_gimple_omp_for_iter (collapse);+@end smallexample+and the @code{collapse} provides the length of the field.++This second use of @code{length} also works on global variables, like:+@verbatim+static GTY((length("reg_known_value_size"))) rtx *reg_known_value;+@end verbatim++@findex skip+@item skip++If @code{skip} is applied to a field, the type machinery will ignore it.+This is somewhat dangerous; the only safe use is in a union when one+field really isn't ever used.++@findex desc+@findex tag+@findex default+@item desc ("@var{expression}")+@itemx tag ("@var{constant}")+@itemx default++The type machinery needs to be told which field of a @code{union} is+currently active. This is done by giving each field a constant+@code{tag} value, and then specifying a discriminator using @code{desc}.+The value of the expression given by @code{desc} is compared against+each @code{tag} value, each of which should be different. If no+@code{tag} is matched, the field marked with @code{default} is used if+there is one, otherwise no field in the union will be marked.+A union field can have at most one @code{tag}.++In the @code{desc} option, the ``current structure'' is the union that+it discriminates. Use @code{%1} to mean the structure containing it.+There are no escapes available to the @code{tag} option, since it is a+constant.++For example,+@smallexample+struct GTY(()) tree_binding+@{+ struct tree_common common;+ union tree_binding_u @{+ tree GTY ((tag ("0"))) scope;+ struct cp_binding_level * GTY ((tag ("1"))) level;+ @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;+ tree value;+@};+@end smallexample++In this example, the value of BINDING_HAS_LEVEL_P when applied to a+@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type+mechanism will treat the field @code{level} as being present and if 0,+will treat the field @code{scope} as being present.++@findex param_is+@findex use_param+@item param_is (@var{type})+@itemx use_param++Sometimes it's convenient to define some data structure to work on+generic pointers (that is, @code{PTR}) and then use it with a specific+type. @code{param_is} specifies the real type pointed to, and+@code{use_param} says where in the generic data structure that type+should be put.++For instance, to have a @code{htab_t} that points to trees, one would+write the definition of @code{htab_t} like this:+@smallexample+typedef struct GTY(()) @{+ @dots{}+ void ** GTY ((use_param, @dots{})) entries;+ @dots{}+@} htab_t;+@end smallexample+and then declare variables like this:+@smallexample+ static htab_t GTY ((param_is (union tree_node))) ict;+@end smallexample++@findex param@var{n}_is+@findex use_param@var{n}+@item param@var{n}_is (@var{type})+@itemx use_param@var{n}++In more complicated cases, the data structure might need to work on+several different types, which might not necessarily all be pointers.+For this, @code{param1_is} through @code{param9_is} may be used to+specify the real type of a field identified by @code{use_param1} through+@code{use_param9}.++@findex use_params+@item use_params++When a structure contains another structure that is parameterized,+there's no need to do anything special, the inner structure inherits the+parameters of the outer one. When a structure contains a pointer to a+parameterized structure, the type machinery won't automatically detect+this (it could, it just doesn't yet), so it's necessary to tell it that+the pointed-to structure should use the same parameters as the outer+structure. This is done by marking the pointer with the+@code{use_params} option.++@findex deletable+@item deletable++@code{deletable}, when applied to a global variable, indicates that when+garbage collection runs, there's no need to mark anything pointed to+by this variable, it can just be set to @code{NULL} instead. This is used+to keep a list of free structures around for re-use.++@findex if_marked+@item if_marked ("@var{expression}")++Suppose you want some kinds of object to be unique, and so you put them+in a hash table. If garbage collection marks the hash table, these+objects will never be freed, even if the last other reference to them+goes away. GGC has special handling to deal with this: if you use the+@code{if_marked} option on a global hash table, GGC will call the+routine whose name is the parameter to the option on each hash table+entry. If the routine returns nonzero, the hash table entry will+be marked as usual. If the routine returns zero, the hash table entry+will be deleted.++The routine @code{ggc_marked_p} can be used to determine if an element+has been marked already; in fact, the usual case is to use+@code{if_marked ("ggc_marked_p")}.++@findex mark_hook+@item mark_hook ("@var{hook-routine-name}")++If provided for a structure or union type, the given+@var{hook-routine-name} (between double-quotes) is the name of a+routine called when the garbage collector has just marked the data as+reachable. This routine should not change the data, or call any ggc+routine. Its only argument is a pointer to the just marked (const)+structure or union.++@findex maybe_undef+@item maybe_undef++When applied to a field, @code{maybe_undef} indicates that it's OK if+the structure that this fields points to is never defined, so long as+this field is always @code{NULL}. This is used to avoid requiring+backends to define certain optional structures. It doesn't work with+language frontends.++@findex nested_ptr+@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")++The type machinery expects all pointers to point to the start of an+object. Sometimes for abstraction purposes it's convenient to have+a pointer which points inside an object. So long as it's possible to+convert the original object to and from the pointer, such pointers+can still be used. @var{type} is the type of the original object,+the @var{to expression} returns the pointer given the original object,+and the @var{from expression} returns the original object given+the pointer. The pointer will be available using the @code{%h}+escape.++@findex ptr_alias+@item ptr_alias (@var{type})++When applied to a type, @code{ptr_alias} indicates that the GTY-ed+type is an alias or synonym of the given @var{type}.++@findex chain_next+@findex chain_prev+@findex chain_circular+@item chain_next ("@var{expression}")+@itemx chain_prev ("@var{expression}")+@itemx chain_circular ("@var{expression}")++It's helpful for the type machinery to know if objects are often+chained together in long lists; this lets it generate code that uses+less stack space by iterating along the list instead of recursing down+it. @code{chain_next} is an expression for the next item in the list,+@code{chain_prev} is an expression for the previous item. For singly+linked lists, use only @code{chain_next}; for doubly linked lists, use+both. The machinery requires that taking the next item of the+previous item gives the original item. @code{chain_circular} is similar+to @code{chain_next}, but can be used for circular single linked lists.++@findex reorder+@item reorder ("@var{function name}")++Some data structures depend on the relative ordering of pointers. If+the precompiled header machinery needs to change that ordering, it+will call the function referenced by the @code{reorder} option, before+changing the pointers in the object that's pointed to by the field the+option applies to. The function must take four arguments, with the+signature @samp{@w{void *, void *, gt_pointer_operator, void *}}.+The first parameter is a pointer to the structure that contains the+object being updated, or the object itself if there is no containing+structure. The second parameter is a cookie that should be ignored.+The third parameter is a routine that, given a pointer, will update it+to its correct new value. The fourth parameter is a cookie that must+be passed to the second parameter.++PCH cannot handle data structures that depend on the absolute values+of pointers. @code{reorder} functions can be expensive. When+possible, it is better to depend on properties of the data, like an ID+number or the hash of a string instead.++@findex variable_size+@item variable_size++The type machinery expects the types to be of constant size. When this+is not true, for example, with structs that have array fields or unions,+the type machinery cannot tell how many bytes need to be allocated at +each allocation. The @code{variable_size} is used to mark such types.+The type machinery then provides allocators that take a parameter +indicating an exact size of object being allocated. ++For example,+@smallexample+struct GTY((variable_size)) sorted_fields_type @{+ int len;+ tree GTY((length ("%h.len"))) elts[1];+@};+@end smallexample++Then the objects of @code{struct sorted_fields_type} are allocated in GC +memory as follows:+@smallexample+ field_vec = ggc_alloc_sorted_fields_type (size);+@end smallexample++@findex special+@item special ("@var{name}")++The @code{special} option is used to mark types that have to be dealt+with by special case machinery. The parameter is the name of the+special case. See @file{gengtype.c} for further details. Avoid+adding new special cases unless there is no other alternative.+@end table++@node GGC Roots+@section Marking Roots for the Garbage Collector+@cindex roots, marking+@cindex marking roots++In addition to keeping track of types, the type machinery also locates+the global variables (@dfn{roots}) that the garbage collector starts+at. Roots must be declared using one of the following syntaxes:++@itemize @bullet+@item+@code{extern GTY(([@var{options}])) @var{type} @var{name};}+@item+@code{static GTY(([@var{options}])) @var{type} @var{name};}+@end itemize+@noindent+The syntax+@itemize @bullet+@item+@code{GTY(([@var{options}])) @var{type} @var{name};}+@end itemize+@noindent+is @emph{not} accepted. There should be an @code{extern} declaration+of such a variable in a header somewhere---mark that, not the+definition. Or, if the variable is only used in one file, make it+@code{static}.++@node Files+@section Source Files Containing Type Information+@cindex generated files+@cindex files, generated++Whenever you add @code{GTY} markers to a source file that previously+had none, or create a new source file containing @code{GTY} markers,+there are three things you need to do:++@enumerate+@item+You need to add the file to the list of source files the type+machinery scans. There are four cases:++@enumerate a+@item+For a back-end file, this is usually done+automatically; if not, you should add it to @code{target_gtfiles} in+the appropriate port's entries in @file{config.gcc}.++@item+For files shared by all front ends, add the filename to the+@code{GTFILES} variable in @file{Makefile.in}.++@item+For files that are part of one front end, add the filename to the+@code{gtfiles} variable defined in the appropriate+@file{config-lang.in}. For C, the file is @file{c-config-lang.in}.+Headers should appear before non-headers in this list.++@item+For files that are part of some but not all front ends, add the+filename to the @code{gtfiles} variable of @emph{all} the front ends+that use it.+@end enumerate++@item+If the file was a header file, you'll need to check that it's included+in the right place to be visible to the generated files. For a back-end+header file, this should be done automatically. For a front-end header+file, it needs to be included by the same file that includes+@file{gtype-@var{lang}.h}. For other header files, it needs to be+included in @file{gtype-desc.c}, which is a generated file, so add it to+@code{ifiles} in @code{open_base_file} in @file{gengtype.c}.++For source files that aren't header files, the machinery will generate a+header file that should be included in the source file you just changed.+The file will be called @file{gt-@var{path}.h} where @var{path} is the+pathname relative to the @file{gcc} directory with slashes replaced by+@verb{|-|}, so for example the header file to be included in+@file{cp/parser.c} is called @file{gt-cp-parser.c}. The+generated header file should be included after everything else in the+source file. Don't forget to mention this file as a dependency in the+@file{Makefile}!++@end enumerate++For language frontends, there is another file that needs to be included+somewhere. It will be called @file{gtype-@var{lang}.h}, where+@var{lang} is the name of the subdirectory the language is contained in.++Plugins can add additional root tables. Run the @code{gengtype}+utility in plugin mode as @code{gengtype -P @var{gt-pluginout.h} -r+@var{gtype.state} @var{plugin*.c}} with your plugin files+@var{plugin*.c} using @code{GTY} to generate the @var{gt-pluginout.h}+file. The @var{gtype.state} is usually in the plugin directory.+++@node Generating GGC code with gengtype+@section How to run the gengtype generator.+@cindex garbage collector, generation, gengtype+@findex gengtype++The @code{gengtype} utility (which may be installed as+@code{gcc-gengtype} or some other name on your system) is a C code+generator program. It generates suitable marking and type-walking C+routines for GGC. Generated GC marking routines ensure that every+live data is ultimately marked, so that GGC can delete, at the end of+its @code{ggc_collect} function, every non-marked data (which is+unreachable and dead). Generated PCH type-walking routines are used+for ``compilation'' of header files @var{*.h} into a @var{*.gch}+file. The @code{gengtype} utility is used when building GCC itself+from its source tree, and also for plugins dealing with their own+GTY-ed data.++The @code{gengtype} utility is run once specially when compiling GCC+itself. It parses many GCC source files (mostly internal GCC headers+declaring @code{GTY}-ed types and internal GCC files declaring+@code{GTY}-ed variables). It saves its state in a persistent textual+file @file{gtype.state}. This file has a format tied to a particular+GCC release and configuration. This @file{gtype.state} file should be+parsed only by @code{gengtype} (of the same version which has+generated it). Following GNU usage, @code{gengtype} accepts the usual+@code{--version} or @code{-V} and @code{--help} arguments. It also+accepts @code{--verbose} or @code{-v} to show what it is doing, and+@code{--debug} or @code{-D} for debugging purposes. It may dump every+internal data in human readable form with @code{--dump} or @code{-d}.+The @code{--backupdir} @var{directory} or @code{-B} @var{directory}+specifies a back-up directory used to store, as e.g. @file{gt-*.h~}+files, the previous version of generated C files.++During GCC build process, a file @file{gtyp-input.list} is built and+contains a long list of files to be parsed. Then, @code{gengtype} is+run once to generate its state with @code{gengtype -S+@file{source-directory} -I @file{gtyp-input.list} -w+@file{gtype.state}}. Immediately after, it is re-run to read its just+generated state and generate @file{gt*.[ch]} files with @code{gengtype+-r @file{gtype.state}}.++Plugin developers using @code{GTY} in their plugin source code should+run @code{gengtype -P @var{gt-pluginout.h} -r @file{gtype.state}+@var{plugin*.c}}+++@node Invoking the garbage collector+@section How to invoke the garbage collector+@cindex garbage collector, invocation+@findex ggc_collect++The GCC garbage collector GGC is only invoked explicitly. In contrast+with many other garbage collectors, it is not implicitly invoked by+allocation routines when a lot of memory has been consumed. So the+only way to have GGC reclaim storage it to call the @code{ggc_collect}+function explicitly. This call is an expensive operation, as it may+have to scan the entire heap. Beware that local variables (on the GCC+call stack) are not followed by such an invocation (as many other+garbage collectors do): you should reference all your data from static+or external @code{GTY}-ed variables, and it is advised to call+@code{ggc_collect} with a shallow call stack. The GGC is an exact mark+and sweep garbage collector (so it does not scan the call stack for+pointers). In practice GCC passes don't often call @code{ggc_collect}+themselves, because it is called by the pass manager between passes.