This is a development release for testing purposes only. This document describes version 5.907 of HTML::Element,
released September 13,
2013 as part of HTML-Tree.

Methods introduced in version 4.0 or later are marked with the version that introduced them like this: (v4.0).

WARNING: Methods marked (v6.00) are not yet stable.
The API for handling encoding might change.
if you have comments or suggestions on the new API,
please write to the LWP mailing list at <libwww AT perl DOT org>.

Objects of the HTML::Element class can be used to represent elements of HTML document trees. These objects have attributes, notably attributes that designates each element's parent and content. The content is an array of text segments and other HTML::Element objects. A tree with HTML::Element objects as nodes can represent the syntax tree for a HTML document.

This is the traditional way to diagram a tree, with the "root" at the top, and it's this kind of diagram that people have in mind when they say, for example, that "the meta element is under the head element instead of under the body element". (The same is also said with "inside" instead of "under" -- the use of "inside" makes more sense when you're looking at the HTML source.)

The "treeness" of the tree-structure that these elements comprise is not an aspect of any particular object, but is emergent from the relatedness attributes (_parent and _content) of these element-objects and from how you use them to get from element to element.

While you could access the content of a tree by writing code that says "access the 'src' attribute of the root's first child's seventh child's third child", you're more likely to have to scan the contents of a tree, looking for whatever nodes, or kinds of nodes, you want to do something with. The most straightforward way to look over a tree is to "traverse" it. The simplest way to do that is with a recursive function:

TL;DR summary: use HTML::TreeBuilder 5 -weak; and forget about the delete method (except for pruning a node from a tree).

Because HTML::Element stores a reference to the parent element, Perl's reference-count garbage collection doesn't work properly with HTML::Element trees. Starting with version 5.00, HTML::Element uses weak references (if available) to prevent that problem. Weak references were introduced in Perl 5.6.0, but you also need a version of Scalar::Util that provides the weaken function.

Weak references are enabled by default. If you want to be certain they're in use, you can say use HTML::Element 5 -weak;. You must include the version number; previous versions of HTML::Element ignored the import list entirely.

To disable weak references, you can say use HTML::Element -noweak;. This is a global setting. This feature is deprecated and is provided only as a quick fix for broken code. If your code does not work properly with weak references, you should fix it immediately, as weak references may become mandatory in a future version. Generally, all you need to do is keep a reference to the root of the tree until you're done working with it.

Because HTML::TreeBuilder is a subclass of HTML::Element, you can also import -weak or -noweak from HTML::TreeBuilder: e.g. use HTML::TreeBuilder: 5 -weak;.

This constructor method returns a new HTML::Element object. The tag name is a required argument; it will be forced to lowercase. Optionally, you can specify other initial attributes at object creation time.

Returns (optionally sets) the value of the given attribute of $h. The attribute name (but not the value, if provided) is forced to lowercase. If trying to read the value of an attribute not present for this element, the return value is undef. If setting a new value, the old value of that attribute is returned.

If methods are provided for accessing an attribute (like $h->tag for "_tag", $h->content_list, etc. below), use those instead of calling attr $h->attr, whether for reading or setting.

Note that setting an attribute to undef (as opposed to "", the empty string) actually deletes the attribute.

These are element objects with a $h->tag value of "~comment", and the content of the comment is stored in the "text" attribute ($h->attr("text")). For example, parsing this code with HTML::TreeBuilder...

These objects are not currently produced by HTML::TreeBuilder, but can be used to represent a "super-literal" -- i.e., a literal you want to be immune from escaping. (Yes, I just made that term up.)

That is, this is useful if you want to insert code into a tree that you plan to dump out with as_HTML, where you want, for some reason, to suppress as_HTML's normal behavior of amp-quoting text segments.

This somewhat deprecated method returns the content of this element; but unlike content_list, this returns either undef (which you should understand to mean no content), or a reference to the array of content items, each of which is either a text segment (a string, i.e., a defined non-reference scalar value), or an HTML::Element object. Note that even if an arrayref is returned, it may be a reference to an empty array.

While older code should feel free to continue to use $h->content, new code should use $h->content_list in almost all conceivable cases. It is my experience that in most cases this leads to simpler code anyway, since it means one can say:

@children = $h->content_list;

instead of the inelegant:

@children = @{$h->content || []};

If you do use $h->content (or $h->content_array_ref), you should not use the reference returned by it (assuming it returned a reference, and not undef) to directly set or change the content of an element or text segment! Instead use content_refs_list or any of the other methods under "Structure-Modifying Methods", below.

This is like content (with all its caveats and deprecations) except that it is guaranteed to return an array reference. That is, if the given node has no _content attribute, the content method would return that undef, but content_array_ref would set the given node's _content value to [] (a reference to a new, empty array), and return that.

This returns a list of scalar references to each element of $h's content list. This is useful in case you want to in-place edit any large text segments without having to get a copy of the current value of that segment value, modify that copy, then use the splice_content to replace the old with the new. Instead, here you can in-place edit:

(v6.00) Returns (optionally sets) the "_encoding" attribute. This attribute is normally found only on the root <html> node. Its default value is taken from $HTML::Element::default_encoding, which itself defaults to $ENV{PERL_HTML_TREE_ENCODING}. But that environment variable is read only once when this module is loaded; Perl code should set $HTML::Element::default_encoding directly. Normally, that variable is not set, so the default value is undef (meaning auto-detect the encoding).

The value is a string: the name of a Perl encoding understood by Encode, with :BOM appended if the file began with a Unicode Byte-Order-Mark (U+FEFF). Common values are cp1252 (aka Windows-1252, Microsoft's extended version of ISO-8859-1, and the default encoding recommended by HTML5 for most locales), utf-8-strict, and UTF-16LE:BOM.

There are two special values: the empty string indicates :raw mode, and undef indicates that the encoding should be auto-detected.

You can use the "openw" or "encode_fh" methods to open an output file with the same encoding.

Returns (optionally sets) the "_implicit" attribute. This attribute is a flag that's used for indicating that the element was not originally present in the source, but was added to the parse tree (by HTML::TreeBuilder, for example) in order to conform to the rules of HTML structure.

Returns (and optionally sets) the "_pos" (for "current position") pointer of $h. This attribute is a pointer used during some parsing operations, whose value is whatever HTML::Element element at or under $h is currently "open", where $h->insert_element(NEW) will actually insert a new element.

(This has nothing to do with the Perl function called pos, for controlling where regular expression matching starts.)

If you set $h->pos($element), be sure that $element is either $h, or an element under $h.

If you've been modifying the tree under $h and are no longer sure $h->pos is valid, you can enforce validity with:

Returns all this element's attributes and values, as key-value pairs. This will include any "internal" attributes (i.e., ones not present in the original element, and which will not be represented if/when you call $h->as_HTML). Internal attributes are distinguished by the fact that the first character of their key (not value! key!) is an underscore ("_").

Adds the specified items to the end of the content list of the element $h. The items of content to be added should each be either a text segment (a string), an HTML::Element object, or an arrayref. Arrayrefs are fed thru $h->new_from_lol(that_arrayref) to convert them into elements, before being added to the content list of $h. This means you can say things concise things like:

Detaches the elements from $h's list of content-nodes, starting at $offset and continuing for $length items, replacing them with the elements of the following list, if any. Returns the elements (if any) removed from the content-list. If $offset is negative, then it starts that far from the end of the array, just like Perl's normal splice function. If $length and the following list is omitted, removes everything from $offset onward.

The items of content to be added (if any) should each be either a text segment (a string), an arrayref (which is fed thru "new_from_lol"), or an HTML::Element object that's not already a child of $h.

This unlinks $h from its parent, by setting its 'parent' attribute to undef, and by removing it from the content list of its parent (if it had one). The return value is the parent that was detached from (or undef, if $h had no parent to start with). Note that neither $h nor its parent are explicitly destroyed.

This replaces $h in its parent's content list with the nodes specified. The element $h (which by then may have no parent) is returned. This causes a fatal error if $h has no parent. The list of nodes to insert may contain $h, but at most once. Aside from that possible exception, the nodes to insert should not already be children of $h's parent.

Also, note that this method does not destroy $h if weak references are turned off -- use $h->replace_with(...)->delete if you need that.

This replaces $h in its parent's content list with its own content. The element $h (which by then has no parent or content of its own) is returned. This causes a fatal error if $h has no parent. Also, note that this does not destroy $h if weak references are turned off -- use $h->replace_with_content->delete if you need that.

Detaches this element from its parent (if it has one) and explicitly destroys the element and all its descendants. The return value is the empty list (or undef in scalar context).

Before version 5.00 of HTML::Element, you had to call delete when you were finished with the tree, or your program would leak memory. This is no longer necessary if weak references are enabled, see "Weak References".

Returns a copy of the element (whose children are clones (recursively) of the original's children, if any).

The returned element is parentless. Any '_pos' attributes present in the source element/tree will be absent in the copy. For that and other reasons, the clone of an HTML::TreeBuilder object that's in mid-parse (i.e, the head of a tree that HTML::TreeBuilder is elaborating) cannot (currently) be used to continue the parse.

You are free to clone HTML::TreeBuilder trees, just as long as: 1) they're done being parsed, or 2) you don't expect to resume parsing into the clone. (You can continue parsing into the original; it is never affected.)

Inserts (via push_content) a new element under the element at $h->pos(). Then updates $h->pos() to point to the inserted element, unless $element is a prototypically empty element like <br>, <hr>, <img>, etc. The new $h->pos() is returned. This method is useful only if your particular tree task involves setting $h->pos(). Otherwise, you should just use "push_content".

(v6.00) Opens $filename for writing, calls "encode_fh" on the resulting $filehandle, and returns the $filehandle. If $encoding is omitted, it defaults to $h->encoding. May be called as a class method if you supply $encoding as a parameter.

The $filehandle will not have the :crlf layer applied, even on platforms where it normally is on by default. If you want :crlf, apply it after calling openw (with binmode $filehandle, ':crlf').

Throws an exception if the file cannot be opened for any reason, or if $encoding is undef.

(v6.00) Applies $encoding to $filehandle and returns $filehandle. If $encoding is omitted, it defaults to $h->encoding. May be called as a class method if you supply $encoding as a parameter.

$filehandle should probably have been opened in :raw mode, especially if you might use an encoding that's not ASCII-compatible (e.g. UTF-16). Applying a UTF-16 encoding on top of the :crlf layer will produce invalid output. On Windows, :crlf is applied by default when you open a file unless you specify :raw. If you want :crlf, you should apply it after calling encode_fh.

$encoding may be any valid value for the _encoding attribute, except undef.

If $encoding ends with :BOM, then \x{FEFF} is printed to $filehandle after the encoding is set.

Throws an exception if the encoding cannot be applied for any reason, or if $encoding is undef.

Returns a string representing in HTML the element and its descendants. The optional argument $entities specifies a string of the entities to encode. For compatibility with previous versions, specify '<>&' here. If omitted or undef, all unsafe characters are encoded as HTML entities. See HTML::Entities for details. If passed an empty string, no entities are encoded.

If $indent_char is specified and defined, the HTML to be output is indented, using the string you specify (which you probably should set to "\t", or some number of spaces, if you specify it).

If \%optional_end_tags is specified and defined, it should be a reference to a hash that holds a true value for every tag name whose end tag is optional. Defaults to \%HTML::Element::optionalEndTag, which is an alias to %HTML::Tagset::optionalEndTag, which, at time of writing, contains true values for p, li, dt, dd. A useful value to pass is an empty hashref, {}, which means that no end-tags are optional for this dump. Otherwise, possibly consider copying %HTML::Tagset::optionalEndTag to a hash of your own, adding or deleting values as you like, and passing a reference to that hash.

(v6.00) Returns a string representing in HTML the element's descendants. Takes the same arguments as "as_html". This is like the DOM's innerHTML property, except that it can't be used to set the content. (Use $h->delete_content->push_content(...) for that.)

Returns a string consisting of only the text parts of the element's descendants. Any whitespace inside the element is included unchanged, but whitespace not in the tree is never added. But remember that whitespace may be ignored or compacted by HTML::TreeBuilder during parsing (depending on the value of the ignore_ignorable_whitespace and no_space_compacting attributes). Also, since whitespace is never added during parsing,

HTML::TreeBuilder->new_from_content("<p>a</p><p>b</p>")
->as_text;

returns "ab", not "a b" or "a\nb".

Text under <script> or <style> elements is never included in what's returned. If skip_dels is true, then text content under <del> nodes is not included in what's returned.

This is just like as_text(...) except that leading and trailing whitespace is deleted, and any internal whitespace is collapsed.

This will not remove non-breaking spaces, Unicode spaces, or any other non-ASCII whitespace unless you supply the extra characters as a string argument (e.g. $h->as_trimmed_text(extra_chars => '\xA0')). extra_chars may be any string that can appear inside a character class, including ranges like a-z, POSIX character classes like [:alpha:], and character class escapes like \p{Zs}.

Returns a string representing the complete start tag for the element. I.e., leading "<", tag name, attributes, and trailing ">". All values are surrounded with double-quotes, and appropriate characters are encoded. If $entities is omitted or undef, all unsafe characters are encoded as HTML entities. See HTML::Entities for details. If you specify some value for $entities, remember to include the double-quote character in it. (Previous versions of this module would basically behave as if '&">' were specified for $entities.) If $entities is an empty string, no entity is escaped.

Returns true if the $h element is, or is contained anywhere inside an element that is any of the ones listed, or whose tag name is any of the tag names listed. You can use any mix of elements and tag names.

Returns true if $h has no content, i.e., has no elements or text segments under it. In other words, this returns true if $h is a leaf node, AKA a terminal node. Do not confuse this sense of "empty" with another sense that it can have in SGML/HTML/XML terminology, which means that the element in question is of the type (like HTML's <hr>, <br>, <img>, etc.) that can't have any content.

That is, a particular <p> element may happen to have no content, so $that_p_element->is_empty will be true -- even though the prototypical <p> element isn't "empty" (not in the way that the prototypical <hr> element is).

If you think this might make for potentially confusing code, consider simply using the clearer exact equivalent: not($h->content_list).

In scalar context: returns the node that's the immediate left sibling of $h. If $h is the leftmost (or only) child of its parent (or has no parent), then this returns undef.

In list context: returns all the nodes that're the left siblings of $h (starting with the leftmost). If $h is the leftmost (or only) child of its parent (or has no parent), then this returns an empty list.

In scalar context: returns the node that's the immediate right sibling of $h. If $h is the rightmost (or only) child of its parent (or has no parent), then this returns undef.

In list context: returns all the nodes that're the right siblings of $h, starting with the leftmost. If $h is the rightmost (or only) child of its parent (or has no parent), then this returns an empty list.

The first form (with no parameter) returns a string representing the location of $h in the tree it is a member of. The address consists of numbers joined by a '.', starting with '0', and followed by the pindexes of the nodes in the tree that are ancestors of $h, starting from the top.

So if the way to get to a node starting at the root is to go to child 2 of the root, then child 10 of that, and then child 0 of that, and then you're there -- then that node's address is "0.2.10.0".

As a bit of a special case, the address of the root is simply "0".

I foresee this being used mainly for debugging, but you may find your own uses for it.

$element_or_text = $h->address($address);

This form returns the node (whether element or text-segment) at the given address in the tree that $h is a part of. (That is, the address is resolved starting from $h->root.)

If there is no node at the given address, this returns undef.

You can specify "relative addressing" (i.e., that indexing is supposed to start from $h and not from $h->root) by having the address start with a period -- e.g., $h->address(".3.2") will look at child 3 of $h, and child 2 of that.

Returns the list of the tag names of $h's ancestors, starting with its parent, and that parent's parent, and so on, up to the root. If $h is root, this returns an empty list. Example output: ('em', 'td', 'tr', 'table', 'body', 'html')

In list context, returns the list of all $h's descendant elements, listed in pre-order (i.e., an element appears before its content-elements). Text segments DO NOT appear in the list. In scalar context, returns a count of all such elements.

In list context, returns a list of elements at or under $h that have any of the specified tag names. In scalar context, returns the first (in pre-order traversal of the tree) such element found, or undef if none.

This is just an alias to find_by_tag_name. (There was once going to be a whole find_* family of methods, but then look_down filled that niche, so there turned out not to be much reason for the verboseness of the name "find_by_tag_name".)

In a list context, returns a list of elements at or under $h that have the specified attribute, and have the given value for that attribute. In a scalar context, returns the first (in pre-order traversal of the tree) such element found, or undef if none.

This method is deprecated in favor of the more expressive look_down method, which new code should use instead.

This starts at $h and looks thru its element descendants (in pre-order), looking for elements matching the criteria you specify. In list context, returns all elements that match all the given criteria; in scalar context, returns the first such element (or undef, if nothing matched).

Note that (attr_name, attr_value) and (attr_name, qr/.../) criteria are almost always faster than coderef criteria, so should presumably be put before them in your list of criteria. That is, in the example above, the sub ref is called only for elements that have already passed the criteria of having a "_tag" attribute with value "img", and an "alt" attribute with value "pix!". If the coderef were first, it would be called on every element, and then what elements pass that criterion (i.e., elements for which the coderef returned true) would be checked for their "_tag" and "alt" attributes.

Note that comparison of string attribute-values against the string value in (attr_name, attr_value) is case-INsensitive! A criterion of ('align', 'right')will match an element whose "align" value is "RIGHT", or "right" or "rIGhT", etc.

Note also that look_down considers "" (empty-string) and undef to be different things, in attribute values. So this:

$h->look_down("alt", "")

will find elements with an "alt" attribute, but where the value for the "alt" attribute is "". But this:

$h->look_down("alt", undef)

is the same as:

$h->look_down(sub { !defined($_[0]->attr('alt')) } )

That is, it finds elements that do not have an "alt" attribute at all (or that do have an "alt" attribute, but with a value of undef -- which is not normally possible).

Note that when you give several criteria, this is taken to mean you're looking for elements that match all your criterion, not just any of them. In other words, there is an implicit "and", not an "or". So if you wanted to express that you wanted to find elements with a "name" attribute with the value "foo" or with an "id" attribute with the value "baz", you'd have to do it like:

Coderef criteria are more expressive than (attr_name, attr_value) and (attr_name, qr/.../) criteria, and all (attr_name, attr_value) and (attr_name, qr/.../) criteria could be expressed in terms of coderefs. However, (attr_name, attr_value) and (attr_name, qr/.../) criteria are a convenient shorthand. (In fact, look_down itself is basically "shorthand" too, since anything you can do with look_down you could do by traversing the tree, either with the traverse method or with a routine of your own. However, look_down often makes for very concise and clear code.)

In list context, returns a list consisting of the values of the given attribute for $h and for all its ancestors starting from $h and working its way up. Nodes with no such attribute are skipped. ("attr_get_i" stands for "attribute get, with inheritance".) In scalar context, returns the first such value, or undef if none.

...in list context, this will return a list consisting of the values of these attributes which exist in $h and its ancestors. In scalar context, this returns the first value (i.e., the value of the first existing attribute from the first element that has any of the attributes listed). So, in the above example,

Scans across $h and all its descendants, and makes a hash (a reference to which is returned) where each entry consists of a key that's a tag name, and a value that's a reference to a list to all elements that have that tag name. I.e., this method returns:

Returns links found by traversing the element and all of its children and looking for attributes (like "href" in an <a> element, or "src" in an <img> element) whose values represent links. The return value is a reference to an array. Each element of the array is reference to an array with four items: the link-value, the element that has the attribute with that link-value, and the name of that attribute, and the tagname of that element. (Example: ['http://www.suck.com/',$elem_obj, 'href', 'a'].) You may or may not end up using the element itself -- for some purposes, you may use only the link value.

You might specify that you want to extract links from just some kinds of elements (instead of the default, which is to extract links from all the kinds of elements known to have attributes whose values represent links). For instance, if you want to extract links from only <a> and <img> elements, you could code it like this:

Returns true if $h and $i are both elements representing the same tree of elements, each with the same tag name, with the same explicit attributes (i.e., not counting attributes whose names start with "_"), and with the same content (textual, comments, etc.).

Sameness of descendant elements is tested, recursively, with $child1->same_as($child_2), and sameness of text segments is tested with $segment1 eq $segment2.

Resursively constructs a tree of nodes, based on the (non-cyclic) data structure represented by each $array_ref, where that is a reference to an array of arrays (of arrays (of arrays (etc.))).

In each arrayref in that structure, different kinds of values are treated as follows:

Arrayrefs

Arrayrefs are considered to designate a sub-tree representing children for the node constructed from the current arrayref.

Hashrefs

Hashrefs are considered to contain attribute-value pairs to add to the element to be constructed from the current arrayref

Text segments

Text segments at the start of any arrayref will be considered to specify the name of the element to be constructed from the current arrayref; all other text segments will be considered to specify text segments as children for the current arrayref.

Elements

Existing element objects are either inserted into the treelet constructed, or clones of them are. That is, when the lol-tree is being traversed and elements constructed based what's in it, if an existing element object is found, if it has no parent, then it is added directly to the treelet constructed; but if it has a parent, then $that_node->clone is added to the treelet at the appropriate place.

This turns any text nodes under $h from mere text segments (strings) into real objects, pseudo-elements with a tag-name of "~text", and the actual text content in an attribute called "text". (For a discussion of pseudo-elements, see the "tag" method, far above.) This method is provided because, for some purposes, it is convenient or necessary to be able, for a given text node, to ask what element is its parent; and clearly this is not possible if a node is just a text string.

Note that these "~text" objects are not recognized as text nodes by methods like "as_text". Presumably you will want to call $h->objectify_text, perform whatever task that you needed that for, and then call $h->deobjectify_text before calling anything like $h->as_text.

This undoes the effect of $h->objectify_text. That is, it takes any "~text" pseudo-elements in the tree at/under $h, and deletes each one, replacing each with the content of its "text" attribute.

Note that if $h itself is a "~text" pseudo-element, it will be destroyed -- a condition you may need to treat specially in your calling code (since it means you can't very well do anything with $h after that). So that you can detect that condition, if $h is itself a "~text" pseudo-element, then this method returns the value of the "text" attribute, which should be a defined value; in all other cases, it returns undef.

(This method assumes that no "~text" pseudo-element has any children.)

For every UL, OL, DIR, and MENU element at/under $h, this sets a "_bullet" attribute for every child LI element. For LI children of an OL, the "_bullet" attribute's value will be something like "4.", "d.", "D.", "IV.", or "iv.", depending on the OL element's "type" attribute. LI children of a UL, DIR, or MENU get their "_bullet" attribute set to "*". There should be no other LIs (i.e., except as children of OL, UL, DIR, or MENU elements), and if there are, they are unaffected.

This method is for testing whether this element or the elements under it have linkage attributes (_parent and _content) whose values are deeply aberrant: if there are undefs in a content list; if an element appears in the content lists of more than one element; if the _parent attribute of an element doesn't match its actual parent; or if an element appears as its own descendant (i.e., if there is a cyclicity in the tree).

This returns empty list (or false, in scalar context) if the subtree's linkage methods are sane; otherwise it returns two items (or true, in scalar context): the element where the error occurred, and a string describing the error.

This method is provided is mainly for debugging and troubleshooting -- it should be quite impossible for any document constructed via HTML::TreeBuilder to parse into a non-sane tree (since it's not the content of the tree per se that's in question, but whether the tree in memory was properly constructed); and it should be impossible for you to produce an insane tree just thru reasonable use of normal documented structure-modifying methods. But if you're constructing your own trees, and your program is going into infinite loops as during calls to traverse() or any of the secondary structural methods, as part of debugging, consider calling has_insane_linkage on the tree.

(v4.0) This method returns the class which will be used for new elements. It defaults to HTML::Element, but can be overridden by subclassing or esoteric means best left to those that will read the source and then not complain when those esoteric means change. (Just subclass.)

* If you want to free the memory associated with a tree built of HTML::Element nodes, and you have disabled weak references, then you will have to delete it explicitly using the "delete" method. See "Weak References".

* There's almost nothing to stop you from making a "tree" with cyclicities (loops) in it, which could, for example, make the traverse method go into an infinite loop. So don't make cyclicities! (If all you're doing is parsing HTML files, and looking at the resulting trees, this will never be a problem for you.)

* There's no way to represent comments or processing directives in a tree with HTML::Elements. Not yet, at least.

* There's (currently) nothing to stop you from using an undefined value as a text segment. If you're running under perl -w, however, this may make HTML::Element's code produce a slew of warnings.

You are welcome to derive subclasses from HTML::Element, but you should be aware that the code in HTML::Element makes certain assumptions about elements (and I'm using "element" to mean ONLY an object of class HTML::Element, or of a subclass of HTML::Element):

* The value of an element's _parent attribute must either be undef or otherwise false, or must be an element.

* The value of an element's _content attribute must either be undef or otherwise false, or a reference to an (unblessed) array. The array may be empty; but if it has items, they must ALL be either mere strings (text segments), or elements.

* The value of an element's _tag attribute should, at least, be a string of printable characters.

Moreover, bear these rules in mind:

* Do not break encapsulation on objects. That is, access their contents only thru $obj->attr or more specific methods.

* You should think twice before completely overriding any of the methods that HTML::Element provides. (Overriding with a method that calls the superclass method is not so bad, though.)