The reason I need to do this is that some nodes in the tree have the same name, but I need a unique identifier for each leaf node. In the above tree My goal is to have the following naming conventions once I reach a leaf node:

a.b a.c.e a.c.e.h a.c.f a.c.f.g.h a.c.f.g.j a.d

With my current code, I have no problem finding the name of the leaf node (in the example it would be nodes b, e ,h, f ,h,d,j). However, this results in a collision since in the example, there are two leaf nodes named "h". In my tree I also have the situations where the branch nodes have the same name. In the above tree, branch "g" appears twice however the leaf node is still unique. Appending the names is proving to be frustrating and not as trivial as I imagined.

Here is how I am trying to approach this (somewhat pseudo code):

Code

my $node_name = "";

&parse_tree($root_node, $node_name);

sub parse_tree { my ($root_node, $node_name) = $_;

#find sub nodes my @sub_nodes = $root_node->findnodes;

#basically a check for if sub nodes exists, if they don't then the current node is a leaf node if(scalar(@sub_nodes) > 0) {

# now go through each sub node foreach my $sub_node (@sub_nodes) {

#grab the name of node, I don't actually have this written but this will return the name of the current sub node; $node_name = &get_node_name ($sub_node);

#now do the recursion, here I also export the name of the node, so it can be used later. &parse_tree($sub_node, $node_name); } #reading a leaf node } else {

print "name: $node_name name\n"; } }

With the above, the output becomes the names of the leaf nodes (assuming the node names are (a,b,c,d,e,f,g,h) from the above example):

Code

name: b name: e name h name: f name: h name: j name: d

Is there a particular method I need to be using to parse this tree, to get it to output the way I want? I was thinking doing it 'length' wise where I traverse until I reach a leaf node. Once the leaf node is reached go back to the root node and continue parsing. Only issue with this is I would need to keep track of which nodes I have already read. An issue with marking nodes as read is that some nodes are identical (i.e. h, h or g, g) so once I mark one node as "read" it would skip the next one. Another possibility is I need to save the level of the tree I'm in, and always do a comparison with the previous level.

Anyways, writing this recursive function is making me lose my mind, I have spent a long time trying different methods but each method I figure out always has a drawback or case where it will fail. Maybe there is a very easy way of doing this that I am completely missing.

Recursion can be confusing. It helps to pay as much attention to the design of your data structure as you do to the code. Document its design before you attempt to code. I have included the design spec in the comments of the code. Neither Laurent nor I have reproduced your output exactly because you have not made it clear how to choose the nodes to print (They are all leaves except a.c.e). I print only the 'leaves' of the tree. (Move the print statement out of the if-block if you wish to print every node.)

Code

use strict; use warnings;

# Data structure is defined recursively. # TREE is a reference to a node # NODE is a two element list # 0: Value # 1: Ref to an array (possibly empty) of trees.

sub visit_all_nodes { # Visit every node in the tree. # If there are no sub-trees, print the "path" to the current node. use 5.10.0; use feature qw(state); state @path; my ($value, $ref_to_array_of_sub_trees) = @{shift()}; my $number_of_sub_trees = scalar @$ref_to_array_of_sub_trees; push @path, $value;

Thanks for the response Laurent and Bill. Sorry I was not clear in my explanation of the scenario, but I am trying to achieve only paths to leaf nodes, as Bills output shows. My input data structure is XML files.

Anyways I've managed to solve the issue on my end using Bill's state keyword, and using an array to store the name of the current node. My original implementation was similar to Bill's however I was not using an array to store the data.

I did not end up needing the special local variable. Probably best since I still do not quite understand how it works (or the local keyword for that matter).

The special variable $" contains the character (default <space>) which is placed between elements when an array is interpolated into a string. Your example required '.'.

You can think of 'local' as a way to make a copy of a global variable for use in the current scope (block). The rest of the program uses the default. For a more accurate description refer local.

This advice is probably to late for the current version of your project, but you almost certainly will write more versions. Parsing XML is really a job for a module. DIY can fail in ways you cannot even imagine. This issue comes up often. Search the forums for help in choosing which module to use. Good Luck, Bill

Ya I am using XML::LibXML for the main parsing of elements and nodes, however I needed to parse multiple XML files which are in a tree structure themselves. Possibly there is already something built in for this, but its what I could do in as little time as possible.