On Apr 17, 4:03 am, Clarendon <jine... at hotmail.com> wrote:
> Thank you very much for this information. It seems to point me to the
> right direction. However, I do not fully understand the flatten
> function and its output. Some indices seem to be inaccurate. I tried
> to find this function at nltk.tree.Tree.flatten, but it returns a
> flattened tree, not a tuple.
>> So your flatten function must be a different one, and it's not one of
> the builtins, either. Could you tell me where I can find the
> documentation about this flatten function?
No, it is a different one. I don't even have it. We'd have to write
it.
The indices weren't included in the flattened tree, but if you're
writing it, it can.
0: ( 'ROOT', None, <object>, None --no parent--, 0 )
1: ( 'S', None, <object>, 0 --parent is 'ROOT'--, 1 )
2: ( 'NP', None, <object>, 1 --parent is 'S'--, 2 )
3: ( 'PRP', 'I', <object>, 2 --parent is 'NP'--, 3 )
4: ( 'VP', None, <object>, 1 --parent is 'S', 2 )
5: ( 'VBD', 'came', <object>, 4 --parent is 'VP'--, 2 )
I screwed up the 'depth' field on #5. It should be:
5: ( 'VBD', 'came', <object>, 4 --parent is 'VP'--, **3** )
Otherwise I'm not sure what you mean by 'indices seem to be
inaccurate'. I'm still not completely sure though. After all, I did
it by hand, not by program.
If your package comes with a flatten function, it would be a good
place to start. Flatten functions can get hairy. What is its code,
and what is its output?
Here's an example:
>>> a= [ 'p', [ [ 'q', 'r' ], 's', 't' ], 'u' ]
>>> a
['p', [['q', 'r'], 's', 't'], 'u']
>>> def flatten( x ):
... for y in x:
... if isinstance( y, list ):
... for z in flatten( y ):
... yield z
... else:
... yield y
...
>>> list( flatten( a ) )
['p', 'q', 'r', 's', 't', 'u']