Lets $T=(V,E,W)$ be a weighted tree (undirected acyclic graph) with positive weights on $n$ nodes. The weights define a natural metric on the set $V$ : $d(i,j) = $ weight of the (unique) path between $i$ and $j$ in $T$.

Now, lets suppose that $T$ is unknown and that we have access only to the $n\times n$ distance matrix induced by the tree. My question is : Can one learn the structure of $T$ without looking at all the ${n\choose 2}$ distances?

Alternatively, it is easy to see that the tree $T$ is the MST of the complete graph on $V$ with weights given by $d(\cdot,\cdot)$. Is there a sub-linear algorithm for finding the MST of this special, completely connected graph?

Edit: I must add that I would be happy to restrict attention to certain families of trees. For instance, this can be done when the tree $T$ is a line graph and if we know this before hand.

probably no (I recall reading some lower bound involving average degree of a vertex in a graph)
–
SuvritNov 9 '11 at 21:05

What structure of T do you want to learn about?
–
Michael BiroNov 9 '11 at 21:10

By structure, I mean I want to learn T. That is, the edge set.
–
BhaktNov 9 '11 at 21:14

1

The path lengths don't define a tree uniquely if you allow zero weight edges. Do they define a unique tree if all edge weights are positive?
–
Michael BiroNov 9 '11 at 22:15

@Michael : Sorry, the edge-weights have to be positive. (Edited question to reflect this). In this case, the path-lengths do define a unique (minimum spanning) tree. This can be proved without too much effort by contradiction. [For instance, see the (famous) paper by "Approximating Discrete Probability Distributions with Dependence Trees" Chow and Liu (1968)]
–
BhaktNov 9 '11 at 23:13

2 Answers
2

Let $T$ be a star with weights 1, 2, 3, 4, ... on its edges. Then unless you test the distance between every two leaves of $T$ you can't distinguish it from a different tree where some two leaves whose distance wasn't tested belong to a single path from the hub of the star. So there's an $\Omega(n^2)$ lower bound for this problem, matching the upper bound.

I agree. This is one of the reasons I added an edit that said I am happy to look at restricted families of graphs (bounded degree, e.g.). Here is a trivial example : If I restrict my attention to only line graphs, then $\Theta(n)$ measurements suffice to identify which line graph is the right one. So, the real question is : Are there interesting (non-trivial) assumptions one can make about the underlying tree that "generates" distances so that $o(n^2)$ measurements suffice to identify it? I like your example. Thanks a lot for your answer!
–
BhaktNov 11 '11 at 18:17

@David, I think that's not quite right. If you find the center of the star, you can first make sure it is as star... etc. The easiest way I know of making the $O(n^2)$ lower bound work is using "two-level" stars. See Proposition 7 of this cc.gatech.edu/~lreyzin/papers/ReyzinSri07_alt.pdf
–
Lev ReyzinJan 5 '12 at 23:02

When you say "first make sure it is a star", what do you mean? The two different trees in my answer do not differ from each other in the distances of the leaves from the hub, only in their distances from each other.
–
David EppsteinJan 10 '12 at 7:25

Sorry, I misread your answer -- I missed that the numbers are weights (I thought they were just labels). The two-level star is needed only in the unweighted case.
–
Lev ReyzinJan 10 '12 at 14:14