If we have an union-find(disjoint-set) data structure and we are doing an union by rank and path compression for a find operation, how would the depth and height of every node change after the find operation?

I am trying to find the amortized cost of the find operation using the potential function:

$$phi(x) = \sum_{x} height(x) $$

I know this is not the right potential function, however, i am trying to learn more about amortized analysis and i am just practicing by analyzing more data structures.

$\begingroup$Union by rank and path compression!!Is this the answer you are looking for?I am really sorry, i am more than glad to give you more information, but i am not sure what exactly is being asked here!!$\endgroup$
– karthik rameshOct 14 '16 at 1:21

$\begingroup$There are many possible ways to implement union-find. Based on the question, I can guess that you are looking at a tree-based one. But you need to specify that!$\endgroup$
– Raphael♦Oct 14 '16 at 8:57

$\begingroup$oh my god!!Yes!!I am really stupid that i dint mention it!!I am really sorry!!Yes, i was looking at the tree implementation of the data structure!!I wish i could have given you more information in the beginning!!$\endgroup$
– karthik rameshOct 14 '16 at 14:28

1 Answer
1

I assume that the height of a node $\small x$, denoted as $\small h(x)$, is recursively defined as:
$$
\small
h(x) = \begin{cases}
0 & \text{ if $x$ is a leaf} \\
1+\max\{h(y) \mid y\text{ is child of } x\} & \text{ otherwise }
\end{cases}
$$
Let the value of the potential function before calling $\small \texttt{Find}(x)$ be $\small \Phi$ and the value after the call be $\small \bar{\Phi}$. Similarly we can define $\small h(x)$ and $\small \bar{h}(x)$ respectively. Denote as $\small P_x$ the path from $\small x$ to the root $\small r$ of the corresponding tree. By path compression, nodes in $\small P_x$ (except $\small r$) are all set to point to $\small r$. Therefore, generally $\small \bar{h}(y) \leq h(y)$ for $\small y \in P_x$, since $\small y$ may have one child removed.

The amortized cost of $\small \texttt{Find}(x)$ is:
$$
\small
\mathcal{O}(|P_x|) + \bar{\Phi} - {\Phi} = \mathcal{O}(|P_x|) + \sum_{y \in P_x}\{\bar h(y) - {h}(y)\}
$$
where $\small |P_x|$ is the # of nodes in $\small P_x$ and $\small \mathcal{O}(|P_x|)$ is the actual cost of $\small \texttt{Find}(x)$. Now look at term $\small \bar h(y) - {h}(y)$, by the discussion above, its value is no greater than $\small 0$. Unfortunately, there are special cases where $\small \bar h(y) = h(y)$ for all $\small y \in P_x$. Therefore, what we can say about the amortized cost, using sum of heights as potential function, is the amortized cost is $\small \mathcal{O}(|P_x|)$, which is the same as the worse-case bound.