Problem Detail: Is there any formal definition about the average height of a binary tree? I have a tutorial question about finding the average height of a binary tree using the following two methods:
- The natural solution might be to take the average length of all possible paths from the root to a leaf, that is $qquad displaystyle operatorname{avh}_1(T) = frac{1}{text{# leaves in } T} cdot sum_{v text{ leaf of } T} operatorname{depth}(v)$.
- Another option is to define it recursively, that is the average height for a node is the average over the average heights of the subtrees plus one, that is $qquad displaystyle operatorname{avh}_2(N(l,r)) = frac{operatorname{avh}_2(l) + operatorname{avh}_2(r)}{2} + 1$ with $operatorname{avh}_2(l) = 1$ for leafs $l$ and $operatorname{avh}_2(_) = 0$ for empty slots.
Based on my current understanding, for example the average height of the tree $T$
1 / 2 3 / 4
is $operatorname{avh}_2(T) = 1.25$ by the second method, that is using recursion. However, I still don’t quite understand how to do the first one. $operatorname{avh}_1(T) = (1+2)/2=1.5$ is not correct.
Asked By : Timeless
Answered By : Raphael
There is no reason to believe that both definitions describe the same measure. You can write $operatorname{avh}_1$ recursively, too: $qquad displaystyle operatorname{avh}_1(N(l,r)) = frac{operatorname{lv}(l)(operatorname{avh_1}(l) + 1) + operatorname{lv}(r)(operatorname{avh_1}(r) + 1)}{operatorname{lv}(l) + operatorname{lv}(r)}$ with $operatorname{avh}_1(l) = 0$ for leaves $l$. If you don’t believe that this is the same, unfold the definition of $operatorname{avh}_1$ on the right hand side, or perform an induction proof. Now we see that $operatorname{avh}_1$ works quite differently from $operatorname{avh}_2$. While $operatorname{avh}_2$ weighs the recursive heights of a nodes children equally (adding and dividing by two), $operatorname{avh}_1$ weighs them according to the number of leaves they contain. So they are the same (modulo the anchor) for leaf-balanced trees, that is balanced in the sense that sibling trees have equally many leaves. If you simplify the recursive form of $operatorname{avh}_1$ with $operatorname{lv}(l) = operatorname{lv}(r)$ this is immediately apparent. On unbalanced trees, however, they are different. Your calculations are indeed correct (given your definition); note that the example tree is not leaf-balanced.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/2762