Finding the minimum and maximum height in a AVL tree, given a number of nodes?

69k views Asked by At

Is there a formula to calculate what the maximum and minimum height for an AVL tree, given a certain number of nodes?

For example:
Textbook question:
What is the maximum/minimum height for an AVL tree of 3 nodes, 5 nodes, and 7 nodes?
Textbook answer:
The maximum/minimum height for an AVL tree of 3 nodes is 2/2, for 5 nodes is 3/3, for 7 nodes is 4/3

I don't know if they figured it out by some magic formula, or if they draw out the AVL tree for each of the given heights and determined it that way.

5

There are 5 answers

0
River On BEST ANSWER

The solution below is appropriate for working things out by hand and gaining an intuition, please see the exact formulas at the bottom of this answer for larger trees (54+ nodes).1

Well the minimum height2 is easy, just fill each level of the tree with nodes until you run out. That height is the minimum.

To find the maximum, do the same as for the minimum, but then go back one step (remove the last placed node) and see if adding that node to the opposite sub-tree (from where it just was) violates the AVL tree property. If it does, your max height is just your min height. Otherwise this new height (which should be min height+1) is your max height.

If you need an overview of what the properties of an AVL tree are, or just a general explanation of an AVL tree, Wikipedia is a great place to start.

Example:

Let's take the 7 node example case. You fill in all levels and find a completely filled tree of height 3. (1 at level 1, 2 at level 2, 4 at level 3. 1+2+4=7 nodes.) That means 3 is your minimum.

Now find the max. Remove that last node and place it on the left subtree instead of the right. The right subtree still has height 3, but the left subtree now has height 4. However these values differ by less than 2, so it is still an AVL tree. Therefore your max height is 4. (Which is min+1)

All three examples worked out below (note that the numbers correspond to order of placement, NOT value):

Worked out as an example:


Formulas:

The technique shown above doesn't hold if you have a tree with a very large number nodes. In this case, one can use the following formulas to calculate the exact min/max height2.

Given n nodes3:

Minimum: ceil(log2(n+1))

Maximum: floor(1.44*log2(n+2)-.328)

If you're curious, the first time max-min>1 is when n=54.

1Thanks to Jamie S for bringing this failure at larger node counts to my attention.

2Technically, the height of a tree is the longest path length (in edges) between the root and any leaf node. However the OP's textbook uses a common alternate definition of height as the number of levels in a tree. For consistency with the OP and Wikipedia, we use that definition in this post as well.

3These formulas are from the Wikipedia AVL page, with constants plugged in. The original source is Sorting and searching by Donald E. Knuth (2nd Edition).

2
Evan Bechtol On

It's important to note the following defining characteristics of an AVL Tree.

AVL Tree Property

  • The nodes of an AVL tree abide by the BST property
  • AND The heights of the left and right sub-trees of any node differ by no more than 1.

Theorem: The AVL property is sufficient to maintain a worst case tree height of O(log N).

Note the following diagram. AVL Tree

- T1 is comprised of a T0 + 1 node, for a height of 1.
- T2 is comprised of T1 and a T0 + 1 node, giving a height of 2.
- T3 is comprised of a T2 for the left sub-tree and a T1 for the right sub-tree + 1 node, for a height of 3.
- T4 is comprised of a T3 for the left sub-tree and a T2 for the right sub-tree + 1 node, for a height of 4.

If you take the ceiling of O(log N), where N represents the number of nodes in an AVL tree, you get the height.

Example) T4 contains 12 nodes. [ceiling]O(log 12) = 4.

See the pattern developing here??

**The worst-case height is enter image description here

0
Michelle On

http://lcm.csa.iisc.ernet.in/dsa/node112.html

It is roughly 1.44 * log n, where n is the number of nodes.

For a more detailed description on how that was derived. You can refer to this link starting on the middle of page 13: http://www.compsci.hunter.cuny.edu/~sweiss/course_materials/csci335/lecture_notes/chapter04.2.pdf

0
Naman Choradia On

Lets assume the number of nodes is n

Trying to find out the minimum height of an AVL tree would be the same as trying to make the tree complete i.e. fill all the possible nodes at each level and then move to the next level.

So at each level the number of eligible nodes increases by 2^(h-1) where h is the height of the tree.

So at h=1, nodes(1) = 2^(1-1) = 1 node

for h=2, nodes(2) = nodes(1)+2^(2-1) = 3 nodes

for h=3, nodes(3) = nodes(2)+2^(3-1) = 7 nodes

so just find the smallest h, for which nodes(h) is greater than the given number of nodes n.

Now for the problem of maximum height of an AVL tree:-

lets assume that the AVL tree is of height h, F(h) being the number of nodes in the AVL tree,

for its height to be maximum lets assume that its left subtree FL and right subtree FR have a difference in height of 1(as it satisfies the AVL property).

Now assuming FL is a tree with height h-1 and FR be a tree with height h-2.

now the number of nodes in

F(h)=F(h-1)+F(h-2)+1 (Eq 1)

Adding 1 on both sides :

F(h)+1=(F(h-1)+1)+ (F(h-2)+1) (Eq 2)

So we have reduced the maximum height problem to a Fibonacci sequence. And these trees F(h) are called Fibonacci Trees.

So, F(1)=1 and F(2)=2

so in order to get the maximum height just find the index of the the number in the fibonacci sequence which is less than or equal to n.

So applying (Eq 1)

F(3)= F(2) + F(1)+ 1=4, so if n is between 2 and 4 tree will have height 3.

F(4)= F(3)+ F(2)+ 1 = 7, similarly if n is between 4 and 7 tree will have height 4.

and so on.

0
hdante On

The formula is not magic, the AVL tree was specifically designed for the purpose of having logarithmic height, search, insertion and deletion.

The minimum height is the best case for AVL the tree, when it's densely packed. In particular, a perfect and complete AVL tree has all balance factors of all nodes equal to zero and it has a simple exact formula for the height: h = lg(N+1). This can be proved by finite induction, or more simply, by using the formula of the finite sum of a geometric progression, because each level in the tree has twice the number of nodes of the level above, forming a geometric progression of factor 2: 1, 2, 4, 8, 16, ...

1 + 2 + 4 + 8 + ... = ∑ a.qⁱ = a₁ (1 - qⁿ) / (1 - q)

In the equation, a₁ = 1, q = 2 and n is the tree height, h, so:

N = 1.(1 - 2ⁿ)/(1 - 2) = 2ⁿ - 1

h = lg(N+1) ∎

As the AVL tree starts to become less dense, the absolute value of the balance factor starts increasing to 1 in some nodes. Due to the "holes" in the node allocation, the height will start increasing relative to a dense tree with the same number of elements. In the worst case, all internal nodes have a balance factor different from zero and the tree height reaches its maximum when this happens. It's possible to recursively construct trees from smaller trees, all having internal nodes with balance factor equal to -1 and each time a new tree is constructed, the height increases by 1. We start with trees with height 0 and 1, T0 and T1. For tree T_h, we build it by placing T_h-1 as the left child of the root and T_h-2 as the right child of the root. All such trees T_h have their internal nodes balance factor equal to -1, which is the worst case. Such trees are called Fibonacci trees and the number of elements in them T(h) is given by the following series:

AVL Tree

T(0) = 0, T(1) = 1

T(h) = T(h-1) + T(h-2) + 1 (#elements left tree + #elements right tree + root)

This series are very similar to the Fibonacci sequence, but the difference is the additional element +1. In can be shown by finite induction that T(h) = F(h+2) - 1, where F(n) is the nth Fibonacci number. This gives a closed formula for the number of elements in a Fibonacci tree of height h:

N = T(h) = (ϕⁿ⁺² - ψⁿ⁺²) / √5 - 1

Where ϕ is the golden ratio and ψ its conjugate. This formula already shows an exponential relation between the number of elements in the tree N and its height, but it's not possible to rewrite it in terms of the height h. However, -ψⁿ⁺² / √5 is a small number that tends to zero when h tends to infinity. We can replace this term by a constant ε that is assumed to be the largest value that this term can assume:

N = T(h) = ϕⁿ⁺² / √5 + ε - 1

ϕⁿ⁺² = √5.(N + 1 - ε)

h+2 = log_ϕ (√5.(N + 1 - ε))

h = log_ϕ (√5.(N + 1 - ε)) - 2

The constants can be rearranged and simplified to give:

h = log_ϕ (N + a) + b

Where a = 1 - ε and b = log_ϕ (√5) - 2. Replacing ε with the maximum possible constant, the upper limit for the height h can be found:

h ≤ log_ϕ (N + a) + b

Where a = ψ² / √5 + 1 and b = log_ϕ (√5) - 2. If desired, the logarithm in base ϕ can be transformed to base 2 by adding a constant:

h ≤ c.lg (N + a) + b ∎

Where a = ψ² / √5 + 1 and b = log_ϕ (√5) - 2 and c = 1/lg ϕ. c is approximatelly equal to 1,44042, which means that, compared to the formula for the minimum height, the worst case height of the AVL tree is around 44% bigger.