In this post I will try to review some of the main versions of self balancing binary trees, provide Java implementations and micro benchmark their performance under various conditions.

**Why do we need binary search trees?**

Lets say we have a list as an input and we need to find some element in it. What is the best way to do it? Naive implementation could be simply iterating through all the elements and check each one. That maybe be fine for small lists and simple use cases. But can we do better? Probably not with a simple random array. However if we sort it beforehand it is possible to “divide and conqueror”. That is we can check element in the middle and if the one we are looking for is larger we do the same with sub list n/2 to n (where n is number of elements in the list). So we cut through search space considerably (O(log n) complexity instead of O(n)). That is fine but we need to sort the input before with some sorting algorithm. And if we are constantly modifying (inserting/deleting) the input and searching through it, it’s not practical to sort it each time before search. Here is where binary search trees steps in. We modify the way data is kept. Not in a simple list/array but in a tree like data structure. Looking at the picture you probably already see why it is easier to search in such a structure.

However binary search tree can get unbalanced and then loose its efficiency. To solve that problem self balancing binary search trees were invented. I am not going to explain how they work in detail here just provide general information and some ideas how they can be used. Also I will provide my implementations in java and some benchmarks so you can start playing with it if you want.

One of the oldest, most well known and most popular tree data structure. It saves additional height information in each node and re balances tree if height of one node is higher than its sibling by 2. That keeps tree rigidly balanced so search is extremely fast on AVL tree. However that comes with a cost of rebalancing and that can make insert/delete operations slightly more expensive than other trees.

Quick link to my implementation on GitHub

Most widely used self balancing binary search tree. It also saves additional information on nodes (black node or red node) and preserves tree balance based on certain rules about how nodes are colored. It can be less balanced than AVL tree (which means slower search) but with more efficient inserts and deletes. Also because only two colors required, Red-Black tree needs only one bit of additional information per node. All of this makes it most popular choice for general use cases. Java *TreeSet *is implemented as Red-Black tree.

Quick link to my implementation on GitHub

This tree has advantage that it doesn’t need any additional memory per node. Once it finds out that the tree has to be re balanced it finds a scapegoat node and performs re balance operation on it. So it might be a good choice when memory is an issue. To find out if the tree is out of balance it uses parameter alpha. This allows to parametrise a scapegoat tree with different values of alpha which means different degrees of balance which means a choice between faster search or faster inserts. Imagine doing that dynamically – when there is heavy load on searches and not too much on inserts/deletes, you can lower alpha and increase it when amount of inserts/deletes increases.

Quick link to my implementation on GitHub

Basically randomized binary search tree and heap hybrid. It randomly generates and saves additional node parameter *priority*. Tree is then balanced such as priority values satisfies heap property. It is easy to implement and I think has a possibility of fast set operations (I did not implement it in my code). Other than that I don’t know if it has any advantage over other trees. Maybe instead of random priority flag we can use a non random (somehow calculated) importance flag which would keep more important nodes at the top?

Quick link to my implementation on GitHub

Splay tree is pretty interesting. It performs operation called *splaying* not only on inserts and deletes, but also when searching. This operation performs tree rotations and brings element up to the top. It means that recently accessed elements are always near the top. You can see how it can be very useful in certain use cases like some cache implementations. However additional rotations on search adds additional overhead and if each element is as likely to be searched then splaying doesn’t do that much good.

Quick link to my implementation on GitHub

**Short summary**

First choice for BST is a *Red-Black* tree as it is most universal data structure. However if your tree will be used mostly for search then take a look at *AVL tree*. If mostly the same elements are retrieved again and again then *Splay tree *can be even better. If sometimes there is a heavy read and sometimes a heavy write load then *Scapegoat tree* with different parameters alpha might be helpful. If for some reason you need to implement tree yourself and don’t yet feel very confident about your coding skills – Treap might be you choice.

**Benchmarks**

Here I have prepared some simple benchmarks for my own implementations of each tree. Please keep in mind that I might have made some mistakes or my implementation can be inefficient so let me know if you find that to be the case. Keep in mind that ms/op in units column means whole 100k elements test in ms. I’ve used jmh framework for benchmarking.

This first benchmark inserts elements from 1 to 100K. We see that all scapegoat trees performed very poorly. Not sure why. Splay tree performed phenomenally, however it is not that good once you realize that tree is not balanced at all and in fact is basically a linked list. Interesting that AVL tree was much better than Red-Black tree. Treap was quite good too (maybe even the best choice as it is much simpler than only slightly faster AVL tree). Another interesting observation is that less balanced Scapegoat tree (0.9 alpha) had slower inserts than 0.75 alpha balanced Scapegoat tree. In this case it is no surprise because we insert sorted elements. Having tree re balance too rarely in this situation increases costs for finding next element’s place once tree grows larger.

Insert sorted elements benchmark | Score | Score error | Units |
---|---|---|---|

AVL tree |
9.420 | 0.270 | ms/op |

JDK TreeSet |
13.981 | 0.479 | ms/op |

Red-Black tree |
17.430 | 0.917 | ms/op |

Scapegoat tree (alpha 0.6) |
131.432 | 4.159 | ms/op |

Scapegoat tree (alpha 0.75) |
76.769 | 2.538 | ms/op |

Scapegoat tree (alpha 0.9) |
77.999 | 1.775 | ms/op |

Splay tree |
1.229 | 0.039 | ms/op |

Treap |
9.576 | 1.102 | ms/op |

Once again lets insert 100k elements but this time they are random and not sorted. My AVL tree implementation did quite well and Red-Black tree implementation was much slower than we could expect (remember, it suppose to be faster on inserts). Maybe my implementation is not very efficient or I made a mistake somewhere. Red-Black tree implementation from JDK was fastest here (but only barely faster than my non optimized AVL tree) and Splay tree was slowest so no more surprises.

Insert random elements benchmark | Score | Score error | Units |
---|---|---|---|

AVL tree |
39.051 | 1.998 | ms/op |

JDK TreeSet |
38.811 | 4.356 | ms/op |

Red-Black tree |
59.052 | 7.871 | ms/op |

Scapegoat tree (alpha 0.6) |
50.738 | 4.400 | ms/op |

Scapegoat tree (alpha 0.75) |
44.077 | 1.192 | ms/op |

Scapegoat tree (alpha 0.9) |
44.862 | 2.168 | ms/op |

Splay tree |
76.457 | 4.114 | ms/op |

Treap |
47.978 | 1.656 | ms/op |

This next benchmark inserts 100K elements and then deletes them one by one. One more time my implementation of Red-Black tree under performs (any help why is that would be appreciated!). Interesting that more rigidly balanced Scapegoat trees couldn’t even complete this test (even when I rewrote tree flattening algorithm to iterative from recursion). My explanation is that it is not very good scenario for Scapegoat tree, as constant deletion triggers re balancing at the root too often (scapegoat tree re balances at the root node on deletion) and it is an expensive operation as it has to flatten and then rebuild the whole tree (not just a scapegoat sub tree as in inserts). However it should do OK when there are not too many deletes in a row. Loosely balanced Scapegoat tree did quite well however. JDK Red-Black tree was fastest and Splay tree was slowest once again.

Delete elements benchmark | Score | Score error | Units |
---|---|---|---|

AVL tree |
88.266 | 4.713 | ms/op |

JDK TreeSet |
67.010 | 5.138 | ms/op |

Red-Black tree |
114.527 | 12.326 | ms/op |

Scapegoat tree (alpha 0.9) |
75.266 | 1.758 | ms/op |

Splay tree |
151.600 | 8.451 | ms/op |

Treap |
97.886 | 7.516 | ms/op |

Now lets check searching 100K size trees. Surprisingly JDK Red-Black tree slightly outperformed also very fast AVL tree. Low alpha Scapegoat tree was just slightly slower than AVL tree. Because of splaying overhead on each search operation Splay tree is performing very poorly in such situation.

Search elements benchmark | Score | Score error | Units |
---|---|---|---|

AVL tree |
29.360 | 1.719 | ms/op |

JDK TreeSet |
26.630 | 4.192 | ms/op |

Red-Black tree |
45.811 | 4.770 | ms/op |

Scapegoat tree (alpha 0.6) |
30.151 | 1.285 | ms/op |

Scapegoat tree (alpha 0.75) |
34.001 | 2.171 | ms/op |

Scapegoat tree (alpha 0.9) |
31.732 | 1.892 | ms/op |

Splay tree |
91.072 | 5.093 | ms/op |

Treap |
31.564 | 1.688 | ms/op |

This benchmark was specially created for the Splay tree. It is the same as previous search elements benchmark, just now we are constantly searching for only the same 100 elements. Splay tree “is splaying” those to the top after search and that should have given it a superior performance! However probably my benchmark or implementation is flawed in some way and Splay tree under performed!

Search same few elements benchmark | Score | Score error | Units |
---|---|---|---|

AVL tree |
2.301 | 0.192 | ms/op |

JDK TreeSet |
2.667 | 0.182 | ms/op |

Red-Black tree |
15.921 | 4.166 | ms/op |

Scapegoat tree (alpha 0.6) |
1.993 | 0.192 | ms/op |

Scapegoat tree (alpha 0.75) |
2.029 | 0.110 | ms/op |

Scapegoat tree (alpha 0.9) |
1.870 | 0.196 | ms/op |

Splay tree |
19.690 | 0.761 | ms/op |

Treap |
7.887 | 0.392 | ms/op |

That is it. As always you can check out all source code from GitHub repo. It is pretty clear my Red-Black tree is not implemented correctly and is inefficient. And probably Splay tree too. It would be great if someone could take a look into it (and other implementations). I hope you liked this little benchmark and it did help you to choose your favorite binary search tree 🙂 Happy coding!

**EDIT: ** check out AVL with RedBlack tree performance comparison here

Interesting post! I’m working on implementing my own balanced BSTs as well, and I found that red-black trees are quite tricky to get working correctly. I liked your comparisons though, great job!

Thanks! Yeah red-black trees probably are the most complicated to implement of all of them but also most common in real world usage.

Implement delete in AVL trees, probably the most difficult because red-black implementations are the most common so you can copy an existing solution.

It is not clear that your data supports your conclusions. Is it worth considering that AVL trees often outperform red-black trees but people still choose red-black because they listen to their peers instead of looking at the results?

Structure of my writing was not perfect. Basically I’ve just introduced some theory and then a short summary on that theory. Then I did the benchmarks without any conclusion.

Having said that AVL tree performed surprisingly well and maybe it could be optimized further to beat TreeSet in every test. It is also easier to implement (less time, less testing, less bugs). One definite disadvantage is that it requires more memory but nowadays maybe it’s not that important.

Could it be that AVL is a better choice as an all around tree data structure? Maybe, but I want to believe that library developers (those few who actually implement that kind of stuff professionally) test and compare different algorithms for the most common use cases in practice. If you need to optimize something really bad and want to implement your own data structures – it’s probably a bad idea to choose only after reading or listening to your peers and without testing how it really works in practice in your own situation.

Yes, and AVL trees requiring “more memory” is easy to quantify. Red-black trees require each node to store 1 of 2 possible values (red or black). AVL trees require each node to store 1 of 3 possible values (-1,0,1). Another consideration is that the library designers may have been correct in the 90s but those choices are no longer correct today since hardware has changed.

It could be, unfortunately I do not know intricate implementation details and various optimizations that would go into each of the data structure for professional use (e.g. in library). I guess one way for you to find out is to turn on your profiler, take my AVL tree implementation and try to make it faster than TreeSet on all benchmarks 🙂

Done with my AVL implementation except I’m using TreeMap. One thing that took me a bit to implement was a one extra bit per node AVL tree, that beats TreeMap for all the tests I have. I’m checking code in now, if you are interested in AVL implemented 10 different ways…https://github.com/dmcmanam/bbst-showdown

Eventually the textbooks and standard implementations will catch-up to the research on AVL trees.

Great job! Did you try deletion or mixed insert/delete tests? Delete is where my AVL implementation lost pretty significantly to Java’s TreeSet. It seems that AVL trees can hold up pretty well in actual benchmarks despite that in theory it should have slower insert/delete operations.

Here is another guy trying some tests: http://nathanbelue.blogspot.lt/2012/05/red-black-versus-avl.html

There were so many different ways to implement the insert for AVL trees it took me awhile to compare them all. This week I’ll add more tests including deletes and expect that practice and theory will demonstrate AVL trees are faster – https://refactoringlightly.wordpress.com/

You left out one. Weight Balanced Binary Search Tree. These are commonly used by Functional programmers because their rebalancing strategy can be easily implemented via recursion. https://en.wikipedia.org/wiki/Weight-balanced_tree

I was looking for a fast BST for dealing with a very small data payload (32 or 64 bit chunks) and the memory overhead for most other BST strategies is completely ridiculous. I realised that I could instead use a quickly derived structure patterning that only uses bit rotations (doubling/halving) and additions and subtractions to walk the tree, and by eliminating the need for the usual bidirectional links (and yes, you could make them one-directional but only save on the parent reference), instead there is no references at all, and the structure of the tree is coded in a simple formula you can derive from the index of the node being examined. (namely 2*n+1 or 2 depending if you want to walk down left or right, and n/2… something something… I haven’t worked out that one yet).

I could be wrong in thinking that nobody has invented this variant of WBBST, but it seems like a perfect way to deal with relatively small data structures, since you can fit the whole table into the cache of an i5-7600 (6Mb), containing 16 levels of the tree from the root of a 32 bit payload, like, for example, a database index table. I conceived this for the purpose of implementing a Proof of Work algorithm that uses a partial hash collision search (based on Cuckoo Cycle).

I must admit I didn’t know about those as they don’t seem that popular. They seem quite similar to AVL trees to me.

As for your optimization I am not sure I follow it correctly after a quick read. As I understand you want to get rid of references (something like heap implemented using a simple array?) but I think you still need to keep the size of subtree in each node? Or doesn’t you? If so maybe scapegoat tree is even more suitable as it does not require any additional information per node.

Anyway once you are done feel free share it here (or link to your own post with details or even write guest post here if you don’t blog). Seems interesting 🙂 Good luck!