thesis/self_balancing_search_trees.tex

\chapter{Self-Balancing Search Trees}\label{chap:sb-bst}

This chapter will briefly discuss the properties and fundamental ideas behind the most used self-balancing search trees in standard libraries to give an idea about current options and how WAVL fits among them.

\section{Red-black trees}

As mentioned previously, red-black trees are among the most popular implementations in standard libraries. As always, we have a binary search tree, and each node is given \textit{red} or \textit{black} colour. A red-black tree is kept balanced by enforcing the following set of rules~\cite{rbtree}:

\begin{enumerate}
    \item External nodes are black; internal nodes may be red or black.
    \item For each internal node, all paths from it to external nodes contain the same number of black nodes.
    \item No path from an internal node to an external node contains two red nodes in a row.
    \item External nodes do not hold any data.
    \item Root has black colour. This rule is optional, since it increases the count of black nodes from root to each of the external nodes. However it may be beneficial during insertion.
\end{enumerate}

Given this knowledge, we can safely deduce the following relation between the height of the red-black tree and nodes stored in it~\cite{cormen2009introduction}:
\[
\log_2{(n + 1)} \leq h \leq 2 \cdot \log_2{(n + 2)} - 2
\]\label{rb-height}

Lower bound is given by a perfect binary tree and the upper bound is given by the minimal red-black tree.

There are also other variants of the red-black tree that are considered to be simpler for implementation, e.g. left-leaning red-black tree, as described by \textit{Sedgewick}~\cite{llrb}.

Red-black trees are used to implement sets in C++~\cite{llvm}, Java and C\#.

\section{AVL tree}

AVL tree is considered to be the eldest self-balancing binary search tree. For clarity, we define the following function:

\[
BalanceFactor(n) := height(right(n)) - height(left(n))
\]

Then we have an AVL tree, if for every node $n$ in the tree the following holds:

\[
BalanceFactor(n) \in \{ -1, 0, 1 \}
\]

In other words, the heights of left and right subtrees of each node differ at most in 1.~\cite{avl}

Similarly, we will deduce the height of the AVL tree from original paper, by \textit{Adelson-Velsky and Landis}~\cite{avl}, we get:

\[
\left( \log_2{(n + 1)} \leq \right) h < \log_{\varphi}{(n + 1)} < \frac{3}{2} \cdot \log_2{(n + 1)}
\]\label{avl-height}

If we compare the upper bounds for the height of the red-black trees and AVL trees, we can see that AVL rules are more strict than red-black rules, but at the cost of rebalancing. However, in both cases the rebalancing still takes $\log_2{n}$.

Regarding the implementation of AVL trees, we can see them implemented in the standard library of Agda or Coq.

\section {B-tree}

\textit{To keep or not to keep…}

Used in Rust.
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00			`\chapter{Self-Balancing Search Trees}\label{chap:sb-bst}`

			`This chapter will briefly discuss the properties and fundamental ideas behind the most used self-balancing search trees in standard libraries to give an idea about current options and how WAVL fits among them.`

			`\section{Red-black trees}`

fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`As mentioned previously, red-black trees are among the most popular implementations in standard libraries. As always, we have a binary search tree, and each node is given \textit{red} or \textit{black} colour. A red-black tree is kept balanced by enforcing the following set of rules~\cite{rbtree}:`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
			`\begin{enumerate}`
			`\item External nodes are black; internal nodes may be red or black.`
			`\item For each internal node, all paths from it to external nodes contain the same number of black nodes.`
			`\item No path from an internal node to an external node contains two red nodes in a row.`
fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`\item External nodes do not hold any data.`
			`\item Root has black colour. This rule is optional, since it increases the count of black nodes from root to each of the external nodes. However it may be beneficial during insertion.`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00			`\end{enumerate}`

			`Given this knowledge, we can safely deduce the following relation between the height of the red-black tree and nodes stored in it~\cite{cormen2009introduction}:`
			`\[`
			`\log_2{(n + 1)} \leq h \leq 2 \cdot \log_2{(n + 2)} - 2`
			`\]\label{rb-height}`

fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`Lower bound is given by a perfect binary tree and the upper bound is given by the minimal red-black tree.`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`There are also other variants of the red-black tree that are considered to be simpler for implementation, e.g. left-leaning red-black tree, as described by \textit{Sedgewick}~\cite{llrb}.`

			`Red-black trees are used to implement sets in C++~\cite{llvm}, Java and C\#.`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
			`\section{AVL tree}`

			`AVL tree is considered to be the eldest self-balancing binary search tree. For clarity, we define the following function:`

			`\[`
			`BalanceFactor(n) := height(right(n)) - height(left(n))`
			`\]`

			`Then we have an AVL tree, if for every node $n$ in the tree the following holds:`

			`\[`
			`BalanceFactor(n) \in \{ -1, 0, 1 \}`
			`\]`

fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`In other words, the heights of left and right subtrees of each node differ at most in 1.~\cite{avl}`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`Similarly, we will deduce the height of the AVL tree from original paper, by \textit{Adelson-Velsky and Landis}~\cite{avl}, we get:`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
			`\[`
			`\left( \log_2{(n + 1)} \leq \right) h < \log_{\varphi}{(n + 1)} < \frac{3}{2} \cdot \log_2{(n + 1)}`
			`\]\label{avl-height}`

fix: cleaning up bib and minor typos Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-18 11:32:26 +02:00			`If we compare the upper bounds for the height of the red-black trees and AVL trees, we can see that AVL rules are more strict than red-black rules, but at the cost of rebalancing. However, in both cases the rebalancing still takes $\log_2{n}$.`
chore: split to multiple files Signed-off-by: Matej Focko <mfocko@redhat.com> 2022-05-17 13:37:59 +02:00
			`Regarding the implementation of AVL trees, we can see them implemented in the standard library of Agda or Coq.`

			`\section {B-tree}`

			`\textit{To keep or not to keep…}`

			`Used in Rust.`