Quick Navigation

🌳 B-Tree Proofs 🔄 Splay Tree Proofs 🔄 Circular Buffer Proofs 📡 Rsync Proofs 🔒 Thread-Safe Proofs

🌳 B-Tree Proofs

Theorem: B-Tree Height

Statement: For a B-Tree of order $t$ (min degree) with $n$ keys, the height $h$ satisfies:

$$h \leq \log_t \frac{n+1}{2}$$

Proof:

A B-Tree of height $h$ has at least:

  • Root: 1 node with at least 1 key
  • Level 1: at least 2 nodes with at least $(t-1)$ keys each
  • Level 2: at least $2t$ nodes with at least $(t-1)$ keys each
  • Level $h$: at least $2t^{h-1}$ nodes with at least $(t-1)$ keys each

Total minimum keys:

\begin{align} n &\geq 1 + 2(t-1) + 2t(t-1) + \ldots + 2t^{h-1}(t-1) \\ n &\geq 1 + 2(t-1)(1 + t + t^2 + \ldots + t^{h-1}) \\ n &\geq 1 + 2(t-1)\frac{t^h - 1}{t - 1} \\ n &\geq 1 + 2(t^h - 1) \\ n &\geq 2t^h - 1 \\ t^h &\leq \frac{n+1}{2} \\ h &\leq \log_t \frac{n+1}{2} \end{align}

Corollary: Height is $O(\log_t n) = O(\log n)$ when $t$ is constant.

Theorem: B-Tree Search Complexity

Statement: Searching in a B-Tree with $n$ keys takes $O(\log n)$ time.

Proof:

  1. Search traverses from root to leaf: $O(\log_t n)$ levels
  2. At each level, perform binary search on at most $(2t-1)$ keys: $O(t) = O(1)$ if $t$ is constant
  3. Number of levels: $O(\log_t n) = O(\log n)$
  4. Total time: $O(\log n) \times O(1) = O(\log n)$

Conclusion: B-Tree search has $O(\log n)$ time complexity.

Theorem: B-Tree Insert Complexity

Statement: Inserting into a B-Tree with $n$ keys takes $O(\log n)$ time.

Proof:

  1. Find insertion point: $O(\log n)$ (same as search)
  2. Insert into leaf: $O(1)$
  3. If node splits:
    • Split operation: $O(t) = O(1)$ if $t$ is constant
    • Propagate split upward: at most $O(\log n)$ levels
  4. Total time: $O(\log n) + O(\log n) \times O(1) = O(\log n)$

Conclusion: B-Tree insert has $O(\log n)$ time complexity.

Theorem: B-Tree Delete Complexity

Statement: Deleting from a B-Tree with $n$ keys takes $O(\log n)$ time.

Proof:

  1. Find node to delete: $O(\log n)$
  2. If internal node, find predecessor/successor: $O(\log n)$
  3. Delete from node: $O(1)$
  4. Handle underflow:
    • Borrow from sibling: $O(1)$
    • Merge with sibling: $O(t) = O(1)$
    • Propagate upward: at most $O(\log n)$ levels
  5. Total time: $O(\log n) + O(\log n) = O(\log n)$

Conclusion: B-Tree delete has $O(\log n)$ time complexity.

Theorem: B-Tree Space Complexity

Statement: A B-Tree with $n$ keys uses $O(n)$ space.

Proof:

  • Each key is stored exactly once: $n$ keys
  • Each node has at most $(2t-1)$ keys and $2t$ children pointers
  • Number of nodes: at most $\frac{n}{t-1} = O(n)$ when $t$ is constant
  • Total space:
    \begin{align} \text{Space} &= n \text{ keys} + O(n) \text{ pointers} \\ &= O(n) + O(n) \\ &= O(n) \end{align}

Conclusion: B-Tree space complexity is $O(n)$.

🔄 Splay Tree Proofs

Theorem: Splay Tree Amortized Complexity

Statement: The amortized time per operation in a splay tree is $O(\log n)$.

Proof (Potential Method):

Define potential function:

$$\Phi(T) = \sum_{v \in T} \log(\text{size}(v))$$

where $\text{size}(v)$ = number of nodes in subtree rooted at $v$.

For operation with actual cost $c$, amortized cost:

$$\hat{c} = c + \Delta\Phi$$

Splay Operation Analysis:
  • Zig step: amortized cost $\leq 3(r'(x) - r(x)) + 1$
  • Zig-Zig step: amortized cost $\leq 3(r'(x) - r(x))$
  • Zig-Zag step: amortized cost $\leq 3(r'(x) - r(x)) - 2$

where $r(x) = \log(\text{size}(x))$ and $r'(x)$ is rank after operation.

Summing over all splay steps:

$$\text{Total amortized cost} \leq 3(r(\text{root}) - r(x)) + 1$$

Since $r(\text{root}) = \log(n)$ and $r(x) \geq 0$:

$$\text{Amortized cost} \leq 3\log(n) + 1 = O(\log n)$$

Conclusion: Splay tree operations have $O(\log n)$ amortized complexity.

Theorem: N-Way Splay Tree Search Complexity

Statement: Searching in an N-way splay tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

  1. Find node: $O(\log n)$ worst case (tree height)
  2. Splay to root: $O(\log n)$ amortized (from Theorem 6)
  3. Total: $O(\log n)$ amortized

Conclusion: N-way splay tree search has $O(\log n)$ amortized time complexity.

Theorem: N-Way Splay Tree Insert Complexity

Statement: Inserting into an N-way splay tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

  1. Find insertion point: $O(\log n)$
  2. Insert node: $O(1)$
  3. Splay new node to root: $O(\log n)$ amortized
  4. Adjust branching if needed: $O(1)$ amortized (infrequent)
  5. Total: $O(\log n)$ amortized

Conclusion: N-way splay tree insert has $O(\log n)$ amortized time complexity.

Theorem: N-Way Splay Tree Delete Complexity

Statement: Deleting from an N-way splay tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

  1. Splay node to root: $O(\log n)$ amortized
  2. Delete root: $O(1)$
  3. If needed, splay successor/predecessor: $O(\log n)$ amortized
  4. Total: $O(\log n)$ amortized

Conclusion: N-way splay tree delete has $O(\log n)$ amortized time complexity.

Theorem: Dynamic Branching Adjustment Complexity

Statement: Adjusting branching factor in an N-way splay tree takes $O(1)$ amortized time per operation.

Proof:

  1. Branching adjustment triggered when: $|\text{children}| > \text{maxChildren}$ or optimal branching changes
  2. Split operation: $O(t)$ where $t$ is current branching factor
  3. Since $t \leq \sqrt{n}$ (by design), split cost: $O(\sqrt{n})$
  4. Split occurs at most once per $O(\sqrt{n})$ insertions
  5. Amortized cost: $\frac{O(\sqrt{n})}{O(\sqrt{n})} = O(1)$

Conclusion: Dynamic branching adjustment has $O(1)$ amortized time complexity.

Theorem: N-Way Splay Tree Space Complexity

Statement: An N-way splay tree with $n$ nodes uses $O(n)$ space.

Proof:

Each node stores:

  • 1 key
  • 1 value
  • At most $\text{maxChildren}$ pointers to children

Number of nodes: $n$

Total pointers: at most $n \times \text{maxChildren}$

Worst Case Analysis:

In worst case, $\text{maxChildren} = O(\sqrt{n})$ (by design constraint):

\begin{align} \text{Total space} &= n \text{ keys} + n \text{ values} + n \times O(\sqrt{n}) \text{ pointers} \\ &= O(n) + O(n) + O(n\sqrt{n}) \\ &= O(n\sqrt{n}) \end{align}
Average Case Analysis:

With dynamic branching adjustment, average branching factor is $O(1)$:

\begin{align} \text{Amortized space} &= n \text{ keys} + n \text{ values} + n \times O(1) \text{ pointers} \\ &= O(n) + O(n) + O(n) \\ &= O(n) \end{align}

Conclusion: N-way splay tree has $O(n)$ average space complexity with dynamic branching, and $O(n\sqrt{n})$ worst-case space complexity.

📡 Rsync Proofs

Theorem: Block Matching Complexity

Statement: Finding matching blocks using rolling checksum in splay tree takes $O(m \log n)$ time where $m$ is number of blocks to match and $n$ is number of stored blocks.

Proof:

For each of $m$ blocks:

  1. Compute rolling checksum: $O(1)$
  2. Search in splay tree: $O(\log n)$ amortized
  3. Verify strong hash: $O(1)$

Total: $m \times O(\log n) = O(m \log n)$

With Splay Optimization:

Frequently matched blocks move to root, reducing average search time for common patterns.

Conclusion: Block matching has $O(m \log n)$ time complexity, with better average case for common patterns.

Theorem: Rsync Tree Construction Complexity

Statement: Building splay tree from file with $b$ blocks takes $O(b \log b)$ time.

Proof:

For each of $b$ blocks:

  1. Compute checksums: $O(1)$
  2. Insert into tree: $O(\log i)$ where $i$ is current tree size

Sum: $\sum_{i=1}^{b} \log i = \log(b!) = b \log b$ (by Stirling's approximation)

Conclusion: Rsync tree construction has $O(b \log b)$ time complexity.

🔄 Circular Buffer Splay Tree Proofs

Theorem: Circular Buffer Splay Tree Search Complexity

Statement: Searching in a Circular Buffer Splay Tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

The search operation consists of:

  1. Finding the node: $O(\log n)$ worst case (tree height)
  2. Splay operation: $O(\log n)$ amortized (from standard splay tree analysis)
  3. Buffer lookup: $O(1)$ (direct index access)
\begin{align} T(n) &= \text{Find node} + \text{Splay} + \text{Buffer operations} \\ &= O(\log n) + O(\log n) + O(1) \\ &= O(\log n) \text{ amortized} \end{align}

Conclusion: Circular Buffer Splay Tree search has $O(\log n)$ amortized time complexity.

Theorem: Circular Buffer Splay Tree Insert Complexity

Statement: Inserting into a Circular Buffer Splay Tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

The insert operation consists of:

  1. Find insertion point: $O(\log n)$
  2. Allocate node from buffer: $O(1)$
  3. Insert node: $O(1)$
  4. Splay new node to root: $O(\log n)$ amortized
  5. Handle buffer overflow (if needed): $O(1)$ amortized
\begin{align} T(n) &= \text{Find} + \text{Allocate} + \text{Insert} + \text{Splay} + \text{Overflow} \\ &= O(\log n) + O(1) + O(1) + O(\log n) + O(1) \\ &= O(\log n) \text{ amortized} \end{align}

Conclusion: Circular Buffer Splay Tree insert has $O(\log n)$ amortized time complexity.

Theorem: Circular Buffer Splay Tree Delete Complexity

Statement: Deleting from a Circular Buffer Splay Tree with $n$ nodes takes $O(\log n)$ amortized time.

Proof:

The delete operation consists of:

  1. Find node to delete: $O(\log n)$
  2. Splay node to root: $O(\log n)$ amortized
  3. Delete node: $O(\log n)$ worst case
  4. Deallocate from buffer: $O(1)$
\begin{align} T(n) &= \text{Find} + \text{Splay} + \text{Delete} + \text{Deallocate} \\ &= O(\log n) + O(\log n) + O(\log n) + O(1) \\ &= O(\log n) \text{ amortized} \end{align}

Conclusion: Circular Buffer Splay Tree delete has $O(\log n)$ amortized time complexity.

Theorem: Circular Buffer Splay Tree Sort Complexity

Statement: Sorting a Circular Buffer Splay Tree with $n$ nodes takes $O(n)$ time for both ascending and descending order, regardless of comparison mode (lexicographic, numeric, or semantic).

Proof:

The sort operation performs an in-order traversal:

  1. Visit all nodes: $O(n)$
  2. Comparison per node: $O(1)$ for fixed-size keys, $O(k)$ for variable-size keys
  3. Build result vector: $O(n)$
For fixed-size keys or $O(1)$ comparators:
\begin{align} T(n) &= \text{Traversal} + \text{Comparisons} + \text{Result building} \\ &= O(n) + O(n) + O(n) \\ &= O(n) \end{align}
For variable-size keys with lexicographic comparison:

$T(n) = O(n \cdot k)$ where $k$ is the average key length.

Sort Order:
  • Ascending: Traverse left $\rightarrow$ node $\rightarrow$ right: $O(n)$
  • Descending: Traverse right $\rightarrow$ node $\rightarrow$ left: $O(n)$
Comparison Modes:
  • Lexicographic: $O(n \cdot k)$ where $k$ is key length
  • Numeric: $O(n)$
  • Semantic: Depends on comparator, typically $O(n)$ or $O(n \cdot k)$

Conclusion: Circular Buffer Splay Tree sort has $O(n)$ time complexity for fixed-size keys, and $O(n \cdot k)$ for variable-size keys where $k$ is the key length.

🔒 Thread-Safe Proofs

Theorem: Thread-Safe Operation Overhead

Statement: Thread-safe operations add $O(1)$ overhead per operation in the uncontended case.

Proof:

Thread-safe operations involve:

  • Mutex lock/unlock: $O(1)$ in uncontended case
  • Atomic operations: $O(1)$
  • Thread pool enqueue: $O(1)$ amortized
Uncontended Case:
\begin{align} \text{Overhead} &= \text{lock} + \text{operation} + \text{unlock} \\ &= O(1) + O(1) + O(1) \\ &= O(1) \end{align}
Contended Case:

When $k$ threads are waiting, lock acquisition is $O(k)$ worst case. However, with thread pool management, average overhead remains $O(1)$:

\begin{align} \text{Amortized overhead} &= \frac{\text{Total overhead}}{\text{Number of operations}} \\ &= \frac{O(n)}{n} \text{ (for $n$ operations)} \\ &= O(1) \end{align}

Conclusion: Thread-safe operations add $O(1)$ overhead per operation in the average case. For tree operations with $O(\log n)$ complexity, the $O(1)$ thread-safety overhead is dominated by the operation itself.