Fast sub-tree containment checks

Posted on February 21, 2021 · Tagged with algorithms, data-structures

There’s a problem on leetcode called "Subtree of Another tree". It requires to check whether one binary tree is a subtree of another binary tree.

There are two official solutions presented, one involving a pre-order traversal which is $O(m^2 + n^2+m\cdot n)$ and another one that checks does a traversal for every subtree of the bigger tree, and checks if it’s equal to the smaller tree, element-by-element and this is $O(m\cdot n)$.

It’s possible to write a different solution using Merkle trees that has $O(m+n)$ time complexity (if we consider $O(1)$ the time to compute one hash).

The Merkle trees can be computed bottom-up and the hashes for each node can be stored. Then the containment check only requires checking if a certain hash is present in a dictionary.

This will pass all the tests, but if we want to avoid false positives completely, once a hash match is found we need to do one tree-equality check. Merkle trees can generate false positives because hash collisions are possible.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
 #!/usr/bin/python3

 import hashlib

 class HashUtil:
     def __init__(self):
         self.m = hashlib.md5()

     def add_int(self,val):
         self.m.update(str(val).encode('ASCII'))

     def add_str(self,val):
         self.m.update(val.encode('ASCII'))

     def get_hash(self):
         return self.m.hexdigest()

 class MerkleTree:
     def __init__(self, x):
         self.hashes = {}
         self.build(x)

     def contains(self, H):
         return H in self.hashes

     def build(self, x):
         h = HashUtil()

         if not x.left and not x.right:
             h.add_int(x.val)
             x.hash = h.get_hash()
             self.hashes[x.hash]=1
             return x.hash

         h.add_int(x.val)

         if x.left:
             h.add_str("L"+self.build(x.left))

         if x.right:
             h.add_str("R"+self.build(x.right))

         x.hash = h.get_hash()
         self.hashes[x.hash]=1
         return x.hash

 class Solution:
     def isSubtree(self, s: TreeNode, t: TreeNode) -> bool:
         M_s = MerkleTree(s)
         M_t = MerkleTree(t)

         return M_s.contains(t.hash)
If you liked this article and would like to discuss more about optimizing and fine-tuning slow code feel free to reach out at stefan.petrea@gmail.com