Problem Statement
Pattern:
O(n^2) Solution
List<TreeNode> res;
HashMap<String, Integer> serialCount;
// post order serialise
String pos(TreeNode root){
if(root == null) return "#";
// serialise curr subtree
String serial = root.val + "," + pos(root.left) + pos(root.right);
// update serialCount and res
int freq = serialCount.getOrDefault(serial, 0);
serialCount.put(serial, ++freq);
if (freq == 2) res.add(root);
return serial;
}
public List<TreeNode> findDuplicateSubtrees (TreeNode root){
res = new LinkedList<>();
serialCount = new HashMap<>();
pos(root);
return res;
}Notes
- For every node, serialise its subtree convert its
root, left, rightvalues into a string. and save this serials frequency in a hashmap. - since we only have to return a single instance of duplicate subtrees, check if
freq == 2, and add to result - effectively postorder traverse and serialise
Now, Why O(N^2)
- at the bottom left most node/subtree , the size of the
serialstring is1 - at the root the size of the
serialstring is >nwhere n is number of nodes. - why does this matter?
- because concatenation, and hashing operations complexity increases linearly with the size of the string
- so at the root the concatenation has to concatenate a string of characters for all nodes = n, its child n-1, for its child n-2, so on and so forth
- therefore O(n^2)
O(n) Solution
if we could instead of appending all the values, could somehow ‘cache’ the serial, effectively assign the serial an id, so we didnt have to concatenate, and append these serial s over and over again, and we could simply identify , concatenate and hash a serial based off if its id
int uniqueSerials = 1;
HashMap<String, Integer> serialID;
HashMap<Integer, Integer> idCount;
List<TreeNode> res;
int getSerial(String serial) {
// get serialID if exists
if (serialID.containsKey(serial)) return serialID.get(serial);
// else get unique serial
serialID.put(serial, ++uniqueSerials);
return uniqueSerials;
}
// post order serialise
int pos(TreeNode root) {
if (root == null) return 0;
// serialise curr subtree
String serial = pos(root.left) + "," + root.val + "," + pos(root.right);
// get an ID for serial
int srID = getSerial(serial);
// update idCount and res
int freq = idCount.getOrDefault(srID, 0);
idCount.put(srID, ++freq);
if (freq == 2) res.add(root);
return srID;
}
public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
serialID = new HashMap<>();
idCount = new HashMap<>();
res = new LinkedList<>();
pos(root);
return res;
}-
In the first approach, imagine that from left side you got a string like “2,3,4,5” etc while from right you got “7,8,9,10”, then you will be concatenating 4 + 4 operations(excluding commas and if you include them its even more).
-
Whereas with ids, this is always bounded to an integer at most from left side you will get 10 chars(-2147483647 or 2147483647) and from right you will get again max 10 chars. So at most your String concatenation will be 10 chars + 10 chars.