The algorithm question was to compute Edit Distance (Levenshtein Distance). I had happened to encounter edit distance before while reading Grokking Algorithms, and it can be solved with DP. But this problem was a bit more complex: among many strings, find the string with the smallest distance. It can be understood as “Fuzzy matching”. The rough statement was:
1 2 3 4 5 6 7
莱文斯坦距离,又称 Levenshtein 距离,是编辑距离的一种。指两个字串之间,由一个转成另一个所需的最少编辑操作次数。 允许的编辑操作包括: 插入一个字符 删除一个字符 将一个字符替换成另一个字符 需要你编写一个程序,实现以下功能: 给定一个字符串集合 S 以及一个模板串 P,从 S 中找出与 P 莱文斯坦距离最小的字符串 T,输出 T 以及其对应的编辑距离 D。如果 S 中出现多个满足条件的字符串,则取按字典序排列的第一个。
I did not think of a good solution. With brute force, comparing every string with P has time complexity: O(P.size() * sum(S_i.size())).
Later, I searched online for solutions, and they were not hard to find. Use a Trie to avoid repeated DP computation among different strings. The time complexity becomes O(P.size() * number of Trie nodes). Although the worst-case complexity does not improve, it is a real optimization. This should be the solution the interviewer wanted.