I am currently trying to solve the problem Add and Search Words Data Structure on leetcode. The question is as follows:
Design a data structure that supports adding new words and finding if a string matches any previously added string.
Implement the WordDictionary class:
WordDictionary()Initializes the object.
void addWord(word)Addswordto the data structure, it can be matched later.
bool search(word)Returnstrueif there is any string in the data structure that matcheswordorfalseotherwise. word may contain dots.where dots can be matched with any letter.
My Strategy:
My strategy involves representing a trie with a hashmap instead of a traditional linked-list-based tree structure, aiming for better performance and lower complexity. By using a hashmap, we can quickly access the next node without traversing through unnecessary nodes, making operations faster especially with large datasets.
For example, when inserting words like apple and application into this structure, it's organized as nested hashmaps where each character in a word points to another hashmap representing the next character. The end of a word is marked with a special key-value pair {'end': {}}. This way, we efficiently store and search for words with minimal space and time complexity.
My Code:
class WordDictionary(object):
def __init__(self):
self.map = {}
def addWord(self, word):
"""
:type word: str
:rtype: None
"""
current = self.map
for i in word:
if i in current:
current = current[i]
else:
current[i] = {}
current = current[i]
current['end'] = {}
return
def search(self, word):
"""
:type word: str
:rtype: bool
"""
current = self.map
for i in word:
if i in current:
current = current[i]
elif i == '.':
current = {key: value for d in current.values() for key, value in d.items()}
else:
return False
if 'end' in current:
return True
return False
The solution seems to be effective for the majority of cases, but I've hit a error with test case 16, where it's not giving the right outcome. The length of test case 16 makes it particularly challenging to pinpoint where the mistake is happening. I'm in need of some guidance to track down and fix this logical error. Would you be able to lend a hand in sorting this out?
One thing searching in a trie shares with binary search is that it keeps shrinking the possibilities between a range; except instead of binary, it's character-by-character, and instead of dividing down the binary middle, it uses the position in the word. (It still performs with
log niid, see Tong2016Smoothed.)However, when it gets to a wildcard (
.), one can't add to the possibilities and do them in parallel. In general, we will have holes in the output such that it is not represented in a single range. For example, when searching forc.tin{cat, cel, cut},catandcutare included butcel, in the middle, is not. I think I've modified this code to do this by taking multiple paths.The
matchfunction is a (recursive) generator for all the words that could be possible using the string with wildcards. Theself.nodestring is a more expressive form of theendsentinel;endcould be used, but the way I've writtensearchrequires that any words are returned frommatch. (Sorry for thepython3instead ofpython, this is not my usual language at all.)