I'm trying to write code to help me at crossword puzzle. I'm experiencing the following errors.
1.When I try to use the much larger text file with my word list I receive no output only the small 3 string word list works.
2.The match test positive for the first two strings of my test word list. I need it to only test true for the entire words in my word list. [ SOLVED SOLUTION in the code bellow ]
lex.txt contains
dad
add
test
I call the code using the following.
./cross.py dad
[ SOLVED SOLUTION ] This is really slow.
#!/usr/bin/env python
import itertools, sys, re
sys.dont_write_bytecode = True
original_string=str(sys.argv[1])
lenth_of_string=len(original_string)
string_to_tuple=tuple(original_string)
with open('wordsEn.txt', 'r') as inF:
for line in inF:
for a in set (itertools.permutations(string_to_tuple, lenth_of_string)):
joined_characters="".join(a)
if re.search('\\b'+joined_characters+'\\b',line):
print joined_characters
Let's take a look at your code. You take the input string, you create all possible permutations of it and then you look for these permutations in the dictionary.
The most significant speed impact from my point of view is that you create the permutations of the word over and over again, for every word in your dictionary. This is very time consuming.
Besides of that, you don't even need the permutations. It's obvious that two words can be "converted" to each other by permuting if they've got the same letters. So your piece of code can be reimplemented as follows :
For a dictionary with 100 words of 8 letters, the output is :
The time consumed by the original implementation for 10000 records in the dictionary is unbearable.