I have a file with preprocessed german text all 39lines have a dot at the end
In order to get rid of some nulls in text I use this code:
text_with_nulls = open('lemmatizeAFD', 'r')
text_without_nulls = open('lemmatizeAFD_without_nulls', 'w')
for i in text_with_nulls:
res = re.findall(r'[a-zA-Z0-9äöüÄÖÜß\.]+', i)
for i in res:
text_without_nulls.write(i)
text_without_nulls.write(' ')
the output file, however, has only 33 dots, but all lemmatized words are in placed
Why did some dots disappear? I need them to split sentenses in separate lines later.
I am not very pofessional in re, so I assume that something is wrong with res = re.findall(r'[a-zA-Z0-9äöüÄÖÜß\.]+', i)