Python walk through directory and open a txt file

1.5k views Asked by At

I'm trying to open and process thousand of text file I have downloded from wikipedia using saprql queries. I use the following code:

list_words=[]
for roots, dirs, files in os.walk(path):
    for file in files:
        if file.endswith(".txt"):
           with open(file, 'r') as f:
                content= f.read()

                #remove the punct
                table=string.maketrans(string.punctuation,' '*len(string.punctuation)) 
                s= content.translate(table)


                #remove the stopwords
                text= ' '.join([word for word in s.split() if word not in stopwords])
                alfa= " ".join(text.split())

                #remove the verbs
                for word, pos in tag(alfa): # trovo tutti i verbi.
                    if pos != "VB": 
                        lower= word.lower()
                        lower_2= unicode(lower, 'utf-8', errors='ignore')
                        list_words.append(lower_2)

                #remove numbers 
                testo_2 = [item for item in list_words if not item.isdigit()]

print set(list_words)           

The problem is that the script open some text files and for others it give me the error: "Not such file or directory: blablabla.txt"

Does anybody knows why it happen and how can I cope with it?

Thanks!

1

There are 1 answers

3
anthonybell On BEST ANSWER

The file is relative, you have to concat the root and file to get the absolute filename like this:

absolute_filename = os.path.join(roots, file)
with open(absolute_filename, 'r') as f:
   .... rest of code

(It should be named root instead of roots).