Using Python to read through a directory of JSON files and run a reversing Python script on each file

1.1k views Asked by At

I have a folder called Userss with ~100 individual JSON files. Each JSON file holds data about a user in the format:

{"cX": 298, "cY": 492, "time": 1420209750422, "y": 492, "x": 298, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 653, "cY": 57, "time": 1420209753241, "y": 57, "x": 653, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 646, "cY": 53, "time": 1420209753244, "y": 53, "x": 646, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 640, "cY": 50, "time": 1420209753250, "y": 50, "x": 640, "type": "mousemove", "name": "Anthony Coleman"}

(all names are made up)

Most of the files are pretty large, so doing this manually is not an option.

I am trying to reverse the content of these individual files and write this reversed data to a new 'reversed file' so that the above JSON snippet would appear as

{"cX": 640, "cY": 50, "time": 1420209753250, "y": 50, "x": 640, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 646, "cY": 53, "time": 1420209753244, "y": 53, "x": 646, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 653, "cY": 57, "time": 1420209753241, "y": 57, "x": 653, "type": "mousemove", "name": "Anthony Coleman"}
{"cX": 298, "cY": 492, "time": 1420209750422, "y": 492, "x": 298, "type": "mousemove", "name": "Anthony Coleman"}

in a new file, essentially reverse sorting them by the Unix timestamp.

The Userss file is of the format

firstname-secondname1.json

firstname-secondname2.json

firstname-secondname3.json

...

FYP is the folder in which the script (test.py) to run is saved and Userss is the folder in which the user data is saved. Userss is a sub-folder of FYP.

My approach is to use os.walk() on the Userss directory and carry out a reversing script which I have come up with on each file. My problem is actually iterating through the directory and reading the files in, in the first place.

The following code is what I have:

test.py

import os
from operator import itemgetter, attrgetter, methodcaller
import json

rootdir = './Userss'

fileHandles = {}
count = 0
totalfilelines = 0
filenum = 0
lastName=None
handle=None

for files in os.walk(rootdir):
    #print files
    #print "---------"
    #print len(files)
    #for file in files:
    filenum += 1

    with open(files) as infile:
        #for line in sortedpython(infile, key=itemgetter(2), reverse=True):
        for line in infile:
        '''
        reversing script here
        '''

The commented lines are where I have just tried some different things, I chose to leave them in to give an idea of my approach.

Running this gives me the following error:

Traceback (most recent call last): File "test.py", line 37, in with open(files) as infile: TypeError: coercing to Unicode: need string or buffer, tuple found

From my understanding of what I'm trying to do, os.walk() should walk through the Userss directory, and as it 'walks over' each user file I'm trying to pass each of these files to the with open() method to open it so that I can do some work on it.

Where am I going wrong here?

1

There are 1 answers

2
Joran Beasley On BEST ANSWER

reversing a single file

with open(newFile,"wb") as f:
     f.write("\n".join(reversed(list(open("oldFile.txt","rb"))))

I guess?

iterating over files

os.walk returns a tuple of the current_directory,directories_in_cwd,files_in_cwd not just the file path ... and the individual files are only the filename it is not the path to the file (either absolute nor relative)

for curent_directory,directories,files in os.walk(rootdir):
     for file in files:
         filePath = os.path.join(current_directory,file)
         with open(filePath,"rb") as oldFile: 
              ....

alternatively its probably easier to do

import glob
for filePath in glob.glob("/path/to/*.json"):
    with open(filePath,"rb") as oldFile:
         #do something i guess? ...

maybe addresses your question... although really this is more about debugging your program. adding a simple print(file) would have shown you that what you were expecting os.walk to return was not actually what you were getting back from os.walk ... actually it looks like you did, but then commented it out... why did you think that giving a list to open was the correct thing to do