How to save incoming file in bottle api to hdfs

201 views Asked by At

I am defining bottle api where I need to accept a file from the client and then save that file to HDFS on the local system.

The code looks something like this.

@route('/upload', method='POST')
def do_upload():
    import pdb; pdb.set_trace()
    upload = request.files.upload
    name, ext = os.path.splitext(upload.filename)

    save_path = "/data/{user}/{filename}".format(user=USER, filename=name)

    hadoopy.writetb(save_path, upload.file.read())
    return "File successfully saved to '{0}'.".format(save_path)

The issue is, the request.files.upload.file is an object of type cStringIO.StringO which can be converted to a str with a .read() method. But the hadoopy.writetb(path, content) expects the content to be some other format and the server sticks at that point. It doesn't give exception, it doesn't give error or any result. Just stands there as if it was in infinite loop.

Does anyone know how to write incoming file in bottle api to HDFS?

1

There are 1 answers

5
ron rothman On

From the hadoopy documentation, it looks like the second parameter to writetb is supposed to be an iterable of pairs; but you're passing in bytes.

...the hadoopy.writetb command which takes an iterator of key/value pairs...

Have you tried passing in a pair? Instead of what you're doing,

hadoopy.writetb(save_path, upload.file.read())  # 2nd param is wrong

try this:

hadoopy.writetb(save_path, (path, upload.file.read()))

(I'm not familiar with Hadoop so it's not clear to me what the semantics of path are, but presumably it'll make sense to someone who knows HDFS.)