I need download a binary file with proprietary format, make a conversion, then move converted back to storage.
I create directory/file on /tmp, using java Files.createTempDirectory and make the conversion. I had try run code on the driver and the worker.
When I run spark locally, It works. But in a managed cluster on Dataproc, I got FileNotFoundException.
There are a recomended way to process a binary file on Spark? Or there are other temporary location I should save the temporary files?
Ps: Using a binary datasource, or a stream does not work in my case, once I rely on an external lib that accepts paths only.