I'm looking for a way of keeping a filename that's associated with the tuples/data that originate from that particular file. I've searched around and found that hfs-wholefile
works really well at getting filenames but it then returns a large chunk of binary information. Is it possible to take this binary information and turn it back into tuples that I can then processes as if I had gotten them from hfs-textline
?
(def file-name-with-data
"Process a file and associate a filename with it"
[file]
(<- [file-name ?data1 ?data2 ?data3 ?data4]
((hfs-wholefile file) ?file-name ?binary-data)
(function-that-im-looking-for ?binary-data :> ?data1 ?data2 ?data3 ?data4)))
The example above is ideally what I would like to use to process this information. In Cascalog/Cascading is there a way to turn the bytes into regular variables I can use in queries?