Preface I am working on a platform in-depended media database written in java where the media files are identified by a file hash. The user shall be able to move the files around, so I do NOT want to rely on any file path. Once imported, I store the path and the hash in my database. I developed a fast file-hash-id algorithm based on a tradeoff between accuracy and performance, but fast is not always fast enough. :)
In order to update and import mediafiles, I need to (re)create the file hashes of all files in my library. My idea is now to calculate the hash just once and store it in the files metadata (extended attributes) to boost performance on filesystems which support extended file attributes. (NTFS, HFS+, ext3...) I already implemented it, and you can find the current source here: archimedesJ.io.metadata
Attempts At a first glance, Java 1.7 offers with the UserDefinedFileAttributeView a nice way to handle metadata. For most platforms this works. Sadly, UserDefinedFileAttributeView does not work on HFS+. Albeit, I do not understand why especially the HFS+ filesystem is not supported - it is one of the leading formats for metadata? (see related Question - which does not provide any solution)
How to store extended file attributes on OS X with Java? In oder to come by this java limitation, I decided to use the xattr commandline tool present on OSX and use it with Javas Process handling to read the output from it. My implementation works, but it is very slow. (Recalculation of the file hash is faster, how ironic! I am testing on a Mac BookPro Retina, with an SSD.)
It turned out, that the xattr tool works quite slow. (Writing is damn slow, but more importantly also reading an attribute is slow) To prove that it is not a Java issue but the tool itself, I have created a simple bash script to use the xattr tool on several files which have my custom attribute:
FILES=/Users/IsNull/Pictures/
for f in $FILES
do
xattr -p vidada.hash $f
done
If I run it, the lines appear "fast" after each other, but I would expect to show me the output immediately within milliseconds. A little delay is clearly visible and thus I guess the tool is not that fast. Using this in java gives me an additional overhead of creating a process, parsing the output which makes it even a bit slower.
Is there a better way to access the extended attributes on HFS+ with Java? What is a fast way to work with the extended attributes on OS X with Java?
I have created a JNI wrapper for accessing the extended attributes now directly over the C-API. It is a open source Java Maven project and avaiable on GitHub/xattrj
For reference, I post the interesting source pieces here. For the latest sources, please refer to the above project page.
Xattrj.java
org_securityvision_xattrj_Xattrj.cpp
Now the makefile which has troubled me quite a bit to get things working: