Scalable directory structure for user submissions

255 views Asked by At

Our team is in the process of switching over our media library storage engine from database BLOB storage over to file system (We are using the LAMP stack, PHP is ver 5.3). Virtually all the content being stored is image data that will be pulled into the application and the most processing that will be done on it is some resizing/resampling with GD. The database storage is an artifact left over from a previous build that we are trying to abandon in order to reduce the strain on the database server.

I have built a few filesystem-based image libraries like this before, but I'd like to solidify some best practices, since this is going to get large and once it is filled with user data it will be very difficult to modify.

In my previous builds, I had created a 'resources' folder that had read/write privileges. Within that directory, there was an additional layer of directories that were named for what "section" of the site the content was pertinent to, usually organized by the name of the model or controller that implemented the stuff in there. Under that layer were user/profile ids or numerically-named folders that shared the number of whatever primary key in the database determined primary ownership (these were typically, in this kind of deployment, gallery ids of the complilation the images came from, since ownership of the gallery by a specific user could be handled through the database/object models.)

What kind of approaches has the community used in this situation, and which were most scalable? Is there any software for Apache that could handle this kind of organization more effectively than simply manually coding it into the models? I tried a few searches on SO and Google for similar threads on filesystem media storage, but didn't find much other than things to the effect of "don't use BLOBs" which we've already more or less established. Is there any hard and fast do-nots?

Thanks for your guidance!

1

There are 1 answers

2
Alexey Lebedev On BEST ANSWER

Check out MogileFS, it's a distributed parallel fault-tolerant file system.

It provides automatic replication, namespaces, and can be integrated with nginx (i.e. no intermediate script required to serve the content). It proved to be more reliable and scalable than filesystem for storing millions of photos for our project.