Reinventing the Photo Album (Part 2)

Aug 30, 2006

I would like to expand a little on yesterday's discussion of the photo album software I am in the process of developing. Specifically, I'd like to talk about the server side changes that I'm planning on making in relation to the Plogger software that I currently use. Suggestions and additional ideas would be super great, so please suggest anything that comes to mind.

Plogger allows users to add images to an album in two ways: either by uploading an image one at a time, or by importing a number of images at once. I never use the upload feature, opting instead to import one or more albums at a time via SFTP. Plogger handles this process via an upload folder on the server side. One may either place images directly in this folder, or create sub-folders to better organize things during the import process. I tend to do the latter step, creating a sub-folder for each album that I create. At import time, Plogger scans the upload folder for items to be added, presenting the user with a list of the folders containing items to be imported. After selecting the desired folder to import, the user is given the opportunity to caption the pictures and move them into an album. Physically, the files get moved from the upload folder into an images folder. The resulting file structure looks something like this:

images/
 +-- collection_1/
      |-- album_1/
      |    |-- image_1.jpg
      |    |-- image_2.jpg
      |    +-- image_3.jpg
      +-- album_2/

There are a few problems that I can see with this system. First, thumbnail images aren't placed in the images folder, but in a thumbs folder instead (which is up at the same level as the images folder). Unlike its sibling, the thumbs folder has absolutely no organization whatsoever. All of the thumbnail images are simply placed in this one folder, ad hoc. I'm not entirely sure what happens if two different images in two separate albums have the same filename. I wouldn't be surprised if the name collision is not resolved cleanly.

The second issue is that the sub-folders one creates within the upload folder don't get cleaned up when the images get moved during the import step. As a result, a bunch of outdated, empty folders build up over time. Highly annoying for an obsessive-compulsive organizer like myself.

Finally, what happens if two albums have the same name? Again, I'm not sure that the collision is handled cleanly. The results could be potentially disastrous. Separating two intertwined albums in the database would most likely cause a great deal of headache, and is something I'd rather not have to deal with.

So here are the changes that I am proposing for my new, custom system. First of all, thumbnail images will be placed with each corresponding album, in a nested "thumbs" folder (to keep the root album folder as clean as possible). Second, the upload folder will be properly cleaned out when importing images. Empty albums will be discovered and removed as necessary. Third, albums will be date stamped. For example, if I uploaded an album today using a folder name of "eno_river", the resulting album name in the images folder would be something like "20060829_eno_river." This would help prevent name collisions on the album level, and would provide a nice chronological ordering on disk (not that that really matters).

This is how I am planning on proceeding with my new album package. Thoughts? Suggestions? Both are most welcome. This project is still in the early stages of development, and things are still quite malleable.

Update: I want to emphasize that I will not be using the filesystem to do logical organization and naming for each album and image. My album package will use a MySQL database to accomplish this, storing caption data, album data, and EXIF data as necessary. The organization on disk is simply a convenience, to help keep things orderly. Kip makes some good points in the comments in this post, some of which may result in modifications to my current plan.

3 Comments

kip

3:33 AM on Aug 30, 2006
Well I'll tell you how I did it for my photos page. I started out with something kinda similar to what you describe--files were arranged in folders on the server side, each folder would have an "info.txt" file with the name and description of the folder. Then that folder would either have subfolders, or an "images" and "thumbs" folder. If it was the latter, the thumbnails would be shown, and when you clicked on one, the timestamp and caption would be parsed out of the filename. This quickly grew to be a hassle, and produced inpersistent links if I wanted to change the name of a directory, so I rewrote the whole thing to use a database backend. Now I just have /photos/upload, /photos/thumbs, and /photos/images. I put new images in /photos/upload, and when I log into the admin page PHP looks for any files there and adds them to the database with no parent album (so they won't show up just yet). Every photo has a unique 32-bit (8 hex digit) id, so the filename is something like /photos/images/9236f21d.jpg, and the thumbnail is similarly named in the thumbs directory. I can then create directories via my admin page, and move any of the unassigned photos into the new directory. This has some advantages: * Naming collisions are avoided. In the event an existing id is computed, I just compute another one until it is unique (although the birthday paradox tells me that I'd need 9291 id's before the chance of a collision is greater than 1 percent!). * It is easy to update photo and directory titles and captions without breaking links. * It is easy to back up and restore all my data (1 directory and a database export/import vs. a recursive directory copy). And of course some disadvantages: * Having hundreds of files in a directory can make retrieving the file a little slow, but I imagine once this starts being an issue I could easily break files out across multiple directories. * File names won't mean anything to users if they want to save the files locally, or find them in their history/cache. * A lot more front-end programming was required than for the old method. But since writing code for my website is a hobby of mine, I considered this kind of an advantage. :) If you're interested: 1) I use this Perl script on my machine to rename my photos based on the "modified" timestamp of the file (my camera sets this so I've never had to look at the metadata of the JPG itself). Then PHP uses that timestamp when grabbing files out of the upload folder. 2) I generate the unique key like this, and it has proved pretty effective: do { //the id is given by the last 8 hex digits from an md5 hash of a random value $id = intval(hexdec(substr(md5(uniqid(mt_rand(), true)), -8))); } while ($id == 0 || id_exists($id, $table_name)); Sorry I'm so long-winded... #1 criticism I got from English teachers... :)

kip

11:26 AM on Aug 30, 2006
This morning I remembered what was the most annoying part of a system like you're describing: each time you create an album, you have to give the thumbs directory write permission so that PHP can store the thumbnails. This was really annoying to me, especially the time I restored a backup (deleted the directory on the server and uploaded my local copy) and I had to go back and chmod each thumbs directory. For a while I had an intermediate step, where I still used a directory structure (i.e. no database backend), but a single thumbnails folder, and the thumbnails were named something like md5($img_path . $img_name) . '.jpg'.

Jonah

12:20 PM on Aug 30, 2006
Just to clarify, I won't be using the file system to name albums and the like. Instead, I'll use a MySQL database to do it (relying on the file system is always a big mistake). The naming convention I'm using on disk is simply a convenience to keep things as orderly as possible. Renaming stuff might present a problem, so that's something I'll need to consider. Backing up and restoring is one important case that I hadn't even considered, so I'm glad you mentioned it! I'm clearly not worried about thumbnail images (they can easily be recreated). But I clearly need to keep track of the core images and their resulting folder structure. Restoring things from a backup is the tricky part in this case. Perhaps this would be a manual step, rather than something that the photo album package automates. You've given me good food for thought. I'll have to think carefully about the direction I'm heading. Perhaps I need to alter my course.

Leave a Comment

Ignore this field:
Never displayed
Leave this blank:
Optional; will not be indexed
Ignore this field:
Both Markdown and a limited set of HTML tags are supported
Leave this empty: