OTFG Step 2: Thinking about Photo URLs

My next step in moving my photos from Flickr to my own server was thinking about where I would store the photo files. Flickr assigns every photo a numeric ID which is available in every photo URL. For example, here's the URL of one of my photos hosted on Flickr:

http://farm1.static.flickr.com/124/359119647_4874f02815_o.jpg

The URL doesn't give much information about the photo. We know that the photo is at a flickr.com server, and that the photo is a user's original photo (note the _o at the end of the filename). Other than that, pretty anonymous.

Since I'm hosting my own photos, I thought I'd put a bit more information into the photo URLs. I decided to go with this format for original, unresized images:

http://example.com/[year]/[month]/[photo title].jpg

This means the same photo on my server will have a URL like this:

http://example.com/2007/01/beach-dogs.jpg

Though I don't expect my photo URLs to be exposed in the wild very much, I like this structure because it provides a bit of context. And because I'll be using actual directories in the filesystem named /2007 and /01, for example, the filesystem should scale well. I won't have hundreds and hundreds of photos in one folder. On the other hand, it will make running batch operations on all of the photos a bit tougher because I'll have to recurse through the directories—but that shouldn't be a big deal. (Especially since all of the file locations will be stored in the db.)

The Flickr API provides the date and time a photo was added to their system in Unix time, and the PHP date() function converts that to any format. So as my import script grabs photos from the Flickr server, it puts the image in the local filesystem based on the time it was added to Flickr originally.

I simply set a starting directory in my import script that's available through the web server, say, /www/photos/ or c:\\www\\photos\\ in Windows, and it will create the necessary local directories as it pulls in photos from Flickr.

Using the title of a photo as the file title is a bit tricky, because the titles are meant to be read by humans, not used in the filesystem. Photo titles contain punctuation and spaces, so I just strip all of that out with some regular expressions. I'm sure this could be improved, but I'm using:

$photoTitle_f = preg_replace('/\s+/', '-', $photoTitle_f);
$photoTitle_f = preg_replace('/[^-\w]/', '', $photoTitle_f);


Basically this bit of code says replace any whitespace in the title with a dash, and then remove any character that isn't a dash or isn't standard letters and numbers. A bit rough, but it should handle most standard English titles.

With the photo-URL planning out of the way, it was time to set up Flickr API access for my import script. I'll show how that works in Step 3.

Comments

Are you planning on using Imagemagick or something similar to resize the images, do watermarking, etc?
Yep, I'll use ImageMagick. I'm planning to resize all of my images on my server instead of grabbing the resized versions at Flickr. But that'll be a quite a few steps down the road.
And I'm not planning on watermarking my images. But that'd be easy to add into any automated resizing functions with ImageMagick.
Did you think about using a top level '/photos' folder in the URI? In dynamic blog publishing scenario, using your solution, you'd now be creating a whole bunch of year/month/date directories that ordinarily don't exist for individual entry archives.
Good point, Joost, I might do that. I was thinking about hosting the photo files on a static image server separate from the interface, but I haven't decided yet. So the images might end up at 'images.onfocus.com' instead of 'www.onfocus.com'. But I agree if everything's going to be at the same domain, it'd be good to have a top level folder so your blog and photos aren't competing for virtual space. This is a good reminder that designing your image file URLs is an important step.
Apparently I can't spell my own name. It's Joost (like the video site, I know).
Fixed!
I've been following along as you head off the grid. Will you be storing all of the photos and their resized versions in the same directory? I gather that you will be. I'm in the middle of going off the grid as well. I think I've got more photos than you do and I'm worried about having too many photos in one directory. Any thoughts on a max number of photos to hold in one directory?
No, I've separated the thumbnails and original-sized photos into separate directories. I have a /photos directory and a /thumbs directory. For the "originals" I separate photos by year and month. For me, that's less than a dozen photos per directory. For thumbnails, I'm only separating by year which ends up as hundreds of thumbnails at different sizes in each directory. I only think you need to be worried about the number of files in a directory when you're going to have to go in there as a human with your own eyes and dig around. All of the thumbnails in my system are generated by scripts based on the original photos. So I definitely view the thumbnails as "machine" photos and I'm not too worried about digging around in those directories by hand. For thumbanils I figure I can usually write scripts to manage things, because any changes will likely be across every thumbnail.

I talk a little more about my thumbnail setup in Step 8:

http://www.onfocus.com/2007/02/3932