Scaling a photo sharing site

A question from comp.lang.php:

I am writing my own family photo sharing site that I hope to take public (like so many others). Anyway, currently, when the user uploads a picture, I store the picture outside my htdocs folder and record the image details in a MySQB db. When you browse the picture, I read the record and build the image by sending an image/jpeg header.

Seriously though, if I take this public and get extremely lucky and millions of photos are uploaded, would this be the best method?

I’ve read pros and cons of storing images in a database. I’ve read about Flickr, SmugMug, Photobucket having HUNDREDS of millions to over a BILLION images stored!

Obviously, load balancing plays into this but what other secrets do you think they use?

Separating (static) pictures from other (dynamic) content. Say, you have two servers, one with PHP/MySQL (let’s call it www.yoursite.com), another with nothing but Apache (content.yoursite.com), optimized for serving static images. The application residing on www.yoursite.com saves images onto content.yoursite.com and records their full URLs (http://content.yoursite.com/path/file.jpg) in its database. When content.yoursite.com gets low on available disk space, you put up a new server (content2.yoursite.com) for writing and start filling it up with pictures, while content.yoursite.com still remains accessible for reading. You can continue to add new content*.yoursite.com servers as you go. Dynamically generated HTML gets served from www.yoursite.com (which may eventually outgrow a single server and morph into a server cluster), static images, from content*.yoursite.com.

A slight variation of this approach is that multiple servers are open for writing at any given time; images are written onto a randomly chosen server. This helps ensure that highly popular content will be spread between multiple servers and can thus be served faster.

Yet another possibility is to hide your application behind a layer of caching proxies…

* * * * *

A later addition: Cal Henderson, one of Flickr architects, actually wrote a book on the subject. (Building Scalable Web Sites: Building, scaling, and optimizing the next generation of web applications)

Leave a Reply

Your email address will not be published. Required fields are marked *