SPARQL for Flickr: Picture the Possibilities

Posted in photos, semantic web, technology at 2:49 am by wingerz


I recently purchased my first DSLR camera. It wasn’t an easy decision, and at some point I was looking for sample photos taken by a non-DSLR under a certain condition (wide aperture). I started with the Flickr Camera Finder. There is so much wonderful data on pages like the list of Canon cameras and the individual camera pages. The data can be viewed in several ways, but it all just leaves me wanting more. Sure, for a particular camera I can search for pictures tagged with “food”, but what if I want to specify photos with a wide aperture that were taken on November 23, 2005?

They’re sitting on a gold mine of data, but the only way to get at it is through the web API (The advanced search is not very powerful). It’s possible to get at some of the EXIF data (photo metadata), but only if you have the ID for a photo; there’s no way to search across all of the images. Even if they managed to implement this particular interface, what if I want to search for photos that satisfy these restrictions that were posted by users within three friend-links of me?

If Flickr slaps a SPARQL endpoint on its data, it opens up all sorts of amazing possibilities. Using API keys, they could allow paid access to the data from photo equipment sellers (and free access to web hackers), who would be able to offer their customers the ability to find pictures taken with particular cameras and lenses and the people who own them (possibly restricting this set of people to friends or foafs). Of course, Flickr could put together a proprietary web API and do this now, but then they would have to code up every new API method request themselves rather than letting data subscribers write their own queries. And SPARQL-able data has the additional benefit of being easier to integrate with other sources.


  1. Dan Brickley said,

    December 17, 2006 at 3:33 pm

    Exposing an unconstrained SPARQL interface could be computationally rather costly. Especially if it got heavily used. But I think you’re right in general: SPARQL over such photo data would be great. I started some experiments in this direction, but based around idea of doing it against a local cache of personal or group data, … not against the entire mega dataset.

    See http://spypixel.com/2006/kml/photos/about.html

    This is currently built by taking results from Flickr API, and reshaping them to look like the results of SPARQL queries expressed in JSON. I want to move to using the Perl CPAN Net::Flickr::Backup libraries instead, … but they don’t yet run against a group pool. Have swapped mails with the module author. If you or anyone you know would be interested to push on this idea further, it could be a fun testbed to collaborate around…

  2. wingerz said,

    December 17, 2006 at 4:07 pm

    Yes, I definitely agree that it would be costly and it probably wouldn’t be practical to deploy a completely open SPARQL endpoint on any site. With keys you could restrict the number of requests over a given time period and track usage patterns. I’m sure it would be valuable from a business standpoint, which could drive interest (and maybe result in more optimization research).

    I’m sure IBM won’t complain if all of a sudden there is a huge need for server cycles – will be interesting to see whether exposing this sort of functionality is cost-effective. I guess that’s part of our (the SW community’s) job.

    Will take a look at the link; it sounds very interesting.

  3. inkel said,

    December 19, 2006 at 1:14 pm

    I was wondering the same long time ago, and I even started to write some scripts that transform Flickr data into RDF (mostly for FOAF purposes). A SPARQL endpoint would be wonderful, lots of friends of mine love to take and search photos on Flickr, but as you’ve said, it’s quite difficult to find proper results.

    In case you’re interested, the rough english translation of my rants is in http://f14web.com.ar/inkel/2005/08/25/flickr2foaf.en.html

    I’m looking at Dan’s link now, seems very interesting.

  4. ~wingerz » Thoughts on DSLR ownership said,

    January 27, 2007 at 1:04 am

    […] Post-processing takes time. I’ve been shooting in RAW and tweaking each image a bit before converting to JPEG. Also, I need to do a better job of keeping track of metadata so that later on I can export to RDF and query my images with SPARQL. […]

  5. ~wingerz » Text indexing and query in Boca said,

    February 6, 2007 at 10:15 am

    […] This powerful feature allows SPARQL-aware developers to roll their own APIs. It’s easy to whip up a search across the all literals for traditional text search behavior. With a little more work, you can craft more sophisticated searches, like one for authors of a paper that mentions a specific search term in the abstract (say, “march madness”). […]

Leave a Comment