Mash-ups and Mash-ins

Posted in development, semantic web, technology at 1:04 am by wingerz


With more and more websites providing access to data via public APIs, mash-ups have become quite popular. The canonical mash-up takes data from one or more sources and plots it on a Google Map; three of our four summer project demos included a map component. MIT Simile’s Timeline provides a similarly draggable interface for time-based data. In most cases, a mash-up usually involves creating a new page or set of pages for displaying data.

Lately at work we’ve been tossing around the idea of a mash-in. Rather than creating a new page for displaying data, a mash-in supplements an existing page with additional data. For example, our summer demo involved supplementing corporate directory profiles with bookmarks from an intranet social bookmarking system and a map showing a business address. Profiles normally display several tabs containing various categories of information, and without access to the corporate directory webserver we added additional tabs containing our information. This is one way to add new functionality without forcing the user to adapt to a new interface.

We did this by setting up a proxy server to add a few JavaScript includes to the page. The included code effectively adds a tab to the page, grabs an email address from the contents of the page, and populates the tab with the results of an asynchronous data fetch based on the email address. We also could have written a Firefox extension or Greasemonkey script to do the includes instead.

Scraping the page for an email address is quite primitive; we envision that one day pages will be written in RDFa, which allows the embedding of RDF triples in XHTML. Rather than matching something that looks like an email address, we can run an RDFa parser to find RDF triples, then get the object of a triple with predicate “hasEmailAddress” or something to that effect. Instead of just getting a single value, it would also be easy to check to see if the page contains RDF describing an event, an address, a book, or something else. We could choose widgets to display based on the content of the page.

It gets even more interesting when you throw Queso into the mix as a backend storage system. Queso can store any sort of structure you’d like (without you having to pre-define the structure beforehand). This makes it easy to store user-specific information. In our demo we used Queso to store employee home addresses (for calculating map directions) and del.icio.us usernames (for fetching non-intranet bookmarks). These were two simple, but illustrative, examples.

It’s difficult to go beyond simple mash-ups and mash-ins without a flexible storage system because either you have to set up your own database to store a particular data structure (like storing gas prices and locations in gasbuddy) or work with siloed data with no simple means of bringing it together (for example, if you wanted to compare a particular user’s usage of tags in both flickr and del.icio.us — never mind that they’re both owned by Yahoo). In both cases you’d be able to get something working, but you’d probably end up doing it with your own database with custom database tables and code. Using Queso, you could store the data that you wanted without having to do this (of course, you would have to become familiar with RDF, RDFa, and SPARQL).

We are continuing work on mash-ins. Although they require more trust from the user (proxy / Greasemonkey / FF extension), we believe that this is offset by adding value to existing applications already familiar to users. And as RDFa becomes more pervasive, we’ll be able to add even more interesting functionality after analyzing the contents of a page.

Leave a Comment