To centralise or not to centralise

There seems to be a simultaneous movement toward both centralising services and decentralising services.

Web apps like Gmail, Flickr and del.icio.us, ever more popular, are inherently centralised: everyone’s email and access to that email is kept on one network. In a similar vein, the rise of things like REST and the Google/Yahoo APIs mean that we’re relying on centralised ‘web services’ for our data.

Centralisation makes it easier for the centralised web service to provide statistics on everyone’s data. If the service provides an API, it makes it easier for anyone to get access to said statistics, which has some extremely expansive possibilities.

On the flip side we have microformats. The microformat principle is to make it easy to publish data, but do so in a standard way. In an ideal, microformat-filled world, anyone wanting to aggregate everyone’s data can just build a crawler and parser. Compare that to querying one centralised service’s API, and you have the essential difference between centralisation and decentralisation.

Initially, it seems that the microformat guys have got things the wrong way round. Querying one service is a lot easier than writing a crawler. The justification for the decentralised model is that there are a lot of people writing content less people writing parsers. Thus they’ve made microformats easier to write than to parse.

However, they’ve missed something here. Even though only a minority of people are writing parsers, the reading audience is generally much larger than the producers. Therefore we should make it as easy as possible for people to get access to as much data as possible. Centralised websites like Odeo do this: put the content in one place, then it’s easier for people to get at it.

But then the web wasn’t built that way. Imagine if all the web’s content was centralised into one giant index. Creating a new page wouldn’t be as easy as write-upload, it would be write-inform the index-wait-wait a bit longer-upload. Hmm. I don’t think so. The model the web has followed is a decentralised network. To find anything, you use a crawler. We could use this model for podcasts as well: write a microformat for saying ‘this is a podcast’, then Google writes a ‘Podcast search’ and that becomes everyone’s front door.

I mentioned in the last paragraph that centralising all content in an index makes it slower and more difficult to publish. This isn’t necessarily true: Flickr thrives because it’s easier to upload your photos to Flickr than it is to set up your own website and publish them there. It’s just the fact that the web wasn’t designed for sharing photos, so you’d need to write software if you wanted to share photos on your site. Flickr happens to be a very good system already in place.

So, it seems that centralisation and decentralisation both have their advantages. It seems that for sharing bookmarks and photos, a centralised system is generally easier and more natural. For just generic informational content, the decentralised system that is the web is the way forward.

Let me just finish by mentioning that I avoided using blogging as an example thus far because it’s an interesting case-in-point. Blogging has traditionally been available in two flavours, centralised systems like blogger (and now wordpress.com), or decentralised software like Movable Type and, of course, WordPress. Who knows where this will end up?