We’ve already seen how Web 2.0 has brought about a paradigm of tagged and commented-upon content: photos, bookmarks, events, videos, and blog posts. Blog posts are usually only tagged on the blog itself by the post creator, using free-text keywords such as “scotland”, “movies”, etc. (unless they are bookmarked and tagged by others using social bookmarking services like del.icio.us or personal aggregators like Gregarius). Technorati, the blog search engine, aims to use these keywords to build a “tagged web”. Both tags and hierarchial categorisations of blog posts can be further enriched using the SKOS framework. However, there is often much more to say about a blog post than simply what category it belongs in…
So let’s move on to semantic blogging (some ideas here are from Knud Moeller who is working on semiBlog). Traditional blogging is aimed at what can be called the “eyeball Web” – i.e. text, images or video content that is targetted mainly at people. Semantic blogging aims to enrich traditional blogging with metadata about the structure (what relates to what and how) and the content (what is this post about – a person, event, book, etc.). In this way, metadata-enriched blogging can be better understood by computers as well as people.
Last time I talked about structured blogging, where microcontent such as microformats is positioned inline in the HTML (and subsequent syndication feeds) and can be rendered via CSS. Structured blogging and semantic blogging do not compete, but rather offer metadata in slightly different ways (using microcontent / microformats and RDF respectively). There are already mechanisms such as GRDDL which can be used to move from one to the other.
So why would one choose to enhance their blogs and posts with semantics? Current blogging offers poor query possibilities (except for searching by keyword or seeing all posts labelled with a particular tag). There is little or no reuse of data offered (apart from copying URLs or text from posts). Some linking of posts is possible via direct HTML links or trackbacks, but again, nothing can be said about the nature of those links (are you agreeing with someone, linking to an interesting post, or are you quoting someone whose blog post is directly in contradiction with your own opinions?). Semantic blogging aims to tackle some of these issues, by facilitating better (i.e. more precise) querying when compared with keyword matching, by providing more reuse possibilities, and by creating “richer” links between blog posts.
It is not simply a matter of adding semantics for the sake of creating extra metadata, but rather a case of being able to reuse what data a person already has in their desktop or web space and making the resulting metadata available to others. People are already (sometimes unknowingly) collecting and creating large amounts of structured data on their computers, but this data is often tied into specific applications and locked within a user’s desktop (e.g. contacts in a person’s addressbook, events in a calendaring application, author and title information in documents, audio metadata in MP3 files). Semantic blogging can be used to “lift” or release this data onto the Web.
Looking at the picture on the right, Aidan writes a blog post which he annotates using content from his desktop calendaring and addressbook applications. He publishes this post onto the Web, and John, reading this post, can reuse the embedded metadata in his own desktop applications.
The next picture is from a semantic blogging application called semiBlog. In this picture, a semantic blog post is being created by annotating a part of the post text about John with an address book entry that has extra metadata describing John. Once a blog has semantic metadata, it can be used to perform queries such as “which blog posts talk about papers by Stefan Decker?”; it can be used for browsing not only across blogs but also other kinds of discussion methods; or it can be used by blog readers for importing metadata into desktop applications (using the Web as a clipboard).
As well as semiBlog, other semantic blogging systems have been developed by HP, the National Institute of Informatics, Japan and MIT. But it’s not just blog posts that are being enhanced by structured metadata and semantics – it’s happening in many other Web 2.0 application areas. Wikis such as the Wikipedia have contained structured metadata in the form of templates for some time now, and at least twenty “semantic wikis” have also appeared to address a growing need for more structure in wikis. I’ll talk about semantic wikis next time, and in the meantime look forward to your comments…