Tales From the SIOC-O-Sphere #10

siocapps_medium

SIOC is a Social Semantic Web project that originated at DERI, NUI Galway (funded by SFI) and which aims to interlink online communities with semantic technologies. You can read more about SIOC on the Wikipedia page for SIOC or in this paper. But in brief, SIOC provides a set of terms that describe the main concepts in social websites: posts, user accounts, thread structures, reply counts, blogs and microblogs, forums, etc. It can be used for interoperability between social websites, for augmenting search results, for data exchange, for enhanced feed readers, and more. It’s also one of the metadata formats used in the forthcoming Drupal 7 content management system, and has been deployed on hundreds of websites including Newsweek.com.

As part of our dissemination activities, I’ve tried to regularly summarise recent developments in the project so as to give an overview of what’s going on and also to help in connecting interested parties. It’s been much too long (over a year) since my last report, so this will be a long one! In reverse chronological order, here’s a list of recent applications and websites that are using SIOC:

  • SMOB Version 2. As you may have read on Y Combinator Hacker News yesterday, a re-architected and re-coded version of SMOB (Semantic Microblogging) has been created by Alex Passant. As with our original SMOB design, a user’s SMOB site stores and shares tweets and user information using SIOC and FOAF, but the new version also exposes data via RDFa and additional vocabularies (including the Online Presence Ontology, MOAT, Common Tag). The new SMOB suggests relevant URIs from DBpedia and Sindice when #hashtags are entered, and has moved from a client-server model to a set of distributed hubs. Contact @terraces.
  • on-the-wave. This script creates an enhanced browsing experience (that is SIOC-enabled) for the popular PTT bulletin board system. Contact kennyluck@csail.mit.edu.
  • Newsweek.com. American news magazine Newsweek are now publishing RDFa on their main site, including DC, CommonTag, FOAF and SIOC. Contact @markcatalano.
  • Linked Data from Picasa. OpenLink Software’s URI Burner can now provide Linked Data views of Google Picasa photo albums. See an example hereContact @kidehen.
  • Facebook Open Graph Protocol. Facebook recently announced its Open Graph Protocol (OGP), which allows any web page to become a rich object in their social graph. While OGP defines its own set of classes and properties, the RDF schema contains direct mappings to existing concepts in FOAF, DBpedia and BIBO, and indirect mappings to concepts in Geo, vCard, SIOC and GoodRelations. OpenLink also have a data dictionary meshup of some OGP and SIOC terms (ogp:Blog is mapped to sioct:Weblog). Contact @daveman692.
  • Linked Data from Slideshare. A service to produce Linked Data from the popular Slideshare presentation sharing service has been created, and is available here. Data is represented in SIOC and DC. Contact @pgroth.
  • Fanhubz. FanHubz supports community building and discovery around BBC content items such as TV shows and radio programmes. It reuses the sioct:MicroblogPost term and also has some interesting additional annotation terms for in-show tweets (e.g. twitterSubtitles). Contact @ldodds.
  • RDFa-enhanced FusionForge. An RDFa-enhanced version of FusionForge, a software project management and collaboration system, has been created that generates metadata about projects, users and groups using SIOC, DOAP and FOAF. You can look at the Forge ontology proposal, and also view a demo site. Contact @olberger.
  • Falconer. Falconer is a Semantic Web search engine application enhanced with SIOC. It allows newly-created Social Web content to be represented in SIOC, but it also allows this content to be annotated with any semantic statements available from Falcons, and all of this data can then be indexed by the search engine to form an ecosystem of semantic data. Contact wu@seu.edu.cn.
  • Django to RDF. A script is available here to turn Django data into SIOC RDF and JSON. View the full repository of related scripts on github. Contact @niklasl.
  • SIOC Actions Module. A new SIOC module has been created to describe actions, with potential applications ranging from modelling actions in a developer community to tracing interactions in large-scale wikis. There is a SIOC Actions translator site for converting Activity Streams, Wikipedia interactions and Subversion actions into RDF. Contact @pchampin.
  • SIOC Quotes Module. Another SIOC module has been developed for representing quotes in e-mail conversations and other social media content. You can view a presentation on this topic. Contact @terraces.
  • Siocwave. Siocwave is a desktop tool for viewing and exploring SIOC data, and is based on Python, RDFLib and wxWidgets. Contact vfaronov@gmail.com.
  • RDFa in Drupal 7. Following the Drupal RDF code sprint in DERI last year, RDFa support (FOAF, SIOC, SKOS, DC) in Drupal core was committed to version 7 in October, and work has been apace on refining this module. Drupal 7 is currently on its fifth alpha version, and a full release candidate is expected later this summer. Find out more about the RDFa in Drupal initiative at semantic-drupal.com. Contact @scorlosquet.
  • Omeka Linked Data Plugin (Forthcoming). A plugin to produce Linked Data from the Omeka web publishing platform is in progress that will generate data using SIOC, FOAF, DOAP and other formats. Contact @patrickgmj.
  • Boeing inSite. inSite is an internal social media platform for Boeing employees that provides SIOC and FOAF data services as part of its architecture. Contact @adamboyet.
  • Virtuoso Sponger. Virtuoso Sponger is a middleware component of Virtuoso that generates RDF Linked Data from a variety of data sources (working as an “RDFizer”). It supports SIOC as an input format, and also uses SIOC as its data space “glue” ontology (view the slides). Contact @kidehen.
  • SuRF. SuRF is a Python library for working with RDF data in an object-oriented way, with SIOC being one of the default namespaces. Contact basca@ifi.uzh.ch.
  • Triplify phpBB 3. A Triplify configuration file for phpBB 3 has been created that allows RDF data (including SIOC) to be generated from this popular bulletin board system. Various other Triplify configurations are also available. Contact auer@informatik.uni-leipzig.de.
  • SiocLog. SiocLog is an IRC logging application that provides discussion channels and chat user profiles as Linked Data, using SIOC and FOAF respectively. You can see a deployment and view our slides. Contact @tuukkah.
  • myExperiment Ontology. myExperiment is a collaborative environment where scientists can publish their workflows and experiment plans, share them with groups and find those of others. In their model, myExperiment reuses ontologies like DC, FOAF, SIOC, CC and OAI-ORE. Contact drn@ecs.soton.ac.uk.
  • aTag. The aTag generator produces snippets of HTML enriched with SIOC RDFa and DBpedia-linked tags about highlighted items of interest on any web page, but aiming at the biomedical domain. Contact @matthiassamwald.
  • ELGG SID Module. A Semantically-Interlinked Data (SID) module for the ELGG educational social network system has been described that allows UGC and tags from ELGG platforms to become part of the Linked Data cloud. Contact @selvers.
  • Liferay Linked Data Module. The Linked Data module for Liferay, an enterprise portal solution, supports mapping of data to the SIOC, MOAT and FOAF vocabularies. Contact @bryan_.
  • ourSpaces. ourSpaces is a VRE enabling online collaboration between researchers from various disciplines. It combines FOAF and SIOC with data provenance ontologies for sharing digital artefacts. Contact r.reid@abdn.ac.uk.
  • Good Relations and SIOC. This post describes nicely how the Good Relations vocabulary for e-commerce can be combined with SIOC, e.g. to link a gr:Offering (either being offered or sought by a gr:BusinessEntity) to a natural-language discussion about that thing in a sioc:Post. Contact sdmonroe@gmail.com.
  • Debian BTS to RDF. Discussions from the Debian bug-tracking system (BTS) can be converted to SIOC and RDF and browsed or visualised in interesting ways, e.g. who replied to whom. Contact quang_vu.dang@it-sudparis.eu.
  • RDFex. For those wishing to reuse parts of popular vocabularies in their own Semantic Web vocabularies, RDFex is a mechanism for importing snippets from other namespaces without having to copy and paste them. RDFex can be used as a proxy for various ontologies including DC, FOAF and SIOC. Contact holger@knublauch.com.
  • IRC Logger with RDFa and SIOC. A fork of Dave Beckett’s IRC Logger has been created to include support for RDFa and SIOC by Toby Inkster. Contact mail@tobyinkster.co.uk.
  • mbox2rdf. A mbox2rdf script has been created that converts a mailing list in an mbox file to RDF (RSS, SIOC and DC). Contact mail@tobyinkster.co.uk.
  • Chisimba SIOC Export Module. A SIOC Export module for the Chisimba CMS/LMS platform has been created, which allows various Chisimba modules (CMS, forum, blog, Jabberblog, Twitterizer) to export SIOC data. Contact @paulscott56.
  • vBulletin SIOC Exporter. Omitted from the last report, the vBulletin SIOC plugin generates SIOC and FOAF data from vBulletin discussion forums. It includes a plugin that allows users to opt to export the SHA1 of their e-mail address (and other inverse functional properties) and their network of friends via vBulletin’s user control panel. Contact @johnbreslin.
  • Discuss SIOC on Google Wave. You can now chat about SIOC on our Google Wave.

Book launch for "The Social Semantic Web"

We had the official book launch of “The Social Semantic Web” last month in the President’s Drawing Room at NUI Galway. The book was officially launched by Dr. James J. Browne, President of NUI Galway. The book was authored by myself, Dr. Alexandre Passant and Prof. Stefan Decker from the Digital Enterprise Research Institute at NUI Galway (sponsored by SFI). Here is a short blurb:

Web 2.0, a platform where people are connecting through their shared objects of interest, is encountering boundaries in the areas of information integration, portability, search, and demanding tasks like querying. The Semantic Web is an ideal platform for interlinking and performing operations on the diverse data available from Web 2.0, and has produced a variety of approaches to overcome limitations with Web 2.0. In this book, Breslin et al. describe some of the applications of Semantic Web technologies to Web 2.0. The book is intended for professionals, researchers, graduates, practitioners and developers.

Some photographs from the launch event are below.

Reblog this post [with Zemanta]

Another successful defense by Uldis Bojars in November

Uldis Bojars submitted his PhD thesis entitled “The SIOC MEthodology for Lightweight Ontology Development” to the University in September 2009. We had a nice night out to celebrate in one of our favourite haunts, Oscars Bistro.

Jodi, John, Alex, Julie, Liga, Sheila and Smita
Jodi, John, Alex, Julie, Liga, Sheila and Smita

This was followed by a successful defense at the end of November 2009. The examiners were Chris Bizer and Stefan Decker. Uldis even wore a suit for the event, see below.

I will rule the world!
I will rule the world!

Uldis established a formal ontology design process called the SIOC MEthodology, based on an evolution of existing methodologies that have been streamlined, experience developing the SIOC ontology, and observations regarding the development of lightweight ontologies on the Web. Ontology promotion and dissemination is established as a core part of the ontology development process. To demonstrate the usage of the SIOC MEthodology, Uldis described the SIOC project case study which brings together the Social Web and the Semantic Web by providing semantic interoperability between social websites. This framework allows data to be exported, aggregated and consumed from social websites using the SIOC ontology (in the SIOC application food chain). Uldis’ research work has been published in 4 journal articles, 8 conference papers, 13 workshop papers, and 1 book chapter. The SIOC framework has also been adopted in 33 third-party applications. The Semantic Radar tool he initiated for Firefox has been downloaded 24,000 times. His scholarship was funded by Science Foundation Ireland under grant numbers SFI/02/CE1/I131 (Líon) and SFI/08/CE/I1380 (Líon 2).

We wish Uldis all the best in his future career, and hope he will continue to communicate and collaborate with researchers in DERI, NUI Galway in the future.

Reblog this post [with Zemanta]

Haklae Kim and his successful defense in September

This is a few months late but better late then never! We said goodbye to PhD researcher Haklae Kim in May of this year when he returned to Korea and took up a position with Samsung Electronics soon afterward. We had a nice going away lunch for Haklae with the rest of the team from the Social Software Unit (picture below).

Sheila, Uldis, John, Haklae, Julie, Alex and Smita
Sheila, Uldis, John, Haklae, Julie, Alex and Smita

Haklae returned to Galway in September to defend his PhD entitled “Leveraging a Semantic Framework for Augmenting Social Tagging Practices in Heterogeneous Content Sharing Platforms”. The examiners were Stefan Decker, Tom Gruber and Philippe Laublet. Haklae successfully defended his thesis during the viva, and he will be awarded his PhD in 2010. We got a nice photo of the examiners during the viva which was conducted via Cisco Telepresence, with Stefan (in Galway) “resting” his hand on Tom’s shoulder (in San Jose)!

Philippe Laublet, Haklae Kim, Tom Gruber, Stefan Decker and John Breslin
Philippe Laublet, Haklae Kim, Tom Gruber, Stefan Decker and John Breslin

Haklae created a formal model called SCOT (Social Semantic Cloud of Tags) that can semantically describe tagging activities. The SCOT ontology provides enhanced features for representing tagging and folksonomies. This model can be used for sharing and exchanging tagging data across different platforms. To demonstrate the usage of SCOT, Haklae developed the int.ere.st open tagging platform that combined techniques from both the Social Web and the Semantic Web. The SCOT model also provides benefits for constructing social networks. Haklae’s work allows the discovery of social relationships by analysing tagging practices in SCOT metadata. He performed these analyses using both Formal Concept Analysis and tag clustering algorithms. The SCOT model has also been adopted in six applications (OpenLink Virtuoso, SPARCool, RelaxSEO, RDFa on Rails, OpenRDF, SCAN), and the int.ere.st service has 1,200 registered members. Haklae’s research work was published in 2 journal articles, 15 conference papers, 3 workshop papers, and 2 book chapters. His scholarship was funded by Science Foundation Ireland under grant numbers SFI/02/CE1/I131 (Líon) and SFI/08/CE/I1380 (Líon 2).

We wish Haklae all the best in his future career, and hope he will continue to communicate and collaborate with researchers in DERI, NUI Galway in the future.

Reblog this post [with Zemanta]

Call for bid proposals for hosting BlogTalk 2010 / 2011: The International Conference on Social Software

20091218a

BlogTalk, the International Conference on Social Software, is designed to allow dialogue between practitioners, developers and academics who are involved in the area of social software (blogs, wikis, forums, IM, social networks, microblogging, etc.). As well as a programme of peer-reviewed presentations, BlogTalk features prominent speakers from successful social media companies, research organisations, etc. Typical attendance figures are over 100 people.

The BlogTalk steering committee encourages you to submit a preliminary bid to host the International Conference on Social Software in 2010 or 2011. The annual conference includes a combination of formal talks, workshops, breakout sessions, networking opportunities, and social events. We seek to hold our annual conference in a diverse range of localities (previous countries were Austria, Australia, Ireland and Korea). Each conference involves a working partnership between the BlogTalk steering committee, the host organisers, and a programme committee of expert reviewers.

Conference schedules have typically followed the pattern of having two full days of talks, with interleaved discussion panels, birds of a feather sessions, etc. although each host has flexibility about when to hold certain extra events, or sometimes, whether to hold them at all. We recommend that the dinner event be held on the first night, in the middle of the conference. There is also an option to have a day of workshops prior to the main conference talks, and a welcome reception the night before the main conference.

Each host takes a lead role in gathering sponsorship for its conference. Usually, tickets account for about $15,000 – 20,000 and the host is responsible for raising at least $20,000 – 35,000 in sponsorship. The combined funds go a long way toward making the conference budget manageable. A small portion of the conference budget will also go into a central BlogTalk fund for aiding with publications and future events.

Sponsorship includes the placement of a logo on materials such as the attendee’s pack, t-shirts, and the conference website. It may include free registration for two attendees, and a guaranteed slot for a product demo during the conference’s demonstrations session. The conference’s main event, the dinner, can also be sponsored. As well as a placard at the entrance to the event, the sponsor will be acknowledged on the website, during the programme chair’s speeches, and in conference materials.

With your help, the steering committee will also help market the event in a variety of ways, through targeted emails and social media distribution channels.

To be considered as a host for BlogTalk 2010 or BlogTalk 2011, please fill out the attached preliminary bid proposal and return to us (blogtalk2010@gmail.com) by January 18, 2010. The steering committee will consider all proposals and notify within two weeks of the closing date.

Bid Proposal for BlogTalk 2010 or 2011

Contact Person:
Organisation:
Address:
Telephone:
Email:

Which year are you bidding for (2010 or 2011)?

Proposed Hotel / Venue Name:
Location/Address:
Distance from Major Airport (Miles):
Distance from Major Airport (Minutes):

Describe potential keynote speakers you would intend to have speak at the event:

Give details of previous conferences and workshops that you and your team have organised:

Describe available transportation modes and costs between major airport and preferred conference venue hotel (shuttle, taxi, etc.):

Describe the preferred conference venue / hotel’s accommodations (lodging and meeting rooms, public areas):

Give details of any possible social events that could be held:

Describe the restaurants, shopping, and night life close to the preferred conference hotel:

Reblog this post [with Zemanta]

Some of my (very) preliminary opinions on Google Wave

I was interviewed by Marie Boran from Silicon Republic recently for an interesting article she was writing entitled “Will Google Wave topple the e-mail status quo and change the way we work?“. I thought that maybe my longer answers may be of interest and am pasting them below.

Disclaimer: My knowledge of Google Wave is second hand through various videos and demonstrations I’ve seen… Also, my answers were written pretty quickly!

As someone who is both behind Ireland’s biggest online community boards.ie and a researcher at DERI on the Semantic Web, are you excited about Google Wave?

Technically, I think it’s an exciting development – commercially, it obviously provides potential for others (Google included) to set up a competing service to us (!), but I think what is good is the way it has been shown that Google Wave can integrate with existing platforms. For example, there’s a nice demo showing how Google Wave plus MediaWiki (the software that powers the Wikipedia) can be used to help editors who are simultaneously editing a wiki page. If it can be done for wikis, it could aid with lots of things relevant to online communities like boards.ie. For example, moderators could see what other moderators are online at the same time, communicate on issues such as troublesome users, posts with questionable content, and then avoid stepping on each other’s toes when dealing with issues.

Does it potential for collaborative research projects? Or is it heavyweight/serious enough?

I think it has some potential when combined with other tools that people are using already. There’s an example from SAP of Google Wave being integrated with a business process modelling application. People always seem to step back to e-mail for doing various research actions. While wikis and the like can be useful tools for quickly drafting research ideas, papers, projects, etc., there is that element of not knowing who is doing stuff at the same time as you. Just as people are using Gtalk to augment Gmail by being able to communicate in contacts in real-time when browsing e-mails, Google Wave could potentially be integrated with other platforms such as collaborative work environments, document sharing systems, etc. It may not be heavyweight enough on its own but at least it can augment what we already use.

Where does Google Wave sit in terms of the development of the Semantic Web?

I think it could be a huge source of data for the Semantic Web. What we find with various social and collaborative platforms is that people are voluntarily creating lots of useful related data about various objects (people, events, hobbies, organisations) and having a more real-time approach to creating content collaboratively will only make that source of data bigger and hopefully more interlinked. I’d hope that data from Google Wave can be made available using technologies such as SIOC from DERI, NUI Galway and the Online Presence Ontology (something we are also working on).

If we are to use Google Wave to pull in feeds from all over the Web will both RSS and widgets become sexy again?

I haven’t seen the example of Wave pulling in feeds, but in theory, what I could imagine is that real-time updating of information from various sources could allow that stream of current information to be updated, commented upon and forwarded to various other Waves in a very dynamic way. We’ve seen how Twitter has already provided some new life for RSS feeds in terms of services like Twitterfeed automatically pushing RSS updates to Twitter, and this results in some significant amounts of rebroadcasting of that content via retweets etc.

Certainly, one of the big things about Wave is its integration of various third-party widgets, and I think once it is fully launched we will see lots of cool applications building on the APIs that they provide. There’s been a few basic demonstrator gadgets shown already like polls, board games and event planning, but it’ll be the third-party ones that make good use of the real-time collaboration that will probably be the most interesting, as there’ll be many more people with ideas compared to some internal developers.

Is Wave the first serious example of a communications platform that will only be as good as the third-party developers that contribute to it?

Not really. I think that title applies to many of the communications platforms we use on the Web. Facebook was a busy service but really took off once the user-contributable applications layer was added. Drupal was obviously the work of a core group of people but again the third-party contributions outweigh those of the few that made it.

We already have e-mail and IM combined in Gmail and Google Docs covers the collaborative element so people might be thinking ‘what is so new, groundbreaking or beneficial about Wave?’ What’s your opinion on this?

Perhaps the real-time editing and updating process. Often times, it’s difficult to go back in a conversation and add to or fix something you’ve said earlier. But it’s not just a matter of rewriting the past – you can also go back and see what people said before they made an update (“rewind the Wave”).

Is Google heading towards unified communications with Wave, and is it possible that it will combine Gmail, Wave and Google Voice in the future?

I guess Wave could be one portion of a UC suite but I think the Wave idea doesn’t encompass all of the parts…

Do you think Google is looking to pull in conversations the way FriendFeed, Facebook and Twitter does? If so, will it succeed?

Yes, certainly Google have had interests in this area with their acquisition of Jaiku some time back (everyone assumed this would lead to a competitor to Twitter; most recently they made the Jaiku engine available as open source). I am not sure if Google intends to make available a single entry point to all public waves that would rival Twitter or Facebook status updates, but if so, it could be a very powerful competitor.

Is it possible that Wave will become as widely used and ubiquitous as Gmail?

It will take some critical mass to get it going, integrating it into Gmail could be a good first step.

And finally – is the game changing in your opinion?

Certainly, we’ve moved from frequently updated blogs (every few hours/days) to more frequently updated microblogs (every few minutes/seconds) to being able to not just update in real-time but go back and easily add to / update what’s been said any time in the past. People want the freshest content, and this is another step towards not just providing content that is fresh now but a way of freshening the content we’ve made in the past.

Reblog this post [with Zemanta]

New version of Recovery.gov launches

The new version 2.0 of the Recovery.gov site was launched today. I’ve been tracking recent happenings on Twitter and elsewhere, so here are some recent developments:

  • The new site is available here.
  • Rusty Talbot from Synteractive, the developers of Recovery.gov version 2.0, has posted a thread on the Sunlight Labs discussion forum asking for input from citizen developers regarding ways to make data available from Recovery.gov.
  • Nextgov have a great summary article about Recovery.gov’s call for data provision ideas with some interesting quotes from the individuals concerned.
  • Raymond Yee, a colleague of Eric Wilde and Eric Kansa at Berkeley, has published an interesting blog post with advice for Recovery.gov. They co-authored the report “Proposed Guideline Clarifications for American Recovery and Reinvestment Act of 2009” earlier this year.
  • You can now follow Recovery.gov on Twitter.
  • The Recovery Accountability and Transparency Board (RATB) also has a YouTube account. The first video message was posted featuring RATB Chairman Earl Devaney.
  • From a SIOC perspective, I thought this quote from Nextgov referencing Chairman Devaney’s statement was interesting, as there is an opportunity to semantically link the social media contributions from many users to the financial grants in question:
    Board Chairman Earl Devaney will appeal to his so-called citizen inspectors general — or anyone interested in rooting out fraud, waste and abuse — through social media outlets, including the video-sharing site YouTube. Individuals who would like to broadcast miniblog entries about the site through Twitter can do so using hash tag #ARRA. “Our goal here is to provide the facts and the tools for the public to decide whether that is a good use of the public’s money,” Devaney said in an interview with Nextgov earlier in September. “We’re going to put the facts and the tools up so that people can mash it up.” The functions should allow citizens to draw useful observations, such as, “That’s the mayor’s brother in law — I’m going to call the Recovery Board,” he said.

    .

I previously gave some initial ideas about how grant feed data (following the Wilde / Kansa / Yee model) can be linked with user contributions using SIOC and FOAF. See this picture for an example. We also have a recently-created Linked Government Data initiative at DERI, NUI Galway carrying out research in this area.

Reblog this post [with Zemanta]

John Breslin's Blog

Follow

Get every new post delivered to your Inbox.