This work is licensed under a Creative Commons License. This copyright applies to the SIOC Ontology: Applications and Implementation Status and accompanying documentation and does not apply to SIOC data formats, ontology terms, or technology.

Regarding underlying technology, SIOC relies heavily on W3C's RDF technology, an open Web standard that can be freely used by anyone.

Abstract

The SIOC (Semantically-Interlinked Online Communities) Core Ontology provides the main concepts and properties required to describe information from online communities (e.g., message boards, wikis, weblogs, etc.) on the Semantic Web. This document contains a brief overview of various SIOC implementations and applications.

Status of this document

NOTE:This section describes the status of this document at the time of its publication. Other documents may supersede this document.

1. Introduction

All SIOC data uses RDF as an underlying data format, and can be created and processed as such. Various applications have been designed to use SIOC by taking some of its unique aspects into account. In this document, we will outline concrete implementations and applications that use SIOC data. (A complete state-of-the-art list of SIOC implementations is maintained at the SIOC applications page).

SIOC data can also be processed and used by many generic Semantic Web applications, capable of using RDF. A full list of these applications is outside the scope of this document. For more information about Semantic Web applications and libraries please see "Where do I find tools for Semantic Web development?" section of the Semantic Web FAQ.

2. Creating SIOC data

SIOC is designed to export information about the content and structure of online community websites in a machine-readable form. Thus, various tools, exporters and services have been created to expose SIOC data from existing online communities.

2.1 SIOC APIs

SIOC Export API for PHP. In order to help people to write SIOC exporters, a SIOC Export API for PHP has been designed, offering an easy way to manipulate SIOC data through PHP objects and methods, and rendering content in an RDF/XML file. The API creates and exports SIOC concepts about the authors (sioc:User plus foaf:Person), posts and comments (sioc:Post and sioc_t:Comment), and the structure of the website (sioc:Site and sioc:Forum).

SIOC API for Java. A SIOC API for Java has been created, based on semweb4j. For each object in the SIOC ontology, this API generates classes with links between the objects realised as Java properties.

SIOC API for Perl. Version 1.0 of a SIOC API for Perl has been released on CPAN. The CPAN page for the SIOC API is here. A description of the project itself is available here.

2.2 Weblog, forum and CMS exporters

Different SIOC exporters have been written for a number of popular weblogs, forums and content management systems (CMS). All of these exporters feature RDF auto-discovery links for SIOC data, and are available via open-source licences.

WordPress SIOC Exporter.WordPress is a popular blogging platform based on PHP/MySQL. The WordPress SIOC Exporter allows the production of SIOC metadata from WordPress-based blogs, by simply installing two plugin files in the plugins folder and enabling the SIOC plugin from the WordPress control panel. This plugin is the most widely used SIOC exporter.

Dotclear SIOC Exporter.Dotclear is a widely-used French blogging platform. The Dotclear SIOC Exporter produces SIOC metadata using the SIOC export API for PHP, and exports information about the blog itself, the blog users, posts and comments.

b2evolution SIOC Exporter.b2evolution is a multi-blog platform that evolved from the same roots as WordPress (from b2/cafelog). An early version of a b2evolution SIOC Exporter has been built upon the SIOC export API for PHP.

Drupal SIOC Exporter. There is also a Drupal SIOC Exporter, which can be used to export SIOC data from Drupal sites, including blogs and forums. As Drupal can be used as a multi-user blogging platform, the plugin will export all blogs and all user accounts, so that each post can be clearly identified by its users.

phpBB SIOC Exporter.phpBB is one of the most used open-source message board platforms. A phpBB SIOC Exporter has been written that produces SIOC metadata about forums, posts and the users that created them.

vBulletin SIOC Exporter. The vBulletin SIOC Exporter exports SIOC and FOAF data from vBulletin discussion forums. It includes a plugin that allows users to opt to export the SHA1 of their e-mail address (and other inverse functional properties) and their network of friends via vBulletin's user control panel.

2.3 Other exporters

OpenLink Data Spaces Modules. There are a number of modules for the OpenLink Data Spaces (ODS) platform that each export SIOC metadata, including ODS-Blog, ODS-Wiki, ODS-Bookmarks, ODS-AddressBook, ODS-Calendar, ODS-Polls, ODS-Gallery (for photos), ODS-Feeds (for feed aggregation and exposure via SIOC), and ODS-Discussion (for comments across blogs, wikis or any other data space that supports some form of commenting).

Talk Digger.Talk Digger is a web service that helps people to find, follow and enter conversations on the Web, in order to see who is linking to a specific web page. Users can create a personal profile, define their interests, make new friends, track conversations, leave comments in conversations, etc. All data from this service is exported in RDF/XML using SIOC.

SWAML. SWAML is an exporter for mailing list content in Semantic Web format. SWAML reads a collection of e-mail messages stored in a mailbox (from a mailing list compatible with RFC 4155) and generates an RDF description of it. It is written in Python, using SIOC as the main ontology to represent a mailing list in RDF. SWAML is also available as a Debian package (in testing).

IRC2RDF. An RDF converter for IRC has been created that exports metadata in Turtle format, and SIOC is being used as one of the main representation formats.

Sioku.Jaiku is another microblogging site for which the Sioku Jaiku2RDF service has been created using Ruby on Rails. SIOC and FOAF are used as the main vocabularies for representing streams of microblog entries and for describing people and their contacts respectively.

OpenQabal.OpenQabal, an open source social networking and collaboration platform, is to include SIOC support, allowing Roller, JavaBB and other packaged component applications to become part of the SIOC-o-sphere.

memoQ. SIOC data is now being produced by memoQ from the National Institute of Informatics, Japan. memoQ allows conference participants to more easily ask their questions at academic conferences, by inputting their memos on a web page set up specifically for each presentation. SIOC terms such as Forum, Post and User are being used to export this data.

3. Using SIOC data

3.1 Querying SIOC data

All SIOC data can be queried using SPARQL, once the SIOC Core Ontology and Module Namespaces are defined in the SPARQL query.

OpenLink Data Spaces. As mentioned in section 2.4, ODS exposes all its data as real or virtual RDF graphs via its Virtuoso-based quad store. The ODS SIOC reference wiki page describes how various application realms are mapped to SIOC, along with an extensive collection of SPARQL query examples and live demonstration links for interacting with the SIOC instance data.

3.2 Crawling SIOC data

SIOC Crawler. SIOC data can be collected by a crawler that traverses the Web and retrieves any SIOC data it finds. The crawler starts with a list of "seed" SIOC URLs and follows rdfs:seeAlso links used to point to more SIOC and RDF data. This is a generic principle for crawling RDF documents, so a generic RDF crawler could be used. The SIOC Crawler, however, has additional knowledge about the structure of SIOC data which allowed the enhancement of this crawler with advanced functionality, e.g., incremental retrieval of new SIOC data in threads.

3.3 Browsing SIOC data

SIOC Browser. The SIOC Browser allows people to browse and receive additional information from SIOC data sources or data stores. Browsers can work in two modes - on-the-fly mode and crawler mode - or can use a combination of both (Bojars et al., 2006). The on-the-fly or live browser is a simple and effective way to explore community information available in SIOC. It gives a user-friendly look at the internal structure of the data without requiring the viewers to dive into a more complex RDF/XML syntax. A triple-store interface - that can be plugged onto any triple store that offers a SPARQL endpoint - has also been written for browsing crawled SIOC data, providing methods to visualise this data in both textual and graphical ways.

Buxon.Buxon, a sioc:Forum browser, was released as a part of SWAML 0.0.3 and is now available as an independent package. Written in PyGTK, it reads sioc:Forum information from RDF files and shows it as a tree of message threads. See this Buxon screenshot from the application. It is available as a Debian package.

SIOC Explorer. The SIOC Explorer is a web application which can aggregate posts from community web sites publishing SIOC data. The SIOC Explorer allows you to view and navigate based on all exported RDF data, not just SIOC, by utilising a domain-independent faceted-browsing approach. It has been implemented in Ruby on Rails and the ActiveRDF / SWORD Semantic Web application framework for Rails.

BAETLE.BAETLE (Bug And Enhancement Tracking LanguagE) aims to create a software bug ontology that can be used by various repositories to enable people to query for bugs across these repositories. SIOC is being used to define some of the required terms.

RDFa on Rails.RDFa on Rails is a library of helper methods to help Ruby on Rails developers with producing RDFa data. SIOC terms are used to describe blog posts in this library.

IkeWiki.IkeWiki is a semantic wiki for knowledge engineering. IkeWiki allows discussions (following a forum style with threaded views) to be attached to wiki pages. These discussions are represented using the SIOC ontology, which allows one to use semantic queries to investigate the structure of any discussion.

int.ere.st.int.ere.st provides metadata creation and sharing support across online communities that use tag data. int.ere.st aims to build a tag-mediated society based on Semantic Web technologies, and resources in the site are based on RDF vocabularies including SIOC, FOAF, and SCOT.

OpenLink Virtuoso AMI.OpenLink have released an EC2 / S3 Amazon Image-version of their Virtuoso product, which includes SIOC support: "your blogs, wikis, bookmarks, etc. are based on the SIOC ontology (think open social graph++)".

SIOC Comments Widget. A JavaScript SIOC-flavoured comments widget has been described by Talis that allows comments functionality to be added to a section of any page identified with a "sioc:has_reply" property using a short piece of embedded widget code (as shown here. This is powered by Talis' Convert service, which enables the creation of client-side Semantic Web applications.

3.5 Reusing SIOC data

IKHarvester.IKHarvester, a component for the Didaskon curriculum assembly framework, collects data from semantic social spaces (wikis, blogs, etc.) and provides it to Didaskon as informal learning objects (LOs). SIOC data exported from blogs and wikis is gathered and mapped to learning object metadata (LOM) with IKHarvester.

notitio.us and JeromeDL.notitio.us, a social bookmarking and knowledge harvesting system, provides SIOC metadata support through SSCF (social semantic collaborative filtering). The SSCF functionality can be seen in action at notitio.us/bookmarks, which can also display the associated SIOC data from bookmarked sites, forums and posts. This functionality is also implemented in the JeromeDL semantic digital library system.

4. SIOC utilities

Semantic Radar. To facilitate end-user access to SIOC data, the Semantic Radar - a Firefox browser extension - detects the presence of SIOC, FOAF and DOAP data in a web page, and alerts a user who then has the possibility to browse the data in an online SIOC browser.

PingTheSemanticWeb. The Semantic Radar application can also ping the PingTheSemanticWeb (PTSW) website, an online service that collects, stores and distributes links to RDF documents for every ping, and this is an efficient way to find and index SIOC data over the Web (Bojars et al., 2007). Through this index, external services such as doap:store or Sindice can use the PTSW service to find data.