The Semantic Web: Web 3.0?
A key feature of Web 2.0 sites is community-contributed content that may be tagged and can be commented on by others. That content can be virtually anything: blog entries, board posts, videos, audio, images, wiki pages, user profiles, bookmarks, events, etc. I fully expect to see a site with live multiplayer video games appearing in little browser-embedded windows just as we already have YouTube for videos, with running commentaries going on about the games in parallel. Tagging is common to many Web 2.0 sites – a tag is a keyword that acts like a subject or category for the associated content. Then we have folksonomies: collaboratively generated, open-ended labeling systems that enable Web 2.0 users to categorise content using the tags system, and to thereby visualise popular tag usages via “tag clouds” (visual depicitions of the tags used on a particular website, like a weighted list in visual design).
Folksonomies are one step in the same direction as what some have termed Web 3.0, or the Semantic Web. (The Semantic Web often uses top-down controlled vocabularies to describe various domains, but can also utilise folksonomies and therefore develop more quickly since folksonomies are a great big distributed classification system with low entry costs.) As Tim-Berners Lee et al. said in Scientfic American in 2001, the Semantic Web is “an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. You probably know that the word “semantic” stands for “the meaning of”, and therefore the Semantic Web is one that is able to describe things in a way that computers can better understand (yes, computers are just like Ginger in the Far Side). Some of the more popular Semantic Web vocabularies include FOAF (Friend-of-a-Friend, for social networks) and Geo (for geographic locations).
It consists of metadata that is associated with web resources, and then there are associated vocabularies or “ontologies” that describe what this metadata is and how it is all related to each other. SEO experts have known that adding metadata to their websites can often improve the percentage of relevant document hits in search engine result lists, but it is hard to persuade web authors to add metadata to their pages in a consistent, reliable manner (either due to perceived high entry costs or because it is too time consuming). For example, few web authors make use of the simple Dublin Core metadata system, even though the use of DC meta tags can increase their pages’ prominence in search results.
The main power of the Semantic Web lies in interoperability, and combinations of vocabulary terms: interoperability and increased connectivity is possible through a commonality of expression; vocabularies can be combined and used together:
e.g. a description of a book using Dublin Core metadata can be augmented with specifics about the book author using the FOAF vocabulary. Vocabularies can also be easily extended (modules, etc.). Through this, true intelligent search with more granularity and relevance is possible: e.g. a search can be personalised to an individual by making use of their identity profile and relationship information.
The challenge for the Semantic Web is related to the chicken-and-egg problem: it is difficult to produce data without interesting applications, and vice versa. The Semantic Web can’t work all by itself, because if it did it would be called the “Magic Web”. For example, it is not very likely that you will be able to sell your car just by putting your a Semantic Web file on the Web. Society-scale applications are required, i.e. consumers and processors of Semantic Web data, Semantic Web agents or services, and more advanced collaborative applications that make real use of shared data and annotations.
The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications, and the primary Web 2.0 meme as already discussed is about providing user applications. These are not mutually exclusive: with a little effort, many Web 2.0 applications can and do use Semantic Web technologies to great benefit, and this picture from Nova Spivack shows some evolving areas where these two streams have and will come together: semantic blogging, semantic wikis, semantic social networks and the Semantic Desktop all fall in the realm of what he terms the Metaweb, or “social semantic information spaces”. Semantic MediaWiki, for example, has already been commercially adopted by Centiare.

There are also great opportunities for mashing together of both Web 2.0 data or applications and Semantic Web technologies – just use your imagination! Dermod Moore wrote of one such Web 2.0 application mashing for a hobby: a Scuttle + Gregarius + Feedburner + Grazr hybrid that allows one to aggregate one’s favourite blogs or other content on a particular topic and then to annotate bookmarks to the most interesting content found. Bringing this a step further, we could have a “semantic social collaborative resource aggregator”. Okay, it needs a better name, like “scraggy” or something
. In this hypothetical system:
- Social network members specify their favourite content sources
- You and your friends specify any topics of interest
- You specify friends whose topic lists you value
- Metadata aggregator collects content from sites you and friends like (which may be human tagged, or could be auto-tagged)
- Highlights content that may be of interest to you or your friends
- If nothing of interest is currently available, content sources may have semantically-related sources in other communities for secondary content acquisition and highlighting
- You bookmark and tag the interesting content, and share!
That’s all for now; next time I’ll be talking about the evolution from blogging to structured and semantic blogging.
Comments
6 Responses to “The Semantic Web: Web 3.0?”
Leave a Reply


[...] The second part of my IIA guest blog posts has now been published: “The Semantic Web: Web 3.0“. [...]
Cracking post John. Have you any opinion of timelines for adoption of the semantic web seeing as it has been talked about for a long time?
It’s hard to say that the Semantic Web will be here in X years time, but rather that parts of it are already here, in different forms.
Microformats are also part of the Semantic Web, in that they are adding semantics to web pages (in a slightly different way to the metadata-ontologies I outlined above), and are now being picked up by browsers like Firefox 2 or 3 (directly or via the Operator plugin). There are other Semantic Web vocabularies like RSS 1.0 that are already widely used, and with FOAF being produced by Opera and LiveJournal there’s a bunch of interesting Semantic Web data currently being generated.
The key is to do this without end-user participation – you’ll create something and in the background some associated metadata (that may not be necessary for display purposes) will be generated, so that you can then reuse that data or connect your page with other resources. I’ll talk about this some more in the blogging post I’ll make soon, but the idea is that you could be browsing someone’s blog post and start reusing things from their blog post in your own applications (user metadata to address book entries, event metadata to your calendaring application, etc.).
I should have given an example of the finely-grained search I hinted to above. At the moment, you can’t answer a question like “find me everything that any person John knows has written on the topic of the Semantic Web” using a conventional search engine. Through a combination of FOAF descriptions of users and their social networks, and of documents created by all those users with associated subjects, such a query becomes possible…
Following a comment from Paul Walsh at Segala, I’ve updated the table to show some actual Semantic Web application names rather than generic categorisations like “Semantic Wikis”:
Semantic Blogs: semiBlog, Haystack, Semblog, Structured Blogging
Semantic Wikis: Semantic MediaWiki, SemperWiki, Platypus, dbpedia, Rhizome
Semantic Search: SWSE, Swoogle, Intellidimension
Semantic Digital Libraries: JeromeDL, BRICKS, Longwell
Semantic Forums and Community Portals: SIOC, OpenLink DataSpaces
Semantic Social Networks: FOAF, PeopleAggregator
Semantic Social Information Spaces: Nepomuk, Gnowsis
[...] After all, there is one thing that computers and I share in common, neither of us understand the language so well… I found it amusing that JohnBreslin, in his blog entry on the Irish Internet Association site, uses the above cartoon to depict the lack of interoperability between certain software agents (see his blog entry here). The notion of better defining information so that the computer has a more precise understanding thereby enabling human and computers to work even more effectively together is novel to me. [...]