Introduction to web standards

February 27, 2007 · Posted in Uncategorized · Comment 

Hello again. I’m Laurence Veale, senior usability analyst with iQ Content. This is my second post as guest blogger here on the IIA blog. Last week, I posted an introduction to accessibility. This time around I’m talking web standards.

Rather than go into the history and talk about the beginnings of HTML, the browser wars and the start of the web standards movement, I’ll direct you over to the comprehensive article, “Developing with Web Standards” on 456bereastreet.com, one of my favourite blogs on web standards and accessibility.

What are web standards?

Essentially, web standards are the best way to code the front-end of a web site. Starting with your HTML, it should be semantic. This means that any tags in the HTML should convey document structure or meaning only, and little or nothing else.

The principle behind this is that your website needs to be perceived by non-visual devices. For this reason, the meaning is far more important than the appearance. Appearance can then be easily applied later using Cascading Style Sheets (CSS).

Semantics for dummies

To explain the difference between presentation and structure, I sometimes use Microsoft Word. Within Word, I wanted to make some headings using the font size tool.

MS Word Content
changing the appearance

adding a table of contents

error in table of contents

Above, I’ve just changed the appearance of the text using the font size menu.

Then, in the third screenshot, I’ve tried to create a table of contents. However, Microsoft Word has thrown a wobbly because it can’t interpret any meaning from the appearance of my text.

So, while some of the text may appear like a heading, it isn’t, and Word can’t generate a table of contents as a result.
Alternatively, I can apply “meaning” to the same piece of text. Instead of making the text merely bigger, I’ve applied the “Header 1″ style. Then, when it comes to building a table of contents within my Word document, it works.

adding headings

successful addition of a table of contents

In addition to style, there is a third element to an individual web; behaviour, most commonly implemented using JavaScript. Commonly known as unobtrusive JavaScript, the web standards approach means keeping JavaScript out of your HTML.

separation of structure, appearance and behaviour

The end result of web standards: less code

Less code is better code, plain and simple. Less code to write means less bugs. Less code to download, meaning quicker downloads and less bandwidth costs. Less, in this case, is definitely more and can lead to tangible and quantifiable benefits.

Clear business benefits of web standards

There are clear business benefits in adopting web standards. Indeed, more and more RFP (Request for proposals) that we read are asking for web standards explicitly, and for good reason. But what are the real benefits?

  • SEO: Better placement in search engine results. Search engines like more meaningful code, the better structured your document, the better it will rank (all other factors being equal)
  • Quicker downloads for your users: less code is generally quicker code which fits nicely into the principles of universal design. Remember, not everyone is on broadband or on a PC/laptop for that matter
  • Reduced bandwidth costs for your business: less code means less load on your servers and your bandwidth. As I’ll discuss next time, even a small saving in code can save hugely on heavily trafficked websites.
  • Cheaper maintenance: Faster and easier changes can be made to the look and feel of an entire website.
  • Closer to an “accessible” website: while coding with web standards doesn’t guarantee an accessible web site, it can get you a good deal of the way towards one.

What’s next?

Next time, I’ll try and quantify the benefits of web standards, using the homepage of a major Irish institution as an example. Stay tuned.
Hopefully, I’ve whetted your appetite on web standards. If you want to find out more, then check out some of the resources and books listed below.

Further resources on web standards

Books on web standards

Introduction to web accessibility and universal design

February 23, 2007 · Posted in Uncategorized · 2 Comments 

Hi, I’m Laurence Veale. I’m a senior usability analyst at iQ Content. I’ll be posting a series on accessibility. First up, an introduction to accessibility.

Before I try and define accessibility let’s have a go at defining disability, something I find very difficult to do. According to the CSO, in 2002, “disabled” people accounted for 8.3% of the population. But is that the full picture?
Let’s take three categories:

  • Vision
  • Mobility
  • Cognitive

There are those from mild vision impairments, colour blind, glaucoma to the fully blind. Similarly, there are those who have little or no mobility at all, and those who may be temporarily immobile due to repetitive strain injury, or a sporting injury.

“I’ve spent the past few weeks trying to use my computer mostly via keyboard and voice control, trying to avoid touching my mouse (recurring overuse injury in my elbow)” Donna Maurer, Australia

Already it’s becoming quite clear that it’s not a black and white issue and it’s quite difficult to label groups of people on whether they are “abled” or “disabled”.

So there’s a spectrum, and to some degree we’re all on it. The main disability we all suffer from is ageing. And it’s terminal. But before we get to the terminal stage, for most of us our eyesight will start to deteriorate, we’ll have reduced mobility through arthritis, and we’ll lose some of our cognitive capacity. Not a lot to look forward to!
Tim Berners-Lee, credited with inventing the web, had this to say in an interview on the British Computer Society website:

“Another important area of professionalism is accessibility awareness. Everyone should be accommodated, especially when around 20 per cent of the population have special requirements.

In fact, Microsoft said recently that nearly 50 per cent of people need to make some sort of adjustment to their system to interact with it. Having turned 50, I’m very aware of receiving email with very small fonts – people don’t want to use their spectacles to look at a Web page!”

Where does web accessibility fit in to all this?

There is one school of thought that web accessibility is all about catering for disabled people, but as I mentioned, “disabled” is very hard to define as a single category or demographic. So how do you cater for everybody?

Universal design

However, there’s another school of thought, called universal design, the idea (or some would argue “ideal”) of designing for everybody.

In practical terms and from the perspective of websites, in addition to incorporating the needs of disabled people, universal design could include, amongst others:

Your website has the potential to take away the physical barriers that exist in the world of bricks and mortar. Take one simple example, thanks to the web, blind visitors can read the newspaper on the day it is published (provided the website is designed correctly).

Dr. Mark Magennis, director of the Centre for Inclusive Technology (CFIT), part of the National Council for the Blind of Ireland, spoke at our Boot Camp last year and described universal design in the simple terms of:

“Accessible design is good design”

Finally, no introduction to accessibility would be complete without the ubiquitous quote from Tim Berners-Lee:

“The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect”

What’s next in the series?

In my next few posts, I hope to cover

  • Web standards & accessibility: better for everyone
  • Web standards case study: review of a major Irish institution’s homepage and the business case for accessibility
  • Accessibility & the Law
  • Rapid-fire accessibility audits: how to assess the accessibility of your own site
  • And anything you’d like me to cover? Email me at laurence.veale@iqcontent.com or leave a comment below.

Usability basic: Fitt’s Law

February 22, 2007 · Posted in Uncategorized · Comment 

Howdy IIA readers. This is the first guest post from iQ Content. We’re a web usability and design consultancy based in Dublin. For the next couple weeks we’ll be posting about usability and accessibility. Our blog is a group effort, and this guest blogging will be, too. So you’ll be hearing from me (Brian), Laurence (our blogging champ) and hopefully a few others.

If you find any of this interesting, have a gawk at our regular blog — iqcontent.com/blog.

This first post is about a nice little tutorial on Fitt’s Law. If you haven’t heard it before, Fitt’s Law was established back in 1954, and focuses on the speed of clicking onscreen elements. Anyone in web design should be at least superficially familiar with this law.

The tutorial actually includes interactive “experiments” where you get to click on a bunch of circles. True, clicking on circles really loses its appeal quickly, even immediately. But if you stick with it a couple minutes, you get to actually experience what Fitt’s law describes. If you’re a bit impatient, just click ahead and read about what you were supposed to experience.

fitts law grab 2.JPG

I think the tutorial is nice because by experiencing it, you’re more likely to actually remember it, and because it points out some insights I hadn’t realised before. Like why the Mac menu bars are always at the top of the screen, regardless of whether the application window is full screen.

fitts law mac2.JPG

And though the basics of Fitt’s Law seems plainly obvious, the tutorial also points out where Fitt’s Law is actually counter-intuitive: “The opposite corner of the screen may be easier to target than a spot three pixels away!”

So if you’re looking for a 5 or 10 minute diversion, it’s worth a gander.

What Next for “Yet Another Social Network”?

February 18, 2007 · Posted in Uncategorized · 6 Comments 

Social networking services (SNS) allow a user to create and maintain an online network of close friends or business associates for social and professional reasons. There has been an explosion in the number of online social networking services in the past four years, so much so that the terms YASN and YASNS (Yet Another Social Network[ing Service]) have become commonplace. But these sites do not usually work together and therefore require you to re-enter your profile and redefine your connections when you register for each new site. Let me start with an overview of SNSs.

You may be familiar with the Irish phrase “dúirt bean liom go ndúirt bean leí”, which occurs when someone tells someone something and they then tell you – the friend-of-a-friend effect – or the theory that anybody is connected to everybody else (on average) by no more than six degrees of separation. Where did this number of six degrees come from? A sociologist called Stanley Milgram conducted an experiment in the late 1960s. Random people from Nebraska and Kansas were told to send a letter (via intermediaries) to a stock broker in Boston. However, they could only give the letter to someone that they knew on a first-name basis. Amongst the letters that found their target, the average number of links was around 5.5 (rounded up to 6). Some other related ideas include the Erdös number (the number of links required to connect scholars to mathematician Paul Erdös, a prolific writer who co-authored over 1500 papers with more than 500 authors), and the Kevin Bacon game (the goal is to connect any actor to Kevin Bacon, by linking actors who have acted in the same movie). The six degrees idea is nicely summed up by this quote from a film called “Six Degrees of Separation” written by John Guare:

“I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names. [...] It’s not just big names — it’s anyone. A native in a rain forest, a Tierra del Fuegan, an Eskimo. I am bound — you are bound — to everyone on this planet by a trail of six people.

You’ll often find that even though you follow one route to get in contact with a particular person, when you start talking to them there is another obvious connection between you and them that you didn’t know about previously. This is part of the small-world network theory, which says that most nodes in a network exhibiting small-world characteristics (such as a social network) can be reached from every other node by a small number of hops or steps.

Now we have websites acting as a social networking service. The idea behind such services is to make people’s real-world relationships explicitly defined online – whether they be close friends, professional colleagues or just people with common interests. Most SNSs allow you to surf from your list of friends to find friends-of-friends, or friends-of-friends-of-friends for various purposes. SNSs have become the new digital public places of Web 2.0 – just look at the huge takeup of sites such as MySpace, LinkedIn, Bebo and Facebook. Most SNSs allow content generation and sharing, and there is also a gradual transformation of SNSs into public e-markets – either through product promotions or targetted ads.

Social networking services usually offer the same basic functionalities: network of friends listings (showing a person’s “inner circle”), person surfing, private messaging, discussion forums or communities, events management, blogging, commenting (sometimes as endorsements on people’s profiles), and media uploading. Some motivations for SNSs include building friendships and relationships, arranging offline meetings, curiosity (nosiness!) about others, arranging business opportunities, or job hunting. People may want to meet with local professionals, create a network for parents, network for social (dating) purposes, get in touch with a venture capitalist, or find out if they can link to any famous people via their friends.

20070218a.pngBefore 2002, most people networked using services such as OneList, ICQ or eVite. The first big SNS in 2002 was Friendster; in 2003, LinkedIn (a SNS for professionals) and MySpace (target audience is 20-30 years) appeared; then in 2004 we had orkut (Google’s SNS) and Facebook (by a college student for college students); these were followed by Bebo (target audience is 10-20 years) in 2005. The graph on the right shows the growth of these sites over the past few years, according to Alexa. As of today, Bebo was ranked at #162 (even though it has just been around for about a year and half), Facebook at #34, orkut at #8, MySpace at #6, LinkedIn at #174, and Friendster at #36. I produced the top SNS table (in terms of membership) below from a list of social networking websites from Wikipedia; I could only describe it as indicative as some of the references for the figures are outdated.

20070218b.png

There have been lots of venture capital and sales of SNSs as well. Friendster raised $13 million in its early years, Tribe.net got $6.3 million, LinkedIn $4.7 million, and Bebo $15 million. MySpace was sold to News Corporation for $580 million, Friends Reunited to ITV for £120 million, and Facebook received a purported $1 billion offer by Yahoo!; leaked papers suggest that there was actually $1.6 billion available for deal, but the founder wanted $2 (billion, that is).

20070218c.pngEven in a small-sized SNS (the picture on the right is a part of the boards.ie friends network), there can be a lot of links available for analysis, and this data is usually meaningless when viewed as a whole, so one needs to apply some social network analysis (SNA) techniques. Apart from textbooks, there are many academic resources for social networks and SNA. For example, the tool Pajek can be used to drill down into various networks. A common method is to reduce the amount of relevant social network data by clustering. You could choose to cluster people by common friends, by shared interests, by geography, by tags, etc. In social network analysis, people are modelled as nodes or “actors”. Relationships (such as acquaintainceship, co-authorship, friendship, etc.) between actors are represented by lines or edges. This model allows analysis using existing tools from mathematical graph theory and mapping, with target domains such as movie actors, scientists and mathematicians (as already mentioned), sexual interaction, phone call patterns or terrorist activity. There are some nice tools for visualing these models, such as Vizster by Heer and Boyd, based on the Prefuse open-source toolkit. Others have combined SNA with Semantic Web technologies to determine social behaviour patterns, and MIT Media Lab are conducting mobile SNA research via their “reality mining” project.

With all such online interactions, people should limit the amount of personal information they put up (see my previous article on cyberstalking) as they are revealing more and more information on SNSs and other social software sites. There can be personal privacy issues, where sensitive information is revealed unknowingly. Depending on the signup agreements, advertisers and marketers can gain a better understanding from customer behavioural patterns by analysing masses of social network information, using “topic clouds” to show the overall picture (this may be good or bad from your point of view – maybe you want targetted ads showing you offers in areas you are interested in). On the security front, the NSA are using social network analysis technologies for homeland security, and there have been reports from the New Scientist of “automated intelligence profiling” from sites like MySpace based on potentially unreliable information.

So what does the future hold for SNS sites? It has been theorised that many sites only work where there is some “object-centered sociality” in networks, i.e. users are connected via a common object, e.g. their job, university, hobby, etc. In this way, it is probable that people’s SNS methods will move closer towards simulating their real-life social interaction, so that people will meet others through something they have in common, not by randomly approaching each other. In the future, we will no doubt see better interaction methods with friends à la Second Life.

But the main interest I see is in terms of distributed social networks and reusable profiles. There have been a lot of complaints about the walled gardens that are social network sites (and a recent balanced analysis from Danah Boyd). Some of the best SNSs out there would not exist without the walled garden approach, so it’s not all bad, but some flexibility would be nice. Users may have many identities on different social networks, where each identity was created from scratch. A resusable profile would allow a user to import their existing identity and connections (from their own homepage or from another site they are registered on), thereby forming a single global identity with different views (e.g. there is Videntity which works with OpenID and FOAF).

For those who are interested in setting up their own social network, I can suggest the following. First of all, you can try the open source AroundMe and Yogurt systems. Secondly, there are two books of general interest (i.e. not too scientific): “Linked” by Albert-Laszlo Barabasi and “Six Degrees” by Duncan J. Watts (one of the formalisers of the small-world network theory).

That’s all from me for the IIA Blog. I hope you’ve enjoyed my two week guest series of posts on Web 2.0, and I’ll be back to my own Cloudlands blog again tomorrow… Bye!

To Wikis and Beyond

February 16, 2007 · Posted in Uncategorized · 1 Comment 

Last time, I talked about semantic blogging and how the blogging experience can be augmented by adding structure and metadata about the things you’re blogging about. Today, I’m going to talk about wikis and how they too can benefit from such structure.

Firstly, some history. Many people are familiar with the Wikipedia, but less know exactly what a wiki is. In short, a wiki is an “information space” (web or desktop application) that allows users to easily add and edit content, and is especially suited for collaborative writing. Wikis rely on cooperation, on checks and balances of the wiki site members, and a belief in the sharing of ideas. The name comes from a Hawaiian phrase, “wiki wiki”, which means to hasten or go quickly. Ward Cunningham, who now works for Microsoft, created the first wiki in 1995, and I had the pleasure of meeting both Ward and Jimmy Wales (who set up the Wikipedia in 2001) at the first Wikimedia conference. Apart from the Wikipedia, wikis are being used for free dictionaries, book repositories, event organisation, and software development. They have become increasingly used in enterprise environments for collaborative purposes: research projects, papers and proposals, coordinating meetings, etc. Ross Mayfield’s SocialText produced the first commercial open source wiki solution, and many companies now use wikis as one of their main intranet collaboration tools.

There are a plethora (hundreds) of wiki software systems now available, ranging from MediaWiki, the software used on the Wikimedia family of sites, and Eugene Eric Kim’s PurpleWiki, where fine grained elements on a wiki page are referenced by purple numbers, to Alex Schröder’s OddMuse, a single Perl script wiki install, and WikidPad, a desktop-based wiki for managing personal information. Many are open source, free, and will often run on multiple operating systems. The differences between wikis are usually quite small but can include the development language used (Java, PHP, Python, Perl, Ruby, etc.), the database required (MySQL, flat files, etc.), whether attachment file uploading is allowed or not, spam prevention mechanisms, page access controls, RSS feeds, etc.

The Wikipedia project consists of 250 different wikis, corresponding to a variety of languages. The English-language one is currently the biggest, with over 1.5 million pages, but there are wikis in languages ranging from Irish to Arabic to Chinese (and even in constructed languages such as Esperanto and Klingon!). A typical wiki page will have two buttons of interest: “Edit” and “History”. Normally, anyone can edit an existing wiki article, and if the article does not exist on a particular topic, you can create it. If someone messes up an article (either deliberately or erroneously), there is a revision history so that you can fix or revert the contents. There is a certain amount of ego-related motivation in contributing to a wiki – people like to show that they know things, to fix mistakes and fill in gaps in underdeveloped articles (stubs), and to have a permanent record of what they have contributed via their registered account. By providing a template structure to input facts about certain things (towns, people, etc.), wikis also facilitate this user drive to populate wikis with information.

For some time on the Wikipedia and in other wikis, templates have been used to provide a consistent look to the content placed within article texts. They can also be used to provide a structure for entering data, so that it is easy to extract metadata about the topic of an article (e.g. from a template field called “population” in an article about Galway). Semantic wikis bring this to the next level by allowing users to create semantic annotations anywhere within a wiki article text for the purposes of structured access and finer-grained searches, inline querying, and external information reuse. There are already about 20 semantic wikis in existence, and one of the largest ones is Semantic MediaWiki, based on the popular MediaWiki system.

20070216a.pngLet’s take some examples of providing structured access to information in wikis. At the moment, there may be a page about John Grisham that has a link to the Pelican Brief (and to other books that he has written), to Mississippi because he lives there, and to Random House, his publisher (thanks to Eyal for this example). But, you cannot perform fine-grained searches on the Wikipedia dataset such as “show me all the books written by John Grisham”, or “show me all authors that live in the US”, or “what authors are signed to Random House”, because the type of links (i.e. the relationship type) between wiki pages are not defined. In Semantic MediaWiki, you can do this by linking with [[author of::Pelican Brief]] rather than just [[Pelican Brief]]. There may also be some attribute such as [[birthdate:=1955-02-08]] which is defined in the John Grisham article. Such attributes could be used for answering questions like “show me authors over 50″ or for sorting articles.

20070216b.pngSome semantic wikis also provide what is called inline querying. The screenshot on the right (from another system called SemperWiki) gives an example of this. The text in red (which says find me all pages where the creator is Eyal Oren) is processed as a query when the page is viewed and the results are shown at the bottom. Other wikis will process the query and show the results as part of the article text itself. [The green text here defines some relationships and attributes, and for each of these, articles with matching properties are shown on the right-hand side.]

Finally, just as in the semantic blogging scenario, wikis can enable the Web to be used as a clipboard, by allowing readers to drag structured information from wiki pages into other applications (for example, geographic data about locations on a wiki page could be used to annotate information on an event or a person in your calendar application or address book software respectively).

My next (and final) guest blog post will be on social network services and connecting them all together. See you then!

Semantic Blogging

February 11, 2007 · Posted in Uncategorized · 8 Comments 

We’ve already seen how Web 2.0 has brought about a paradigm of tagged and commented-upon content: photos, bookmarks, events, videos, and blog posts. Blog posts are usually only tagged on the blog itself by the post creator, using free-text keywords such as “scotland”, “movies”, etc. (unless they are bookmarked and tagged by others using social bookmarking services like del.icio.us or personal aggregators like Gregarius). Technorati, the blog search engine, aims to use these keywords to build a “tagged web”. Both tags and hierarchial categorisations of blog posts can be further enriched using the SKOS framework. However, there is often much more to say about a blog post than simply what category it belongs in…

So let’s move on to semantic blogging (some ideas here are from Knud Moeller who is working on semiBlog). Traditional blogging is aimed at what can be called the “eyeball Web” – i.e. text, images or video content that is targetted mainly at people. Semantic blogging aims to enrich traditional blogging with metadata about the structure (what relates to what and how) and the content (what is this post about – a person, event, book, etc.). In this way, metadata-enriched blogging can be better understood by computers as well as people.

Last time I talked about structured blogging, where microcontent such as microformats is positioned inline in the HTML (and subsequent syndication feeds) and can be rendered via CSS. Structured blogging and semantic blogging do not compete, but rather offer metadata in slightly different ways (using microcontent / microformats and RDF respectively). There are already mechanisms such as GRDDL which can be used to move from one to the other.

So why would one choose to enhance their blogs and posts with semantics? Current blogging offers poor query possibilities (except for searching by keyword or seeing all posts labelled with a particular tag). There is little or no reuse of data offered (apart from copying URLs or text from posts). Some linking of posts is possible via direct HTML links or trackbacks, but again, nothing can be said about the nature of those links (are you agreeing with someone, linking to an interesting post, or are you quoting someone whose blog post is directly in contradiction with your own opinions?). Semantic blogging aims to tackle some of these issues, by facilitating better (i.e. more precise) querying when compared with keyword matching, by providing more reuse possibilities, and by creating “richer” links between blog posts.

It is not simply a matter of adding semantics for the sake of creating extra metadata, but rather a case of being able to reuse what data a person already has in their desktop or web space and making the resulting metadata available to others. People are already (sometimes unknowingly) collecting and creating large amounts of structured data on their computers, but this data is often tied into specific applications and locked within a user’s desktop (e.g. contacts in a person’s addressbook, events in a calendaring application, author and title information in documents, audio metadata in MP3 files). Semantic blogging can be used to “lift” or release this data onto the Web.

20070211a.pngLooking at the picture on the right, Aidan writes a blog post which he annotates using content from his desktop calendaring and addressbook applications. He publishes this post onto the Web, and John, reading this post, can reuse the embedded metadata in his own desktop applications.

20070211b.pngThe next picture is from a semantic blogging application called semiBlog. In this picture, a semantic blog post is being created by annotating a part of the post text about John with an address book entry that has extra metadata describing John. Once a blog has semantic metadata, it can be used to perform queries such as “which blog posts talk about papers by Stefan Decker?”; it can be used for browsing not only across blogs but also other kinds of discussion methods; or it can be used by blog readers for importing metadata into desktop applications (using the Web as a clipboard).

As well as semiBlog, other semantic blogging systems have been developed by HP, the National Institute of Informatics, Japan and MIT. But it’s not just blog posts that are being enhanced by structured metadata and semantics – it’s happening in many other Web 2.0 application areas. Wikis such as the Wikipedia have contained structured metadata in the form of templates for some time now, and at least twenty “semantic wikis” have also appeared to address a growing need for more structure in wikis. I’ll talk about semantic wikis next time, and in the meantime look forward to your comments…

Adding Structure to Blog Posts

February 9, 2007 · Posted in Uncategorized · 7 Comments 

As you probably know (since you’re reading this!), blogs are [usually open access] websites which contain periodic time-stamped posts (in reverse chronological order) about a particular genre or touching on a number of topics of interest. They range from individual’s online diaries or journals to promotional tools used by companies or political campaigns, and many allow public commenting on their posts. They are also starting to cross the generation gap – your kids might have a blog on Bebo, you may blog yourself and your parents could be reading or commenting on your posts.

The growth and takeup of blogs over the past four years has been dramatic, with a doubling in the size of the blogosphere every six or so months (according to statistics from Technorati). Over 100,000 blogs are created every day, working out at about one a second. Nearly 1.5 million blog posts are being made each day, with over half of bloggers contributing to their sites three months after the blog’s creation.

Similar to accidentally wandering onto message boards and web-enabled mailing lists, when you’re searching for something on the Web, you may often happen across a relevant entry on someone’s blog. RSS feeds are also a useful way of accessing information from your favourite blogs, but they are usually limited to the last 15 entries, and don’t provide much information on exactly who wrote or commented on a particular post, or what the post is talking about. Some approaches like SIOC aim to enhance the semantic metadata provided about blogs, forums and posts, but there is also a need for more information about what exactly a person is writing about. If you’re searching for particular information in or across blogs, it’s often not that easy to get it because of “splogs” (spam blogs) and the fact that the virtue of blogs so far has been their simplicity – apart from the subject field, everything and anything is stored in one big text field for content. Keyword searches may give some relevant results, but useful questions such as “find me all the restaurants that bloggers reviewed in Dublin with a rating of at least 5 out of 10″ cannot be posed, and you cannot easily drag-and-drop events or people or anything (apart from URLs) mentioned in blog posts into your own applications.

I’m going to talk about two approaches to tackle this issue of adding more information to posts, so that queries can be made and the things that people talk about can be reused in other posts or applications, because not everyone is being served well by the lowest common denominator that we currently have in blogs. The first is called structured blogging and the second semantic blogging. (I’ll cover semantic blogging in my next installment…)

“Structured blogging” is an open source community effort that has created tools to provide microcontent (including microformats like hReview) from popular blogging platforms such as WordPress and Moveable Type. In structured blogging, packages of structured data are becoming post components. Sometimes (not all of the time) you will have a need for more structure in your posts – if you know a subject deeply, or if your observations or analyses recur in a similar manner throughout your blog – then you may best be served by filling in a form (which has its own metadata and model) during the post creation process. For example, you may be writing a review of a film you went to see, or a report on a sports game you attended, or a guide to tourist attractions you saw on your travels. Not only do people get to express themselves more clearly, but blogs can start to interoperate with enterprise applications through the microcontent that is being created in the background.

Let’s say that someone (or a group of people) is reviewing some soccer games that they watched. Their after-game soccer reports will typically include information on which teams played, where the game was held and when, who were the officials, what were the significant game events (who scored, when and how, or who received penalties and why, etc.) – it’d be great if these blog posters could use a tool that would understand this structure, presenting an editing form with the relevant fields and creating both HTML and RSS with this stucture embedded in it. Then other people reading these posts could say, “hey, I want to reuse this structure in my own posts” and their blog reader / creator could make this structure available when the blogger is ready to write. As well as this, reader applications could begin to answer questions based on the form fields available – “show me all the matches from Germany with more than two goals scored”, etc.

20070209a.pngAt the moment, the structured blogging tools do provide a fixed set of forms that bloggers can fill in (see the WordPress restaurant review form on the right) – for things like reviews, events, audio, video and people – but there is no reason that people couldn’t create custom structures, and news aggregators or readers could auto-discover an unknown structure, notify a user that a new structure is available, and learn the structure for reuse in the user’s future posts.

There have been some other past efforts with similar aims to the structured blogging community, including Qlogger, the Lafayette project, and JemBlog. And in the future, Semantic Web technologies could be used to ontologise any available post structures for more linkage and reuse… This neatly brings me on to semantic blogging, which I’ll discuss in the next post!

The Semantic Web: Web 3.0?

February 4, 2007 · Posted in Uncategorized · 6 Comments 

A key feature of Web 2.0 sites is community-contributed content that may be tagged and can be commented on by others. That content can be virtually anything: blog entries, board posts, videos, audio, images, wiki pages, user profiles, bookmarks, events, etc. I fully expect to see a site with live multiplayer video games appearing in little browser-embedded windows just as we already have YouTube for videos, with running commentaries going on about the games in parallel. Tagging is common to many Web 2.0 sites – a tag is a keyword that acts like a subject or category for the associated content. Then we have folksonomies: collaboratively generated, open-ended labeling systems that enable Web 2.0 users to categorise content using the tags system, and to thereby visualise popular tag usages via “tag clouds” (visual depicitions of the tags used on a particular website, like a weighted list in visual design).

Folksonomies are one step in the same direction as what some have termed Web 3.0, or the Semantic Web. (The Semantic Web often uses top-down controlled vocabularies to describe various domains, but can also utilise folksonomies and therefore develop more quickly since folksonomies are a great big distributed classification system with low entry costs.) As Tim-Berners Lee et al. said in Scientfic American in 2001, the Semantic Web is “an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. You probably know that the word “semantic” stands for “the meaning of”, and therefore the Semantic Web is one that is able to describe things in a way that computers can better understand (yes, computers are just like Ginger in the Far Side). Some of the more popular Semantic Web vocabularies include FOAF (Friend-of-a-Friend, for social networks) and Geo (for geographic locations).

It consists of metadata that is associated with web resources, and then there are associated vocabularies or “ontologies” that describe what this metadata is and how it is all related to each other. SEO experts have known that adding metadata to their websites can often improve the percentage of relevant document hits in search engine result lists, but it is hard to persuade web authors to add metadata to their pages in a consistent, reliable manner (either due to perceived high entry costs or because it is too time consuming). For example, few web authors make use of the simple Dublin Core metadata system, even though the use of DC meta tags can increase their pages’ prominence in search results.

The main power of the Semantic Web lies in interoperability, and combinations of vocabulary terms: interoperability and increased connectivity is possible through a commonality of expression; vocabularies can be combined and used together:
e.g. a description of a book using Dublin Core metadata can be augmented with specifics about the book author using the FOAF vocabulary. Vocabularies can also be easily extended (modules, etc.). Through this, true intelligent search with more granularity and relevance is possible: e.g. a search can be personalised to an individual by making use of their identity profile and relationship information.

The challenge for the Semantic Web is related to the chicken-and-egg problem: it is difficult to produce data without interesting applications, and vice versa. The Semantic Web can’t work all by itself, because if it did it would be called the “Magic Web”. For example, it is not very likely that you will be able to sell your car just by putting your a Semantic Web file on the Web. Society-scale applications are required, i.e. consumers and processors of Semantic Web data, Semantic Web agents or services, and more advanced collaborative applications that make real use of shared data and annotations.

The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications, and the primary Web 2.0 meme as already discussed is about providing user applications. These are not mutually exclusive: with a little effort, many Web 2.0 applications can and do use Semantic Web technologies to great benefit, and this picture from Nova Spivack shows some evolving areas where these two streams have and will come together: semantic blogging, semantic wikis, semantic social networks and the Semantic Desktop all fall in the realm of what he terms the Metaweb, or “social semantic information spaces”. Semantic MediaWiki, for example, has already been commercially adopted by Centiare.

20070201d.png

There are also great opportunities for mashing together of both Web 2.0 data or applications and Semantic Web technologies – just use your imagination! Dermod Moore wrote of one such Web 2.0 application mashing for a hobby: a Scuttle + Gregarius + Feedburner + Grazr hybrid that allows one to aggregate one’s favourite blogs or other content on a particular topic and then to annotate bookmarks to the most interesting content found. Bringing this a step further, we could have a “semantic social collaborative resource aggregator”. Okay, it needs a better name, like “scraggy” or something :) . In this hypothetical system:

  • Social network members specify their favourite content sources
  • You and your friends specify any topics of interest
  • You specify friends whose topic lists you value
  • Metadata aggregator collects content from sites you and friends like (which may be human tagged, or could be auto-tagged)
  • Highlights content that may be of interest to you or your friends
  • If nothing of interest is currently available, content sources may have semantically-related sources in other communities for secondary content acquisition and highlighting
  • You bookmark and tag the interesting content, and share!

That’s all for now; next time I’ll be talking about the evolution from blogging to structured and semantic blogging.

From Web 1.0 to 2.0…

February 2, 2007 · Posted in Uncategorized · 1 Comment 

Hello and welcome to the first of my guest posts for the Irish Internet Association’s blog. For the next two weeks, I’ll be talking about matters Web 2.0 related – hopefully with enough material to pique the interest of those who are both new to or already involved in this and related areas.

20070201a.jpgAbout me: I’m a researcher at the Digital Enterprise Research Institute at NUI Galway, and co-founder of boards.ie. Some more information about myself can be found on my personal and work pages. In parallel to this guest blogging session, I’m teaching a new module in “Emerging Web Media” to Masters in Digital Media students at the Huston Film School, and some of the topics being covered in that will overlap with these entries.

First off, I will mention Web 1.0. The structural / syntactic web put in place in the early 90s is still much the same as what we use today: resources (web pages, files, etc.) connected by untyped hyperlinks. By untyped, I mean that there is no easy way for a computer to figure out what a link between two pages means – for example, on the IIA website, there are hundreds of links to the various organisations that are registered members of the association, but there is nothing explicitly saying that the link is to an organisation that is a “member of” the IIA or what type of organisation is represented by the link. On my work page, I link to many papers I’ve written, but I haven’t said that I am the author of those papers or that I wrote such-and-such when I was working at NUI Galway.

20070201b.gif In fact, the Web was envisaged to be much more, as you’ll see from the image on the right which is taken from Tim Berners-Lee’s original outline for the Web in 1989, entitled “Information Management: A Proposal”. In this, all the resources are connected by links describing the type of relationships, e.g. “wrote”, “describe”, “refers to”, etc. This is a precursor to the Semantic Web which I’ll come back to…

Now to Web 2.0, a term made popular by Tim O’Reilly and explained here. But what exactly is it? I’m sure if you ask 10 different people you’ll come up with at least five answers. (Here are a Web 2.0 meme cloud, meme map and an elements picture. Any clearer as to what it is?!). The global brain, or as it likes to call itself, “Wikipedia”, says in one place that “Web 2.0 … has … come to refer to what some people describe as a second phase of architecture and application development for the World Wide Web.” I like to think of it as a web where “ordinary” users can meet, collaborate, and share [content] using social software applications on the Web – via tagged items, social bookmarking, AJAX functionality, etc. And there are many popular examples that work along this collaboration and sharing meme: Bebo, del.icio.us, digg, Flickr, UseAMap.com, Technorati, orkut, 43 Things, Wikipedia, and so on.

Over the last 13 years, there’s been a shift from just ‘existing’ on the Web to participating on the Web. Web 2.0 is a platform for social and collaborative exchange with reusable community contributions, where anyone can mass-publish using web-based social software and others can subscribe to desired information, news, data flows, or other services. It is “social software” that is being used for this communication and collaboration, software that “lets people rendezvous, connect or collaborate by use of a computer network. It results in the creation of shared, interactive spaces…” Examples include instant messaging, IRC, forums, blogs, wikis, SNS (social network services), social bookmarking, podcasts, and MMOGs / MMORPGs.

O’Reilly wrote a long article on the seven features or principles of Web 2.0, to which some have added an eighth: the long tail phenomenon. But in short, Web 2.0 is all about being more open, more social, and through user-created content, cheaper!

20070201c.png

Tomorrow I’l talk about the move from Web 2.0 towards what has been termed Web 3.0, or the “Semantic Web”.