Guide to the semantic web
After two days of evaluating my last post about the semantic web, I have come to the conclusion that it is a very bad idea to blog when irritated about something. I’ll take the advice of Rob and Simo and simply not do it from now on. A funny thing was that I’ve tried for a long time to convince people to write semantic mark-up without much luck, so I decided to change tactics. Ending up pissing people off was not the plan, but it sure got a lot of attention heat on both the comments and on e-mail. I’m sorry about that.
On a positive side, it did lead to some interesting questions about why we should care about the semantic web. Also it illustrated that this is new territory to a lot of people and that was news to me. In this post I’ll try to answer some of the questions, get beneath the philosophy and give some examples on how to implement it.
What is the semantic web?
This question is better answered by the W3C and here is what they have to say about it:
The Semantic Web is a web of data. There is lots of data we all use every day, and it’s not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar?
Why not? Because we don't have a web of data. Because data is controlled by applications, and each application keeps it to itself.
The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing.
This may be a little abstract or difficult to grasp, but let’s look at some examples where the semantic web can make our lives easier.
The bank example
Josh wrote in his comments:
The vast majority of ASP.NET web applications serve a business purpose. Lots of these apps don't even get public exposure (authentication required). […] For instance, what good would XFN and FOAF do for a banking application?
This is a good question I get a lot and it is based on a common misunderstanding on the use of semantic mark-up. As I normally explain it, you have to think of the semantic web as 3 things - a database, an enabler and some glue.
Let’s start with the semantic database. Whenever you mark up a web page with microformats (I’ll get back to those) you make parts of that page machine readable. It could be contact information or calendar events or some other structured data. A machine can query that data and aggregate that information with thousands of other microformatted web pages. That’s what Yahoo is experimenting with in their new search engine and Google is utilizing in their Social Graph API.
The enabler could be an online banking site. The pages are not public because only authenticated customers have access, so it cannot be used as a database. It can however enable you to query relevant information from the semantic database. Here is a scenario I got from a video with Tim Berners-Lee written from my memory:
Imaging reading your bank statement on your online banking application and see a transaction you don’t remember making. Each transaction in the statement has a date and time that could be used to figure out what you did that day. If the bank statement was marked up with semantic meaning about the date, then your browser can recognize it as a date – otherwise it can’t.
This brings us to the last part – the glue. In the bank scenario the glue is the browser or a browser plug-in since no existing browser supports microformats natively yet.
Now that the browser can recognize the dates on the bank statement, it should have no problem looking in your Outlook or Google calendar to see exactly what you did that day and present it to you at the click of a button.
Because your friends are using microformats on their website or profile page, the browser can also tell you exactly who you saw that day. You took some photos as well and uploaded them to Flickr, so now you also have photos associated with that particular day. Some of your friends are tagged on those photos with a link to their photos, so now you can associate their photos of you that day too. All with a click of a button in the browser – the glue.
It sounds very futuristic, but the technology for this has been around for years.
If browsers don’t support it, why should I?
This is the classic question of the chicken and the egg. If browsers don’t support it why should you, and if you don’t publish semantic mark-up why should the browser vendors waste their time on it? No one takes the first step and we end up getting nowhere. That is why we haven’t seen any killer applications that utilize the semantic web yet.
Lucky for us, this is an exciting time to play around with semantic formats because new services and applications that utilize it are starting to pop up like never before. We’re still waiting for the killer application, but that won’t happen before the database is big enough and there is only one way for that to happen. We need to start marking up our pages. If we don’t start then we stay in limbo and the bank scenario gets pushed further and further into the future. I for one have a hard time ignoring this chicken/egg situation – especially when so little is required to get started.
How to start
The easiest way to start is by choosing one or more microformats that make sense to use on your web application (I'll get back to that in a bit). Let’s take a look at microformats. To put it simple, a microformat is a standard naming convention of classes on HTML elements. Here is an example of a very simple hCard microformat marked up in existing HTML. hCard is used to mark up a person and is equivalent to the old vCard standard used by Outlook and other address books.
<div class="vcard">
<span class="fn">John Doe</span>
<a href="http://example.com" class="url">My website</a>
</div>
Notice the class names marked in bold. The name of those class attributes comes from the hCard standard defined at microformats.org. This is basic HTML and that is the whole idea with microformats. It’s easy for humans to implement and it’s easy for machines to read. You don’t have to change the layout of your page and you can use existing HTML elements already there.
Microformats are the best way to start because they can easily be added to existing web pages with little effort. Another example is the XFN (XHTML Friends Network) microformat. It is used to describe a person’s relations to other people. It could be family, co-workers, friends or other contacts. This is probably the easiest microformat and it uses the rel tag of the <a> element like so:
<a href=”http://johndoe.com” rel="friend co-worker">John Doe</a>
<a href=”http://melissa.com” rel="spouse">Melissa Smith</a>
<a href=”http://britney.com” rel="muse">Britney Spears</a>
<a href=”http://madskristensen.dk” rel="me">Mads Kristensen</a>
In case you’re wondering, the rel tag is valid XHTML. Here is a list of valid XFN relations you can use. The purpose is to make social relations machine readable and would be beneficial to use by social networks like Facebook and LinkedIn etc. Imaging signing up for the first time on Facebook and then just give them your URL and then let Facebook find your friends from your XFN tags and then connect you to them on Facebook automatically.
FOAF is the next step. It can also contain information about your friends and contacts like XFN can and that’s why Josh’s was right. XFN and FOAF (in most cases) are meant for public consumption and thereby contribute to the semantic database. An online bank site is not public and therefore XFN and FOAF aren’t suited for it.
I won’t go into details about FOAF because it deserves a post of its own.
Getting started
This is always the hard part when faced with something new. You saw the simplicity of the hCard and XFN microformat and you can rest assured that the other microformats are just as simple. To make it even easier to get started, I’ve listed different types of web applications and the microformats that might be possible for you to implement on those. They are listed in priority under each type. Just pick your type and follow the links to the implementation guides.
Personal website or blog
Company website
Webshop
Calendar and events
A good tip is to use the Operator Toolbar for Firefox when adding microformats to a page. It can show you how it looks as you code along. That way you know if you are doing it correctly.
I hope this will inspire you to get started using semantic mark-up on your existing and new web projects. Another day I’ll get to some other semantic formats and technologies such as FOAF, OWL, SIOC and APML.
Here are some links to earlier post I’ve written with how-to’s and code samples.