ben.eficium is the blog of Ben Slack, citizen and principal consultant at Systems Xpert.
All posts are Copyright © Ben Slack on the date of publishing.


10 May 2010

Be prepared, Web 3.0 is coming

Web 3.0 is a new buzzword that describes Web content that enables greater degrees of self-description, indexing and interactivity than is currently possible. The idea has been around since the late 90’s in the form of Sir Tim Berners-Lee’s Semantic Web and internet enthusiasts have long known the benefits of parts of the Semantic Web, such as Dublin Core and RDF meta-tagging. But when a non-geek friend of mine, @Control-Edit, tweeted about Web 3.0 the other day, I knew its time is nearing.
In brief, the Semantic Web is a loose group of standards, data formats and functionality designed to make content on the World Wide Web more meaningful and understandable to automated systems owned by search providers, news agencies, content aggregators and content specialists. More information can be found at the W3C Web site and on the ever reliable Wikipedia. Additionally, the video my friend tweeted is highly recommended.
In part, the slow take-up of Semantic Web standards has been due to the power of search engines, such as Google, in extracting meaning from plain-language text. However, most serious commentators (e.g. Florian Cramer) seem to think that as the Web grows, plain-language indexing and Google’s Page Rank algorithms will become increasingly unable to provide relevant results. Therefore, some sort of content semantics will be necessary to index Web pages in the near future.
So, what does this mean for business and Web site owners? Primarily, it means that content published to the Web or online systems needs to be published with meta-data in the formats that the Semantic Web already dictates and any new formats that arise. This means more work on the part of publishers to manually enter things like keywords, summaries and indexing information according to any number of information schemas that are going to become standard for particular kinds of content and industries. Automatic keyword extraction tools may be a useful tool in this process, but it means more work for content publishers in order to stay relevant.
A good first step for any organization is to be ensure that all Web published content is moved to the XHTML format. This will enable you to include XML formatted meta-data with your content that will be ignored by browsers but picked up by the Web 3.0 applications. The next step is to include meta-data in resource description framework (RDF) format. This is the primary means to add Semantic Web friendly meta-data at the moment, and it will become more common and necessary in the near future. At the same time, including Dublin Core meta-tags in your content is also worthwhile. Performing these three steps will make you a pioneer in Web 3.0 and will set you up with the business knowledge you need to understand and implement future movements in the Web 3.0 world.

1 comment:

IainS said...

Ben,

I agree I have noticed over the last few years just how difficult it is to find anything on Google without getting swamped with irrelevant material.

Tagging the great exciting part of Web 2 of course confuses more than it helps, unless it is systematic so it can be searched and all the relevant material can be found. I think we will return to the library use of keywords as tags.

It is remarkable how difficult it is to find a simple program to edit metadata though!