Wednesday, March 15, 2006

Tim Berners-Lee on the Semantic Web

Just back from Oxford. What's up with drunk people on trains? Sheesh. One on the way up, one on the way down. How unpleasant. However, casting aside the negativity, I had the opportunity to see Web-Creator Sir Tim Berners-Lee discuss what he viewed as the Future of the Web in a special OII event. The Semantic Web. From what I gather, the key element of this new approach to filing technology online is People have URIs, not just webpages. So while the contemporary Web has addresses for content and data, the Semantic Web has addresses for documents and concepts.

The full presentation of his talk is here (currently forbidden). There is rumour that it will be webcast in the near future. Links to follow.

I also had the pleasure of having dinner afterwards with Ralph Schroeder.

Here are my (occasionally cryptic) notes:


Intro: Berners-Lee graduated from Oxford University in 1989. He invented the WWW while working at Cern. He developed it as a tool to help scientists collaborate. Also director of WWW Consortium. Senior Researcher at MIT CSAIL and Professor at Southampton. Wrote first web client and server in 1990.

This talk: The future of the web and the future of our thinking about it.
Building a web of data rather than a world of hypertext.

Philosophical engineering (When we take computers and networks and build things like the web, we have a tremendous medium where we can play God. We design the rules.)

The web is defined by simple protocols. What is more interesting is the gap between the macroscopic and the microscopic rules (blogosphere vs. computing). Emerging are new types of communication in society. Apply the system (i.e., all of us), then macroscopic phenomena occur. The connections between the elements in this process are more interesting.

Rules are social as well as technical. We can change social rules more easily. We need to reengineer social rules. The technology which is going to happen will need new social rules.

Examples:

Email. Tech rules: store and forward, no trust infrastructure. Social rules: don’t bother people.

Email scaled really well until the same microscopic rules were run in a commercial environment. From academic to commercial. Now the email system is in a heap. People are giving up email and moving back to the telephone.

Web rules.

Technical: use the URIs (e.g., http) to anchor and document. Allows for hyperlinking.

Ladder of authority to interpret. (The web browser has the authority to interpret the URIs. It then looks at the http spec, goes to the registry, which goes to another registry, which points to another place, which points to MIME, which points to the html registry which says this is content, which is passed back to the user. Must keep the engineering structure there.

Must use standards (e.g., http, htnl, css, xml, etc)

Provides the infrastructure for what happens when you poke a computer.

Social Rules: You should provide useful stuff. URIs should actually link to what’s publicised.

Make useful links. (Google only works so well because it reads the carefully made network of links).

IP laws, fraud laws, libel laws. (E.g., saying that an email is from someone else is a lie and you can be liable, especially if earn money from it.)

Thank goodness for Google. It identifies vectors of links which identifies topics of human interest. If you look at those clusters and those vectors, you can find the important topics.

We’re not yet at the place where we can develop software as subtle as conversational systems.

Wikis.

Take web rules and add 2 more rules: simple editor (micro rules) and citizen’s responsibility (micro social). Macro – wikipedia.

Blogs.

Micro: trackback – the ability to make a blog point to other blogs which mention it.

Macro: the blogosphere.

Semantic web.

The rules for the Sematic Web (SW) are basically similar. Technical: use URIs for documents and concepts

People have URIs, not just webpages

It still has the same ladder of authority, standards (e.g., rdf)

Social: serve useful stuff, make useful stuff – serendipitous use. Put data out there for one purpose and find that it’s used for something never intended. Follow links through the data.

Share ontologies – make sure we’re all talking the same kind of meaningful language. Ontologies are published on the web, and you can find them.

Agree on ontologies. – surely this is difficult.?

Semantic web: everything has a URI.

Don’t say “colour”, say http://example.com...” Don’t say “hydrogen”. One click on “hydrogen” will lead you to everything out there on hydrogen.

Relational database – subject, property, value. Expressing knowledge about things rather than expressing knowledge about tables.

“The web is a graph. Life is a graph”

Communities and vocabularies.

A universal WWW must include communities on many scales.

Google Maps is like the semantic web.

The mainlines are concepts which are shared through relationships.

There are advantages to having local standards and advantages to having universal standards.

To cope with all of the data, we must build a fractal system – optimal for communicating across complicated networks. The W3 is aiming to build a fractal system. It’s reliant upon trust.

SW works at a conceptual level. We want to develop a unifying language to translate all of the others which people use to create and distribute content (e.g., urls, xml, rdf etc).

Why isn’t the semantic web going to have as fast an uptake as WWW? That spread in 5 years. Because it’s so difficult to describe what the world was like before the web, now it’s difficult to explain what the world will be like after the web - therefore the SW is a paradigm shift all over again.

The value of your bit depends on the value of what’s out there. This needs to be learned all over again.

Data is trickier, especially to design logic languages.

Need for smaller incubator population,

The web took off so quickly because the need was there. There was a group which needed international collaboration, had access to appropriate technology It was a small, enthusiastic group of people who had a problem, “a great Petri dish”. Potential incubator communities for SW are the life sciences and drug discovery communities. If they all work together, they will find cure for everything: “Get the genomics out there so we can all use it”. At the moment, genomics data is in the genomics department. Particle physics has its own server in another building. Never the twain meet. (e.g., clinical trial data). These people are bright and intelligent; early adopters with quite a lot of money.

Data is less exciting without a browser (less woo hoo about trading budgets than trading info about film).

Make your data browsable. There’s plenty of computer programmes which will do it for you with simple queries. They’ll do it automatically.

In New York after 9/11, the authorities used geospatial data to map the city, which flagged up a powerful tool to use in emergencies. We can do that now with friends and maps. The Semantic Web aims to expand Google Maps (by having access to data) so the limitations that plague the current application aren’t there.

There is a fear of having to make ontologies (total cost is actually finite and very small).

Convert your data to rdf and put it on the web!

The SW aims to make information as easy to consume and view as iTunes, iPhoto etc. in a tabular/database form.

Building new systems requires the knowledge of the social rules.


I'm pleased to see how intertwined the social and technical are.

1 Comments:

Anonymous Anonymous said...

haha well this is something!

Wed Sep 20, 08:33:00 AM  

Post a Comment

<< Home