« Hate Mail | Main | ANN: FeedDemon 2.1 Now Available »

Sunday, November 12, 2006


Feed You can follow this conversation by subscribing to the comment feed for this post.

You are totally missing the point here.. First Web 3.0 is just another buzz word not invented by tech people, but by sales/marketing people..

Second, right now we're stuck in an era where loads of people are still writing relatively low level data..

Stuff like RDF will be generated from existing data sources, and likely most people who publish that stuff will most likely not have to worry about the syntax, in the same way people are publishing .mp3's .doc's .pdf's or whatever..

You can also compare it with IP/TCP/HTTP.. protocols we use every day and never have to worry about. Of course there will always be buggy generators, but whats the point of writing one if you can't interface with proper parsers?

Sadly incompatibilities happen.. like with SOAP, but RDF is stable and well defined as of right now.. We should not worry about the amateur writing shitty markup, but about the big vendors that have the actual power to change incompatibility to a semi-standard.

Evert, what you're suggesting is that future tools will generate valid syntax, yet past experience has proved this wrong (people said that about HTML, and then XML). What is really needed is more tools that can read invalid syntax.

TCP/IP works because geeks such as myself have agreed to try to follow the rules. But once a technology grows beyond the geekosphere, it's unreasonable to assume that it can remain syntactically valid.

I missed the hype, which is what I'm sure it is. I thought that was what Web 2.0 was all about.

I think that systems and users on systems that produce broken code will be increasing invisible, or rather unfindable and unsearchable. I have a feeling the usefulness of the semantic web will aid it's own proliferation and those who do not conform, you just won't hear about them unless they shout and spend lots of cash getting themselves noticed.


Your arguments are flawed.

As stated in earlier comment, much of the data published is formatted automatically using tools, which eliminates much of the invalid use of markup.

You also equate invalid markup with not being able to determine if a person is lying. By any stretch, there's no way to see the logical connection between these two. And no one problems the semantic web would be a mind reader.

Shelley, the fact that much of the data will be formatted by tools doesn't mean it will be valid. People made the same arguments about HTML, but for years we've dealt with web authoring tools that generate invalid markup. And of course, right now we're dealing with tons of invalid RSS feeds despite similar arguments that we could rely on tools to generate well-formed XML files.

Also, I wasn't trying to equate invalid markup with being able to determine whether a person is lying, but I can see how my imprecise writing could be interpreted that way. I just think that the Semantic Web assumes too much about the quality and reliability of data.

Don't count on Google to validate anything. It's not in their genes.

Have you ever run a validator over the pages they produce??

Ok, my turn -

re. "The Semantic Web may happen, but if it does, it's going to be a helluva lot messier than the architects would like."

I believe it's starting to happen, and it certainly is messy.

Evert got a key point in early - there's all this stuff already in databases, moved around by software. Why should expressing it in a slightly different fashion make it any the less reliable?

After having to trawl through thousands of feeds, dealing with all the 'intricacies' (too polite) to 'sanitise' them for presentation, I have to heartily agree.

Call the XML Police!! ;)


what an interesting report about the WEB 3.0.

The whole world is talking about the web 2.0 and the bubble 2.0, because nobody knows exactly what the web 2.0 actually means.

According to this confusion I read a few days ago an article in a German Newspaper about the Web 3.0. It was very amusing.

Best wishes from Germany

Could not resist to show this link which I read just after Nick's.
Seems Nick has a point.

I've had this out with some semwebbers recently. There's an entire layer missing in the semantic web, which is reverse engineering structured information out of semi-structured and ill-formed nonsense. You're right - the semweb is making lab level assumptions about data quality, that don't hold up minutes in the field. It's a very GOAFAI way to think. The winners here will those who parse at any cost.

Shelly: "Your arguments are flawed."

Really? Look around you. We're drowning in junk markup.

Danny: "there's all this stuff already in databases, moved around by software. Why should expressing it in a slightly different fashion make it any the less reliable?"

This doesn't make any sense - it already *is* unreliable. Data in databases are engineered or validated to be reliable, sure. Yet, lots of the malformed junk on the data comes straight from a DB.

The comments to this entry are closed.