Categories
RDF Writing

Ooo boy

I sent a note to the RDF Interest Group today, telling them about the draft of the book. I’ve already received some excellent feedback.

Some people climb mountains. Others scale rock cliffs, or dive the deepest depths of the ocean. Still others race cars at 180 MPH, ride bulls, or sail across the ocean in a dinghy.

Me? I write a book about a specification that’s the combined genius of several really scary-smart people, most of whom, if not all, are PhD’s, and then throw the rough draft into their midst, in it’s unpolished, unedited, defenseless nakedness.

I win.

Categories
RDF Writing

It feels so good when you stop

I am so burned out from the push to finish the draft this last week. It got to the point that I was coding PHP into a Java class, and I kept looking at some Python, trying to figure out why it looked funny (it’s Python, it’s supposed to look funny).

And then I had to install .NET to finish the review of the C# API, and that hosed my W2K system up for a bit.

Have I mentioned how grateful I am for such a patient editor? Simon St. Laurent is every tech writer’s dream. And Dorothea Salo has been my content and “interested but RDF naive” tech editor, and has been doing a splendid job. I’d link to them, but even that’s too much technology at the moment.

Dorothea asked me if I was upgrading to Movable Type 2.6. I’m not sure if I replied to her email (Dorothea, if I didn’t, sorry, but this last week has been a mess.) However, I think I’m at my limit of tweaking right at this moment. The thought of going out to my server and playing around with Perl modules, well, it makes me want to stick my head in a snow bank outside, and just leave it there.

It seriously does.

Categories
RDF

Grand idea #102

Thanks to Sean McGrath I found out about this discussion thread over at the Tag Soup discussion forum.

It starts off with Tim Berners-Lee basically asserting that a URI represents a web page, or at least a physical resource:

We know. Your “resource” is a vague thing which can be
a robot or a web page. That vagueness makes a system
which is less useful than a system in which the URI
identifies specifically the web page.

Luckily Tim Bray came along and hollered “Hold on Partner!”:

 

Once again, no matter how hard I try, it’s easy to believe that XML
Namespaces are resources, but really hard to believe that they’re web pages.

Concluding notes:
(a) In both of my examples, the resources identified by the URI map
fairly nicely onto the actual meaning of the English word “resource” –
one of Antarctica’s maps is a resource in human-speak (that’s why
people pay for the software), and if an XML Namespace (typically a
pre-coooked XML vocabulary with pre-cooked semantics) isn’t a resource
as the word is normally used, I don’t know what is. My point is not
only is the Fielding formalism useful to programmers and
self-consistent, the terminology is useful to ordinary people.

(b) In my vision of the semantic web, it makes all sorts of sense to
package up RDF assertions about Antarctica’s maps or XML namespaces and
these could be really useful without pretending, against the evidence,
that either kind of URI actually points at a “web page”.

In RDF/XML, URIs are used to identify a specific resource, but there is no assumption that this resource is actually accessible on the web as a hard and stable entity.

After reading a bit of this discussion thread, my head is bleeding, too, primarily because of my work with RDF, the darling daughter of the W3C. Seems to me that Tim BL’s interpretation just kicked dearest in the butt.

I really respect the W3C people, but this is frustrating. There does seem to be more and more of a disconnect between the W3C folks and us out here in the world trying to use the W3C products. I respect the W3C because without this organization, you wouldn’t reading this. But at the same, I kind of wish they would give their brains a rest every once in a while. I imagine Mark Pilgrim feels that way, too, because of XHTML 2.0 (which isn’t open source, BTW, Mark).

What’s needed at the W3C is someone to come in and keep these white-coat people in line. Someone experienced, practical, down to earth, and very easy going. I think I fit this description, ahem, and it just so happens that I’m looking for a job. Fancy that.

Yes, that’s my grand idea for today. I think the W3C should hire me as the Enforcer — the person who goes around and whacks a white coat in the head everytime they start to get a Grand Idea.

What think?

Categories
RDF Writing

Final TOC and home stretch

Next week I’m delivering to my editor the complete first draft of “Practical RDF” for O’Reilly. Yeah, finally. No one has seen the complete TOC, including the tools, APIs and whatever used in the book and I thought I would provide a heads up before the book is released for public review.

If you’re interested, the TOC is duplicated below. If you have concerns about the technology used, or are curious as to why I’m covering one tool over another, or suggestions about tools/apis/topics you feel I should have covered, please leave a comment or send me an email.

Once the book has had a look over by my editor, I’ll be posting OpenOffice versions of each chapter for chapter by chapter review at http://rdf.burningbird.net.

The book ended up featuring over 50 different tools and APIs, in seven different languages (Perl, PHP, Python, C, LISP, Java, C#, and even Javascript), on three different databases; most of the APIs and tools are currently in alpha/beta state, not to mention the RDF spec itself, now heading towards last call. This was a challenging and rather frustrating experience at times.

Grr.

But, most of the tools and APIs were freely given and open source, supported by people who want nothing more than to provide nifty technologies for people to use.

Grr-eat.

TOC:

 

Chapter 1. Introduction
This chapter will introduce the book, as well as provide a brief history of RDF including current efforts as of the date the first draft of the book.

What exactly is RDF?
A Brief History
RDF and the Semantic Web
Current Specification Efforts
The RDF Specifications
When to use and not use RDF
RDF Controversies
Related Technologies
The RDF Primer

Chapter 2. RDF: Heart and Soul
Focuses on the Concepts and Semantics specifications

The Search for Knowledge
The RDF Triple
The RDF Graph
The URI
RDF Serialization: N-Triples
Datatypes
Talking RDF: Lingo and Vocabulary
Sub-Graphs
Graph and Not Ground
Entailment
Equality
Assertions

Chapter 3. Basics of RDF/XML
The major elements of the RDF syntax are introduced and discussed. Covers the syntax and test cases docs

Serializing RDF to XML
Nodes
Stripped Syntax
Properties
URIs, Qnames, and Abbreviations
The Type Property
RDF Blank Nodes
More on RDF Data Types
RDF Shortcuts
The RDF Test Cases

Chapter 4. Specialized RDF Relationships: Reification, Collections, and Containers
More complex constructs with some semantic challenges.

RDF Containers
Basic Container syntax
Typed node emulation
RDF Collections
What Containers and Collections ‘mean’
Reified Statements
An Example of Reification
The Necessity of Reification and Higher-Order Statements
A Shorthand Reification Syntax
Why Big Ugly?
Why Reify?

Chapter 5 Important concepts from the RDF Vocabulary
The RDF Schema provides the roadmap to creating an RDF vocabulary. The “rules” are covered, with examples to clarify the more complex topics.

RDF Schema: Defining the Metadata
Metadata’s Role in Existing Applications
RDF Schema: Metadata Repository
Core RDF Schema Elements
Overview of the RDF Core Classes
Demonstrations of the RDF Core Classes
Refining RDF Vocabularies with Constraints
RDF Schema Alternatives

Chapter 6. Defining RDF Data Schemas
This chapter provides coverage of defining a custom vocabulary for RDF. Discussion will also cover PICS, as an example, as well as other examples.

What do we mean by Vocabulary
Defining the Vocabulary Business and Scope
Defining the Vocabulary Elements
The PostCon Elements
Prototyping the Vocabulary
Adding in Repeating Values and a container
Formalizing the Vocabulary with RDFS
Another Example: The Dublin Core
An overview of the Dublin Core MetaData Element Set
Dublin Core in RDF/XML
The Qualified Dublin Core elements
Mixing Vocabularies
Using DC-dot to generate DC RDF
When Precision isn’t enough

Chapter 7. Ontologies: RDF Business Models
Why Ontology?
DAML+OIL
RDF and OWL

Section II – RDF Tools
Now that we know what it is, how can we work with it?

Chapter 8. Merging RDF with Other Technologies
Using RDF with other applications.

RDF and Links
RDF and SOAP
Generating RDF with XSLT
RDF and UML
RDF and SVG

Chapter 9. Editing, Parsing, Generating, Converting, and Browsing RDF
Browsers
BrownSauce
Parsers
ARP
Raptor RDF/XML Parser
ICS-FORTH Validating RDF Parser
Javascript RDF Parser
Wilbur
Editors
SMORE — Semantic Markup, Ontology, and RDF Editor
RDF Editor written in Java
Converters
Grove’s ConvertToRDF
Convert RDF to iCalendar (Dan Connolly) – RDF Calendar task force
DMOZ RDF Parser for MySQL

Chapter 10. Jena: A Java-Based RDF API
Overview of the Classes
The Underlying Parser
The Model
The Query
The Iterators
DAML+OIL
Creating and Serializing a model
Very Quick Simple Look
Encapsulating the Vocabulary in a Java Wrapper Class
Adding in more complex structures
Creating a Typed node
Creating a container
Parsing and Querying an RDF Document
Just doing a basic dump
Accessing specific values
In Memory versus Persistent Memory Model Storage
A Brief look at DAML+OIL in Jena

Chapter 11 RDF and the Three P’s

RDF/XML and Perl
Ginger Alliance PerlRDF
Model Persistence and Basic Querying
Serializing RDF/XML
Examining the Schema
RDFStore
The PHP XML RDF Classes
RDF-API
Class Overview
Creating an RDF Model
Parsing and Querying an RDF Model
PHP Classes for XML
Class overview
Rdql
Persistent RDF – rdql db
Python Support
RDFlib
Building a basic Model and Serializing
Parsing a model and queries
TripleStore and ZODB

Chapter 12 Querying RDF: RDF as Data
Basic relational syntax of RDF query languages
Querying with Jena
The Query Language
RDF Query-o-Matic
Querying with PHP
The Query Language
RDF Query-o-Matic light
Inkling–Querying RDF Data using SquishQL
Sesame
RDF Server (rdftp)
Versa RDF Querying Language

Chapter 13. A Brief look at other RDF Application Environments
Whatever works with XML, works with RDF/XML
Overview of Redland — a multi-language -based RDF Framework
Working with the Redland Framework
Redland’s language du jour – C
Using the Language APIs
Perl and Python
Redfoot
RDF and NET
C# RDF Parser
4Suite

Section III – RDF Goes to Work
We know what it is, we know how to use it, now list some of the uses.

Chapter 14. Subscription and Aggregation with RDF/RSS (RSS 1.0)
This chapter focuses on RSS, including how to expose content, including exposing content through userland, other sources. Chapter also covers Meerkat.

RSS: A quick History
RSS 1.0: A quick introduction
A Detailed Look at the Specification
Channel
Title, Link, Description
Items
Image
Textinput
Item
Extending the Specification through Modules
The RSS Modules
Core: Dublin Core, Syndication, Content
Extended
Brief look at RDF/RSS Aggregators
AmphetaDesk
Meerkat
Aggregating on a Mac
Creating your own RDF/RSS Content
(RDF/RSS isn’t only for news feeds)
Build your own RDF/RSS Consumer
PHP – using an XML API
Python – using an RDF API
Java – using a specialized RSS API
Perl – Ditto
Validating and Converting to RDF/RSS

Chapter 15. Mozilla: User Interface Development with XUL and RDF
Covers Mozilla’s use of RDF to process template data within XUL. Strong enough and significant enough to leave as separate chapter.
The Concepts behind XUL
A Brief Review of the XUL User Interfaces
Dynamic Table of Contents using XUL/RDF
Nested TOC Data

Chap 16. A World of Other Uses
FOAF: Friend-of-a-Friend
DMOZ Directory Outputs and the DMOZ parser
RDF Gateway, a commercial RDF Database
Chandler: RDF within an Open Source PIM
RDF and Adobe: XMP
Creative Commons license
Tucana KnowledgeStore (TKS)
A look at the RDF projects underway at Sourceforge

Appendix A. A Detailed Look at the RDF Grammar

Get permission from W3C to duplicate the RDF Grammar and productions

Appendix B. RDF Resources

URLs and notes to as many RDF resources as we can scrape together

Categories
Semantics

Good Enough

Recovered from the Wayback Machine.

Mark Pilgrim does not believe in the Semantic Web. He believes Semantics is hard; that the syntax for the Semantic Web is laughably complex. Mark wants to stay with the “…simple but relatively well-defined semantics of HTML.”

HTML is good enough for Mark, and I say that’s great, because no one wants to force the Semantic Web on Mark.

But HTML is not ‘good enough’ for me. HTML has pre-defined elements and I can’t add to these. HTML comes with a lot of baggage from the past, and I don’t want this. And HTML is primarily about presentation, and I’m not necessarily interested in this outside of my own web pages. Don’t mistake me: I’m not out to re-create the world, or provide tools that allows one to cut through the bullshit and drill directly to the truth. All I want is a way of defining data that is consistent, using a commonly occurring syntax with pre-existing tools that can parse that syntax.

I’ve worked with data since day one of my professional life. I wrote applications that traversed billions of lines of code from Peace Shield in order to populate a data dictionary. I was lead Information Repository modeler for Boeing Commercial. I helped the old Oracle Case tools people design their products. I’ve worked with PDES and POSC and other organizations, to find a way to define data so that it was interoperable between organizations without having to re-negotiate protocols. And I was looking for a magic interoperability protocol long before the web. It started with EDI, but EDI wasn’t good enough.

SGML didn’t work because, bluntly, we didn’t think about using it. HTML didn’t work because HTML was/is about web pages. XML didn’t work because there was no meta-data structure associated with the markup language. Even within RDF there are other serialization formats that aren’t ‘good enough’ for me. Mark points to Aaron Swartz’s RDF Primer that focuses on N3 notation. Aaron really doesn’t care for RDF/XML; N3 notation is ‘good enough’ for Aaron. But it’s not good enough for me.

RDF/XML, with its metadata structure (RDF) paired with a common syntax (XML), is a start on being ‘good enough’ for my needs.

The point though, is that for each of us there are technologies that aren’t ‘good enough’, and you spend your time finding ways to improve or expand or correct the technology until it is ‘good enough’. To Mark, this is improving how we use HTML, which is comendable. But to me, it’s finding ways to use RDF/XML and in the process explain RDF/XML so that others might also find some uses for it. I hope this is also seen as comendable.

Mark’s discussion about Semantic web and HTML is a response, in part, to Dare Obasanjo, who writes:

 

Given that the W3C thinks XML is the basis for RDF and the Semantic Web it seems the general direction going forward is to move towards replacing a WWW full of HTML documents to one full of XML documents.

If you are for the Semantic Web, you are for an XML Web not for an HTML one.

 

(I sometimes think that the W3C is its own worst enemy. So many noble goals, based on so many impracticable ideas. We keep telling them and telling them: webbies just want to have fun, but they keep pushing back with the search for truth, and a better way of life.)

Reading Dare’s comment, I can see why Mark feels that technologies such as RDF/XML are being pushed on him. I can see why he pushes back with:

 

RSS 0.91 is the simplest and most popular of all the RSS formats, it’s one of the simplest XML-based formats you’ll ever find, and 10% of the world’s RSS feeds are still invalid—mostly due to XML formatting rules (escaping ampersands, character encoding issues) that aren’t even RSS-specific. And you want to “move towards replacing a WWW full of HTML documents to one full of XML documents”? Are you sure? Because realistically, all you’ll manage to do is replace a morass of bloated, poorly written, invalid HTML documents with a morass of bloated, poorly written, invalid XML documents. And to tease any meaning at all out of these “semantic” documents, you’ll spend your days writing ultra-liberal parsers to parse invalid XML (or, God help you, invalid RDF/XML), and you’ll spend your nights and weekends decrying “the new generation of tag soup” on XML-DEV.

 

Dare’s comment, and the W3C esoteric ideals aside, isn’t that what the move towards XHTML is all about? Moving towards valid and well written XML documents that are based on the HTML vocabulary? Isn’t that the whole point of technologies such as XHTML and CSS: to replace those …bloated, poorly written, invalid HTML documents? To realize the full potential that started with HTML, before we got sloppy?

Innovation and improvements in technology don’t come about because technology is ‘good enough’. They come about because technology is full of holes and no matter what we do we’ll never plug all of them. But we’ll keep trying and we’ll keep improving and in the process, we’ll discover new and exciting technologies, and we start the process all over again.