Categories
RDF Semantics

And it jiggles, too

I’ve been playing with mash-ups lately for the book, and at one point had to slap myself in the face to get me to Stop! Stop! Not another service!

straup at Flickr’s announcement of “machine tags” is significant, because, as he demonstrated, it really is the same as RDF, except without the scary name (and we’ll shoot the first person who mentions reification). Of course, now I’m looking at my mash-up examples for the book and thinking, like jello, there’s always room for more.

Speaking of integrating services and data, I still like RDF as XML. I can do things with it, such as load it into the browser XML Parser and manipulate the data using DOM methods. Unfortunately, I have to copy the RDF file, such as Dan Brickley’s FOAF file, to my home directory before using Ajax–it’s not packaged correctly for cross-domain browser access. It wouldn’t be difficult for any RDF/XML source to be packaged as end points for cross-domain access. Leaving aside issues of trust.

Danny, who points to a nice semantic/scripting challenge (but…iPod?), asked about RDF Turtle notation and Ajax, and sure you could use Turtle in XMLHttpRequest (XHR) requests, or as endpoints and dynamic scripting. All you have to do is either return it as text for XHR, or as a valid parameter in an endpoint (wrapped in a function call, and used with dynamic scripting). What we need, is an transformation between Turtle and JSON, and return Turtle formatted as JSON (and we have it). But I like RDF/XML because I can just cram it into the browser’s parser and use the DOM. Either XML or JSON works for me.

The Flickr API’s “machine tags” works, too, basically flattening triples and squeezing them skinny thangs into a JSON response; The API provides an endpoint, too, so that you can call it from the browser. If you’re as curious like me, as who would use the dc: namespace at Flickr, click the dc: button in the example page I linked earlier, and you’ll see the most recent cases. From the pictures, it looks pretty much like everyone.

Let me say that in the crowded field of photo services, Flickr just got all pretty and sparkly, and is still *Queen of the game.

Sparkly…sparqly…say…that gives me an idea

Yup. There’s always room for more.

Update Try out the end button, which pulls in the dc:subject from my RSS 1.0 file. Click on an option, and it searches for all the matching photos in Flickr. Of course, I’m the only one who has used dc:subject with Flickr…still.

Quick note: The example application I linked works on most browsers, but this is just a quick hack, for fun. It hasn’t been heavily tested other than me playing around, nor have I optimized the code. I haven’t tried it on IE 6.x or IE7 yet, me having ‘fun’ being the operative concept in this paragraph.

Bonus points: Kingsley Idehen: SPARQL, Tagging, Ajax,…

*What, you thought I was going to say King? Don’t know me well, do you?

Categories
Semantics

Honest Cruft

When I went looking for a FOAF file to copy for my playing around with Ajax, RDF, Flickr, and so on, I immediately thought of Dan Brickley’s FOAF file, and once I had copied it locally, I just plugged it into my application, without validating the RDF/XML first. I did so with confidence because I knew that, if there was one FOAF file guaranteed to be cruft-free, it was Dan’s.

There’s more to ‘trust’ on the internet than is covered by openID: a person can create cruft and still be honest. What we need, as the number of services and data endpoints expand, is a way of attaching trust to the quality of a service–not to mention trust as to whether the service can be hacked and we’d be at risk using their data in our Ajax applications.

I have a great deal of trust with Flickr, but even when I was working on the book, one of their services went out, just for a few minutes, just as I was testing something. Still, I knew it would come back. Why? It was Flickr. The entire site would most likely be taken down before the API would be stripped–or Stewart Butterfield would be fired before he’d let it be stripped.

This is a measure of trust associated with how long a service will be available. If a service is pretty stable, such as Google Maps, or Flickr, or others of that sort, we can integrate such more heavily into our work. However, if the company is a startup, in trouble financially, well then, we better keep any integration at a surface level, ready to cut loose at any moment.

There’s issues associated with whether a service was meant for internal or external access. The recent del.icio.us JSON endpoint service, the Tagometer, wasn’t necessarily meant for completely open-ended use. I’m sure the organization won’t yank it, but…I’d only moderately integrate it into my applications, and keep a replacement handy.

How about ads? Payment? Google has always kept the door open for adding ads to Maps, but the company has said it would provide several days notice. Still: if our mashups, widgets, what have you become dependent on Google Maps, what happens when the ad drops?

We’ve focused so much on people and trust, that we’ve forgotten how much we’re putting our applications, our widgets, our web sites, and even our businesses at risk because of the services and data we’re tying into. What we need is an OpenID for data services: can this data be trusted, is this data trustworthy, is this data coming from the correct spot–hey, is this company going belly up? Does it have dangerous elements? Perhaps what we need is a trust scale we can apply to a service to determine how much we want to depend on such. ProgrammableWeb has a rating system, but let’s face it, that’s more a rank on the ‘coolness’ factor, than the stability, trust, and general warm and fuzziness.

Then there is the issue of our service requests: how about a ‘signature’ we can attach to our requests? Hi, this is Shelley passing through. No worries, I’m not a spammer. Looks like this one has been asked at the OpenID forum. It would be nice to have an API key that I could use with all services. More importantly, though, I’d like to establish a level of trust, so when I hammer the service, hopefully those who are monitoring the service see it’s only me, and I wouldn’t hurt a fly.

Categories
JavaScript RDF

To JSON or not to JSON

Recovered from the Wayback Machine.

Dare Obasanjo may be out of some Ajax developers spheres….actually *I’m probably out of most Ajax developers spheres…but just in case you haven’t seen his recent JSON/XML posts, I would highly recommend them:

The GMail Security Flaw and Canary Values, which provides some sound advice for those happily exposing all their vulnerable applications to GET requests with little consideration of security. I felt, though, that the GMail example was way overblown for the consternation it caused.

JSON vs. XML: Browser security models. This gets into the cross-domain issue, which helped increase JSON’s popularity. Before you jump in with “But, but…” let me finish the list.

JSON vs. XML: Browser Programming Model on JSON being an easier programming model. Before you jump in with “But, but,…” let me finish the list.

XML has too many Architect Astronauts. Yeah, if you didn’t recognize a Joel Spolskyism in that title, you’re not reading enough Joel Spolsky.

In the comments associated with this last post, a note was made to the effect that the cross-domain solution that helped make JSON more popular doesn’t require JSON. All it requires is to surround the data returned in curly brackets, and use the given callback function name. You could use any number of parameters in any number of formats, including XML, as long as its framed correctly as a function parameter list.

As for the security issues, JSON has little to do with that, either. Again, if you’re providing a solution where people can call your services from external domains, you better make sure you’re not giving away vital information (and that your server can handle the load, and that you ensure some nasty bit of text can’t through and cause havoc).

I’ve seen this multiple places, so apologies if you’ve said this and I’m not quoting you directly, but one thing JSON provides is a more efficient data access functionality than is provided by many browser’s XML parsers. Even then, unless you’re making a lot of calls, with a lot of data, and for a lot of people, most applications could use either JSON or XML without any impact on the user or the server. I, personally, have not found the XML difficult to process, and if I wanted really easy data returns, I’d use formatted HTML–which is another format that can be used.

You could also use Turtle, the newly favored RDF format.

You could use comma separated values.

You could use any of these formats with either the cross-domain solution, or using XMLHttpRequest. Yes, really, really.

As was commented at Dare’s, the cross-domain issue is not dependent on JSON. HOWEVER, and this one is worthy of capitals: most people ASSUME that JSON is used, and you’re not returning JSON, you better make sure to emphasize that a person can a) choose the return format (which is a superior option), and/or b) make sure people are aware if you’re not using JSON by default with callback functions.

As for using JSON for all web service requests, give us a break, mate. Here’s a story:

When the new bankrupty laws were put into effect in the year 2005, Congress looked around to find some standard on which to derive ‘reasonable’ living costs for people who have to take the new means test. Rather than bring in experts and ask for advice, their eyes landed on the “standards of living expenses” defined by the IRS to determine who could pay what on their income tax.

The thing is, the IRS considers payment to itself to probably be about as important as buying food and more than paying a doctor. The IRS also did not expect that their means test would be used by any other agency, including Congress to define standards for bankruptcy. The IRS was very unhappy at such when it was discovered.

In other words, just because it ‘works’ in one context doesn’t mean it works well in all contexts: something that works for one type of application shouldn’t be used for all types of applications. Yes, ECMAScript provides data typing information, but that’s not a reason to use JSON in place of XML. Repeat after me: JavaScript/ECMAScript is loosely typed. I’m not sure I’d want to model a data exchange with ‘built-in typing’ based on a loosely typed system.

Consumers of JSON or XML (or comma separated values for that matter) can treat the data they receive in any way they want, including parsing it as a different data type than what the originator intended. Yes, JSON brings a basic data typing, and enforces a particular encoding, but for most applications, we munge the returned data to ensure it fits within our intended environment, anyway.

What’s more important to consider is: aren’t we getting a little old to continually toss out ‘old reliables’ just because a new kid comes along? I look at the people involved in this discussion and I’m forced to ask: is this a guy thing? Toss out the minivan and buy the red Ferrari? Toss out the ‘old’ wife for a woman younger than your favorite shirt? Toss out old data formats? Are the tools one uses synonymous with the tools we have?

Snarky joking aside and channeling Joel Spolsky who was spot on in his writing, just because a new tech is sexy for it’s ‘newness’ doesn’t mean that it has to be used as a template for all that we do.

The biggest hurdle RDF has faced was it’s implementation in XML. It’s taken me a long time to be willing to budge on only using RDF/XML, primarily because we have such a wealth of tools to work with XML, and one can keep one’s RDF/XML cruft-free and still meaningful and workable with these same tools. More importantly, RDF/XML is the ‘formal’ serialization technique, and there’s advantages to knowing what you’re going to get when working with any number of RDF APIs. However, I have to face the inevitable in that people reject RDF because of RDF/XML. If accepting Turtle is the way to get acceptance of RDF, then I must. I’d rather take another shot at cleaning up RDF/XML, but I don’t see this happening, so I must bow to the inevitable (though I only use RDF/XML for my own work).

We lose a lot, though, going with Turtle. We loose the tools, the understanding, the validators, the peripheral technologies, and so on. This is a significant loss, and I’m sometimes unsure if the RDF community really understands what they’re doing by embracing yet another serialization format for yet another data model.

Now we’re doing the same with JSON. JSON works in its particular niche, and does really well in that niche. It’s OK if we use JSON, no one is going to think we’re only a web developer and not a real programmer if we use JSON. We don’t have to make it bigger than it is. If we do, we’re just going to screw it up, and then it won’t work well even within that niche.

Flickr and other web services let us pick the format of the returned data, Frankly, applications that can serve multiple formats should provide such, and let people pick which they use. That way, everyone is happy.

Ajaxian: Next up: CSV vs. Fixed Width Documents. *snork*

*most likely having something to do with my sense of humor and ill-timed levity.

Categories
RDF

May the source be with you

Danny, suffering from a cold leading to procrastinitis (I hear you on this one), hooked on a port of the WP 2813 theme to MT, LiveJournal, and Typepad to create an XSLT transform of the stylehseet items into RDF. This is based on the continuing effort to add more microformat labeling of page contents in order to enhance discoverability.

It’s a nice bit of code, but it strikes me as less than an efficient method when it comes to providing semantic information about the contents of the page.

The same processes that deliver the page for human consumption are also the same processes that provide the same data for syndication. It’s only a small step to then take the same information and provide this in an already formatted RDF format, accessible just by tacking on either /rdf or /meta at the end of the document.

If the issue is then one of static pages, such as those provided by Movable Type, couldn’t one generate static meta pages, as easily?

I’m not pushing against microformats. To me it makes sense to use ‘intelligent’ CSS class names for the different constructs contained within the page, because it’s more consistent and makes it easier to move templates between tools. Besides, might as well start smart than dumb.

But shouldn’t the approach be to generate all the content–human readable content using semantic markup and smart CSS labels, syndication feeds, and RDF–dynamically? Rather than generate one and then use XSLT to ‘transform’ to the other? Or is the bigger issue: let’s all start being consistent with our CSS–make it do double duty. Start bringing presentation, format, layout, and semantics into a cohesive whole.

Of course, I could have completely misread Danny’s intentions, too.

Regardless, I need to clean up my own CSS files. After I finish the Adding Ajax book, first.

Categories
RDF

In front of one’s face

Recovered from the Wayback Machine.

From Planet RDF today, Leo Sauermann points to Zack Rosen who writes of a flawed research/implementation paradigm with regards to RDF. He states that researchers interested in RDF aren’t keeping up with today’s web implementations, such as weblogging software. They’re building ‘widgets’ rather than useful content, and so on.

One specific complaint:

# Researchers are not moving at the pace the web is currently developing, instead they are attempting to leap-frog it. A good example of this is the Structured Blogging and Microformats initiatives. Why are semantic web researchers not collaborating with the teams pursuing these projects?

I don’t know that either of these initiatives are going anywhere. The developers behind WordPress are inserting microformats into WordPress, but doing so without interest or even compliance of most users of the product. That’s the problem: semantic web is not an accidental web and requires some input from the user–not just geek to geek. There’s been little effort to reach beyond the geek with Structured Blogging, and I think that microformats have hit the limit of their reach.

I agree that the inner core of those associated with the semantic web do need to connect with real world implementations. I think, for the most part, they are attempting to do so. Where the failure is happening is that they want to work with Big things, and change starts small.