Categories
RDF Technology Weblogging

Technology to enable community

Recovered from the Wayback Machine.

Serendipity is such a major component of my life, never more so than when I read Gary’s attempt to manually connect the multiple threads to the whole discussion about Identity.

While I’m on my long journey through distance and time, I’m working on a new application that will provide a means to track cross-blog discussions, such as those my own virtual neighborhood (and others) participate in. The specs for the application are:

 

Project is called Thread the Needle, or “Needley” for short. Its purpose is to track cross-blogging threads.

How it works:

You register your weblog, once, with an online application I’ll provide (i.e. provide your weblog location, name of weblog, email). Frequently throughout the day, the Needle service bot will visit the weblog looking for RDF (an XML meta-language, used for RSS and other applications) embedded within the weblog page. Note that this may change to scan weblogs.com for changed weblogs that are registered, or based on the first time a person clicks the link or some other procedure – testing these out as you read this.

The RDF will be generated by the service now and copied and pasted into the posting; hopefully someday it will be generated automatically by the weblogging tools.

The RDF either starts a weblogging subject thread – starts a new subject – or continues an existing thread. The bot pulls this information in and when someone clicks on a small graphic/link attached to the posting, a page opens showing all related threads and their association with each other.

Example:

AKMA writes a posting on Identity. Because he starts the discussion thread he creates and embeds RDF “thread start” XML into the posting (generated by the tool using very simple to use form, results cut and pasted into posting). Included in this RDF is thread title, brief description, posting permalink, weblog name, and posting category, accessed from pulldown list.

The generated code also contains a small graphic and link that a person clicks to get to the Needley page. Clicking another small graphic/links opens up a second form for a person wanting to respond to this posting, with key information already filled in.

The posting would look like:

 

This is posting stuff, posting stuff, words, more words more words
more words and so on.

link/graphic to view page Needle thread page,
link/graphic to respond to current posting

Posted by person, date, comment

 

The embedded RDF is invisible.

David Weinberger creates his own posting related to AKMA’s posting, and clicks AKMA’s “respond” link and a form opens with pre-filled fields. He adds his own permalink info, pushes a button and a second page opens with generated RDF that David then embeds into his posting.

Stavros comes along wanting to continue on David’s discussion and follows same process. Jeneane responds directly to AKMA, and Jonathon, responds to Stavros, and Mike responds to David, and Steve responds to Jeneane and AKMA responds to David and Steve, who responds back to AKMA.

The Needle page for this thread shows:

AKMA
David
Stavros
Jonathon
AKMA
Mike

Jeneane
Steve
AKMA

Each of the above names is a hypertext link to the discussion posting. Some visual cue will probaby be added to assist in the reading of the hierarchy of discussion. (I’ll also work to make sure that this page and its contents are fully accessible.)

If a person is responding to two or more of the threaded postings, they can add the generated RDF for each posting they’re responding to – there’s no limit. So Dorthea responds to Jonathon’s and AKMA’s original posting:

AKMA
David
Stavros
Jonathon
Dorothea*
AKMA
Mike

Jeneane
Steve
AKMA

Dorothea*

The asterisk shows that the posting is one response to multiple postings.

It will take approximately 30 seconds to click, complete, generate, cut and paste the RDF for a response; about 1 minute for starting a thread.

The results can either be hierarchy ordered, by response, or time ordered. The thread page starts with the thread title, category, description, date started, date of last update and each weblog entry is associated with a link that will take a person directly to the specific posting.

With this, people can see all those who’ve responded, can reply with new posting, and the conversation can continue cross-blog, many threaded.

I’ll probably try to add in graphics to create a flow diagram, similar to the RDF validation tool (see at http://www.w3.org/RDF/Validator/ and use http://burningbird.net/example12f.rdf as test RDF file to demonstrate).

Discussion thread titles and associated descriptions and categories will go on a main page that is continuously updated, with a link to the main thread page for each discussion. I’d like to add search capability by category, weblog, and keyword.

(e.g. “Show me all discussions that AKMA has originated that feature Identity”)

 

I’ve already incorporated RDF into Movable Type postings and have been able to successfully scrape and process the information.

I’ll be asking for beta testers of this new technology in July, and will be hosting the discussion server at first. My wish is to distribute this application rather than centralize it, and will look at ways this can occur (one major reason why I went with embedded RDF).

Update: AKMA and Gary Turner are collecting suggestions and requirements from the weblogging community for this application. A basic infrastructure is in place, but the user community needs to provide information about how this product will work, and what it will do. Please see AKMA’s posting to get additional information.


 

Just read Meg’s What we’re doing when we blog article. Though I can agree with many of Meg’s sentiments, I totally disagree with Meg’s philosophy that the weblogging format is the key to weblogging. Last time I looked, I thought it was the people. Meg truly missed the boat on this one. In fact, she wasn’t even at the dock to wave her handkerchief good-bye when the boat left.

The Thread the Needle application will help weblogger discussions, but it’s just an enabler – weblogging discussions can continue without it. We are connecting because of what we say, not the technology we use. Weblogging tools help, but they don’t create community.

Another instance of serendipity because the same day Meg’s article appears, I stated in the Pixelview interview:

 

Too many people focus on the technology of the web, forgetting that technology is nothing more than a gateway to wonderous things. The web introduces us to beauty, creativity, truth, new people and new ideas. I genuinely believe there are no limits to what we can accomplish given this connectivity.

Categories
Technology

P2P Discovery

What kind of core do Kazaa and its supernodes have? Is it iron? Gold? Or is it more of an aluminum core because the cloud that supports the Kazaa P2P network is still malleable — the Supernodes that provide the cloud services are fluid and can change as well as go offline with little or no impact to the system.

I imagine, without going into the architecture of the system, that more than one Supernode is assigned to any particular subnet, others to act as backups, most likely pinging the primary Supernode to see if it’s still in operation. Out of operation, the backup Supernode(s) takes over and a signal is sent to the P2P nodes to get services from this IP address rather than that one. The original Supernode machine may even detect a shutdown and send a signal to the secondaries to take over.

Or perhaps the Supernode IPs are chained and the software on each P2P node checks at this IP first and if no response occurs, automatically goes to the second within the Supernode list and continues on until an active Supernode is found. This would take very little time, and would, for the most part, be transparent to the users.

Again without access to any of the code, and even any architecture documentation (which means there’s some guesswork here) the algorithm behind the Supernode selection list looks for nodes that have the bandwidth, persistent connectivity, and CPU to act as Supernodes with little impact to the computer’s original use. The member nodes of each KaZaA sub-net — call it a circle — would perform searches against the circle’s Supernode, which is, in turn, connected to a group of Supernodes from other circles so that if the information sought in the first circle can’t be found, it will most likely be found in the next Supernode and so on. This is highly scalable.

So far so good — little or no iron in the core, because no one entity, including KaZaA or the owner’s behind KaZaA, can control the existence and termination of the Supernodes. Even though KaZaA is yet another file sharing service rather than a services brokering system, the mechanics would seem to meet our definition of a P2P network. Right?

Wrong.

What happens when a new node wants to enter the KaZaA network? What happens if KaZaA — the corporate body — is forced offline, as it was January 31st because of legal issues? How long will the KaZaA P2P network survive?

In my estimation, a P2P network with no entry point will cease to be a viable entity within 1-2 weeks unless the P2P node owners make a determined effort to keep the network running by designating something to be an entry point. Something with a known IP address. Connectivity to the P2P circle is the primary responsibility of a P2P cloud. KaZaA’s connectivity is based on a hard-coded IP. However, small it is, this is still a kernel of iron.

We need a way for our machines to find not just one but many P2P circles of interest using approaches that have worked effectively for other software services in the past:

We need a way to have these P2P circles learn about each other whenever they accidentally bump up against each other — just as webloggers find each other when their weblogging circles bump up against each other because a member of two circles points out a weblog of interest from one circle to the other.

We need these circle to perform an indelible handshake and exchange of signatures that become part of the makeup of each circle touched so that one entire P2P circle can disappear, but still be recreated because it’s “genetic” makeup is stored in one, two, many other circles. All it would take to restart the original circle is two nodes expressing an interest.

We need a way to propagate the participation information or software or both to support the circles that can persist regardless of whether the original source of said software or information is still operating, just as software viruses have been propagated for years. Ask yourselves this — has the fact that the originator of a virus gone offline impacted on the spread of the said virus? We’ve been harmed by the technology for years, time to use the concepts for good.

We need a way to discover new services using intelligent searches that are communicated to our applications using a standard syntax and meta-language, through the means of a standard communication protocol, collected with intelligent agents, as Google and other search engines have been using for years. What needs to change is to have the agents find the first participating circle on the internet and ask for directions to points of interest from there.

A standard communication protocol, meta-language, syntax. Viral methods of software and information propagation. Circles of interest with their own DNA that can be communicated with other circles when they bump in the night, so to speak. Internet traversing agents that only have to be made slightly smarter — given the ability to ask for directions.

Web of discovery. Doesn’t the thought of all this excite you?

Categories
Technology

Iron Clouds

A true P2P cloud does not have a core of iron. By this I mean that there can be no static IP or server providing the gateway or facilitating the communication between nodes within a distributed application.

You can argue this one with me for years and you won’t convince me otherwise. I know that Groove has an iron core cloud. I know that Userland is thinking of an iron core cloud that can move about the nodes. UDDI is based on the premise of a centralized source of information about services that just happens to get striped and mirrorer. Striped — chunked off. Mirrored — distributed to different servers. And don’t focus on the the distributed in the latter, keep your eye on the server.

Server == iron

iron == control

Freenet comes closest to being the truest form of a cloud but there is an assumption that the gateway to the cloud must be known in some way, a pre-known entrance. According to the Ian Clarke’s Freenet: A Distributed Anonymous Information Storage and Retrieval System, “A new node can join the network by discovering the address of one or more existing nodes through out-of-band means, then starting to send messages”.

Can we have P2P clouds without some touch of iron? Can we have transient gateways into P2P networks without relying on some form of pre-knowledge, such as a static IP?

Ask yourselves this — I’m looking for information about C#, specifically about the CLR (Common Language Runtime) and the Common Language Interface (CLI).

Keys are: C# CLR CLI

Go to Google, enter the words, click on I’m Feeling Lucky — and say hi to me in passing.

We don’t need P2P clouds with cores of iron; what we need is new ways of looking at existing technologies.

Categories
Technology

P2P Services

The Don Box discussion about HTTP was a good read with valid points.

From a P2P, not a web services perspective, we need to guarantee certain capabilities in P2P services that we take for granted in more traditional client/server environments. This includes the following:

 

  • Transaction reliability — the old two-phase commit of database technology appears again, but this time in a more challenging guise.
  • Transaction auditing — a variation of the two-phase commit, except that auditing is, in some ways, more fo the business aspect of the technology.
  • Transaction security — we need to ensure that no one can snoop at the transaction contents, or otherwise violate the transaction playing field.
  • Transaction trust — not the same thing as security. Transaction trust means that we have to ensure that the P2P service we’re accessing is the correct one, the valid one, and that the service met some business trust criteria (outside of the technology realm with the latter).
  • Service or Peer discovery — still probably one of the more complicated issues about P2P. How do we find services? How do we find P2P circles? How do market our services?
  • Peer rediscovery — this is where the iron hits the cloud in all P2P applications I know of. You start a communication with another peer, but that peer goes offline. How do you take up the conversation again without the use of some centralized resource? Same could also be applied to services.
  • Bi-directional communication — This is Don’s reference to HTTP’s asymmetric nature. Peers share communication; otherwise you’re only talking about the traditional web services model.

The file transfer nature of Napster or Freenet, and the IM nature of Jabber don’t necessarily consume all of these aspects of P2P applications, so haven’t necessarily pushed the P2P bubble to the max. However, when we start talking about P2P services — a variation of web services one could say — then we know we’re going to be stretching both our technology capabilities, and our trust of the same.

Fun!

Categories
Technology

Next Generation P2P?

John Robb at Userland has defined a set of constraints for what he considers to be next generation of P2P. I appreciate that he’s put Userland architecture interests online — it generates conversation. However, I am concerned about the interpretation of “P2P”, for what is, essentially a lightweight server system.

Requirement one: The ability for individual users to create subnets where authorization is required before use is enabled.

It’s interesing that people talk about sub-nets and authorization. For true P2P security, the same rules of trust and security must be established with all peers, sub-net participants or not. Rather than create new authentication and security for each individual sub-net, the same security mechanisms and trust definitions must apply to all P2P nodes. Otherwise, any one P2P node that’s on a wire that has physical access to the secure sub-net is a point of vulnerability. And I guarantee that there will be one node that’s connected to the Internet, making all nodes insecure.

However, applying security measures across all possible P2P nodes is going to be a burden on a system — security takes bandwidth. And that’s not the biggest issue — security within P2P nodes implies control. Most forms of authentication and authorization are based on these functions being provided by a central server.

As we’ve seen recently with Morpheus, central points of entry make a P2P system vulnerable.

If this issue is straight user signon and authorization to access of services, then you’re not talking about P2P — you’re talking about a more traditional server/client application. A true P2P system must have a way for each peer to establish a secure connection and determine identity and accessibility without reliance on any specific server.

Yeah. “Gack” is right.

Requirement two: The ability to publish structured content such as a complete web site or web app to a multi-million person network without flooding the publisher’s PC.

I know where this one is going, and I’m sorry, but this is based on a flawed vision: pushing content out to an individual client rather than having the client connect to a centralized source. In addition, this isn’t really a requirement for P2P, but a specific application’s functional need. It’s important to keep the two separate as we discuss the requirement in more detail.

At it’s simplest, published content is nothing more than files, and any P2P file system will work, including Freenet and Gnutella. But in reality, with published content we’re talking about structure as well as files. In addition, the published content also implies an ability to access and re-access the same publication source again and again in order to get fresh content.

Traditional P2P file transfer systems are based on the concept that you’re after a specific resource, a single item — you don’t care where you get it. For published content, the source is a key factor in the peer connection.

As for the issues of scalability, again, traditional P2P networks don’t have an answer that will work in for this requirement because of that single port of content. This would be equivalent to a Gnutella network and only one node on that network has Michael Jackson’s Thriller. As relieved as we are about this, this does put some serious limitations on a P2P-based resource system.

However, once we get beyond the stretch to the P2P paradigm this requirement necessitates, the same concepts of store and forward of Freenet could work for this requirement, except that you’re not talking about intermediate nodes storing an MP3 file — you’re talking about the possibility of massive amounts of information being dumped on each individual intermediate node.

The only way for this to work would be to stripe the material and distribute the content on several nodes, basically creating a multi-dimensional store and forward. Ugh. Now, what was the problem with the web?

Requirement three: The ability to connect subscribed users in a given subnet to each other via Web Services in order to enable a new class of applications that share information (but don’t utilize centralized resources).

The whole principle behind P2P is connecting peers to each other. However, maintaining a true connection in order to successfully conduct a transaction, that’s the key. I once wrote the following functionality for a P2P transaction:

Transaction reliability — the old two-phase commit of database technology appears again, but this time in a more challenging guise.

Transaction auditing — a variation of the two-phase commit, except that auditing is, in some ways, more fo the business aspect of the technology.

Transaction security — we need to ensure that no one can snoop at the transaction contents, or otherwise violate the transaction playing field.

Transaction trust — not the same thing as security. Transaction trust means that we have to ensure that the P2P service we’re accessing is the correct one, the valid one, and that the service met some business trust criteria (outside of the technology realm with the latter).

Service or Peer discovery — still probably one of the more complicated issues about P2P. How do we find services? How do we find P2P circles? How do market our services?

Peer rediscovery — this is where the iron hits the cloud in all P2P applications I know of. You start a communication with another peer, but that peer goes offline. How do you take up the conversation again without the use of some centralized resource? Same could also be applied to services.

Bi-directional communication — This is a reference to HTTP’s asymmetric nature. Peers share communication; otherwise you’re only talking about the traditional web services model.

Interesting challenge. As far as I know, at no one has met it yet…at least nothing that can handle complex data with a single point of origin.

Outside of the listed requirements, John discusses that the next generation P2P systems needs some form of development environment. He states, “Notice, that in this system, the P2P transport is important but generic — it is just a pipe.” He also says “… this system it doesn’t have to be completely decentralized to avoid legal action.”

Last time I looked, decentralization was the basis of P2P. And can we all forget the damn copyright issues for once and focus on what P2P was meant to be: total enablement of each node within the Internet?

John, you have specified requirements of which some, but not all, can be met by P2P-based functionality. Let me emphasize that “some but not all” response again.

You’re really not packaging requirements for the next generation of P2P systems; what you’re packaging is the requirements for “Next Generation Radio”. It’s important not to confuse this with what’s necessary for P2P systems.

I am Superwoman. What makes me Superwoman? Because I meet all the requirements for being Superwoman. And what are the requirements for being Superwoman?

Being me.

It just doesn’t work that way.