Moving to a new weblogging tool

Recovered from the Wayback Machine.

As you’ve seen, I am moving Burningbird to a new weblogging tool this week. After testing several, and after long deliberation, I decided to move to WordPress. Once that decision was made, I have been spending the last couple of days making some extensive template and code changes to WordPress 1.02 to match what I had with my original Movable Type-based weblog. And I’ve made some important discoveries about what to do, and not do, when moving to WordPress. However, before getting into the details of the migration process, a quick overview of why I went with WordPress over the other fine products easily available.

When shopping for a new tool, I had some specific requirements in mind. The first was that the tool must support MySQL. If the tool’s storage is based in MySQL, I can access the weblog data directly using SQL and add my own innovations to my pages, such as the recent comments list on this page. I had no interest in working with what are more primitive text and file based tools to manipulate either the post text or comments/trackbacks–or both. Nor am I that comfortable with the other personal database systems out on the market, such as Berkeley DB.

A second requirement is that the tool supports PHP. I can work with Perl or PHP, or even Python for that matter (but not as comfortably), but the challenge with Perl is that it is a rather cryptic, (though powerful) language. In addition, effective use of Perl or Python requires a great deal of object-oriented design, and the problem with OO is that it can be very difficult to make customizations to code–especially if you’re not a programmer by either vocation or interest.

Frankly, PHP is a looser language. By this I mean, though it’s generally not considered as powerful as Perl, it isn’t as cryptic, either, or as easily obfuscated with object-oriented design. PHP (as with ASP, a similiar type of environment supported by Microsoft) is also, in my opinion, a more comfortable language for non-geeks to learn. By using PHP and supporting PHP, I hope to be able to work less with the geeks, and more with the rest of the world–people who don’t write code either for a living or as a passionate hobby. It’s not that I don’t like working with geeks; after all, I’ve worked with geeks most of my programming life. It’s just that at this time, I’m enjoying the opportunity to work with people who have a different perspective about code; who code for a specific reason, rather than code for the sheer joy of it. (Or because they get paid to code.)

Besides–most of my in-page hacks have been with PHP. By using a PHP-based weblogging tool, I can focus on one language, and probably incorporate my in-page hacks directly into the tool infrastructure itself if it supports a customization (i.e. hack or plugin) type of interface.

A final major requirement, and somewhat supported by my choice of PHP, is one I’ve gone back and forth on for some time now: statically generated pages as compared to dynamic ones. There are advantages and disadvantages to both.

Static versus Dynamic Pages

Statically generated pages are generally less burdensome than dynamic pages, because extra resources are necessary to serve up a page each time it’s accessed. For instance, if a post is output to an individual page, a dynamic system would mean that the contents for that post would have to be accessed from the database, formatted, and then output as HTML (or XHTML) each time the page is accessed (read). The database access and formatting would only be done once with a static page.

However, there shortcuts and optimizations that one can build into a dynamic system, such as caching the post contents so that frequent access of more recent posts does not require round trip access to the database each time the page is accessed. Since database access is generally the most costly aspect of dynamic pages, caching recently accessed post data should improve the performance of each page access.

(By caching what I mean is that the data is stored locally, such as in memory, for faster access.)

Before comments and trackback, it made sense to statically generate pages. Now, though, what we’re finding is that page rebuilding has become all too common, and a burden all of its own within a static system. And when you add in rebuilding to syndication files (RSS and/or Atom), especially if you support comments syndication feeds, the overall use of system resources tends to level out over time.

An even bigger factor in my choice, though, is my own past experience. I have, too many times, published a post only to find an error in it once uploaded. I then have to re-edit the post and re-generate the static page, and this can be a frustrating experience at times. With a dynamic system, I can make the change and once it has been saved to the database, the change is now visible. This is a faster process, and one that overall should cut down on my own personal frustration with static page systems.

Of course, if I have a page that’s Slashdotted, and hit with thousands of requests at once, a dynamic page will fail much more quickly than one that’s statically generated. However, since I’ve only been Slashdotted twice in my entire weblogging career, I’m not using this as a criteria for choosing a product. Besides–few systems can survive Slashdotting easily, even with static pages.

Once I defined my basic requirements for a system–MySQL, PHP-based, and dynamic–then it was a matter at looking at the many different tools.

LAMP Weblogging Tools

To all intents and purposes, the weblogging tools I was most interested in were those that could be considered LAMP tools. By this I mean that the tools run within a Linux environment, on the Apache web server, using MySQL for data access, and based on PHP.

(Note that most of these tools will also work in a Windows environment.)

I wrote once before about looking at a couple of different tools: pMachine’s ExpressionEngine and WordPress. I can’t afford ExpressionEngine, but I could afford the original pMachine, the free version and the Pro.

I also evaluated b2evolution, which, like WordPress, is open source; and Textism’s new Textpattern.

After overcoming a couple of mistakes in the installation of Textpattern, I was able to get a weblog up and running. Textpattern has an elegant editing interface consisting of layered tabs. In addition, much of the product’s effort is focused on simplifying template editing, with template tag generation for every aspect of a page:

In addition, Textpattern also features a plug-in called Textile whereby you can insert special characters into the text for specific HTML tags, and the product then converts these characters into their HTML equivalent.

With Textile, an underscore around text would automatically enclose it in the appropriate HTML tag, after first annotating the text with paragraph markings. Thus the following:

_this is a test_

Would become:

<p><em>this is a test</em></p>

When registering new users, you can designate whether they are a specific type of user. For instance, I can add a new user who is a Freelancer, as compared to a new person who is a Designer, and each, we presume, has different authorizations. However, when I logged in as Designer, I found I could post new essays but not access Presentation; whereas I could access Presentation and post when logged in as a Freelancer. To me, the semantics of each role doesn’t seem to match the access granted.

One widely used weblogging mechanism that Textpattern doesn’t support is trackbacks or pingbacks. I searched in the Textpattern forum, and found this from Dean Allen, who is the Papa of Textpattern:

There’s a mechanism built in to Txp called ‘mentions’, which harvests the url, page title and an excerpt of referrers pointing to specific articles. It still needs some work, but the idea is you’ll be able to output a list or summary beneath the relevant article -’This article was mentioned on the following pages’ or some such.

I think Trackback and Pingback are good ideas, but difficult to grasp. As such I don’t have plans to implement them in Txp.

Nothing to prevent them from appearing in a plugin down the road, however.

Unfortunately, though, the power of trackback isn’t in capturing referrer information, which I’ll get to in more detail in a later posting.

Textpattern is an attractive interface, with its elegant tabs and good use of white space. One quibble I have is that the editing page shares some of the same problems of Movable Type in that elegance can actually overcome usability by making the editing field itself too small. However, the biggest drawback I saw with the tool is, like pMachine, Textpattern is a proprietary tool. Though there are advantages to a proprietary tool, I wanted to go open source for my next weblog lifetime.

Proprietary versus Open Source

A major decision to make when picking a weblogging tool is whether to go with a proprietary product, or whether to go open source. All the tools I’ve used in the past–Blogger, Radio, and Movable Type–have all been proprietary, or closed source.

The strongest advantage to a proprietary product, especially a commercial one, is that there is ownership of modifications to the system, and even accountability if you pay for a product. With applications like pMachine and ExpressionEngine, or even Movable Type for that matter, you can expect that the company is going to pay attention to problems with their code, and if they want to stay in business, fix them in a timely manner.

In addition, there is greater control over what modifications get incorporated into the product with each release, and how the documentation is managed. Even with source code control systems like CVS, open source applications tend to be marked by rather frenzied and even chaotic activity at time, unless certain people have been designated as administrators of the product’s direction.

An advantage, though, to open source applications is that when you see something that needs to be fixed, and you come up with a good fix, chances are you can incorporate that fix directly within the code for the next release of the product. You may have to work with the product’s administrative team to ensure the code meets the standards of the development effort–but anyone can contribute, if they can prove that they know what they’re doing.

A good demonstration of the types of problems the power of the open source development could solve more quickly can be found in the events surrounding the comment spam problems we’ve had with Movable Type in the last year.

The team working on Movable Type was fairly small in the beginning, and could only work on so much code at a time. Several people uninvolved with Six Apart, the parent company for Movable Type, did write code that handled individual aspects of it, but there wasn’t an easy way to incorporate all these changes because they couldn’t be incorporated back into the product in an interim release. I actually had to write a post describing the approach to take to integrate the code, like a human CVS system.

If MT had been open source, the code changes could have been incorporated into the product code itself, and then rolled out in the next bug release. More hands would have been involved in the develpment, and consequently, the fix rolled out sooner.

But there’s another reason why I went with open source this time around, and it has a long history.

Hard to believe now, but once upon a time I was almost a pure Windows developer, starting with the very first beta of Windows (and IBM’s OS/2) years ago when I worked at Boeing. I took, and passed, several of the Microsoft certification exams when they were still beta. I attended WinDev in Boston. I learned about the early versions of COM directly from Kraig Brockshmidt, before he went new age and became known as “Satyaki” (code will do that to you after a while). I learned about the Windows API from Charles Petzold.

My bestselling book for O’Reilly was “Developing ASP Components”.

However, all this changed when Microsoft came out with .NET. I took a look at .NET and knew that this was _not_ a direction I was comfortable with, and felt that Microsoft was heading in opposite directions from the rest of the technical community–going with a larger, intertwined, proprietary infrastructure rather than a smaller, open, component-based systems. I liked COM and COM+, but was less sanguine about .NET. I was frustrated that I had no impact on the direction of this product, and all my work towards COM+ (not to mention my newest edition of my book) was now abrogated in favor of a ‘new deal’. I was disappointed, and basically quit working within the Windows environment.

This decision didn’t come without cost. Professionally, I moved from one environment to something completely new (Java, though I had been working with C and C++ in Unix parallel to working in Windows), and had to become established all over again (including taking Java certification tests). In addition, it’s taken me years, but I’m now at the point that when Win2K is no longer supported, I can move off of Windows completely; to either Linux on my PC, or to the Mac.

Still: years to pull out of a large, complex, and troublesome (to me at least) proprietary system to something that’s either non-proprietary, such as Linux; or at least more liquid, such as OS X is not something you forget the lesson of quickly, or easily.

Now I’m faced with the open source decision, as we await a major new release of Movable Type: either move to open source or stay with proprietary code. Frankly if I wanted to stay proprietary, I’d probably stay with Movable Type.

Ultimately, though, staying with proprietary code makes no sense considering that my focus is now so strongly towards an open source environment. If I’m going to make a move to open source, now is the time to do it–before getting more entangled with a MT specific way of doing things with the release of MT 3.0.

Once my non-proprietary source decision was made, my choices were narrowed down to two products (at this time): b2evolution and WordPress.

WordPress versus b2evolution

Both WordPress and b2evolution have an extensive codebase; both are supported by a group of people with enough ownership of the process to ensure it doesn’t falter or break down. Each has a documentation wiki, b2evolution’s here and WordPress’ here. Each has unique features that make it stand out, but both support most of what I would look for in a tool.

b2evolution has the more polished interface. It supports multiple blogs as well as hierarchichal categories. In addition, the current editing screen provides post preview, as well as spell checking. It also provides smileys, but I swear that’s not the reason I didn’t use it.

b2evolution screenshot

Both WordPress and b2evolution share the same heritage, being forks off an original open source weblogging tool, b2 (cafelog), which came to a full development stop a year ago.

With the same heritage, meeting my initial requirements and a more polished interface, why did I choose WordPress over b2evolution? The simple answer lies in the code, and in the direction that I see both tools going.

WordPress has been focusing more on an infrastructure than an interface, from my initial review. Because of this, it minimizes the amount of PHP code in any of the templates, as compared to b2evolution. In other words, there’s a lot more going on behind the scenes to set up a good, extensible environment than what I’m seeing with b2evolution (though b2evolution is, I believe, the newer tool and may not have as much time in).

Multiple blogs and hierarchical categories, and spell check are great–but I’d rather see more effort put into keeping the templates as clean of PHP code as possible; and creating a clean, extensible interface.

Once you have a good foundation, then you can add the goodies, including spell checking and post preview (which, like comment preview, is planned for the next major release–1.2).

In addition, I am seeing a broader community support within the WordPress community, and that’s essential for an open source product; otherwise the members will burn out and the tool will go by the wayside. At times the community energy may seem chaotic, but there is a method to the madness, as I found after only a couple of days ‘messing’ around, and getting help from community members.

There was, also, one other major factor that decided me on WordPress; a feature I consider so important to my own weblogging efforts, that it overcomes limitations on not yet having Post Preview. This feature is one you’ve seen used in this post: multiple page weblog entries.

WordPress has several quicktags that can be used to insert commonly used XHTML tags into a document, including EM for emphasis and IMG for images. However, it also has a specialized tag called “nextpage” that looks like the following:

<!–nextpage–>

When you insert this into the document, WordPress knows to split the content into a separate page, which can then be accessed by its specific page number from any of the entry pages. Unlike the concept of more.., which just basically takes you to the individual page entry and to a specific anchor point in the page, the nextpage feature from WordPress literally gives you separate URLs for each section.

If you believe that weblogs should be short posts and links, you will never use this. However, if you believe that weblogs are nothing more than a loose framework and that a person should write as they want, how they want, and when they want, then the concept of nextpage should have immediate appeal.

For instance, I am the type of writer that pulls together separate topics within one writing, each of which can be accessed individually in a meaningful way; all forming a comprehensive whole. There is no way to do this elegantly with existing weblog tools, at least those I’m familiar. What I end up doing is splitting the sections into multiple postings, with a ‘part number’ to each; or I smoosh the sections into one, long, jumbled, breathless entry.

Writings with different part numbers doesn’t work in weblogging because the entries are posted chronologically. This means, then, that if you want Part 1, you have to move backwards in a page, rather than moving from top to bottom for oldest to newest, which is more natural for the majority of us. In addition, you know how difficult some of my long writings can be to read at times–not to mention how long the pages are, and how slow to load if I use a lot of images.

Neither works for me–not after seeing nextpage. It could have be created just for people like me. And it’s not something that would be easily created as a plug-in or hack, either.

Now, with this very long posting, you can read all the sections, or just those of interest. (Note to WordPress: attribute on the tag that allows one to label each page with topic heading would be mighty nice. Maybe I’ll write this). In addition, if you want to comment on the document from your own weblog, you can link to the entire document, or link to a specific section that you’re specifically addressing! For instance, if you want to post about this essay, but specifically the section on nextpage you can by grabbing the URL for just this section, and then linking directly to it.

Once discovered, I cannot live without this feature. No matter what’s missing in WordPress now (and most of what I want is planned for release 1.2), nothing compares to having this simple little tag–and what it means to me as a writer.

Quite simply, I’ve falled in love with a tag. Those of you who may yearn for me from afar, alas, but my heart is given to another–and it’s markup. Choice was made: Burningbird is moving to WordPress.

Well, choosing the tool was the easy part. Now it’s time to go to work, and begin the migration process from Movable Type to WordPress. Moreso, now it’s time to move a weblog that was heavily integrated into Movable Type into a completely different product.

However, this will go into a separate post. Don’t want to wear my new little buddy, nextpage out. And I’m still in the midst of tweaking.

Later today.

Static versus Dynamic Pages

LAMP Weblogging Tools

Proprietary versus Open Source

WordPress versus b2evolution

Nextpage