Categories
Technology

Survival guide to LAMP: Ten basic commands of Linux

L is for Linux

Before continuing the Survival tutorials, you’ll need some basic Unix commands. I’ve recovered the following, which provides a good intro to most of the commands you’ll need, from the now defunct weblog, Linux for Poets.

Once upon a time Unix used to be for geeks only – the platform of choice for godlike SysAdmins and obsessed hackers who muttered strange phrases and giggled over inside jokes, as they swigged gallon after gallon of Mountain Dew. Unix neophytes were faced with a blank screen and an uncompromising command line along with dire warnings about what not to do … or else. Extending the basic computer, adding in such esoteric devices as printers or modems, required recompilation of the kernel, ominous sounding words intimidating enough to send all but the most brave, or foolish, running for the safety of Windows.

Then a strange thing happened: Unix started to get friendlier. First, commercial versions of Linux such as Red Hat came along with easier installation instructions, integrated device support, and lovely graphical desktops (not to mention a host of fun and free games). Open source Unix developers started drinking microbrews and fancy cocktails instead of caffeine and realized that they had to make their software easier to install and well documented in addition to being powerful and freely available. Alternatives to powerhouse commercial applications, such as Openoffice’s challenge to Microsoft’s Office, minimized the cost of switching to desktop Unix platforms. Finally, that bastion of the Hide the Moving Parts Club, Apple, broke all tradition and built a lovely and sophisticated operating system, Mac OS X, on top of a Unix platform.

Today’s Unix: slicker, safer, smaller, better…but push aside the fancy graphics and built-in functionality and simple installation, and you’re still going to be faced, at one time or another, with a simple command line and dire warnings about what not to do. Before you contemplate drinking the Code Red kool-aid, take a deep breath, relax, and familiarize yourself with the Ten Basic Commands of Uinux.


First Command: List the Contents

You have a brand new Unix site to host your weblog. You’re given shell access, which means that you can actually log into the operating system directly, rather than access the site contents through a browser or via FTP. You’ll access the site through SSH, or Secure Shell, because you’ve been told that its more secure. To do so, you’ll install an SSH application recommended by your friends, or use one provided by your hosting service. Up to this point, you’re in familiar territory – start an application and provide your username and password. Simple.

However, once you log on to the operating system, you’re faced with a cryptic bit of writing on the left side of the screen, such as “host%” or some variation thereof, with the cursor located just to the right, waiting to reflect whatever you type. At this point, your mouse, which has been your friend and companion, sits idle, useless, because you’re now in the Unix command line interface, and you haven’t the foggiest what to do next.

Your direction at this point depends on what you hope to accomplish, but chances are, you’re going to be interested in knowing what’s installed in the space you’ve just been given. To do this, you use the Unix List directory contents command, ‘ls’ as it’s abbreviated, to list the contents of the current directory. You can issue the command by typing the letters ‘ls’ followed by pressing the Enter key:

host% ls

What results is a listing of all the files and directories located directly in your current location, which is likely to be the topmost directory of your space on the machine. Depending on the host and what you have installed, this listing could include a directory for all CGI applications, cgi-bin. If your site is web-enabled, it could also include web pages, such as an index.html or index.php file, depending on what you’re using for web pages. If you have a email box attached to your account, you might also see a directory labeled “mail”, or another labeled “mbox”.

This one simple command is highly useful, but there are parameters you can pass to the list command to see more detailed information. For instance, you can see the owner, permissions, and size of files by passing the -l parameter to the command:

host% ls -l

The results you’ll get back can vary slightly based on version of Unix, but the following from my forpoets directory is comparable to what you’ll see:

drwxr-xr-x 3 shelleyp shelleyp 4096 Jul 20 18:09 flavours
-rw-r–r– 1 shelleyp shelleyp 5255 Aug 16 16:28 forpoets.css
-rw-r–r– 1 shelleyp shelleyp 6064 Aug 10 15:14 index.php
-rw-r–r– 1 shelleyp shelleyp 1319 Aug 10 15:00 index.rdf
-rw-r–r– 1 shelleyp shelleyp 789 Aug 10 15:00 index.xml
drwxr-xr-x 10 shelleyp shelleyp 4096 Sep 25 16:21 internet
-rw-r–r– 1 shelleyp shelleyp 27638 Jul 23 00:06 jaggedrocksml.jpg
drwxr-xr-x 9 shelleyp shelleyp 4096 Sep 25 16:23 linux

In this output, the first set of parameters is the permissions for the files and directories, the owner and group associated with each is ’shelleyp’, the size is listed after the group name, as well as the date, and so on. If the permission character begins with the character ‘d’, this means the object is another directory. Easy.

Of course, at this point you might be saying to yourself that I find Unix easy because I’m aware of what the commands are and what all the different parameters mean and do, as well as how to read the results. I’m a geek. I’ve visited the caffeine fountains and drunk deep; I’ve wondered the halls and muttered arcane curses and behold, there is light but not smoke from the tiny little boxes. But how can you, the creative master behind the sagas recorded on the web pages and the color captured in the images and the sounds recorded in the song files, learn these mystical secrets without having to apprentice yourself to the SysAdmin?

That leads us to the second command, whereby you, the seeker, find the Alexandrian Library embedded within the heart of most Unix installations.

Second Command: Seek Knowledge

Cryptic as Unix is, there is an amazing amount of documentation installed within the operating system, accessible if you use the right magic word. Originally, this word used to be man for manual pages; more recently the command has been replaced by info, though most Unix systems provide support for both.

Want to discover what all the parameters are for the list command? Type in the world man, followed by the command name:

host% man ls

What returns is a wealth of information such as more detailed information about the command itself, as well as a listing of optional parameters, and how each impacts on the behavior of the Unix command. Additionally, documentation for some commands may actually contain examples of how to use the command.

Nice, but what if you don’t know what a command is in the first place? After all, Unix is a rich environment; we can assume that one does more than just list directory contents.

To provide a more introductory approach to Unix, the info command, and the associated Info documents for the Unix system provide detailed information about specific commands, and can be used in a manner similar to man:

host% info ls

What follows is less cryptic information about the command, written more in the nature of true user documentation rather than arising from the ‘less is more’ school of annotation. Still, you have to know about the command first to use the system. Or do you?

If you type ‘info’ without a command, you’ll be introduced into the Info system top level node, which provides a listing of commands and utilities and a brief description of each. Pressing the space bar allows you to scroll through this list until you find a utility or built-in Unix command that seems to provide what you need. At this point, you can usually type ‘m’ to enter menu item mode, and then type the command name to get more detailed information. For instance, if I’m looking for a way to list directory contents, scrolling through Info on my server the first command that seems to match what I want is ‘dir’ not ‘ls’. By typing ‘m’ while still in Info, and then ‘dir’, I find out that ‘dir’ is a shortcut, an alias for a specific use of ‘ls’ that provides certain parameters by default:

`dir’ (also installed as `d’) is equivalent to `ls -C -b’; that is,
by default files are listed in columns, sorted vertically, and special
characters are represented by backslash escape sequences.

Suddently, Unix doesn’t seem as cryptic or as mysterious as you once originally thought. Still, it helps to know some basic commands before diving into it headfirst, and we’ll continue with our basic Commands of Unix by exploring how to traverse directories, next.

Third Command: Move About

Unix systems, as with most operating systems including Windows, are based on a hierarchy of directories following from some topmost directory basically represented by an empty slash ‘/’. However, unlike a Window-like environment where you click the directory name to open it and continue your exploration, in a command line environment you have to traverse the directories via command. The command you use is the Unix ‘Change directory’ command, or ‘cd’.

For instance, if you have a directory called cgi-bin located in your current directory, you can change to this directory by using the following:

host% cd cgi-bin

Typing the ‘ls’ command displays the contents of the cgi-bin directory, if any.

To return to the directory you started from you can use the ‘..’ value, which tells the cd command to move up one directory:

host% cd..

You can chain your movement requests to move up several directories with one command by using the slash character between the ‘..’ values. The following moves up two levels in the directory hierarchy:

host% cd ../..

Additionally, you can move down many levels by typing the names of directories you want to traverse, again separated by the slash:

host% cd shelleyp/forpoets/cgi-bin

Of course, you have to be directly in the directory path of a target directory to be able to use these shortcuts; and you have to know where you’re at relative to the target directory. However, what if you want to access a directory directly without messing with relative locations? Let’s say you’re in the full directory path of ‘home/username/forpoets/cgi-bin’ (assuming your home environment is /home/username) and you want to move to ‘home/username/web/weblog/logs’? The key to directly accessing a directory no matter where you are is to specify the complete directory path, including the beginning slash:

host% cd /home/shelleyp/forpoets/cgi-bin

Once you’ve discovered the power of directory traversal, you’ll go crazy, winging your way among directories, yours and others, exploring your environment, and generally snooping about. At some point, you’ll get lost and wonder where you are. You’re at X. Now, what is X.

Fourth Command: Find yourself

In Unix, to paraphrase Buckaroo Bonzai, no matter where you go, there you are. To find your location within the Unix filesystem of your machine, just type in the Unix Print Working Directory command, ‘pwd’:

host% pwd

Your current directory location will be revealed, and then you can continue your search for truth, and that damn graphic you need for your new page but you placed somewhere and can’t remember where now.

Of course, to traverse to a directory in order to place a graphic in it, the location of which you’ll then promptly forget, you have to create the directory, first.

Fifth Command: Grow your Space

Directories are wondrous things, a way of managing your resources in such a way that you can easily find one JPEG file without having to search through 1000 different items. With this simple hierarchical, labeled system, you can put your images in a directory labeled ‘image’, or put your weblog pages in a directory labeled ‘weblog’, and add an ‘archives’ directory underneath that for archive pages.

You can go mad, insane, with the impulse to organize – organizing your pages by topic, and then by month, and then by date, and then…well, the limits of your creativity will most likely be exhausted before the system’s ability to support your passionate embrace of your own self geekness.

Making a new directory is quite simple using the Make Directory command, ‘mkdir’. At the command line, you specify the command followed by the name of the directory:

host% mkdir image

When next you list the contents of the current directory, you’ll now see the new directory, ready for you to traverse and fill with all your bits of wisdom and art. Of course, there is a caveat. This is Unix – there is always a caveat.

Before you can create a directory or even move a file to an existing directory you have to either own the directory, and/or have permissions to write to the directory. It wouldn’t be feasible, in fact it would be downright rude, if you could create a directory in someone else’s space, or worse, in the core operating system directories.

We’re assuming for the nonce that you’re the owner of your domain, as far as your eye can see (as allowed by the operating system) and that you can create things as needed. But what if you want to magnanimously change the permissions of files or directories to allow others to run applications, access pages, or create their own directories?

Sixth Command: Grant Devine Rights

Earlier when playing around with the ‘ls’ command, we looked at more detailed output from the command that showed a set of permissions for the directory contents. The output looked similar to:

-rw-r–r– 1 shelleyp shelleyp 789 Aug 10 15:00 index.xml
drwxr-xr-x 10 shelleyp shelleyp 4096 Sep 25 16:21 internet

In the leftmost portion of the output, following the first character, which specifies whether the object is a directory or not, the remaining values specify the permissions for each object listed by owner of the object (the first set of triple characters), the group the owner belongs to (the second set of triples), and basically the world. Each triple permission states whether the person accessing the object has read access, write access, or can execute (run) the object – or all three.

In the first line, I as owner had read and write access to the file, but not execute because the file was not an executable. Any member of the group I belong to (the same name as my user name in this example, though on most systems, this is usually a different name), would have read access to the file, only. The same applies to the world, not surprising since this is a web accessible XML file. For the second line, the primary difference is that all three entities – myself, group, and the world – have executable permission for object, in this case a directory.

What if you want to change this, though? In particular, for weblog use, you’ll most likely need to change permissions for directories to allow weblogging tools to work properly. To change permissions for a file or a directory, you’ll use the Change Mode command, ‘chmod’.

There are actually two ways you can use the chmod command. One uses an octal value to specify the permission for owner, group, and world. For instance, to change a directory to all all permissions for the owner, but only execution permission for a group and the world, you would use:

host% chmod 755 somefile

The first value sets the permissions for the owner. In this case, the value of ‘7′ states that the owner has read, write, and execute permission for the object, somefile

-rwxr-xr-x 1 shelleyp shelleyp 122 Sep 27 17:48 somefile

If I wanted to grant read and write permission, but not execute, to owner, group, and world, I would use ‘chmod 666 somefile’. To grant all permissions to owner, read and write to group, and read only to world, I would use ‘chmod 764 somefile’.

To recap the numbers used in these examples:

4 – read only
5 – read and execute only
6 – read and write only
7 – read, write, and execute

The first number is for the owner, the second for the group, the final for the world.

Another approach that’s a bit more explicit and a little less mystical than working with octal values, is to use a version of chmod that associates permission with a specific group or member, without having to provide permissions for all three entities. In this case, the use of the plus sign (’+’) sets a permission, the use of the subtraction sign (’-‘) removes it. The groups are identified by ‘u’ for user (owner), ‘g’ for group, and ‘o’ for others. To apply a permission to all three, use ‘a’, which is what’s assumed when no entity is specified.

This sounds suspiciously similar to that simple to put together table you bought at the cheap furniture place, but all’s clear when you see an example. To change a file’s permission to read, write, and execute for an owner, read and execute for group, and execute for the world, use the following:

chmod u+rwx,g+rx,o+x somefile

In this example, the owner’s permissions are set first, followed by the permissions for the group and then ‘others’, or the rest of the world.

To remove permission, such as removing write capability for owner, use the following:

host% chmod u-w somefile

Though a bit more complex and less abbreviated than using the octal values, the latter method for chmod is actually more precise and controlled and should be the method you use generally.

(Of course, there’s a lot more to permissions and chmod than explained in this essay, but we’ll leave this for a future Linux for Poets writing.)

Once you’ve created your lovely new directory, and made sure the permissions are set accordingly, the next thing you’ll want to do is fill it up.

Seventh Command: Be fruitful, copy

One way you’ll add content to your directories is to create new files, or to FTP files from another server. However, if you’re in the midst of reorganizing your directories, you’ll most likely be copying files from an existing directory to a new one. The command to copy files is, as you’ve probably guessed by now, Copy, or ‘cp’.

To copy a file from a current directory to another, use the following:

host% cp somefile /home/shelleyp/forpoets

With this the source file, somefile, is copied to the new destination, in this case the directory at /home/shelleyp/forpoets. Instead of copying the file to another location, you can copy it in the same directory, but use a different name:

host% cp somefile newfile

Now you have two files where before there was one, both with identical content.

You can copy directories as well as files by using optional parameters such as -a, -r, or -R. For the most part, and for most uses, you’ll use -R when you copy a directory. The -R option instructs the operating system to recursively enter the directory, and each directory in that directory and so on copying contents, and to preserve the nature of certain special files such as symbolic links and device files (though for the most part you shouldn’t have these types of files in your space unless you’ve come over to the geek side of the force):

host% cp -R olddir newdir

The -a option instructs the operating system to copy the files and directories as near as possible to the state of the existing objects, and the -r option is recursive but can fail and hang with special files.

(Before using any of the optional flags with copy, it’s a good idea to use the previously mentioned ‘info’ command to see exactly what each flag does, and does not do.)

When you’re reorganizing your site, copying is a safe approach to take but eventually you might want to commit to your new structure and that’s when you make your move. Literally.

Eighth Command: Be Conservative, Commit

Instead of copying files or directories, you can move them using the Unix Move command, abbreviated as ‘mv’.

To move a file to a new location, use the command as follows:

host% mv filename /home/shelleyp/forpoets

Just as with copy, the first parameter in this example is the source object, the second the new destination or new object name – you can rename a file or directory by using ‘mv’ command with a new name rather than a destination. You can also move a directory but unlike ‘cp’, you don’t have to specify a an optional parameter, or flag, to instruct the command to move all the contents:

host% mv olddir newdirlocation

Up to this point, you’ve created, and you’ve copied, and you’ve moved and over time you’re going to find your space becoming cluttered, like Aunt Minnie’s old Victorian house filled with dusty lace doilies and oddities like Avon bottles, forming canyons of brightly colored glass for the 20, or so, cats wondering about.

It’s then that you realize: somethings got to go.

Ninth Command: Behold, the Destroyer

There is a rite of passage for those who seek to enter geekhood. It’s not being able to sit at a keyboard and counter the efforts of someone trying to crack your system; it’s not being able to create a new user or manage multiple devices. The rite of passage for geek candidates is the following:

host% rm *

Most geeks, at one time or another, have unintentionally typed this simple, innocuous phrase in a location that will cause them some discomfort. It’s through this experience that the geek receives a demonstration of the greatest risk to most Unix systems…ourselves.

The simple ‘rm’, is the Unix Remove command and is used to remove a file or directory from the filesystem. It’s essential to keep a directory free of no longer wanted files or directories, and without it, eventually you’ll use up all your space and not be able to add new and more exciting material. However, it is also the command that most people use incorrectly at some point, much to their consternation.

To remove a specific file, type ‘rm’ with the filename following:

host% rm filename

To remove an entire directory, use the following, the -r flag signaling to recursively traverse the directories removing the contents in each:

host% rm -r directoryname

When removing an entire directory, you’ll be prompted for each item to remove, and this prompt can be suppressed using the -f option, as in:

host% rm -rf directoryname

So far, the use of remove is fairly innocuous, as long as you’re sure you want to remove the file or directory contents. It’s when remove is combined with Unix wildcards that warning signs of ‘Ware, there be dragons here should be entering your thoughts.

For instance, to remove all JPEG files from a directory, instead of removing each individually, you can use a wildcard:

host% rm *.jpg

This command will remove any file in a directory that has a .jpg extension. Any file. Simple enough, and as long as that’s your intent, no harm.

However, it’s a known axiom that people work on their web sites in the dead of night, when they’re exhausted or have had one too many microbrews. Our minds are befuddled and confused and tired and not very alert. We’re impatient and want to just finish so we can go to bed. So we enter the following to remove all JPEG files from a directory:

host% rm * .jpg

A simple little space, the result of a slight twitch of the thumb, and not seen because we’re tired – but the result is every file in that directory is removed, not just the JPEG files. And the only way to recover is to access your backups, or seek the nearest Unix geek and ask them to please, pretty please, help you recover files you accidentally removed.

And they’ll look at you with a knowing eye and say, “You used rm with a wildcard, didn’t you?”

Which leads us to our last Command, and the most important…

Tenth Command: Do Nothing

You can’t hurt anything if you don’t touch it. If you’re unsure of what a command will do, read more about it first, don’t type it and hope for the best. If you’re tired and you’re removing files, wait until you’re more rested. If something isn’t broken, don’t fix it. If your site is running beautifully, don’t tweak it. If you’re trying something new, back your files up first.

Unless you’re a SysAdmin and need to maintain a system, in which case you don’t need this advice anyway, you can’t hurt yourself in Unix unless you do something, so if all else fails, Do Nothing.

The easiest mistake to recover from in Unix is the one that’s not made.

Categories
Technology

Survival guide to LAMP: Basic ingredients

Recovered from the Wayback Machine.

I’ve heard from several people interested in moving to an open source weblogging tool like WordPress or Textpattern, but they’re concerned about having to manipulate PHP code in order to modify the templates or make the changes to the tool to incorporate modifications some of us have created.

What happens, then, is these folks stay with a hosted and/or proprietary tool, even if they aren’t happy with it, and this bothers me. I’m cool with folks remaining with other tools because they’re not interested in tweaking an open source product, or because they like the tools; but I think those of us who support the open source effort have an obligation to make the products of open source obtainable to everyone interested, tech or not. I have never felt that there should be a division between the techies and the non-techies when it comes to our physical environments. A better divider, to me, is interest: there are those who are interested in using open source products, or tweaking the product to fit their needs; and those who just want to use a weblogging tool, as is, no mods. Packaged product, end of story. Regardless of technical skill.

I’ll get into this further, but first a little digression:

Today’s Web environments are different from the environments we endured back in the Web stone age. Years ago I remember that the only thing you could depend on within a web environment was that a web server was installed. There might, just might, be other software installed that you could use–usually Perl for CGI applications. If you hosted with a Windows ISP, you also had access to ASP, and sometimes even SQL Server. If you were hosted on Unix, you had Perl and some limited ability to install a rather intimidating array of programs that usually caused havoc with your system at one time or another.

If you had your own server–and this wasn’t that common long ago–you had to maintain everything yourself, and you had to know what you were doing. Even today, with the increased sophistications of hosting options and environments, an independent environment does require a great deal of expertise with Internet and usually Linux facilities. Still, long ago, this was the only option for people who wanted to run a database, applications, and various languages.

Today the web hosting environments are different. First the reduced costs for co-location or virtual servers has increased the number of people running independent servers. However, most of us have a hosted site on an ISP, like my own on Hosting Matters, where the cost is less and, more importantly, much of the server administration is handled by the ISP. Considering the number of security releases for even a reliable product such as Red Hat Linux, having that aspect of administration handled by the ISP is worth the loss of some independence.

However, just because we don’t have total server independence doesn’t mean we don’t have access to a great deal of software. Most of us are served through the Apache web sever, with several modules compiled in increasing our ability to customize our own environment. In addition, most ISPs now offer database access, usually MySQL but other databases are supported. As for code — I don’t know of any ISP that doesn’t have PHP and Python installed in addition to Perl, and with today’s beta release of Mono for Linux, we should soon see support for C#, too. Personally, I’m looking forwared to this one–I’ve always liked C#.

We also have access to applications that help us manage our environment. My own ISP provides cPanel, a very sophisticated interface that allows us to easily add and remove users, databases, files, security, and even applications. Integrated with this control panel is a host of other tools such as PHPMyAdmin, which greatly increases the simplicity of working with the MySQL database.

There are some limitations, of course. For instance, if you want to run a beta release of a product, you might have to negotiate with the ISP to have it installed, and if it proves a problem, they’ll yank it. And forget asking to install an EJB (Enterprise Java Bean) framework on most systems. Frameworks tend to be notorious CPU hogs, not to mention tempermental and easily crashed.

However, for the most part, this shared hosting environment, with the administrative tasks handled by the ISP but with access to a plethora of tools for us to do dama.., urh, interesting stuff is an optimum solution for most of us, and is how many of us are currently hosted. At a minimum, these hosts provide what most of us ask for, which is LAMP.

O’Reilly long ago coined the term LAMP to encapsulate the web environments used, incidentally, for most weblogging tools, and hosted by most ISPs. LAMP stands for a set of open source product encompassing Linux (or one of the BSD Unix variants) as operating system; Apache as web server; MySQL as database; and one or more of PHP/Perl/Python as scripting language.

I am inordinately fond of LAMP, though most of my development expertise in the past has been non-LAMP. For instance, I’ve developed with Visual Basic/C++, within ASP and not; I’ve also worked with Delphi and Powerbuilder, in addition to spending a number of years working with C, C++, and Java on Unix boxes. There was some Perl here and there, but for the most part, much of my experience has been Windows development or Java and the aforementioned EJBs.

However, over time I’ve come to appreciate the cleanliness of LAMP–the simplicity of the development process; how lightweight and accessible it is. There is no initial investment of thousands to install the infrastructure, as is necessary (usually) with EJBs; there isn’t even a significant learning burden put on the developer to obtain expertise in LAMP. After all, anyone can install Linux on their PCs, and once that’s installed, download and install Apache, MySQL, and PHP–for free.

Okay, okay–free as in beer.

Bluntly, aside from my interest in Mono, I have no interest in returning to work with Java or huge infrastructure environments. Give me simple. Give me clean. Give me open source, and make it free.

What does this mean to you, the innocent weblog writer? Well, nothing much if you’re using a tool such as Blogger that manages your weblog for you. However, if you have to install your weblogging tool, you and your environment are going to, at some point, have to meet and greet. I know enough about weblogging tools to know that most are LAMP-based (okay, okay, .NET folks, don’t crawl all over my case with this one–I know you’re out there, in numbers to big to ignore).

How far you take the greeting, though, is up to you. That’s where we go back to webloggers who are interested in open source and tweaking, as compared to webloggers who just want to push a button and have a weblog instantly created. For most folks, LAMP exists but they don’t have the training to take advantage of it. This, then, makes installing something like WordPress, with its PHP exposure, intimidating. But there’s nothing about these tools that require a person have a technical background in order to be comfortable with installing products like WordPress or Textpattern, running updates on MySQL, or modifying an .htaccess file in order to redirect web pages.

I haven’t done a tech series for a long time, and I think I’m overdue. So, for the next few weeks, I’m going to write a number of tutorials about LAMP, targeted for the non-tech, to see if I can’t increase the comfort level with those of you using Linux, or Apache, or PHP, or MySQL. I will be focusing on PHP, of the three P’s: PHP, Perl, and Python. Personal preference you might say.

During these, I’ll install and configure WordPress 1.2 beta for one of my web sites, and then walk through modifications I’ll make with the basic product to fit my needs. In addition, I’ll also install Textpattern and do the same.

(Previously I had thought that Textpattern was proprietary, only to find out it is open source, and so to make amends, I’ll work with both products. )

Even if you aren’t interested in WordPress or Textpattern, if your site is hosted on Linux, you’ll benefit from the tutorials covering the simple line commands, not to mention SSH access. If your weblog tool uses MySQL, or you’re just interested in it, you’ll also benefit from the essays discussing this popular database. Same with PHP and Apache.

All entries in the Survivor Guide to LAMP are designated with the lava lamp icon (I thought this was apropos), and are featured in a separate category for easier access. If you have anything specific you’d like covered, drop me a note or put in a comment.

Oh, and these tutorials are also open–I’ll be attaching a Creative Commons license to these essays (but only these essays, specific to the posting only, if I can figure out how to do this). Hopefully these tutorials will be incorporated into WordPress’s documentation wiki, and elsewhere, to continue being useful long after I’m gone.

Categories
Connecting Technology

Googled to death

For someone interested in this sort of thing–I am interested in this sort of thing, aren’t I?–I never did make a statement about Google’s Gmail.

Gmail is a centralized email system with virtually unlimited space to hold your messages, and with the ability to use Google’s search algorithms to research your email. Bells and whistles aside, that’s all it is. I have a Yahoo email account I maintain for emergencies–where’s the beef?

There was some discussion about privacy concerns because after all, Google does hold your messages, your data, and does search your email to post targeted ads. I agree with Tim O’Reilly on this issue in his writing The Fuss about Gmail and Privacy: Nine Reasons Why It’s Bogus. Some folks think that Big Brother will force Google to allow him to peek into our private emails. However, for the terrorists among you, I would suggest that you consider not using a centralized email system to exchange words about your plans to take over the world.

What Tim says about privacy and email is spot on: we’ve had centralized email systems like Gmail before, and email itself is notoriously easily compromised. Never assume privacy with email and if you want to say something in private, I suggest lunch in a quiet bistro somewhere.

However, I do disagree with O’Reilly’s overjoy about the benefits of Gmail. He envisions this great global social software network that would allow our email systems to link us to appropriate people based on need and someday we’ll move all our data to the Core and access it with our data ports that also double as tie clips, cellphones, and nose rings.

Yeah, and when that happens to the general populace and not just the Network junkies, pigs will fly on pretty, pretty dragonfly wings.

Do you really want who you know and who you like and who you trust mapped into some universal algorithm so that your system can tell Person B who knows you who knows Person A and then Person B pushes you to connect them up with Person A, when maybe you don’t necessarily agree with Person B about what all this knowing really means.

Do you really want to be that wired?

Even if I and most people could get over our repugnance with this type of overall encompassing and non-directed and pervasive/invasive social software network, this type of overall, all inclusive centralized core of data is anathema to those of us who believe in decentralized systems.

O’Reilly writes:

Storage of my critical data on one of the largest, most reliable data storage banks in the world. As Rich Skrenta made so clear in his recent weblog posting, Google is the shape of the future. Forget Moore’s Law and Metcalfe’s Law. Storage is getting cheaper faster than any other part of the technology infrastructure. I remember Bob Morris, head of IBM’s Storage Division and the Almaden Research Labs, telling me a couple of years ago, that before too long, storage would be cheap enough and small enough that someone who wanted to do so could film every moment of his life, and carry the record around in a pocket. Scary? Maybe. But the future is always scary to those who cling to the past. It is enormously exciting if you focus on the possibilities. Just think how much value Google and other online information providers have already brought to all of our lives – the ability to find facts, in moments, from a library larger than any of us could have imagined a decade ago.

Anybody who would store their critical data on a system in which they have no control, and one which is used by millions of people is, well, to put it kindly, caught up in the enthusiasm of the moment.

I guess I am old, and too set in my ways. Too many ghosts in the machine for my liking now, without opening the door to hordes more of them, just because somebody dangles a new pretty in front of my face, and I like the sparkle.

Besides, I am getting weary to death of Google. First it was Google buying Blogger, then Orkut, and then Gmail, and now its the IPO where we get to hear about how some very rich people are going to get even richer. It’s just a software company. It’s big machines with lots of data and some good developers and some interesting algorithms, some of which don’t always work as well as we would like. I agree with William Grosso: Anyone Else Bored by the 24 x 7 Google Watch?

(Well, that was good–I managed to fulfill my Google quota for an entire year with one posting. Good, economic writing, that.)

Categories
Technology Weblogging

Move: Halfway there

Recovered from the Wayback Machine.

I am about halfway in my move to WordPress. Have run into some interesting challenges along the way, but also have discovered a couple of nifty things that help compensate for the more interesting of the interesting challenges. There’s one feature, in particular, for those of us who like to write long posts that makes up for most of the problems; a feature I think will end up being standard in all weblog tools–once the tool makers finally decide that we’re not all link-short comment-and post webloggers.

It’s a groovy feature.

For those thinking of moving to WordPress from Movable Type, I have a lot to talk about, which you might be interested in–especially if you have a lot of posts, comments, trackbacks, and have made extensive use of MT tags. Tomorrow, though, because I’m about exhausted. However, as an early hint: clean up your old postings that are still in draft mode. You don’t want to have a lot of drafts.

And publishing is sooooo fast now. Unbelievable. We’ll see how general page access goes.

In addition, I had looked at two additional PHP/MySQL weblogging tools during this time, and I have some notes on these that I’ll also provide. In case you’re wondering, these were B2 evolution and Textpattern. I’ll give the whys and wherefores of why I went with WordPress…tomorrow.

For now, help me test. Ping me. Comment. Push buttons and let me know what breaks.

Also, notice how I managed the categories? WordPress doesn’t have a primary category–rightfully so, the concept isn’t particularly meaningful–which did cause some challenges with my category icon next to my posting title.

But then I thought: why just list the one category? So, there you have it – all categories are now listed. I had to hack into the WordPress code to write a function for this, but I think I can break this out into what is known as a WordPress Hack, which is code that extends WordPress but isn’t part of the official build. I still have to see how this works. This extended function would work as a plugin, too, but the plugin architecture for WordPress isn’t out until version 1.2 (in the works).

Regardless, I rather like the multi-category icon list. Sometimes change is good.

UpdateUnfortunately, I made the mistake of trying to make a change in Movable Type, and it overwrote the index.php page, which just happens to be the code and the template for WordPress. I am now attempting to recover a half a day’s work.

So first thing we learn: Don’t touch Movable Type. Second thing: copy index.php to a safe location once you get the template fixed. WordPress does not maintain a copy of the template in the database. Don’t make this mistake at home, kiddies.

Categories
Technology Weblogging

TypeKey: Final act

Recovered from the Wayback Machine.

Six Apart has released its pre-launch’ FAQ about TypeKey, and everything I expected about the service has been confirmed. I have no doubts that when MT 3.0 releases, we’ll see masses of people rush to enable TypeKey in their weblogs so they rest assured at night that only the proper sort of comments need appear when they are not there to maintain the necessary vigilance to protect their weblogging homes from dastardly introducers.

Discussing the issues of registration and centralization, comment spam prevention, centralization and performance, privacy, baby squirrels, and social issues, in turn:

Registration and centralization

If you want comment registration with Movable Type or TypePad, you will have to use TypeKey. As the FAQ says, if we want comment registration without TypeKey, then we’ll have to …build our own authentication system. The problem with building authentication, as with any other sercurity aspect of an application, is that it needs to be designed and incorporated right from the start; an addon registration system for a tool built to use something else is not something I want to contemplate having to maintain as Movable Type goes through new variations, open APIs or not.

If Movable Type were open source, I could understand this. And before you point out the nature of Perl, open code is not open source.

The reasons for having a centralized registration system, frankly, don’t make a lot of sense. Six Apart states that:

TypeKey takes care of the hassle of running an authentication service: building the service itself; keeping it running; dealing with users who have forgotten their username or password; verifying the email address of new users; etc. All of these tasks are managed for you by TypeKey.

I imagine if you’re a weblog that gets hundreds of new commenters a day, having a service take care of authenticating an email address would be valuable. Now, those of you who get hundreds of new commenters a day, raise your hand?

Other than that, the aspects of registration that Six Apart mention for TypeKey are built into other products, quite simply, and this includes WordPress and a host of other weblogging tools. The commenter may on occasion have to give an answer to a question to recover a password; or if the tool doesn’t provide an automated registration recovery procedure (which it should, that’s not difficult to add in), we may have to reset a person’s password manually for them, but frankly, people using software that manages registration locally has been around on the Web since it was not much beyond a twinkle in Tim Berners-Lee’s eye.

(And an added benefit with local registration – break into a local system, and you compromise one weblog or site; break into a global system, and you compromise everyone’s.)

As for managing multiple usernames and passwords form weblog to weblog, well, please. We use our email addresses for each username, and we use the same password at each, or a variation of a password based on the weblog name and our naming scheme. Not as secure to use same password? Well, sure, but we’re not talking about our bank accounts here – we’re just talking about comment systems and keeping comment spammers out.

Comment spam prevention

Authentication and registration is not a infalliable solution to comment spammers. Just think of the new offshore possibilities – hire people in countries to sign up for email addresses, get authenticated in the TypeKey system, place innocuous comments at sites until they’re allowed in, and then one fine night – blitz ‘em!

No comment registration system, TypeKey or otherwise, will be able to deliberately keep out all spammers. Fortunately Movable Type does have better comment management, with being able to delete comments by name, IP address, and URL, and this is good, this is something we have been asking for. However, I also see no evidence that throttles have been incorporated into the code to prevent trackback and comment DoS (Denial of Service) attacks, so this will continue to be a problem, even with Movable Type 3.0. Unless we hack the code, and the thought of having to hack the code before the product is even out is just too much at this point.

By the way, what about trackback?

Centralization and performance

Having a centralized registration system for a centralized weblogging tool makes sense. After all the weblog posts, comment builds, and every other aspect of the weblog is managed centrally, why not the comment registration? But there is no good technical reason for going with a centralized service for what are distributed weblogs. There are probably good commercial reasons, but none from a technical or even individual user’s point of view.

We who went to Movable Type or other product that we host on our own servers did so specifically because we did NOT want to have any form of dependency on a centralized system. We did so, for the most part, because we have been burned on either performance or access because of the centralization and scaling problems. TypeKey is no different, and in some ways, potentially worse than any of the other centralized tools that we use.

Think of it– for web sites that use centralized comment registration, every comment has to be authenticated with TypeKey. Now think about how many comments are being written at any moment in time?

Six Apart mentions the performance aspect of TypeKey, saying:

We are committed to offering a solution that has as little customer-facing downtime as possible. Of course, we can never guarantee 100% uptime. It’s in Six Apart’s best interest to keep TypeKey up and functioning and to keep our users happy. In the case of downtime, there will be fall-back options in place to help guarantee a fairly seamless commenting process. That means downtime of the TypeKey service would not necessarily mean that spammers and abusive comments could get through nor that commenters would not be able to comment. We’ll have more information about how this will work nearer to the release.

Which frankly tells me they haven’t worked through a solution on this aspect yet, and that doesn’t bode well for the use of this service.

When building a new web-enabled application with any of the clients I had when I was a technical architect, the first aspect we would build into the system was security. You have to build security from the ground up. It must be incorporated into the very design of the product, from its first conceptualization, it can never be an ‘add-on’. Added security never works as efficiently, or as effectively as security integrated deeply with the product.

Mark Pilgrim came out with a weakly satirical rant making fun of what several of us have had to say about TypeKey (after first making disparaging ethnocentric comments about our writing to our weblogs during the ‘weekends’ based on his own interpretation of same; in an international environment, no less)– including Six Apart’s own announcement of Movable Type 3.0.

(I can see, in all seriousness, why Mark would make fun of us for spending time talking about this. After all, it’s just technology. Why get worked up over technology? We never get worked up over technology such as RSS and Atom and RDF and the Semantic Web, that sort of thing.)

The only technical aspect I can pull out of his writing to address is that he lists several centralized systems that he believes do scale well and serve the community, and it’s true these have managed to scale and are useful, but each and every one has failed when I’ve tried to access it at least once a week.

Blogdex was inaccessible off and on this weekend, and Technorati was hard to access last night, and I couldn’t access Bloglines two or thee times last week, and I got some kind of odd error with Radio comments a couple of weeks ago, too, and, well, the list goes on. The problem with centralized systems is not that they fail completely and breakdown permanently; it’s that they behave oddly or inconsistently, or poorly under load.

Time out. Ever get a time out when accessing a centralized system?

But the thing with Technorati or Blogdex or Bloglines (I haven’t used Feedster) is that I’m not dependent on them to write to my weblog, or for my commenters to respond, or for my pages to be accessed. Only my own system resources, or the Internet in general between my server and each of us can impact on this. With TypeKey, though, that’s changed.

Now, not only we’ll we have to write out blog posts in Notepad or some other local application to prevent losing them when we can’t access our hosted or remote weblogging applications; we’ll have to do the same with comments, too.

(Though I imagine that Six Apart will create a caching subsystem that will cache authenticated comments for publishing when the TypeKey system is accessible again – you’ll just have to wait for the remote system to continue your discussion is all.)

Why would we go through all the hassle to have a distributed application if we’re going to tie into a centralized authentication system? Might as well go to TypePad for the rest of our weblogging needs.

(But I don’t want Mark to go away not thinking that we’re not appreciative of his efforts. As he said in one weblog comment: what’s the fuss? After all it’s just a public announcement of a new technology? Sure, I can agree with that – and Atom is just another alpha release of yet another syndication format. No big deal.)

Privacy

I have no doubts that Six Apart won’t publish my personal information, just as I have no doubts that they won’t do something with the aggregate data. All that juicy information about which sites getting how many comments; and then there’s plenty of ego-stroking aspects to the application. If we think that we’re too fixated on buzzsheets such as Technorati 100, wait until we see what can be done with comments.

I’m not particularly concerned about the system being hacked into to get my individual information, though I imagine email spammers will attempt to do so to farm all of the email addresses contained in the system. However, from a security stand point, this is a bright red target in a field of beige – I have no doubts that crackers will be at that system to crack just so they can flood our comments with a crapflood of bogus comments, using our login information, as they use our IP addresses as proxy for their attacks today.

And won’t that be a hell of a mess to clean up?

Baby Squirrels

Aside from these specific technical issues, one other issue I have is the trust releationship we have have established with Ben and Mena Trott, our friends and neighbors, being carried over into our dealings with Six Apart, the company.

It’s important to remember when judging whether to buy into the use of TypeKey for your site is that Six Apart is no longer ‘Ben and Mena’. It is an international company, with international investors and multiple employees, and business concerns that influence the company’s direction. This isn’t to knock that Six Apart has become successful – more power to the company! This is to make a point that we can no longer judge the use of any product, even the ‘free’ ones, from Six Apart, as if they are given to us by Ben and Mena, sitting in their apartments, writing the code in their spare time.

I have heard some good and valid defenses of TypeKey from sites who plan on using it, or some other form of comment registration and authentication because of the nature of topics covered at their sites. These people can attract all sorts of racist and bigoted people, and they want to ensure that if a person is going to make comments such as these, they can at least be authenticated to an email address.

But much of the pushback against those of us raising technical and social concerns has been based on a personalization of the technology and the Six Apart company.

In the MeFil thread on TypeKey, one person wrote:

SixApart is the Apple of the blog world–they take the time during development to make robust, stable apps (TypePad and MT are both solid, and both spreading like wildfire as a result) and they do it with enough style and digital sex appeal to make it consistently-appealing (if not downright Pavlovian) to the crucial early adopter set.

So naturally, let the chorus of haters begin.

Just so long as the haters are Typekey-authenticated, of course.

Another wrote:

Really, who can argue that a centralized, secured, open registration system for weblogs is better than distributing a registation system into thousands of individual weblogs that never update their software? It just doesn’t make sense. Think of all the fun customer support issues that could arise from handing loud bloggers a complicated registation system. Besides, everyone loves typing their information into weblogs over and over again.

Of course, it’s not like there blogging systems out there that are focused on small closed communtities. Well, there’s livejournal, but they don’t meet my exact needs either. I mean, why should I have to switch blogging software or do any work when Six Apart should be reading my mind and meeting my needs exactly for free.

Don’t they realize that the people that read my site are so dumb that though they can use a computer, check email, and surf the web, there is no possible way they could remember a username and password. No other website makes people remember a username and password!

What is this world coming to when companies try to plan ahead and think broadly instead of catering to the loudest whiner? Egads, you’d think that I’m not the most important person in the world.

Why the sarcasm? Why the issues of hatred?

The problem is that we can’t discuss this from a technical perspective because we’re talking “Ben and Mena” here, and there are a lot of complicated factors in work. There’s Six Apart’s support for Atom when others have supported RSS; there’s the fact that Ben and Mena are, were, are webloggers just like the rest of us; that they provided Movable TYpe for free, and did start out by coding in their home, in their spare time; there’s the fact that a lot of people have met Ben and Mena, and like them, and I’m sure they are very nice, and personable.

But Six Apart is not ‘Ben and Mena”. Being critical of TypeKey is not attacking Ben and Mena. And choosing to use TypeKey should not be based on trusting Ben And Mena.

Personalizing the Tech: the social in social software

It’s not surprising that a personalization of the TypeKey has entered our discussions. The thing with social software, such as weblogging software, is that personalization will always be one of the factors in its design, no matter how much we try to ‘de-personalize’ the tech.

With TypeKey enabled at a weblog for all comments, either you register with this centralized service, or you don’t comment. But if we have good comment management and good throttles enabled to prevent comment spam, why would we use comment registration such as TypeKey? I’ve read that it’s to prevent comment spammers, but we know with current workarounds that we don’t need registration to manage comment spammers.

From what I’m hearing, now, that’s not the issue for registration. People are talking about filtering out ‘negative’ comments, and commenters who say ‘hateful’ things. The Six Apart FAQ talks about this:

Now Alice goes to Carol’s weblog. Carol also allows comments by registered users only. Alice signs in using her existing TypeKey account and posts a comment to Carol’s weblog, which goes into Carol’s moderation queue, because this is Alice’s first comment on his weblog.

But Alice hates Carol, so she left a nasty comment! Carol receives the comment via email, doesn’t like the tone of it. So she logs into Movable Type and bans Alice from posting comments to his weblog.

Damn, folks, but nasty is relative. Since the beginning of this year I have been labeled as vicous, nasty, rude, negative, and about everything you can think of. Not because I’m using names, or even personal attacks, but because I have used a specific tone of voice, or an abrupt way of speaking; I have used sarcasm and satire in responding; I have said negative things about what a person has written. Recently, a tone was even implied because I used the person’s last name, rather than their first to address them!

(As a personal aside, am I getting tired of passive aggressive types chastizing my behavior, as if they were Mom or Dad, and I the wayward child? You god damn right I’m getting tired of it. More on this in a later writing.)

Now any comment registration system will keep me out of a weblog, and TypeKey is no different than a local system. I’m not making a statement against TypeKey, now, as much as I am against comment registration; against a growing trend that I’m seeing within the weblogging world to put up barriers and filters around our spaces so that we may control not only what’s discussed within our writing, but within the comments we attach to our spaces.

Combine this with never linking to contrary viewpoints, or disparging same based on some group affiliation or at the behest of some A-lister who we’re sucking up to, and eventually we can still the voices and if we’re successful enough, the people speaking will lose heart and just go away and leave us alone.

Is this where we want to go with this brave new world?

Never say never

In my one weblogging post, I deliberately used a provocative title of “Patriot Act of Weblogging” to discuss TypeKey, and I received criticism for this, as I expcted. However, for the most part, the reason why I used this title seems to have been lost.

In my opinion, the Patriot Act was an overcompensation based on fear and a reaction to being attacked. Through it our freedoms have been curtailed, though many people feel that the added security is worth it. To me, TypeKey is based on the same principles, though of course the similarities between events are far, far different. There is no horrible and sudden loss of life, and no frightening and insiduous curtailment of civil rights, and my use of this term should be, rightfully, called on because of this.

There is a hint, though, of the same overcompensation – a reaction against being ‘attacked’, a pulling in of our heads, like the turtle into its shell, an all or nothing to both events that when I first read the TypeKey announcement, my initial reaction was that it was the Patriot Act of weblogging.

(All or nothing. Hmmm. Sounds like a good title for an essay on communication and barriers, doesn’t it?)

TypeKey is all or nothing. Not using TypeKey in my weblog doesn’t end TypeKey’s influence on me. I said I would never register with TypeKey, which means never commenting at TypeKey enabled sites. Never say never, the saying goes, but for me, never means just that – never.

Feel free to TypeKey protect your comment systems and know that I for one will not be commenting there, and perhaps that makes you even happier about TypeKey. Of course, I’ve also instigated lively discussions in your comments at times, or about your posts, but that’s beside the point. The important thing is that you have complete and utter control over who says what in your space, and that’s all that matters.

Be nice, or be gone

Be nice, or be gone someone said to me recently.

Odd thing, weblogs and comments. We say to each other, “Our weblogs are our homes and we should be able to control what’s said in them”. Yet, they aren’t our homes, are they? You don’t keep your door open for anyone to just walk in to your home, do you? Weblogs are published online supposedly because we want a broader audience for our thoughts and writing then just our friends and family.

They aren’t really our ‘homes’, and the analogy fails in so many ways, but they are our spaces, so we have a right to control them and hold people who comment accountable, don’t we?

But who holds us accountable? I’ve seen again and again, the weblogger write the most inflammatory material in an essay, and when you respond to the tone they set in their writing, or to their responses to your earlier comments, you’re told to be nice, or be gone.

We say, commenters should be held accountable for what they say. I say, but then, who holds the weblogger accountable?

Be nice, or be gone.

I guess I and all the other troublesome, negative, critical, contrary, rude, nasty, vicious, and dissenting voices that you see as graffiti on the wall will be gone, and though we can write in our own weblogs, we’ll never be part of the conversations. Free to speak, true; but not to be part of a discussion; on the outside looking in through the window at the party, trying to be heard through the thick panes. After a while though, shouting in the street gets discouraging and disheartening, and perhaps some day we’ll just be gone for good.

Just think, though: when we’re gone, you won’t need TypeKey. That’s great, isn’t it?