Categories
Web

Web 9.75

The precision of naming takes away from the uniqueness of seeing. Pierre Bonnard.

Nick Carr comments on Google’s Web 3.0, pointing out the fact that Web 3.0 was supposed to be about the Semantic Web, or, as he puts it, the first step in the Machine’s Grand Plan to take over.

For all the numbers we flash about there really are only so many variations of data, data annotation, data access, data persistence, and whatever version of “web” features the same concepts, rearranged. Perhaps instead of numbers, we should use descriptive terminology when naming each successive generation of the web, starting with the architecture of the webs.

Application Architectures

thin client

This type of application is old. Older than dirt. A thin client is nothing more than an access point to a server, typically managing protocols but not storing data or installing application locally. All the terminal traditionally does is capture keystrokes and pass them along to the server-based application. The old mainframe applications were, and many still are, thin clients.

There was a variation of thin client a while back when the web was really getting hot: the network computer. Oracle did not live up to its name when it invested in this functionality, long ago. The network computer was a machine that was created to access the internet and serve up pages. In a way, it’s very similar to what we have with the iPhone and other hand held devices. There is no way to add third-party functionality to the interface device, and any functionality, at all, comes in through the network.

Is a web application a thin client? Well, yes and no. For something like an iPhone or Apple TV, I would say yes, it is a thin client. For most uses, though, web applications require browses and plug-ins and extensions, all of which do something unique, and require storage and the ability to add third-party applications, as well as processing functionality on the client. I would say that a web application where most of the processing is done on the server, little or none in the browser, is a thin client. However, beyond that, then the web application would be…

client/server

A client/server application typically has one server or group of servers managed as one, and many clients. The client could be a ‘thin’ client, but when we talk about client/server, we usually mean that there is an application, perhaps even a large application, on the client.

In a client/server application, the data is traditionally stored and managed on the server, while much of the business processing as well as user interface is managed on the client. This isn’t a hard and fast separation, as data can be cached on the client, temporarily, in order to increase performance or work offline. Updates, though, typically have to be made, at some point, back to the server.

The newest incarnation of web applications, the Rich Internet Applications (RIA), are, in my opinion, a variation of client/server applications. The only difference between these and application that have been built with something like Visual Basic is that we’re using the same technologies we use to build more traditional web applications. We may or may not move the application out of the browser, but the infrastructure is still the same: client/server.

However, where RIA applications may differ from the more traditional web applications is that RIA apps could be a variation of client/server–a three tier client/server application…

n-tier

In a three, or more properly n-tier client/server application, there is separation between the user interface and the business logic, and the business logic and the data, creating three levels of control rather than two. The reasoning behind this is so that changes in the interface between the business layer and the data don’t necessarily impact on the UI, and vice versa. To match the architecture, the UI can be on one machine, the business logic on a second, and the data on a third, though the latter isn’t a requirement.

Some RIA applications can fit this model, because many do incorporate a concept of a middleware component. As an example, the newer Flex infrastructure can be built as a three-tier with the addition of a Flex server.

Some web applications, whether RIA or not, can also make use of another variation of client/server…

distributed client/server

Traditional client/server: many clients working against one set of business logic mapped to database server running serially. Easiest type of application to create, but one less likely to be able to scale, and from this arises the concept of a distributed client/server or computing architecture.

The ‘distributed’ in this title comes from the fact that the application functionality can be split into multiple objects, each operating on possibly different machines at the same time. It’s the parallel nature of the application that tends to set this type of architecture apart, and which allows it to more easily scale.

J2EE applications fit the distributed computing environment, as does anything running CORBA or the older COM and the newer .NET. It is not a trivial architecture, and needs the support of infrastructure components such as WebLogic or JBoss.

This ‘distributed parallel’ functionality sounds much like today’s widget-bound sidebars, wherein a web page can have many widgets, each performing a small amount of functionality on a specific piece of data at the same time (or as parallel as can be considering that the the page is probably not running in a true multi-threaded space).

Remember, though, that widgets tend to operate as separate and individual applications, each to their own API (Application Programming Interface) and data. Now, if all the widgets were front ends to backend processes running in parallel, and working together to solve a problem, then the distributed architecture shoe fits.

There’s a variation of distributed computing–well, sort of –which is…

Service Oriented Applications

Service Oriented Applications (SOA). Better known as ‘web services’. This is the APIs and the RESTful service requests, and other services that run the web we seem to become more dependent on every day. Web services are created completely independent of the clients, supporting a specific protocol and interface that makes the web services accessible regardless of the characteristics of the client.

The client then invokes these services, sending data, getting data back, and does so without having any idea of how the web services are developed or what language they’re developed with, other than knowing the prototype and the service.

Clean and elegant, and is increasingly running today’s web. The interesting thing about web services is that they can be almost trivially easy to tortuously complex to implement. And no, I didn’t specifically mention the WS-* stack.

Of course, all things being equal, no simpler architecture than…

A stand alone application

A stand alone application is one where no external service is necessary for accessing data or processes. Think of something like Photoshop, and you get a stand alone application.

The application may have internet capabilities but typically these are incidental. In addition, the data may not always be on the same machine, but it doesn’t matter. For instance, I run Photoshop on one Mac, but many of my images are on another Mac that I’ve connected through networking. However, though I may be accessing the data on the ‘net, the application treats the data as if it is local.

The key characteristic of a stand alone application is that you can’t split the application up across machines — it’s an all or nothing. It’s also the only architecture that can’t ‘speak’ web, so we can’t look for the Web 3.0 among the stand alones.

Alone again, naturally…

No joy in being alone; what we need is a little help from our friends.

P2P

P2P, or peer-to-peer applications are built in such a way that once multiple peers have discovered each other through some intermediary they communicate directly–sharing either process, data, or both. A client can become a server and a server can become a client.

Joost is an example of a P2P application, as is BitTorrent. There is no centralized place for data, and the same piece of data is typically duplicated across a network. Using a P2P application, I may get data from one site, which is then stored locally on my machine. Another person logging on to the P2P network can then get that same piece of data from me.

The power to this environment is it can really scale. No one machine is burdened with all data requests, and a resource can be downloaded from many resources rather than just one. It is not a trivial application, though, and requires careful management to ensure that any one participant’s machine isn’t made overly vulnerable to hacking, that downloads are complete, that data doesn’t get released into the wild, and so on. Communication and network management is a critical aspect to a P2P application.

These are the architectures, at least, the ones I can think of off the top of my head. Which, then, becomes the ‘next’ Web, the Web 3.0 we seem to be reaching for?

Web 3.0

Ew Ew Ew! The next generation of the web must be Google’s cloud thing, right. So that makes Web 3.0 a P2P application, and we call it “Google’s P2P Web” or “MyData2MyData”?

Ah, no.

The concept of ‘cloud’ is from P2P (*correct?). It is a lyrical description of how data is seen to a P2P application…coming from a cloud. When we make requests for a specific file, we don’t know the exact location of where the file is pulled; chances are, it’s coming from multiple machines. We don’t see all of this, though, hence the term ‘cloud’. Personally, I prefer void, but that’s just semantics.

The term cloud has been adopted for other uses. Clouds are used with ‘tags’ to describe keyword searches, the size of the word denoting the number of requests. I read once where a writer called the entire internet a cloud, which seems too generic to be useful. Dare Obasanjo wrote recently on the discussions surrounding OS clouds, which, frankly, don’t make any sense at all and, me thinks, using cloud in the poetic sense: artful rather than factual.

The use of ‘cloud’ also occurs with SOA, which probably explains Google’s use of the term. And Microsoft’s. And Apple, if they wanted, but they didn’t–being Apple (Stickers on our machines? We don’t need no stinking stickers!) Is the next web then called, “BigCo SOA P2P Web”?

Let’s return to Google CEO Schmidt’s use of the cloud, as copied from Carr’s post, mentioned earlier:

My prediction would be that Web 3.0 would ultimately be seen as applications that are pieced together [and that share] a number of characteristics: the applications are relatively small; the data is in the cloud; the applications can run on any device – PC or mobile phone; the applications are very fast and they’re very customizable; and furthermore the applications are distributed essentially virally, literally by social networks, by email. You won’t go to the store and purchase them. … That’s a very different application model than we’ve ever seen in computing … and likely to be very, very large. There’s low barriers to entry. The new generation of tools being announced today by Google and other companies make it relatively easy to do. [It] solves a lot of problems, and it works everywhere.

With today’s announcement of Google shared space we’re assuming that Google thinks of third-party storage as ‘cloud’, similar to Microsoft with its Live SkyDrive or Apple with its .mac. It’s the concept of putting either data or processes out on third party systems so that we don’t have to store on our local machines or lease server space to manage such on our own.

In Google’s view, Web 3.0 is more than ‘just’ the architecture: it’s small, fast applications built on an existing infrastructure (think Mozilla, Silverlight, Flex, etc.) that can run locally or remotely; on phones, hand helds, and/or desk sized or laptop computers; store data locally and remotely; built on web services run on one or many machines, created by one or more than one company. I guess we could call Google’s web, the Small, Fast, Device Independent, Remote Storage, SOA P2P Web, which I will admit would not fit easily on a button, nor look all that great with ‘beta’ stuck to its ass.

Not to mention that it doesn’t incorporate all that neat ‘social viral’ stuff. (I knew I forgot something.)

The social viral stuff

Whatever makes people think that Facebook or MySpace or any of the like is ‘new’? Since the very first release of the internet we’ve had sites that have enabled social gathering of one form or another. The only thing the newer forms of technology provide is a place where one can hang one’s hat without having to have one’s own server or domain. That’s not ‘social’–that’s positional.

Google mentions how we won’t be buying software at the store. I had to check the date on the talk, because we’ve been ‘spreading’ software through social contact for years. Look in the Usenet groups and you’ll see recommendations for software or links to download applications. Outside of an operating system and a couple of major applications, I imagine most of us download our software now.

What Google’s Schmidt is talking about isn’t downloaded software so much as software that has a small installation footprint or doesn’t even need to be installed at all. Like, um, just like the software it provides. (Question: What is Web 3.0? Answer: What we’re selling.)

Anyone who has ported applications is aware of what a pain this is, but the idea of a ‘platformless’ application has been around as long as Java has been around, which is longer than Google. It’s an attractive concept, but the problem is you’re more or less tied into the company, and that tends to wear the shininess off ‘this’ version of the web–not to mention all that ‘not knowing exactly what Google is recording about us, as we use the applications’ thing that keeps coming up in the minds of we paranoid few.

Is the next web then, the Small, Fast, Device Independent, Remote Storage, SOA P2P, Proprietary Web? God, I hope not.

Though Schmidt’s bits and cloudy pieces are a newer arrangement of technology, the underlying technology and the architectures have been around some time: the only thing that really differs is the business model, not the tech. In this case, then, ‘cloud’ is more marketing than making. Though the data could end up on multiple sites, hosted through many companies, the Google cloud lacks both the flexibility and freedom of the P2P cloud, because at the heart of the cloud is…Google. I’ve said it before in the past and will repeat: you can’t really have a cloud with a solid iron core.

Though ‘cloud’ is used about as frequently as lipstick at a prom, I don’t see the next generation of the web being based on either Google’s cloud, or Microsoft. Or Adobe’s or Mozilla’s or Amazon’s or any single organization.

If Google’s Web 3.0, or, more properly, Small, Fast, Device Independent, Remote Storage, SOA P2P, Proprietary, Web with an Iron Butt, is a bust, does this mean, then, that the Semantic Web is the true Web 3.0 after all?

Semantic Web Clouds…and stuff

Trying on for size: a Semantic Client/Server Web. Nope. Nope, nope, nope. Doesn’t work. There is no such thing as a semantic client/server. Or a semantic thin client, or even distributed semantics, or SOA RDF, though this one comes closest, while managing to sound like something that belongs on a Boyscout badge.

Semantics on the web is basically about metadata–data about data. Our semantic efforts are focused on how metadata is recorded and made accessible. Metadata can be recorded or provided as RDF, embedded in a web page as microformat, or even found within the blank spaces of an image.

We all like metadata. Metadata makes for smarter searches, more effective categorization, better applications, findability. If data is one dimension of the web, then metadata is another, equally important.

The semantic web means many things, but “semantic web” is not an application architecture, or profoundly new way of doing business. Saying Web 3.0 is the Semantic Web implies that we’ve never been interested in metadata in the past, or have been waiting some kind of solar congruence to bring together the technology needed.

We’ve been working with metadata since day one. We’ve always been interested in getting more information about the stuff we find online. The only difference now from the good old days of web 1.0 is we have more opportunities, more approaches, more people are interested, and we’re getting better when it comes to collecting and using the metadata. Then again, we’re also getting better with just the plain data, too.

Web 3.0 isn’t Google’s cloud and it isn’t the Semantic Web and it certainly isn’ t the Small, Fast, Device Independent, Remote Storage, Viral, SOA, P2P, Proprietary, Smart Web with an Iron Butt. Heck, even Web 3.0 isn’t Web 3.0. So what is the next great Web, and what the devil are we supposed to call it?

Web 9.75

It is a proprietary thing, this insistence on naming things. “From antiquity, people have recognized the connection between naming and power”, Casey Miller and Kate Swift wrote.

We can talk about Web 1.0, or 2.0, or 3.0, but my favorite is Web 9.75, or Web Nine and Three-Quarters. It reminds me of the train platform in the Harry Potter books, which could only be found by wizards. In other words, only found by the people who need it, while the rest of the world thinks it’s rubbish.

There are as many webs as there are possible combinations of all technologies. Then again, there are many webs as people who access them, because we all have our own view of what we want the web to be. Thinking of the web this way keeps it a marvelously fluid and ever changing platform from which to leap unknowing and unseeing.

When we name the web, however, give it numbers and constrain it about with rigid descriptions and manufactured requirements, then we really are putting the iron into the cloud; clipping our wings, forcing our feet down paths of others’ making. That’s not the way to open doors to innovation; that’s just the way to sell more seats to a conference.

Instead, when someone asks you what the next Web is going to be, answer Web 9.75. Then, when we hear it, we’ll all nudge each other, wink and giggle because we know it’s nonsense, but no more nonsense then Web 1.0, Web 2.0, Web 3.0 or even Google’s Web-That-Must-Not-Be-Named.

*As reminded in comments, network folks initially used ‘cloud’ to refer to that section of the network labeled “…and then a miracle happens…”

Categories
Web

Controlling your data

Popular opinion is that once you publish any information online, it’s online forever. Yet the web was never intended to be a permanent snapshot, embedding past, present, and future in unbreakable amber, preserved for all time. We can control what happens to our data once it’s online, though it’s not always easy.

The first step is, of course, not to publish anything online that we really want to keep personal. However, times change and we may decide that we don’t want an old story to appear in search engines, or MySpace hotlinking our images.

I thought I would cover some of the steps and technologies you can use to control your online data and media files. If I’ve missed any, please add in the comments.

Robots.txt

The grandaddy of online data control is robots.txt. With this you can control which search engine web bot can access what directory. You can even remove your site entirely from all search engines. Drastic? Unwelcome? As time goes on, you may find that pulling your site out of the mainstream is one way of keeping what you write both timely and intimate.

I discussed the use of robots.txt years ago, before the marketers discovered weblogging, and most people were reluctant to cut themselves off from the visitors arriving from the major search engines. We used to joke about the odd search phrases that brought unsuspecting souls to our pages.

Now, weblogging is much more widely known, and people arrive at our pages through any form of media and contact. In addition, search engines no longer send unsuspecting souls to our pages as frequently as they once did. They are beginning to understand and manage the ‘blogging phenomena’, helped along by webloggers and our use of ‘nofollow’ (note from author, don’t use, bad web use). Even now, do we delight in the accidental tourists as much as we once did? Or is that part of a bygone innocent era?

A robots.txt file is a text file with entries like the following:

User-agent: * Disallow: /ajax/ Disallow: /alter/

This tells all webbots not to traverse the ajax or alter subdirectory. All well behaved bots follow these, and that includes the main search engines: Yahoo, Google, MSN, Ask, and that other guy, the one I can never remember.

The place to learn more about robots.txt is, naturally enough, the robots.txt web site.

If you don’t host your own site, you can achieve the same effect using a META element in the head section of your web page. If you’re not sure where this section is, use your browser’s View Source capability: anything between opening and closing “head” tags is the head section. Open mine and you can see the use of a META element. Another example is:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

This tells web bots to not index the site and not harvest links from the site.

Another resource you might also want to protect is your images. You can tell search engines to bypass your images subdirectory if you don’t want them picked up in image search. This technique doesn’t stop people from copying your images, which you really can’t prevent without using Flash or some other strange web defying move. You can, however, stop people from embedding your images directly in their web pages, a concept known as hot linking.

There are good tutorials on how to prevent hotlinking, so I won’t cover it here. Search on “preventing hotlinking” and you’ll see examples, both in PHP code and in .htaccess.

Let’s say you want to have the search engines index your site, but you decide to pull a post. How can you pull a post and tell the search engines you really mean it?

410 is not an error

There is no such thing as a permanent, fixed web. It’s as fluid as the seas, as changeable as the weather. That’s what makes this all fun.

A few years back, editing or deleting a post was considered ‘bad form’. Of course, we now realize that we all change over time and a post that seemed like a good idea at one time may seem a terrible idea a year or so later. Additionally, we may change the focus of our sites: go from general to specific, or specific back to general. We may not want to maintain old archives.

When we delete a post, most content management tools return a “404” when the URL for the page is accessed. This is unfortunate because a 404 tells a web agent that the page “was not found”. An assumption could be made that it’s temporarily gone; the server is having a problem; a redirect is not working right. Regardless, there is an assumption that 404 assumes a condition of being cured at some point.

Another 4xx HTTP status is 410, which means that whatever you tried to access is gone. Really gone. Not just on vacation. Not just a bad redirect, or a problem with the domain–this resource at this spot is gone, g-o-n-e. Google considers these an error, but don’t let that big bully fool you: this is a perfectly legitimate status and state of a resource. In fact, when you delete a post in your weblog, you should consider adding an entry to your .htaccess file to note that this resource is now 410.

I pulled a complete subdirectory and marked it as gone with the following entry in .htaccess:

Redirect gone /category/justonesong/

I tried this on an older post and sure enough, all of the search engines pulled their reference to the item. It is, to all intents and purposes, gone from the internet. Except…

Except there can be a period where the item is gone but cache still remains. That’s the next part of the puzzle.

Search Engine Cache and the Google Webmaster Toolset

Search on a term and most results have a couple of links in addition to the link to the main web page. One such link is for the cache for the site: a snapshot of the the last time the webbot stopped by.

Caching is a handy thing if you want to ensure people can access your site. However, caching can also perpetuate information that you’ve pulled or modified. Depending on how often the search engine refreshes the snapshot, it could reflect a badly out of date page. It could also reflect data you’ve pulled, and for a specific reason.

Handily enough as I was writing this, I received an email from a person who had written a comment to my weblog in 2003 and who had typed out his URL of the time and an email address. When he searched on his name, his comment in my space showed in the second page. He asked if I could remove his email address from the comment, which was simple enough.

If this item still had been cached, though, his comment would have remained in cache with his email address until that comment was refreshed. As it was, it was gone instantly, as soon as I made the change.

How frequently older pages such as these are accessed by the bots really does depend, but when I tested with some older posts of other weblogs, most of the cached entries were a week old. Not that big a deal, but if you want to really have control over your space, you’re going to want to consider eliminating caching.

To prevent caching, add the NOARCHIVE meta tag to your header:

To have better control of caching with Google, you need to become familiar with the Google Web tools. I feel like I’ve been really picking on Google lately. I’m sure such will adversely impact on share price, and bring down searching as we know it today–bad me. However, I was pleased to see Google’s addition of a cache management tool included within the Google Webmaster tool set. This is a useful tool, and since there are a lot of people who have their own sites and domains, but aren’t ‘techs’, in that they don’t do tech for a living or necessarily follow sites that discuss such tools, I thought I’d walk through the steps in how to control search engine caching of your data.

So….

To take full advantage of the caching tool, you’ll need a Google account, and access to the Webmaster tools. You can create an account from the main Google page, clicking the sign in link in the upper right corner.

Once you have created the account and signed in, from the Google main page you’ll see a link that says, “My Account”. Click on this. In the page that loads, you can edit your personal information, as well as access GMail, Google groups, and for the purposes of this writing, the Webmaster toolset.

In the Webmaster page, you can access domains already added, or add new domains. For instance, I have added burningbird.net, shelleypowers.com, and missourigreen.com.

Once added, you’ll need to verify that you own the domain. There’s a couple of approaches: add a META tag to your main web page or you can create a file given the same name as a key generated for you from Google. The first approach is what you want to use if you don’t provide your own hosting, such as if you’re hosted in Blogger, Typepad, or WordPress.com. Edit the header template and add the tag, as Google instructs. To see the use of a META tag, you can view source for my site and you’ll see several in use.

If you do host your site and would prefer another approach, create a text file with the same name as the key that Google will generate for you when you select this option. That’s all you need with the file: that it be named the name Google provides–it can be completely empty. Once created, FTP or use whatever technique to upload it to the site.

After you make either of these changes, click the verify link in the Webmaster tools to complete the verification. Now you have established with Google that you are, indeed, the owner of the domain. Once you’re verified the site, clicking on each domain URL opens up the toolset. The page that opens has tabs: Diagnostic, Statistics, Links, and Sitemaps. The first three links most likely will have useful information for you right from the start.

Play around with all of the tabs later, for now, access Diagnostic, and then click the link “URL Removal” in the left side of the page. In the page that opens, you’re given a chance to remove links to your files, subdirectories, or your entire site at Google, including removing the associated cache. You can also use the resource to add items back.

You’ve now prevent webbots from accessing a subdirectory, told the webbots a file is gone, and cleaned out your cache. Whatever you wrote and wish you didn’t is now gone. Except…

Removing a post from aggregation cache

Of course, just because a post is removed from the search engines, doesn’t meant that it’s gone from public view. If you supply a syndication feed, aggregators will persist feed content for some period of time (or some number of posts). Bloglines persists the last 100 feeds, and I believe that Google reader persists even more.

If you delete a post, to ensure the item is removed from aggregator cache, what you really need to do is delete the content for the item and then re-publish it. This ‘edit’ then overwrites the existing entry in aggregator cache.

You’ll need to make sure the item has the same URL as the original posting. If you want, you can write something like, “Removed by author” or some such thing — but you don’t have to put out an explanation if you don’t want to. Remember: your space, your call. You could, as easily, replace the contents with a pretty picture, poem, or fun piece of code.

Once the item is ‘erased’ from aggregation, you can then delete it entirely and create a 410 entry for the item. This will ensure the item is gone from aggregators AND from the search engines. Except…

That pesky except again.

This is probably one of the most critical issues of controlling your data and no one is going to be happy with it. If you publish a fullcontent feed, your post may be picked up by public aggregators or third party sites that replicate it in its entirety. Some sites duplicate and archive your entries, and allow both traversal and indexing of their pages. If you delete a post that would no longer be in your syndication feed (it’s too old), there’s no way to effectively ‘delete’ the entry for these sites. From my personal experience, you might as well forget asking them to not duplicate your feeds — with many, the only way to prevent such is to either complain to their hosting company or ISP, or to bring in a lawyer.

The only way to truly have control over your data is not to provide fullcontent feeds. I know, this isn’t a happy choice, but as more and more ‘blog pirates’ enter the scene, it becomes a more viable option.

Instead of fullcontent, provide an excerpt, as much of an excerpt as you wish to persist permanently. Of course, people can manually copy the post in its entirety, but most people don’t. Most people follow the ‘fair use’ aspect of copyright and quote part of what you’ve written.

There you go, some approaches to controlling you data. You may not have control over what’s quoted in other web sites based on fair use, but that’s life in the internet lane; returning us back to item number one in controlling your data–don’t publish it online.

Categories
Diversity Technology Web

Speak softly

Recovered from the Wayback Machine.

Interesting writing and discussion on another perspective about women in technology. This is from the DevChix group, and though I really dislike the use of ‘chix’ and ‘grrl’ when referencing professional women, it’s a good site to discover women working in the newer Web 2.0 technologies.

In the essay, the writer who goes by gloriajw, believes that one of the reasons women have been dropping out of the field is the hostile nature of most tech environments. She addresses this from the perspective of what makes women’s only groups more approachable:

The material for this article came about through my participation in both women-only and mixed gender groups of many kinds. When I wonder why tech groups aren’t tolerable for many women, I look at the inverse of the problem: What makes women-only tech groups more tolerable for women?

Of the behavioral patterns she’s identified in said groups, she mentions a strong sense of community, and in particular how communication is managed:

Destructive criticism is the best way to keep a site predominantly male. It implies that there is no concern about whether a person can learn from a response or not, or whether they would find offense. It is an outward display of ego, a territorial “pissing rite” in which most women do not and will not participate.

In such groups, the author states, bad behavior is seldom called and typically ignored. Contrastingly, in women’s groups:

If you do something awful to one woman in a women-only community, all will hear and know about it, and you are ousted. Most of the time this is first discussed and voted on by many group members. Many times the women’s group will even make an effort to explain the offense to the oblivious offender. But if the offender is still oblivious and/or offending, the offender is out. This is done to protect the interests and goals of the group. Many male dominated online groups don’t run this way. Most if not all women’s groups run this way, whether online or off.

There is a reason why I won’t join such women’s group, and this paragraph more or less sums it up for me. This ‘group think’ way of dealing with difficulty I find, frankly, repugnant. I happen to agree that ignoring a person who exhibits ‘bad’ behavior is one of the better approaches to take. I’ve seldom seem a troll continue when no responds to what they say.

And what is ‘bad’ behavior? When does such voting take place? In the last week I’ve been called both mean and vicious because of my criticism of a company and a company’s actions. Is it then that one must preface all criticism with something sweet and fluffy in order to ease the difficulty of the words? I can’t think of any better approach to shut down all discussion than to have to struggle through some inner debate about how to coach criticism in ‘nice’ terms in order to express such. Weblogging has demonstrated that nice is relative–having to do with popularity, as much as tone and word usage.

gloriasw, has four suggestions for online discussion areas to make them more inviting for women:

  1. Immediately delete offending and off-topic comments
  2. Return aggressive or overly hard comments back to the creator and have them re-phrase
  3. Treat the space like a community, which I presume means to monitor
  4. Explicitly state the site is ‘woman friendly’

She also has approaches to take for men to communicate with women:

  • Don’t assume when a woman is enthusiastic about their work, they’re hitting on you or has to do with you
  • Leave your libido at the door
  • Women aren’t dressing the way they do because they’re sending you signals
  • Something about guy humor can be OK if the first three items are kept in mind

There is some of this I agree with, but I have to ask the question: do women spend all day running from the men in their groups? I’ve rarely had issues of being hit on, even when I was younger and considered ‘purty’. I’ve rarely seen this happen with other women. Is this happening, now, among the younger men? Younger tech guys, do you spend all your time hitting on the women at work?

Too much emphasis lately on women being perceived as sexual object or victim’, and way too much emphasis on how the problems women are having in technology are because men see us as sex objects. I’m sorry, this is not my perception. I’ve been in the industry 25 years, and I’ve rarely seem women hit on at work, nor do I see such behavior in most of the discussions I get involved in.

Does it exist? I imagine so, but I seriously doubt this is the reason women are not joining and are leaving the tech field. Why? Because such behavior is everywhere–it’s not unique to Web 2.0 environments. The feel of titanium or the glow of an LCD does not trigger men into being primal savages.

As for the aggressive nature of the discussions, again, considering that I’m also seen as a ‘aggressive’ communicator, I don’t know if communication style is the problem as much as lack of respect and the communication only reflects this. To me, the larger issue is that women in tech are not as respected as the men, and hence our work is more easily discredited or ignored, our contributions downplayed, our participation compromised. Worse, when we do get into passionate discussion, our arguments tend to be discredited using the too typical ‘shrill’ or my personal favorite, ‘hysterical’.

What concerns me about writings such as gloriasw’s is that this can actually make things worse, rather than better.

The first writing I ever did on sexism in this weblog was related to Doc Searls –yes beloved, gentle Doc Searls. Doc Searls is a nice man, and yes, he does reference and link to women–more than a lot of other guys. But he’ll never get into a discussion with a woman. He will never debate a woman. In close to seven years of off and on reading of his site, I’ve never seen him actually have a truly engaged discussion with a woman. To this day, I don’t know if it’s because he doesn’t respect us, professionally. Or if it’s because he doesn’t know how to have such a discussion without coming across as bully or being abusive. By not engaging with women, though, he does us more harm than if he wrote that we’re all skanky bitches.

If we keep emphasizing about how women need ‘safe’ places, we’re going to get exactly what we’re asking for: safe, isolated, segregated spaces where we never have to worry about harsh words. We’ll also never have to worry about reaching the top positions in our fields, becoming as well known, being invited to conferences, and so on, either.

Respect is the key, not tone of voice, or words used. If a person respects you, it comes across in how they respond to what you say. They may get angry, and they tell you you’re dead wrong, and they may even say you’re being an idiot in this situation. However, if the overall interaction is one of respect, it doesn’t matter the tone in any particular discussion. That’s the real problem we women have: we don’t have the respect that, frankly, we deserve.

Case in point is the Devchix site, itself. This site has been around almost a year, and covers all sorts of topics, including those of interest to the Ajaxian set. Yet, I don’t think I’ve seen any of this site’s writings linked by sites such as Ajaxian. In fact, the same looks to be true for each individual contributor’s weblog–I can’t see that any of these women have been linked by some of the more dominate or well known tech weblogs.

I first found out about this writing at Simon Willison‘s weblog. Yet this is the first time (that I can find through the search engines) that Simon has ever referenced a writing from the site. Or, from what I can see, the individual weblogs of the authors. Yet they write on many topics related to the tech that Simon is interested in.

Simon does point to the Reddit thread, as demonstration that this confirms the writing, but really doesn’t this just confirm that discussions at Reddit (and Digg) tend to degenerate into three year olds flinging shit no matter what the topic? Frankly, as much as I’d like to blame Reddit or Digg or even Slashdot for women leaving tech, I find it unlikely.

Men and women are both equally capable of being aggressive and mean, and though society has educated each sex to express such in differing ways, we need to stop pointing how women are fragile flowers who can’t handle strong disagreement, while all men do is go toe to toe and spit at each other. What we need to question is when there are women in the field, writing on the topics, speaking of such, going to the conferences, why aren’t we given the acknowledgment? Why aren’t we given either the respect due us as professional or the attention we deserve as active participants. At a minimum, why, in this supposedly equal world where no one knows you’re a man, woman, or dog, why aren’t we given the links?

For all that I disagree with gloriasw, I appreciate her post. At a minimum, it highlights yet more women who are working with the Web 2.0 technologies and such attention is a good thing. I just wish when members of the site write on technology, they would be equally as noticed.

Categories
Web

Web 2.0 way of running a web application

Recovered from the Wayback Machine.

Question: How do you turn a Web 1.0 application into a Web 2.0 one?

Answer: Pull the plug.

I’ve been involved in comments over at Robert Scoble’s on the Zooomr crash and burn, and make no mistake: the application is down and out, and there is no plan in place to get it back up other than asking for donations. Donations for what? Well, we heard about server problems, but then we heard about RAID controller problems; discussion about the database server failing was followed by discussion that 10TB is not a big enough server. This was then followed by the problem isn’t CPU: It’s I/O because of the static pictures being served. Kristopher Tate doesn’t seem to be a consistent idea of what the problem is. Other than, We think the controller on our main database server is bad.

What?

This is what I dislike about Web 2.0. Kristopher Tate and Thomas Hawk are darlings of the A List set, so therefore these aren’t two men running an application, neither of whom really has a clue in how to keep an application of this magnitude up and going. No, these are two brilliant, far thinking futurists who only need money and help from someone to keep this going, as the troops rally around with “Way to go, guys!”

This application has been down over a week, after it went down once before with a promised rollout, after missing its initial rollout at its own startup party, following on what sounds like other downtime problems. Do you think system users should be concerned? Perhaps even, dare I say it, critical? Not on your life. Being critical is not the Web 2.0 way.

There isn’t a note on the front page about the server being down for the count. No, they’ve pasted some UStream videos where people can watch Hawk and Tate stare at a computer terminal, drink wine with ice, and talk to people who are asking them questions via some form of chat. However, when the server failed this last time, Thomas Hawk did write about the the little train that could.

(I would pay money, real money, for one of you developers at, oh, Citibank, Boeing, John Hancock insurance, or so on to go into your bosses office next time you have problems with your applications, and tell him or her about the little train that could.)

I have had more than my share of problems with hardware in the past, and normally I would be much more sympathetic. In fact, since the application is database-driven and PHP, I believe, I might even have offered help–I’m better at troubleshooting problems than building apps from scratch. Not for an application and a company, though, that keeps billing itself as the place for photo sharing, with such grandiose pronouncements that one is forced to imagine that Flickr is the po’dunk site, while Zooomr is the tasty noodle waiting to be snapped up.

So now people are being asked to give, Kristopher has bills to pay, and Zooomr needs a new server. Personally, before I started throwing money at the site, I’d ask to see a business AND technical plan upfront, including a detailed estimate and listing of hardware Tate and Hawk need, now and for the next six months to a year. But, as Thomas Hawk has pointed out, I’m not really a part of that whole Flickr/Zooomr scene, so I don’t really understand.

True, I don’t. I don’t understand solving server problems by doubling or more the burden on the server; throwing new features on when the server can’t reliably support existing ones; and not putting an honest, blunt, note in the front of the site telling people exactly what’s happened, and how the site, and their photos and data, will be recovered. However, before other startups think to follow this as an example, all I have to say is: don’t try this at home, kiddies. Not unless you’re good buds with the Big Names.

I’m sure the company will get the support they need. After all, they have all sorts of friends among the movers and shakers. That’s the only thing that really matters in a Web 2.0 universe.

Speaking of movers and shakers, I have an idea for the company: Get Mike Arrington and Techcrunch to fund the equipment they need. After all, Techcrunch has been pushing the company as a Flickr buster since the very beginning. As much as Scoble and Arrington have been touting this site, you’d think they’d both be fighting each other for the honor of investing.

Categories
Web

A succinct look at web generations

Web 0.0

brink

Web 1.0

blink

Web 2.0

link

Web 3.0

think