Categories
Semantics Specs

Opinion: Australian Censorship Bill Could Impact P2P

Recovered from Wayback Machine.

Australia’s been in the news before about Net censorship legislation, but the South Australian Parliament may have gone a little extreme even for this Net-conservative country.

A bill introduced in November would make it illegal for content providers to post material that is considered “objectionable viewing material” for children. What’s objectionable viewing material? Anything that the police — the police, mind you — would consider as falling within the R, NC, or X ratings categories of the film industry. Ostensibly this would cover material such as child pornography or content advocating breaking the law. However, the bill is general enough that it could also cover material on topics such as abortion, suicide, drug use, sexual behavior and other sensitive topics that could be termed “adult topics” and therefore R-rated.

Even more alarmingly, under this bill posting this material is illegal even if access to the material is restricted or password-protected. Compounding the problem, content providers would have no way of knowing whether their material would fall under one of the prohibited classifications before posting it; if the material is judged by the police to be within the parameters of this bill, you’d be charged. No warning and no second chance. And the fines aren’t cheap: as much as 10,000 (Australian) dollars per offense.

According to an alert issued by Electronic Frontiers Australia, this bill would actually make material that’s legal offline, illegal once posted online.

Related Articles:

Lessig: Fight For Your Right to Innovate

Free Radical: Ian Clarke Has Big Plans for the Internet

Code + Law: An Interview with Lawrence Lessig

Lessig: Fight For Your Right to Innovate

Search for “censorship” on O’Reilly Network

More on P2P Law


More from the OpenP2P.com

The impact of this bill on Web-based businesses is obvious — the level of censorship implied would give even the most conservative businesses pause when it comes to posting content on their Australian-based Web sites. What may not be so noticable, though, is the impact of this bill on peer-to-peer applications and services. You see, the wording of the bill doesn’t focus on Web-based content; it concerns content distributed via the Internet.

Consider the following scenario: You’re a subscriber to a file-sharing P2P service such as Napster. You make a request for material that could be considered “objectionable” because of the language used — for instance one of the more explicit songs from Alanis Morissette’s album “Jagged Little Pill,” or practically anything from Guns N’ Roses or Eminem. Once you’ve downloaded an “objectionable” song, it’s now on your machine for your personal use. However, in this process, you’ve also “posted” this content for access by other clients through the Internet: P2P is based on the fact that any node within the network can be both a client and server. According to this bill, you would be in violation of the law.

If you’re a subscriber to a decentralized service such as Freenet or Gnutella, the potential problems with this type of bill are even more extreme. With these types of P2P networks, if a file request is made from node A to node B, and then from node B to node C, that file is returned to node B as the intermediary first, and finally to node A. Now, not only is the peer located at C in violation of the law, so are A, who originally requested the file, and B, who did nothing more than subscribe to the conditions of the P2P service that states files may be stored on the client’s machine as a method of disseminating popular files throughout the network.

By its very nature, Freenet hides the identity of nodes supplying or requesting files, making it difficult to ascertain who was the originator of the material or the request. Because of this, it becomes difficult to ascertain who is legally responsible for “posting” the file if it is deemed to fall within the parameters of this censhorship bill. So, what could happen is that the intermediary node containing the file is the one charged with violating the law, rather than the originator, regardless of the technical and legal semantics that form the basis of anonymity within a Freenet network.

At the very least, applying this censorship law to the Freenet or Gnutella network would become a legal nightmare to the South Australian court system. All it would take to demonstrate the unfeasibility of the law is to introduce one highly popular but objectionable file to Freenet, potentially turning all or most South Australian Freenet users into criminals. This issue goes beyond considerations of copyright law.

According to the UK-based Register the South Australian’s politicians must have gone “barking mad” — in other words, the bill’s sponsors may want to reconsider the bill on its own merits.

Read the pertinent sections of the censorship bill at Electronic Frontiers and then join discussions at Slashdot and South Australia’s Talking Point

Categories
Specs Technology

Browser, Browser Not

Originally published at O’Reilly

Recently, O’Reilly published a set of articles (Netscape Navigator 6.0 to Fail Standards ComplianceAn Update, and Netscape 6.0 Released), written by the popular author David Flanagan, about the release of Netscape 6.0, Netscape’s newest entry in the browser marketplace.

David presented several valid concerns about bugs still present in the release of Netscape 6.0. And it is true, Netscape 6.0 did release with several unfixed bugs, many of which will have an impact on support for W3C specifications.

Our reaction to the release, however, was somewhat different. Along with other application developers, we’ve been waiting for the public release of an application that uses Mozilla’s XPToolkit, a set of software components from which Netscape 6.0 and the upcoming Mozilla 1.0 were built. Now that Netscape 6.0, which uses this framework, has been publicly released, we’re delighted: testing of XPToolkit may begin in earnest.

While many are focused on the release of Netscape 6.0, some of us aren’t. We’re more interested in the application environment created by the Mozilla team to support the implementation of browsers in general. To us, this framework is more important than the release of a new browser will ever be.

The reason for this is the changing face of the Internet, itself.

The Changing Face of Internet Applications

Current Internet applications rely on a centrally located Web server to distribute HTML over HTTP to clients. Each client, or Web browser, renders the source and displays a human-readable page.

This architecture has become so popular that you can’t pick up a magazine or a newspaper without hearing about Web servers or the new business models based on them. Although this architecture is based around universally located resources, most application-level resources are centralized and many other resources are hard to find. Some Web sites help you find other Web sites or “resources.” Others go so far as to offer completely centralized applications, as Application Services Providers (ASPs).

New technologies will soon force us to rethink the way we use the Internet. Distributed systems, mobile agents, and peer-to-peer (P2P) applications may completely undermine the need for browser-based Internet access.

P2P applications are already stepping around the browser. The next step will be around the Web server.

Consider this: a P2P application that locates and downloads a new function. The simplest example here may be provided by a P2P execution framework that uses XML-based remote procedure calls between peers to marshal XML-encoded functions. Instead of hitting Web pages, each peer locates and accesses both data and functions among a network of peers. No Web servers.

This scenario is not going to be best served by the traditional browser. Why?

The Limitations of Browsers

The things that made the Web browser a success in the beginning are the things that make it ineffective for new application models.

The browser was built to render files stored on Internet sites so we didn’t have to muck about with FTP. As soon as content became more visible, people started publishing yet more content, so browsers rendered HTML, then XML, formatted with CSS or XSLT. However, the browser itself has a very limited interface, even with new advances in W3C specifications. Sophisticated browser pages mean using either complicated object models–leading to cross-platform and cross-browser idiosyncrasies that are usually the result of standards initiatives–or using page-embedded applications, such as Java applets and plug-ins.

Even when the browser follows standard specifications, working within a browser page to create a sophisticated interface isn’t a simple or uncomplicated task.

In addition to the browser becoming increasingly complex as the nature of content becomes so, use of it implies that applications ought to be served from one location, and in one manner. To do something such as make a remote procedure call, you would need to use a digitally signed Java applet or some other browser-specific and limited technique. This is something that won’t bother newer P2P applications.

Finally, browsers were designed to be safe, and operate in a protective sandbox. Web-based applications served via a browser have difficulty getting at the user’s machine. Though safe, this restriction also prevents behaviors that would have the application modify its user interface. And this dynamism is going to be necessary in an environment where new services require new application interfaces that can be downloaded as data.

An Internet Application Framework?

Mozilla made a tough decision a few years ago–to scrap the Netscape 4.x architecture in favor of one built from the ground up. In the process, this open source team created an application environment based on reusable and interchangeable components.

With this application environment in place, the team then proceeded to build a sophisticated browser. They threw in Internet Chat, a Web page composer, and other complex things, all of which were released recently as Netscape 6.0. Often forgotten is that a powerful application environment came with it. This environment is now usable by developers of other Internet applications.

What types of applications? Well, ActiveState, the company that provides popular implementations of Perl and Python for various operating systems, used Mozilla to create itsKomodo product, a visual IDE for working with Python and Perl code. The user interface provides, among other things, colored syntax, syntax checking, and source-level debugging.

So, we have a browser and an application that can be used to create and test Perl and Python applications, all built from the same application architecture.

This is exciting stuff! Much has been written about reusable code and component-based design, and now we have an open source application environment we can all use to build our own applications.

Even more exciting is the extensible user-interface language from Mozilla called XUL (pronounced “zool”). It’s based on XML, which means you can use XML to create a user interface. Combine this with the ability to make remote procedure calls, and you have a perfect place from which to commence building a bunch of P2P applications, based on the scenario mentioned above.

Now, instead of opening a browser, you can open an application built on the same framework as your browser, but with a sophisticated interface of dropdown menus and tabbed pages–all created using XML. You can access remote procedure calls at the touch of a button and when you’re ready to access a new service, click another button, and in a couple of minutes restart your application. New entries will be added to new or existing menus providing access to the new service. All this is accomplished without Java bytecode, a new plug-in, or a DLL.

You’ve just downloaded XML.

When you explore the possibilities of the XPToolkit from Mozilla maybe you’ll agree that Netscape 6.0 is more than just a standards-based, better-than-Navigator-4.x-browser. It’s the start of a new new way of doing things on the Internet.

Categories
Specs

The Tyranny of Standards

Originally published at O’Reilly

Before proceeding into the core of this article, I want to say one thing to you: challenge your assumptions.

Challenge your assumption that all Internet services are provided by a Web server and consumed by a browser Challenge your assumption that chaos within a development environment is a bad thing. And challenge your assumption that standards must take precedence over innovation.

Several years ago, when the concepts of Web server and browser were first implemented, the Internet was introduced to a new state of chaos and, as the explosive growth of technologies that are “Web-enabled” demonstrates, innovation was not only the rule, it was the norm.

Over time, people decided that standards were a necessary adjunct to the growth of the Web, something with which I completely agree. Enter the W3C, the World Wide Web Consortium.

As the W3C organization will attest, they are not a standards body. As such, they don’t issue “standards” per se. Instead, the W3C issues recommended specifications. The only enforcement of these specifications has been through voluntary compliance on the part of the technology providers, and demand for said compliance on the part of technology consumers.

Thanks to the efforts of the W3C, we have specifications for HTML, XML, CSS, HTTP, and a host of other Web-enabling technologies. Thanks to those following the specifications, we have Web pages that can be viewed by different browsers and served by different servers.

Somewhere along the way, however, standards became less of a means for providing stability and more a means of containment. In some cases, standards have become a weapon used to bludgeon organizations for practicing the very thing that started the growth of Web applications in the first place: innovation.

The Importance of Innovation

Innovation is the act of improving what exists and creating something new. Though innovation does not always lead to something better (Remember push technology?), it is the thing that keeps us moving forward, always searching for a better way of doing things.

Innovation can work comfortably with standards; new XML-based specifications, such as MathML, are a case in point. There are also times when innovation actually bucks the standards.

For instance, Microsoft has been long criticized for adding its own “innovations” to a specification, particularly with its popular Web browser, Internet Explorer. One innovation was the support of a property called innerHTML that is used to access or easily replace the contents of a specific HTML element. Though innerHTML is not part of any of the W3C specifications, its use is so popular that Mozilla, the open source effort behind the new Netscape 6.0 browser, has adopted the use of innerHTML within its own layout engine.

Should Microsoft and Mozilla be bashed for lack of standards compliance because innerHTML is not a property supported by the W3C? Or should both organizations be commended for providing a useful tool that has become very popular with Web developers?

This leads to an additional question: How does one measure standards compliance? For example, if Internet Explorer and Mozilla both supported CSS attributes such as font size and color, and they also supported new attributes and properties like innerHTML, would both browsers be compliant? Or are they noncompliant because they’ve added new features to the underlying CSS/DOM/XML/HTML specifications? How exactly do we define “standards compliance,” especially when there are groups like the WSP (Web Standards Project) enforcing this compliance?

The WSP

I’ve long been a fan of the W3C, and I think that the Web and the Internet would be a much more chaotic environment without this organization. However, my fondness for the W3C does not necessarily extend itself to the WSP.

If you haven’t heard of the WSP, it is an example of what happens when standards enforcement is left to the masses. This organization’s intentions are pure: It’s a nonprofit organization of Web developers, designers, and artists who encourage browsers to support standards equally and completely. However, somewhere along the way, the WSP took on the aspect of a holy war, a Web jihad.

The WSP’s behavior is tantamount to lynch mob justice. After all, there are no gray areas of justice: only black and white, right or wrong. The same can be said of support for the enforcement of standards: A company supports standards 100 percent, or the company is noncompliantand, therefore, evil.

Note that I agree with the WSP in spirit: Our lives would be much easier if Microsoft and Mozilla and Netscapewould support the W3C specifications fully and equally. I’m more than aware of the cost of having to write different Web pages for different browsers because each has implemented technologies in a different way. I’ve been doing this for years.

However, I’ve also benefited when an organization has expressed an innovation that exists outside of a specification, such as the aforementioned innerHTML, or Mozilla’s support for XUL (Extensible User Interface Language). If having all browsers be 100 percent standards compliant means not having access to these innovations, then I’ll take noncompliance even if it does mean extra effort to compensate for differences.

I encourage Microsoft and Mozilla and Netscape to support the W3C specifications and other standards, but I also encourage these same organizations to continue their innovative efforts, even if the result is a bit of chaos in a world that would otherwise run smoothly, and without a wrinkle.

And who’s to say that a little chaos is such a bad thing?

The Chaos of Innovation or the Sameness of Compliance

In August 2000, CNET.com featured an article titled Why Open Standards are a Myth. The author of the article, Paul Festa, wrote that open standards only work when a company has a lead in a technology and then uses the standard as a means of ensuring that its competition doesn’t exceed its own ability. The support for standards, then, becomes a means of disabling a competitor’s innovation.

In this context, the sameness of compliance to standards becomes less a tool to help developers and businesses and more a weapon against competition. The sameness of compliance also becomes a measure of ensuring that all participants reach one level, are kept on this level, and that there are no bumps in the road of compatibility.

Is this smooth path of total compliance the Internet of the past? And is this the Internet we want in the future?

In the End

Standards are essential to doing business between companies. They are necessary to ensure that, for example, CD players can play all CDs, and elevators don’t crash to the first floor from the tenth. Our lives are protected by standards and our laws are based on them.

However, standards were never meant to be a weapon against innovation, as a tool for beating a company into submission, particularly within the free-spirited environment of the Internet.

Should we encourage the adoption of standards? A resounding yes! But not at the expense of what makes working on the Internet so challenging and exciting: The promise of something new coming through the router. Something different. Something interesting. Something innovative.

Categories
Specs Technology

XML Expectations

Originally published at Netscape Enterprise Developer, now archived at Wayback Machine

Extensible Markup Language is a language that defines other languages. It also has the potential to give structure and meaning to the information contained in HTML documents or any other data form — making such information naturally as searchable and structured as the information locked into a database. Such capabilities mean that XML can turn our current view of data upside down — instead of a static, impenetrable lump of information, a file that uses XML suddenly has a logical structure that can be manipulated, queried, and changed without delving down into the data itself. The potential of such a meta-language is huge — if it’s implemented as an open standard. Right now, XML remains such a standard, and if it continues to evolve along open lines, it could drastically improve Web-based development.XML is similar to SQL in a number of ways; SQL is also an example of a multi-purpose language used to define data structures and query those same data structures without concern as to how the information is displayed or used. The only guarantees are that the information is defined in structures, the structures follow certain rules, and the information contained within the structures can be accessed automatically or manually. Both SQL and XML define structures for information in the form of elements, element attributes, and element content. The main difference is that instead of defining data that is stored in a physical storage medium usually only accessible by a database engine, XML describes data that is stored and accessed from within documents.

XML’s Parent: SGML

XML is a subset of SGML (Standardized General Markup Language), a generalized markup language that was passed as an ISO standard in the 1980s. Rather than specifying a language’s elements directly, SGML is used to define the rules that constrain the elements of a specific language.

SGML grew out of the need to define a document’s structure and to define rules used to determine whether the document is valid and well formed. The document’s structure is defined through the use of markup tags, which delimit the elements, and Document Type Definition (DTD) files that define each element’s structure and content, providing a sort of grammar for the document.

For example, using SGML, a customer element within a document could have the following structure:

<CUSTOMER name="Shelley Powers" id="CUST011A1">
<PO id="PO23349008">
<POITEM id="POI1">
<ITEM id="14453">
Item ID: 14453
Item Desc: some description
</ITEM>
</POITEM>
</PO>
</CUSTOMER>

To validate the markups used to define the structure of the document, an associated DTD would have to be created, with statements similar to the following:

<!ELEMENT customer - - (POITEM)+><!ATTLIST customer    name CDATA   id CDATA>

This extremely simplified and abbreviated DTD uses an Extended Backus-Naur Form (EBNF) syntactic notation to create the grammar.

Using a standardized meta-language to define entities within a document allows SGML parsers to pull out the individual entities (such as the customer entity just described) and any associated attributes and content. An application can then use that information for a number of purposes, including the following possibilities:

  • To define information in a database-neutral format for transport between unlike databases.
  • To provide a search engine that allows a person to query on the entity type as well as the data.
  • For report generation, or even an online hypertext order processing form that allows the reader to drill down within the document to find the desired information.
  • To define a standard language for a specific industry or science, such as the petroleum industry or chemistry, which includes special notational conventions.

The concept of SGML is very attractive: Define a language that in turn defines a document structure used for a specific group of documents and which can be extended without impacting the underlying language generation mechanism. Unfortunately, the downside to SGML is that it is far from trivial to define the DTD for a language. SGML is a complex standard that is difficult to implement.

HTML, a derivative of SGML
SGML did, however, provide the roots of the first Web document specification, HTML. HTML was derived from SGML, except that it predefined a group of elements that controlled the delivery of a Web page’s content. In addition, the original HTML elements were expanded to include suggested presentation elements that controlled the appearance of the Web page. SGML does not control the presentation of elements, only the element structure and semantics.

The following code defines an HTML unnumbered list element, which is defined by the DTD associated with the HTML 4.0 specification as having a start and end tag and containing at least one list item:

<!ELEMENT UL - - (LI)+>

According to the EBNF associated with SGML, this DTD states that UL is an element, the double dashes assert that the element requires a start and end tag, and the element consists of at least one, and possibly more than one, list item (LI). When a user agent such as a browser parses an HTML element, it knows to look for both beginning and ending UL tags and at least one LI element contained within those tags.

Associated with the DTD for HTML 4.0 is an implied visual presentation of an unnumbered list, which is that each list item has a specified list graphic, each list item is on a separate line, and each list item lines up beneath the previous list item. However, not all user agents (such as browsers) are visual, so presentation can only be a suggestion, not a mandate.

After the first releases of the HTML specification, new elements crept into the language to provide control over page presentation. One such element is the FONT element, which controls the size, color, type, and font family of any text the element contains. The problem with using a specific tag like FONT, however, is that non-standardized tags can lead to different Web page presentations depending on the user agents.

To help differentiate an element’s structure and its presentation in HTML, the W3C issued a recommendation for CSS1, or Cascading Style Sheet Level 1, a specification that provides presentation information for HTML elements.

The real advantage of HTML was that it was relatively easy to code and display, even in different browsers. The ease of HTML was directly responsible for the massive growth of the Web. If Web document access had begun with XML, chances are you wouldn’t be reading this article right now and Web access would probably be limited to the scientific community. We initially needed a simple mechanism to create Web documents, and HTML was it. The very lack of flexibility was the language’s strength.

Now that the Web and Internet-based technologies have matured to a certain extent, enterprise developers are increasingly demanding a way to build flexibility into documents like Web pages in order to increase their effectiveness and ease of access.

Enter XML
XML arose from a need to create more generalized markup languages without having to follow the large and complex SGML standard. The XML standard still demands that a markup language be defined as well-formed, but it makes the validation step optional, which means that an associated DTD is not required (though one can be included). Additionally, XML uses only a subset of the rules for SGML, letting developers understand the principles and implementation of the technology more quickly.

Like SGML, XML is a meta-language that provides rules to define a set of tags that can be used within a document. These tags are then used to delimit an XML entity, its attributes, and its contents, and to define the elements’ syntax. These tags are read by a XML processor, which in turn provides an application with access to the entities. The application can then perform one or more actions on the XML entities.

XML processors can either be validating, which means that they make use of an associated DTD in order to ensure valid structures, or non-validating. Regardless of whether or not they are validated, XML documents can be considered to be well-formed as long as they match the XML syntax overall and as long as each entity within the document meets the syntax for a well-formed XML entity.

The main requirements for a well-formed Extensible Markup Language include the following:

  • The language may begin with a valid XML declarative statement or prolog.
  • There is one element that acts as the root element and which acts as parent to all other elements.
  • Elements are either not empty or, if they are empty, they have a “hint” encoded within the element that defines this information to the XML parser.
  • Non-empty elements must have start and ending tags.
  • All elements except the root element are contained within some element, referred to as the element’s parent; all contained elements are referred to as the parent element’s children.
  • Elements can contain character data, other elements, CData sections, processing instructions or comments.
  • Each parsed element within the document is well-formed.
  • Character data that may be processed as XML is enclosed within CData sections.
  • Documents can include comments, white space, and processing instructions.

Consider that a valid and well-formed XML document consists of the following EBNF format non-terminating symbols (non-terminating meaning that the symbols are themselves expanded elsewhere):

document::= prolog element Misc*

A complying document could be as simple as:

<?XML VERSION="1.0" ENCODING="UTF-8"?>
<ARTICLE name="XML" author="Shelley Powers"/>

This document consists of the prolog section which includes the XML declaration (“<?XML”) and includes the version number of the XML definition, as well as the encoding declaration. It also contains one element, ARTICLE, which has two attributes, NAME and AUTHOR. Since the element is an empty element, it ends with a backslash to signal to the processor that the element contains no other content. This is necessary for a non-DTD (non-validating) document. Otherwise the XML processor would not know when to look ahead in the parsing for required element content. This is one of the key features of XML: Forward processing information is embedded directly within the document, negating the necessity of creating an associated DTD.

The example just provided is a well-formed document, but not a valid one, since no DTD is provided for validation. The example also demonstrates the simplicity of XML. An even simpler version of the language would be:

<ARTICLE name="XML" author="Shelley Powers"/>

To make the document a valid one, I could have added a DTD for the ARTICLE element directly into the document, or linked to a DTD external file:

<?XML VERSION="1.0" ENCODING="UTF-8"?>
<!DOCTYPE article SYSTEM "article.dtd">

<ARTICLE name="XML" author="Shelley Powers"/>

XML in action
Though the standard is relatively new, there are several XML parsers that validate whether an XML document and its associated DTD fit the rules for a valid XML document. In addition, these same parsers may return the elements within a document exposed in their tree-like form — a form that can be used by applications.

XML is being used in the real world already. For example, Microsoft has defined an XML application it terms “Channel Definition Format,” or CDF. CDF files contain entities that describe the contents of an active channel. Following the accepted technique for XML, CDF files do not contain reference to a DTD file and instead use clues embedded within the tags and tag definitions to provide forward-looking information for the XML parser.

CDF’s purpose is to provide a document that defines the use of push technology at a specific Web site, including which pages are to be displayed as channels, what icons to display, what the update schedules are, etc. With this information, the XML processor provides the key elements that a channels-based application can use to control channel access on the Web site.

The following code shows the CDF file I have defined for use at my personal Web site. The root element for the file is the CHANNEL element. It is the parent element for several other elements, such as an ICON element, an ITEM element, and an ABSTRACT element. Each of the elements within the document may or may not have attributes, and a child element may in turn be the parent for another element:

<?XML VERSION="1.0" ENCODING="UTF-8"?>
<CHANNEL HREF="http://www.yasd.com/plus/index.htm" 
        BASE="http://www.yasd.com/plus/">
    <TITLE>YASD+</TITLE>
    <ABSTRACT>YASD+ pages, using the newest technologies</ABSTRACT>
    <LOGO HREF="http://www.yasd.com/mm/wide_logo.gif" STYLE="IMAGE-WIDE"/>
    <LOGO HREF="http://www.yasd.com/mm/logo.gif" STYLE="IMAGE"/>
    <LOGO HREF="http://www.yasd.com/mm/icon.gif" STYLE="ICON"/>
    <SCHEDULE>
        <INTERVALTIME DAY="1"/>
        <EARLIESTTIME HOUR="0"/>
        <LATESTTIME HOUR="12"/>
    </SCHEDULE>
    <ITEM HREF="http://www.yasd.com/samples/bytes/daily.htm">
        <LOGO HREF="http://www.yasd.com/mm/icon.gif" STYLE="ICON"/>
        <ABSTRACT>YASD Code Byte</ABSTRACT>
    </ITEM>
    <ITEM HREF="http://www.yasd.com/samples/bytes/cheap.htm">
        <LOGO HREF="http://www.yasd.com/mm/icon.gif" STYLE="ICON"/>
        <ABSTRACT>Cheap Page Tricks</ABSTRACT>
    </ITEM>
</CHANNEL>

Notice that the first line contains the XML declaration element, a version number, and an encoding declaration. The main entity within the document is the CHANNEL entity, enclosing other elements such as TITLE, ITEM, ABSTRACT, and LOGO. Each of these elements falls within the allowable XML definition for elements:

element ::= EmptyElemTag | STag content ETag 
EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
STag :: = '<' Name (S Attribute)* S? '>'
ETag::= '</' Name S? '>'
content ::= (element | CharData | Reference | CDSect | PI | Comment )*

Without continuing to resolve the non-terminating references, what the syntax just shown states is that each element is either an empty element, in which case it ends with a backslash/angle bracket combination (‘/>’), or it has start and end tags which enclose content. A “well-formed” constraint is placed on the start and end tags in that the NAME used in both is the same. The enclosed content can include other elements, comments, processing instructions, or other well-formed XML entities. Both empty and non-empty elements can have zero or more attributes, as the following demonstrates:

<CHANNEL HREF="http://www.yasd.com/plus/index.htm" 
        BASE="http://www.yasd.com/plus/">
...
</CHANNEL>

or

<INTERVALTIME DAY="1"/>

Internet Explorer 4.0 has an associated XML parser that pulls the element information out of the document. IE 4.0 uses this parsed element information to create the channel for the Web site, including the two sub-channel items, as shown below:

This site has a main Web page channel, denoted by the top-level graphic, and two sub-channels, with the second sub-channel loaded into the browser.

Accessing the CDF file directly with IE 4.0 opens a dialog box asking the individual how they would like to subscribe to the site’s channel, and allowing the reader to determine how and when the channel contents are downloaded to their client machine.

Other uses of XML
In addition to CDF, Microsoft and Marimba, Inc. have also proposed XML-based technology called the Open Software Description (OSD) format, which can be used to control software downloads and installations over a corporate network. A major IS expense for larger corporations, especially those that are geographically distributed, is installing and maintaining software upgrades on employees’ desktops. One small upgrade to a popular piece of software can take days of planning and weeks of actual implementation (i.e., walking around to each person’s desk and installing the upgrade). During the upgrade rollout, employees will have different versions of the same software, which can create problems. With OSD, software upgrades can be handled automatically using push technology, reducing both IS staff hours and logistical problems.

SGML and XML have both been used to create a Chemical Markup Language (CML) for the chemistry community. With the CML vocabulary, molecular structures can be defined within a document and the information can be either posted or transmitted. XML processors can pull out the CML elements and pass these to applications that perform actions like preparing a print-out of the information, either textually or graphically, or creating an online three-dimensional model of the information using VRML or some other 3D technology.

Netscape, Apple, and others have proposed a Meta Content Framework (MCF) created in XML that can expose a Web site structure for navigation or online exploration. MCF can be used to do such things as generating a three-dimensional site map which can be used for Web site publication and administration. The technology is currently used by Apple’s ProjectX/HotSauce browser, and “Xspace”-compatible content can also be viewed using a plug-in available from Apple.

XML can also be used to define a relational database meta-language, which can then be used to describe documents containing relational database information. These same documents can be easily generated from the relational database dictionaries, which are repositories of information about the information stored in the database. The extensible markup language can then be used to create context-centered documents like “all information pertaining to any purchases, week of January 16 through January 23,” rather than using the context-neutral database format. In addition, supporting information that is not part of the data in the database, like images or reference material, can be pulled into the document.

An XML processor can process this context-based data document and use the information therein to present reports, perform online research and queries, or even to create interactive three-dimensional models of the data. Instead of issuing a SQL statement such as:

select customer_name, customer_address, city, state, zip_code from customer, 
purchase_orderwhere purchase_order.order_id = 32245 and 
customer.customer_id = purchase_order.customer_id;

I could enter a three-dimensional VRML world at a purchase_order portal and scan a virtual filing cabinet for my purchase order number. Once I find it, I can open the door into another room with doors labeled “Purchase Order items” and “Customer” and open the Customer door into another room containing the information I am looking for. Best of all, the documents containing the context-based data could be generated automatically, processed automatically, and presented automatically. This means a change in the database table could be handled automatically.

Besides three-dimensional database applications, defining data in an XML document could be used as a method to convert database data in one format, such as relational data, into another format, such as object- based database records. The resources section at the end of this article has a reference to a preliminary XML representation of a relational database.

In addition, with XML processors (or XML parsers, if you prefer), the most difficult aspect of XML has already been implemented: pulling the entities out of the document.

Returning to the CDF example, not only can the XML document be used by Internet Explorer 4.0 to provide information about the structure of a Web site’s channels, I can also access the XML entities using JavaScript, C++, or Java and use the information for other purposes. For example, the following JavaScript functions open a CDF file, pull out information about the elements contained within the CDF file, and print out this information in a newly opened window.

<script language="jscript">
<!--
var doc = new ActiveXObject("msxml");
var wndw = null;

// display elements in CDF file
// file reference must be fully resolved Internet reference
function DisplayElements(cdffile)
{
// Display this with an appropriate message in a popup window
wndw = window.open("","CDFFile",
"resizable,scrollbars=yes");
wndw.document.open();
doc.URL = cdffile;

// begin displaying elements at root
displayElement(doc.root);

wndw.document.write("</body>");
wndw.document.close();

}

// display element tagname, if any
// and information about element such as any attributes (even 
// if undefined for element) and text and element type
function displayElement(elem) {
if (elem == null) return;
wndw.document.writeln("<p>");
if (elem.type == 0)
    wndw.document.writeln("Document contains element with 
                           tagname: " + elem.tagName);
else
    wndw.document.writeln("Document contains element with no tagname");
wndw.document.writeln("<br>Element is of type: " + 
                                GetType(elem.type) +"<br>");
wndw.document.writeln("Element text: " 
                                + elem.text + "<br>");
wndw.document.writeln("Element href: " 
                                + elem.getAttribute("href") + "<br>");
wndw.document.writeln("Element base: " 
                                + elem.getAttribute("base") + "<br>");
wndw.document.writeln("Element style: " 
                                + elem.getAttribute("style") + "<br>");
wndw.document.writeln("Element day: " 
                                + elem.getAttribute("day") + "<br>");
wndw.document.writeln("Element hour: " 
                                + elem.getAttribute("hour") + "<br>");
wndw.document.writeln("Element minute: " 
                                + elem.getAttribute("min") + "<br>");

// check to see if element has children
var elem_children = elem.children;
if (elem_children != null)
   for (var i = 0; i < elem_children.length; i++) {
      element_child = elem_children.item(i);
        displayElement(element_child);
   }

}

// element type
function GetType(type) { 
if (type == 0) 
        return "ELEMENT"; 
if (type == 1) 
        return "TEXT"; 
if (type == 2) 
        return "COMMENT"; 
if (type == 3) 
        return "DOCUMENT"; 
if (type == 4) 
        return "DTD"; 
else 
        return "OTHER";
}

//-->
</script>

See the Resources section for a pointer to an XML demonstration.

Creating an XML document
A key to the true usefulness of XML is that once an XML parser has been created to process an XML document, you can use it to parse out entity information from any document containing any well-formed XML content.

In the last section, I used Internet Explorer’s ability to parse XML entities, attributes, and content to create a Web page that listed the entities, their attributes, and some content. An interesting example, but not really useful. But what if I were to define my own XML document, including my own XML entities and attributes, and then use IE’s built-in XML parser to create my own graphic menu Web page application? This is fairly simple and only took a couple of hours of playing around to accomplish.

First, I defined my own CDF file and created my own entities, as shown here:

<?XML VERSION="1.0" ENCODING="UTF-8"?>

<DOCUMENT >
    <TITLE>YASD+</TITLE>
    <STYLESHEET HREF="http://www.yasd.com/css/daily.css" />
    <ITEM HREF="http://www.yasd.com/plus/plus.htm">
        <IMAGE HREF="http://www.yasd.com/plus/logo.jpg">
        <ALT>YASD+ Main Page</ALT>
        </IMAGE>
    </ITEM>
    <ITEM HREF="http://www.yasd.com/samples/bytes/daily.htm">
        <IMAGE HREF="http://www.yasd.com/plus/logo.jpg">
        <ALT>YASD Code Byte</ALT>
        </IMAGE>
    </ITEM>
    <ITEM HREF="http://www.yasd.com/samples/bytes/cheap.htm">
        <IMAGE HREF="http://www.yasd.com/plus/logo.jpg">
        <ALT>YASD Cheap Page Tricks</ALT>
        </IMAGE>
    </ITEM>
</DOCUMENT>

I redefined what ITEM is, created a new root element called “DOCUMENT,” and added some new elements of IMAGE, STYLESHEET, and ALT. I followed the XML convention for well-formed entities — opening up this document for parsing within IE 4.0 generates no errors.

I then created an application, consisting of two frames, that uses the images associated with the items to create a graphical menu bar in the top frame of the window and set the link associated with each image to open in the bottom frame of the window. The window originally opens with the form to access the CDF file and process its contents. This form is then overwritten with the processing results. The code for the form and to process the form contents is as follows:

 
<script language="jscript">
<!--
var doc = new ActiveXObject("msxml");
var wndw = null;

var title = "";
var stylesheet = "";
items = new Array();
itemimages = new Array();
itemalts = new Array();
ct = -1;

function createWindow(cdffile)
{
doc.URL = cdffile;

// find main document and any associated item documents
findElements(doc.root);

// if associated documents
if (ct > 0) {
  var strng = "<HTML><HEAD><TITLE>" + title + 
        "</TITLE><LINK REL=STYLESHEET TYPE='text/css'" +  
        " HREF='" + stylesheet + "'></HEAD><BODY>";
  for (var i = 0; i <= ct; i++) 
     strng+="<a href='" + items[i] + 
                "' target='Body'><IMG src='" + itemimages[i] + "' ALT='" + 
                itemalts[i] + "' border=0>" + 
                "</a>"; 
  strng+="</BODY></HTML>";
  document.open();
  document.writeln(strng);
  document.close();
  }
}

// display element tagname, if any
// and information about element such as any attributes (even if undefined for element)
// and text and element type
function findElements(elem) {
if (elem == null) return;
if (elem.type == 0) {
    if (elem.tagName == "TITLE")
        title = elem.text;
    if (elem.tagName == "STYLESHEET")
        stylesheet = elem.getAttribute("href");
    if (elem.tagName == "ITEM") {
        ct++;
          items[ct] = elem.getAttribute("href");
        }
    if (elem.tagName == "ALT") 
        itemalts[ct] = elem.text;
    if (elem.tagName == "IMAGE")
        itemimages[ct] = elem.getAttribute("href");
    }
        
// check to see if element has children
var elem_children = elem.children;
if (elem_children != null)
   for (var i = 0; i < elem_children.length; i++) {
      element_child = elem_children.item(i);
        findElements(element_child);
   }
}
//-->
</script>

I could have defined any elements within the XML document as long as I used well-formed XML entities, and I could process the results in virtually any way I desired just by using simple scripting techniques.

Linking and style information
In addition to the XML specification, other efforts are currently underway to add supporting specifications. The first is XML part 2, which includes linking. Another is XSL, the Extensible Style Language, which defines an XML stylesheet.

Linking has been extended considerably with XML. You can specify an attribute that determines how a resource is displayed, specify whether the resource is displayed automatically, and even specify multiple layers of linkage. Of particular interest is the capability to define a group of links, associating documents together in such a way that the person following the links does not have to hunt around for related documents. If you have ever jumped to a Web site page by following a link from another site, you know how frustrating it can be to try establish the context of the link in order to find related documents.

XSL would be specified using XML and would provide a way to define presentation elements, such as those used currently in HTML. For example, HTML includes the Emphasis element, delimited with <EM> </EM> tags, the Strong element, delimited with <STRONG> </STRONG>, and others. With XSL, you could create styles to provide recommendations for how an XML entity is rendered.

The downside to XML
While XML’s implementation-neutral technique allows parsed information to be used for multiple purposes in multiple applications, it is this same flexibility that may cause problems.

Returning one last time to my CDF example, I created a simple JavaScript application that opens the main channel page and all the associated pages into a frames-based Web page. The main page opens into the top-most frame, and each individual CDF ITEM element opens into one of the smaller frames located along the bottom of the document.

This isn’t a problem for my own CDF file, which is relatively simple. Applying the same application to another CDFfile, however — one I neither created nor control — creates a Web page that probably does not meet the expectations of the page’s designer. The following screen shot shows the result of using the frames-generation application on the IDG.net channel:

To create this page, I used a publicly accessible file, IDG.net’s CDF file, and exposed the XML elements to create a presentation neither Microsoft nor IDG.net intended. Even with the new effort on XSL, currently only a W3C proposal, there is no guarantee that the information exposed with XML will be used for anything approaching the intended purpose of the XML document’s original creator.

Another potential problem area with XML is the CDF specification. CDF’s potential is great; you could use it to build an XML-based document that could be used by different push technology vendors with relatively comparable results. But what happens if a vendor supports channels but doesn’t want to use CDF? Do we end up with different “flavors” of channels? Does the W3C then create a different standards specification for channels, another for chemistry, another for math, another for finance, and so on in order to ensure that only one specification for each “topic” or “business” is created? Or can we design tools for translating between each of the XML document definitions?

In conclusion
Even with these issues at stake, XML is a terrific addition to Web and other application development. One of the most difficult aspects in application programming is extracting the structure as well as the contents of documents. XML has made this process a whole lot easier.

During the recent XML/SGML conference in Washington DC (December10-12), XML became a proposed recommendation of the W3C, the last remaining step before becoming a real recommendation. It may be only a matter of time before XML is just as common as SQL is today.

Categories
Specs

Getting started with cascading style sheets

Originally appeared in Netscape World, now archived at Wayback Machine

Web page authors want to control more than what basic HTML provides, yet they also want their pages to display in the same manner across multiple browsers and multiple platforms. HTML provides the tools that allow us to create hypertext links, frames, tables, lists, or forms, but it does not provide fine control over how each object is displayed.

As an example, to create a hypertext link in a page, the Web page author would use the following syntax:

<A HREF="http://www.somecompany.com/index.html"> link page </A>

Interpreted by Netscape’s Navigator and Microsoft’s IE (Internet Explorer), the link would display on the Web page in whatever manner is determined by the browser.

To address this problem, Navigator and IE both support an extension to the <BODY> HTML tag that lets a Web page author change the color of unvisited, visited, and active links, as shown in the following statement:

<BODY link=#ff0000 alink-#00ff00 vlink=#0000ff>

This statement would display unvisited links in red font, visited links in green font, and active links (clicking on a link makes it active) as blue, unless the user overrides this in their browser. Many Web pages now define the color of the links for a page using this technique. However, the technique of providing specific display attributes for a tag becomes less workable when we consider tags such as the paragraph tag, which can be used many times in one page.

Following in the trend set by the <BODY> link attribute, we would need to create display attributes for the paragraph (<P>) tag and then apply these attributes whenever we wish to display text in some manner other than the default. This again, is workable, though the concept starts to become much more involved.

Where the idea breaks down is if the page author wants to change the color attribute of the paragraph, and then has to search through every web page to make look for where the attribute has been applied, and then make this modification. Additionally, if a company would like to provide a standard formatting of all paragraph tags for all web site pages, each web page creator would need to be aware of what the standard was, and remember to consistently apply it. A better solution would be to change the attribute once per page, or even once in a separate document and have it work on many pages.

Another option to apply formatting would be for the Browser creators to create new HTML extensions to be used for presentation, such as Netscape did with the <FONT> element (see “What’s wrong with FONT). This idea breaks down when one considers that other browsers viewing any material formatted by the new tag will not be able to see the material, or will see it in a manner that may make it illegible.

What is needed is a general formatting tag that one can use to create format definitions, which can be applied to one or more HTML elements. This technique of using one tag to cover extensions was used with the scripting (<SCRIPT>) tag and has worked fairly well.

Enter the concept of style sheets. Style sheets are methods to define display characteristics that can then be applied to all or some instances of an element, or multiple elements. Specifically, the W3C has recommended the adoption of CSS1 (Cascading Style Sheets).

What is CSS1Style sheets provide formatting definitions that can be applied to one or more HTML elements. An example of a style sheet would be the following, which sets all occurrences of the <H1> header tag to blue font:

H1 { color: blue }

CSS1 extends this by defining style sheets that can be merged with the preferences set by both the browser and the user of the browser, or other style settings that occur in the page. The style effect cascades between the different definitions, with the last definition of a style overriding a previous definition for an element.

As an example, the following will redefine the header tag <H1> to be blue, with a font of type “Arial”, size 24 point, and bold:

H1 { color: blue; font-family: Arial ; font-size: 24pt }

With this specification, any time the <H1> tag is used in a page, the text will be displayed in Arial, 24pt, blue font. We define a second style for the <STRONG> tag using an inline definition that will override the first:

<H1 STYLE="color: red">

The difference between this and the first definition is that the latter redefines the formatting for the <H1> tag, but only for that specific use of the tag. This will not impact on any other uses of the H1 tag in the document. If the style definition in the HEAD section had attached a weight to the style, using the important keyword, the original specification would have taken precendence over the second one:

H1 { color: blue ! important; 
         font-family: Arial ; font-size: 24pt }

Styles can be nested, as follows:

H1 EM { color: red }

Now, if the EM tag is used within a H1 header, the EM specification will apply to the text in addition to any other style specification given for the H1 tag. This type of style property is referred to in the CSS1 style guide as a contextual selector. Each element referenced in the line is analogous to an element within a pattern list, and the browser applys the style to the last element in the list that successfully matches the pattern it is processing

W3C has recommended two levels of compliance for CSS1: core and extended. The standard can be seen at this W3C Web page).

At this time, only IE has partially implemented CSS1 in version 3. However, both Netscape and Microsoft have committed to implementing at least the core specification of CSS1 in version 4.0 of their browsers.

The rest of this article will give examples of using CSS1 that will display using IE 3.x. Unix users can download a testbed client, named Amaya, that will allow them to see the results of the style sheets. Amaya was created by the W3C and can be downloaded from at this W3C Web page.

How CSS1 worksAs shown in the previous section, formatting information can be defined for an existing element and this formatting will apply to all uses of the element unless it is overridden or modified by other definitions.

The definition, delimited by new HTML tags of <STYLE> and </STYLE> can be inserted into the <HEAD> section of the page, into a separate document, or inserted in-line into the element itself.

As an example of embedding style information into the header of a document, the next bit of code will create a style sheet that will modify how paragraphs display in a page:

<STYLE type="text/css">
	P { margin-left: 0.5in; margin-right: 0.5in; margin-top: 0.1in; color: red }
</STYLE>

Now, with this definition, any paragraph on the page will have a margin of half and inch for both the left and right margins, a margin of one-tenth of an inch for the top, and will have a red background. No other formatting is necessary to apply this style to every use of the paragraph tag in the entire document.

Another method will allow the Web page author to define style sheets in a separate document that is then imported into or linked to a Web page. To import a Web page, the import keyword is used, as shown in the following syntax:

<STYLE type="text/css">
	@import url (http://someloc);
</STYLE>

The imported style sheet will merge with any styles defined directly in the existing page, or by the browser/user, and the resultant combined styles will influence page presentation. Note that IE 3.x does not support the import keyword, though this should be implemented in version 4.0.

The second method of including a style sheet file is using the LINK tag:

<LINK REL=STYLESHEET HREF="standard.css"
TYPE="text/css">

Using this type of tag will insert a style sheet into the existing Web page that overrides any other style definition for the page, unless style sheets have been turned off for the page. It is an especially effective approach to use when a company may require that all Web pages follow specific formatting.

Let’s see CSS workGranted, if you go a little crazy using CSS1, your page is going to end up looking like something that will land you in a Federal prison if you sent it through the US Mail. An example of this can be seen in a page I call “expressionism with an attitude.”

Impartial observers would call it “the ugliest page they’ve ever seen on the Web.” However, with a little restraint (and of course, we all use restraint in our Web pages), CSS1 can turn a bland page into a grabber.

I have a Web page on my Scenarios site that uses a combination of display properties as defined by Netscape, style sheets as defined by Microsoft, and HTML tables.

Stripping away all but the most basic HTML tags leaves a page that has a lot of content, but without formatting is cold and not very interesting.

Unless the viewer was highly motivated to view the contents, chances are they would skip the page.

The first change to make is to add both a background image and background color to brighten the document up a bit. The full implementation of CSS1 allows the Web page author to specify whether a background image should repeat, and if it does, whether it will repeat horizontally or vertically. This is welcome news for those who have created really long, thin graphics to be able to give that attractive sidebar look to a page. Unfortunately, IE 3.0 does not implement this attribute, nor is it implemented with Preview Release 2 of Netscape Navigator. Instead, the image used in the example is one that can repeat gracefully. The style sheet is:

<STYLE TYPE="text/css">
	BODY { background-image: URL(snow.jpg) ; 
		background-color: silver }
</STYLE>

With the image, the style sheet also adds a default color in case the person accessing the page has turned off image downloading.

Adding the background image is a start, but the text is still a bit overwhelming and rather dull looking (but not reading, of course).

It would be nice to add a margin to the document, as well as changing the overall font to Times 12pt. In addition, modifying the formatting for both the <H1> header and the <STRONG> tags would help add a bit of color and contrast to the document:

<STYLE TYPE="text/css">
	BODY { background-image: URL(snow.jpg) ; 
		background-color: silver ; 
		font-size: 12pt ; font-family: Times;
		margin-left: 0.5in ; margin-right: 0.5in ;
		margin-top: 0in }
	H1 { font: 25pt/28pt blue ; color: navy ; 
		margin-top: -.05in ; margin-left: 1.0in } 
	STRONG { font: 20pt/22pt bold; color: maroon ; 
		font-family: Helvetica ; font-style: italic}
</STYLE>

At the time this was written, Netscape Navigator Preview release 2, in Windows 95, only implemented size units of em and ex. The first unit definition is the height defined for the font, the second is the height defined for the letter ‘X’ in the font. The results are an improvement, but you still can’t easily spot two sidebars that are inserted into the document.

To correct this, a generic class is created and named “sides”, which will contain a formatting definition that can be applied to any element. This is done by naming an element prefixed with a period (‘.’) to represent the class:

.sides { background-color: white ; margin-left: 0.5in; 
	margin-right: 1.0in ;
	text-align: left ; font-family: "Courier New" ; 
	font-size: 10pt }

Looking at the page now, the sidebars stand out from the rest of the document.

The class is used with the <DIV> tag, which allows the formatting to span multiple elements until an ending </DIV> tag is reached:

<DIV CLASS=sides>
</DIV>

The class could also have been used in each individual paragraph tag that makes up the sidebar:

<P CLASS=sides>

Additionally, instead of a class, we could have created an ID attribute for the style:

#sides { background: white ; margin-left: 0.5in; 
	margin-right: 1.0in ;
	text-align: left ; font-family: "Courier New" ; 
	font-size: 10pt }

Using the identifier would be:

<DIV ID=sides>
</DIV>

W3C wants to discourage use of the ID attribute. The W3C wants people to provide classes for an existing HTML element that only applies to that element. Then if people which to cascade the effect they use the parent-child style specification as stated earlier with the <H1> header and <STRONG> tags.

Notice from the example, and only if you are using IE 3.01, that the background color for the class is only applied to the contents and not to the area represented by a rectangle that would enclose the contents. As this looks a bit odd in the example, it is removed from the definition.

In addition to removing the background color from the “sides” class, the next change to the document will add another definition for the STRONG tag to be used in the sidebars and formatting definitions for the hypertext links.

A hypertext link is referred to in the CSS1 standard as a pseudo-class because browsers will usually implement a different look for a visted link than one that has not been visited. This type of element can take a class style specification, but the browser is not required to implement the specification.

Another change will be a specific class definition of “sides” that differs from the original class definition and which will be used for specific paragraph tags:

.sides { margin-left: 0.5in; 
	margin-right: 1.0in ;
	text-align: left ; font-family: "Courier New" ; 
	font-size: 10pt }
 
STRONG { font-size: 22pt; color: maroon ; 
	font-family: Helvetica ; font-weight: bold}
STRONG.extended { font: 18/20pt bold; color: red ; 
		background-color : silver; font-style: italic }
 
P.sides { margin: 0.25in 0in 0in }
A:link { color : red }
A:visited { color : teal }

Using the <STRONG> tag with the extended style would look like:

<STRONG class=extended>

To use the original formatting, no class name is given.

The page is definitely improving.

The sidebars stand out and spacing has improved the ease with which the page can be read. Unvisited links stand out with the bright use of color, yet blend in to be non-obtrusive after the link has been visited.

A final change is made, which is to add formatting to the lists contained in the page. The Web page has both an ordered list, where the elements are numbers, and an unordered list, where the elements are bulleted. Styles are added to each of these list types to display them more effectively. Formatting is added to the generic paragraph tag to indent the start of every paragraph:

OL { margin: 0in 0.5in 0in; font-size: 10pt }
UL { margin: 0in 0.2in 0in }
 
P { text-indent: 0.2in }

The lists now have new formatting, and all paragraphs are indented. With the cascading nature of CSS1, the paragraphs that are defined with the “sides” style inherit the indentation from the parent style, which is denoted by the use of the ‘P’ classifier without any specific class or identifier selector.

The displayed Web page also makes use of several in-line styles definitions, strategically placed to override some of the generic formatting options. There are a few paragraphs that should not be indented in the first line. Overriding the original paragraph specification is an in-line one that sets the indent to ‘0’:

<P STYLE="text-indent: 0in">

This turns off the text indentation.

The paragraphs that label the two figures that are included in the document are defined to increase the left margin another half-inch. As styles inherit from the parent element in which they are embedded, the figure paragraphs will have a left margin set to one inch rather than a half as the new style is merged with the one specified for the entire document:

<P STYLE="margin-left: 0.5in; color: green; font-weight: bold">

The font for the figure paragraphs is also changed to be green and bold.

The paragraphs at the end of the document that contain the trademark and copyright information are also modified with an in-line style:

<P STYLE="text-indent: 0in ; font-size: 8pt; font-style: italic">

This style sets the font to be smaller, and italic.

Positioning the elementsOne improvement that would have helped the page is being able to position the sidebars to the side of the document and have the rest of the document “flow” around them, as happens with print magazines. Another would be to be able to specify a background color for the sidebars that would have “filled” the rectangle enclosing the contents, not just the contents themselves.

CSS1 defines formatting of elements but does not define positioning of them. To this end Netscape and Microsoft have collaborated (yes, you read that right) on a proposed modification to the CSS1 that would provide a standard specification for how elements can be positioned on the page.

The W3C proposal, “Positioning HTML elements with Cascading Style Sheets”, provides the ability to define areas for the content to flow into. These areas can then be positioned relative to each other, using “relative positioning” or in absolution position to each other using, what else, “absolute positioning.”

From the recommendation, an example of absolute positioning could be:

#outposition {position:absolute; top: 100px; left: 100px }

Using this style sheet in the document as follows:

<P> some contents
<span id=outposition> some contents defined for a different position</span>
</P>

This code will result in the contents enclosed in the SPAN tag to be positioned in an absolute space beginning at the position defined as 100 pixels from the left and 100 pixels from the top. The enclosing rectangle will extend until it hits the right margin of the parent element, in this case the document. The height will be long enough to enclose all the contents. However, both the width and height of the elements could also have been defined.

Relative positioning allows elements to be positioned relative to each other, even if this means the elements overlap:

#newpos {position: relative; top: -12px }

The contents formatted by this style sheet will position themselves above the rest of the contents, moving the other contents down.

In addition to positioning along the X- and Y-axis (horizontally and vertically on the web page), the elements can also be positioned to each other on the Z-axis. This means that web developers will be able to layer elements on top of each other. An example pulled directly from the positioning paper is:

<STYLE type="text/css">
<!--
.pile { position: absolute; left: 2in; top: 2in; width: 3in; height: 3in; }
-->
 
<IMG SRC="butterfly.gif" CLASS="pile" ID="image" STYLE="z-index: 1">
 
<DIV CLASS="pile" ID="text1" STYLE="z-index: 3">
This text will overlay the butterfly image.
</DIV>
 
<DIV CLASS="pile" ID="text2" STYLE="z-index: 2">
This text will underlay text1, but overlay the butterfly image
</DIV>

With this, the order of the elements would be the image on the bottom then the contents defined by the class “text2”, and finally the contents defined by the style “text1”. The elements are transparent meaning that the bottom elements will show through to the top, though this can also be changed using style sheet settings.

Another recommendation is the ability to define whether an element is visible or not, which would still maintain its position in the document, and whether the element is even displayed which would remove it from the display, including the space reserved from the element.

The ability to position HTML elements, to control their visibility, and to finally control how they overlap is a revolutionary change to HTML document design.

What’s next?With Microsoft and Netscape both committed to the support of CSS1, and both participating in an extension to the CSS1 proposal to provide for positioning of HTML elements, creating HTML pages that display effectively in both browsers should be a snap. However, there is one element that was not discussed in this article and which can tear down the browser truce flag: dynamic movement of HTML elements.

As can be seen with the release of Navigator 4.0, Netscape supports script based movement of elements with their LAYER tag and with a style sheet concept they call Javascript Style Sheets (JSS). With the release of Internet Explorer Preview in March, Microsoft supports dynamic content through their own version of Dynamic HTML, which uses CSS1 elements directly. Unfortunately, neither method will work with the other browser.

As with the problems that have been faced with JavaScript, mentioned in the Digital Cats’ article “Whose JavaScript is it, anyway?” until Microsoft and Netscape agree on a standard scripting Object Model, you and I will continue to have to work around browser differences if we want dynamic content. Or use Java applets, and forgo all uses of scripting.