Hypermedia Style Sheets on the World Wide Web

Jacco van Ossenbruggen (*), Lynda Hardman (**), Lloyd Rutledge (**)
(*) Vrije Universiteit, Fac. of Mathematics and Computer Sciences
De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
(**) CWI
P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
email: jrvosse@cs.vu.nl

Abstract:

Cascading style sheets can be used on the World Wide Web for separating out style descriptions of an HTML document's appearance from the document structure. These allow a uniform set of styles to be applied to multiple documents. Style sheets used for the Web do not, however, fully support the description of styles for hypermedia documents - that is documents which include multiple, synchronized media elements and links. This has recently become an urgent issue with the introduction of the multimedia document description language SMIL (Synchronized Multimedia Integration Language).

In this paper we consider existing style sheet mechanisms and conclude that they lack a sufficiently powerful output document description for hypermedia documents. We derive the requirements for such an output format description. We also discuss existing style sheet standards, including their ability to describe structural as well as style transformations, and list a number of areas which need to be supported for hypermedia document styles: temporal constraints, spatial constraints, hyperlinking among multiple and continuous objects, and adaptive environments.

Introduction

In the first years after the introduction of HTML, document modeling development focussed on the appearance rather than on the content of HTML pages. A plethora of new tags was added to the original HTML specification and, to further enhance the layout capabilities, Java applet support and plug-ins became standard features of Web browsers. More recent developments focus on the structure of Web pages and, more importantly, the separate definition of the structure (in the document itself) and its layout (in a style sheet). As a consequence, we can now separate layout from an HTML page by using the Cascading Style Sheets (CSS1 [9]) W3C Recommendation, which has been implemented by all major browser vendors.

HTML markup, however, remains oriented towards appearance and has limited facilities for a richer structural markup, which is needed to convert the displayed content of today's Web pages to machine-readable and interchangeable structured content [8]. The eXtensible Markup Language (XML [3]) provides a standard, open format to define such documents with a rich and machine readable structure. Because a browser has generally no prior knowledge about the tags in an arbitrary XML document, it depends completely on an associated style sheet which defines how to process the markup and content of the document. This requires a style sheet language which is much more powerful and expressive than CSS1. The eXtensible Style Language (XSL [1]) provides such a language for page-based documents and is based on the ISO standard DSSSL [7].

In this paper, we look at the application of style sheets to true hypermedia presentations, i.e. presentations that combine synchronized multimedia with hyperlinking. We claim that such presentations require a style language with a number of significant extensions to the current XSL proposal. We discuss the existing use of style sheets as applied to text and discuss their application to the case of hypermedia, and in particular how they need to be extended. A preliminary version of this paper was presented in [10].

Use of Style Sheets in Hypermedia

From an abstract point of view, style sheets specify the mapping of a model describing the logical structure of the document onto an output model. The output model may also be a logical structure model, or may describe a presentation structure. For page-based documents (e.g.\ traditional print or HTML) a clear abstraction of such a presentation structure has been developed. This is based on a sequence of pages consisting of columns, headers and footers, etc. Such abstractions allow the specification of style sheets which map a variety of logical document models (including encyclopedia, novels, articles and letters) onto the same presentation structure -- this technique has long been commonplace.

Style sheets for traditional printed documents associate individual document structures with presentation attributes of the medium such as font type, character style and font size, and also describe aspects inherent to the presentation structure, such as page size, margins and headers. Work has been carried out on style sheets for HTML documents on the World Wide Web [9]. HTML is an extension of text including anchors and links, and as such the style sheet options have been extended to include the visual styles of anchors.

In order to take style definitions further for application to hypermedia we need to include not only the anchor and linking aspects of hypertext, but also the layout and scheduling aspects of multimedia. The most important of these is the temporal specification of the document, given that spatial aspects have been partially addressed for the text case in terms of pages.

To illustrate possible uses of a hypermedia style sheet we present a simple scenario. A document describing a slide show as an ordered collection of slides may be presented in several ways. For example, each slide can be presented for a fixed time interval (e.g. 10 seconds). Here the style sheet need only associate a duration with each slide. An alternative manner of presentation is that the slide is left on display until the user selects a link to the following slide. Here the style sheet has to specify the addition of a hyperlink for each slide plus some means of activating the link, for example introducing a ``next'' text button. For a more advanced audience the author may wish that a number of slides are skipped, or the ordering changed. The style sheet would specify the structural changes and the output document could then be given presentation characteristics by either of the previous two style sheets.

Current Style Sheet Languages

Currently, three style sheet languages are important in respect to their use in hypermedia, especially on the World Wide Web. These languages are CSS, DSSSL and XSL.

CSS

The World Wide Web Consortium defined a number of formats based on the idea of Cascading Style Sheets (CSS). CSS level 1 (CSS1, [9]) is an official W3C Recommendation and defines how multiple style sheets can be applied to determine how an HTML document is to be displayed on the screen. A number of extensions to CSS1 are currently being developed. These include constructs which allow high-quality printing of HTML documents, positioning of HTML elements on the screen and speech-synthesis of web documents (e.g. for visually impaired users). All of these extensions are part of CSS level 2 (CSS2, [2]) which is currently still a working draft.

As its name suggests, one of the advantages of CSS is the concept of cascading, which allows multiple style sheets to be applied to the same document. Other advantages of CSS are:

The main limitation of CSS, that it handles only HTML documents, is due to its simplicity. While CSS is in principle applicable to document formats other than HTML, in practice it is only used in combination with HTML documents. A CSS style sheet is unable to change the structure of, or the elements within, the document tree -- it only modifies their associated style properties. For example, a horizontal rule in the input document stays a horizontal rule in the output document. A CSS style sheet cannot, for instance, map it to a page break in a printed version of the document. To be applicable to a wide variety of document formats, a style sheet needs to be able to transform the elements of a foreign document type, i.e. those not recognized by the browser, to those elements which are supported by the browser.

DSSSL

The Document Style Semantics and Specification Language (ISO/IEC 10179:1996, [7]), is a language based on Scheme for describing how text and graphics should be presented to users in a two-dimensional environment. The Scheme code is embedded in an SGML document which conforms to the SGML architecture defined in the DSSSL standard. DSSSL consists of two parts, a tree transformation process that can be used to reorder structured documents prior to presentation, and a formatting process that associates formatting instructions with specific nodes in the target representation -- the flow object tree. DSSSL specifications are device independent pieces of information that can be interchanged between different platforms. The back-end formatters needed to generate the final form of document (e.g. a PDF or RTF file, or a presentation on a computer display) are not standardized by DSSSL. Advantages of DSSSL include:

Important disadvantages of DSSSL are:

XSL

The eXtensible Style sheet Language (XSL, [1]) is a proposed style sheet language for XML documents. Because the tags used in an XML document may vary from application to application, a style sheet should provide a mapping from the set of elements used in the XML document to the set of elements known to the browser. CSS is not sufficiently powerful to provide such a mapping. While DSSSL can provide these mappings, it is considered to be too complex, especially in environment where most XML applications will originate from an HTML background.

XSL is therefore based on DSSSL, but has some important differences:

With these extensions, XSL has several advantages over the current DSSSL standard:

The extensions XSL makes to DSSSL are expected to be incorporated into DSSSL during the next corrigendum, which would make XSL a proper subset of DSSSL.

Limitations of Style Sheet Models

A hypermedia style sheet language needs to be able to define a mapping from a hypermedia document model onto the presentation model of the playout environment. This mapping includes the specification of temporal and spatial constraints, and hyperlink behavior in documents containing multiple synchronized media streams. The style sheet languages described above, and the models on which these languages are based, are currently unable to support this mapping. The following gives a brief characterization of these problems.

Temporal Constraints

In contrast to the spatial dimensions of text, hypermedia contains a another, temporal, dimension. This dimension has to be reflected in the final presentation to ensure the precise scheduling and synchronization of the media items involved. Often, the logical document structure also reflects some of the important temporal constraints as well, i.e. hypermedia authors tend to organize their documents in a way that reflects the temporal structure. An example is the logical grouping of elements which have to be played in sequence or in parallel. Just as a text author wants to attach spatial layout properties to logical elements in a text document (e.g. to chapters and paragraphs), a multimedia author may want to attach synchronization and scheduling properties to the logical elements constituting a multimedia presentation. Style sheet languages do not support the specification of such temporal constraints because these cannot be expressed within the page-based output models which include only spatial dimensions.

Spatial constraints

In contrast to spatial layout based on text-flow used in word-processors and hypertext browsers, hypermedia applications require different mechanisms to specify their spatial layout. In general, resizing the browser's window should not result in re-formatting the document to generate a new text-flow, but result in resizing of the media items playing. This may involve constraints to preserve aspect ratios or cropping of media items which do not support scaling. Because the output models are based on a flow model, the layout requirements of hypermedia documents cannot be specified by CSS1, DSSSL or XSL.

A proposal for explicitly positioning HTML elements on the browser window is currently being developed as part of CSS2 [2]. This proposal allows both absolute positioning (i.e. positioning with respect to the browser window) and relative positioning (i.e. positioning relative to the ``default'' position the element would have in the document flow) of HTML elements.

To avoid requiring style sheet to specify the position of every individual HTML element in the document, the positioning is inherited down the document tree. For example, if a paragraph element is positioned, all text and other elements constituting the content of the paragraph will be rendered relative to their parent's position. This inheritance mechanism works for HTML documents because the logical hierarchy specified by the document tree (more or less) mirrors the spatial layout hierarchy of the renderer. For multimedia, the assumption that the logical grouping of a document is based on the spatial hierarchy of the presentation does not hold. Multimedia documents often base the logical grouping on the temporal relations of their elements, e.g. several shots of a movie may be grouped to one scene. Inheritance of spatial layout properties down a tree which reflect the temporal structure leads to unexpected (and usually unwanted) results.

Hyperlinking continuous media objects

Continuous media objects introduce additional complexity to the specification of hyperlinks among these objects. Anchors may be active only for a certain amount of time, or be connected to different links during different time intervals. Video anchors probably move during the presentation.

Additionally, link traversal might lead users to a point somewhere in the middle of another document, as is the case in HTML. For hypermedia documents, this might involve fast-forwarding the presentation to start it at the right moment. Because their output models do not account for temporal constraints, this type of link behavior cannot be specified by current style languages.

Hyperlinking within multiple active media streams

In hypertext, a window usually displays a single document (HTML frames are a common exception) where, on traversal, the destination of a link replaces the complete source document, or is displayed in a new window. In hypermedia documents, multiple streams of media items might be active simultaneously, and link traversal should not necessarily affect all of them. As a consequence, links have to define what their context is in terms of their source (i.e. which of the currently active streams are affected) and in terms of their destination (i.e. which of the streams of the destination will be activated). Synchronization constraints between objects belonging to the streams involved might further complicate the link processing. The facilities for defining link behavior are very limited in most style sheet languages, and few models support the concept of link context [6].

Adaptive environments

In adaptive environments, the style sheet conversion might also be based on information about how to adapt to user characteristics (e.g. level of expertise) or changing system resources (e.g. network bandwidth). The hypermedia style sheet could be used to indicate how to deal with limited resources (by specifying alternatives or QoS negotiation protocols) on different platforms, thus making the document source independent of platform specific details.

Hypermedia Presentations on the World Wide Web

Before we can define how style sheet languages need to be extended for hypermedia documents on the World Wide Web, we first describe how hypermedia can be presented. Because HTML and CSS1 do not have any synchronization primitives, they cannot be used for integrating a set of independent multimedia objects into a synchronized multimedia presentation. To distribute such presentations over the World Wide Web, authors need to use either vendor and platform specific plug-ins or use low-level scripting to define their multimedia presentations.

In order to provide a declarative, open and platform independent format to disseminate multimedia presentations over the Web, the W3C Working Group on Synchronized Multimedia (SYMM [5]) developed the Synchronized Multimedia Integration Language (SMIL [4]). SMIL allows an integrated presentation of multimedia objects over the Web, using existing transport protocols such as HTTP and RTP. It also uses existing formats to describe the contents of the individual media objects, which will typically include HTML pages, audio and video fragments, still images etc. While it is expected that SMIL documents will be generated using dedicated editors, SMIL documents can be edited by an ordinary text editor, and because SMIL is an XML application, its syntax is comparable to HTML.

The focus of SMIL is on the temporal scheduling of the individual media objects. In contrast to HTML, whose document model is based on a spatial hierarchy, the document model of SMIL is based on temporal composition. A SMIL document describes a tree of parallel and sequential elements, which can have additional attributes to define more precise synchronization constraints.

For example, SMIL may be used to describe a simple slide show. The following document plays two slides in sequence. The duration of each slide depends on the length of the associated audio track (since the audio element is the only media object with a fixed duration, the image and text end when the audio ends).

<smil> 
  <head>
    <layout type="text/smil-basic">
      <tuner id="slides"     left="20"  top="50"/>
      <tuner id="slide-text" left="20"  top="120"/>
    </layout>
  </head>
  <body>
    <seq>
      <par id="slide1">
        <img   href="slide1.gif"  loc="slides"/>
        <text  href="slide1.html" loc="slide-text"/>
        <audio href="slide1.mp3"/>
      </par>
      <par id="slide2">
        <img   href="slide2.gif"  loc="slides"/>
        <text  href="slide2.html" loc="slide-text"/>
        <audio href="slide2.mp3"/>
      </par>
    </seq>
  </body>
</smil>
The example only shows the basics of the two synchronization primitives of SMIL (parallel and sequential composition). Both primitives can have many attributes to provide more complex types of synchronization (including ``hard'' synchronization as provided by the lipsync attribute).

The precise layout mechanisms of SMIL are still under development. However, the format will provide hooks for style sheets to add additional layout information, and currently has a built-in layout mechanism which implements a functionally equivalent subset of CSS2.

Additionally, the initial version of SMIL provides limited hyperlinking functionality, which will likely be extended in future versions of SMIL. An interesting feature of SMIL is that its documents may display linking behavior at more than one level. For example, a SMIL document which sequentially displays HTML pages, may have links defined by the SMIL document as well as by the respective HTML documents.

Extensions to Hypermedia Style Sheet Languages

CSS is a style language tailored to HTML documents, but we considered CSS to be too limited to be applicable to more complex hypermedia documents. Despite their limitations, we think both XSL and DSSSL can provide the fundamentals for a future hypermedia style sheet language. To overcome the limitations described above, we propose a number of extensions.

First, an hypermedia object tree needs to be developed as an alternative to the DSSSL and XSL flow object tree. This tree should be able to model the temporal structure of a hypermedia presentation (e.g. it should support the scheduling of media streams and be able to synchronize parallel media streams) as well as its spatial layout (e.g. it should provide alternatives to the page and flow-based layout of DSSSL and XSL). Additionally, it should extend the linking output model of XSL in order to provide for hyperlinking within multiple synchronized streams of continuous media.

Secondly, we need an extension to the DSSSL and XSL query language which provides access not only to the input document, but also to the run-time play-out environment and to a profile of the current user. Such information is needed by the style sheet to adapt its mapping to the characteristics of the capabilities of the user's platform and to the user's current task, as described in the scenario in the beginning of the document.

To provide an initial version of such an XSL extension, we propose to add the core elements of the new SMIL format to the DSSSL and HTML/CSS core objects of XSL. This will provide a simple hypermedia output model for SGML and XML documents types other than SMIL, which will allow them to be played on future Web browsers.

This approach, however, has several disadvantages. First, both the XSL and SMIL formats are in a very early stage of development, and it will take a considerable amount of time before the specifications of these formats will stabilize and the formats themselves will be implemented. The same holds for the features which will be used for defining the spatial layout mechanisms. Secondly, the current SMIL proposal provide only limited support for hyperlinking. Explicit specification of link contexts, as advocated in [6] is not yet supported.

Conclusions and Future Work

Existing style sheets allow the specification of structural transformations and style specifications for text-based and HTML documents. There is no single standard, however, that allows the specification of structural transformations for a general hypertext document model. For presentations that include synchronized media elements a style sheet language is required that, in addition, is able to take into account the following aspects: temporal layout, spatial layout that is not text-flow based, hyperlinking to and from continuous media objects, hyperlinking to within a temporal presentation, hyperlinking to a subset of a presentation, and adaptive environments. We recommend the extension of DSSSL or XSL, since these standards feature both a powerful query language and a language for expressing structural transformations.

We discussed the practical aspects of developing a style sheet language for the newly introduced SMIL multimedia description language for the Web. This would require extensions to the CSS2 proposal to include temporal aspects -- in particular parallel, sequential and relative synchronization constraints. CSS2 already contains a sufficiently rich spatial layout model for multimedia layout. Linking support in SMIL is currently limited, so that the hyperlinking aspects in CSS2 are also sufficient but would have to be extended for more complex hyperlinking.

As an initial step, we propose to extend XSL to include the basic SMIL elements which, although would not provide a general solution to style sheets for hypermedia, would provide sufficient support for the SMIL language.

References

1
Sharon Adler, Anders Berglund, James Clark, Istvan Cseri, Paul Grosso, Jonathan Marsh, Gavin Nicol, Jean Paoli, David Schach, Henry S. Thompson, and Chris Wilson. A Proposal for XSL, August 1997. W3C Note for discussion only. Available at http://www.w3.org/Submission/1997/13/Overview.html.

2
Bert Bos, Håkon Wium Lie, Chris Lilley, and Ian Jacobs. Cascading Style Sheets, level 1, November 1997. Work in progress. W3C Working Drafts are available at http://www.w3.org/TR.

3
Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen. Extensible Markup Language (XML): Part I. Syntax, March 1997. Work in progress. W3C Working Drafts are available at http://www.w3.org/TR.

4
Stephan Bugaj, Dick Bulterman, Lynda Hardman, Jack Jansen, Rob Lanphier, Nabil Layaida, Jonathan Marsh, Anup Rao, Warner ten Kate, Jacco van Ossenbruggen, Michael Vernick, and Jin Yu. Synchronized Multimedia Integration Language (SMIL), november 1997. Edited by Philipp Hoschka. Work in progress. W3C Working Drafts are available at http://www.w3.org/TR.

5
W3C SYMM Working Group. W3C Activity on Synchronized Multimedia. More information at http://www.w3.org/AudioVideo/.

6
L. Hardman, D.C.A. Bulterman, and G. van Rossum. Links in hypermedia: the requirement for context. In Proceedings of ACM Hypertext '93 (Seattle), pages 183-191. ACM, November 1993.

7
International Organization for Standardization/International Electrotechnical Commission. Information technology -- Processing languages -- Document Style Semantics and Specification Language (DSSSL), 1996. International Standard ISO/IEC 10179:1996.

8
Rohit Khare and Adam Rifkin. XML: A Door to Automated Web Applications. IEEE Internet Computing, 1(4), July - August 1997.

9
Håkon W. Lie and Bert Bos. Cascading Style Sheets, level 1, December 1996. W3C Recommendation; Available at http://www.w3.org/TR/REC-CSS1-961217.

10
Jacco van Ossenbruggen, Lynda Hardman, Lloyd Rutledge, and Anton Eliëns. Style Sheet Support for Hypermedia Documents. In Hypertext'97 -- The Eighth ACM Conference on Hypertext, pages 216-217, April 1997.