Lange's Basic Datamodel of Hypertext

The following datamodel is developed through stepwise refinement. Initially the meaning of a hyperdocument is defined as a database that has active cross-references, allowing the user to have nonsequential access to a text, thereby making the reading process nonlinear. A hyperdocument, in its simplest form, can be modeled as a set of nodes and a collection of links, where the nodes are documents and the links are cross-references.

Nodes - Units of Information

An information fragment in a hyperdocument is called a node. Thus, hypertext is made up of a collection of distinct, named information fragments. The names may be assigned explicitly by the user or they can be assigned automatically. In some hypertext systems it may be necessary to divide the nodes into several different types: document, illustration, annotation, etc. Thus it must be possible to add attributes and attribute values to nodes.

Links - the Glue that holds Hyperdocuments together

A link in its simplest form is a connection between two nodes. A hypertext network is made up of a collection of uniquely named links. Links can be used to transfer the reader to a new topic, provide access to an annotation or footnote, show a reference and so on. Conceptually a link is directed, i.e. it points from one node to another, having an origin called the (source) anchor and an end point called the destination (anchor). (The words between parentheses are not used by Lange.) This does not mean that links can only be traversed in one direction. One can always pose the question: who points to me?

A hypertext system may allow different types of links. The link type can reflect the type of information it is pointing to, making it possible for the user to only view links of a certain type. Different types of links in a document could be references to related articles or reviewers' annotations. To represent this variety of link types, links can have attributes just like nodes.

An important point in hypertext is the support for collaborative work. If several people are reviewing and annotating the same hypertext, they all use the common network made by the author of the document. To this common network each individual can add a personal subnetwork reflecting their own need for referencing across the common network and including references for their annotations. (This is exactly how annotations work in the WWW-browser Mosaic for X.)

Slots - the Interior of the Node

The structure of a node can be compared to a record datatype in programming languages. A node has a collection of uniquely named slots, each having some kind of textual content. Slots can be used to represent structured information like in a relational database. (But the fields may contain any kind of information, unlike in a relational database.) Like anchors and destinations slots are identified by the node in which they are embedded together with their name.

Buttons and Fields - the Referential Mechanism

In addition to links that connect entire nodes and slots links may have connection points within the nodes. Pragmatically this covers the referential use of links in a hypertext.

A handle is a part of the text inside the slot to which a link can be attached. Handles make it possible to establish connections between the contents of one node and another node. A handle is defined as a consecutive sequence of characters in the textual contents of the slot. It is identified by (the node id and by) its character position in the text and the span in numbers of characters.

When a link is anchored to a handle, that is, there is an outgoing link from a handle, the text span specified by the handle is called a button. When the end-point of a link is a handle it is called a field. The difference between buttons and fields is purely a conceptual one.

N-ary links, 2nd Order Links and Active Links

So far only binary links have been treated. Binary links are characterized by one link anchor and one destination point. They match the concept of navigating in a hypertext very well. For structural reasons it may be more appropriate to consider a more general concept of links. N-ary links have one or more link anchors and one or more destination points. In the model this means that a set of link anchors and destination points are bound to the same link. Apart from having several sources (buttons) leading to the same destination, n-ary links can also be used to implement following links in multiple steps. An n-ary link leading to an article might first direct you to an entry in an annotated reference list, for reading an abstract, and then to the article itself. A link may also lead to a short description of the destination, before actually arriving there. This behavior resembles the few words you get from Hyperties or the complete url you get from WWW-browsers such as Mosaic or Netscape Navigator.

Nodes, slots and fields have been discussed as destination points for links. Links pointing at links, called second order links, can be used to point at a collection of connections. It might reflect that a link itself is of special interest, and that the reader, after being guided to the link, can chose to study the anchor or destination of the link. Links are identified as connection points by name of the network in which they are embedded, together with their own name.

Active links are links that have anchors or destinations that are function denotations. That is, instead of having links pointing at fragments of text they contain a function. This function is to be interpreted when you follow the link. This kind of link can be used to generate a view of the data it is anchored to. That could be the generation of a graphical representation of the data each time one is following the link. A function signature is added to the domain of anchors and destinations. The domains of the arguments and the results of the function are not specified in any further detail within the reference model.

Structures - the Organizers of Hypertext

In addition to the simple graph of nodes and links, additional structures may be desirable. A hierarchy of nodes in a hyperdocument is another primitive example of organizing a hyperdocument. It is a way of organizing information into meaningful parts, e.g. documents into sections and subsections, much like in a book. Users are usually free to define information structures in hypertext systems as they are needed; but the novice user sometimes may require guidance by the hypertext system itself, or one may find ad hoc organization of hyperdocuments potentially dangerous. The problem can be solved by using structures.

The use of the set-structure fits well into card-like hypertexts. The map-structure can extend this unordered collection of cards with a facility of direct access by user-defined names. Sequences can be used to express interrelationships between nodes as the sequence in which they should be visited, e.g. chapters in a book. Defining these structures recursively makes it possible to make tree structures of nodes.

It should be emphasized that it is not the nodes and networks themselves that are organized in these structures. The structures contain only the names of the nodes and networks. Hence it is possible to reuse nodes and networks in several structures. E.g., one can think of a section or figure appearing in more than one book, and thus in several structures.

Structures can be interpreted by filters, to make linear representations of the hyperdocument, e.g. on paper. A tree structure of a book should intuitively be interpreted by a filter in a top-down, left-to-right manner, so that chapter one and the subsections of this chapter are written out before chapter two and so on.

Structures are uniquely identified by their name. Each structure is characterized by having a collection of substructures, each organizing destinations into sets, sequences or maps. The substructures themselves have unique identities and can be destinations, thus making it possible to build more complicated structures. A structure has a root that can identify one of the substructures as being the root of the structure.