User-Tailored Hypermedia Explanations

Fiorella de Rosis, Berardina De Carolis, Sebastiano Pizzutilo

Dipartimento di Informatica, Universitą di Bari Via Amendola, 173, 70126, Bari, ITALY

email: derosis, nadja, pizzutil@gauss.uniba.it


Abstract

A good explanation facility should adapt itself to the user's needs, and offer a clear alternative if a given explanation is not clear or complete. This paper describes how concepts are explained in an intelligent interface to a statistical package by accessing information in a knowledge base. The facility is based on a combination of user modelling, natural language generation and hypermedia techniques. The advantage of this approach is to reduce difficulties in user modelling and in interpreting requests of further information. In addition, explicit knowledge representation enables to modify the facility according to results of evaluation studies.

Keywords: explanations, user models, hypermedia, natural language generation.

1. Introduction

This paper discusses how explanations can be provided in a system whose expected users belong to heterogeneous categories, with different characteristics and needs. The specific problem examined deals with explaining concepts by accessing information in a large knowledge base. Thus, relevant information have to be extracted from the knowledge base and have to be presented to the user in an appropriate form. What is relevant and which form is appropriate depend on the user characteristics. Therefore, modeling these characteristics can help in producing non-redundant, understandable explanations based on hypotheses about which information the user needs to receive. However, in case these hypotheses are not correct, and therefore the explanation provided does not fully correspond to the user expectations, users have to be able to access easily the information they need. This paper describes how this purpose may be attained by combining user modeling, adaptive message generation and hypertext/hypermedia techniques. The context where this research was developed is the design of EPIAIM* , an intelligent interface to a statistical package that supports people with limited experience of epidemiology and statistics in the correct analysis of a data set.

A consultation session with EPIAIM starts with the formulation of an hypothesis (e.g. about the relationship between the risk factor and a health event) and proceeds iteratively through various phases. In each phase, a method of data analysis is suggested, a statistical procedure is applied accordingly, and results are presented and interpreted interactively with the user. Intermediate hypotheses can be formulated at the end of each phase, to be verified in the next one. At any time, the user may ask for explanation of a concept mentioned in a message. Potential EPIAIM users are general practitioners, hospital doctors, health administrators, people working in community health services and so on. A sample of these users was examined by a set of experimental studies [1] which revealed that users' knowledge in the domain varies considerably, according to their background and job. Users cannot be categorized in a few classes, such as beginner, intermediate or advanced, according to their level of experience, but have to be modeled in a more complex way. These studies also showed us that the possibility of receiving adequate explanations is essential to insure that users understand and accept the suggestions provided by the consultation system. However, the need for explanations (information content, order, and presentation) is not homogenous for the different users, but varies according to their experience in the domain. Therefore, explanations have to be tailored to the user characteristics described in the user model.

2. Related Work

The characteristics of a "good" explanation facility have been defined recently, based on the long experience in this field [7]. these characteristics include the ability to adapt itself to the user's knowledge and goals and to offer an alternative in the case a give n explanation is not understood or is not sufficiently complete. This requirement is especially important when expected users are not homogeneous and when their needs can be foreseen only in part. In order to build a user-adapted interaction style, one has to make some conjecture about the user's goals, plans, knowledge and preferences. These default assumptions are stored in a model of the user, that will be accessed any time a message has to be provided. Reflections on the content of such user model can be found in many papers (see, e.g. [9]). Many proposals also exist on the way in which this information can be organized, and especially on how default reasoning can be applied to infer a lot of information from a few questions to the user. However, independently of the adapted approach, the image of the user that is provided by the model is approximate. Most approaches propose to assign to each user's attribute a degree of belief, defined according to one of the existing theories to deal with the uncertainty in knowledge-based systems. A further source of uncertainty in user-tailored message generation is that it is not totally certain what is the most appropriate strategy to produce messages, once the user characteristics have been established. Research in this field is based on the analysis of naturally occurring texts. This analysis, in our experience, gives very good cues on how explanation strategies are related to audience's characteristics. However, in formalizing these strategies, one has always to introduce some approximation. These two sources of uncertainty (in the characteristics of the user who is interacting with the system, and in the message-production strategies most suited to these characteristics) combine to produce a situation where the risk of errors in the production of messages is high. An alternative to one-shot explanations is based on very sophisticated and expensive user models is to increase cooperativity in the interaction. To this aim, discourse strategies are modeled by planning operators [5] in order to monitor the effects of explanations on the user and to recover if feedback indicates that the user is not satisfied with the response. Other authors propose to model the dialogue, in addition to the discourse, in order to be able to interpret follow-up questions and answer to them [3]. However, using discourse and dialogue models to produce cooperative explanations requires a deep understanding of several issues: how communicative goals are related to the user characteristics, which alternative rhetorical strategies may be used to achieve these goals, how different communicative - linguistic or visual - acts may be combined in these strategies. This approach can be complex and expensive. In addition, is has been proved that users have questions, and their requests of further information may consist of a "vaguely articulated mumble or sentence fragment" [5]. As a result adjusting a message when the user model was incorrect may produce long dialogues that risk to provoke a negative reaction from the user. Combining an initial, one-shot user tailored explanation with the possibility of obtaining more details, if needed, by an hypertext facility may help to overcome these difficulties. This possibility has been explored by other authors, who used hypertexts to allow users to ask for clarifications [6] or to issue new queries on the text provided initially [8].

In this paper, we describe how we applied this approach to produce follow-up elaborations of the first explanation. To this aim, the user knowledge is modeled in a rather detailed way, the first explanation and the hypertext exploration of the database are tailored to the user needs, and mixed media communication is employed. A schema of EPIAIM architecture is shown in Figure 1.

Figure 1: The Architecture of EPIAIM

3. Knowledge Base

The section of EPIAIM knowledge base that is relevant to the explanation facility is made up of three components: a Dictionary of Concepts, a Library of Examples and a User Model.

3.1 The Dictionary of Concepts

The Dictionary of Concepts is organized in a database where concepts are classified according to their theoretical meaning. Examples of classes are: methods, indices, tests. Each class is characterized by a set of attributes. For example: the class of indices includes items like rate ratio, adjusted rate, relative risk, odds ratio crude, Mantel Haenszel odds ratio and so on. These concepts are described by the following attributes: purpose (that says which phenomenon the index allows to measure), basic-idea (that outlines the computational criterion behind the index without going into mathematical details), formula, interpretation, and so on.

To each of these attributes are attached two short-length texts: a nucleus, which gives the most essential description of the attribute, and a satellite, which provides additional details. Items n the same class are linked by a relation of analogy. In addition, concepts are linked by a relation of order: each concept has a set of "underlying concepts" associated with it, that correspond to those terms of epidemiology and statistics whose knowledge is considered as a prerequisite for understanding the explanation of the concept itself. Therefore, relations Ci to Cj and Ch to Cj (where Ci , Cj , Ch denote concepts) indicate that Ci and Ch are underlying concepts of Cj. This order relation establishes an organization of concepts in the knowledge base in the form of a Direct Acyclic Graph. For example, underlying concepts of Mantel Haenszel odds ratio are confounding, odds ratio and stratified analysis, as the Mantel Haenszel odds ratio is an index allowing the estimation of the odds ratio in the stratified analysis, when confounding factors are present.

3.2 The Library of Examples

The Library of Examples collects commented studies taken from the scientific literature in the domain of epidemiology. Examples are organized, as well, in classes, according to the research area to which they refer (e.g. infectious diseases, cancer, etc.). Each example is related to several concepts, because a single study can illustrate more than one concept. Each concept, in its turn, can be related to examples in different classes, as studies made in several research areas can support explaining it (see Figure 2). All examples are described by the same set of attributes: a description of the study (hypothesis, population examined, considered variables and so on), the study results and the interpretation of these results. Each attribute has an information item associated with it: some of these items are textual; others (e.g. the results) can take different, visual forms like tables, histograms, graphs, maps.

Figure 2

3.3 The User Model

The User Model describes the characteristics of the user that are relevant for the functioning of EPIAIM. The way in which this model is built up and updated is described elsewhere [2]. We only mention here the aspects that are of interest to the explanation facility. The user model is a stereotype whose main component is the body, a collection of sentences about the user knowledge of concepts of the type: (KNOW-ABOUT (user, concept-i))

KNOW-ABOUT is a predicate that synthetizes the knowledge of a person about an abstract object. It takes, as arguments, two variables: the first one denotes the user, the second one the concept. The sentence indicates that the user has the concept within his/her mind as something learned or understood, and therefore has cognition of its properties. To each sentence is attached a numerical value, which measures the probability that the sentence is true. The body includes also sentences about the other characteristics of the user, such as the main research areas: (PRACTICE-IN (user, subject-area-j)).

The stereotype is activated by a set of general (trigger) questions on the user's curriculum: e.g. University degree, years of experience, job. The probabilities of sentences are computed and revised exploiting knowledge about the learning process of concepts, that is represented in a belief network. This is the inference component of the stereotype. Each node in the belief network represents a concepts. the links among nodes correspond to the "order relation" mentioned previously. The strength of these relations is measured in terms of conditional probabilities attached to the nodes. The theory of belief network is applied to propagate and to update the uncertainties attached to nodes.

4. Explanation Facility

4.1 User-Tailored Message Generation

The generation of messages is tailored to the user characteristics by a schema-based approach [4]. This approach was preferred to plan representation technique because is much less complex and nevertheless suited to our needs. The messages to be produced are short, the communication goals are the same for all users, and therefore patterns of discourse structures can be standardized. In our case a schema is just a combination of concept attributes. For each attribute, one may specify whether only the nucleus (N) or also the satellite text (N+S) have to be included in the message. Production rules establish the relationship among the concept class, the user characteristics and the schema to be selected for producing the message. For example:

SCHEMA-1: purpose(N), basic-idea(N), interpretation(N)

SCHEMA-2: basic-idea(N+S), formula(N)

schema-selection rule-1: if (Class(?concept, INDICES))AND(Less-than(P(KNOW-ABOUT(user, ?concept)), .33)) then (Apply(SCHEMA-1, ?concept))

schema-selection rule-2: if (Class(?concept, INDICES))AND(Greater-than(P(KNOW-ABOUT(user, ?concept)), .66)) then (Apply(SCHEMA-2, ?concept))

The condition sides of these two rules say that these rules can be applied to concepts belonging to the class of indices. The first one applies to the case of a user who probably does not know the concept (p<.33). The second one applies to the case of a user who probably knows it (p>.66). The term Apply(SCHEMA-i, ?concept) designates the action of applying the message generation procedure to the element ?concept in the knowledge base, with the information content and order specified in the SCHEMA-i. The effect of this action is a message that combines the texts attached to the attributes mentioned in the SCHEMA-i. The hypothesis (which result from the experimental studies mentioned in the Introduction) is that, as the probability that the user is familiar with a concept increases, the aspects of the concepts that have to be clarified after a request of explanation change and the details needed for each aspect change as well. The explanation of a concept can be integrated, on user request, by an example. This example is selected among those linked to the concept, according to the user research or work experience. This experience corresponds to the research area, that is the value of the second variable in the PRACTICE-IN sentence in the user model. Examples are generated as multimedia messages, by combining linguistic and visual acts. They are made up of three components, each going into separate window: a paragraph abut the study description; an image showing main results in visual form (a table, a histogram, a graph, a geographical map); a paragraph about how results can be interpreted.

4.2 Hypermedia Follow-up

The user-tailored message generation facility of EPIAIM is employed only to produce a first short explanation of the concept according to what probability the user knows of the concept itself, and to the application area with which the user is most familiar. However, as we mentioned in the Introduction, these assumptions are subject to various sources of uncertainty, and therefore may be incorrect. The user is therefore enabled to have access to other possibly important pieces of information, by a hypertext facility. By this facility, users may obtain more detailed explanations in the following directions:

In the last two cases, the user-tailored generation facility will be called again in action, and the new message will be tailored to the probability that the user knows the underlying or the analogous concepts. These facilities are obtained by clicking on various icons, that cause changing in the information content of the windows. In addition, like in any hypertextual system, the user may browse in the Dictionary of Concepts and in the Library of Examples, to examine their content and to add personal comment if needed.

5. Conclusions

A prototype of EPIAIM, with the explanation facility described in this paper, has been implemented in SMALLTALK 80 on a SUN station. Although EPIAIM is designed to work in the specific application domain of epidemiology, methods and programs can be easily transferred to other, non medical, application domains of statistics or to other fields, like image analysis. We are convinced that combing natural language generation with hypermedia techniques has several advantages. Natural language generation enables extracting the relevant information from the knowledge base to produce different messages, an approach that is much more convenient, in term of memory space, than attaching canned text to every possible type of explanation. At the same time, hypermedia follow-ups leave the user the freedom to exit fro the system's assumptions about their information needs. this reduces the well known difficulty of building correct and complete user models. As a result, focusing into the appropriate message is rapid and easy for users with any type of experience. This approach may be especially convenient in knowledge-based systems, where various aspects of the interaction have to be adapted to the user, and therefore a user model can serve to several purposes. This was the case of EPIAIM, where strategies of data analysis had to be changed, as well as explanations, according to the user background. We have now to make an evaluation of our explanation facility, aimed at assessing how frequently the first messages responds to the user needs and how easy is for the user to find out the information needed in case this message is not adequate. Modifying the explanation facility according to the results of these studies will not be very complex. Explicit representation of message generation strategies and of their linkage with the user characteristics, which is typical of knowledge-based systems, makes the system very flexible. A few schemas can be introduced to define strategies corresponding to all concepts classes and user types. Schema and criteria for selecting the most appropriate of them, according to the user needs, can be modified independently. Therefore, strategies of message generation can be changed according to the results of evaluation studies, and can be diversified in more than one level of experience of users, if needed.

References

[1] Berry D.C., de Rosis F.: Designing an adaptive interface to EPIAIM. Lecture Notes in Medical informatics n.44, Stefanelli M., Hasman A., fieschi M., Talmon J. (Eds). Springer Verlag, 1991, 306-316.

[2] Berry D.C., de Rosis F., Nicolau Molina J., Pizzutilo S., Russo A.: Modeling the user knowledge by belief networks. User Modeling and User Adapted Interaction,2, Kluwer Ac. Publ., 19992, 367-388.

[3] Cawsey A.: Generating Interactive Explanations. In the Proceedings of the 9th National Conference on Artificial Intelligence, 1991, 86-91.

[4] McKeown K.R.: Discourse strategies for generating natural language text. Artificial Intelligence, 27, 1985, 1-41.

[5] Moore J.D. and Swartout W.R.: A reactive approach to explanations. 4th international workshop on Natural Language Generation, July 1988, 1-35.

[6] Moore J.D. and Swartout W.R.: Pointing: A way toward explanation dialogue. In the Proceedings of the 8th National Conference on Artificial Intelligence, 1990, 457-464.

[7] Paris C.: The EES explanation facility. Workshop on Task Based Explanation, Samos, June 1992.

[8] Reiter E., Mellish C., Levine J.: Automatic generation of on-line documentation in the IDAS project.. 3rd conference on Applied Natural Language Processing, Trento, 1992, 64-71.

[9] Wahlster W and Kobsa A.: User Models in Dialog Systems. In User Models in Dialog Systems, A.Kobsa and W. Wahlster (Eds), Springer verlag, Berlin, 1989, 4-34.