[Ada Information Clearinghouse]
Ada '83 Rationale, Sec 6.2: Overview of the Issues

"Rationale for the Design of the
Ada® Programming Language"

[Ada '83 Rationale, HTML Version]

Copyright ©1986 owned by the United States Government. All rights reserved.
Direct inquiries to the Ada Information Clearinghouse at adainfo@sw-eng.falls-church.va.us.

CHAPTER 6: Access Types

6.2 Overview of the Issues

The main problems usually encountered with access types fall into two categories: We first discuss these problems and then define the desirable goals for a formulation of access types.

In this section...

6.2.1 Conceptual Aspects
6.2.2 Reliability, Efficiency, and Implementation Issues
6.2.3 Goals for a Formulation of Access Types


6.2.1 Conceptual Aspects

The objects of a program can be classified into two categories: static objects and dynamically allocated objects.

Static objects are declared in a program and are containers for values. Each static object has a simple name that is used to denote either the container or the value, depending on the context where the name appears. The simple name of a static object is defined by its declaration. For example, an object declared within a procedure is created by the elaboration of the object declaration and exists until the end of the procedure. Such objects are said to be static since their lifetime is determined by the static (textual) structure of the program.

In contrast, the creation of dynamically allocated objects occurs dynamically, by the execution of so-called allocators, and is not directly related to the program structure. In general, the number of dynamically allocated objects is not easy to predict and it will not be possible to define their names by declarations. Hence what is returned by the execution of an allocator is an internal name - not an identifier - and therefore it cannot be used explicitly in the text to denote the newly allocated object.

To deal with this problem, one usually defines by declaration a number of indirect names that may be used to access the different dynamically allocated objects at successive stages of execution. The internal names of the allocated objects will be the values of these declared indirect names. For this reason, indirect names are called access objects in Ada and throughout the remainder of this chapter (they have been called pointers or reference variables in other languages). An access object can be a declared variable or a component of a declared variable (and hence static); but it can also be a component of some dynamically created object. Internal names are called access values in Ada. To emphasize the difference between names and internal names, we say that a name denotes an object whereas we say that an access value designates an object.

Four important consequences follow from the fact that access objects contain internal names:

  1. The same internal name may be contained in several access objects, with the consequence that they provide access to the same dynamically allocated object.

  2. Access objects may be used to describe relations, in particular, relations that change over time.

  3. Since the internal name contained in an access variable may vary with successive stages of the program execution, a given dynamically allocated object may become inaccessible: A dynamically allocated object is accessible as long as its internal name is contained by a declared access object, or if its internal name is contained by an access object that is itself a component of a dynamically allocated object that is still accessible, and so on.

  4. Since an access object does not contain any internal name prior to its first allocation or assignment, there must be a special null value corresponding to no internal name ( none in Simula, nil in Algol 68 and Lisp, null in Ada). This value is also required for describing partial relations.
Sharing and the possibility of inaccessibility are thus two of the classical difficulties of access types. A third classical difficulty is the well-known problem of dereferencing: Considering the name of an access object, this name may stand for several different things: > The first two possibilities (name or content) also exist for static objects. Most languages (Bliss being an exception) have the same notation in the two cases, and make a distinction by context. The third possibility, however, only exists for access objects, and the solutions offered by programming languages are very diverse.

Two issues arise:

The diversity of the solutions adopted by several languages is a clear indication of the conceptual difficulties involved. We illustrate this diversity with the example below, where X and Y are access variables and AGE is a component of the dynamically allocated record object (For the Algol 68 formulation, T is assumed to be the mode of the record values; the Simula example extends the possibilities offered for texts).

languageaccess assignmentvalue assignmentcomponent selection
Simula
Algol 68
Pascal
Ada
X :- Y;
X := Y;
X := Y;
X := Y;
X := Y;
T(X) :=Y;
X^ := Y^;
X.all := Y.all;
X.AGE
X.AGE
X^.AGE
X.AGE

A final conceptual difficulty in defining access types is the notion of constant access objects. Suppose an access object is declared to be constant. Several alternative interpretations could be given for such a declaration.

  1. The access value (an internal name) is constant. This means that it always designates the same dynamically allocated object. The value of the latter, however, could vary.

  2. The access value can vary, but it may only be used to read the components of a designated object.

  3. The access value is constant and it may only be used to read the components of the designated object. Note, however, that we cannot infer that the dynamically allocated object designated by such a constant is itself constant, since other variables may designate the same dynamically allocated object.
Some languages, including Mary and Lis, have provided different syntaxes for all three forms of constant semantics. The first meaning, however, is the one that is most consistent with what is done for other types. Consider the analogy with an index:

subtype INDEX is INTEGER range 1 .. 9;
MEDIAN  :  constant INDEX  :=  (INDEX'FIRST + INDEX'LAST)/2;
TABLE   :  array (INDEX) of COLOR  :=  (others => WHITE);

In this formulation we use the index MEDIAN to refer to the median value of the table: TABLE(MEDIAN). Now, the fact that MEDIAN is constant only means that we always refer to this component; it does not mean that assignment to this component is forbidden.


6.2.2 Reliability, Efficiency, and Implementation Issues

When a dynamically allocated object becomes inaccessible, the corresponding space may (at least theoretically) be reclaimed for other uses without any risk. This operation, classically called garbage-collection, has been used in languages such as Lisp, Simula, and Algol 68.

Unfortunately, there is no method of garbage collection that is generally suitable to real-time applications. The method used in most Lisp implementations is to allocate storage continuously until the available space is exhausted, and then reclaim inaccessible objects by a complete traversal of all allocated structures. This implies that the execution of an allocator can suddenly initiate garbage collection, thereby causing a large and unpredictable overhead. Moreover, as the available storage becomes increasingly fragmented by accessible objects, garbage collection could be triggered ever more frequently, causing rapid degradation of performance.

Another method is to maintain reference counts with each allocated object: an object is inaccessible if its reference count is zero. This fails with cyclic structures, where a non-zero reference count does not necessarily imply accessibility. It also causes implementation problems, since the reference count of an accessed object must be decremented whenever a declared access object that designates it passes out of scope - either in the normal course of execution or as a result of an exception. Access objects are therefore associated with finalization actions, with all their attendant difficulties. However, even if this method were fully implemented it would not solve the problem: the unpredictable overhead has merely been transferred to the deallocation operation.

A third method is to perform garbage collection periodically by a parallel process of lower priority. Provided the synchronization problems can be solved, this provides the least unsatisfactory solution for real-time use. Its major defect is that, under conditions where the transaction rate is high, the lower-priority processes may become starved, so garbage collection might not be done often enough to maintain a satisfactory pool of free storage.

For these reasons several languages in the systems programming area (including Lis and Euclid) try to achieve better control over storage management for dynamically allocated objects. This means that such languages offer the opportunity to define the workings of object allocation within the language itself. Similarly they admit an explicit deallocation statement which can also be defined within the language itself.

Such operations usually cannot be written with the full degree of compilation-time checking that is provided by types, though the Ada generics facility permits a greater degree of safety than is found in many other languages. In addition, the availability of explicit deallocation introduces the possibility of dangling access values: the program might deallocate a dynamically allocated object that is still accessible by other paths - its internal name still being contained by other access variables.

Confronted with this dilemma between reliability and efficiency, a possible answer is to choose reliability and accept the possibility that access types might not be used in programs that are time- critical. However, there are cases where access types should be used, precisely because the application considered is time-critical. We illustrate this point with the following example:

Assume that we need to compute the sum of the elements of a circular list. A formulation using an array might look as follows:

type INDEX is range 1 .. 1000;

type ITEM is
  record
    SUCC, PRED :  INDEX;
    CONTENT    :  INTEGER;
  end record;

TABLE :  array (INDEX) of ITEM;
HEAD, NEXT :  INDEX;
SUM :  INTEGER;

The algorithm for adding the contents of the successors of HEAD may be written as a while loop:

SUM    :=  0;
NEXT   :=  TABLE(HEAD).SUCC;
while NEXT /=  HEAD loop
  SUM   :=  SUM + TABLE(NEXT).CONTENT;
  NEXT  :=  TABLE(NEXT).SUCC;
end loop;

Clearly, the above formulation attempts to use index values in order to express relations, and does not achieve this with quite the elegance and readability offered by access variables. The main point, however, is that the index computation involved in accessing the array element TABLE(NEXT) at each iteration may be a drawback, especially on small computers where multiplication is rather slow.

The alternative formulation with access objects (declarations omitted) is given below:

SUM   :=  0;
NEXT  :=  HEAD.SUCC;
while NEXT /=  HEAD loop
  SUM   :=  SUM + NEXT.CONTENT;
  NEXT  :=  NEXT.SUCC;
end loop;

This solution is more readable - it does not require mention of names such as TABLE that are irrelevant to the logic of the algorithm - and also more efficient since no index calculation is involved.

In general, when access variables are used, address computations will be done only once, at the time of dynamic allocation. Thereafter access values can only be assigned to access objects or used to access the dynamically allocated objects. This however does not involve address computations. On the other hand, when indices are used, address computations must be redone for every access.


6.2.3 Goals For a Formulation of Access Types

As shown by the previous example, one of the advantages of access variables is efficiency. As a consequence we must be able to use them in time-critical applications. In this case, however, we must provide a form of access variable that does not result in garbage collection with the associated costs and unpredictability. Naturally this does not exclude the possibility of more elaborate storage management strategies in applications that are not time-critical.

The needs of efficiency being thus satisfied, it remains that reliability should be a major goal in the formulation of access types, especially in view of the conceptual difficulties they raise. Hence a safe formulation of access types should have several important properties:


NEXTPREVIOUSUPTOCINDEX
Address any questions or comments to adainfo@sw-eng.falls-church.va.us.