[Ada Information Clearinghouse]

Ada '83 Quality and Style:

Guidelines for Professional Programmers

Copyright 1989, 1991,1992 Software Productivity Consortium, Inc., Herndon, Virginia.

CHAPTER 3: Readability

3.3 Comments

Ada comments can be either beneficial or harmful to software maintainers. They can be beneficial by explaining aspects of the code that are otherwise not readily apparent. They can be harmful by containing inaccurate information and by being too numerous and not visually distinct enough, which can cause them to obscure the structure of the code.

Comments should be minimized. They should provide needed information that cannot be expressed in the Ada language, emphasize the structure of code, and draw attention to deliberate and necessary violations of the guidelines. Comments are present either to draw attention to the real issue being exemplified or to compensate for incompleteness in the example program.

Maintenance programmers need to know the causal interaction of noncontiguous pieces of code to get a global, more or less complete sense of the program. They typically acquire this kind of information from mental simulation of parts of the code. Comments should be sufficient enough to support this process (Soloway et al. 1986).

This section presents general guidelines about how to write good comments. It then defines several different classes of comments with guidelines for the use of each. The classes are: file headers, program unit specification headers, program unit body headers, data comments, statement comments, and marker comments.

Language Ref Manual references: 2.7 Comments


3.3.1 General Comments

In this section...
3.3.1 General Comments
3.3.2 File Headers
3.3.3 Program Unit Specification Header
3.3.4 Program Unit Body Header
3.3.5 Data Comments
3.3.6 Statement Comments
3.3.7 Marker Comments
Summary of Guidelines from this section

guideline

rationale

The structure and function of well written code is clear without comments. Obscure or badly structured code is hard to understand, maintain, or reuse regardless of comments. Bad code should be improved, not explained. Reading the code itself is the only way to be absolutely positive about what the code does. Therefore, the code should be made as readable as possible.

Using comments to duplicate information in the code is a bad idea for several reasons. First, it is unnecessary work that decreases productivity. Second, it is very difficult to correctly maintain the duplication as the code is modified. When changes are made to existing code, it is compiled and tested to make sure that it is once again correct. However, there is no automatic mechanism to make sure that the comments are correctly updated to reflect the changes. Very often, the duplicate information in a comment becomes obsolete at the first code change and remains so through the life of the software. Third, when comments about an entire system are written from the limited point of view of the author of a single subsystem, the comments are often incorrect from the start.

Comments are necessary to reveal information difficult or impossible to obtain from the code. Subsequent sections of this book contain examples of such comments. Completely and concisely present the required information.

The purpose of comments is to help readers understand the code. Misspelled, ungrammatical, ambiguous, or incomplete comments defeat this purpose. If a comment is worth adding, it is worth adding correctly in order to increase its usefulness.

Making comments visually distinct from the code, by indenting them, grouping them together into headers, or highlighting them with dashed lines is useful because it makes the code easier to read. Subsequent sections of this book elaborate on this point.

automation note

The guideline about storing redundant information in comments applies only to manually generated comments. There are tools which automatically maintain information about the code (e.g., calling units, called units, cross-reference information, revision histories, etc.), storing it in comments in the same file as the code. Other tools read comments, but do not update them, using the information from the comments to automatically generate detailed design documents and other reports.

The use of such tools is encouraged, and may require that you structure your header comments so they can be automatically extracted and/or updated. Beware that tools which modify the comments in a file are only useful if they are executed frequently enough. Automatically generated obsolete information is even more dangerous than manually generated obsolete information, because it is more trusted by the reader.

Revision histories are maintained much more accurately and completely by configuration management tools. With no tool support, it is very common for an engineer to make a change and forget to update the revision history. If your configuration management tool is capable of maintaining revision histories as comments in the source file, then take advantage of that capability, regardless of any compromise you might have to make about the format or location of the revision history. It is better to have a complete revision history appended to the end of the file than to have a partial one formatted nicely and embedded in the file header.


3.3.2 File Headers

guideline

instantiation

example

------------------------------------------------------------------------ 
--      Copyright (c) 1991, Software Productivity Consortium, Inc. 
--      All rights reserved.

-- Author: J. Smith

-- Department:System Software Department

-- Revision History: 
--   7/9/91 J. Smith 
--     - Added function Size_Of to support queries of node sizes. 
--     - Fixed bug in Set_Size which caused overlap of large nodes. 
--   7/1/91 M. Jones 
--     - Optimized clipping algorithm for speed. 
--   6/25/91 J. Smith 
--     - Original version. 
------------------------------------------------------------------------

rationale

Ownership information should be present in each file if you want to be sure to protect your rights to the software. Furthermore, for high visibility, it should be the very first thing in the file.

Responsibility and revision history information should be present in each file for the sake of future maintainers, this is the header information most trusted by maintainers because it accumulates. It does not evolve. There is no need to ever go back and modify the author's name or the revision history of a file. As the code evolves, the revision history should be updated to reflect each change. At worst, it will be incomplete, it should rarely be wrong. Also, the number and frequency of changes and the number of different people who made the changes over the history of a unit can be good indicators of the integrity of the implementation with respect to the design.

Information about how to find the original author should be included in the file header, in addition to the author's name, to make it easier for maintainers to find the author in case questions arise. However, detailed information like phone numbers, mail stops, office numbers, and computer account usernames are too volatile to be very useful. It is better to record the department for which the author was working when the code was written. This information is still useful if the author moves offices, changes departments, or even leaves the company, because the department is likely to retain responsibility for the original version of the code.

Language Ref Manual references: 10.1 Compilation Units - Library Units


3.3.3 Program Unit Specification Header

guideline

instantiation

example

------------------------------------------------------------------------ 
-- AUTOLAYOUT

-- Purpose: 
--   This package computes positional information for nodes and arcs 
--   of a directed graph.  It encapsulates a layout algorithm which is 
--   designed to minimize the number of crossing arcs and to emphasize 
--   the primary direction of arc flow through the graph.

-- Effects: 
--   - The expected usage is: 
--     1. Call Define for each node and arc to define the graph. 
--     2. Call Layout to assign positions to all nodes and arcs. 
--     3. Call Position_Of for each node and arc to determine the 
--        assigned coordinate positions. 
--   - Layout can be called multiple times, and recomputes the 
--     positions of all currently defined nodes and arcs each time. 
--   - Once a node or arc has been defined, it remains defined until 
--     Clear is called to delete all nodes and arcs.

-- Performance: 
--   This package has been optimized for time, in preference to space. 
--   Layout times are on the order of N*log(N) where N is the number 
--   of nodes, but memory space is used inefficiently. 
------------------------------------------------------------------------

package Autolayout is

   ...
   
   --------------------------------------------------------------------- 
   -- Define
   
   -- Purpose: 
   --   This procedure defines one node of the current graph. 
   -- Exceptions: 
   --   Node_Already_Defined 
   --------------------------------------------------------------------- 
   procedure Define 
         (New_Node : in     Node);
   
   --------------------------------------------------------------------- 
   -- Layout
   
   -- Purpose: 
   --   This procedure assigns coordinate positions to all defined 
   --   nodes and arcs. 
   -- Exceptions: 
   --   None. 
   --------------------------------------------------------------------- 
   procedure Layout;
   
   --------------------------------------------------------------------- 
   -- Position_Of
   
   -- Purpose: 
   --   This function returns the coordinate position of the 
   --   specified node.  The default position (0,0) is returned if no 
   --   position has been assigned yet. 
   -- Exceptions: 
   --   Node_Not_Defined 
   --------------------------------------------------------------------- 
   function Position_Of (Current : in     Node) 
         return Position;
   
   ...
   
end Autolayout;

rationale

The purpose of a header comment on the specification of a program unit is to help the user understand how to use the program unit. From reading the program unit specification and header, a user should know everything necessary to use the unit. It should not be necessary to read the body of the program unit. Therefore, there should be a header comment on each program unit specification, and each header should contain all usage information not expressed in the specification itself. Such information includes the units' effect on each other and on shared resources, exceptions raised, and time/space characteristics. None of this information can be determined from the Ada specification of the program unit.

When you duplicate information in the header that can be readily obtained from the specification, the information tends to become incorrect during maintenance. For example, do not make a point of listing all parameter names, modes, or types when describing a procedure. This information is already available from the procedure specification. Similarly, do not list all subprograms of a package in the header unless this is necessary to make some important statement about the subprograms.

Do not include information in the header which the user of the program unit doesn't need. In particular, do not include information about how a program unit performs its function or why a particular algorithm was used. This information should be hidden in the body of the program unit to preserve the abstraction defined by the unit. If the user knows such details and makes decisions based on that information, the code may suffer when that information is later changed.

When describing the purpose of the unit, avoid referring to other parts of the enclosing software system. It is better to say "this unit does ..." than to say "this unit is called by Xyz to do ...." The unit should be written in such a way that it does not know or care which unit is calling it. This makes the unit much more general purpose and reusable. In addition, information about other units is likely to become obsolete and incorrect during maintenance.

Include information about the performance (time and space) characteristics of the unit. Much of this information is not present in the Ada specification, but it is required by the user. To integrate the unit into a system, the user needs to understand the resource usage (CPU, memory, etc.) of the unit. It is especially important to note when a subprogram call causes activation of a task hidden in a package body, the task may continue to consume resources after the subroutine ends.

exception

Where a group of program units are closely related or simple to understand, it is acceptable to use a single header for the entire group of program units. For example, it makes sense to use a single header to describe the behavior of Max and Min functions; Sin, Cos, and Tan functions; or a group of functions to query related attributes of an object encapsulated in a package. This is especially true when each function in the set is capable of raising the same exceptions.

Language Ref Manual references: 6.1 Subprogram Declarations, 7.2 Package Specifications and Declarations


3.3.4 Program Unit Body Header

guideline

instantiation

example

------------------------------------------------------------------------ 
-- Autolayout

-- Implementation Notes: 
--   - This package uses a heuristic algorithm to minimize the number 
--     of arc crossings.  It does not always achieve the true minimum 
--     number which could theoretically be reached.  However it does a 
--     nearly perfect job in relatively little time.  For details about 
--     the algorithm, see ...

-- Portability Issues: 
--   - The native math package Math_Lib is used for computations of 
--     coordinate positions. 
--   - 32-bit integers are required. 
--   - No operating system specific routines are called.

-- Anticipated Changes: 
--   - Coordinate_Type below could be changed from integer to float 
--     with little effort.  Care has been taken to not depend on the 
--     specific characteristics of integer arithmetic. 
------------------------------------------------------------------------ 
package body Autolayout is

   ...
   
   --------------------------------------------------------------------- 
   -- Define
   
   -- Implementation Notes: 
   --   - This routine stores a node in the general purpose Graph data 
   --     structure, not the Fast_Graph structure because ... 
   --------------------------------------------------------------------- 
   procedure Define 
         (New_Node : in     Node) is 
   begin 
      ... 
   end Define;
   
   --------------------------------------------------------------------- 
   -- Layout
   
   -- Implementation Notes: 
   --   - This routine copies the Graph data structure (optimized for 
   --     fast random access) into the Fast_Graph data structure 
   --     (optimized for fast sequential iteration), then performs the 
   --     layout, and copies the data back to the Graph structure.  This 
   --     technique was introduced as an optimization when the algorithm 
   --     was found to be too slow, and it produced an order of 
   --     magnitude improvement. 
   --------------------------------------------------------------------- 
   procedure Layout is 
   begin 
      ... 
   end Layout;
   
   --------------------------------------------------------------------- 
   -- Position_Of 
   --------------------------------------------------------------------- 
   function Position_Of (Current : in     Node) 
         return Position is 
   begin
   
      ... 
   end Position_Of;
   
   ...
   
end Autolayout;

rationale

The purpose of a header comment on the body of a program unit is to help the maintainer of the program unit to understand the implementation of the unit, including tradeoffs among different techniques. Be sure to document all decisions made during implementation to prevent the maintainer from making the same mistakes you made. One of the most valuable comments to a maintainer is a clear description of why a change being considered will not work.

The header is also a good place to record portability concerns. The maintainer may have to port the software to a different environment and will benefit from a list of nonportable features. Furthermore, the act of collecting and recording portability issues focuses attention on these issues and may result in more portable code from the start.

Summarize complex algorithms in the header if the code is difficult to read or understand without such a summary, but do not merely paraphrase the code. Such duplication is unnecessary and hard to maintain. Similarly, do not repeat the information from the header of the program unit specification.

note

It is often the case that a program unit is self-explanatory enough that it requires no body header to explain how it is implemented or why. In such a case, omit the header entirely, as in the case with Position_Of above. Be sure, however, that the header you omit truly contains no information. For example, consider the difference between the two header sections:

-- Implementation Notes: None.

and

-- NonPortable Features: None.

The first is a message from the author to the maintainer saying "I can't think of anything else to tell you," while the second may mean "I guarantee that this unit is entirely portable."

Language Ref Manual references: 6.3 Subprogram Bodies, 7.3 Package Bodies


3.3.5 Data Comments

guideline

example

Objects can be grouped by purpose and commented as:
...

--------------------------------------------------------------------- 
-- Current position of the cursor in the currently selected text 
-- buffer, and the most recent position explicitly marked by the 
-- user.

-- Note:  It is necessary to maintain both current and desired 
--        column positions because the cursor cannot always be 
--        displayed in the desired position when moving between 
--        lines of different lengths. 
---------------------------------------------------------------------

Desired_Column : Column_Counter; 
Current_Column : Column_Counter; 
Current_Row    : Row_Counter; 
Marked_Column  : Column_Counter; 
Marked_Row     : Row_Counter;

The conditions under which an exception is raised should be commented:
--------------------------------------------------------------------- 
-- Exceptions 
--------------------------------------------------------------------- 
Node_Already_Defined : exception;   -- Raised when an attempt is made 
                                    --|   to define a node with an 
                                    --|   identifier which already 
                                    --|   defines a node. 
Node_Not_Defined     : exception;   -- Raised when a reference is 
                                    --|   made to a node which has 
                                    --|   not been defined.

Here is a more complex example, involving multiple record and access types which are used to form a complex data structure:
--------------------------------------------------------------------- 
-- These data structures are used to store the graph during the 
-- layout process. The overall organization is a sorted list of 
-- "ranks," each containing a sorted list of nodes, each containing 
-- a list of incoming arcs and a list of outgoing arcs.

-- The lists are doubly linked to support forward and backward 
-- passes for sorting. Arc lists do not need to be doubly linked 
-- because order of arcs is irrelevant.

-- The nodes and arcs are doubly linked to each other to support 
-- efficient lookup of all arcs to/from a node, as well as efficient 
-- lookup of the source/target node of an arc. 
---------------------------------------------------------------------

type Arc; 
type Arc_Pointer is access Arc;

type Node; 
type Node_Pointer is access Node;

type Node is 
   record 
      Id       : Node_Pointer;-- Unique node ID supplied by the user. 
      Arc_In   : Arc_Pointer; 
      Arc_Out  : Arc_Pointer; 
      Next     : Node_Pointer; 
      Previous : Node_Pointer; 
   end record;
   
type Arc is 
   record 
      ID     : Arc_ID;        -- Unique arc ID supplied by the user. 
      Source : Node_Pointer; 
      Target : Node_Pointer; 
      Next   : Arc_Pointer; 
   end record;
   
type Rank; 
type Rank_Pointer is access Rank;

type Rank is 
   record 
      Number     : Level_ID;  -- Computed ordinal number of the rank. 
      First_Node : Node_Pointer; 
      Last_Node  : Node_Pointer; 
      Next       : Rank_Pointer; 
      Previous   : Rank_Pointer; 
   end record;
   
First_Rank : Rank_Pointer; 
Last_Rank  : Rank_Pointer;

rationale

It is very useful to add comments explaining the purpose, structure, and semantics of the data structures. Many maintainers look at the data structures first when trying to understand the implementation of a unit. Understanding the data which can be stored, along with the relationships between the different data items, and the flow of data through the unit is an important first step in understanding the details of the unit.

In the first example above, the names Current_Column and Current_Row are relatively self-explanatory. The name Desired_Column is also well chosen, but it leaves the reader wondering what the relationship is between the current column and the desired column. The comment explains the reason for having both.

Another advantage of commenting on the data declarations is that the single set of comments on a declaration can replace multiple sets of comments that might otherwise be needed at various places in the code where the data is manipulated. In the first example above, the comment briefly expands on the meaning of "current" and "marked." It states that the "current" position is the location of the cursor, the "current" position is in the current buffer, and the "marked" position was marked by the user. This comment, along with the mnemonic names of the variables, greatly reduces the need for comments at individual statements throughout the code.

It is important to document the full meaning of exceptions and under what conditions they can be raised, as shown in the second example above, especially when the exceptions are declared in a package specification. The reader has no other way to find out the exact meaning of the exception (without reading the code in the package body).

Grouping all the exceptions together, as shown in the second example, can provide the reader with the effect of a "glossary" of special conditions. This is useful when many different subprograms in the package can raise the same exceptions. For a package in which each exception can be raised by only one subprogram, it may be better to group related subprograms and exceptions together.

When commenting exceptions, it is better to describe the exception's meaning in general terms than to list all the subprograms that can cause the exception to be raised; such a list is harder to maintain. When a new routine is added, it is likely that these lists will not be updated. Also, this information is already present in the comments describing the subprograms, where all exceptions that can be raised by the subprogram should be listed. Lists of exceptions by subprogram are more useful and easier to maintain than lists of subprograms by exception.

In the third example, the names of the record fields are short and mnemonic, but they are not completely self-explanatory. This is often the case with complex data structures involving access types. There is no way to choose the record and field names so that they completely explain the overall organization of the records and pointers into a nested set of sorted lists. The comments shown are useful in this case. Without them, the reader would not know which lists are sorted, which lists are doubly linked, or why. The comments express the intent of the author with respect to this complex data structure. The maintainer still has to read the code if he wants to be sure that the double links are all properly maintained. Keeping this in mind when reading the code makes it much easier for him to find a bug where one pointer is updated and the opposite one is not.

Language Ref Manual references: 3.3.1 Type Declarations, 11.1 Exception Declarations


3.3.6 Statement Comments

guideline

example

The following is an example of very poorly commented code:
...

-- Loop through all the strings in the array Strings, converting 
-- them to integers by calling Convert_To_Integer on each one, 
-- accumulating the sum of all the values in Sum, and counting them 
-- in Count.  Then divide Sum by Count to get the average and store 
-- it in Average. Also, record the maximum number in the global 
-- variable Max_Number. 
for I in Strings'Range loop 
   -- Convert each string to an integer value by looping through 
   -- the characters which are digits, until a nondigit is found, 
   -- taking the ordinal value of each, subtracting the ordinal value 
   -- of '0', and multiplying by 10 if another digit follows.  Store 
   -- the result in Number. 
   Number := Convert_To_Integer(Strings(I)); 
   -- Accumulate the sum of the numbers in Total. 
   Sum := Sum + Number; 
   -- Count the numbers. 
   Count := Count + 1; 
   -- Decide whether this number is more than the current maximum. 
   if Number > Max_Number then 
      -- Update the global variable Max_Number. 
      Max_Number := Number; 
   end if;
   
end loop; 
-- Compute the average. 
Average := Sum / Count;

The following is improved by not repeating things in the comments which are obvious from the code, not describing the details of what goes in inside of Convert_To_Integer, deleting an erroneous comment (the one on the statement which accumulates the sum), and making the few remaining comments more visually distinct from the code.
Sum_Integers_Converted_From_Strings: 
   for I in Strings'Range loop 
      Number := Convert_To_Integer(Strings(I)); 
      Sum := Sum + Number; 
      Count := Count + 1;
      
      -- The global Max_Number is computed here for efficiency. 
      if Number > Max_Number then 
         Max_Number := Number; 
      end if;
      
   end loop Sum_Integers_Converted_From_Strings;
   
Average := Sum / Count;

rationale

The improvements shown in the example are not improvements merely by reducing the total number of comments; they are improvements by reducing the number of useless comments.

Comments that paraphrase or explain obvious aspects of the code have no value. They are a waste of effort for the author to write and the maintainer to update. Therefore, they often end up becoming incorrect. Such comments also clutter the code, hiding the few important comments.

Comments describing what goes on inside of another unit violate the principle of information hiding. The details about Convert_To_Integer (deleted above) are irrelevant to the calling unit, and they are better left hidden in case the algorithm ever changes. Examples explaining what goes on elsewhere in the code are very difficult to maintain and almost always become incorrect at the first code modification.

The advantage of making comments visually distinct from the code is that it makes the code easier to scan, and the few important comments stand out better. Highlighting unusual or special code features indicates that they are intentional. This assists maintainers by focusing attention on code sections that are likely to cause problems during maintenance or when porting the program to another implementation.

Comments should be used to document code that is nonportable, implementation-dependent, environment-dependent, or tricky in any way. They notify the reader that something unusual was put there for a reason. A beneficial comment would be one explaining a work-around for a compiler bug. If you use a lower level (not "ideal" in the software engineering sense) solution, comment on it. Information included in the comments should state why you used that particular construct. Also include documentation on the failed attempts, e.g., using a higher level structure. This type of comment is useful to maintainers for historical purposes. You show the reader that a significant amount of thought went into the choice of a construct.

Finally, comments should be used to explain what is not present in the code as well as what is present. If you make a conscious decision to not perform some action, like deallocating a data structure with which you appear to be finished, be sure to add a comment explaining why not. Otherwise, a maintainer may notice the apparent omission and "correct" it later, thus introducing an error.

note

Further improvements can be made on the above example by declaring the variables Count and Sum in a local block so that their scope is limited and their initializations occur near their usage, e.g., by naming the block Compute_Average or by moving the code into a function called Average_Of. The computation of Max_Number can also be separated from the computation of Average. However, those changes are the subject of other guidelines; this example is only intends to illustrate the proper use of comments.

Language Ref Manual references: 5 Statements


3.3.7 Marker Comments

guideline

example

if    A_Found then 
   ... 
elsif B_Found then 
   ...
   
else  -- A and B were both not found 
   ...
   
   if Count = Max then 
      ...
      
   end if;
   
   ... 
end if;  -- A_Found

------------------------------------------------------------------------ 
package body Abstract_Strings is 
   ...
   
   --------------------------------------------------------------------- 
   procedure Catenate (...) is 
   begin 
      ... 
   end Catenate; 
   ---------------------------------------------------------------------
   
   ... 
begin  -- Abstract_Strings 
   ... 
end Abstract_Strings; 
------------------------------------------------------------------------

rationale

Marker comments emphasize the structure of code and make it easier to scan. They can be lines that separate sections of code or descriptive tags for a construct. They help the reader resolve questions about the current position in the code. This is more important for large units than for small ones. A short marker comment fits on the same line as the reserved word with which it is associated. Thus, it adds information without clutter.

The if, elsif, else, and end if of an if statement are often separated by long sequences of statements, sometimes involving other if statements. As shown in the first example, marker comments emphasize the association of the keywords of the same statement over a great visual distance. Marker comments are not necessary with the block statement and loop statement because the syntax of these statements allows them to be named with the name repeated at the end. Using these names is better than using marker comments because the compiler verifies that the names at the beginning and end match.

The sequence of statements of a package body is often very far from the first line of the package. Many subprogram bodies, each containing many begin lines, may occur first. As shown in the second example, the marker comment emphasizes the association of the begin with the package.

note

Repeating names and noting conditional expressions clutters the code if overdone. It is visual distance, especially page breaks, that makes marker comments beneficial.

Language Ref Manual references: 5.3 If Statements, 5.4 Case Statements, 5.6 Block Statements, 6.3 Subprogram Bodies, 9.1 Abort Statements


Back to document index