!topic LSN on Edited Output !key LSN-1078 on Edited Output !reference RM9X-J.3;4.0 !from Ben Brosgol $Date: 94/01/31 12:04:59 $ $Revision: 1.3 $ !discussion 1 Summary of Issues "Edited output" is the mapping from a numeric value to a character string that includes not only the digits, sign, and decimal point, but additional "editing characters" such as a currency symbol, digits separator, and "check protection" fill character. The Ada 9X provision for edited output is defined for decimal fixed-point types in the IS Annex of RM9X V4.0 (J.3) and is based heavily on both the current (COBOL-85) version and in-progress revision of the COBOL standard. This LSN addresses several issues that have emerged from comments on RM9X V4.0: * How to specify the dependence on the COBOL 85 semantics * How to handle localization of the currency symbol to a multi-character string * Whether to allow the currency symbol to appear to the right of the edited output number The LSN also proposes some minor corrections to the rules for picture string validity, since the current rules disallow some useful cases permitted in COBOL, while they allow some nonsensical picture strings to be treated as valid. 2 Dependence on COBOL 85 Semantics RM9X J3.1(4) and J3.2(4) give Ada rules by citing the relevant portions of the ISO COBOL 85 standard. This has several major advantages: * It makes it clear that, aside from a small number of explicitly specified exceptions, the rules for picture string formation and interpretation in Ada 9X and COBOL are identical. (The reasons for the deviations are explained in the Ada 9X Rationale.) * Making the Ada 9X description self-contained would require reformulating a set of COBOL rules that are extremely complicated. This would certainly raise questions about how to be sure that the Ada 9X and COBOL rules were the same, and in fact there would be the distinct danger that the reformulation could unintentionally introduce differences. Note that simply copying the text from the COBOL standard into the Ada 9X standard is not a solution; the relevant COBOL section covers not only edited output but also other rules for picture strings. Moreover, the terminology in the COBOL standard is based on COBOL terms that would need to be defined. On the other hand, reviewers have noted that "pointing to" another standard has several disadvantages: * A reader of the Ada 9X standard with no knowledge of COBOL will have no idea what the edited output rules are. Moreover, the cited reference to COBOL (ISO 1989-1985) might not be readily available. * What happens when COBOL 85 is obsolesced by the new COBOL standard? The first issue can be addressed via additional detail on several of the concepts derived from COBOL (for example, 'P' for "implicit zeroes" in picture strings), as well as non-normative text that provides a summary of the edited output facilities and some examples of their usage. When the new COBOL standard is issued, several approaches are feasible. One would be to retain the reference to the older COBOL standard. Another approach, under the almost certain assumption that the changes to COBOL edited output will be upward compatible (that is, that all legal picture strings in COBOL 85 will continue to be legal and have the same effect in the revised COBOL standard), would be to issue an update to the relevant clauses of the IS Annex so that the appropriate paragraphs in the new COBOL standard are referenced. If WG4 makes incompatible changes to the picture string rules then things would obviously become more complicated, and in such a situation we would likely need to re-express the COBOL 85 rules in Ada terms. But it seems best to undertake that effort only if and when it is needed. Based on these considerations, the MRT recommends that the current approach to portraying the dependence on COBOL picture strings be retained; namely, via citations to the relevant clauses of the COBOL standard. However, we also recommend that sufficient non-normative text be added to clarify the intent of the edited output facility for readers not necessarily familiar with COBOL. 3 Currency Symbol Localization COBOL 85 supports currency symbol localization in a rather restrictive fashion: the replacement symbol must be just a single character. This is too limited for international usage, and in fact the COBOL revision group (ISO/IEC JTC1/SC22 WG4) is planning as a generalization to allow the currency symbol in the picture string to expand to a multi-character currency string in the edited output. A specific proposal has been sent to ANSI X3J4: namely, to have the (one-character) currency symbol in the picture string expand to the (possibly multi-character) currency string in the edited output. As a simple example, if the picture string is "$ZZ9.99" and the localization is to "KD" ("Klingon Dollars"), then the edited output of the decimal value 12.34 would be "KDb12.34" (where 'b' signifies a blank character). This applies also when the currency symbol "floats" in the picture string; if the picture is "$$$9.99" then the edited output string for 12.34 would be "bKD12.34". It may be easier to see these effects when the picture and edited output strings are aligned with respect to the radix mark: Picture: "$ZZ9.99" "$$$9.99" Result: "KDb12.34" "bKD12.34" Thus in the first example the single '$' expands to "KD", the 1st (leftmost) 'Z' becomes a blank, and the 2nd 'Z' is replaced by the digit '1'. In the 2nd example the rightmost '$' becomes the '1', the middle '$' expands to "KD", and the leftmost '$' becomes a blank. The benefit of this approach is simplicity of description. Even if there are floating currency symbols in the picture string, the transformation is straightforward: one currency symbol expands to yield the localized currency string, and each of the other currency symbols corresponds to exactly one position in the edited output. The drawback is that with such a transformation the length of the edited output string is not necessarily the same as the length of the picture string. To prepare a picture string for a report item, the programmer will not simply be able to align the picture string to overlap the range of columns, since the expansion may yield a string longer than the picture. Moreover, it will be tricky to reuse the same picture strings for different currencies, since the lengths of the edited output strings will depend on the lengths of the currency symbol. (Actually there is another situation where the edited output string has a different length than the picture string; namely if there is a "V" in the picture. In this case the edited output is shorter than the picture, since there is nothing in the edited output string at the position of the "V" in the picture. However, the usage of "V" in picture strings is likely to arise more with file or database output than with human-readable report output, hence the problem of aligning picture strings with report fields is not so much an issue.) These drawbacks have prompted some reviewers to suggest an alternate scheme, namely one where, except for the just-decribed situation with "V", the length of the edited output string is always the same as the length of the (expanded) picture string. The problem is how to define these semantics precisely, given the complexity of the COBOL picture string formation rules and the fact that a repeated currency symbol in the picture could be either a floating currency symbol or a fixed-position currency symbol depending on the length of the currency string. This causes complications since COBOL allows just one symbol to float. For example, consider the picture string "$$$ZZZ9.99". This is an invalid picture string in COBOL, since it contains two "floating" symbols ('$' and 'Z'). Now we might take one of several approaches to this in Ada 9X: (1) Permit it, floating the currency string to the right of the '$' positions if the currency string length is 1 or 2. (2) Allow it only if the length of the currency string is 3. (3) Reject it Alternative (1), although perhaps the most appealing intuitively, would require a detailed analysis of the COBOL picture string formation rules because of all the interactions. This would be a major effort, would raise the issues mentioned earlier in 2 (Dependence on COBOL 85 Semantics), and would risk diverging from the approach that is presently the one most likely to be adopted for the COBOL revision. Alternative (2) would introduce an interaction between picture string validity and the (run-time) value of the current locale, would inhibit potential compile-time optimizations of edited output, and would not allow reuse of picture strings for currency strings of different lengths. Alternative (3) would interfere with international usage, since a picture string would be valid if the currency string has length 1 (i.e., "$ZZZ9.99"), but the corresponding picture string would be invalid when the currency string has length greater than 1. As a result of these difficulties with trying to ensure that the edited output string length has the same length as the picture string, we recomment retaining the current RM9X approach to currency symbol expansion. 4 Position of currency symbol in picture string An inadvertent omission from the Ada 9X edited output facilities was the provision of a fixed-position currency symbol to the right of the numeric quantity. This is extremely useful in international applications, since currency values are sometimes expressed with the currency string on the right; for example, "1,234 USD". Such a facility is absent from COBOL 85, but it is possible to add the capability to Ada 9X without undue complexity. The following are the revised versions of the relevant clauses of J.3: J.3.1(4) should be reworded as follows: The string obtained by the following transformation of Item satisfies the conditions specified in ISO 1989-1985 (COBOL), Section VI, Subclause 5.9.4, Paragraphs (6)a.1 and (7): * Each occurrence of `_' is replaced by `,' * Each occurrence of `<' is replaced by `-' * The occurrence of '>', if any, is replaced by `B' * If there is a single occurrence of `$' somewhere to the right of the rightmost `9', or, if there are no `9's but there is a single occurrence of `$' somewhere to the right of the rightmost `+' or `-' in a sequence of two or more `+' or `-' characters, then the occurence of `$' is replaced by `B' The enhancement to allow a trailing currency symbol is captured in the last rule above. The beginning of J.3.2(4) should be reworded as follows: Let COBOL_Pic be the string obtained from To_String(Pic) by performing the substitutions defined in J.3.1(4), and let COBOL_Output be the string that results from the COBOL statement MOVE literal TO COBOL_Output. The bulleted list in J.3.2(5..9) should be extended with one further rule: * If COBOL_Pic contains an occurrence of `B' that corresponds to a `$' in To_String(Pic), then this `B' is replaced by Symbols.Currency We considered allowing a floating currency symbol to the right of the numeric quantity, but the resulting complexity in the formulation of the rules seemed out of proportion to its benefit. The capability for a fixed- position currency symbol on the right is sufficient. 5 Other Issues Although COBOL allows a numeric edited move to an item with a PICTURE lacking a numeric-edited-specific character (e.g., "999") this is not taken into account in the rules. Actually the rewording of J.3.1(4) above (replacing the reference to 5.9.4, Paragraph (6)a, with a reference to 5.9.4, Paragraph (6)a.1) solves this problem, since the revised version no longer references the COBOL paragraph that would have made a picture string such as "999" invalid. The rules for picture string validity in J.3.1 need to state that `<' is mutually exclusive with `-', to avoid a nonsensical picture string like "---<--9.99>".