UNCLASSIFIED
A Survey of Computer Programming Languages
Currently Used in the Department of Defense
Audrey A. Hook, Task Leader
Bill Brykczynski
Catherine W. McDonald
Sarah H. Nash
Christine Youngblut
May 2, 1995
This document is still undergoing review and is subject to modification or withdrawal. It should not be referenced in other publications.
UNCLASSIFIED
INSTITUTE FOR DEFENSE ANALYSES
1801 N. Beauregard Street, Alexandria, Virginia 22311
Prepared for
Defense Information Systems Agency
PREFACE
This paper was prepared by the Institute for Defense Analyses (IDA) for the Defense Information Systems Agency under the task order, Ada Technology Insertion, and fulfills an objective, to perform a survey of high order languages currently used in the Department of Defense.
This paper was reviewed by the following IDA research staff members: Dr. Alfred E. Brenner, Dr. Dennis W. Fife, Dr. Richard J. Ivanetich, Dr. John F. Kramer, and Dr. Dale E. Lichtblau.
The authors would like to acknowledge Ms. Jean Sammet for providing many suggestions on improving the data collection form. Ms. Sammet's knowledge of programming languages and their versions was most helpful. Ms. Linda Brown, Ms. Joan McGarity, and Mr. Don Reifer also provided guidance for conducting the survey. The survey respondents should also be thanked for taking time to complete and return the data collection form.
3. RESPONDENT AND PROGRAMMATIC PROFILE
APPENDIX A. SURVEY INSTRUMENT
APPENDIX B. SURVEY DATA
LIST OF REFERENCES
LIST OF ACRONYMS
Table ES-1. Total SLOC by Language Generation for Weapon System Responses
Table ES-2. Total SLOC by Language Generation for AISs Responses
Table ES-3. Total SLOC by General Purpose 3GL for Weapon Systems
Table ES-4. Total SLOC by 3GL for AISs
Table 1. Values Assigned to SLOC Range Estimates
Table 2. Values Assigned to Language Percentage Estimates
Table 3. Total SLOC by Language Generation for Weapon System Responses
Table 4. Total SLOC by General Purpose 3GL for Weapon System Responses
Table 5. Third Generation Special Purpose Languages
Table 6. Third Generation "Other" Languages
Table 7. Total SLOC by Language Generation for AISs
Table 8. Total SLOC by 3GL for AISs
Table B-1. Weapon Program/System Names
Table B-2. AIS Program/System Names
Table B-3. Weapon System Survey Data
Table B-4. AIS Survey Data
Background and Purpose
In June 1994 the Assistant Secretary of Defense for Command, Control,
Communications and Intelligence commissioned a programming language
survey of the Department of Defense (DoD). The purpose was to identify
the number of programming languages being used today in the DoD as
compared to 20 years ago when the DoD first began developing the Ada
programming language.
A 1977 study, "A Common Programming Language for the Department of
Defense-Background, History and Technical Requirements", identified
"450" as the minimum, probable number of general purpose languages and
dialects used in the DoD, but went on to say that the actual number was
not known. How this estimate, and the method used to count root
languages, versions, and dialects, came to be is still questioned. For
this survey, as part of establishing a strong methodology, counting the
number of languages used today required input from the organizations
developing or maintaining automated information systems (AISs) and
weapon systems. A census sample would include new systems, those being
modernized, and those being maintained. For this study, a judgement
sample of weapon systems was identified from the 1994 Presidential
Budget requests for Research, Development, Test and Evaluation (RDT&E)
programs exceeding $15 million and Procurement programs exceeding $25
million. Of the 1,300 programs identified, 423 programs were selected because they included
software applications. The current DoD list of 53 major AISs was used as a sample population for non-weapon systems.
Experts in the field of programming languages have differed
dramatically in classifying programming languages for counting
purposes, particularly in defining the terms "dialect" and "version."
For this paper, we use the term "dialect" to indicate a relatively
minor change in a language whereas "version" indicates a larger change
and usually has a different "name" although the new "name" may only be
the concatenation of a different year or number to the baseline name
(e.g., Jovial, Jovial 73). We counted a "version" of a root language as
a distinct language. The methodology and data collection approach is
explained in detail in this report to allow further expansion of the
sample population.
Findings and Conclusions
Table ES-1. Total SLOC by Language Generation
for Weapon System Responses
Table ES-2. Total SLOC by Language Generation for AISs Responses
Table ES-3. Total SLOC by General Purpose 3GL for Weapon Systems
Table ES-4. Total SLOC by 3GL for AISs
Recommendation
Accepting the number of 450 or more general purpose programming
languages in use in the 1970s, we can see considerable progress has
been made by the Military Departments and Agencies in reducing the
number to 37 in major systems that are new or being modernized. Yet the
survey indicates that a substantial legacy of applications remain that
use older versions of programming languages, vendor-unique languages,
and military-defined languages. The maintenance costs for these
applications could be reduced and their reliability increased by
converting these applications to a current version of a Federal
Information Processing Standard language. Automated conversion methods
should offer a cost-effective technology to facilitate this conversion.
Re-engineering these applications in another language is also a cost
reduction opportunity. Redundant code can be eliminated, software
components can be re-used, and modern off-the-shelf programming tools
can be used to improve maintainability and reliability.
Consequently, we recommend that Service and Defense Agency Program
Managers regularly review their software applications to identify a
migration strategy and plan for upgrading them to current versions of
standards-based versions of languages and modern labor-saving tools.
The progress in reducing the number of languages used, as shown in this
survey, indicates that further reduction should be possible. Indeed, we
recognize that several migration efforts are already ongoing now.
This paper reports the results of a programming language survey
commissioned in June 1994 by the Honorable Emmett Paige, Jr., Assistant
Secretary of Defense for Command, Control, Communications and
Intelligence, and funded by the Defense Information Systems Agency,
Center for Software, DoD Software Initiatives Department. The
motivation for the survey was a desire to know how many programming
languages are being used in the Department of Defense (DoD) today as
compared to 20 years ago when the DoD began development of the Ada
language.
We reviewed studies that preceded and succeeded formation of the DoD
High Order Language Working Group (HOLWG) in the mid-1970s to locate a
primary source for a list of languages then in use within DoD. Two
major software problems were under study at that time. The first was
the trend toward unaffordable costs for DoD embedded systems software
and the second was the potential proliferation of Service-unique
programming languages. Software cost studies of this period did not
reference specific programming languages, presumably because software
development costs did not appear to vary as a function of the specific
programming language being used [AF-CCIP 1973, Fisher 1974]. These
studies extrapolated total and projected costs based upon other factors
(e.g., labor rates, purchase price, and maintenance costs for hardware
and system software used to develop embedded systems).
In 1974, each Military Department independently proposed the adoption
of a common programming language for use in the development of its own
major weapon systems. The then-Director of Defense Research and
Engineering (DDR&E), Malcolm R. Currie, called upon the Military
Departments to "- immediately formulate a program to assure maximum
useful software commonality in the DoD" [Fisher 1977, p. 7]. The
establishment of the HOLWG was the Services' response to DDR&E. The
Technical Advisor to the HOLWG, Dr. David Fisher, and the Defense
Advanced Research Projects Agency sponsor, Colonel William A. Whitaker,
have written historical accounts of HOLWG activities but these
published papers do not document a list of programming languages in use
while the HOLWG effort proceeded [Fisher 1977, Whitaker 1993]. However,
Fisher's paper, which summarizes the technical requirements for a
common programming language, contains the following reference to
languages in use:
There are at least 450 general-purpose languages and dialects currently
used in the DoD, but it is not known whether the actual number is 500
or 1500. With few exceptions, the only languages used in data
processing and scientific applications are, respectively, Cobol and
Fortran. A larger number of programming languages are used in embedded
computer systems applications. [Fisher 1976, p. 6]
As part of the present study, Dr. Fisher was contacted concerning the
origin of the oft-quoted number of 450 languages being used. He did not
recall that a systematic count of languages and versions had been done
by the HOLWG. Although there may be papers or reports containing a list
of programming languages used by DoD, we were unable to locate them
through the open literature resources for use in this study. The
analytical method used in the study of DoD software costs approximated
the number of compilers installed on general purpose computers.
Software cost estimates were derived from analysis of data that the
Services were required to report to the General Services Administration
under the requirements of the Brooks Act (1965). This data included the
numbers, configurations, models, locations, initial cost, and
utilization of computer systems. Questions remain about the 450
estimate, including the following:
This study began with the identification of data elements needed for an
analysis of programming language usage in the development or
maintenance of DoD weapon systems and Automated Information Systems
(AISs). The 1994 Presidential Budget was used to select a sample of
weapon systems to survey. The current DoD list of major AISs was used
to select a sample to survey.
Service and DoD program offices provided the data on the programming
languages being used to develop or maintain their operational and
support software. The primary data reported included the generations
and names of the programming languages being used and the amount
(source lines) of software written in each programming language
expressed as a percentage of the total system. Additional data reported
includes the acquisition category and life-cycle phase of the program.
A data collection form was designed to record the data elements
identified by the survey respondents. Potential respondents were
contacted by telephone to get their agreement to participate in the
survey. The data collection form was then faxed to each participant and
responses were analyzed to extract the information reported in this
study.
The classification of programming languages for counting purposes has
always been, and continues to be, a highly debated subject on which
experts differ in definitions and philosophy. Even when definitions are
generally agreed upon, the application of the definition in a
particular case is often difficult, with results depending on the
judgement of a person.
For the purposes of this report, the key issue is the difference
between "version" and "dialect." We use the term "dialect" to indicate
a relatively minor change in a language whereas "version" indicates a
larger change and usually has a different "name" although the new
"name" may only be the concatenation of a different year or number to
the baseline name (e.g., Jovial, Jovial 73). While these definitions
may appear to be abstract issues of interest only to language
specialists, they actually have a profound effect on portability,
interoperability, and counting. If a dialect (involving small changes)
is involved, training and portability may be easier than with a new
"version." A dialect would normally not be considered a separate
language. A version may or may not be considered a separate language,
depending on the purposes of the counting. In this report we counted
historical versions that divide conveniently between pre- and current
version years.
Because the practical usage of programming languages is generally at
the third generation level, this survey concentrates on this level
while still collecting some minimal data for other generations of
languages. Consequently, the results from this survey can be compared
only in a general way with the historical assertion about "450" general
purpose languages as a practical illustration of what is happening in
the DoD environment.
The results of this survey are drawn from a limited sample of DoD
weapon systems and AISs; therefore, the survey does not provide an
exact and detailed record of computer programming language usage in the
DoD. Several constraints affected the precision of the results:
A description of the methods used to identify the survey population and
sample is found in Section 2. A profile of the survey respondents is
presented in Section 3. Analysis of the programming language data
obtained by the survey is provided as findings in Section 4. Section 5
summarizes the conclusions drawn from survey results. Section 6
contains the recommendation. Appendix A contains the survey instrument
and Appendix B provides the data obtained during the survey. We have
provided as much detail as possible about the method and response data
with the intent of providing a documented baseline for future language
studies.
Several approaches to conducting the survey were initially considered.
These approaches are briefly discussed below before describing in
detail the selected approach.
A comprehensive DoD data call was considered, involving a formal
request for specific data elements throughout the DoD. This approach
was rejected because it would have encompassed a great deal of effort
on the part of operational organizations whose primary mission is
readiness. Historically, the response rate has been low to data calls
for information that is not directly related to assigned missions.
Another approach involved reviewing several automated databases that
contain programming language information on DoD systems. Several of
these databases were examined as part of this study, but none were able
to provide the information required. It was also difficult to determine
the lineage and accuracy of the data. Therefore, these databases were
not used as part of the present study.
The approach that was chosen involved direct contact with the
organizations responsible for developing or maintaining systems that
contain software. This section provides a detailed description of this
approach, including the survey populations and samples, trade-offs made
in designing the data collection form, the method used in contacting
potential respondents, the methods for handling erroneous response data
values, and the methods for analyzing the survey results.
We recognize that a census population of software would include systems
that are new or undergoing major modernization and software in a steady
state of maintenance. Software being maintained is a collection of
applications that are difficult to identify because they are aggregated
under operational costs. After a trial effort, we could see clearly
that the estimated time and effort to approximate a census population
would exceed the targets agreed for this survey effort. Consequently,
we identified a judgement population as described in the next
sections.
2.1.1 Weapon Systems Population
Weapon systems include aircraft, ships, tanks, tactical and strategic
missiles, smart munitions, space launch and space-based systems,
command and control (C2), and command, control, communications (C3),
and intelligence (C3I) systems. For the purposes of this survey, weapon
system software is considered to comprise embedded, C3, and C3I
systems, as well as any other software that directly supports or is
critical to a weapon system's mission [STSC 1994].
Four acquisition categories (ACAT) are defined for weapon systems by
DoD Instruction 5000.2 [DoDI 1991, pp. 2-2-2-4]:
An Automated Information System (AIS) can be functionally described as
follows:
A combination of computer hardware and computer software, data and/or
telecommunications, that performs functions such as collecting,
processing, transmitting, and displaying information. Excluded are
computer resources, both hardware and software, that are: physically
part of, dedicated to, or essential in real time to the mission
performance of weapon systems; used for weapon system specialized
training, simulation, diagnostic test and maintenance, or calibration;
or used for research and development of weapon systems. [DoDI 1993]
These systems are often categorized as automatic data processing
systems that are designed to meet specific user requirements for
business functions (e.g., transaction processing, accounting,
statistical analysis, or record keeping) and they are implemented on
general purpose computers, including personal computers.
An authoritative source for a complete inventory of existing AISs could
not be identified. Given the time and effort constraints placed on this
study, the list of 53 designated major AISs was used as the AIS survey
population [OASD 1994]. A major AIS is defined as one that is not a
highly sensitive, classified program (as determined by the Secretary of
Defense), and that according to DoDI 8120.1, the instruction on life
cycle management of AISs [DoDI 1993], is characterized by the
following:
The approach used in selecting the sample from the population of weapon
systems and AISs is described in the next section.
2.2.1 Weapon Systems Sample
A close approximation of the population of existing weapon systems was
found in a commercially available publication [Carroll 1994]. This
publication provided a list of over 1,300 RDT&E and procurement
programs for all Services and DoD Agencies. The list, called the
Program Management Index (PMI), was based on the President's 1994
budget request and identifies all RDT&E programs with current or future
fiscal budgets exceeding $15 million and procurement programs with
total budgets of more than $25 million.
The PMI contains a number of programs that do not develop or maintain
software for a weapon system (e.g., ammunition programs, medical
research, biodegradable packaging technology) and lacks some programs
that would have been of interest such as intelligence systems, highly
classified programs, and programs below the budgetary thresholds cited.
The PMI was then reviewed to eliminate programs that were obviously
outside of the population of interest. For example, programs such as
25MM Ammunition Development, Health Hazards of Military Material, and
Petroleum Distributions were eliminated from the population. Also
eliminated were basic and applied research programs that involve
technology years away from being fielded. While these programs often
involve small amounts of prototype software development, the scope of
the survey constrained the size of the survey sample.
Each of the programs remaining in the PMI list was briefly examined to
characterize the likelihood of being a weapon system. Weapon systems
such as aircraft, ships, and tanks were (usually) easily identifiable.
However, many of the programs required additional effort to determine
their relevance to the population. For example, the AN/BSY-2 is an
RDT&E project. Unless one is familiar with the AN/BSY-2 project, it is
not immediately clear that it is the combat system for the Seawolf
submarine and contains an aggregate of several million lines of
software.
Of the 423 programs selected from the PMI list to form the survey
sample, 142 were eliminated from the sample after we found that they
had been cancelled or were combined with another program, or contained
no software. The remaining 281 programs included most of the typical
weapon platforms (e.g., aircraft, ships, submarines, tanks) and many of
the sensors, communication systems, and weapon subsystems.
2.2.2 Automated Information Systems Sample
Of the 53 AISs on the original list, 2 have been cancelled, 4 were
primarily acquisitions for hardware and commercial off-the-shelf (COTS)
software, 5 have not begun to develop software, and 4 programs had no
current program manager name and telephone number. The survey sample of
AISs for this study, therefore, consists of the remaining 38 major
AISs.
A data collection form was designed for this survey to reduce
respondent error and to present technically accurate language choices.
Because data was to be collected on five different programming language
generations, definitions of these language generations were adapted
from the ANSI/IEEE Standard Glossary of Software Engineering
Terminology [ANSI/IEEE 1990] with advice from Ms. Jean Sammet, language
historian. These definitions were provided on the form as follows:
An overriding concern for the data collection form was to keep it as
simple as possible. Data collection forms that are lengthy or require a
great deal of effort to complete are less likely to be completed and
returned. Thus, the following design decisions were made with respect
to the data collection form:
The process for contacting potential survey respondents for weapon
systems and AISs differed only in the means by which telephone numbers
were obtained. For weapon systems, the PMI list provided the name and
telephone number of each weapon system program manager. For AISs, the
Office of the Secretary of Defense official responsible for oversight
of that AIS was contacted to provide the name and telephone number of
the AIS program manager.
The purpose of the survey was described upon contacting each potential
respondent. Suggestions for filling out the form were provided and the
form was then faxed to the potential respondent. If a response was not
received after three weeks, a follow-up call was placed.
Some data collection forms were not completely or accurately filled out
by survey respondents. For example, respondents may have omitted the
Acquisition Category because it was not known to the respondent or was
overlooked. The most common instance of inaccurate responses was that
two different programming languages were listed as being used for over
75% of the system. If the correct data was not immediately obvious, the
respondent was either contacted for the correct data or the values
reported for the data element were excluded from our analysis and
logged as a non-response. Graphic displays of survey results in the
next section show these errors as "data not available."
The process for estimating the total number of SLOC addressed by this
survey is now described. As discussed in Section 2.3, respondents were
not requested to provide an exact SLOC count for their response.
Rather, they were asked to select from a range of "Total Source Lines
of Code." A uniform procedure for estimating the SLOC represented by
each survey response form was developed. Table 1 provides the Total
SLOC ranges on the response form and the corresponding SLOC count
assigned to each range. For example, if the "100-500K" range was
checked on the response form, 300K was used as the total SLOC covered
by the response form. The SLOC sizes in the "Value Assigned" column in
Table 1 were subjectively assigned. However, if an exact SLOC count was
provided on the response form, that count was used in place of an
estimate. The total SLOC addressed by this survey was therefore derived
by summing the estimated SLOC (or in some cases the exact SLOC) from
each response form. Values assigned in Table 1 were subjectively assigned for the top and bottom ranges; the midpoint was used for other ranges.
Table 1. Values Assigned to SLOC Range Estimates
Respondents were also requested to provide the percentage of the total
system written in each applicable language. Ranges were available to
identify this percentage. Table 2 provides the "% of Total" ranges on
the response form and the corresponding percentages assigned to each
range. For example, if "5-25%" was checked for Jovial 73, 15% was used
as the percentage of the total system written in Jovial 73. If an exact
percentage was provided on the response form, that percentage was used
in place of an estimate. For each response, the SLOC for each language
was derived by multiplying the total SLOC count (see Table 1 on
page 12) by the estimated percent of total system written in that
language.
Table 2. Values Assigned to Language Percentage Estimates
The problems in using SLOC as a means of measuring the amount of
software are well publicized [Jones 1991]. It is unlikely that
respondents would have provided much data had specific methods for
counting SLOC been required. Therefore, survey respondents were allowed
to provide SLOC range estimates using their method for counting SLOC.
Clearly, non-uniform methods for counting SLOC reduces the precision of
the SLOC-related portions of the survey. However, this trade-off does
not detract from the primary purpose of the survey (i.e., to produce a
count of programming languages being used in the DoD today).
Before presenting the survey results, it is important to realize that
the level of abstraction of survey responses varies (see Section 2.6 to
understand the rationale for this decision). For example, some
responses describe an entire weapon system (e.g., the V-22 Osprey),
other responses describe different versions of a weapon system (e.g.,
the Standoff Land Attack Missile (SLAM) Baseline and the SLAM Upgrade),
while other responses describe major subsystems resident within a
weapon system (e.g., seven subsystems on the C/KC-135). Consequently,
there is not a one-to-one mapping between a survey response and a
single weapon system. Therefore, survey results are presented in terms
of responses, not "programs" or "systems".
The survey data collection form was structured to provide the Service
and Agency distribution of respondents as the demographic data of
interest to DoD. Attributes being surveyed included the acquisition
cost category and the life-cycle phase. This section presents
observations from the weapon system and AIS responses.
The distribution of the weapon system responses in terms of Service
participation, acquisition category, and acquisition phase are
presented for information purposes only.
3.1.1 Services
Figure 1 presents the distribution of responses by Services. The sample
of programs selected was not evenly distributed among Army (19%), Navy
(50%), and Air Force (26%); consequently, nearly half of the responses
were from the Navy. The "Other" category represents responses from the
Ballistic Missile Defense Organization, Defense Logistics Agency, and
Defense Information Systems Agency.
3.1.2 Acquisition Category
Figure 2 presents the distribution of acquisition categories for the
weapon system responses. The largest percentage of responses were from
ACAT I programs, with ACAT III close behind.
3.1.3 Acquisition Phase
Figure 3 presents the distribution of acquisition phases for the weapon
system responses. The Engineering & Manufacturing Development and
Production & Deployment phases combine to represent 79% of the total
number of responses.
The distribution of the AIS responses in terms of Service participation
and acquisition phase are presented for information purposes only.
Acquisition category is not defined by the same rules as for weapon
systems. The data collected from the survey forms has been omitted here
because it was considered unreliable (e.g., over half of the
respondents did not report acquisition cost category).
3.2.1 Services
Figure 4 presents the distribution of Services contributing to the
major AIS survey. The "Other" category includes the Defense Information
Systems Agency and Defense Logistics Agency. There were no Marine Corps
AISs in the survey samples.
3.2.2 Acquisition (Life-Cycle) Phase
Life-cycle phases for AISs are defined by DoDI Instruction 8120.1 [DoDI
1993]. Figure 5 presents the distribution of life-cycle phases reported
by the major AISs surveyed.
Figure 1. Distribution by Service for Weapon System Responses (Not Shown)
Figure 2. Distribution by Acquisition Category for Weapon System
Responses (Not Shown)
Figure 3. Distribution by Acquisition Phase for Weapon System Responses (Not Shown)
Figure 4. Distribution by Service for AIS Responses (Not Shown)
Figure 5. Distribution by Acquisition Phase for AIS Responses (Not Shown)
Finding 1: Most weapon system software is being written and maintained
in (general and special purpose) third generation languages.
More than 150 million SLOC (i.e., 81%) of the weapon system software
surveyed is written in third generation languages. Without historical
data similar to Figure 6, trends such as the changing emphasis on
particular language generations cannot be adequately identified.
However, it is very likely that over the past 20 years there has been a
gradual decline in the use of machine and assembly languages and a
corresponding increase in third generation languages.
Table 3 on page 20 provides a numerical presentation of the same data
as Figure 6. Table 4 lists the estimated total surveyed SLOC for each
third generation language. The Total SLOC Reported column in Table 3
and Table 4 has been rounded to the nearest million.
Table 3. Total SLOC by Language Generation for Weapon System Responses
Table 4. Total SLOC by General Purpose 3GL for Weapon System Responses
The following special purpose third generation languages were also
reported (Table 5).
Table 5. Third Generation Special Purpose Languages
Respondents were provided space on the data collection form to identify
any programming languages being used that were not already listed.
These languages formed the "Other 3GLs" noted in Table 4 on page 20,
and included the languages listed in Table 5 and Table 6.
Table 6. Third Generation "Other" Languages
Finding 2: Ada is the leading third generation language in terms of
existing weapon system source lines of code.
Figure 7 presents the top five third generation languages in terms of
estimated total SLOC surveyed. Survey responses reported an estimated
49+ million SLOC in Ada and 32+ million SLOC in C. These five languages
represent about 84% of the total estimated third generation SLOC
reported.
Finding 3: Ada is the leading third generation language in terms of
number of weapon system responses indicating usage.
Figure 8 presents the top five third generation languages in terms of
the number of responses reporting specific language use. As can be
seen, 143 responses indicated the use of Ada and 122 responses
indicated the use of C. In comparing Figure 7 and Figure 8, the key
difference is the more frequent reported use of C++, albeit with fewer
total estimated surveyed SLOC. Note that the data presented in Figure 7
do not represent a uniform population (i.e., survey responses address
varying levels of abstraction). See Section 2.6 for details.
Finding 4: Two-thirds of the weapon system responses reported on
application systems of 500,000 or less SLOC.
Figure 9 presents the distribution of responses in terms of the Total
SLOC range selected on the response form. The large number of 1-499+K
responses is due, in part, to responses at the subsystem level.
Finding 5: Over 70% of the weapon system responses indicated the use of
more than one programming language from all five generations.
Figure 10 presents the distribution of responses in terms of the number
of languages reported on a response form (single subsystem or system).
Finding 6: Multiple versions of third generation languages are being
used in weapon systems.
The goal of the 1970s, language commonality within the weapon system
community, has not been reached yet even for military standards such as
Jovial and CMS-2 (Figure 11). In addition, at least two versions are
being used for most Federal Information Processing Standards (FIPS).
Different versions of a language are almost always incompatible.
Dialects of a version present subtle but not inconsequential porting
problems, particularly when they are dialects based upon older versions
of the language. For example, there are 10 or more different dialects
of pre-J73 Jovial still in use.
4.2 AIS Findings
Finding 7: Most AIS software is being written and maintained in third
generation languages.
Figure 12 is the SLOC distribution of all generations of languages used
in AIS application. Table 7 is the numeric presentation of Figure 12.
The use of first generation language (machine language) is limited to
only one of the AISs. The use of assembly (including proprietary macro
languages) is inconsequential when compared to weapon system
applications.
Table 7. Total SLOC by Language Generation for AISs
Table 8 is the SLOC estimates in millions for third generation
languages.
Table 8. Total SLOC by 3GL for AISs
Finding 8: Cobol is the leading third generation language in terms of
existing AIS source lines of code.
Figure 13 presents the top five third generation languages in terms of
estimated total SLOC reported. Survey responses reported an estimated
22 million SLOC in two versions of Cobol and about 8 million SLOC in
Ada. These five languages represent about 89% of the total estimated
third generation SLOC reported.
Finding 9: Ada is the leading third generation language in terms of
number of AIS responses indicating usage.
Figure 14 shows that the use of Ada was reported by more respondents,
although the number of lines of source code written in Ada is less than
for Cobol.
Finding 10: Most of the AIS responses reported on application systems
are in the range of 100K-5,000K SLOC.
Figure 15 depicts that 85% of the responses are evenly distributed in
the mid-size range of applications.
Finding 11: Ninety percent of the AISs surveyed indicated the use of
one or more third generation programming languages.
The first column in Figure 16 showing no use of third generation
languages indicates that some applications are developed only with
fourth generation languages. Fourth generation languages for such
applications as database query, report writing, and screens are not
applicable to weapon system applications except in the support
activities required to construct or maintain applications.
Finding 12: Multiple versions of third generation languages are being
used in AISs.
Figure 17 indicates that Cobol 85, the current FIPS version, has not
had a significant effect on AIS applications, and that older versions
of Fortran exceed the number of applications written in the current
version.
Figure 7. Top Five 3GLs by Total SLOC for Weapon System Responses (Not Shown)
Figure 8. Top Five 3GLs by Reported Usage for Weapon System Responses (Not Shown)
Figure 9. Distribution of Total SLOC Size for Weapon System Responses (Not Shown)
Figure 11. Comparison of 3GLs with Multiple Versions for Weapon System Responses (Not Shown)
Figure 12. Total SLOC by Language Generation for AIS Responses (Not Shown)
Figure 13. Top Five 3GLs by Total SLOC for AIS Responses (Not Shown)
Figure 14. Top Five 3GLs Reported by AIS Responses (Not Shown)
Figure 15. Distribution of Total SLOC Size for AIS Responses (Not Shown)
Figure 16. Distribution of Number of 3GLs Reported by AIS Responses (Not Shown)
Figure 17. Comparison of 3GLs with Multiple Versions for AIS Responses (Not Shown)
This survey is not a universal census of weapon systems and AISs but
the results reported do represent a substantial and visible portion of
the population. Even though the sample size was constrained by
available time and resources, a systematic method was used and
documented so that others who care to extend the sample size at a later
date will be able to obtain results that are consistent with the
language counting method used in this survey. The responses received
represent over 60% of the programs contacted. We have drawn the
following conclusions about programming languages currently used in the
DoD, based upon findings from the survey:
Conclusion 1:
The issue of how to count languages makes this conclusion open to some
level of debate. There are many dialects of a language version that
some may choose to count as unique languages. If we accept the
historical assertion that at least 450 third generation languages were
used in the late 1970s, we can see that considerable progress has been
made toward reducing the number of programming languages used in DoD.
Conclusion 2:
The fact that Ada usage is not greater in DoD could be due to several
factors. First, production quality Ada compilers and development tools
were not available immediately after the language was adopted as a
standard. There was a lag-time of four to five years before compiler
vendors could offer choices of Ada environments for high performance
host/target machines. Second, there is always inertia to overcome
before change can occur and the resistance of the DoD software
development community to DoD policy toward the use of Ada perpetuated
that inertia. And third, it takes time to educate and train software
engineers and managers to understand the language and to use it
effectively.
There is an unknown quantity of legacy software being maintained by
software support activities that modify code and/or provide data
processing service. Many of these software applications were developed
by contractors and are being maintained by the government using the
language versions and dialects chosen by the development contractor.
The constraints on this survey precluded our being able to
systematically collect a sample from the software maintained by O&M
budgets. However, we speculate that languages used in the maintenance
community include more use of second generation languages (assembly)
and older versions of third generation languages.
Conclusion 3:
The existence of first generation language (machine) is almost
certainly due to the continued maintenance of fairly old legacy
hardware and software. It is highly unlikely that future new software
will be written in first generation languages, considering the target
computer systems which will be candidates for modernization.
Conclusion 4:
To some extent, the use of second generation languages (assembly) is
also due to the continued maintenance of legacy software. However,
there are specific reasons, other than historical ones, that have
necessitated the use of second generation languages. One of these
reasons is special purpose hardware and, in this case, the need for
second generation languages will almost certainly continue. Another
reason is performance. Ten or more years ago, many systems used second
instead of third generation languages for those parts of the system
that were time critical. Although the performance of modern third
generation languages, such as Ada or C, can meet many such performance
issues now, it is likely that minimal use of assembly language will
continue for some time for its real or perceived performance
properties. However, this will become less of a problem as better
software engineering techniques are used in code generation.
Conclusion 5:
AIS applications have used fourth generation languages as database
management products, graphical user interfaces, and shrink-wrapped
tools have been acquired to improve user services. The SQL standard has
not only promoted relational database products but has provided an
alternative to the continued use of proprietary languages for data
access. The modest use of fourth generation languages by the weapon
system community could indicate that COTS products are seldom used to
develop software or that the respondents did not consider the
development environment as appropriate for this survey.
Conclusion 6:
There are several reasons for the very low usage of fifth generation
languages. One reason is that the immaturity of fifth generation AI
languages does not recommend their use in operational weapon systems.
Other reasons could be the lack of exploratory R&D programs in the
sample or that many AI problems are being solved with third generation
languages.
Conclusion 7:
For example, the continued use of several versions of CMS2, Jovial,
Fortran, Cobol, and platform/vendor unique languages may be motivated
by short-term economic views. There are tools to aid in re-engineering
and conversion tools that makes reimplementing existing software more
feasible and practical than to continue maintenance of this
multi-version software.
Conclusion 8:
Even if only one language were used, software commonality, portability,
and interoperability would be imperfect. With modern programming
languages and compilers, increased use of COTS products and re-use of
software components, it is possible to produce applications with
components written in different languages. Ada, with its specified
pragma interfaces, is a language that is well suited to being used with
other languages in multi-language applications.
EXECUTIVE SUMMARY
Language Generation Total SLOC Reported
(in millions)
------------------- -------------------
First 3.90
Second 26.30
Third:
General Purpose 148.38
Special Purpose 3.70
Fourth 5.00
Fifth 0.29
Language Generation Total SLOC Reported
(in millions)
------------------- -------------------
First 0.30
Second 0.63
Third:
General Purpose 38.24
Special Purpose 0.00
Fourth 10.81
Fifth 0.05
Third Generation Language Total SLOC Reported
and Version (in millions)
------------------------- -------------------
Ada 83 49.70
C 89 32.50
Fortran pre-91/92 18.55
CMS-2 Y 14.32
Jovial 73 12.68
C++ 5.15
CMS-2 M 4.23
Other 3GLs 3.38
Pascal pre-90 3.62
Jovial pre-J73 1.12
Fortran 91/92 1.00
PL/I 87/93 subset 0.64
Basic 87/93 (full) 0.48
PL/I 76/87/93 0.36
Pascal 90 (extended) 0.29
Basic 78 (minimal) 0.17
LISP 0.10
Cobol pre-85 0.09
Cobol 85 0.00
========================= =======
Total 148.38
Third Generation Total SLOC Reported
Language and Version (in millions)
-------------------- ---------------------
Cobol 85 14.06
Cobol pre-85 8.59
Ada 83 8.47
Basic 87/93 2.18
C++ 2.05
C 89 1.55
Fortran 91/92 0.87
Fortran pre-91/92 0.47
================= =====
Total 38.24
1. INTRODUCTION
1.1 Purpose
1.2 Background
The DoD does not maintain "corporate level" information on programming
languages used in contemporary software projects. Therefore, gaining a
reasonably accurate understanding of programming languages being used
in the DoD required input from the organizations responsible for
developing or maintaining individual systems. Accordingly, these
organizations are the primary source for this survey data.
1.3 Approach
1.4 Language Counting Issues
1.5 Scope
1.6 Organization
2. SURVEY METHOD
2.1 Population Identification
2.1.2 Automated Information Systems Population
2.2 Sample Selection
2.3 Data Collection Form
Languages were grouped on the data collection form by these generations
and listed by name and version within the third generation languages
category. We decided not to ask for name and version of first, second,
fourth, and fifth generations because supplying that type of data would
require an inordinate amount of research effort for respondents to
provide and for us to validate.
The key information desired from each survey respondent included the
following items:
Secondary information desired from each survey respondent included the
following items:
A pilot survey was conducted using a preliminary version of the data
collection form. Improvements were made according to suggestions made
by several respondents as well as by analysis of their responses.
Appendix A provides a copy of the final data collection form.
2.4 Contact Process
2.5 Respondent Errors
2.6 Analysis Process
"Total SLOC" Range Value Assigned
Marked on Response Form
----------------------- --------------
1-100K 75K
100-500K 300K
500-1,000K 750K
1,000-5,000K 3,000K
5,000+K 6,000K
"% of Total" System Value Assigned
Marked on Response Form
----------------------- --------------
<5% 2.5%
5-25% 15.0%
25-50% 37.5%
50-75% 62.5%
>75% 87.5%
3. RESPONDENT AND PROGRAMMATIC PROFILE
3.1 Weapon System Responses
3.2 AIS Responses
4. LANGUAGE USAGE FINDINGS
4.1 Weapon System Findings
Language Generation Total SLOC Reported
(in millions)
------------------- -------------------
First 3.90
Second 26.30
Third:
General Purpose 148.38
Special Purpose 3.70
Fourth 5.00
Fifth 0.29
Third Generation Total SLOC Reported
Language and Version (in millions)
-------------------- -------------------
Ada 83 49.70
C 89 32.50
Fortran pre-91/92 18.55
CMS-2 Y 14.32
Jovial 73 12.68
C++ 5.15
CMS-2 M 4.23
Other 3GLs 3.38
Pascal pre-90 3.62
Jovial pre-J73 1.12
Fortran 91/92 1.00
PL/I 87/93 subset 0.64
Basic 87/93 (full) 0.48
PL/I 76/87/93 0.36
Pascal 90 (extended) 0.29
Basic 78 (minimal) 0.17
LISP 0.10
Cobol pre-85 0.09
Cobol 85 0.00
==================== ======
Total 148.38
Language Purpose SLOC
ATLAS Equipment Checkout 1.38
VHDL Hardware Description 0.18
CDL Hardware Description 0.22
GPSS Simulation 0.04
Simulink Simulation 0.06
CSSL Simulation 0.01
ADSIM Simulation 0.02
SPL/1 Signal Processing 1.62
SPL Space Programming 0.01
Language Purpose Unverified
--------------------------------------------
DTC
LISA Language for Systolic Array Processor
PIL HARM Program Implementation Language
PLM
PLM-51
PLM-86
Pspice
REXX HOL
TACL TSC
VTL
Language Generation Total SLOC Reported
(in millions)
First 0.30
Second 0.63
Third
General Purpose 38.24
Special Purpose 0.00
Fourth 10.81
Fifth 0.05
Third Generation Total SLOC Reported
Language / Version (in millions)
Cobol 85 14.06
Cobol pre-85 8.59
Ada 83 8.47
Basic 87/93 2.18
C++ 2.05
C 89 1.55
Fortran 91/92 0.87
Fortran pre-91/92 0.47
================= =====
Total 38.24
5. CONCLUSIONS AND DISCUSSION
Language Type | Language Name and Version | % of Total | |||||
. | <5% | 5- 25% | 25 -50% | 50 -75% | >75% | ||
First Generation | Machine | . | . | . | . | . | |
Second Generation | Assembly (Provide Count of Distinct Versions Being Used): ___________ | . | . | . | . | . | |
Third Generation | Ada 83 | . | . | . | . | . | |
ALGOL | ALGOL 60 | . | . | . | . | . | |
ALGOL 68 | . | . | . | . | . | ||
APL 89 | . | . | . | . | . | ||
BASIC | BASIC 78 (minimal) | . | . | . | . | . | |
BASIC 87/93 (full) | . | . | . | . | . | ||
C 89 | . | . | . | . | . | ||
C++ (identify version on page 4) | . | . | . | . | . | ||
CHILL 89 | . | . | . | . | . | ||
COBOL | COBOL pre-85 | . | . | . | . | . | |
COBOL 85 | . | . | . | . | . | ||
CMS-2 | CMS-2 Y | . | . | . | . | . | |
CMS-2 M | . | . | . | . | . | ||
FORTRAN | FORTRAN pre-91/92 | . | . | . | . | . | |
FORTRAN 91/92 | . | . | . | . | . | ||
JOVIAL | JOVIAL pre-J73 | . | . | . | . | . | |
JOVIAL J73 | . | . | . | . | . | ||
LISP (identify version on page 4) | . | . | . | . | . | ||
MUMPS | MUMPS pre-90 | . | . | . | . | . | |
MUMPS 90 | . | . | . | . | . | ||
Pascal | Pascal pre-90 | . | . | . | . | . | |
Pascal 90 (extended) | . | . | . | . | . | ||
PL/I | PL/I 76/87/93 | . | . | . | . | . | |
PL/I 87/93 subset | . | . | . | . | . | ||
PROLOG (identify version on page 4) | . | . | . | . | . | ||
SIMULA | SIMULA pre-67 | . | . | . | . | . | |
SIMULA 67 | . | . | . | . | . | ||
Smalltalk (identify version on page 4) | . | . | . | . | . | ||
TACPOL | . | . | . | . | . | ||
Others: list and identify on page 4 | . | . | . | . | . | ||
Fourth Generation | e.g., SQL, RPG, Clipper, Visual BASIC | . | . | . | . | . | |
Fifth Generation | e.g., Knowledge/rule base shells | . | . | . | . | . |
Application Area | Generic Language Name | Version Name and/or Number | % of Total | ||||
. | <5% | 5 - 25% | 25 - 30% | 50 - 75% | >75% | ||
Equipment Checkout | ATLAS | . | . | . | . | . | . |
Hardware Description | VHDL | . | . | . | . | . | . |
CDL | . | . | . | . | . | . | |
Simulation | GPSS | . | . | . | . | . | . |
SIMSCRIPT | . | . | . | . | . | . | |
CSSL | . | . | . | . | . | . | |
Signal Processing | SPL/1 | . | . | . | . | . | . |
Space Programming | SPL | . | . | . | . | . | . |
Statistics | SPSS | . | . | . | . | . | . |
SAS | . | . | . | . | . | . | |
Robotics Languages | AL | . | . | . | . | . | . |
AML | . | . | . | . | . | . | |
KAREL | . | . | . | . | . | . | |
Expert System Languages | KRL | . | . | . | . | . | . |
OPS5 | . | . | . | . | . | . |
The following definitions are provided for language generation:
_____________________________________________________________________________________ _____________________________________________________________________________________ _____________________________________________________________________________________