1. INTRODUCTION 1.1 Purpose This paper reports the results of a programming language survey commissioned in June 1994 by the Honorable Emmett Paige, Jr., Assistant Secretary of Defense for Command, Control, Communications and Intelligence, and funded by the Defense Information Systems Agency, Center for Software, DoD Software Initiatives Department. The motivation for the survey was a desire to know how many programming languages are being used in the Department of Defense (DoD) today as compared to 20 years ago when the DoD began development of the Ada language. 1.2 Background We reviewed studies that preceded and succeeded formation of the DoD High Order Language Working Group (HOLWG) in the mid-1970s to locate a primary source for a list of languages then in use within DoD. Two major software problems were under study at that time. The first was the trend toward unaffordable costs for DoD embedded systems software and the second was the potential proliferation of Service-unique programming languages. Software cost studies of this period did not reference specific programming languages, presumably because software development costs did not appear to vary as a function of the specific programming language being used [AF-CCIP 1973, Fisher 1974]. These studies extrapolated total and projected costs based upon other factors (e.g., labor rates, purchase price, and maintenance costs for hardware and system software used to develop embedded systems). In 1974, each Military Department independently proposed the adoption of a common programming language for use in the development of its own major weapon systems. The then-Director of Defense Research and Engineering (DDR&E), Malcolm R. Currie, called upon the Military Departments to "- immediately formulate a program to assure maximum useful software commonality in the DoD" [Fisher 1977, p. 7]. The establishment of the HOLWG was the Services' response to DDR&E. The Technical Advisor to the HOLWG, Dr. David Fisher, and the Defense Advanced Research Projects Agency sponsor, Colonel William A. Whitaker, have written historical accounts of HOLWG activities but these published papers do not document a list of programming languages in use while the HOLWG effort proceeded [Fisher 1977, Whitaker 1993]. However, Fisher's paper, which summarizes the technical requirements for a common programming language, contains the following reference to languages in use: There are at least 450 general-purpose languages and dialects currently used in the DoD, but it is not known whether the actual number is 500 or 1500. With few exceptions, the only languages used in data processing and scientific applications are, respectively, Cobol and Fortran. A larger number of programming languages are used in embedded computer systems applications. [Fisher 1976, p. 6] As part of the present study, Dr. Fisher was contacted concerning the origin of the oft-quoted number of 450 languages being used. He did not recall that a systematic count of languages and versions had been done by the HOLWG. Although there may be papers or reports containing a list of programming languages used by DoD, we were unable to locate them through the open literature resources for use in this study. The analytical method used in the study of DoD software costs approximated the number of compilers installed on general purpose computers. Software cost estimates were derived from analysis of data that the Services were required to report to the General Services Administration under the requirements of the Brooks Act (1965). This data included the numbers, configurations, models, locations, initial cost, and utilization of computer systems. Questions remain about the 450 estimate, including the following: - How was the estimate of programming languages being used in weapon systems derived? These systems were not subject to reporting under the Brooks Act. - How many of the 450 programming languages were special purpose languages? - How many of the 450 programming languages were minor dialects of major versions? The DoD does not maintain "corporate level" information on programming languages used in contemporary software projects. Therefore, gaining a reasonably accurate understanding of programming languages being used in the DoD required input from the organizations responsible for developing or maintaining individual systems. Accordingly, these organizations are the primary source for this survey data. 1.3 Approach This study began with the identification of data elements needed for an analysis of programming language usage in the development or maintenance of DoD weapon systems and Automated Information Systems (AISs). The 1994 Presidential Budget was used to select a sample of weapon systems to survey. The current DoD list of major AISs was used to select a sample to survey. Service and DoD program offices provided the data on the programming languages being used to develop or maintain their operational and support software. The primary data reported included the generations and names of the programming languages being used and the amount (source lines) of software written in each programming language expressed as a percentage of the total system. Additional data reported includes the acquisition category and life-cycle phase of the program. A data collection form was designed to record the data elements identified by the survey respondents. Potential respondents were contacted by telephone to get their agreement to participate in the survey. The data collection form was then faxed to each participant and responses were analyzed to extract the information reported in this study. 1.4 Language Counting Issues The classification of programming languages for counting purposes has always been, and continues to be, a highly debated subject on which experts differ in definitions and philosophy. Even when definitions are generally agreed upon, the application of the definition in a particular case is often difficult, with results depending on the judgement of a person. For the purposes of this report, the key issue is the difference between "version" and "dialect." We use the term "dialect" to indicate a relatively minor change in a language whereas "version" indicates a larger change and usually has a different "name" although the new "name" may only be the concatenation of a different year or number to the baseline name (e.g., Jovial, Jovial 73). While these definitions may appear to be abstract issues of interest only to language specialists, they actually have a profound effect on portability, interoperability, and counting. If a dialect (involving small changes) is involved, training and portability may be easier than with a new "version." A dialect would normally not be considered a separate language. A version may or may not be considered a separate language, depending on the purposes of the counting. In this report we counted historical versions that divide conveniently between pre- and current version years. Because the practical usage of programming languages is generally at the third generation level, this survey concentrates on this level while still collecting some minimal data for other generations of languages. Consequently, the results from this survey can be compared only in a general way with the historical assertion about "450" general purpose languages as a practical illustration of what is happening in the DoD environment. 1.5 Scope The results of this survey are drawn from a limited sample of DoD weapon systems and AISs; therefore, the survey does not provide an exact and detailed record of computer programming language usage in the DoD. Several constraints affected the precision of the results: - The study's sponsors were primarily interested in knowing the primary languages being used in DoD. A detailed, comprehensive inventory of computer programming language usage in the DoD was not called for. Therefore, the following types of software were partially or wholly excluded from the survey: - Software being developed at Service and DoD research laboratories - Software being developed for highly classified systems - Commercially purchased software - Firmware - Software funded by Operations and Maintenance (O&M) - Software below the funding level for Presidential budget-line identification - The effort required by respondents to complete the survey form was to be minimized. Therefore, trade-offs were made in the amount and detail of information requested. - The resources available for the conduct of the survey were limited. 1.6 Organization A description of the methods used to identify the survey population and sample is found in Section 2. A profile of the survey respondents is presented in Section 3. Analysis of the programming language data obtained by the survey is provided as findings in Section 4. Section 5 summarizes the conclusions drawn from survey results. Section 6 contains the recommendation. Appendix A contains the survey instrument and Appendix B provides the data obtained during the survey. We have provided as much detail as possible about the method and response data with the intent of providing a documented baseline for future language studies.