Computer Sciences Corporation
Module 7, Endeavour House
Fourth Avenue, Technology Park
The Levels, SA 5095 Australia
kgramp@csaadel.adl.csa.oz.au
Abstract - Advances in computing power are being achieved by increasing the number of processors as well as increasing individual processor speeds. Concurrently, operating systems and languages are being enhanced with tasking facilities including light weight threads.
This paper discusses the use of tasks in real-time systems. It uses three mobile satellite based telecommunication software designs to compare the different approaches to using tasks. The first two projects have been written in Ada and the last has been prototyped using Ada. The first is the ground station software for the Optus MobileSat system in Australia. It essentially uses single tasked programs. The second is a simulator for the ground station hardware, satellite, air interface and mobile terminals. It uses a task for each hardware component involving hundreds of task. The last is for a future mobile satellite system. It uses individual tasks for each action or operation including individual calls, leading to tens of thousands of tasks.
The basic requirements of a mobile satellite system are outline, highlighting the features which make a tens of thousands of task design feasible. The task structure for each design is covered in detail. Advantages and disadvantages are discussed, covering design, testing, performance, prediction, scalability and how tasks allow the designer to take advantage of the current trends in computer architecture and computer languages. The problems of designing, predicting, developing and debugging programs with thousands of tasks are discussed.
INTRODUCTION
Today's performance requirements for real time systems are demanding higher levels of processing power. The requirements of these real time systems have varying degrees of inherent concurrency. Increases in computer processing power are being achieved both by the increased speed of processors and by the increased number of processors within a computer.
A task (also called a thread) is an execution of a program with its own context. Many tasks can share the same address space providing an efficient concurrency solution. Tasks are becoming commonly used in all forms of software, with real time software having always used multi-tasked designs. Operating systems are now changing [1] to provide thread or task facilities which allow the efficient use of multiprocessors by a group of tasks sharing the same address space.
Three designs, two current implementations, one for a mobile satellite communication system, the second for a simulator of hardware and a proposed future design for a mobile satellite communication system are discussed. Each design has a different task architecture. The advantages and disadvantages of the different designs are reviewed.
COMPUTER HARDWARE ARCHITECTURES
The most common configuration for computer systems today is a single processor or execution unit addressing a block of main memory (Figure 1). Over the past years improvements in processing power have been achieved by increasing processing speed of the single processor and increasing the size of memory to reduce I/O accesses.
Figure 1 - Single Processor
Figure 2 - Distributed Computers
Figure 3 Multiprocessor Computer
OPERATING SYSTEM ARCHITECTURES
For most of the evolution of operating system design there has been a need to provide some form of concurrency. This concurrency can be conceptual or real, depending on whether there is a single processing unit or multiple processing units.
Operating systems today provide a facility for running many processes concurrently. Each process is provided with its own address space and is managed and protected by the operating system (Figure 4). Communication between the processes is restricted and again managed by the operating system. To provide the protection between processes all operations by the operating system involve processor mode changes and protection checks. In many cases these mode changes and checks dominate the cost of the operations being requested by the processes.
Figure 4 - Processes within one Computer
Figure 5 - Distributed processes across Computers
Figure 6 - Threads within one Process
There are emerging standards for tasks or lightweight threads, and most proprietary operating systems are providing some form of thread management. The POSIX operating system [1] standard now includes a set of library routines for managing threads. This management includes thread creation and deletion, thread synchronization and thread communication. The Ada programming language has always including threads (called tasks), with Ada kernels now starting to use the thread facilities provided by the underlying operating system and thus taking advantage of the non-blocking I/O calls and multiprocessor capabilities [2].
DESIGN EXAMPLES
The following are three examples of different software architecture designs. Each design has used tasks to varying degrees. All designs are real-time systems for parts of a mobile satellite communication system. Ada83 has been used for the first two examples and based on the experienced gained Ada95 will be used for future development. The tasking facilities in Ada allowed a controlled and well defined use of tasks with extensive error checking both at compile and run time. Ada95Õs improved tasking facilities and protected types will allow less and possibly no deviation from the language, thus improving error checking and detection.
SINGLE TASK DESIGN
MobileSat[1] is a mobile satellite communications system, run by Optus, an Australian telecommunications service provider. Mobile terminals communicate with the terrestrial telephone system via geostationary satellites and two ground stations positioned on the west and east coasts of Australia. The coverage includes all of Australia extending 200 miles out from the coast. The current MobileSat ground station software [3] has been developed by CSC Australia for Optus. It is a fault tolerant system using a number of redundant networked single processor computers. There is a requirement to process up to a 1000 calls simultaneously at a call rate of 17 calls per second. Figure 7 shows the general architecture of MobileSat. It includes mobile terminals, geostationary satellites and two ground stations.
Figure 7 - MobileSat Architecture
Figure 8 - MobileSat Process Architecture
* No knowledge of task management and communication is required by developers. Since all tasks are restricted to infrastructure and only one task is handling the application code, most developers do not need to directly handle or understand task management, communication and synchronization.
* No data protection is required for the majority of code and data. No critical regions need to be implemented. Since there is only one task processing all application code, data is inherently protected since only one part of the application code can be running at any one time.
* There is implicit protection between processes on the one computer. Where the design uses concurrency, it is restricted to separate processes which have separate address spaces and therefore have implicit protection.
The disadvantages of this architecture include:
* Processes cannot take advantage of a multiprocessor computer's performance. Since there is only one task performing the majority of the work, only one processor can be used at any one time by a process and scalability is limited.
* There is complex routing and distribution of messages to application code. All messages need to pass through the distributor into the correct application areas. This leads to complex routing which is difficult to code, debug and test. This is further complicated by the need for the application code to cooperatively preempt itself when computation is long, by returning to the distributor and continuing at some later time. The continuation of the application code is achieved by the code setting either a timer or sending a message to itself. In both cases the context of the cooperative preempt must be saved.
* Timing facilities are complex. Management of delays and time-outs need to go back through the distributor leading to complex handling and routing of timers both at the distributor and in the application code.
* Hiding of task management from developers. This has been stated as an advantage, but it can also be a disadvantage with developers not having to understand task design and therefore not appreciating some of the performance problems certain coding techniques can produce.
MANY TASKS DESIGN
To test updated versions of the MobileSat software a simulator [4] of the MobileSat hardware has been produced. This simulator has been developed with a very different design to the MobileSat software. It was decided that each component of the simulator would be self contained and based on a task or a number of tasks (Figure 9). Each hardware item being simulated is represented by a task. All shared data is protected inside a task[2]. The number of tasks in the simulator is dynamic, ranging from hundreds to thousands of tasks, depending on the amount of hardware being simulated.
Figure 9 - MobileSat Simulator Architecture
* Design matches the problem domain. Many problem domains, including this one, are naturally concurrent and therefore a design which uses explicit concurrency is a natural solution.
* Explicit task use leads to a design which is clearer and easier to understand. Using tasks provides a natural form of modularisation, with communication and synchronization of tasks handled by the Ada kernel. This has led to simpler and clearer code. With well defined and consistent interfaces between tasks and groups of task the design is easy to expand and modify. For example the air task implements the delay in signal propagation to and from a geostationary satellite. It has been considered to modify this task to include signal propagation errors.
* There is efficient use of processing power. While some tasks are waiting on external operations to finish (e.g. I/O) other tasks can execute. This capability is based on the assumption that tasks performing I/O do not block other tasks within the same process from execution, a feature now found in most multiprocessor Ada kernels and the underlying operating systems.
* The solution is scaleable. If many tasks are used then increased throughput can be obtained by increasing the number of processors in a multiprocessor system.
* There is efficient context switching and communication. Modern kernels or operating systems handle task context switching in libraries which execute as part of the process. This provides very efficient context switching compared to the context switching of processes where processor mode changes are required. The common address space of tasks provide efficient communication between tasks with no address translation and context switching required.
* Off the shelf scheduling algorithms are available. The tasks are managed by a kernel which usually provides a number of scheduling choices, the main ones being priority based and time slicing with combinations of both. This provides flexibility in design by setting priorities according to the critical requirements of each task and allowing preemption of computationally long tasks. This should be compared with the explicit handling of scheduling, timing and context switching required in the previous architecture.
* Simple and efficient timing facilities. By restricting each task to one function then time-outs or delays can be simply handled by delaying in the task and if necessary waiting for some event to occur. This is efficient since the kernel is more than likely to be optimized to handle delays compared to explicit code produced by the developer.
* With high level languages like Ada, compile and run time checking is performed on the task constructs, leading to early and efficient error detection.
The disadvantages of this architecture include:
* Too many tasks can lead to overheads in scheduling and synchronization. As the number of tasks increases the cost of scheduling may increase at a greater rate. A good kernel implementation should not allow this to happen, though an increase in the number of delaying tasks will increase the searching time when inserting new delaying tasks. Even in this case if the tasks were not used, explicit code would need to be produced to handle the many delays, leading to similar or greater costs. In a good design the number of tasks should match the concurrency of the problem. In this case the cost of the scheduling by the kernel will be the same or more likely less than the cost of scheduling being performed explicitly by the developer when few tasks are used.
* Many tasks need to share common data. Common data needs to be protected, leading to potential access contention, which may affect performance. If share data access becomes a major computational problem then scalability advantages with multiprocessor computers are lost. A good design minimizing the cost of access and the number of accesses is essential to obtaining high performance.
* Developers need to understand the use of tasks. As mentioned earlier with the single task architecture this can be considered an advantage as well as a disadvantage.
THOUSANDS OF TASKS DESIGN
MobileSat has performance requirements of 1000 simultaneous telephony calls with a peak rate of call establishment and clear down of 17 calls / second. Future mobile communication systems being considered have a maximum number of simultaneous calls in the region of 10,000, with call rates of 100 calls / second. To handle this ten- fold increase in load using faster uniprocessor computers is not economical. The trend to multiprocessor computers lends itself to the use of many tasks in a design.
Current proposed software architecture solutions to these increased performance requirements involve using tasks for each concurrent activity in the system. This includes individual tasks for all devices and calls. A task per device can lead to thousands of tasks with a task per call leading to tens of thousands of tasks (Figure 10). The simulator described above has been a prototype of the proposed solution and has shown that not only is performance achieved, but development has been both more efficient and more robust.
Figure 10 - Future Mobile Communications Architecture
These tests also measured the actual cost of context switching. As pointed out with the single task design, effectively context storage and switching needs to be performed by the application code to allow concurrent calls to be handled. Having a task per call adds no more context switches and improves performance by allowing the kernel to more efficiently handle the context switching.
An enormous amount of research effort has and is being expended on scheduling problems when dealing with many tasks [6][7]. Running very large numbers of tasks as has been outlined above seems to point to potential scheduling problems. The difference between the above architectures and general tasking architectures is homogeneousity of the tasks. The large numbers of tasks are all clones of a small set of tasks. Further the single task architecture has thousands of concurrencies being handle by the application code explicitly. With the many task solution the currency structure is still the same with the scheduling being handled by the Ada kernel.
CONCLUSION
Two quite different software architectures for a real time system have allowed the advantages and disadvantages of task usage in a design to be evaluated. With the need for more processing power, and the trend to multiprocessor computers, using threads or tasks as a fundamental and visible part of the design allows scalability in performance by increasing the number of processors in a computer.
The MobileSat simulator design has shown that as well as performance advantages, when the use of tasks matches the concurrency of the problem, there are real benefits in the development process. The design is inherently modular, functional code becomes simpler through use of the kernel facilities and the code is much easier to expand. Developers need to become knowledgeable of task development but this does provide them with a better understanding and appreciation of the performance and concurrency problems of a real time design.
The experience gained from both the single tasked design of MobileSat and the multi-tasked design of its simulator, has allowed the use of an expanded simulator design for a proposed higher performance mobile satellite communication systems. This design will be more robust and will take advantage of multiprocessors computers to meet the higher performance requirements.
DEFINITIONS
Application Code - The software code performing the functions required. For example, connecting a call.
Computer - A physical entity consisting of processors, memory, mass storage and I/O interfaces. A computer can have one or more processors.
Cooperative Preemption - When a task or process preempts itself by returning control to the kernel or operating system.
Distributed System - A group of Computers connected by a communication network.
Distributor - The software that routes messages to specific application code. It includes management of timers, prioritizing of messages and handling the context of each piece of application code.
Infrastructure Code - The software supporting the application code.
Kernel - Refers to the small operating system managing the scheduling, communication and synchronization of tasks within a process.
Operating System - The software managing the processes running on a computer.
Preemption - When a task or process is stopped from running on a processor.
Process - An execution of a program with its own address space. Managed by the operating system. A process is made up of one or more threads of execution.
Processor - An execution unit of a computer.
Router - A simplified distributor. It only routes messages to the appropriate thread. The other distributor functions are handled by the thread kernel.
Task - Same as Thread.
Thread - A sequential execution of part of a program. There is no concurrency in a thread.
Task Synchronization - When two tasks synchronize in time. Usually occurs by one task calling an entry point in another task. At the time of synchronization information can be passed.
REFERENCES
[1] Technical Committee on Operating Systems and Application Environments of the IEEE. Portable Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) - Amendment 2: Threads Extension [C Language] (Draft 7), October 1993. P1003.4a/D8.
[2] Karen L. Sielski, "Implementing Ada Tasking in a Multiprocessing, Multithreaded Unix Environment", Proceedings of the TRI-Ada'92 Convention, pp. 512-517, 1992.
[3] H. Nguyen, "Software for the MobileSat network management", Proceedings of the 2nd Australian Conference on telecommunication Software, pp. 53 - 60, 1993.
[4] Hardware Simulator Operator's Manual for the MobileSat II Project, CSC Australia, Sept 1995, CSC-50037-00-D-00-0011.
[5] K. J. Gramp, Scalability of Base 21, CSC Australia, 11 Jan 1995, CSC-5213-00-D-01-0001.
[6] L. Sha, R. Rajkumar and J.P. Lehoczky, "Priority Inheritance Protocols: An Approach to Real-Time Synchronisation", IEEE Transactions on Computers, Vol. 39, No. 9, pp. 1175-1185, September 1990.
[7] J. Stankovic, M. Spuri, M. Di Natale and G. Buttazzo, "Implications of Classical Scheduling Results for Real-Time Systems", IEEE Computer, pp.16-25, June 1995.