Monday, December 12, 2011

Chapter 28

Part 5     Advanced Topics in Software Engineering

Chapter 28
Formal Methods


CHAPTER OVERVIEW AND COMMENTS

This chapter presents an introduction to the use of formal methods in software engineering. The focus of the discussion is on why formal methods allow software engineers to write better specifications than can be done using natural language. Students without precious exposure to set theory, logic, and proof of correctness (found in a discrete mathematics course) will need more instruction on these topics than is contained in this chapter.  The chapter contains several examples of specifications that are written using various levels of rigor. However, there is not sufficient detail for a student to learn the language (supplementary materials will be required).
If your students are familiar with a specification language (like OCL or Z) they should be encouraged to use it to write formal specifications for at least one of their own functions. If they are not familiar with a specification language, it may be worthwhile to teach them one (if your course is two semesters long).  Having students write correctness proofs for their function specifications will be difficult and you may need to review proof techniques and heuristics if you wish students to write proofs.

28.1  Basic Concepts

This section discusses the benefits of using formal specification techniques and the weaknesses of informal specification techniques. Many of the concepts of formal specification are introduced (without mathematics) through the presentation of three examples showing how formal specifications would be written using natural language. It may be worthwhile to revisit these examples after students have completed the chapter and have them write these specifications using mathematical notation or a specification language (like OCL or Z).



28.2  Mathematical Preliminaries

A review of the mathematics needed for the remainder of the chapter appears in this section. If your students have completed a good course in discrete mathematics, this review should be adequate. If they have not, some additional (and more concrete) examples will need to be presented to your students. Constructive set specification writing is a very important concept for your students to understand, as is work with predicate calculus and quantified logic. Formal proofs of set theory axioms and logic expressions is not necessary, unless you plan to have your students do correctness proofs for their specifications. Work with sequences may be less familiar to your students, if they have not worked with files and lists at an abstract level.

28.3  Applying Mathematical Notation for Formal Specification

This section uses mathematical notation to refine the block handler specification from Section 28.1.3. It may be desirable to refine the other two specification examples from Section 28.1 using similar notation. If your students are comfortable with mathematical proofs, you may wish to present an informal correctness proof for these three specifications. Having students write specifications for some of their own functions, using notation similar to that used in this section may be desirable.

28.4  Formal Specification Languages

This section discusses the properties of formal specification languages from a theoretical perspective. The next two sections use OCL and the Z specification language to rewrite the block handler specification more formally. You might have students try writing the specifications for their own functions using a pseudocode type notation embellished with comments describing semantic information.

28.5  Object Constraint Language (OCL)

This section presents a brief overview of OCL syntax and semantics and then applies OCL to the block handler example. The intent is to give the student a feel for OCL without attempted to teach the language. If time and inclination permit, the material presented here can be supplemented with additional OCL information from the UML specification or other sources.

28.6  Using Z to Represent and Example Software Component

This section presents a brief overview of Z syntax and semantics and then applies Z to the block handler example. The intent is to give the student a feel for Z without attempted to teach the language. If time and inclination permit, the material presented here can be supplemented with additional Z information.

25.7  The Ten Commandments of Formal Methods

Class discussion of the 10 commandments listed in this section would be beneficial to students. They need to be able to articulate in their own words, why these commandments are important when using formal methods in software engineering.

25.8  Formal Methods—The Road Ahead

This section mentions several obstacles to using formal methods on large real world several development projects. Students should acknowledge that formal methods require discipline and hard work. They should also recognize that the there are times when the risks associated with using informal specification techniques make the use of formal methods necessary.


A Detailed Example of the Z Language

To illustrate the practical use of a specification language, Spivey [1] considers a real-time operating system kernel and represents some of its basic characteristics using the Z specification language [SPI88]. The remainder of this section has been adapted from his paper (with permission of the IEEE).

*************

Embedded systems are commonly built around a small operating-system kernel that provides process-scheduling and interrupt-handling facilities. This article reports on a case study made using  Z notation,  a mathematical specification language, to specify the kernel for a diagnostic X-ray machine.
        Beginning with the documentation and source code of an existing implementation, a mathematical model, expressed in Z, was constructed of the states that the kernel could  occupy and the events that could take it from one state to another. The goal was a precise specification that could be used as a basis for a new implementation on different hardware.
        This case study in specification had a surprising by-product. In studying one of the kernel's operations, the potential for deadlock was discovered: the kernel would disable interrupts and enter a tight loop, vainly searching for a process ready to run.
        This flaw in the kernel's design was reflected directly in a mathematical property of its specification, demonstrating how formal techniques can help avoid design errors. This help should be especially welcome in embedded systems, which are notoriously difficult to test effectively.
        A conversion with the kernel designer later revealed that, for two reasons, the design error did not in fact endanger patients using the X-ray machine. Nevertheless, the error seriously affected the X-ray machine's robustness and reliability because later enhancements to the controlling software might reveal the problem with deadlock that had been hidden before.
        The specification presented in this article has been simplified by making less use of the schema calculus, a way of structuring Z specifications. This has made the specification a little longer and more repetitive, but perhaps a little easier to follow without knowledge of Z.

About the Kernel
       
The kernel supports both background processes and interrupt handlers. There may be several background processes, and one may be marked as current. This process runs whenever no interrupts are active, and it remains current until it explicitly releases the processor, the kernel may then select another process to be current. Each background process has a ready flag, and the kernel chooses the new current process from among those with a ready flag set to true.
        When interrupts are active, the kernel chooses the most urgent according to a numerical priority, and the interrupt handler for that priority runs. An interrupt may become active if it has a higher priority than those already active and it becomes inactive again when its handler signals that it has finished. A background process may become an interrupt handler by registering itself itself as the handler for a certain priority.

Documentation

Figures 9.15 and 9.16 are diagrams from the existing kernel documentation, typical of the ones used to describe kernels like this. Figure 9.15 shows the kernel data structures. Figure 9.16 shows the states that a single process may occupy and the possible transitions between them, caused either by a kernel call from the process itself or by some other event.


        In a way, Figure 9.16 is a partial  specification of the kernel as a set of finite-state machines, one for each process. However, it gives no explicit information about the interactions between processes—the very thing the kernel is required to manage. Also, it fails to show several possible states of a process. For example , the current background process may not be ready if it has set its own ready flag to false, but the state "current but not ready" is not shown in the diagram. Correcting this defect would require adding two more states and seven more transitions. This highlights another deficiency of state diagrams like this: their size tends to grow exponentially as a system complexity increases.




Kernel States

Like most Z specifications, the kernel model begins with a description of the state space, the collection of variables that determine what state the kernel is in and the invariant relationships that always hold between these variables' values.
        In this article, the kernel state space is described in several parts, corresponding to the background processes, the interrupt handlers, and the state of the processor on which the kernel runs. Each piece is described by a schema, the  basic unit of specification in Z. Table 9.1 {Table 25.2 in SEPA, 4/e] describes the Z notation used in this case study.
        The state space of the whole kernel is obtained by putting together these pieces and adding more invariants, among them the static policy for allocating the processor to a background process or interrupt handler. Processes are named in the kernel by  process identifiers. In the implemented kernel, these are the addresses of process-control blocks, but this detail is irrelevant to a programmer using the kernel, so they are introduced as a basic type PID:

        [PID]

        This declaration introduces PID as the name of a set, without giving any information about its members. From the specification's point of view, the members are simply atomic objects.
        For convenience, a fictitious process identifier, none, was introduces. none is not the name of any genuine process. When the processor is idle, the current process is none. The set PID1 contains all process identifiers except none:

        none: PID
        PID1: P PID
————————————
        PID1 = PID\{none}

        The part of the kernel state concerned with background processes is described by this schema:

————Scheduler——————————
        background: P PID1
        ready: P PID1
            current: PID
——————————————————
        ready   background
        current   background  {none}
——————————————————

Like all schema, this one declares some typed variables and states a relationship between them. Above the horizontal dividing line, in the declaration part, the three variables are declared:

    background is the set of processes under control of the scheduler;
    ready is the set of processes that may be selected for execution when the processor becomes free;
     current is the process selected for execution in the background.

        Below the line, in the predicate part, the schema states two relationships that always hold. The set ready is always a subset of background, and current is either a member of background or the fictitious process none. This schema lets the current process be not ready, because no relationship between current and ready is specified.
        This schema does not reflect the full significance of the set ready, but its real meaning will be shown later in the Select operation, where it forms the pool from which a new current process is chosen.
        Interrupts are identified by their priority levels, which are small positive integers. The finite set ILEVEL includes all the priorities:

        ILEVEL: F N
————————————
        0    ILEVEL

        Zero is not one of the interrupt priority levels, but it is the priority associated with background processes.
        The state space for the interrupt-handling part of the kernel is described like this:
       
——————IntHandler——————
        handler: ILEVEL   PID1
        enabled, active:  P ILEVEL
—————————————————
        enabled   active   dom handler
—————————————————

This schema declares the three variables:

    handler is a function that associates certain priority levels with processes, the interrupt handlers for those priorities;
    enabled is the set of priority levels that are enabled, so an interrupt can happen at that priority;
    active is the set of priority levels for which an interrupt is being handled.

        The predicate part of this schema says that each priority level that is either enabled or active must be associated with a handler. An interrupt may be active without being enabled if, for example, it has been disabled by the handler itself since becoming active. The declaration

        handler: ILEVEL  PID1

declares handler to be a partial injection. It is partial in that not every priority level need be associated with an interrupt handler and it is an injection in that no two distinct priority levels may share the same handler.
        Information like this — that interrupts may be active without being enabled and that different priority levels may not share a handler— is vital to understanding how the kernel works. It is especially valuable because it is static information about the states the kernel may occupy, rather than about what happens as the system moves from state to state. Such static information is often absent from the text of the program, which consists mostly of dynamic, executable code.
        The state space of the whole kernel combines the background and interrupt parts:


—————Kernel——————————
        Scheduler
        IntHandler
———————————
        background   ran handler =  
——————————————————

        The declarations Scheduler and IntHandler in this schema implicitly include above the line all the declarations from those schemas and implicitly include below the line all their invariants. So the schema Kernel has six variables: background, ready, and current from Scheduler, and handler, enabled, and active for IntHandler.
        The additional invariant has been added so that no process can be both background process and an interrupt handler at the same time.
        The main job of the kernel is to control which process the  processor runs and at what priority. Therefore, the running process and the processor priority have been made part of the state of the system. Here is a schema that declares them:
       
—————CPU————————
        running: PID
        priority: ILEVEL  {0}
———————————————

        This schema has an empty predicate part, so it places no restriction on the values of its variables, except that each must be a member of its type. The variable running takes the value none when no process is running.
        Of course, there are many other parts of the processor state, including the contents of registers and memory, condition codes, and so on, but they are irrelevant in this context.
        Because the kernel always uses the same scheduling policy to select the running process and the CPU priority, this policy is another invariant of the system. It is stated in the schema State, which combines the kernel and CPU parts of the system state:

————State—————————————
        Kernel
        CPU
———————————————————
priority= max (active {0})
priority  =  0  running  = current
priority  >  0  running  = handler (priority)
———————————————————

        If any interrupts are active, the processor priority is the highest priority of an active interrupt, and the processor runs the interrupt handler for that priority. Otherwise, the processor runs the current background process at priority zero.
        The invariant part of this schema uniquely  determines priority and running in terms of active, current, and handler, three variables of the schema Kernel. This fact will be exploited when the events that change the system state are described. With the description of the kernel's and processor's state space complete, the next step is to look at the operations and events that can change the state.

Background Processing

        Some kernel operations affect only the background processing part of the state space. They start background processes, set and clear their ready flags, let them release the processor temporarily or permanently, and select a new process to run when the processor is idle.
        A process enters the control of the scheduler through the operation Start, described by this schema:
——————Start————————
        _State
        p?: PID1
————————————————
p?   ran handler
background' = background   {p?}
ready' = ready    {p?}
current' = current
IntHandler' =   IntHandler
————————————————

        LIke all schemas describing operations, this one includes the declaration _State, which implicitly declares two copies of each variable in the state space State, one with a prime (') and one without. Variables like background and ready without a prime refer to the system state before the operation has happened, and variables like background' and ready' refer to the state afterward.
        The declaration _State also implicitly constrains these variables to obey the invariant relationships that were documented in defining the schema State—including the scheduling policy—so they hold both before and after the operation.
        In addition to theses state variables, the Start operation has an input p?, the identifier of the process to be started, By convention, inputs to operations are given names that end in a ?.
        The predicate part of an operation schema lists the precondition that must be true when the operation is invoked and postcondition that must be true afterward. In this case, the precondition is explicitly stated: that the process p? being started must not be an interrupt handler (because that would violate the invariant that background processes are disjoint from interrupt handlers).
        In general, an operation's precondition is that a final state exists that satisfies the predicates written in the schema. Part of the precondition may be implicit in the predicates that relate the initial and final states. If an operation is specified be the schema Op, its precondition can be calculated as
               
         State Op

        If the precondition is true, the specification requires that the operation should terminate in a state satisfying the postcondition. On the other hand, if the precondition is false when the operation is invoked, the specification says nothing about what happens. The operation may fail to terminate, or the kernel may stop working completely.
        For the Start operation, the postcondition says that the new process is added to the set of background processes and marked as ready to run. The new process does not start to run immediately, because current is unchanged; instead, the processor continues to run the same process as before.
        The final equation in the postcondition
       
          Inthandler' =   Inthandler                                                                                               

means that the part of the state described by the schema IntHandler is the same after the operation as before it.
        The equation in this schema determines the final values of the six variables in the kernel state space in terms of their initial values and the inputs p?,  but they say nothing about the final values of the CPU variables running and priority. These are determined by the requirements, implicit in the declaration _State, that the scheduling policy be obeyed after the operation has finished. Because the values of active, handler, and current do not change in the operation, neither does the CPU state.
        The current background process may release the processor by calling the Detach operation, specified like this:

——————Detach————————
        _State
—————————————————
        running background                                                                                                            
        background' = background
        ready' = ready 
        current' = none
        IntHandler' =    IntHandler
—————————————————
               
        Again, this operation is described using _State variables before and after the operation has happened. The precondition is that the process is running a background process. The only change specified in the postcondition is that the current process changes to none, meaning that the processor is now idle. The next event will be either an interrupt or the selection of a new background process or run.
        After a call to Detach—and after other operations described later— current has values none, indicating that no background process has been selected for execution. If no interrupts are active, the processor is idle, and the Select operation may happen spontaneously. It is specified like this:

——————Select————————
        _State
—————————————————
        running =none                                                                                                                            
        background' = background
        ready' = ready 
        current' ready
        IntHandler' =   IntHandler
—————————————————

        Rather than a part of the interface between the kernel and an application, Select is an internal operation of the kernel that can happen whenever its precondition is true. The precondition is

        running = none   ready _    
       
The processor must be idle, and at least one background process must be ready to run. The first part of this precondition is stated explicitly, and the second part is implicit in the predicate

        current'     ready

        The new value of current is selected from ready, but the specification does not say how the choice is made—it is nondeterministic. This nondeterminism lets the specification say exactly what programmers may rely on the kernel to do: there is no guarantee that processes will be scheduled in a  particular order.
        In fact, the nondeterminism is a natural consequence of the abstract view taken in the specification. Although the program that implements this specification is deterministic—if started with the ring of processes in a certain state, it will always select the same process—it appears to be nondeterministic if the set of processes that are ready are considered, as has been done in the specification.
        However, the kernel selects the new current process, the specification says that it starts to run, because of the static scheduling policy, which determines that after the operation, running is current and priority is zero.
        A background process may terminate itself using the Stop operation:
       
——————Stop————————
        _State
—————————————————
        running  background                                                                                                           
        background' = background   {current}
        ready' = ready   {current}
        current' = none
        IntHandler' =   IntHandler
—————————————————

        For this operation to be permissible, the processor must be running a background process. This process is removed from background and ready, and the current process becomes none, so the next action will be to select another process.
        A final operation, SetReady, sets or clears a process's ready flag. It has two inputs, the process identifier and a flag , which takes one of the values set or clear:
       
        FLAG :: = set \ clear

The SetReady operation is:

——————SetReady———————
        _State
        p?: PID 
        flag?: FLAG
—————————————————
        p?  background
        flag? =set  ready' = ready   {p?}
        flag? = clear  ready' = ready {p?}                                                                                
        background' = background
        current' = current
        IntHandler' = IntHandler
—————————————————

        The precondition is that p? is a background process, according to the value of flag?, it is either inserted in ready or removed from it. The scheduling parameters  do not change, so there is no change in  the running process.

Interrupt handling

                Other operations affect the kernel's interrupt-handling part. A background process may register itself as the handler for a certain priority level by calling the operation IEnter:

——————IEnter———————
        _State
        i?: ILEVEL   
—————————————————
        running background
        background' = background   {current}
        ready' = ready   {current}
        current' = none
        handler' = handler     {i?  current}
        enabled' = enabled   {i?}
        active' = active
—————————————————

        This operation may be called only by a background process. The operation removes the calling process from background and ready, and the new value of current is none, just as in  the Stop operation. Also, the calling process becomes an interrupt handler for the priority level i?, given as an input to the operation, and that priority level becomes enabled. The expression

        handler      {i?  current}

denotes a function identical to handler, except that i? is mapped to current, This function is an injection, because current, a background process, cannot already be the handler for any other priority level. The new handler supersedes any existing handler for priority i?, which can never run again unless restarted with the Start operation.
        Once a process has registered itself as an interrupt handler, the scheduler chooses a new process to run in the background, and the new interrupt handler waits for an interrupt to happen:

——————Interrupt———————
        _State
        i?: ILEVEL   
—————————————————
        i? enabled    i?   > priority
         Scheduler' = Scheduler
        handler' = handler   
        enabled' = enabled
        active' = active   {i?}
—————————————————
       
        The process hardware ensures that interrupts happen only when they are enabled and have priority greater than the processor priority. If these conditions are satisfied, the interrupt can happen and the kernel then adds the interrupt can happen and the kernel then adds the interrupt to active.
        The scheduling  policy ensures that the associated interrupt handler starts to run. In this calculation of the processor priority, each step is justified by the comment in brackets:

                Priority'
        = [scheduling  policy]
                max(active'  {0})
        = [postcondition]
                max((active  {i?})  {0})
        = [  assoc. and comm.]
                max((active  {0})  {i?})
        = max dist. over  ]
                max{max(active  {0}), i?}
        = [scheduling policy]
                max{priority, i?}
        = [ i? > priority]
                i?

        So priority' =  i? > 0 and the other part of the scheduling policy ensures that running' equals handler (i?).
        After the interrupt handler has finished the processing associated with the interrupt, it calls the kernel operation IWait and suspends itself until another interrupt arrives. IWait is specified as

——————IWait———————
        _State
—————————————————
        priority  > 0
         Scheduler' = Scheduler
        handler' = handler   
        enabled' = enabled
        active' = active    {priority}
—————————————————

        The precondition priority > 0 means that the processor must be running an interrupt handler. The current priority level is removed from active, and as for the Interrupt operation, the scheduling policy determines what happens  next. If any other interrupts are active, the processor returns to the interrupt handler with the next highest priority. Otherwise, it returns to the current background process.
        Another kernel operation, IExit, lets an interrupt handler cancel its registration:

——————IExit———————
        _State
—————————————————
        priority  > 0
        background' = background {handler (priority)}
        ready' = ready  {handler (priority)}
        current' = current
        handler' = {priority}   handler   
        enabled' = enabled     {priority}
        active' = active    {priority}
—————————————————

        Again, the processor must be running an interrupt handler. This handler leaves the interrupt-handling part of the kernel and becomes a background process again, marked as ready to run. As with IWait, the processor returns to the interrupt handler with the next highest priority or to the current background process. The process that called IWait is suspended until the scheduler selects it for execution. In this schema, the expression

        {priority}   handler   

denotes a function identical to handler except that priority has been removed from its domain; it is an injection provided that handler is one.
        Two more kernel operations, Mask and Unmask, let interrupt priorities be selectively disabled and enabled. Their  specifications are like SetReady, so have been omitted. The kernel specification is now complete.





[1]          Spivey, J.M., “Specifying a Real-Time Kernel,” IEEE Software, September, 1990, pp. 21 - 28.

No comments:

Post a Comment