FAULT DETECTION: STATE BEHAV!OR, MACHlNE DECOMPOSITION AND ECONOMICS Crissertation for the Degree of Ph D. MECHE CAN STATE UHE EVERS {TY LARRY LEE OROVER 1975 llllllll 3129 10766 ease M l {K 49‘ This is to certify that the thesis entitled FAULT DETECTION: STATE BEHAVIOR, MACHINE DECOMPOSITION AND ECONOMICS presented by Larry Lee Grover has been accepted towards fulfillment of the requirements for _Ph. D. degree in Computerfic 1ence (3,41% Major profe Date Aggust 9, 1976 D 0-7639 m- \‘ A. at:— ~ . ”DING BY ~ HMS 8: SONS' ‘ 800K mum INC. ; LIBRARY BINDERS ‘ mmmr, menu; h: ‘1 __ 4 EVEESI_J - RETURNING MATERIALS: Place in book drop to LJBRARJES remove this checkout from .—;—. your record. FINES will - » be charged if book is returned after the date stamped below. p LIBRARY Michigan Sm: !' I t Unch rsity n. .m. a 2 :2 .72 mmmm mam a mmamma ,mw an maacmanomeaaaaagaana 2::-::L:.:L:L ummmmuouammumammmmuw mwmnm mnmmmmmmnmmmn vch-vvv:;~.vwcvvv mannnnmnmnmnnnnnmnnm NNNNNNNN--NNNNNNNN ::___::::::_ a. uh up - u~ a. q” H_ N. —. n» a. .a p. a. mm .o no N. _. :aaeeaa-aaocaaaaaae an ‘- (-1 a an of I .—. .11 u A.“ m .41 «v 4'. I! 0 v. r1! 4 - v | H 0‘) an en 6" 65 .‘D O. a) :7) m U) m C. ) (:1 - "I- ) m zrmammcmmmmmmmwwum _~2~I,_F.~_~heh222 22222222222222222 mfimmmwwmnmmmnmmmna .,.e.........vw.vv rflennnnmmmnanmmmm NNNNNNNNNNNNNNNNNN ____ .— 222: 2:“ acemaa _::—::: 1.; v 41 3 -r —r ,v. - I :. am mm 2 : “3:32:35“. :92: _ n =========Q=Q=Z=a== rues—2:; ., my mm mm ‘2 3»on ~43 4- m r" «I we r‘l I‘I w n l". Lu U) L.) O: . CI: mmmmm (.0 c0 "napnnmnmm N N _ 2 2 2 '— .r 2 2 z a 2 z a‘ «'1 mama»? m: N---- :Z: ' m~ ”N ”a an ‘4 .1 rm H rd 6’; on m wag a: we. :— 1: ABSTRACT FAULT DETECTION: STATE BEHAVIOR, MACHINE DECOMPOSITION AND ECONOMICS By Larry Lee Grover Conducting checking experiments to detect faults in sequential machines inherently requires the use of spe- cific sub-experiments. In this dissertation, these sub- experiments, or state identification sequences, are used to verify the (indirectly observable) transition function. Further, an algorithm for determining state identification sequences is constructed from a rigorous treatment of state distinguishability and equivalence. This algorithm detects the shortest state identification sequence for any state of a machine for which state identification sequences exist. For machines lacking the desired behavior, procedures are derived such that these machines can be designed to possess the required behavior. The notion of inaccessible states is given serious attention to provide a means of detecting faults which increase the number of states in these machines. The number of states may be increased by Larry Lee Grover faults creating access to the inaccessible states. Design procedures are given to make possible the detection of this type of fault. A machine notation is introduced to detect and elimi- nate redundancies in checking experiments by determining when sub-experiments overlap. This notation and reduction technique is used in an algorithm to detect faults which can occur in a sequential machine. In this algorithm, the use of state identification sequences which satisfy spe- cific conditions assists in making the checking experi- ments as concise as possible. Machines which can be decomposed into serial or parallel connections of submachines possess certain structural properties which are beneficial to fault de- tection. These properties and their relationship to fault detection are presented and investigated to yield a fault detection method for decomposable machines. As a result of this method, both a savings in the amount of work re- quired and other benefits are realized. The cost (in a global sense) of fault detection is presented. Through a partial ordering of events which are encountered in fault detection, a cost model results. The .use of this cost model establishes the overall fault de- tection program which has minimal cost. FAULT DETECTION: STATE BEHAVIOR, MACHINE DECOMPOSITION AND ECONOMICS By Larry Lee Grover A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science 1976 ACKNOWLEDGMENTS My special thanks to Dr. Carl V. Page whose enthusi- astic guidance made this work pleasurable. I have enjoyed the opportunity to freely exchange ideas with a person of such keen ability. My thanks also to Dr. Richard C. Dubes, Dr. Harry G. Hedges, Dr. Richard J. Reid, Dr. James H. Stapleton and Dr. Bernhard Weinberg for their service in reviewing and guiding my work. TABLE OF CONTENTS Chapter Page 1 0 INTRODUCTION 0 O O O O O O O O O O O O O O O O O O 1.1 SEQUENTIAL MACHINE NOTATION . . 1.2 THE PROBLEM OF FAULT DETECTION 1.5 BACKGROUND AND HISTORY . . . . 1.4 DISSERTATION ORGANIZATION . . . o o o o o o o o o o o o o o o o o o o o o o o o o o o 0 (DOWN l-' 2. FORMALIZATION OF FAULT DETECTION PROPERTIES OF SEQUENTIAL MACHINES . . . . . . . . . . . . . . 11 2.1 STATE EQUIVALENCE AND DISTINGUISHABILITY . . 12 2.2 STATE BEHAVIOR SEQUENCES . . . . 14 2.5 PROPERTIES REGARDING STATE IDENTIFICATION SEQUENCES . . . . . 18 2.4 DESIGN PROVISIONS TO ENSURE A COMPLETE IS SET 0 o o o o o o o o 24 2.5 CONCERNING INACCESSIBLE STATES . . . . . . . 28 2.6 INITIALIZING THE MACHINE . . . . . . . . . . 51 2.7 CHAHER SUMMARY 0 o o o o o o o o o o o o o o 31 5. HOW TO MAKE A MACHINE MORE EASILY TESTABLE BY DESIGN00000000000000.0000...55 5.1 DETERMINING STATE IDENTIFICATION SEQUENCES . 55 5.1.1 STATE IDENTIFICATION SEQUENCE ALGORITHM . . . . . . . . 56 5. 2 ALGORITHM FOR ADDITIONAL OUTPUTS . . . . . . 41 5. 5 ADDITIONAL INPUT ALGORITHM . . . . . 46 5. 4 INACCESSIBLE STATE ASSIGNMENT ALGORITHM . . . 47 5. 5 CHAPTER SUMMARY . . . . . . . . . . . . . . . 52 4. AN ALGORITHM FOR DETECTING FAULTS IN SEQUENTIAL MACH INE O O O O O O C C O O O O O O C O C O C O O 54 4.1 STRONGLY CONNECTED AND REDUCED MACHINES . 4. 2 WORKING FORM NOTATION . . . . . . . . 4.2.1 WORKING FORM ALGORITHM . . . . . . . 4. 5 TRANSFER SEQUENCES . . . 4.5.1 TRANSFER SEQUENCE MATRIX ALGORITHM . 4.4 SUB-EXPERIMENT OVERLAPPING AND CONCATENATION 64 4.5 FAULT DETECTION ALGORITHM . . . . . . . . . . 68 83 iii Chapter Page 4.6 EXAMPLES . . . . 4.7 BOUNDS ON EXPERIMENT LENGTH . . . . . . . . . 75 4.8 CHAPTER SUMMARY . . . . . . . . . . . . . . . 77 5. FAULT DETECTION THROUGH ABSTRACT REPRESENTATIONS . 79 5.1 SERIAL AND PARALLEL DECOMPOSITION . . . . . 79 5.2 PARALLEL DECOMPOSITION AND FAULT DETECTION . 91 5.5 SERIAL DECOMPOSITION AND FAULT DETECTION . . 102 5.4 THE BENEFITS OF MACHINE DECOMPOSITION . . . . 111 5. 5 CHAPTER SUMMARY . . . . . . . . . . . . . . . 115 6. ECONOMICS OF CHECKING EXPERIMENTS . . . . . . . . . 117 6.1 EXPERIMENT COST . . . . . . . 118 6.1.1 MACHINE DESIGN AND PREPARATION FOR TESTING o o o o o o o o o o o o o 120 6.1. 2 EXPERIMENT DESIGN . . . . . . . . . 121 6.1. 5 PERFORMING THE EXPERIMENT . . . . . . 121 6. 2 COST MODEL . . . . . . . 125 6. 5 MULTIPLE USE OF A CHECKING EXPERIMENT . . . . 155 6.40HAPI‘HSUMMARYOO...000000.00-0157 7. CONCLUSIONS AND SUGGESTIONS FOR FUTUREWORK . . . . 159 7.1 CONCLUSIONS . . . . . . . . . . . 159 7. 2 SUGGESTIONS FOR FUTURE WORK . ... . . . . . . 141 BIBLIOGRAHIY O O O O C O O O O O O O O O O O O O O O I 1“} iv CHAPTER 1 INTRODUCTION This dissertation is concerned with the detection of faults in sequential machines. Definition 1.1 A sequential machine fault is any alteration which permanently changes the machine's logical operation. Thus, failures which affect shapes of pulses, delay times or physical amplitudes, in the machine, but do not change its logical description, will not be considered. Also excluded are failures due to power sources and clock pulses. Like other theories in science and engineering, the. fault detecting theory presented here is mathematically based on the phenomena which characterize the subject. Thus, the theory and algorithms herein developed, apply to any entity which can be represented as a sequential ma- chine. The remainder of this chapter is used to define the sequential machine mathematical structure used in this 2 dissertation, define the problem, give its history and state the organization of this dissertation. 1.1 Sequential Machine Notation The Mealy sequential machine model will be used in this dissertation (37). Definition 1.2 A Mealy type sequential machine, M, is defined as a quintuple M = (s, I, 0,7),p) where l) S is a finite nonempty set of states; 2) I is a finite nonempty set of inputs; 5) O is a finite nonempty set of outputs; 107): S x I -> S is called a transition function; 5) P: S x I --O is called the output function. Let a single input symbol be denoted xi 6 I. A con- catenation of k input symbols is denoted 5k and (ik)J im- plies the jth string of k inputs. A single output re- sponse is denoted as rh’é 0, while Bk denotes a string of k responses. A transfer sequence from state 81 to 53 is a sequence of inputs denoted as TKsi, 33). The next state for a machine in state 81 given the input xj, is 77(31, x3), while the output response is P(si’ 1:3). The concatenation 5 of responses for a machine in state 51 given the input string 32k, is p(si, 311‘). This dissertation is concerned only with synchronous sequential machines. Sequential machines are classified as synchronous or asynchronous depending on whether or not the machine is operating under the control of clock pulses. In synchronous operation at most one internal state tran- sition occurs in conjunction with the controling clock pulse. The output pulses are also in synchronism with the clock pulses. Let x(t) and s(t) be the input and state of a synchro- nous sequential machine at time t. The state at time t. = t+At is O s(t ) = 7760507). X(t))c +775(6(t). X(t))(1-C). where c is the synchronizing clock pulse train and 1, clock pulse present 0, otherwise. The logical description of a synchronous sequential machine is typically given showing the c-l condition. When c-O, all states in the machine are stable for every x1 6 I. Thus, 773““). X(t)) = s(t). 4 Suppose 773(83’ xm) is faulty such that 773(539 xm) " 8f, and 775(81" Km) 8 8f. Then the machine in any state 51 such that 72c(sl. xm) = 83. will have 5.3. as a transient state on its way to sf when the clock pulse returns to zero. Thus, 776(81’ xm) 3 Sf because when the machine is exposed to the next clock pulse it will be in s Hence, faults in the 778 portion f. of 7) appear as faults in 77°. Consequently, this disser- tation is only concerned with the machine's description when the clock is present, following the tradition es- tablished in previous fault detection and automate theory. Therefore, any references to 7) will mean 77° in the re- mainder of this work. The methods given in this dissertation can be ex- tended to asynchronous machines, Friedman and Menon (15). 5 1.2 The Problem of Fault Detection We shall assume that only the output function, p, is observable; thettransition function, 7), is not. Conse- quently all information concerning a machine's behavior must come through the output. It is therefore necessary to conduct experiments using only the terminals of a ma- chine to determine its behavior. The experiments are de- rived such that inputs are applied and outputs observed to determine correspondence between the machine and its flow table description. The entire experiment is referred to as a checking experiment. Definition 1.5 A checking experiment (CE) is-a sequence of input symbols and their related output responses for the purpose of determining whether or not a sequential machine is functioning correctly. A checking experiment may be either reset, in which the output reaponse does not determine the next input symbol, or ada tive, in which the next input is determined from the present and past outputs. Solving the fault detection problem is then a matter of devising a theory and algorithms by which checking experiments can be constructed such that they are concise and complete in detecting faults. 1.5 Background and History The first work in sequential machine fault detection was by Moore (59). He showed that any sequential machine with n states, p input symbols and q output symbols, can be distinguished, up to equivalence, from all other (n, p, q)-machines by a simple checking experiment. The approach is to construct a machine which is the direct sum of all possible (n, p, q)-machines and then find an experi- ment which will distinguish the object machine from all others. Because Moore's procedure requires prohibitively long experiments, Poage and McCluskey (42), Roth et a1. (44) and Seshu and Freeman (46) greatly restrict the number of machines in the direct sum to reduce experiment length. This is accomplished by selecting a fault or small set of faults to determine the machines for the direct sum. This, however, restricts the diagnostic capability of the experi- ment. This approach is referred to as the "circuit- testing approach". Hennie (25) seeks to devise experiments which identi- fy a correct machine. The application of his experiment to any other machine of the same number of states or less produces a different response and hence, the experiment is a checking experiment. Hennie's work on machine identification in fault de- tection is the basis for efforts by Boute (4), Farmer (15), Gdnenc (19), Hsieh (27), Kime (51) and Kohavi et a1. (54). 7 The chief result of their work has been to refine Hennie's procedures to make the resulting checking experiments more concise. The areas of refinement have been the removal of redundancy in the experiment and reduction in the length of certain input sequences used in Hennie's procedures. Recently, work has been directed toward restricting the type of faults allowed e.g., Boute (5), Boute et a1. (6), and Friedman et a1. (14). Certain constraints are placed on the type of machine and internal logic inter- action (states and output logic cannot be faulty simulta- neously). The restrictions make techniques devised in this manner less potent than the machine identification approach but more general than the circuit-testing approach. Aside from these works, fault detection has received much attention as evidenced by the literature. Five isSues of the IEEE Transactions on Computers’ have been devoted exclusively to Fault-Tolerant Computing of which fault detection is a major area Carter (8). Five symposia on Fault-Tolerant Computing" have been conducted and at least two texts (9), (15) dealing with fault detection and diagnosis in digital circuits have been written. Other texts which consider machine experiments and fault de- tection in their scope also exist (17), (24), (52) and ‘ Vol. 0-25, July 1976, vol. 0-24, May 1975, vol. 0-25, July 1974, vol. 0-22, March 1975, vol. 0-20, Nov. 1971. " IEEE International Symposium on Fault-Tolerant Computing, June 1975, June 1974, June 1975. June 1972, March 1971. 8 (48). Research in fault detection at Michigan State Uni- versity has resulted in two publications, (10) and (29). 1.4 Dissertation Organization The following algorithm exhibits both the organi- zation of this dissertation and the use of this thesis in fulfilling one's needs concerning fault detection. In the flowchart following the algorithm, the numbers in the de- cision operations refer to steps in the algorithm and the numbers in the process operations refer to sections of the dissertation: 1. If machine decomposition or partitioning is a strong consideration, go to 15; otherwise continue. 2. Refer to Sections 5.1 and 5.1.1 to determine whether or not a state identification sequence (Def. 2.6) exists for each state of the machine. 5. If the machine has a state identification sequence for each of its states, go to 8; otherwise continue. 4. If the machine is already constructed, go to 7; other- wise continue. 5. Refer to Sections 2.4, 5.2 and 5.5 for theory and design procedures which ensure an identification sequence for each state. 6. Consider adding inputs, outputs and testpoints and de- termine the machines for all three and go to 8. 7. If the machine can be altered for fault detection 9 through redesign, go to 5; otherwise go to 15. 8. Given that the machine has n states and v internal state variables which can have any of r values, if n-rv, go to 11; otherwise continue. 9. If faults which can cause the rv-n inaccessible states to be entered are not of concern, go to 11; otherwise con- tinue. 10. Refer to Sections 2.5 and 5.4 for the theory and design procedures to ensure that faults causing entry to inaccess- ible states are detected. 11. Refer to Sections 2.6 and 4.1-4.5 to determine a checking experiment for the machine using state identifi- cation sequences. 12. Determine the costs associated with fault detection (Section 6.1). 15. Refer to Sections 6.1-6.5 for economical analysis. 14. Construct cost model and find the best set of admissi- ble costs and go to 22. 15. Refer to Section 5.1 and perform machine decomposition (Hartmanis and Stearns (22)). 16. If a nontrivial decomposition exists, go to 18; other- wise continue. 17. Refer to Section 1.5 for a guide to methods of deriving checking experiments using other than state identification sequences and go to 12. 18. Refer to Sections 5.1 and 5.1.1 to determine whether 01- trot each submachine has a complete set of state 10 identification sequences. 19. If all submachines have a complete set of state identi- fication sequences, go to 21; otherwise continue. 20. Refer to Section 1.5 for a guide to methods of de- riving checking experiments for those machines not having a complete set of state identification sequences. 21. Refer to Sections 5.2, 5.5 and 7.2 for theory relating faults to machine decomposition and go to 12. 22. End. 1 14 ..1 Construct Cost Determinee Model and find Costs CHAPTER 2 FORMALIZATION OF FAULT DETECTION PROPERTIES OF SEQUENTIAL MACHINES The transition function, 7}, is not directly observe- able by terminal measurements. However, the manner in which a machine responds to an input sequence does reveal informationabout its states. It is this state behavior that is of concern in this chapter. Definition 2.1 An 110 sequence is an input sequence (1) and its re- lated output sequence (0). From Definition 1.5, a checking experiment (CE) is an I/O sequence which serves to detect faults in the machine. The particular sequences which are defined in this chapter may eventually be used as part of a checking experiment. Hence, these sequences will constitute a sub-experiment for a checking experiment. Definition 2.2 Any I/O sequence incorporated in the construction of 11 12 a CE is a sub-experiment. 2.1 State Equivalence and Distinguishabiligy To devise a sequence (experiment) which will de- termine the behavior of the machine initially in a par- ticular state, it is necessary to find some input sequence which will distinguish one state from the others. This becomes impossible if two or more states of the machine are equivalent. Definition 2.5 State 31 and $3 of a sequential machine are said to be equivalent, if P i, the machine is in sk and the response is k-i w's. The next input will produce a y response. Hence, in- itially in si, the machine will produce a y response on 27 the k-i+l input. The following results are observed for an input sequence of k-l h's. State hhh ... hhh ... hhh 81 WW .0. m coo m S2 M... W coo m 83 M 000 WW .0. m Si WW 0.. m 0.. m It is certainly evident that each state of this ma- chine has an identification sequence. The state identifi- cation sequence for state 83 is k-j+1 k-J+1 r—A—fif—A—fi ISS 3 hh coo hh/W 000 “y, 3 where j i 1. The state identification sequence for s1 is k-l k-l A,“ 18$ 3 hh coo hh/ww so. We 1 28 It is also noted that the input string of k-l h's is a distinguishing sequence. Thus, adding h as one addition- al member to the input alphabet and assigning it the entries contained in Mh yields a complete IS set for any sequential machine. 2.5 Concerning Inaccessible States For every n-state sequential machine having v r-ary state variables where rv-l < n < rv, there are rv-n inaccessible states. Inaccessible states do not appear as next states in the n-state machine. Therefore, occurence of faults can produce paths to previ- ously inaccessible states and consequently checking experi- ments should be able to detect these faults. It would be advantageous to assign to inaccessible states, next states and responses suitable for detecting entry into them. If faults create paths to inaccessible states, any checking experiment which relies on verification of the flow table requires alteration of the machine structure to provide entry to the inaccessible states. Because faults in inaccessible states are possible also, there exists the possibility or certainty for each fault which causes entry to an inaccessible state, a fault in that inaccessible state which masks the fault causing entry. 29 Suppose that in the checking experiment, a fault causes 77(31, a) to become sm (sm inaccessible) instead of 8.. Now 77(sm, b) is s:] and p(sm, b) is g. Since 7)(sJ., b) J is s. and p(sa., b) is f, the fault causing entry into s m woulg be detected by an input of b. If p(sm, b) is faulty and becomes f, the machine will be in the correct state and no fault is detected. If the checking experiment sub- jects state 31 to an a input only once, the fault is not detected by the checking experiment. It is necessary to verify the transitions in the inaccessible states as well. This can only be done if access, in some form, to the in- accessible states is available. One way to solve this access problem would be to pro- vide an additional member to the input alphabet which is used only during testing. This then would allow the pre- viously inaccessible states to be entered as long as it was not faulty. In assigning entries to the rv-n states, a primary goal is that each state have an identificatiOn sequence to ensure a complete IS set. From Theorem 2.2, it is neces- sary to assign unique entries under every input for the inaccessible states. The next state entries for these states are states contained in the original n. The reasoning will become evident in ensuing arguments. Theorem 2.8: For an n-state, p-input, q-output sequential machine 50 which has v internal r-ary state variables, there are rv-n possible distinct entries in the flow table available for each input symbol when n ) rv/2, using only the original n states. Proof: There are a maximum of nq possible distinct entries for each input symbol using the original n states. Since a maximum of n are used for an input, n(q-l) remain. Hence, n(q-l) must be greater than or equal to rv-n in order to supply the unique entries which will satisfy Theorem 2.2. Thus, n(q-1) a rv-n or q é rv/n. For nontrivial sequential machines, q § 2 ) rv/n. Theorem 2.8 guarantees that sufficient distinct entries are available. It remains to find and assign these entries. The procedure which will accomplish the desired assignments is given in Chapter 5. 51 2.6 Initializing the Machine Given a sequential machine for testing, there may be no way of knowing its present state. Since only terminal experiments are allowed here, it is necessary to force the machine into a known state by terminal experiments. Ma- neuvering the machine into a known state is referred to as initialization. Once initialized, the future behavior of the machine can be predicted. Synchronizing sequences and adaptive homing techniques are useful for initialization and their derivation is given in Gill (17). 2.7 Chapter Summary The necessity of experimentally observing the tran- sition function has resulted in a study of sequential ma- chine state behavior. The primary concern in this chapter was to define and detect particular state behavior which could identify either the initial or final state of a ma- chine with respect to the application of an input sequence. The means by which states could be identified has stemmed from the notions of state equivalence and dis- tinguishability. These notions have been formally de- veloped and used extensively in the derivation of-algo- rithms (Section 5.1.1) which generate the desired input sequences that identify states through output observations. This chapter presents particular machine character- istics which are useful in machine design as applied to 52 fault detection. A discussion of unused or inaccessible states is also given in Section 2.5. CHAPTER 5 HOW TO MAKE A MACHINE MORE EASILY TESTABLE BY DESIGN Not all sequential machines possess a complete state identification sequence set or a distinguishing sequence. Consequently one must rely on testing techniques intro- duced by Hennie (25), which result in checking experiments of great length, unless the machine can be designed to yield additional information about its states and tran- sition function. This chapter will consider the design of machines which are more readily testable. In particular, faults which increase the number of accessible states can be made testable by design. The design criteria presented in this chapter can be used in conjunction with usual de- sign practices resulting in a single design phase. 5.1 Determining State Identification Sequences Although state identification sequences are discussed in the literature, efficient means for computing them are not, although there may be proprietary programs to do this. One way to solve the problem of finding state 55 34 identification sequences for a finite state sequential ma- chine would be to determine if the machine, as an encoder, encodes (state, input) information into output sequences which are-uniquely decodable (Abramson (2)). Thus, ) ’P(Biv SEIS ) i Si (Si’ xIs8 would represent an encoding which is uniquely decodable. In this dissertation we present a means of de- termining state identification sequences for any sequential machine by using the successor tree. Gill (17) explains the successor tree and how he uses it to find dis- tinguishing and homing sequences. The successor tree is, in itself, infinite and of no practical use. To make the successor tree useful in the derivation of machine experiments, termination rules must be established such that branches in the tree are termi- nated when determinated useless or the tree growth itself is to be halted for some specified reason. In the process of determining possible state identi- fication sequences, particular input sequences may be found which will not assist in finding an IS for a state not having one already. Once found, these input sequences (tree paths) are omitted from further use. An input sequence which is retained offers potential value in 55 determining the desired identification sequences and is referred to as a potential input sequence. The procedure to be developed here does not use the tree structure in the graphical sense but, uses a com- pressed representation of the tree to facilitate the tree searching. This representation consists of an array (G) acting as successor tree levels. Each row of G corre- sponds to a different state in the machine. The columns of G correspond to input sequences (tree paths to the par- ticular tree level G represents). Given that the ma- chine's state set is S 3 {$1, $2, 000, Sn), th assign 31 to the 1 row and (im)3 to the 3th column. Now the entries (tree nodes for the level) in G are (6, h)i,j g 8131 hij .77(Si’ (xm),j)’ P(819 (xm)j)° The array at each successive tree level is referred to as a generation. The zeroth-generation is simply the origi- nal flow table. The input sequences for the next gener- ation are determined by concatenating all members of the input alphabet with the prior generation's potential input sequences. For the prior generation input (im)3’ the new generation input is (xm)3xk, xk 6 I. Let 56 (xm)jxk = (xm+l)p for some p, then Sip 3 ”(8139 xk) hip ‘ hij “313’ xx) and each generation is computed using only 77, p and the most recent generation. 5.1.1 State Identification Sequence Algorithm A partition on the machine's state set is a grouping of all states into disjoint subsets called blocks. A par- tition in which two states 31 and 83 are in the same block, if and only if p(si’ (im)k) 'P(339 (im)k) i.e., h s h ik jk is said to be the partition induced by (im)k and denoted 1T((im)k). A partition in which two states 31 and s3 are in the same block, if and only if 57 ”(319 (im)k)’ p<519 (im)k) 77(339 (im)k)’ P0531 (im)k) i.e., (89 h)ik (59 h>jk is denoted(0((xm)k). The state identification sequence algorithm follows. 1. The machine flow table is the zeroth-generation. Set k equal to zero. 2. For each (Sim)J of the present generation, determine ‘fl1(§h)j) and(0((im)j). Increment k by 1. 5. If any”fl((xm)3) contains a singleton block, then that state (si) in the block has a state identification sequence which is (im)345(si, (xm)3). Indicate that si has a state identification sequence by a 8 and also the sequence. 4. If all states have been found to have state identifi- cation sequences, terminate with success; otherwise con- tinue. A 5. If all states which do not have state identification sequences are contained in nonsingleton blocks ofC0(im)J), then place a star over column (in)a indicating that it is of no further use. 6. If all inputs (columns) have been marked with a star, terminate without a complete IS set or no state 58 identification sequences at all; otherwise continue. 7. If k - (n-1)2, terminate without a complete IS set; otherwise continue. 8. Construct the next generation i.e., hip = hicj P<8139 xk)’ for all xkie I and potential (Sim)J of the previous gener- ation and go to step 2. Example 5.1 This example illustrates the algorithm using machine M which is defined by G 2 0' . 0 1 Go 1 1,0 13,0 B 1,0 0,0 0 1,0 11,0 8]) 5,; 11,0 . 10 11 61 1 1,00 0,00 B 1,00 D,OO 0 1,01 11,00 11 1,01 11,00 59 s s ' 110 111 G 81 1,000 11,000 B 1,001 11,000 0 1,001 11,000 11 1,001 11,000 Since all remaining inputs have stars, the algorithm terminates with a partial IS set. ISA = 110/000 IS a O/1 D There are several aspects of this algorithm which are of interest to checking experiment theory. Theorem 5.1: The state identification sequence algorithm will de- termine a state identification sequence for each state of a sequential machine for which a state identification sequence exists. Proof: If a state does possess an identification sequence, then an input sequence exists for this state which will produce a unique response. Since an input's use is never terminated until it is known that all states without an identification sequence are (im)J-prefix equivalent to at 40 least one other state for each input (step 6), then the existence of such an input sequence ceases once the algo- rithm terminates. Theorem 5.2: If an identification sequence exists for a state of a sequential machine, the state identification sequence algo- rithm will determine the shortest such sequence. Proof: Prior to the construction of a new generation, the present generation is checked for all possible state identification sequences. Hence, prior to increasing an input's length, all possible state identification sequences are determined and noted. Theorem 5.53 The state identification sequence algorithm will terminate for any finite state machine. Proof: If the algorithm has not found an identification sequence within (n-l)2 generations, none exist by Theorem 2.5 and the algorithm terminates in step 7. 41 5.2 Algorithm For Additional Outputs The notion of adding outputs to machines to render more efficient fault detection has been considered e.g., Kohavi and Lavellee (55), Sheppard and Vranesic (47). This additional output assignment problem is considered in this dissertation and expanded such that when applied to sequential machines, complete IS sets are the result. The existence of such a design procedure follows from Theorem 2.6. The process of providing additional outputs is, in effect, that of embedding the original machine in a new machine which provides additional state information. Let (2019 '“9 203) = P0(81’ X1) denote the original output vector and 2m 3 F%(si’ x1) th denote the m additional output. Then the output for the machine is a vector represented as Z = (201, coo, 203, 21, 22, coo, Zn). In order to simplify the notation, base 10 values of these vectors are used. Hence, 42 _ m+j-1 m m-l R — 201b + ... + zon + zlb + ... + zm where b is the base of the number system for the internal variables written in base 10. Consider the following in base 2. Z = (1, O, 1) R-S Now R is used as the output and the flow table entry is Si’(l’ O, 1) 2 81,5. Theorems 2.2 and 2.5 suggest a design procedure for adding outputs to ensure that a complete IS set exists. An algorithm for this design process first considers whether or not the machine does have a complete IS set. In the event that it doesn't, an additional output is cre- ated to attempt completion of the IS set. The new output is specified such that it will make states previously (im)j-prefix equivalent no longer equivalent. Therefore, identical 7)(si, x1), p(si, x1) entries in the flow table for some input are made different by assigning the added output to make them different. Thus, these states become distinguishable at the earliest possible time. 45 The addition of outputs to the machine doesn't alter state identification sequences which existed prior to the addition. Existing output vector components are never changed. Thus, these components partition the machine's state set into blocks as they previously did. The added components simply break up nonsingleton blocks of€O((im)j) into smaller blocks. Additional outputs for states which possess identification sequences are free and may be used as don't care conditions (dc). In many cases, identifi- cation sequences for states which previously exhibited them become shorter. The added outputs create state in- formation and Theorem 2.6 shows that if carried far enough, all states would have a state identification sequence of length one for every 1:16 I. In example 5.2, state A re- sults with a state identification sequence whose length is one less than without the output. This algorithm makes use of(0((ih)3) as defined in Section 5.1.1. Breaking up the blocks ofCO((im)d), through additional outputs, results in state identifi- cation sequences for states which do not have them. The algorithm follows. 1. Initially set the index m to zero. 2. Apply the state identification sequence algorithm and increment m by 1. 5. Partition the states of the machine into two blocks V 1 and DE such that 31 is in La, iff it has an IS. 44 4. If té is empty, step; otherwise continue. 5. For each x1 E I, determine co(xl). 6. Construct an array (A), one row for each state in b’ 2 and one column for each x1 6 I. The entries for A are aij =‘ A,z 0,2 A,y Application of the state identification sequence algorithm returns the complete IS set. IS . a/y ISB ab/zy, ba/zy, hb/zy ISC a b/y IS D'h/y 52 No criteria has been given for choosing the unique entries for inaccessible states. The question of which decision is best and how decisions affect the length of state identification sequences is left to future work. 5.5 Chapter Summary The purpose of this chapter was to investigate how to make a machine more easily testable by design. It has been shown that machines can be designed using additional outputs or an additional input to generate a machine of higher dimensionality which has a complete IS set and a distinguishing sequence. The question of whether it is better to add one additional input or additional outputs must be answered by the designer. Adding one input re- quires checking n more transitions in an n-state machine. Additional outputs may require more than one terminal being added to the machine. The additional implementation which must be added in either case may also be a deciding factor. The greatest depth that must be completed for the successor tree in the state identification sequence algo- rithm is (n-l)2. The maximum depth required for other sequences discussed in this dissertation according to Gill (1?). Ginsburg (18), Hibbard (25) and Kohavi (32) are: SS: (n-1)2n/2; HS: n(n-1)/2; 55 DS: (n-l)nn. In comparison to the distinguishing sequence, computation of a state identification sequence is shorter and requires less memory. Since a state identification sequence is the shortest sequence which can be used to identify the state of a ma- chine, the use of such sequences in checking experiments will result in checking experiments of shorter length than experiments using other sequences to identify states. CHAPTER 4 AN ALGORITHM FOR DETECTING FAULTS IN SEQUENTIAL MACHINES It has been shown that the present state of a machine can be verified if the machine has a complete IS set. . Means of modifing machines which do not possess a complete IS set have been shown. The ability to verify the state of a machine is the basic tool used in an algorithm de- veloped in this chapter. The machines to be considered in this fault detection algorithm must be completely speci- fied, finite state and deterministic. ‘This chapter also presents methods by which two classes of machines (non- strongly connected and nonreduced) can also be tested. To make checking experiments produced by this algo- rithm efficient, a notational representation for the Mealy type sequential machine is introduced. This notation as- sists in eliminating redundancies in the checking experi- ment. Examples demonstrate the algorithm. To establish apriori information regarding the length of checking experiments, bounds on experiment-length are established. 55 4.1 Strongly_Connected and Reduced Machines In any fault detecting experiment which verifies the flow table transitions and associated responses, the ex- periment must be capable of maneuvering the machine into each state. Definition 4.1 If for every pair of states si, 33 of a machine M there exists an input sequence which takes M from S1 to 33, then M is said to be strongly connected. Thus, non-strongly connected machines contain states which cannot be entered from other states. Non-strongly connected machines are those which have persistent, iso- lated or transient submachines. The following machine, M4, exhibits these submachines most readily by means of a state diagram. 0 1 1 1,0 3,0 3 0,0 1,1 0 1,1 3,0 D 3,0 0,0 3 3,1 3,0 o/t c/o transient isolated perSistent State Diagram Because this machine has inaccessible states, the argu- ments used for faults which increase the number of acces- sible states are applicable. Machines which are not reduced (Definition 2.10) can be represented by machines having fewer states. Suppose a fault in a non-reduced sequential machine causes state 7](si, xm) to be equivalent to the correct state. Then a fault exists and the machine performance is unchanged. We express this point in the following lemma which will not be proved. Lemma 4.1: In a non-reduced machine, a fault causing a tran- sition to a different member of an equivalent set of states is in effect not a fault in that it does not change the behavior. 57 To derive a checking experiment for a non—reduced machine, reduce the machine flow table. Then use the state identi- fication sequence found for all states, representing an equivalent set of states, as the identification sequence for each state of the set of states in which these repre- sentatives are contained. 4.2 Workinngorm Notation A flow table representation called the "working form notation" is defined below. This notation is advantageous in fault detecting experiments because information about the machine is displayed in the experiment. The standard flow table notation, where columns represent inputs and rows represent states of the machine, uses 77(339 x1), P<339 x1) as the row 33 column xl entry. In working form notation 77(83. x1). 10(33. 1:1) is replaced by Wi, where W is 77(33, x1), 2 is p(sj, x1), and i is a unique pointer to each entry. An algorithm for going from flow table to working form notation is given in the following steps. 4.2.1 Working Form Algorithm 58 1. For each state of the machine, determine W? for all xl in I, where: W =77(sj. x1); Z ‘p(sa'9 x1); 1 = c-l+(r-l)m+l, where r and c are row and column number and m is the maximum number 0f columns (inputs). The index i is unique for each entry. 2. Replace each 77(sj, x1), P n, go to step 2; otherwise continue. 62 7. If 3 = i or tji = fl, go to step 5; otherwise continue. 8. If m > n, go to step 5; otherwise continue. 9. Left concatenate tji with tin and assign t. as follows. rtam, if L(t31t im ) < L(t3m) tjm = ( (t:jm u tjitim), 1r L(tdit 1m) - L(tjm) tjit im’ if L L 3&3 10,11 10,11 10,11 0,1 transfer sequence matrix 3> 3'83 0,1 h» $88 65 (I! 3033 tn ”808.5. 00 000,100 C> EB 53 c D 0 1 0 1 o 1 00,10 01,11 0 D 0 1 0 1 00 1 00,10 01,11 0 D 0 1 1 00 1 00,10 01,11 0 D 0 1 0 1 00 1 00,10 01,11 i=1 i=2 i 185 4 algorithm reveals 64 the shortest sequence of input symbols required to take the machine from one state to another. For machine M1, the transfer sequence from state C to state A is either 10 or 11 and represents the shortest path. The process of finding the transfer sequence matrix is equivalent to that of finding the shortest path between any two nodes in a graph. 4.4 Sub-experiment Overlapping and Concatenation The algorithm which will be presented in the next section is dependent on a "checking table", defined below. The checking table consists of 4 columns and up rows for an n-state, p-input sequential machine. Column 1 contains the indices for the working form representation in numeri- cal order. The column labelled si,sa has in row m the th index in the present state and the next state for the m working form representation. The column labelled T(si, 83) has in row m the working form representation having the index m. The final column in the table is the working form representation of the identification sequences for the left entries in column 31,33. The following illus- trates the checking table. 0) \fl 0 H a» u: U U > w 0 U QH UH“ W0 HO 3. U U O 000 (DO 430 NO H (D index 81,83 (Si’ sj) 8:] 1 1,3 32 12323% 2 1,3 32 0212323212 5 3,1 12 323212 4 3,3 32 12323% 5 0,3 3% 0212323212 6 0,1 12 323212 7 3,3 3% 0212323212 8 3,0 02 12323212 Checking Table for M5 Using the checking table, each TTsi, s3) concatenated with ISs constitutes a sub-experiment. Overlapping is J commenced by assuming an initial index. The sub-experiment corresponding to this index begins the CE. The next 66 candidate for the CE is the sub-experiment whose index is the subscript of the second member in the CE. If all of the candidate's members coincide with the CE's members, when positioned over the CE member whose subscript nominated the candidate, the two overlap and any members of the can- didates extending beyond the CE are retained. The next rightmost subscript is attempted to find a candidate and if its members do not coincide, the next rightmost sub- script is attempted until no next right subscript exists. Any index which has been used previously is never reused. This process is illustrated in the following assuming the initial index 2. index T(si, sJ)ISs 0 0 0 0 o 0 2 320816313413 0 0 0 0 0 8 0816313213 0 0 0 0 0 0 0 1 4 34153237 7 320212323212 2, 8, 6, 4, 7 320212323212323§c212323212 Each experiment generated through overlapping is now a new sub-experiment. Suppose the overlapping and merging of the previous sub-experiment resulted in the following list of sub-experiments. 2, 8, 6, 4, 7 32021232321232320212323212 S. 5 3%0212323212323212 0 0 0 1 1 31153237 Since each new sub-experiment indicates which state the machine will be in after its completion, any further concatenation of them requires that the next sub-experi- ment begins from that state for the sake of continuity. In this example, sequence "2, 8, 6, 4, 7" ends in state A and since sequence "1" begins with a transition from state A, the two sequences can be concatenated. In the event that a sequence does not begin with a transition from the final state of the previous sequence, a transfer sequence must be found such that the two sequences are properly connected. Hence, the transfer sequence matrix becomes useful. To concatenate "l" with "5, 5", the transfer sequence 7(D, C) is required. A B C D O 0 O 0 0 O A BIA5 B1 D208 D2 0 O 0 O 0 0 O B A3 B4 A5D208 A3D2 O O O 1 0 1 C A6 A6B1 D508 D5 0 O O O 0 O 1 D C8A6 CBA6B1 C8 D7 J ; M5 Transfer Sequence Matrix 68 Thus, 021s the correct transfer sequence which will per- form the concatenation and would appear as "1" C2 "5, 5". The final sequence is The overlapping and concatenating procedure is not fully deve10ped in algorithmic form. The procedure is outlined to illustrate how it is done. Basically, it is a covering problem where each sub-experiment in the checking table must be covered in the CE. To find the shortest CE possible, the branching method (56) is utilized in the covering. 4.5 Fault Detection Algorithm The capability of state identification sequences to verify the present state of a sequential machine suggests a procedure which can be used to detect faults in a sequential machine. This procedure is contingent on one condition, Boute (4). Condition I For all 81‘ 83 and Sit’ sjt state pairs (t refers to a machine being tested for faults) which have identifi- cation sequences such that 69 Fxsit’ Issi) 815(81’ Issi) and P\. 82 Definition 5.5 Machine M. is said to realize the gtgtg,behavior of machine M iff M. realizes M with an assignment (L, K} )0 such that t maps each state of M onto a single state of M. and t is one-to-one. In this case the relations in Definition 5.1 reduce to the following: 1) 7f(b(s), K(x3)) = LCO(s, xj)) for all s in S and all x.j in I, 2) )x(p'(t(s), K(xj))) = p(s, x3) for all s in S and x. in I. J Definition 2.4 The serial connection of two machines Ma = (889 Ia, 0a, 728' Pa) and-AMI) ‘ (Sb' Ib’ 013,771), Pb) for which is the machine M = Mam Mb = (88 X Sb, Ia, Ob, n, P) 85 where 77((59 13), xk) ‘3 (778(8’ xk)’77b(t’ Pa(ss Xk))) and P((Ss t)9 xk) =Pb(t’pa(s’ Xk)). Definition 5.5 The parallel connection of two machines Ma = (Sa’ Ia’ Oa’ 77a’ Pa) and IWb 3 (Sb’ Ib’ Ob’nb’ Pb) is the machine M = Ma// Mb = (88 x Sb, Ia x lb, 08. X Ob, 779 P) with 77((59 t), (x89 xb)) (778(8’ xa)"77b(t’ xb)) and P((S’ t)! (xas xb)) (Pa(39 xa)’Pb(t’ xb))° Assuming that a machine is prOperly decomposed as Hartmanis and Stearns prescribe, then we have the 84 following definitions. Definition 5.6 The machine MamMb (Mafl’Mb) is a serial (parallel) decomposition of a machine M iff MacDMb (Mafl’Mb) realizes M. Definitiong§.Z If MamMb (MaA/Mb) is a state behavior realization of a machine M, then we say that Mame (Ma// Mb) is a serial (parallel) decomposition g£_the state behavior of M. We say that the state behavior decomposition of M into MamMb (MaA’Mb) is nontrivial iff Ma and Mb each have fewer states than M. To illustrate these notions of ma- chine decomposition, consider M6 and its serial decompo- sition which is given in Hartmanis and Stearns. 0 1 1 5,1 3,0 2 3,0 4,0 3 1,0 5,1 4 2,0 3,0 5 1,0 4,0 "6 After decomposition we have: 85 O l W 3' A B,w B,y a c,l a,0 B A,y B,w b a,0 b,0 b 0' a O M C 9 9 6a M 6b where the two machines are connected as follows. (M ___. 6a'————‘W 6g O The composite machine is M6 and = (S x S a b 6a 6b 9 1689 .06b’ 779 P) where 77 and p are given in the following table. O I (A, a) (B, c), 1 (B, a), O (A, b) (B, a), O (B, b), O (A, c) (B, b), 0 (B, a), 0 (B, a) (A, a), 0 (B, c), l (B, b) (A, b), O (B, a), O (B, c) (A, a), O (B, b), 0 86 The mapping L is: l - (A, a), ¢ (A, b), ¢ (B, a), (B. b). 9 (B, c). \n-Pwm 0 The mapping K is the same as >\ which is: 090, 1"]... We can now show that testing the network of sub- machines does test the original machine. Theorem,§.2: Given a serial (parallel) decomposable machine M whose state behavior is realized by M. = MaQDMb (M. = Mafl’Mb) with M8 and Mb the components of decompo- sition, if Ma and Mb are tested and found to be free of faults, then M is free of faults. Proof: Suppose Ma and Mb have been tested and are free of | .faults but, M is not. Then the composite machine M (MacDMb 0r Mafl’Mb) does not have a fault and M does. 87 t We know that the composite machine M is derived from Ma and M by Definition 5.4 or 5.5. Since Theorem 5.1 b states that M. behaves the same as M for all finite input sequences and L maps states of M one—to-one and onto states of M', then the determination of M8 and Mb to be free of faults also determines M to be free of faults is proven by contradiction. The significant feature in Theorem 5.2 is that the interconnection of M8 and Mb behaves just as the original machine M does. This is true whether or not the decompo- sition represents internal assemblies or not. Since checking experiments are concerned with measurements on the machine's terminals, only the behavior of the machine is observed and hence, the testing of an abstract repre- sentation (MamMb or MaA’Mb) yields the same results as testing the machine M itself. In a fundamental theorem of machine interconnections, Hartmanis and Stearns prove that if a given machine M has certain structural properties, than there exists a network of interconnected machines, N, such that the network real- izes the state behavior of M and is a natural decompo- sition of M. Thus, for some machines, decomposition into a network of submachines is an actual representation of the machine's construction. This also states that a net- work of machines used to construct a given machine M is also a decomposition of M. 88 Now we are able to identify a particular structure or system which previously was not evident in the machine. This increase in information concerning the machine shows us either how to design complex machines from an inter- connected set of simpler machines or how to represent a given complex machine as a system of simpler machines. The remainder of this section is used to discuss some of the structural characteristics of decomposable machines and introduce the notion of component machine testing. Definition 5.8 A partition'U on the set of states of M is said to have the substitution progerty iff r = t (U) implies that 77(1‘. x3) =77(t. x3) (U) for all xj e I. Definitiong5gfl If‘U1 and Uh are partitions on S, then U’l'U2 is the partition on S such that r = t (L5"U2) iff r = t (U1) and r . t (U2). 89 Definition 5.10 All finite state sequential machines are either: a) Class I; machines whose decompositions actually represent their physical construction, b) Class II; machines whose decompositions are not representative of their physical construction, 0) Class III; machines which cannot be decomposed. In parallel decomposition, the next state of Ma (Mb) is independent of the conditions in Mb (Ma)’ i.e., the next state for Ma (Mb) depends only on the present state of Ma (Mb) and its input. Thus, the state variables of M8 and Mb are independent of each other. Hartmanis and Stearns have proven that a nontrivial parallel decompo- sition of a machine's state behavior exists iff two par- titions (U1,'U2) with substitution pr0perty such that Uh“Ué s 0 exist on the state set of the machine. We assume that parallel decomposable machines are of Class I as L maps state variables of M one-to-one and onto state variables of M. = Ma// Mb. In serial decomposition, Hartmanis and Stearns prove that a machine has a nontrivial decomposition of its state behavior iff a partition.L’with substitution property exists on the state set of the machine. Their proof gives a method of constructing a second partition p. such that tl‘fL = 0. Thus, the serial decomposition of M yields two 90 independent machines Ma and Mb whose state variables are independent. In each type of decomposition Ma computes the conditions for one set of state variables of M and Mb computes the conditions for the remaining set of state variables. Since ”‘0 is constructed from [..L which is not necessarily unique, then we assume that serial decompo- sitions are generally Class II. If M. is (Ma/l' Mb or Mam Mb) and "a and Mb are actual machines which have been interconnected to produce a larger more complex machine (design process), then we can determine that Ma A/Mb 0r MamMb is definitely Class I and the decomposition is actually the construction. It has been proven that testing the network of sub- machines will test the machine this network realizes. How does one test the submachines? This is the problem that needs answering. The answer is evident from the dis- cussion which follows. Machine partitioning and decomposing into a network of submachines makes available checking experiments for known portions of a machine which was initially not prac- tical to test. However, this action does not necessarily result in trivial solutions. We must not forget that the submachines for which checking experiments are found exist within another larger machine. The total machine is the immediate environment for this machine, which could con- ceiveably change conditions in that environment through its responses during the experiment. 91 From knowledge of the overall machine, access to the. submachine terminals must be ensured. Path sensitizing, Roth et a1. (44), which has been developed for combin- atorial circuits can be incorporated into the fault de- tection scheme. This would ensure that each time an input is to be applied to the submachine the configuration of the overall machine iii such that the submachine is: l) in the correct state, 2) its terminals are sensitized, 3) its outputs are observable. If the inputs could not be sensitized and/or outputs observed, then a necessary set of testpoints and their locations could be defined as an alternative. Since parallel and serial decomposable machines are represented by different abstract models we use two different sections to investigate faults in these networks. Thus, the follow- ing two sections present the problem of faults in parallel and serial decomposable machines respectively. 5.2 Parallel Decomposition and Fault Detection The application of a fault detection experiment to a machine which we consider to be the parallel connection of two machines (Ma and Mb) is accomplished differently than testing the machine which was originally decomposed into Ma and Mb. The structure we consider is as follows. 92 Input { The box labeled A is the combinatorial logic used to accomplish mapping 0a x 0b to the original output. Each machine is to be tested separately and thus, Theorem 5.2 ensures that the original machine is tested. Therefore, this section considers the problem of how to test each ma- chine by examining faults and the role they play in parallel decomposition. Das et al. (11) discuss a method for fault detection in parallel decomposable machines. Their method imposes many restrictions on the characteristics of the component machines and possible faults. Although they state that the restrictions are not stringent, the class of machines and the types of faults which can be tested greatly limit the use of their procedure or its results. The re- strictions they impose are: 1) It is assumed that the component not being tested and the combinatorial circuit are free of faults, 93 2) It is assumed that each component machine possesses a component distinguishing sequence, 5) The machine is strongly connected, 4) No fault results in an increase in the number of states, 5) Each component machine can be placed in the desired reference state by a £2253 adaptive experiment, a synchronizing Sequence, or a special reset input. It is true that the state variables for the component machines are independent but, when laying out the masks for integrated circuits or circuit boards one does not make a special effort to separate the state variables to- pographically. It is therefore not generally assumed that they cannot short together. Also, since the state vari- ables usually act as inputs for the output logic, we would not want to assume that a fault in the output logic could not also be a fault in a component machine. Since distinguishing sequences are not generally pos- sessed by all sequential machines, the probability of a component not possessing such a sequence exists and thus,“ this procedure would be useless. The limitations imposed by restriction 4 suffers from the usual critical limi- tations of faults which do not increase the number of states. In restriction 5 the emphasis is placed on ghggt sequences. The upper bound on a synchronizing sequence 94 for an n-state machine, if one does exist, is (n-l)2n/2 or on the order of n5. Since the sequence of concern here is to be applied many times in their procedure, this is very limiting. One is therefore required to rely on adaptive experiments or special reset inputs. To overcome the very limiting restrictions of their method, the following theory is given which enables one to accomplish fault detection in both parallel and serial de- composable machines without the constraints imposed by Das et al.. Rather than attempt a general procedure, we dis— cuss how faults in the machine map to the component ma- chines and the inverse mapping. Theorem 5.5: Given that M. = Ma//Mb and the only fault in the com- ponent machines is a single next state fault in Ma, then there are IsblAIbl next state faults in the table of com- I posite machine M . Proof: From Definition 5.5 we have: i s =saxsb, I I =IaxIb, 77((1'9 17): (xi’ 3(3)) = (7780?, xi)’77b(t’ 13))- I Now in S there are lel pairs representing states where r 95 is the first element and there are lIbl pairs representing inputs for M. where x1 is the first element. Thus, in lIbl columns having xi as the first element, there is an incorrect next state entry for each of the Isbl rows having r as the first entry. When machines Ma and Mb have their inputs physically tied together, then the (xi, x3) pairs, where i f j, are not allowed. Hence, the only input pairs that are allowed are (x1, xi). Given this arrangement, a next state fault in M8 results in Isbl faults in M. as the IIbl terms in Theorem 5.5 are dropped. Lemma 5.1: Given that M. = Ma// Mb and the only fault in the com- ponent machines is a single output fault in M8, then there are Sb Ib output faults in the composite machine M.. Proof: This lemma is proved in a manner similar to the proof for Theorem 5.3. Theorem 5.4: If there is a single next state fault in machine M I whose parallel decomposition (M - MaA’Mb) realizes the state behavior of M, then M has a possible IIaIISal+|Ib||Sbl-2 additional next state faults. 96 Proof: Given that the next state for state s1 and input xk is a fault, then ncsi. xx) - n'ucap. Keck» - 77((r. t). (xm. tn)) - (nae. xm).77b(t. 1:11)). There are three possible cases and they are: a)‘fig(r, xm) is a fault, b) Db(t, xn) is a fault, c) both a and b are true. If a (b) is true, then there are a total of 'Iblisbl (llallsal) next state faults in the composite machine as given in Theorem 5.5. Now if c is the case at hand, there are |Sb| states in M. which have r as the first element and lSaI in which t is the second element. For the lel (lSal) states which have r as the first element (t as the second element) there are lIbl (IIaI) inputs in M. where xm is the first element (xh as the second element). Now consider the state (r, t) and input (xm, xn). In this situation two of the next state faults occur in the same next state of M. and thus, we have (lsb||1b|+|sallla|-l) total faults in M . Omit the original fault and there are (|sb||1b|+|sa||1a|-2) additional faults in M'. Since 97 Ma//Mb is a state behavior realization of M, t maps each I state of M onto a single state of M and b is one-to-one. Therefore there are a possible (lellIb|+ISa||Ia|-2) add- itional faults in M. For machines which are actually constructed from com- ponent machines, the number of faults are those given in Theorem 5.4. The significance here is that decomposable machines exhibit patterns of faults associated with the independent state variables. This is the type of inform- ation that was previously not available in the machine M. By considering M to be realized by M. = MaA/Mb we gain from structural properties otherwise hidden. There are two ways in which we can check for faults in sequential machines. These two methods are essentially the same, only the precedure is different. These methods are: 1) Apply checking experiments to the submachines Ma and Mb to determine if they are free of faults, 2) Apply a checking experiment to the machine using knowledge of the underlying structure to derive the checking experiment. A suggested method for fault detection in Class I ma- chines which have a parallel decomposition is to spot check transitions in M (procedure 2). The decomposition of M 98 yields the necessary information on the spots (transitions) which are to be checked. The transitions which are checked are those which represent a set of transitions which would be faulty if the representative is faulty. Theorem 5.5 proves that due to the structure of the machine, there are a set of transitions which are faulty given a fault in a component machine. Thus, we perform a systematic spgt checking experiment on the machine which is realized by Ma// Mb using information from Ma// Mb. The procedure utilizes a transition covering table which is constructed as follows. 1. Construct a k x k array (D) where k is the number of transitions available for the composite machine, i. e., k = '8' III. 2. Construct the flow table for the composite machine Ma//Mb. 5. Label the transitions in the composite machine such that the transition for row 1 column 3 of the flow table is tm where m = J-l+(i-l)|I|+l. 4. Now the entries in the array are computed using the following. f a, if t3 is faulty when ti is because of Ma’ dij . ( b, if tJ is faulty when ti is because of Mb, \ blank, otherwise. 99 We demonstrate this construction through the following example. Consider the machine M7 and its decomposition. 0 l A F,O C,O B E,O D,1 C B,O E,l D A,O D,1 E D,1 A,l F 0,1 3,1 M7 0 l O l a c,O b,O g h,O g,O b a,0 c,l h g,O h,l c b,1 a,l M 7b “7 a Now the composite machine M; is M7 A’M7 which is as b a follows. »M 7 Input LA} Output M " (a: " (a: (be MMUOWP I a) h) s) h) s) h) (a. (a: (b, (b. (c: (c: As an example of transition t1, where 131 '77((39 S): O) " (7)3(39 0)177b(b9 0)) ' (c: h)- Now if t1 is a fault because of M7 , then every state pair a which has an "a" as its first element has a faulty tran- sition when given the "0” input. a) h) a) h) s) h) 100 o 1 (e. h). 0 (b. s). 0 (c. a). 0 (b. h). 1 (a. h). 0 (c. a). l (a, 8). 0 (c, h), 1 (b. h). 1 (a. s). 1 (b. s). 1 (a. h), 1 n; 0 l . t1, 0 t2, 0 t5, 0 t4, 1 t5, 0 t6, 1 t7, 0 t8, 1 t9, 1 t10,]. tn, 1 1:12, 1 Labeled Transitions computing entries in D consider Thus, in row t1 of D an 101 "a" entry is made in column t1 and t5. If t1 is a fault because of M7b, then every state pair which has "g" as its second entry has a faulty transition when given the "0" input. We see that in row tl "b" entries exist for columns t1, t5 and t9. The completed array D for M7 follows. t1 t2 t5 t4 t5 t6 t7 t8 t9 t1o tll t12 tl ab a b b t2 ab a b b t3 a ab b b t4 a ab b b t5 b ab a b t6 b ab a b t7 b a ab b t8 b a ab b t9 'b b ab a t10 b b ab a t11 b b a ab t12 b b a ab Transition Covering Table for M7 If the set of transitions {t2,'t3, t5, t8, th’ tll} are used, then each transition in M7 has been checked in both M7 and M7 . Thus, to check all 12 transitions for b a faults we need only check this set of 6 transitions. 102 For larger machines, the savings are more impressive as we are concerned with checking two smaller machines rather than their product. Definition 5.11 A key transition is a transition which belongs to the smallest set of transitions which cover all transitions in the transition covering table and the smallest set or sets is called a set 2; keys. There is much more work which is suggested by the procedure which has been initiated here. The object of this section was to uncover a correspondence between parallel decomposition and faults in machines which have a parallel decomposition. This correspondence has been found and we now investigate the notion of serial decompo- sition and faults in machines which have a serial decompo- sition. 5.5 Serial Decomposition and Fault Detection This section is concerned with faults in machines which have a serial decomposition. The structure con- sidered in this section is the following. Input { 105 The strategy is, as in parallel decomposition, to test each submachine separately to fulfill the fault de- tection criteria given in Theorem 5.2. A special case which fits into the realm of serial decomposition is that of iterative arrays of machines. An iterative agggy‘gf machines is the interconnection of identical machines used to form a more complex structure. Breuer (7) considers a linear cascade of identical machines which appears as the following. x1 x2 . xn I I . 1 y ‘ y y —L! I m l m 2 n-l m L_’ y1-~ 1*-—‘-* 27—"'—--1 n yn+1 :72 T y5 y1'1 1 z1 22 Zn Breuer demonstrates how this network of identical machines when connected as a parallel adder can be checked for faults by two experiments of length 8. Thus, one does not need to check 22n additions and resets to verify that an n bit adder is free of faults. Although this network is a serial connection of identical machines, there is the added feature-that out- puts (zi) of each machine are accessible and each machine has inputs (xi) which can be applied directly. These added features make testing for faults simpler than testing the general serial decomposition, MaG)Mb. Kautz (50) and Menon and Friedman (58) have also investigated 104 faults in iterative arrays of machines. Rather than examine serial decompositions which are of a special nature, we consider MacDMb in the general sense to illuminate possible fault detection methods which encompass a larger class of machines. Theorem 5.5: Given M. a MamMb and the only fault in the component machines is a single next state fault in Ma, then there are le| next state faults in the composite machine M'. Proof: From Definition 5.4 we have: S 2 Sa x Sb’ 77((1'1 15), xk) 3 (778(1" xk)’77b(t’ pa(r9 1k)))0 In S. there are |Sb| pairs representing states where r is the first element. Given that‘fiL(r, xk) is the single next state fault in M8, each of the leI state pairs having r as the first element would have faulty first element entries for column xk in M'. From Theorem 5.5 we see that a single next state fault in Ma causes M. to experience a set of faults. This is the same result which was noted in the case of parallel decomposition. The situation where next state faults 105 occur in Mb is different, as there are two characteristics in serial decomposition which are quite different than the parallel decomposition and they are: 1) Mb is not necessarily unique, 2) lb . 0a. Theorem 5.6: Given M's MamMb and the only fault in the component machines is a single next state fault in Mb, then there are [Sallll possible next state faults in the composite I .machine M . Proof: Let 77,)(t, pa(r, xk)) be the next state fault in MD. Since 77((1‘9 13): xk) " (778(1'9 xk)’7?b(t’ Pa(r9 xk)))’ then every next state entry (p, q) in M. which has q =77b(t, pa(r, xk)) is a fault. There are [88' states in Sa x Sb where t is the second element in a state pair. The possibility exists that Ma produces the same output for all inputs xk in I regardless of the state of Ma' Since 08 a Ib, every input xk in I will transfer the [Sal states of M having t as the second element to a state pair in which the second element is incorrect, because 106 ‘fl%(t, F%(r, xk)) is incorrect. The number lsa||1| yielded in Theorem 5.6 is derived on the basis that 08 has only one member in its alphabet. A machine which always responds with the same output re- gardless of the input is not very interesting and thus, it is assumed that a practical number of next state faults in M. due to a fault in Mb is ISaIIII-l. We concern our- selves only with nontrivial submachines. Because the next state function for M. is dependent on the output of Ma, the occurence of output faults in M8 I can cause next state faults in M . Theoremg5.2; Given M. s MaQ)Mb and the only fault in the component machines is a single output fault in Ma, then there exist leI possible next state faults in the composite machine . M . Proof: There are 'Sbl states in Sa x Sb where r, a state in * Ma’ is the first element in a state pair. If F%(r, xk) is the given single output fault, each of the lsbl states in M. having r as the first element of a state pair will have a next state fault in column xk whenever 77b(t' Pa(r’ xk)) “7713(179 R;(r, Xk)). 107 In Mb it is possible that 77.00:. pee. xk» mbw. ,cgcr. x,» for all leI states in Mb. Hence, it is possible to have [Sbl next state faults in M. when a single output fault exists in Ma' Theorem .8: If all transitions in M. = MamMb are tested for faults due to next state faults in Ma, then it is not necessary to test for faulty transitions in M. due to output faults in Ma' Proof: Consider the next state function in M. which is 7)((r. 15). xk) - (773(1'. xk).77b(t. paw. xk))). Now if‘flg(r, xk) is faulty we know from Theorem 5.5 that the lsbl state pairs in M. which have r as the first ele- ment will have next state faults for input Faults due xk. to F%(r, xk) can cause transitions in M. to be faults only when MA is in a state pair which has r as the first ele- ment and xk is the input. Hence, the faulty transitions in M. which are caused by“flk(r, xk) being faulty would in- clude the transitions which would be faulty due to a fault 108 in Fg(r, xk). The only difference might be the faulty next state. Thus, if all transitions in M. are tested for faults due to next state faults in Ma, we need not check for next state faults in the transitions of M. due to output faults in Ma. The significance in what has been given here is that in M. = MamMb which realizes the state behavior of machine M, the underlying structure of M results in a given fault in M being a member of a set of faults. Hence, when testing one transition we actually check portions of other transitions and it is possible to find a set of keys which allow systematic spot checking in machines having a serial decomposition. Theorem*5.9: If there is a single next state fault in machine M whose serial decomposition (M. a Maa)Mb) realizes the state behavior of M, then M has ISaAII|+|sb|'1 possible next state faults in addition. Proof: Given that the next state for state 31 under input xk is a fault, then 109 77(2),. xk> a 77'. Keck» - vim. t). x3) = ma(r. X3),77b(t, pa(r9 xj))) results in seven possible situations which are: a)‘flL(r, x3) is a fault, b) 7],)“, pa(r, xa.)) is a fault, c) F%(r, x3) is a fault, d) a and b are true, a) a and c are true, f) b and c are true, g) a, b and c are true. From Theorem 5.8 it is evident that c and e can be omitted as the number of faults due to a will cover c and e. It can be concluded that the number of faults is greatest when d is the situation as d covers a, b and g. There are lelstate pairs in M. which have r as the first element and IS state pairs where t is the second element. When aJ ‘flL(r, xj) is a fault, Theorem 5.5 yields the result that ISbl faults are possible and when 77b(t, pa(r, x3)) is a fault, Theorem 5.6 yields the result that lSal|I| next state faults are possible in M'. Since one state in M. has both r and t as first and second elements respectively, there are ISb|+|Sa|lIl-l possible next state faults in M'. Since MamMb is a state behavior realization of M, the llO assignment (1., K, A) maps each state of M one-to-one and I onto states of M . Therefore, there are a possible Isb'+lSaIIIl-l next state faults in M. The results obtained here are essentially the same as those obtained for parallel decomposition with the ex- ception that the number of faults is different. We can therefore use the same methods as used in parallel decom- position. To illustrate this, consider M6 and its serial decomposition. From M6 we compute the following tran- sition covering table. t1 t2 t3 t4 t5 t6 t7 t8 t9 th tll t12 tl ab a a b t2 ab a, a b t3 a ab a. b t4 a ab a b t5 a a ab b t6 a a ab b t7 b ab a a t8 k) ab a a t9 b a ab a t10 b a ab a t11 b a a ab t12 b a a ab Transition C0vering Table for M; lll Investigation of the covering table yields the following set of keys. {t1’ t2' t9' “10’ tll’ t12} Now that it has been shown that the structure of a machine holds information which allows it to be tested by syste- matic spot checking or by testing the component machines, we investigate and discuss how this is beneficial. 5.4 The Benefits of Machine Decomposition Machine decomposition and partitioning provides bene- fits which make fault detection more feasible for some ma- chines. These benefits are given in the following. 1) Fault detection experiments can be derived for ma- chines whose flow table definition is not well de- fined. This is accomplished when the machine can be partitioned into a network of known submachines or is constructed with known submachines. Suppose three machines are connected in parallel and each machine has 16 states, than the composite machine has 4096 states. Although we could define a flow table for the composite, it would be impractical since we know that the composite can be checked for faults by checking each submachine. 112 2) Since the number of states in the composite ma- chine is the product of the number of states of each component machine in machine decomposition, fault detection experiments are derived for smaller machines. 5) Some machines which do not possess a complete state identification sequence set can be decomposed into a system of machines where each submachine does have a complete state identification sequence set. We shall show in a later example a machine of this type. 4) Decomposition into a network of submachines may identify some submachines which are not neces- sarily important to test. Thus, in the event that it had been decided to test some of the submachines a savings in resourses is the result. To illustrate the use of machine decomposition in fault detection experiments, consider the following Moore sequential machine M8 and its decomposition which is given in Kohavi (52). Column 21 represents the output for ma- chine i. 113 x state 0 l 28 A D G O B C E O C H F O D F F O E B B O F G D O G A B O H E C 1 Ma The parallel-serial decomposition of M8 results in three component machines M9, M10 and M11. X 291 0 1 z9 00 01 10 11 210 Q Q 0 T T U U U 0 P 1 ”9 ”10 29x 00 01 10 11 211 Y 0 11 114 The composite structure of these three machines is as follows. 10 :t-——4~«w M9 £::::}————28 ‘ —_1 M ___J 11 Assume that the length of a checking eXperiment is proportional to n2p, where n is the number of states and p is the number of inputs. Now for the composite machine M8 the experiment length would be 128 input symbols. For the three component machines we have for M9, M10 and M11 that the experiment length is 8, l6 and 16 input symbols re- spectively. As another illustration of benefit from decompo- sition, consider M6 and its decomposition. Application of the State Identification Sequence Algorithm (Section 5.1.1) and the working form notation (Section 4.2) yields the following incomplete set of state identification sequences. IS . 51 . 0/1 1 1 _ 0 1 ‘ IS - 51 a 1/1 5 6 ' 115 = 000/000, 001/001, 001/000 The two component machines M6 and M6 do have a complete a b set of state identification sequences and they are the following. - s w y s ISB 8 A593 . O/y, l/wo . 1 "sh. Isa 3 Cl 3 "/1, 0 O 5.5 Chapter Summary In this chapter it has been shown that through the use of abstract representations of a machine's structure it is possible to provide fault detection, based on that structure, which is beneficial. The benefits which can be realized have been itemized in Section 5.4. Section 5.1 views the structure theory of sequential machines with the theme that fault detection experiments are the goal. Sections 5.2 and 5.5 consider the global character- istics of decomposition to ensure that a general theoretic approach is the result. The result is that procedures 116 have been outlined which are general and which can be ap- plied to any machine which has a serial or parallel decom- position. CHAPTER 6 ECONOMICS OF CHECKING EXPERIMENTS Knowing how to construct checking experiments does not guarantee that fault detection theory can be economi— cally applied. The following sections provide notions which are concerned with checking experiment application that the literature has not provided in the past. Sections 6.1 and 6.2 are concerned with the definition of costs associated with fault detection, the relationship between these costs and a cost model based on the re- lationship between the costs. Section 6.5 presents a pro— cedure by which one may compute the cost of each appli- cation of a checking experiment. The goal of this chapter is to provide criteria which consider the global analysis of fault detection experi- ments and thus, enable decisions concerning using fault detection or not. This chapter is concerned with Oper- ations Research in the sense of the definition of Oper- ations Research by Stoller (49). Operations Research is the use of the scientific method to provide criteria for decisions concerning man-machine systems involving repeat- able operations. 117 118 The scientific methods are generally used in Oper- ations Research to determine the probability of meeting specified time deadlines within cost limitations (O'Brien (41) and Hillier (26)). In this chapter we use lattice theory and provide a model of fault detection economics that defines a lattice. The fault detection economics model is new and the techniques, although different than the general methods of Operations Research, are of general interest. 6.1 Experiment Cost Application of a checking experiment to a sequential machine is by no means free of cost. Indeed cost may make it not feasible to consider fault detection experiments in some instances. Thus, a critical step in practical fault detection is to consider experiment costs associated with: 1) machine design and preparation for testing, 2) test equipment design or purchase, 5) performing the experiment. Along with these costs, it should be determined whether or not one would need or even want fault detection. Perhaps faults can be tolerated to some extent. Certainly, some entities which can be modeled as sequential machines have a well known behavior and hence, when the behavior changes it is not difficult to infer that a fault has 119 occured. There is also the situation where faults are correctable within the machine itself (Dauber (l2) and Harrison (21)). The fault correcting capacity arises from the fact that once a fault causes the machine to enter an incorrect state, an input sequence exists which brings the machine back to the correct state. Thus, for some period of time the machine goes off track but, at a future time may get back on track. Perhaps these "memory lapses" can be overlooked. On the other hand, it may be absolutely necessary to know that the machine is functioning cor- rectly. Since reliability may or may not be necessary, one determines the cost of providing fault detection and then decides whether or not it would be profitable. If relia- bility is absolutely required or not required at all, the decision is trivial. In other cases a decision can be wisely made on the basis of the cost model which is pro— vided in the following sections. Environmental factors may or may not influence the reliability. When the sensitivity to fail is very great with regard to its environment, a machine will have a high mortality rate. When the sensitivity is small, normal failure rates due to degradation are the result. As a re— sult of sensitivity to environment, the value of a checking experiment becomes greater and hence, its cost should be depreciated in a like manner. 120 6.1.1 Machine Design and Preparation for Testing Given in this section are the particular factors which contribute to the total cost because of special ma- chine design and preparations which must be accomplished. To facilitate fault detection, it is sometimes necessary to: 1) add one or more inputs, 2) add one or more outputs, 5) add testpoints (Williams et al. (51)). Adding an input to the machine involves one terminal and hardware associated with the internal logic. Denote this cost as 01. Additional outputs also require as a supplement, added terminals and hardware associated with internal logic. Denote the cost for additional outputs as 02. For testpoints, their cost is ca. These costs will vary with the type of hardware used in the construction of sequential machines. For large scale integrated circuitry, additional terminals and their access to the internal structure will cost more than wired circuit boards. The associated circuitry for wired boards may cost more than that of an integrated circuit since the integrated circuit needs only to have its "masks" changed. The wired circuitry requires additional hardware for each board. Thus, these costs could be significant in regards to fault detection experiments. 121 6.1.2 Experiment Design If a person is planning on using fault detection ex- periments, someone must write the necessary software to achieveifluadesign. For small machines the cost will be less than for large machines where only components are to be tested. The latter type of machine requires more effort and software engineering resulting in an increase in time spent performing the design. Experiment design cost can be computed in the form of man-hours necessary to take a sequential machine which has been prepared for fault detection and yield the desired experiment. Without the use of algorithmic procedures to aid in experiment design, the man-hours required are great. Usu- ally, a designer applies an input, notes the response, applies another input and observes another response etc.. All of this is recorded and the I/O sequence is then con- sidered a checking experiment. This method is, on the average, time consuming and not always complete. Experi- ment design cost is denoted on. 6.1.5 Performing the Experiment Once the experiment is designed and machines are available for testing, the need for additional hardware to conduct the experiment enters the cost picture. If a com— puter is available, then some type of interactive device is necessary for the computer to cause inputs to the 122 machine under test and observe its outputs. We will call the interconnecting hardware costs c5. In the event that a computer is not available, then the cost for a device which can produce inputs and observe outputs is necessary and its cost is c6. If a device pp perform the checking experiment is not to be purchased but designed and built, this cost is also considered and is denoted 07. Now suppose a sequential machine is to be tested periodically, during its Operation, then the time required for testing is important. This is because, during testing, the machine may not be used for other purposes. It may be that the machine is periodically free and available for testing in which case the cost would not be very important. The cost for "down time" is directly related to the length of the experiment. Since cost must be considered prior to actually deriving the experiment, one must estimate this cost according to a bound which is determined on the maxi- mum length of an experiment. Thus, this "down time" cost can be computed as the amount of real time Operation which is lost to running the checking experiment and is denoted ea. As an example of "down time" cost, suppose a sequential machine is actually controllinga large system. Then since the system function is dependent on the machine, the sys- tem must be made inert during the experiment. For a large chemical processing plant this inert time may be costly if 125 no means of controlling the system during tests exist or else the cost of a replacement is introduced. 6.2 Cost Model From a practical standpoint we are concerned with the efficient use of available resourses to meet fault de- tection requirements. As always, there are certain trade- off situations which are available. As an example, if the decision is made not to add an input, then a checking ex- periment of greater length must be used. This decreases one cost but causes others to increase. To obtain a mini- mal cost for a checking experiment, one must consider all trade-off combinations. The usefulness of the costs which have been defined is dependent on the ability to define how each cost re- lates to the others. The manner in which the defined costs interact is given in the following. Definition 6.1 In the process of making decisions which concern the costs associated with fault detection, if all decisions necessary for the computation of c3 are all necessary or properly contained in the set of decisions needed for the computation of oi, then 01 # 03. Let cO denote the cost of machine preparation when not adding inputs, outputs or testpoints. Further, all L 124 i é 8) are in units of dollars such that all 01 = O. Certainly, c.j # 03 for all j (0 é j é 8). Thus, the relationship (#) is reflexive. cases. (a) (b) (e) (d) (e) (f) (a) (h) c. # c0, for c.j # 01, for c. # c2, for c. # C}, for 0k # 04, for all all all all all i (l 3(4 3(4 3(4 k(5 II\ m m: at c1 am Now consider the following II\ II\ II\ ll\ 8). 8). 8). 8). 8). Given in the following are the details for the re- lationships (a)-(h). (h) Cost 08 cannot be computed prior to c5 as the (1’) amount of "down time" depends on hardware used to conduct the experiment and the interconnecting hardware affects the speed at which the testing equipment can interact with the machine being tested. and (g) The interconnecting hardware cannot be specified until the nature of the device which conducts the experiment is known. (e) Until the experiment has been defined, the amount 125 of memory and computational ability of the testing equipment cannot be specified and hence, the interconnecting hardware and "down time". (b), (c) and (d) The choice of inputs, outputs or testpoints affects the design of the experiment. If inputs are added to a machine to render it more easily testable, then this machine is different than one which had outputs added. In each situation a different machine is the result and the choice affects on and all other 0i (i > 4). (a) We cannot compute 01, 02 or c3 until it has been decided that we are not going to test the machine as it exists. We observe that (#) is anti-symmetric, i.e., (ci # c3) and (c.j # oi), iff c1 = 03. Since (#) is also transitive, we have that (#) defines a partial ordering. The collection of defined costs C = :{ci: O 4 i 4 8} and the partial ordering (#) previously given constitutes a lattice . To show that is a lattice we consider the following two definitions by Berztiss (5). 126 Definition 6.2 Let set A be partially ordered by # and let 0 C B C A (0 is the null set). Then an element a E A is an upper bound (lower bound) of B iff, for all b 6 B, b # a (a # b). The least (greatest) element of the set of all upper (lower) bounds of B is the least upper bound (greatest lower bound) or supremum (infimum) of B, symbolized lub B (glb B) or sup B (inf B). Definition 6.5 Let be a partially ordered set. The system is a lattice iff each pair of elements a1 and a.j of A has a supremum and an infimum in A. Denote sup {a1,a3} by ai o a. and inf {ai,a3} by ai ‘ as. J Thus, is a lattice as the following shows that each pair of elements in C has a supremum and an infimum. 127 This lattice clearly shows how the costs are related. Using this representation, , we are able to construct a cost model such that one can determine the cost of ap- plying fault detection experiments. To proceed in con- structing a cost model we extract from rules to en- sure that the cost model is representative. The following definitions are fundamental in the extraction of the rules we seek and are given in Abbot (1). Definition 6.4 A partially ordered set ’is called totally ordered if for all pi, p5 E P either Pi # P3 or P3 # P10 Definition 6.5 If is totally ordered, then P is called a chain. The lattice ’contains eight chains which span the lattice from 00 to °8' Each chain contains a set of costs and these eight sets are: 128 1) {CO’ 04, c5, °6’ 08} 2) {OO’ 04, c5, c7, 03} 5) 00, c1, c4, c5, 06, c8} 4) {00, 01, 04, c5, c7, c8} 5) { 6) {00, 02, c4, c5, c7, 08} 7) { 8) Consider the pairs of chains (1, 2), (5, 4), (5, 6) and (7, 8). In each pair (3, k) we observe that J and k are the same except that 3 contains c6 and not 07 whereas k contains 07 and not c6. It happens that each pair represents a different possible direction that one would take under machine preparation for fault detection. Also, within each pair the difference is that of how a device to perf0rm the checking experiment is acquired. Thus, we de- note each pair to be a case and the choice within each pair to be an option. This is further defined in the following definitions. Definition 6.6 Four cases exist under machine preparation. Case 1 The machine requires no special preparation for fault detection, i.e., no inputs, no outputs or testpoints are added. 129 Case 2 The machine is to have symbols assigned to its input alphabet only. Case 5 The machine is to have symbols assigned to its output alphabet only. Case 4 The machine will have only testpoints added to facilitate fault detection. In case 2 we have seen (Section 5.5) how fault de- tection can be facilitated through the addition of one input. Fujiwara et al. (16) demonstrates how an easily testable machine can be constructed by using two additional inputs. Case 5 is covered by the material presented in Section 5.2 and Williams et al. (51) discusses a method of adding testpoints. To each of the cases defined above, there are two Options allowed. Definition 6.7 Option 1 The experiment will be performed by use of purchased equipment. Option 2 The experiment will be performed by’use of equipment which will be designed and built. Given the four cases and their options, the °i are computed in the following manner. 1. Select case k (l 4 k é 4). 150 2. Compute ck_1. 5. Compute 04 based on the chosen k. 4. Select an option. 5. Compute the cost of the chosen Option. 6. Compute c5 based on the Option chosen. 7. Compute 08. Definition 6.8 An admissable cost set is a set of Ci (0 4 i 6 8) which have been determined by the above procedure. There are eight possible admissable cost sets. Let each admissable cost set be represented by a vector A(k), where A(k) = (akl’ ak2, ..., ak8) and k = 2(1-1) + 3, where i and 3 refer to the case and option respectively. Thus, for case 4, option 1, k . 7 and A(7) ' (3719 B729 0009 378)- The entries for the kth vector, l 4 m 4 8), are akm ( 151 the particular costs (cm) for the case i, option 3 situ- ation. As an example, consider the situation where case 5,_ option 1, is to be investigated. We can make the following statement: 851 = 853 ‘ a57 ‘ 0' This results because no inputs or testpoints are added and no device for performing the experiment will be built re- sulting in Thus, A(S) = (O, 02, 0, C4, c5, 06’ O, 08). Consideration of all admissable cost sets yields a matrix CM. This matrix represents the cost model and ap- pears as follows where row k is admissable cost set A(k). 0 ° 0 a14 a15 a16 0 818 O O 0 a24 a25 0 a27 a28 a31 O 0 a34 a55 a56 0 a38 a41 O 0 a44 a45 0 847 848 0 a52 0 a5“ a55 a56 0 a58 ° 362 ° 364 365 0 ‘167 868 0 0 a75 a74 a75 a76 0 a78 _ ° ° 385 384 385 ° 987 aas J CM Given a column matrix [I] having eight entries and each entry equal to l, a minimal admissable cost set can be determined. Perform the following multiplication [on] [I] . [COST]. The entry of the resultant column matrix, [COST], which is smallest represents the minimal admissable cost set and is denoted Cmin' Example 6.1 As an example of how this cost model is generated and used to reach a cost decision, consider the following list of given information: a) 31200 to add an input, 153 b) 3650 to add an output, c) 3780 to add testpoints, d) 36450 to design testing equipment, e) $5550 to purchase testing equipment, f) 31670 to design the experiment with no additions, g) 3490 to design the experiment with an added input, h) 8520 to design the experiment with an added output,‘ 1) $660 to design the experiment with testpoints, 3) 81100 for interconnecting hardware case 1, Opt. 1, k) $400 for interconnecting hardware case 1, opt. 2, 1) $1400 for interconnecting hardware case 2, Opt. 1, m) 3570 for interconnecting hardware case 2, opt. 2, n) $1210 for interconnecting hardware case 5, Opt. 1, 0) $260 for interconnecting hardware case.5, opt. 2, P) $1550 for interconnecting hardware case 4, opt. l, q) 3600 for interconnecting hardware case 4, opt. 2, r) $5000 "down time" case 1, options 1 and 2, a) $1000 "down time" case 2, Options 1 and 2, t) 81250 "down time" case 5, options 1 and 2, u) $1500 "down time" case 4, Options 1 and 2. This list yields the following CM. 134 F 0 0 0 1670 1100 5550 o 5000 1 0 O 0 1670 400 O 6450 3000 1200 0 0 490 1400 5550 o 1000 1200 O O 490 570 0 6450 1000 0 550 0 520 1210 5530 0 1250 0 650 0 520 260 0 5450 1250 0 0 780 660 1550 5550 0 1500 b 0 730 660 500 0 6450 1500 J Multiplication by [I] yields the following cost matrix, [COST] . 9300 11520 7620 9510 7160 9130 7820 9990 From this we see that the row 5 entry is smallest and hence, case 5, Option 1 will yield the smallest cost. 155 6.5 Multiple Use of a Checking experiment Through consideration of all trade-off conditions we are able to minimize the cost of a checking experiment. However, the cost which has been computed may seem ex- tremely large if one were to purchase a mini-computer to perform the experiment. The situation could possibly be that even though a checking experiment is absolutely re- quired, the cost which has been computed is prohibitive. To achieve a true accounting of how much a checking experiment will cost, one must consider how many times the checking experiment will be used. Thus, a certain por- tion of the cost is distributed to each use of the experi- ment. First of all, the number of machines to be examined by the checking experiment (N, N e 1) is important as the checking experiment will be used on each machine. Next, -the number of times each machine will be tested (f1, f1 § 1), where fi refers to the ith machine. Now we have as the total number of uses of the checking experiment (T), for N machines is Thus, the total cost CT per use of the checking experiment is CT I Omin/T’ 136 where the same checking experiment is used throughout. Example 6.2 Consider the situation in which the Quikey Keyboard Company which manufactures pocket calculator keyboards de- cides to enter the calculator market. It has been deter- mined that with a minimal expansion, the company can make the entire calculator. An investigation of the market reveals that dealers will not handle pocket calculators unless each one is checked for faults after assembly and prior to shipment. It is further required that one sample in each 1000 units produced be used for reduced life testing. The reduced life test requires that each sample be tested for faults once each hour for a period of 90 days. The members of the board had decided on producing 500,000 units and felt that the company cannot enter the market as the additional financial burden of fault detection will make the product to expensive to compete. Prior to making any decision they decide to study the economics of fault detection further. With the material given in this chapter it is con- cluded that Cmin is 325,720. For each reduced life test sample :21 . (24)(90) - 2160 157 and for each regular unit 3 20 Therefore, with N a 500,000 N T = 2 f1 = 500(2160) + (N-500)(2) i=1 or T - 2,079,000. Now the resultant cost per use of the checking experiment is CT = 0min/T . 25,720/2,079.000 a 3.0114. The resultant cost per fault detection test is 8.0114 which will not affect the calculator price enough to make it noncompetitive. Thus, it is decided to enter the pocket calculator business. 6.4 Chapter Summapy The cost of fault detection eXperiments is considered in this chapter. The particular costs involved in fault detection are defined and their relationships established to construct a lattice . This lattice represents the 158 logical flow of events which are actually encountered in the overall fault detection scheme. From this representa- tion, , a cost model is derived in which one is able to determine an admissable cost set which requires minimal cost. The final section of this chapter considers the cost per individual use of a checking experiment. This notion is useful as it allows one to determine how to defray the total cost which in some cases would otherwise seem pro- hibitive. CHAPTER 7 CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK 7.1 Conclusions The problem of detecting faults in sequential ma- chines has been considered. The task of verifying the transition function,‘fih presents the greatest barrier to a general solution of the problem. This barrier was found to exist because experimental fault detection cannot di- rectly observe the transition function. Thus, one must make decisions, concerning faults, by the manner in which the machine responds to experiments. An algorithm for finding state identification sequences has been developed. Since some machines do not possess an identification sequence for each of their states, means of augmenting these machines are necessary. The methods of accomplishing this have been presented. Thus, the means of establishing all shortest, existing identi- fication sequences for a state and the means by which a state can be given an identification sequence have been formally developed. The problem of faults which can cause inaccessible states to become accessible has been treated. The general 139 140 problem remains Open but, the case where the sum of acces- sible and inaccessible states remain constant has been solved in the sense that the inaccessible state assignment algorithm will yield state identification sequences for inaccessible states. A procedure for using state identification sequences to detect faults in a sequential machine has been sug- gested which yields substantial results. The special Mealy machine notation, introduced in Chapter 4, contri- butes greatly to this procedure through reduction of ex- periment length by elimination of redundancies. In Chapter 5, the correspondence which exists between faults and the structure of decomposable machines has been presented., This correSpondence is of significant import- ance as indicated by the benefits listed in Section 5.4. It is shown that because of the underlying structure of the decomposable machine, a "systematic spot checking" procedure can be used to verify the machine's transition function. A procedure which determines which spots to check is given in Section 5.2. The importance of the economic considerations pre- sented in Chapter 6 is notable. With the cost model given in this chapter, decisions can be made regarding the feasibility of fault detection. Thus, Chapter 6 makes known the alternatives available to the global fault de- tection scheme and how to accomplish fault detectibn most economically. 141 2.2 Suggestions For Future Work The situation in which faults can cause the sum of accessible and inaccessible states to increase needs at- tention. The possibility of state variables increasing and hence, the total number of accessible and inaccess- ible states increasing does exist and therefore must be considered. Although the fault detecting procedure suggested in this work represents a substantial result it is not com- plete. To illustrate this, M3 had a checking experiment constructed for it which has a length of l2.input symbols and the following experiment 3 7 z z y y 2 z y A1A1B2030606B5A4A1 whose length is 9 is Just as capable in detecting faults. It is suggested that there is a procedure which can decide which state identification sequences to select for the checking table such that the optimum in conciseness re- sults for the checking experiment. Another area that needs work is that of defining, ex- actly, the criteria which state identification sequences must satisfy to qualify them for Condition I. The example worked for M5 shows that the state identification sequences chosen do not test the machine completely. A different choice has been found which does work but, no 142 rules for finding them are available as of now. It is suggested that a portion of the problem rests in the fact that this machine has an input, xm, which has 77(319 Km): P(3i9 Km) 377(339 Km): P<339 x31): where i f 3. This situation yields identical sub-experi- ments in the checking table and thus, the possibility of not checking one of the transitions. In Section 5.2, two ways of determining faults in the class of machines which are decomposable were listed. We pursued one of these two resulting in "systematic spot checking". This "systematic spot checking" approach has only been initiated here. There is more work which needs to be accomplished to make it completely functional. It is suggested that the other avenue to fault de- tection for decomposable machines (verification of sub- machines) merits further exploration.' Two notions to consider are: 1) make the outputs of each submachine observable for fault detection, . 2) make the fault detection scheme an integral part of a design procedure used to construct large com- plex machines from simpler ones. BIBLIOGRAPHY (1) (2) (5) (4) (5) (6) (7) (8) (9) (10) (11) BIBLIOGRAPHY Abbot, J. 0., Sets, Lattices, and Boolean Algebras, Allyn and Bacon, Bostbn, 19697 Abramson, N., Information Theor and Coding, McGraw- Hill, New YorE, 1965. Berztiss, A. T., Data Structures Theopy and Practice, Academic Press, ew or an n on, I971. Boute, R. T., "Distinguishing Sets for Optimal Identi- fication in Checking Experiments," IEEE Trans. Com ut., Vol. C-25, pp. 874-877, Aug. I974. , "Optimal and Near Optimal Checking Experi- ments fer Output Faults in Sequential Machines," IEEE Trans. Comput., Vol. C—25, pp. 1207-1215, ov. . Boute, R. T.; and McCluskey, E. J., "Fault Equiva- lence in Sequential Machines," Proc. Sypposium Computers and Automata, 1971. Breuer, M. A., "Fault Detection in a Linear Cascade of Identical Machines," Proc. Ninth Annual S - posium on Switching and Automate Theory, pp. 5- 245, 1968. Carter, W. 0., "Fault-Tolerant Computing: An Intro- duction and A Viewpoint," IEEE Trans. Comput., Vol. C-22, pp. 225-229, March 1975. Chang, H. J.; Manning, E.; and Metze,G., Fault Dia - nosis of Digital Systems, New York: WiIey-Inter— SC ance, o Chang,’L., "Fault Analysis of Combinational Logic Networks," De t. Com uter Science Michi an State University, TecHnIcIEI Rpt. 75:0I, UuIy I975. Das, P.; and Farmer, D. E., "Fault-Detection Experi- ments for Parallel-Decomposable Sequential Machines," IEEE TranS. Comput., Vol. C-24, pp. 1104-1109, Nov. 1975. 144 (12) (15) (14) (15) (16) (17) (18) (19) (2O) (21) (22) (25) (24) 145 Dauber, P. S., "An Analysis of Errors in Finite Automata," Information and Control, Vol. 8, pp. 295-505. 1965- Farmer, D. E., "Algorithm for Designing Fault De- tection Experiments," IEEE Trans. Comput., Vol. C-22, pp. 159-167, Feb. 1975. Friedman, A. D.; and Menon, P. R., "Restricted Checking Sequences for Sequential Machines, " IEEE Trans. Comput., Vol. C-22, PP. 597-599: April I975. , Fault Detection in Digital Circuits, EngIewood-CIITTB, N. J. *PrefiticeJHall, 51971. Fuijwara, H.; Nagao, Y.; Sasao, T.; and Kinoshita, K., "Easily Testable Sequential Machines with Extra Inputs," IEEE Trans. Comput., Vol. C-24, pp. 821- 827, Augu§t 1975} Gill A., , - Introduction to the Theory of Finite State Machines, 0 raw— , ew or , 1962. Ginsburg, S., "On the Length of the Smallest Uniform Experiment Which Distinguishes the Terminal States of a Machine," J. Assoc. Computing Machinery, Vol. 9113' 9 GOnenc, G., "A Method for the Design of Fault De- tection EXperiments," IEEE Trans. Comput., Vol. C-l9, pp. 551- -558, June 1970. Harrison, M.. A.,, "On Asymptotic Estimates in Switching and Automata Theory," J. Assoc. Com utin Machinepy, Vol. 15, pp. ISI-IE7, I966. "On the Error Corecting Capacity of Finite ‘Automata," Information and Control, Vol. 8, pp. 450-450,1965. Hartmanis, J.; and Stearns, R. E., Algebraic Structure Theory of Sepuential Machines, g ewoo s, o 0: en 06 a , O Hennie, F. 0., "Fault Detecting Experiments for Sequential Circuits," Proc. Fith Annual Sypposium on Switching Circuit T eory an 0g ca eslgn, PP. " 9 0 , Finite State Models for Logical Machines, New York: Wiley, 1968. (25) (26) (27) (28) (29) (50) (51) (52) (55) (54) (55) (56) (57) 146 Hibbard, T. N., "Least Upper Bounds on Minimal Terminal State Experiments for Two Classes of Sequential Machines," J. Assoc. Com utin Machinepy, Vol. 8, pp. EDI-612, Oct. 196%. Hillier, F. S.; and Lieberman, G. J., Introduction to Operations Research, Holden-Day, 197K} Hsieh, E. P., "Checking EXperiments for Sequential Machines," IEEE Trans. Com ut., Vol. C-20, pp. 1152—1166, Oct. I971. Kane, J. R.; and Yan, S. S., "On the Design of Easily Testable Machines," Proc IEEE 12th Annual Symposi- um Switching and Automa a copy, pp. - 2, Oct. 1971. Kamal, S., "Diagnosis of Intermittent Faults in Dig- ital Systems," Ph.D. Thesis, Michigan State Uhi- versity, Jan. 1975. Kautz, w. H., "Testing for Faults in Cellular Logic Arrays," Proc. 8th Annual S oosium on Switchin and Automata Theor , pp.’16 —I7E, I967. Kime, C. R., "An Organization for Checking Experi- ments on Sequential Circuits," IEEE Trans. Elec. Com ut., Vol. EC-15, pp. 115-115, F95. I966. Kohavi, Z., Switching and Finite Automata Theory, McGraw-Hill,*l970. Kohavi, Z.; and Lavallee, P., "Design of Sequential Machines with Fault Detection Capabilities," IEEE Trans. Comput., Vol. EC-l6, pp. 475-484, Aug. I967. Kohavi, I.; and Kohavi, Z., "Variable Distinguishing Sequences and Their Application to the Design of Fault-Detection Experiments," IEEE Trans. Comput. Vol. C-l7, pp. 792-795. Aug. 1968. Martin, R. L., "The Design of DiagnOsable Sequential Machines," Proc. Hawaii Internat. Conf. Sys. Sci., 1968. McCluskey, E. J., "Minimization of Boolean Functions," Bell Systems Tech. J., Vol. 55. PP. 1417-1444, Nov. 1956. Mealy, G. H., "A Method for Synthesizing Sequential Machines," Bell S stems Tech. J., Vol. 54, pp. 1045-1080, ep . 9557* 147 (58) Menon, P. R.; and Friedman, A. D., "Fault Detection In Iterative Logic Arrays," IEEE Trans. Comput., V010 0‘20, pp. 524-555, May 19%. (59) Moore, E. P., "Gedanken-Experiments on Sequential Machines," Automata Studies, Princeton Universit Press, Princéton, New Jersy, pp. 129-155, I956. (40) Murakami, S.; Kinoshita, K.; and Ozaki, H., "Se- quential Machines Capable of Fault Diagnosis," IEEE Trans. Comput., Vol. 0-19, pp. 1079-1085, (41) O'Brien, J. J., Scheduling Handbook, McGraw—Hill, New York, 1969. (42) Poage, J. E.; and McCluskey, E. J., "Derivation of Optimum Test Sequences for Sequential Machines," Proc. Fifth Annual Sypposium on Switching Circuit 6013;? an OLca 33 gig pp. -' g o (45) Roth, J. P., "Diagnosis of Automata Failures: A Cal- culus and A Method," IBM J. Res. Develop., Vol. 10, pp. 278-291, July 1966} a (44) Roth, J. P.; Wagner, E. G.; and Patzolu, G. R., "A Formal Theory of Cubical Complexes," IBM T. J. Watson Res. Cen., Rep. N69-55559, AprII I969. (45) Salomaa, A., Theory of Automata, Pergamon Press, 1969. (46) Seshu, S.; and Freeman, D. N., "The Diagnosis of Asynchronous Sequential Systems," IRE Trans. Elec. Com ut., Vol. EC-ll, pp. 459-465, ug. . (47) Sheppard, D. A.; and Vranesic, Z. G., "Fault De- tection of Binary Sequential Machines Using R- Valued Test Machines," IEEE Trans. Comput., Vol. 0-239 pp. 552-5589 ApriI 19710 (48) Stark, P. H., Abstract Automata, North Holland Pub- lishing 00., 1972. (49) Stoller, D. S., O erations Research: Process and Strategy, UniversIty of CaIIfornIa Pfess, Berkely, (50) Warshall, S., "A Theorem on Boolean Matrices," J. Assoc. ComputingMachinery, Vol. 9, pp. 11-127 Jan. 1962. 148 (51) Williams, M. J. Y.; and Angell, J. B., "Enhancing Testability of Large-Scale Integrated Circuits via Test Points and Additional Logic," IEEE Trans. Com ut., Vol. C-22, pp. 46-60, Jan. . "I71111147441111“