«flaw? . 3' v. 1.. 1.31 :1 .u‘...‘ .. 1;: .. .. 1.1.5. , a A!» x («it tin-IF, . .11)... 11.; in)”. no.3... v u”1‘ll 1‘ .a—‘nllcl'A-‘l In}?! 1/ 0“. a i 5? I4 . . . ‘ firstvtozpiiibltfzv 1:... 6 a. ‘ 0.. p30 5;??? 4: , V «S.- Klb.rprl..f. 1» u... . V . gift-15 of? . . . a! .3 , z 3... r! 3!“. (:1 .4! .l! .I I556, vpgl'iavlkt 06):; . . 3:13.... §.=rpt ...o;... J . .Irr..vut tifu . »>5z.. 1v: ~..:5. . :1 [.13. (-5.! .53l.l,u.l.rf‘rl..>.) air lit-ll}: E. 3... 2), .. x... i 9 FI.IA hirliuitpf. a 3,-3‘17rlvil. ’53:...LE'PT!‘ It I .1} 9‘11» (.5 AI.3.!:..' 2.... I€\\Iar\ , uni-3.1.2.35. 1.93.62. .sl: . Ito-.9 I v vb 1.... . . .. 1.. kl-.!-3001’.{7!: .IVAIL'IV ctnfiiv . Cr! ..r1:?..\i L..v ,v... I...Ve¥.Ii-. .‘ . .. 5.} a l. ..|.'|.l . ‘ . , n|rs v.0vu.l: :u , I... ‘ . wt ., ‘ . . litigant OCHIT q ._ . .. 5.1éqsguxkuéwxwfié a 1‘31! y ‘ ' I. ||Il|| . ' lllllllllllHIIHIIIIll!IIIIIIIIHIHIIIllllllllllllllllllill 31293 00885 7249 This is to certify that the dissertation entitled Design and Synthesis of On-Line Testable Sequential Circuits presented by Chia-Shun Lai has been accepted towards fulfillment of the requirements for Ph.D. degree in Electrical Engineering Major professo{ Date 471,97 0 -9527 v MSU is an Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY 2* Michigan State University r-—— _ PLACE lN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE —T r: JLL T TV—i MSU Is An Afiirmetive Action/Equal Opportunity Institution encirchma-pJ _.—._—.._._—fi,._ _ ___l _ DESIGN AND SYNTHESIS OF ON-LINE TESTABLE SEQUENTIAL CIRCUITS By Cilia-Shun Lai A DISSERTATION submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1993 ABSTRACT DESIGN AND SYNTHESIS OF ON-LINE TESTABLE SEQUENTIAL CIRCUITS By Chin-Shun Lai With ever-increasing complexity of digital applications, the issue of reliability has become very important in today's VLSI designs. Concurrent error detection schemes using redundancy design approaches have been sucessfully implemented to enhance chip yield and system reliability. Redundancy design approaches may include hardware redundancy, time redundancy, and information redundancy. The simplest concurrent error detection scheme is the duplication with comparison. Any mismatch between two identical blocks will indicate an error. However, this involves more than a 100% increase in hardware overhead cost. Information redundancy involves the use of coding techniques that enhance circuit capability for reliable operation. Berger codes are least redundant separable codes, they have been implemented for fault-tolerant, fail-safe, and concurrent error testable digital circuits. The features of high speed and low hardware cost are highly desirable especially for a checker design. In response to these needs, a design methodology using a partitioning and folding scheme is deve10ped for the design of fast, yet low hardware cost, Berger code checkers with self-testing capability. the it is ( 5m: :38 sq- is is. (its lzyc Since the hardware cost is increased almost exponentially as the code length of the checker increases, the use of many smaller checkers will require much less hardware cost than that of a larger checker. In this study, an efficient output function partitioning scheme is developed to partition the set of output functions into many smaller subsets so that smaller checkers can be employed. Based on the delay constraint derived from a set of design specification, the checkers which achieve minimal hardware overhead are chosen. Experimental results show that, with the developed partitioning scheme, the hardware cost may be reduced considerably. With such a low hardware cost checker, on-line testability becomes very promising and practical. In this research, a system, SOLiT, for automated synthesis of on-line testable sequential circuits with multi-level logic implementation has been developed. The system is implemented on Sun/4 workstation in the C-language. The system receives a behavioral description of finite state machines in kissz format and automatically generates the physical layout for a self-checking circuit. Copyright by Chia-Shun Lai 1993 To my wife: Kai-Ling Hsu Er n'c C0l heh Cncc CnCo caller and e. ACKNOWLEDGEMENTS I would like to thank Professor Chin-Long Wey, my thesis adviser, for his leading me not in the path of ease and comfort but under the stress and spur of difficulties and challenge. I am inspired by his valuable guidance throughout my graduate study. I would like to thank members of my thesis committee, Professors P. David Fisher, Michael Shanblatt, and Jonathan 1. Hall, for their meaningful comments and remarkable ideas during the course of my dissertation research. I do very appreciate Professor Erik D. Goodman for his encouragement, support, and assistance. My learning life is richer for having crossed his. I would also like to thank Professor Kun—Mu Chen for his cordial assistance. I am benefited from his provident thinking and effective attitude. I would like to acknowledge all the faculty member and students who gave me help during my study at Michigan State University, and many friends who showed their inspiration and benevolent feeling. I am very grateful to my parents and my mother-in-law for their years of concern, encouragement, and support, and to my wife, Kai-Ling, for her wild imagining, significant encouragement and help, and persistent thoughtfulness. She insists in taking care of all things considerately, and keeps learning diligently even though under treatment for her cancer. My latent potentialities have encouraged by her great composure, persevering will, and endless love. vi TABLE OF CONTENTS LIST OF TABLES ............................................................................................... LIST OF FIGURES .............................................................................................. LIST OF SYMBOLS .............................................................................................. Chapter 1: Introduction ..................................................................................... 1.1 Previous Work ................................................................................ 1.2 Problem Statement ........................................................................ 1.3 Research Tasks ................................................................................. 1.4 Thesis Organization ...................................................................... Chapter 2: On—line Self-Testable Circuit and System ..................................... 2.1 Terminology ................................................................................. 2.2 Self-Testing Circuit Structures ....................................................... 2.2.1 Checker Design .............................................................. 2.2.2 Functional Circuits ......................................................... 2.2.3 Logic Synthesis Procedure ............................................. Chapter 3: Efficient Self-Testin g Berger Code Checker Design ................... 3.1 Maximal Length Berger (MLB) Code Checker Design ............... 3.1.1 Berger Code Partitioning Scheme ................................. 3.1.2 Partitioning and Folding Scheme ................................... 3.2 Non-Maximal Length Berger Code (NMLB) Checker Design 3.2.1 STC Design with Partitioning Scheme .................. 3.2.2 STC Design with Partitioning and Folding Scheme vii \ldthL/thN 1 1 1 1 23 26 29 29 3o 33 47 47 53 3.3 Design Alternative ........................................................................ 62 3.4 Summary ........................................................................................ 68 Chapter 4: Output Partitioning Algorithm ...................................................... 70 4.1 Problem Statement ....................................................................... 70 4.2 Problem Formulation ..................................................................... 72 4.3 Algorithms .................................................................................... 75 4.4 Discussion ..................................................................................... 79 4.5 Experimental Results ................................................................... 81 4.6 Summary ....................................................................................... 87 Chapter 5: SOLiTE A System for Automated Synthesis of On-Line Testable Sequential Circuits ....................................................... 89 5.1 Synthesis of Self-Checkin g Functional Circuits ......................... 90 5.2 The Automated Synthesis System SOLiT ................................... 91 5.3 Implementation .............................................................................. 93 5.4 Synthesis Example ...................................................................... 95 Chapter 6: Summary and Conclusion ............................................................. 107 6.1 Summary ........................................................................................ 107 6.2 Contribution and Future Research ............................................... 109 Bibliography ....................................................................................................... l l 1 viii Ta' Tal Tat Tat Tab Tab Tab] Tabl Tabb Table 2.1 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 4.1 Table 4.2 Table 4.3 LIST OF TABLES Berger Code B(7,3). Subsets of B(7,3). B(7,3) with Partitioning Scheme. Logic Expressions for Checker in B(7,3). Berger Code B(6,3). Left-justified Encoding Scheme. .................................. Berger Code B(11,4). Berger Code Encoding Scheme for B(11,4). Reduced Check Part for B(3l,5). Comparisons for Various Checker Designs. Experimental Results. Experimental Results with Output Partitioning Scheme. Comparisons. OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 2 1 34 36 45 48 54 55 67 69 82 84 86 Fig: Egt F1 g. Flgl big big Flt Fig F13 Flt Fig Fig Flt Figure 1.1 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 ‘ Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 2.11 Figure 2.12 Figure 2.13 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 LIST OF FIGURES A self-checking circuit. Self-testing and fault-secure circuits. Totally self-checking circuit. Typical structure of STC for normal Berger code checker. General self-testing checker for mln checker with n at 2m. Test patterns for mln checker. Disjoint 2-1evel realization for C3/5. 3-1evel realization for C35. mln code checker. 1/n code checker. TSC systems: (a) with TSC subsystems; and (b) with TSC and CD subsystems. SFS systems: (a) with SFS subsystems; and (b) with SFS and CD subsystems. A sequential circuit (Moore type). A logic synthesis example. Internal structure of a STC design of MLB code B(I,K) (Circuit CH). Circuit CS for B(I,K) STC design. Circuit CH for B(7,3) STC design [30]. Circuit CS for B(7,3) STC design. STC for modified Berger code with 1:21“. Function G for B(15,4) and B(9,4) STC designs. 12 14 l4 16 16 17 18 24 24 26 27 3O 38 39 42 49 Figure 3.7 Figure 3.8 Figure 3.9 Figure 3.10 Figure 3.11 Figure 3.12 Figure 3.13 Figure 3.14 Figure 4.1 Figure 4.2 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6 Figure 5.7 Figure 5.8 Figure 5.9 Figure 5.10 Figure 5.11 Figure 5.12 Figure 5.13 Flow chart for SOLiT. Circuit CW for B(I,K) STC design. Circuit CW for B(9,4) STC design. Function G for (a) B(11,4); and (b) B(15,4) STC design. Circuit CL for B(I,K) STC design with even u. Circuit CL for B(11,4) STC design. Circuit CL for B(I,K) STC design with odd u. Circuit CL for B(9,4) STC design. Circuit CS with XOR code complementer. A FSM, ex4.kissZ. The output matrix and Rj-sets for ex1.kissz. A Moore-type self-checking sequential circuit. Cell-level implementation of 217 code checker. A FSM example - mark1.kis.s'2. Output of Procedure out_ part. Input file to Pmcedure out_encode. Encoded output functions. OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO Resultant state encoding with assigner. Resultant optimized network. Netlist generated by sis technology mapper. Physical layouts. Layout of an on-line testable sequential circuit, mark1.kissz. Layout of mark1.kissz with (a) the conventional synthesis procedure; and (b) encoded functional circuit. 50 51 56 57 59 61 63 65 74 77 92 95 96 98 98 100 101 103 104 105 106 symbol xeA xeA axe A VxeA AUB AnB IRI f: A —)B {x | x is ...} {l,2,...} ie I" LIST OF SYMBOLS meaning empty, or null set element x is a member of set A element x is not a member of set A there exists an element x in set A for every element x in set A unionofsetsAandB; AuB={xlxe Aorxe B} intersectionofsetsAandB; AnB= {xlxe Aandxe B} union of sets Q1, Q2, ..., and Qn intersection of sets Q1, Q2, ..., and Q1] differenceofsetsAandB,A-B = {xlxe Aandxe B} cardinality of set R, the number of elements in set R f is a function mapping from A to B set builder notation elements in a set index set i = l, 2, ..., v ordered sequence containing f1, f2, ..., and fn logic OR Operation logic exclusive OR operation l'y'l bitwise complement of a binary vector A concatenation of two code, S and T code vector 11! - (n - m) !m! combinations of m out of n; (1:) m-out-of—n m-out-of-n code Berger code with I information bits and K check bits floor function of x; the greatest integer less than or equal to the real number x ceiling function of x; the smallest integer greater than or equal to the real number y xiii hav SOP ttai me: bef the 53‘s tha “P rel} mi CHAPTER 1 INTRODUCTION With ever-increasing complexity of digital applications, the issues of reliability have become very important in today's VLSI designs. Reliability can be improved by sophisticated testing schemes to weed out faulty circuits [1]. However, such offline or static tests can identify permanent faults, but not transient faults. It is obvious that a mechanism for concurrent error detection (CED) must be installed to detect such faults before they cause undesirable results [2]. All concurrent error detection schemes detect errors through conflicting results generated from operations on the same operands. Concurrent error detection can be achieved through space redundancy, time redundancy, and information redundancy [3]. The conflicting results are compared and errors caused by faults are detected by a hardware check circuit, i.e., the checker. Checkers can no longer be assumed to be error-free in digital systems, because they are constructed from the same types of components as the circuits that perform the Operation and hence are subject to the same type of failures. This brings up to the question: Who is checking the checker? The answer is clearly a checker that is self-checking. It would be preferable for the circuits to be designed such that they will indicate any malfunction during the normal operation and will not produce an erroneous result without ch 1C 1.] CITE erTC Cod, an error indication. In these circuits, any fault from a specified set of faults will cause a detectable erroneous output during normal operation, and each fault must not cause erroneous outputs without also producing an error signal [4,5]. Redundancy design [3] approaches have been successfully implemented in today's VLSI designs for enhancing chip yield and system reliability [6]. However, due to speed performance degradation, time redundancy is not suitable for on-line testing. On other hand, for the hardware redundancy, the simplest concurrent error detection in multi-level circuits involves duplication of the circuit and comparing the outputs of the two blocks. Any mismatch between them will indicate an error. The equality of two sets of outputs is checked by a totally self-checking equality comparator [7,8]. This involves more than 100% redundancy (100% due to the duplicate circuitry and some extra for the checker) in the circuit. This is rather a high cost to pay for fault tolerance. Information redundancy involves the use of coding techniques that enhance circuit capacity for reliable operation. It has been implemented for fault-tolerant, fail-safe, and concurrent error detection designs of digital circuits. The features of high speed and low hardware cost are highly desirable especially for a checker design. Therefore, this has motivated the development of fast, yet low hardware overhead cost, checker circuits and design methodologies for self-checking circuits or systems. 1.1 PREVIOUS WORK Extensive research has shown that the errors in VLSI circuits are of a unidirectional error type [9.10]. A unidirectional error model assumes that even though both 0-to-l and l-to-O errors can occur, only one type of error occurs in a particular data word [11]. Such errors have been observed in modern digital devices such as PLAs, ROMs, and compact laser disks [12]. Numerous coding techniques, such as rn-out-of-n codes [13-25], Berger codes [26-32], Borden codes [12,33], Burst detecting codes [34.35.36], Residue codes, con Che. [93 Pro; Prac and AN codes [37,38], have been pr0posed to detect such errors. Among these coding techniques, Berger codes are the least redundant separable codes [26-29] for All- Unidirectional codes. A self-checking circuit [6], as shown in Figure 1.1, consists of a functional circuit and a self-checking checker. The functional circuit can be either a combinational or a sequential circuit. The functional circuit is generally designed such that its primary outputs and some encoded outputs are able to produce an erroneous result in the presence of a fault The error indicator must be designed to produce an error signal for some normal circuit inputs whenever a fault from a specified set of faults occurs within the circuit Input Output ——> Functional Circuit y—-> LSelf-Checking Checker lo— Error Indicator Figure 1.1 A self-checking circuit. Considerable research efi‘orts have been devoted to the design of self-checking combinational circuits [39-43] and self-checking sequential circuits [44-54]. The self- checkirrg concept has been applied to microprogram control units and PLAs [9,39,52,55,56]. In addition, numerous self-checking checker designs have also been proposed [l4,42,49,57,58]. Little emphasis, however, has been devoted to evaluate the practical designs in terms of area overhead and speed performance degradation. Elli Sin VET: attic 1.2 PROBLEM STATEMENT In general, the code length of a checker grows linearly with the number of outputs of a functional circuit, while the complexity of the checker may grow rapidly. As a result, a self-checking circuit with a large amount of outputs requires a huge checker, and the time required for the checker to detect errors may exceed the clock cycle time of the functional circuit [59]. This results in an increase in the system cycle time required to capture the error signal. To alleviate such a problem, it is highly desirable to develop a checker having the features of high speed and low hardware cost. Since the hardware cost of the checkers increase almost exponentially as the code length increases, the use of many smaller code length checkers require less hardware cost than that of a larger code length checker. Thus, it is desirable to develop a partitioning scheme which partitions the output of a given circuit into many smaller groups so that smaller checkers can be used to reduce both hardware overhead and performance degradation. The number of smaller checkers can be generally determined based on the performance measurement A'Ik, where A is the gate count, T is the gate delay, and k is a measurement parameter determined by the applications. In this thesis, the optimization is achieved by minimizing the gate count under delay constraint. In other words, based on the delay constraint derived from a set of design specifications, the checkers which achieve minimal hardware overhead are chosen. As sophisticated techniques for the synthesis of the logic emerge, automated synthesis systems are becoming popular. The synthesis system such as sis [60] generates very good results for multi-level logic implementation. It targets optimizing either chip area or speed performance. However, it is also desirable to address the reliability issue, in automated synthesis system. 1.3 dc: if be 01': 1.4 (its. 1.3 RESEARCH TASKS Due to the salient feature of least redundant separable codes, Berger codes have been implemented for fault-tolerant and fail-safe designs of digital circuits to detect all unidirectional errors. The features of high speed and lower hardware cost are highly desirable especially in the checker design. Thus, the first task in this research is to develop a fast, yet low hardware cost Berger code checker. A partitioning and folding scheme is developed to design a self-testing checker for Berger codes in which both hardware overhead and speed degradation are improved significantly. In order to reduce the code lengths of the checkers and to further reduce the hardware cost, the second task of this research is to develop an efficient output partitioning scheme. The scheme partitions the output into many smaller groups which employ small sized checkers. The chromatic partitioning of a graph is used to formulate the problem. The number of groups is equivalent to the chromatic number. In this research, a partitioning algorithm is presented. Finally, in addition to taking the area and speed as design objectives, reliability should be also a design objective of an automated synthesis system. Thus, the third task is to develop a system for automated synthesis of on-line testable sequential circuits for multi-level logic implementation. The three tasks presented in this research are summarized as follows: (1) develop a fast, yet low hardware cost, Berger code checker; (2) develop an efficient output partitioning algorithm; and (3) deve10p an automated synthesis system for on-line testable circuits. 1.4 THESIS ORGANIZATION This thesis is organized as follows: Chapter 2 reviews the previous work in the design of both checkers and functional circuits. Chapter 3 describes the developed partitioning and folding scheme for Berger code checkers. An output function partitioning algorithm is presented in Chapter 4 with some experimental results. Chapter 5 illustrates a system, namely, SOLi T, for automated synthesis of on-line testable sequential circuits with multi-level logic implementation. The system takes a behavioral description as its input and produce a physical layout for chip fabrication. Finally, conclusions and future research are given in Chapter 6. Ci CHAPTER 2 ON-LINE SELF-TESTABLE CIRCUIT AND SYSTEM This chapter summarizes the terminology used in this thesis, and briefly describes the characteristics of checker and functional circuits for on-line testing circuits and sys- terns. 2.1 TERMINOLOGY The following notation and definitions used in this study can be found in [4]: N: input code space. 8: output code space. x: an input vector. f: a fault in the fault set F. A: an output vector in the correct output space. Y: mapping function of the circuit W: A circuit L is Self-Testing (ST) for an input set N and a fault set F if for every fault f in F there is some input x in N such that Y(x,f) is not in S. m: A circuit L is Fault-Secure (F8) for an input set N and a fault set F if for any input x in N and for any fault r in F, Y(x,f) = Y(x,it), or Y(x,f) e s. erF Y(x,7t)esforallxeN Y(x,2.) es forsrme x eN x e N Circ °t L l e F —. ur __> Input Space Output Space YOU») Y(x,f) e . (a) Self-testing circuit. Vf e F Y(X.0 = Yul) » or Vx e N , , “"0 e S Crtcurt L Input Space Output Space You“) ’ e N i’r 3 Vx E N Not Permissible (b) Fault-secure circuit. Figure 2.1 Self-testing and fault—secure circuits. 9 W: A circuit is Totally Self-Checking (TSC) if it is both fault-secure and self-testing. All Faults Input Space Y(X1.fy) Output Spec m e / Y(X1,f1) e / Y(x29f 1) , 0 Y(Xz.fa) 4’ “12.1) your» ’0 Figure 2.2 Totally self-checking circuit. W: A circuit is Code-Disjoint (CD) if the input code space maps to the output code space and the input noncode space maps to the output noncode space. W: A checker is a Self- Testing Checker (STC) if it is self-testing and code-disjoint. mm: For a fault sequence . let k be the smallest integer for which the combination of f}, l S i S k, does not produce the correct code output for at least one code input in a circuit G. If there is no such k, set k = n. Then G is Strongly Fault Secure (SFS). In In er" an. SEC 5X3 the 11181 1111 lO Wm: A circuit is Strongly Code Disjoint (SCD) for a fault set F if, for every fault f in F, either (1) the circuit is code-disjoint and self-testing for the fault f, or (2) the circuit is code-disjoint and if f occurs then the resultant circuit is still SCD for the fault set F excluding the fault f. W: A circuit is Strongly-Self-Checking (SSC) for a set of faults F if before the occurrence of any fault, the circuit is code-disjoint, and for every fault in F, either: (a) the circuit is self-testing and fault-secure, or (b) the circuit is fault-secure and always maps noncode words at the inputs to non- codewords at the outputs, and if another fault from F occurs, then either property (a) or (b) is true for the fault sequence. We: In a rn-out-of-n (mln) code, denoted as Cm, all code words have exactly m 1’s and (n-m) 0’s. It is also called m-hot code. W: A Berger code B(I,K) of length L has an I-bit information part and a K-bit check part, where K = llogza + 1)] and L = 1+ K. The check part is the bit by bit comple- mented binary representation of the number of 1’s in the information part. W: A code is Systemch if it has K check bits appended to 1 information bits and all the 21 possible information I-tuple are assumed to occur. W: If all the 2I possible I-tuple do not occur and it is known which one is presented, a code derived for this case is referred to as a Separuble Code. mm: A minterrn m; for a single output function f is called a True Minterrn if f= 1 when mi is applied. CC If we “It The Wor for 1 been 1We] 11 W: A minterrn m for a multi-output function, fi’ 1 S i S n, is called a False Minternr if all fi = 0 when m is applied. W: Complete Covering requires the covering of l-cells as usual, but in addition, each O-cell must be covered by some product terms. Wm: A product term pi is a Due Product Term (False Product Term) if all the minterms it covered are true minterms (false minterrn). 2.2 SELF-TESTING CIRCUIT STRUCTURES A general structure of a totally self-checking (TSC) circuit is shown in Figure 1.1. During the normal operation, code inputs are applied to the functional circuits and the coded outputs are produced. Both the functional circuits and the checkers should possess some special properties. This section reviews some existing STC designs for checkers and TSC design for functional circuits. 2.2:1 CHECKER DESIGN Two coding techniques, the rn-out-of-n (tn/n) code and the Berger code, have been successfully implemented in VLSI design for detecting all unidirectional errors. Basically, a rn/n checker consists of two independent sets of subcircuits, each having a single output The mln code checker produces the output (0,1) or (1,0) for the application of a rn/n code- word input, i.e., the number of 1’s at the checker input is m, and the output (0,0) or (1,1) for non-codeword inputs. Considerable design alternatives for rrr/n code checkers have been studied in [13-25]. The circuits for the rn/n checker may be implemented with two- level, three-level, or multi-level logic. Numerous self-checking checkers for Berger codes side: 5815. Where 12 have been presented in [26-32]. The basic design principle for a Berger code checker is to generate the replicated check bits of binary value complementary to the original ones and to compare them using a two-rail comparator. A general structure is shown in Figure 2.3. Information N Bits 6 1 ted om 6111811 Chgck Bits —’ N2 ——" 1 Ge STC for l-out-of-Z ner 3(01' , K-pairs Code ,/ p 2-1211 Code ——> Check Bits K Figure 2.3 Typical structure of STC for normal Berger code checker. A. mln Checkers A rn/n checker design with two-level logic realization was presented in [15]. Con- sider a rn-out-of-Zrn code checker, where the 2m bits are partitioned into two distinct sub- setsA = {inOS i Sm-l} and B = {xj I m Sj S 2m-l}. Ifthe two outputs ofthechecker are designed as q f = 2 Tot. zi) o T(k,, 2 m-i) i is odd, (2-1) is}, 8 = i T(k. 2 i) 0 T(k. 2 tn-i) i is even, (2.2) 1.1) where p = max (m-nb, -l), q = min(na, m+l), jorit the 1 has 1 k. + 1 COdeu the su functi- dant 1 test"! 1101]“ 13 na = the number of bits in group A, ab = the number of bits in group B, m = the number of 1’s in the information bits, ka, kb = the number of 1’s occurring in each group. The notation X represents the OR operation, and 0 represents the AND operation. The ma- jority function T(ka 2 i) represents the Boolean function that has the value 1 if and only if the number of 1’s in subset A is greater than or equal to the value i; similarly, T(kb 2 m - i) has the value 1 if and only if the number of 1’s in the subset B is greater than or equal to the value m - i. The term T(ka 2 i). T(kb 2 m - i) = 1 is tested by making T(ka 2 i) and T(kb 2 m-i) simultaneously be equal to 1. This is done with codewords containing i 1’s in subset A and (m - i) 1’s in subset B. These codewords exist because 0 S i S m and 11a = nb = m. Further- more, T(lra 2 i) may be implemented by the AND of each different combination of i bits from the it, available; the results are ORed together. Each combination must be tested sep- arately, and this requires (“i") = (T) code words, each containing exactly i 1’s in A. Sirni- larly, T(kb 2 m - i) requires (‘2‘) = (J: ‘.) = (T) codewords, each containing exactly (m - i) 1’s in B. Since the bits in A and B are independent except for the constraint that ka + kg = m for codewords, these factors can be completely tested with the same (T) codewords. To test all such terms for 0 S i S m, it requires a total of 2m test patterns, i.e., thesum of (T) forOSiSm. If such a test set supplies all 2m input combinations to A and B, then the majority functions, such as T(lra 2 i) or T(kb 2 m - i), are exhaustively tested, and any non-redun- dant implementation of them is completely diagnosed. Therefore, the checkers are selfi- testing for single faults or unidirectional faults if these 2m codewords are given during normal operation. the 14 A rn/n code checker, with n at 2m, as shown in Figure 2.4, is generally realized by translating the given code to a l-out-of- (3) (ll (:1 )) code, which is then converted to a p-out-of-2p (p/2p) code via a TSC translator [15]. In this case, p should satisfy the follow- ing relation: (2?) .>. (“ J22? (2.3) p m The codewords should be selected so that the left-most p bits are complementary to the right-most p hits as shown in Figure 2.5. The two-level realization obtained is in- deed a minimum level TSC circuit However, when the number of inputs become larger, the fan-in of every OR gate grows fast making it impractical for implementing. o.— l a 1 OR *—° Z0 mln Translato l/ (1:) Array 9/29 P/Zp 1’2 Code C°d° mum Code Checker Code 0— r —° 21 Figure 2.4 General self-testing checker for mln checker with n at 2m. XO X1 XZ X3 X4 X5 0 0 0 l 1 1 1 0 0 0 1 1 0 1 0 1 0 1 0 0 1 1 1 0 1 l 0 0 0 1 0 1 1 1 0 0 1 0 1 0 1 0 1 l 1 0 0 0 Figure 2.5 Test patterns for mln checker. wt 21 M to £3 mc 3.1 [113 15 An alternative rn/n TSC design using a 3-leve1 realization has been presented in [17] for n 2 4. For n < 2m and n > 2m, the procedure gives AND-AND-OR and OR-OR- AND 3-level realizations, respectively, while for n = 2m, it results in an AND-OR (or du- ally OR-AND) 2-level realization which is equivalent to the design provided in [15]. The design procedure consists of the following two stages: (1) Construct two 2-leve1 AND-OR (OR-AND) circuits with no shared gates, which correspond to blocks of the partition. (2) Provide AND (OR) gates, which are shared by two 2-level AND-OR (OR-AND) circuits, to form AND-AND-OR (OR-OR-AND) realization. Basically, a mln code is comprised of a set of nr/n code vector x, and it can be par- titioned into two disjoint subsets Ga and G1,, where ka and kb are defined in Equations (2.1) and (2.2). Ga = {xl x 6 mln, w(ka) is odd} (2.4) Gb = {xl x 6 mln, war.) is even} (2.5) where w(ka) and w(kb) are the number of 1’s in subset Ga and Gb, respectively. A disjoint 2-level AND-OR realization based on the partition is a pair of 2-level AND-OR circuits g8 and gb as shown in Figure 2.6, where there exists a one-to-one correspondence between AND gates in circuit g“ and codewords in G,, and between AND gates in circuit g, and codewords in GI, Exactly m bits of value 1 in codeword x are used as inputs to the AND gates for x. A disjoint two-level OR-AND realization can be defined in a dual way. Such a 3-level configuration with shared gates is illustrated in Figure 2.7. AND gates at the left- most stage are referred to as shared ANDs and those at the middle stage are called 3-level AND-AND-OR realization (a 3-1evel OR-OR-AND realization is defined in a dual manner). l6 123 1452 65 3‘5 12 4 134 234 135135 235 Figure 2.6 Disjoint 2-level realization for C35. 3‘ 2‘35 Figure 2.7 3-level realization for C35. 17 Similar to two-level, the three-level implementation also has the large fan-in prob- lem. In order to alleviate this problem, a cost-effective mln checker design with multi-lev- el logic implementation has been presented for m 2 3, 4m 2 n > 2m [20]. Basically, any rn/n code can be represented as a sum of concatenated partitioned codes i/na and (m-i)/nb for 0 S i S m, and 113 + 11., = n. Thus, the set of all rrr/n codes can be formally written with the union operator as Cm/n = iQCi/n' XC (m+i)/n. (26) where U is a union representation and x is a concatenation operator. The set Cm,n can be partitioned into the following three disjoint subsets: CAB = Cir_/n.xcr,/n.3 (2.7) r, -1 CA = ‘KJO Ci/n.xc(m-i)/u.; (2-8) 1,-1 C3 = .kJoCUn-DIMXCi/h' (2-9) I. wherek,=Lm/21,k.,=l'm/2'L 1 Sk,Sna, 1 Skanb. Both setsCA andCB canbefurther partitioned by the same manner. With this partition scheme and algorithm presented in [20], the min checkers, as shown in Figure 2.8, can be obtained by using a translator H1 of the min code into 2/4 code, and a rsc 2/4 checker, or H2. The block H2 will produce a 1/2 code as the error signal. For demonstrating this algorithm [20], an example is given below. mln l | 2/4 | 1/2 HI I .l H2 Figure 2.8 mln code checker. 18 W: The following equations are used to design a TSC checker for 3/8 code [20]. hi =x1+(x2+x3) hj =(x4+x5)+(x6+x7) h] =x1(x2+X3)+(x4-i-x5)(x6+x7) hi =X2X3+X415+16X7 h} =(h[+h])(h} +111) hi =h1hi +h1h1 Numerous TSC lln code checker designs have been developed [21,22,23]. Similar to Figure 2.8, a lln TSC checker, as shown in Figure 2.9, can be obtained by using a trans- lator H1 to translate the lln code into 2an code, then into 1/2 code for indication [20]. At the first stage, no should be selected such that [n 0; I) s n s (“20) and follow the algorithm in [20] to obtain the partitioned subsets h? , 1 Si 5 no. A Zlno code results. Then, if no is large, a further reduction of no will be necessary to obtain a smaller 11], and then a 1/2 two- rail checker is appended to the last stage. lln lln to mo 11‘ c rn'ln’ m’ln’ Code -—> TransTator b Chedéer —-> H H Figure 2.9 l/n code checker. 19 W: [25] Consider the design of a 1/20 code checker. It first needs a H1 translator which translates the 1/20 code to a 2/7 code with the functions h? , l S i S 7, Then the 2!? code is translated to a 2/4 code as shown in Example 2.1. The following equations summarize the ll20 code checker design. h? =x1+x7+x3+x14+x15 hg =x1+x2+x11+x12+x19+x20 hg =x2+x3+x8+x9+x17+x13 112 =X3+X4+X12+X13+X15+X16 h? =x4+x5+x9+x10+x20 his) =X5+Xo+xr3+xr4+xrs+xr9 h? =x6+x7+x10+x11+x16+x17 h} =h‘}+(h3+h%) h} =aa+h9)+(h2+h9) h] =h‘} (h3+h3)+(h3 +11% x119, +h9) hi =18 h3+h3 12min} 1:2 = 11% =h}h§ +h5hi B. Berger Code Checkers Figure 2.3 shows the general structure of a Berger code checker. The complement- ed check bits generator is used to generate the binary value corresponding to the number of 1’s in the information bits. In general, a set of full adder modules that add the inforrna- tion bits in parallel and produce the binary number corresponding to the number of 1’s in the information bits has been implemented [29]. Recently, based on a completely diflerent Sill; 5'30 L the 20 structure, a Berger code partitioning scheme has been presented for the checker design [30]. Consider a Berger code checker design [29] using a set of full adder modules for the complemented check bit generator. The design procedure is summarized as follows. (1) Let s = {xi} 1 S i S 2k-1} be the set of all information bits, and set m = k andj = 1. (2) Partition the set S into three groups Aj, Bi, and 53, which respectively consist of the left-most2m‘1-1 bits, the next 2m'1-1 bits, and the right-most bit Let ai = (9,11,13ij ,...,ail ), bi = (bim-1 'bjma ,...,bil ), and ej be the binary representations of the number of ones occurring in the information bits of groups Al, Bi, and E3, re- spectively. (3) The binary representation of the number of ones occurring in S, gj = (gim ,gimq, gi. ) is given by the following addition: aill-l ajm-2 31 bin-l bin-2 bi ei gjm Sin-l Sim-2‘" Bil A ripple carry adder with m - 1 stages is used to generate the vector gj. (4)Ifm>2,then:(i)m=m-1,L=j;(ii)letS=Aj,j=j+1andrepeatstepsZand3 to generate the vector aL = g'L. The vector bL is generated by a circuit identical to that which generates aL. (5) End. Consider the Berger code checker based on partitioning scheme in [30]. The design proceeds from the idea that any Berger code can be constructed from u = [(1 + 11/21 rn/n codes, where I is the number of information bits, m = 2i-1, 1 S i S u, and n = I + 1. Table 2.1 (a) lists the B(7,3) codewords. Each subset Di consists of seven information bits {in1 S i S 7} and three check bits {x3,x9,xlo}. The subset Di is a collection of Berger 21 W (a) expanded reduced Subset information part check part x1 x2 x3 x41‘51‘61‘71‘10 X8 X9 D1 0 0 O 0 0 0 0 l 11 D2 0 0 0 0 0 0 l 0 11 D3 0 0 0 0 0 l 1 1 10 D4 0 0 0 0 l 1 1 0 10 D5 0 0 0 l l l 1 1 01 D6 0 0 1 1 1 l 1 0 01 D7 0 l 1 l 1 l l 1 00 D3 1 l 1 1 1 l 1 0 00 (b) expanded reduced Subset information part check part partitioning X1 X2X3 X4XonX7 X10 X8 X9 E1 00000001 11 meal) F4 00000111 10 C3,3x(10) E5 0 0 011111 01 C5,3x(01) F4 0 1 1 l 1 1 l l 00 C7,3x(00) par SELS the subs 22 codewords in which the number of 1’s in the corresponding information part is i - 1, and Di = C(i-1)/(I+1)x Pi, where x is a concatenation operator, Pi is the check part of Di. The Berger code B(7,3) can be generally represented as a union of eight nr/n codes Cin, 0 S i S 7, as follows B(7,3) = Conx(111) u Clnx(110) u me(101)u C3nx(100) u C4/7X(011) U C5/7X(010) U CmeOl) U C7nX(000). The partitioning scheme first expands the information part I by adding the least significant bit (LSB) of the check part Table 2.1 (b) illustrates the expanded information part 1' = [ x1,x2,...,x7,x10} and the reduced check part P. = {x3,x9}. As a result, the sub- sets DZi-l and D2i not only have the same number of 1’3, but also have the same reduced check part Thus, the adjacent pair of subsets D2“ and D3 are grouped and replaced by a subset F4“, as shown in Table 2.1 (b). The Berger code is written as B(7,3) = C1/gX(11) U C3/8X(10) U C5/3X(Ol) U C7/3X(00). i.e., the implementation requires only four checkers for Cm, C313, cm, and Cm. This results in reducing the number of checkers by half with the partitioning scheme. In [30], two design alternatives were presented: a minimal-level design and a cost- effective design. The former uses the minimal-level STCs for rn/n codes to reduce the num- ber of gate levels, while the latter employs multi-level logic implementation to reduce the gate count Results have shown that the STC design for B(7 ,3), for example, requires 5 gate levels and 137 gates for minimal-level implementation and 8 levels and 87 gates for cost- effective implementation [30]. However, it needs 17 gate levels and 40 gates in [28] and 11 gate levels and 58 gates in [29]. The B(15,4) checker takes 11 gate levels and 315 gates for cost efl‘ective implementation in [30], but requires 35 gate levels and 149 gates in [28] and 17 61‘ Oil 604 2.2 seq ext H01 Inf; also mus (2) 23 17 gate levels and 150 gates in [29]. Compared to the ST C designs presented in [28], the partitioning scheme Offers an improvement in delay but with a great cost in hardware. How- ever, it should be mentioned that the implementation of the partitioning scheme is available only for B(I,K) with I = 2K - l or 2K - 2, i.e., the design is not STC for any other Berger code lengths. 2.2.2 FUNCTIONAL CIRCUITS A system may consist of several subsystems which can be either combinational or sequential, interconnected with each other via certain interfaces. TWO extreme approaches exist for such a system to be TSC [56]. (1) If all the subsystems are TSC and all the interfaces are monitored by checkers which are CD and ST, then the system is TSC, as shown in Figure 2.10 (a). (2) If all the systems are CD in addition to being TSC, then the system is TSC with no checker used at the embedded interfaces except for the primary output of the sys- tem, as shown in Figure 2.10 (b). A similar argument holds for SFS systems as shown in Figure 2.11 (a) and (b). However, in approach (1), the use of a large number of checkers for embedded partial in- terfaces may incur not only an intolerable increase in the amount of checker hardware, but also an unacceptable delay in error indication [52]. This is because all the checker outputs must be reduced to a single pair in a self-testing manner. On the other hand, in approach (2), it may be difficult for subsystems to be either CD, SCD. Therefore, a trade-Ofl’ be- tween these two approaches should be made. 24 TSC . TSC Subsystem Subsystem i ,.- Checker Checker Checker (a) —.l TSC&CD TSC&CD Subsystem Subsystem i Checker (b) Figure 2.10 TSC systems: (a) with TSC subsystems; and (b) with TSC and CD subsystems. srs SFS Subsystem Subsystem l SCD I SCD SCD Checker V Checker Checker (a) _._.l SFS&CD |_—.| SFS&CD Subsystem * Subsystem SCD Checker (b) Figure 2.11 SFS systems: (a) with SFS subsystems; and (b) with SFS and CD subsystems. k0 Semi self-c [39]. my 11 done uct ti outpl SFS 1y, th 1C 131 the 1 the 1 lowi Dari. eids got. her I1101 “111 Stat 25 A. Combinational Circuits The concept of a dynamically checked system was first introduced by Carter and Schneider [61] and later formalized and extended by defining both totally and partially self-checking notions for combinational circuits [50]. Consider the two-level PLA design [39]. The conventional design of PLAs generates all the true product terms in the AND ar- ray without considering any false product terms, but the design of SFS PLAs should be done differently. It is proven that if a PLA includes each false mintenn in some false prod- uct terms in the AND array, then any unidirectional error in the PLA will propagate to the output. If the external output is encoded ill an unidirectional error-detecting code and a SFS checker is used to monitor the outputs, the error in the PLA will be detected. Sirnilar- ly, this concept can be extended to the multi-level combinational circuit With the algebra- ic factorization of two-level logic a SC multi-level combination circuit will be Obtained. B. Sequential Circuits Consider a Moore type sequential machine, as shown in Figure 2.12, where Q is the set of states, 2 is the set of output vectors, T is the state transition function, and G is the output function. It has been shown that a sequential machine is ST and FS if the fol- lowing conditions are held [50]: (a) The combinational function T is SC; (b) The combi- national function G is SC; (c) G is CD, even in the presence Of faults in G; and ((1) There exists a sequence i applied during normal functioning such that the next state function goes through all the transitions of the transition diagram (before a second fault occurs), then the sequential machine is SC for a fault set Er. Mukai and Tohma [51] have proved that, using a state assignment which leads to monotonic next state equations and output equations with respect to the primary inputs, will cause unidirectional errors at the primary output when a single fault in the circuit oc- curs. Any code in which the codewords have no ordering among them can be used for the state assignment Figure 2.12 A sequential circuit (Moore type). 2.2.3 LOGIC SYNTHESIS PROCEDURE A logic synthesis procedure for finite-state machines (FSMs) includes three major steps: state assignment, logic optimization, and logic implementation. In order to synthe- size FSMs with the tools developed at the University of California at Berkeley, the standard kiss2 format is used to describe the behavior of a FSM, as shown in Figure 2.13 (a), where ”i” is the number of inputs; "0" is the number of outputs; ”s” is the number Of states, "p” is the number Of product terms. The state assignment tool, mustang [62], is employed to encode internal state variables, as shown in Figure 2.13(b). The resultant logic is optimized by either espresso [63], for two-level logic minimization, misII [63] or sis, for multi-level Optimization. Figure 2.13(c) lists the multi-level logic optimization using sis. A test pattern generation tool, stallion [63], can be used to generate test patterns to evaluate the synthe- sized circuits. Finally, the synthesized circuit can be implemented with either standard cells or gate arrays. Figure 2.13(d) lists the netlist generated by the technology mapper in sis, where standard cells in the sis cell library are used. .11111111111-41. ........... 27 \IQUINW bwoomm 110000000 000000000 001000000 000000000 000000000 100110000 100100000 000001100 000000100 110000000 110000000 110000000 000000010 000000000 001000010 000000000 000000000 100110000 100100000 110000101 110000100 (a) standard kiss2 format 21 (0'0 T 1 0010 3 1010 2 1000 S 0110 7 0111 11 0100 12 1011 8 0000 4 1111 13 1001 14 1100 6 0011 9 0101 10 0001 .i 10 .o 13 .type fr #**********#**=fl= 1 ----- 0010 0010 1010 1000 0110 0111 0100 1011 0000 1111 1001 1100 0011 0101 0001 1---0- 1111 1001 1 ----- 1001 1100 1 ----- 1100 0011 10---- 0011 0011 11---- 0011 0101 1 ----- 0101 0001 1----1 0001 1010 1----0 0001 1111 ATE ASSIGNED FINITE AUTOMATON 110000000 000000000 001000000 000000000 000000000 100110000 100100000 000001100 000000100 110000000 110000000 110000000 000000010 000000000 001000010 000000000 000000000 100110000 100100000 110000101 110000100 (b) state assignment with mustang. Figure 2.13 A logic synthesis example. 28 Figure 2.13 Cont’d INORDER = v0 v1 v2 v3 v4 v5 v6 v7 v8 v9; OUTORDER = v10.0 v10.1 v10.2 v10.3 v10.4 v10.5 v10.6 v10.7 v10.8 v10.9 v10.10 v10.11 v10.12; v10.0 = v8*v10.8 + [13] + v10.5; v10.1 = v9*!v10.0*!v10.10 + lv3*!v9*v10.5 + !v2*!v9*v10.5 + v6*v10.10 + v7*!v10.7 + [13] + v10.12; v10.2 = v6*v10.0*!v10.12 + v7*!v9*!v10.6 + v8*!v10.0*v10.3 + !v10.1*v10.5; v10.3 = £v10.8*[15] + !v7*v10.7 + v6*v9 + v10.6; v10.4 = v10.7 + v10.S; v10.5 = lv6*!v7*!v10.10 + !v8*v10.10; v10.6 = v7*[13]; v10.7 = !v9*!v10.6*[15] + v6*!v7*!v8 + v10.8; v10.8 = v1*v9*[15]; v10.9 = v2*v8*v10.10; v10.10 = !v7*v9; v10.11 = v4*v6*[13] + v6*v10.6; v10.12 = v5*!v8*v10.10; [13] = v8*!v9; [15] : 1V6*V7; (c) resultant output network from sis. .model ex4 p esp .names v6 v7 v8 v10.10 v10.5 .inputs v0 v1 v2 v3 v4 v5 v6 v7 --01 1 v8 v9 oo-o 1 .outputa v10.0 v10.1 v10.2 v10.3 .names v7 [13] v10.6 v10.4 v10.5 v10.6 v10.7 v10.8 11 1 v10-9 v10-10 v10-11 v10-12 .narnes v6 v7 v8 v9 v10.6 v10.8 .names v8 v10.5 v10.8 [13] v10.0 [15] v10.7 -1-- 1 ----- 1- 1 ---1 1 1oo---- 1 1-1- 1 ---oo-1 1 .names v2 v3 v6 v7 v9 v10.0 v10.5 .namea v1 v9 [15] v10.8 v10.7 v10.10 111 1 V10-12 [13] V10-1 .names v2 v8 v10.10 v10.9 --------- 1- 1 111 1 """"" 1 1 .names v7 v9 v10.10 "'1"‘0"' 1 01 1 --1 ----- 1-- 1 .names v4 v6 v10.6 [13] v10.11 0---0-1---- 1 -11- 1 -0--0-1---- 1 11-1 1 ----10--0-- 1 .names v5 v8 v10.10 v10.12 .names v6 v7 v8 v9 v10.0 v10.1 101 1 v10.3 v10.5 v10.6 v10.12 v10.2 .names v8 v9 [13] '----0-1-- 1 10 1 “'1'0'1"' 1 .names v6 v7 [15] -1-0----0- 1 01 1 1"‘1""0 1 .end .names v6 v7 v9 v10.6 v10.7 v10.8 [15] v10.3 ---1--- 1 1-1---- 1 -0--1-- 1 ----- 01 1 .names v10.5 v10.7 v10.4 1- 1 -1 1 (d) netlist. CHAPTER 3 EFFICIENT SELF-TESTING BERGER CODE CHECKER DESIGN This chapter presents the design of self-testing Berger code checkers with the developed partitioning and folding scheme. For a Berger code B(I,K), if I = 2k - 1 or 2k -2, then it is called a maximal length Berger (MLB) code. Otherwise, it is a nan-maximal length Berger (NMLB) code. The designs of self-testing checkers for MLB and NMLB codes are respectively discussed in Sections 3.1 and 3.2. 3.1 MAXIMAL LENGTH BERGER (MLB) CODE CHECKER DESIGN This first section more specifically discusses the Berger code checker design with the partitioning scheme presented in [30], and then presents an improved version with a partitioning and folding scheme for MLB codes. 29 30 3.1.1 BERGER CODE PARTITIONING SCHEME Figure 3.1 shows the internal structure of a STC design of MLB code implemented with the partitioning scheme presented in [30]. The checker, referred to as Circuit CH, is comprised of four major blocks: BlOck H includes the checkers for C(2i-1)I(I+1)’ 1 S i S u. where u =RI+1)/21 Each checker takes 1+ 1 bits from the expanded information part It as its input and produces two check output lines; Block G consists of (2 K" - K) AND gates which have as inputs the check bits from the reduced check part Pt; Block Q consists of u - 1 pairs of AND gates; and Block 8 contains two u-input OR gates. The l-out-of-2 code in the outputs of Block 8 indicates an error, if it exists. I! Q , ‘51 q] 3’ STCFor tan-own) b1 Code , , s w 0 O s an ' 3 M 0‘: l _ s ' hump-km 3‘ 31cm _ ' +1 (2.13-occur) b; q} _ s, : Q. 8 3 s: ' bl ' r u- STCFor -- mined-0+1) 1; Code -- m and Flt HIM-"Mil _ zit-3w __ :smm Figure 3.1 lntemal structure of a STC design of MLB code B(I,K) (Circuit CH). 31 111mm Circuit CH is a STC Checker. Prior to proving the properties of self-testing and code-disjoint for Theorem 3.1, the following lemmas are needed. mm: The totally self-checking checker Hr produces the following code output when a code in a subset Eli-1 is applied to Circuit CH. (01)or(10) ifi=r. (111,115): (00) ifir. M Consider the outputs hi and b5 of the checker H'. Let Nx be the number of 1’s in an input codeword. Since h’l and h’z are the majority functions as defined in Equation (2.1) and (2.2) of the inputs, the value of (hi, hi ) is equal to (1,1) if Nx > r; (1,0) or (0,1) if N" = r; or (0,0) if N" < r. For any code in 523-1, its number of 1’s is i, i.e., NR = i. Thus, the lemma results. 0 Consider the construction of Block G which consists of ZK'1 - K AND gates Gi, 1 S i S u. The output of Gi is denoted as gi and its inputs are the l-value reduced check bits inP;.IncaseofN1(P;)= l,sayxj= l inP; thengi=xjandnoANDgateisnecessary, where N1(A), (N0(A)) denotes the number of 1’s (0’s) in A. For i = u, N1(P; ) = 0, it is assumed that g‘l = 1 [30]. Let X and Y be two N-tuple. We say that X covers Y if and only if X has 1’s everywhere Y has 1’s. If neither Y covers X, nor X covers Y, then we say X and Y are unordered. mm: (a) P; never covers P; if j < i; (b)Ifgi= l and 1’; covers P; ,thengj= l;and (c) Ifgi = land P; and P; are unordered, then gi = 0. m (a) If j < i, then Nlaj) < N109, or N105) > N1(Pi), i.e., P; has more 1’s than Pf . Thus, 1’; never covers P; ; (b) By definition, P; covers P; means that all inputs of Gi are ger ind inE 5/0 the l<1 Out; 32 also inputs of 0‘. Thus, if gi = 1, Le. the input variables of (3‘ are all 1’s, then gi = 1; and (c) Since P; and P; are unordered, there exists, at least, one input variable of 61, say x,, which is not an input of Gi, where x, = 0 (because gi = 1). Thus xt = 0 causes the AND gate Gi to produces gi = 0. o W: When a code including the reduced check part P; is applied to Block G, the outputs are {I forj=i;orj>iandP; caversP}. o for j < i; or j > i and P; and P; are unordered. m: When a code including P; is applied to Circuit CH, the AND gate Gi produces gi=1,i.e.,gj = l ifj=i.Forthecaseofj >iand P; covers Pj',by Lemma 3.2 (b),gj= 1; If j < i, by Lemma 3.2 (a), P; never covers Pf , therefore gi = 0. Finally, if P; and P; are unordered, and j > i, by Lemma 3.2 (c), gi = o. o W: Circuit CH is self-testing for all unidirectional faults. 2mm: Two types of faults are identified: stuck-at-l (s/ 1) and stuck-at-O (s/O) faults. In general, a noncode output (1,1) in Qi results in a noncode output (1,1) in Block S which indicates an error. On the other hand, if all Q’s produce the same noncode output (0,0), this results in a noncode output (0,0) in S that indicates an error. Thus, the application of a code in 821.1 can detect not only the sll fault(s) at the output(s) of Hi, Q1, and/or S, but also the s/O faults at the output(s) of (3‘, 11‘, Q‘, and/or 3. Finally, the sll faults at is can be detected by applying a code in F404. When this code is applied to Circuit CH, all checkers produce the same noncode output (1,1), except that Hr generates a code output. Since gI = 0, for r < u, and g“ = 1 for fault-free circuit. the SI] faulty g‘ will force Qi to produce a noncode output (1,1). Thus, Circuit CH is self-testing for all unidirectional errors. 6 W: Circuit CH is code-disjoint. M3 When a code in subset hid is applied to Circuit CH, the code input produces a 33 code output ill Block S, i.e., all code inputs produce all code outputs. Consider a noncode, without loss of generality, consisting of I; and P: , where r at q. When the noncode input is applied to Circuit CH, by Lemma 3.1, the checker Hi produces a noncode (0,0), for i > q, or (1,1), for i < q. Since gi = 0, for i > r, and g‘= 1, a noncode output (1,1) is produced in Q', if r > q, and further generates a noncode output in Block S. On the other hand, if r < q, all outputs in Block Q are (0,0)’s and result in a noncode output (0,0) in Block S. This Concludes that the noncode inputs produce the noncode outputs. 0 Proof of Theorem 3.1 Based on Lemmas 3.4 and 3.5, Circuit CH is a STC checker. 3.1.2 PARTITIONING AND FOLDING SCHEME The basic concept behind the Berger code partitioning scheme is that the LSB of the check part is grouped with the information part to reduce the number of checkers used in a STC design. This section presents an alternative structure that further reduces the number of checkers. We first consider the MLB code with I = 2K - l. The same concept can be extended for l = 2K - 2 and 2‘“. A. B(I,K)withl=2K-l As defined in Table 2.1(b), E‘, t = l, 3, 5, and 7, is a subset of B(I,K) that replaces the adjacent pair of subset D1 and DM in Table 2.1(a). In general, a B(I,K) is composed of subsets End’s, IS t S 11. Note that N100 = t and N00.) = 2n - t. W: Let Es and Ft be two subsets of B(I,K), the pair of E, and E1 is dual if and only if the bit-complement of any codeword in E8 is a codeword in BI and vice versa. Thus, we say thatEs=EtandEt=Es. co im £1 cor pal Call X1181 3 bit, Consider the MLB codes B(I,K) with I = 2K - l, i.e., u = 2*“. Let Em and 52“-“, be 34 any two subsets of B(I,K). Unless otherwise stated, it is assumed that m < u. M: The pair of subsets Em and EQu-rn is dual. m: For any codeword cIn in'Em, N1(cm) = m and N0(cm) = 2n - 111. Let op be the bit- complement of cm, then we have N1(cp) = N0(cm) = 2u - m and N0(cp) = N1(cm) = m. This implies that Cl, is a codeword in Fawn. Similarly, the complement of any codeword in 520,", is a codeword in Em. Thus, the pair of Em and 52“,“, is dual. 9 Since the subset If in Table 2.1(b) is a collection of l-out-of-8 codewords and J7. consists of all 7-out-of-8 codewords, the pair of subsets El and E; is dual; Similarly, the pair of E3 and 135 is also dual. “Without loss of generality, subsets 55 and F4 in Table 2.1(b) can be re-written as shown in Table 3.1. W expanded reduced Subset information part check part partitioning x1 x2x3 x4x5x45x7 x10 x8 X9 55 0 0 011111 01 C5,8x(01) E7 0 1 l l l 1 l 1 00 C7,3x(00) Let 1: and P: be the expanded information part and the reduced check part of 13‘, respectively, and xMSBm denotes the MSB of 9:“. It is obvious that xMSBm = 1 and XMSBQIH) = 0, if! < 11; 311d XMSBQ) = 0 311d xMSBQu-t) = 1, if! > 1.1. Let A be a 86! and X be abit, then we define the set A$x= {y l y= z$x,forallze A}. 35 m: If P1 = P’lu-texMSBa)’ then Ft = FZu-t- M: For t < 11, since XMSB(t) = 1 and nugget”) = 0, we have F2“ = E1$XMSB(2u-t)= 151 and Ft = EQu-teXMSBa) = 92“-, = Et, i.e., F2“ = Ft. On the other hand, for t > 11, since xMSB(2u-t) = 1 and Mason) = 0» hence» Ft = Elu-t and qu-t = Et = Fan-tr i.e., Ft = FZu-t- 0 Since, by Lemma 3.6, 52“ = E, the condition in Lemma 3.7 can be written as Ft = Faut$XMSB i, or (1,1), if r< i. M: This can be easily seen from the construction of TSC checkers as proof in Lemma 3.1. 0 Recall that the checkers in Circuit CS are the TSC checkers that detect all unidirectional faults. When a code in F4“ is applied to the fault-free Circuit CH, by Lemma 3.9, Hi produces a code output (0,1) or (1,0), and 11“ generates a noncode output (0,0), ifr> i, or(1,l), ifr< i. By Lemma 3.1, gr: 0, forr l-out-of-Z Code I _ 1 L831) —>’ Figure 3.5 STC for modified Berger code with 1:21“. 47 3.2 NON-MAXIMAL LENGTH BERGER CODE (NMLB) CHECKER DESIGN The partitioning scheme in [30] offers an improvement in delay. However, the implementation is available only for MLB codes. This section presents an extension to the NMLB codes. We first show that a NMLB code B(I,K) with the partitioning scheme in [30] is not a STC checker, and then present a STC checker design for NMLB codes. In addition, the partitioning and folding scheme presented in the previous section can also be applied to further reduce hardware cost. 3.2.1 STC DESIGN WITH PARTITIONING SCHEME Consider Circuit CK for the NMLB code B(I,K) with u < 21“. For the reduced check parts, let SPC = {93,4 1 1 Si 5 u-l} be the set of code vectors and SPNC = {93“ I u S i S 2K‘2} be the non-codewords. Consider a noncode input which is comprised of 12“.] and P: , where P: e SPNC. By Lemma 3.1, the checker Hi, i < u, produces a noncode output (1,1); and the checker Hu generates a code output. Since g1 = 0, for all i < u, and g“ = 1, all Qi’s have the non-code output (0,0), except that Qn produces a code output. As a result, Block 8 produces a code output. In other words, a noncode input may result in a code output. ’Ihus, circuit CK is not code-disjoint. This implies that circuit CK is not a STC. The code-disjointness of circuit CK can be improved by adding some circuits that detect the noncode P: e SPNC and then connecting the output signal of these circuits to both on gates s1 and 32. If P; e SPC, then gI = 0 does not affect the outputs of the on gates. On the other hand, if P: e SPNC, then g’ = 1 results in a noncode output (1,1) in s, i.e., a noncode output results when a noncode input is applied. However, such circuits cannot be realized to be code-disjoint. 'Ihus, circuit CK is not a STC. Moreover, the problem can be resolved by the folding encoding scheme, where the check part Pt is the binary encoding of (I - N100). A Berger code B(I,K) can be encoded as either 48 P, = (2K - 1) - N10,), or P, = 1 - N10,), where 1 s t 5 1+1. Both encoding schemes are to map the (I + 1) codes for the check parts into some of the 2K K-bit codes. Let a K-bit code sequence dK-l,2K-2,...,0> where 0 and 2K - 1 are the rightmost and leftmost codes of the sequence, respectively. For simplicity, the former encoding scheme is referred to as left- justified mapping approach, while the later scheme as right-justified. Since all K-bit codes are included in a MLB code, i.e., l = 2K - 1, both left-justified and right-justified schemes produce the same MLB codes. Table 3.5 shows the Berger codes B( 15,4) and B(9,4) that are encoded by the right- justified scheme. Let v = 2K - u and Iv = {1,2,...,v} be an index set. Based on the right- justified scheme, the reduced check part have SPNC = {Pita l i 6 IV} and SPC = {PZj-l' j = i- v and v+1 s i s 11}. Note that gi is the output of the AND gate (3i which is constructed from P111 or P; , for simplicity, let SGNC = {gil i e I" } and SGC = {gil v+1 S i S u-l }. Figure 3.6 illustrates the circuit for Block G for B(15,4) and B(9,4), where SGNC = {$1. 2.23} and SGC = (3435.36.37) in MM) B(15,4) B(9,4) N101.) Pt. N105) Pt. 1 l 1 1 l 100 3 1 10 3 01 1 5 101 5 010 7 100 7 001 9 01 1 9 000 11 010 13 001 15 000 49 (a) 305.4) (b) B(9.4) Figure 3.6 Function G for B(15,4) and B(9,4) STC design. Figure 3.7 shows the internal structure, referred to as Circuit CW, of a STC design of Berger code B(I,K) with any code length. The structure is similar to that in Figure 3.1. Block H includes checkers for Camp“), 1S i s u; Block 6 consists of 21"1-K AND gates; Block Q consists of u-l pairs of AND gates; and Block S contains two (2K'1)-input OR gates. Figure 3.8 illustrates the STC design for B(9,4). It should be mentioned that, for u < 2K1 SGNC is not a null set. As shown in Figure 3.7, every gi e SGC is connected to Block Q, while every gj e SGNC is simultaneously connected to the OR gates S1 and 82. 50 a H 1 fl '1 smart IF.‘n 1-0:;::(1+1) lb. new”... 5 fl _ 55 .. ”-lfltlzv-W n' m,“ IF. ——— n 1 . We .rs'u -_E n . i M -‘ 2.4 1" ...... 'Ii'a WIN-09001) [=- Retired meet part P-(tlelwllex-ll — 531-81.” — :SluleV/lro Figure 3.7 Circuit CW for B(I,K) STC design. 51 $110 Checker Figure 3.8 Circuit CW for B(9,4) STC design. 52 W10: Circuit cw for u = 2‘"1 is a 31C. m: Circuit cw with u = 2‘“1 is virtually the same as Circuit CH. By Theorem 3.3, the circuit is a STC. 0 mm: Circuit cw for u < 2’"1 is code-disjoint. mt: When a code in subset 521-1 is applied to Circuit CW. the code input produces a code output in Block S. Consider a noncode which consists of J; and P; , where r at 1. If P; e SPC, similar to the proof in Lemma 3.5, the application of such a code produces a noncode output in Block S. On the other hand, for P: e SPNC, since g’ = 1, where gr 6 SGNC, is simultaneously connected to both S1 and 82, a noncode output (1,1) in 8 results. Thus, the circuit is code-disjoint. 0 Similar to the proof in Lemma 3.4, all unidirectional faults in Blocks H, Q, S, and all gi e soc in circuit cw are detectable. Since g1 e SGNC is connected to both 31 and 82, a s/l fault at gi forces Block S to produce a noncode output (1,1) with the application of any code input. For s/O faults at all gj e SGNC, they are redundant, but will not affect the circuit performance. More specifically, in a fault-free circuit, g‘ = 0, for every gi e SGNC. On the other hand, in the presence of 8/0 faults, or l-to-O faults, the reduced check part P; e SPC will never be in SPNC. In other words, applying either a code or noncode input to the circuit will never produce a non-codeword output. Therefore, if we consider a fault set F which consists of all multiple unidirectional faults except the redundant fault, the following lemma results. W: Circuit CW for u < 21“] is self-testing for the fault set F. 0 mm: Circuit CW for u < 2‘“ is a STC checker. m: By Lemma 3.11 and Lemma 3.12, the circuit is a ST C checker. O 53 WA: Circuit CW for any Berger code length is a STC checker. m: By Lemma 3.10 and Lemma 3.13, Circuit CW is a ST C checker. 0 3.2.2 STC DESIGN WITH PARTITIONING AND FOLDING SCHEME Recall that the necessary condition for implementing the partitioning and folding scheme is that every pair of subsets Em and E2“, must be dual. Consider a non-maximal- length Berger (NMLB) code B( 11,4), as listed in Table 3.6. Table 3.6(b) shows that the expanded information parts If and 1:1 are dual, but the corresponding reduced check parts PI and P11 are not dual. Thus, the pair of subsets E1 and E11 are not dual. This motivates the development of the following new encoding scheme that makes all pairs of subsets to be dual. To distinguish the previous left-justified and right-justified mapping schemes, the scheme presented here is referred to as center-justified mapping scheme. For simplicity, we first consider the Berger codes B(I,K) with any odd number I and the even number 11. The center-justified encoding scheme encodes the check part as: P, = (21"1 + 20 - l) - N10,), where 1 s t 5 1+1, i.e., P,=[2K°1+(I+ 1)/2- 1]-N1(J,)=2K‘1+(1+1)/2-t, P1=2K'1+20- 1=2K'1+(1+1)/2- l,and PM = 2‘"1 - 20 = 2‘91 - (I + 1)/2. Table 3.7 shows the Berger code B(11,4) encoded by three encoding schemes. Since all K-bit codes are used to encode the check parts of the MLB codes, the encoding schemes produce the same MLB code. For the case where both I and u are even, as discussed in Section 3.1.2 (B), an extra bit, x0 = 0, is added as the MSB of the information part to make I as an odd number. Thus, the above encoding scheme applies. 54 W (a) Information Check Subset pang) pal-[(13) *1 ‘2 ‘3 *4 ‘5 ‘6 ‘7 ‘0 x9 ‘10 ‘11 ‘12’13‘14‘15 D1 00000000000 1111 D2 00000000001 1110 D3 00000000011 1101 D4 00000000111 1100 D5 00000001111 1011 D6 00000011111 1010 D7 00000111111 0111 D3 00001111111 0110 D9 00011111111 0101 D10 00111111111 0100 D11 01111111111 0111 D12 11111111111 0110 (b) _ Irl'ltlfxpended Réguced orma on ec Subset Part (19) Part (P15) 12*: N ‘5 16 *1 ‘0 '9 ‘10 111315 112313114 E1 000000000001 111 E3 000000000111 110 E5 000000011111 101 F4 000001111111 100 E9 000111111111 011 E11 011111111111 010 (c) Expended Reduced Information Check Subset Part (1.) P 8110’.) xr II: we x: at Ir It 1&1 tutu IIranrsllrt E1 000000000001 110 E3 000000000111 101 E5 000000011111 100 E7 000001111111 011 E9 000111111111 010 (d) Eu 011111111111 001 Expended Reduced Information Check -- - Subset Part (1‘) Part0”) Partitioning x1 32 13h ‘3 ‘0 '7 ‘0 ‘9 ‘10 111115 312113114 ¥ Zl 000000000001 111 szx(lll) Z3 000000000111 110 C3,lzx(110) Z5 000000011111 101 C5,lzx(101) 55 WW B(11,4) Left-justfied Right-justfied Center-justfied P, 1111 1011 1101 P2 1110 1010 1100 P3 1101 1001 1011 P4 1100 1000 1010 P5 1011 0111 1001 P5 1010 0110 1000 P7 1001 0101 0111 P8 1000 0100 0110 P9 0111 0011 0101 Fm 0110 0010 0100 P" 0101 0001 0011 P12 0100 0000 0010 Based on this encoding scheme and partitioning and folding scheme, a Berger code B(11,4), as shown in Table 3.6, can be constructed by B(11,4) = C1/12X(10) U C3/12X(01) U €5,12X(00), (3.8) where u = 6 and u = 3. In a general case, a Berger code B(I,K) with any even number u can be constructed by ‘D’s (2m-1)/(I+1) codes, where 1 S m S e. Let v = 2‘"2 - o. For the reduced check parts, the set of non-codewords SPNC :- (F; l i e 1,} and the set ofcodewords SPC = {Ff l v+1S i So}. Since gi is the output of the AND gate Gi which is constructed from Pf , for simplicity, let SGNC = {gil i 6 IV} and SGC = {git v+lS i 5 '0-1 1. Figure 3.9 illustrates the circuit for Block G for B(15,4) and B(11,4), where SGNC = {g1} and SGC = {g2, g3} in B(11,4). Figure 3.10 shows the internal structure, referred to as Circuit CL, of a STC design for B(I,K) with any even number n. Circuit CL consists of a code-complementer and a checker. The checker is comprised of the following four blocks: .56 SGC (a) A Y1? g‘ _ 43—- ‘F:D_:53_ " y“ r g2 Y1: Xi 9 X15. 1: 17.18. (b) Figure 3.9 Function G for (a) B(11,4); and (b) B(15,4) STC design. 57 l a s For l-out-of-(hl) Code (21-1)—out—of-(I¢l) Code STC For (11 -l)-out-of-('l+l) Note: u = ruff]. _ :Bul Signal V = 211-2.. u. —. : Single wtre x“10433) Figure 3.10 Circuit CL for B(I,K) STC design with even 11. 58 Block H: Checkers for C(Zi-l)/(I+1)t 1 S i S 1); Block G: 21"2 - (K - 1) AND gates; Block Q: (t) - 1) pairs of AND gates; and Block S: Two 2K'2-input OR gates. Circuit CL is identical to Circuit CS for the Berger codes with u = 2‘“. The same Blocks G and S are used for all STC designs for B(I,K), where 2’"1 s l 5 2‘1 However, the structure in Blocks H and Q depends on the length of I. Since SGNC is not a null set for any even number u < 21"], every gi e SGC is connected to Block Q, while every gj e SGNC is simultaneously connected to the OR gates S1 and 82. Figure 3.11 illustrates the STC design for B(11,4). W: Circuit CL is code-disjoint. m: When a code in subset 132“ is applied to Circuit CL, the code input produces a code output in Block S. Consider the application of a noncode which consists of I; and P; , where r at t, to Circuit CL. If P; e SPC, by Lemma 3.3, the checker Hi produces a noncode (0,0), for i > q, or (1,1), fori < q. Since gi = 0, for i > r, and gr = l, a noncode output (1,1) is produced in Q', if r > q, and further generates a noncode output in Block S. 0n the other hand, for P: e SPNC, since gr = 1, where gr is connected to both 81 and 82, and gr = 1 and g’ e SGNC a noncode output (1,1) in S results. Thus, the circuit is code-disjoint. 0 Similar to the proofs of Lemma 3.11 and Lemma 3.12, Circuit CL can be shown as code-disjoint and self-testing for the fault set F. Thus, it is a STC checker. The previous discussion assumes that u is even. For the odd number u, it is necessary to convert it to a even number. More specifically, as shown in Table 3.7 for B(11,4), where u = 6, the check parts P5 = (1001), P5 = (1000), P7 = (0111), and P8 = (0110). In general, for an even number u, the check parts P ,1 = 21"1 + l, Pu = 21"], 59 ‘12 Figure 3.11 Circuit CL for B(11,4) STC design. 60 Fm, = 2‘"1 - 1, and PM = 2‘“1 - 2. The pairs (Pu-1,Pu) and (90,1,ng are referred to as center pairs. If u is an odd number, then 21) = u + 1. For example, for B(9,4), u = 5 and u = 3, there exist 5 reduced check parts PI, P3, P3, P1 and P9. Among them, Pf can be paired with F; , and F; can be paired with F; , but F; cannot be paired with anyone. In general, for any odd number 11, P; will not be paired with any other reduced check parts available in a B(I,K). Thus, the codes for the center pair (Pmermz) are not assigned to any check part, or they are defined as noncode. Mathematically, the check parts are encoded as follows: F,= (21"1 + 20 - 1) - N10,) +w, whereW = 2, irtz u + 2; w = 0, otherwise. Thus, the presented encoding scheme can be restated as follows: Modified Berger Code Encoding Scheme: For a Berger code B(I,K), the check part F, is the binary encoding of (21"1 + 2p - 1) - N10,) + w, where {0 ifuiseven,oruisoddandt<(00)- (39) Similar to the previous discussion, if v = 21"1 + 20 - 1, for the reduced check parts, the set of non-codewords SPNC -_- {Fin i e 1,) and the set of codewords SPC = {Ffl v+1 Si 5 11}. Also, SGNC = {gin i e 1,) and SGC = {g‘I v+1 Si 5 u-l}. For example, SGNC = {g1} and SGC = {g2,g3} in B(9,4). This shows that the STC structure for B(I,K) with an even number 11 is almost identical to that with an odd number u, except an AND gate is added, as shown in Figure 3.12, for detecting the noncodes, 2‘"l - 1 and 2K-1 - 2 for the center pair (Pmlrpu+2)r More specifically, consider the Berger code B(9,4), for example, the codewords for the reduced check parts are {(110),(101),(100), (010),(001)} and the non-codewords are {(111),(011),(000)}. For a noncode which consists of J; and a 61 ‘1 ‘2 ll Ecuador: d d m For # (21-l)-out-or-a+l) ,4 ”-111 Code 3" MC Pt! (11 -l)-oul-of-(I+l) Code Retired Checker Put Fuhrer-31.521 K-2 ‘19! Note: 1) I [tr/21. v = 2"2- 1). — :BtlSlnll _ :Shdeme Hoke: Figure 3.12 Circuit CL for B(I,K) STC design with odd u. 62 reduced check part (011), since XMSB(t) = 0 and P3 = (11), the code-complementer will produce an alternative code 1; to Block H and a code (00) to Block G. Thus, a code output results in Block S. This implies that, without the extra AND gate, the circuit in Figure 3.13 is not code-disjoint. Since the code (x1,x1+1,...,x1+x_1) = (11...1) is always a non-codeword in a B(I,K) with an odd number u, and the code (011...l) is assigned as a non-codeword, hence, combining these two non-codewords as (x1+1,x1+2wx1+x,1) = (11... l), where x1 is a "don’t care" term. Thus, an AND gate which takes (x1+1,x1+2,...,x1+1(-1) as its input can detect the non-codewords and makes the circuit in Figure 3.12 be code-disjoint and further be a STC checker. In summary, the STC structure in Figure 3.10 is implemented for the Berger codes with an even number u, while the STC structure in Figure 3.12 is employed for the Berger codes with an odd number u. The only difference is in the exua AND gate. Thus, the presented STC structure can be implemented with any Berger code length. 3.3 Design Alternative Previous sections have presented the STC design for both MLB and NMLB with partitioning and folding scheme. Results show that the designs need only half of the checkers required in [30]. The significant reduction is achieved by XNORing each input bit with the MSB of the reduced check part P", as indicated in Figure 3.2. As shown in Table 3.2, there are two ways to fold the subsets apps, i.e. we may fold the subsets either upward, i.e., choose the pair (21,23), or downward, i.e., choose the pair (25,24). Circuit CS, as shown in Figure 3.2, was generated based on the upward folding. Thus, it is also possible to generate an alternative design using the downward folding. Similar to Theorem 3.2, the following Corollary results. ‘13 ‘12 63 Figure 3.13 Circuit CL for B(9,4) STC design. 64 mm: If m < u, then any unidirectional errors in a subset Zm = Em$xMSB(m) can be detected by the checker C(zu-m)/(I+1). Therefore, in Table 3.2(d) if the pair (25,27) is chosen, then the Berger code B(7,3) can be expressed as B(7,3) = C5/8X(01) U C7/3X(00) (3.10) In general, if the downward folding is chosen, a Berger code B(I,K) can be expressed as u # B(I,K) = LuJ/ZCai—U/(IH) poi-I (3°11) Figure 3.14 shows the internal structure of Circuit CS with the downward folding scheme, where the code complementer is comprised of XOR gates. The above results conclude that, based on the MSB of P‘, the subsets 5214's can be folded by choosing either the upper half or lower half of these subsets. Both implementations have similar structures as shown in Figures 3.2 and 3.14. The former uses XNOR gates while the latter employs XOR gates. In practice, however, any bit of p* can also be used to fold the E subsets. More specifically, consider P" = (x1+1,x1+2,...,x1+k.1). Let yi = x144“, for simplicity, i.e., P"'= (yo,y1,...,yk.p), where yo and er.2 are the MSB and LSB of PI, respectively. W: Let SE be a collection of subsets E"s in B(I,K). Se1(p) and Seo(p) are the two yp-partitioned subsets of SE if Se1(P) (Seo(p)) is a collection of El 6 SE whose yp-bit of PI is 1 (0). W: Let Sc be the checker set for B(I,K), where Sc = { C(Zt-l)l(1+l)' I S t S u}. Sc1(p) and Sco(P) are the yp-partitioned checker subsets of Sc if Sc1(p) = {Ct/(1+l)' Et 5 Sel(P)}r Sc0(13)= {Ct/(1+1) I Et 6 Seo(P)l. and Sc1(13)u Sc0(P) = Sc- 65 BlockH ‘1 STC for (Hunt!) Code *2 STC for E | I . (”3m“) Code hfamution : Part 1‘ 3 I _ :Bus Signal x...(MSB) _ :Single Wire Figure 3.14 Circuit CS with XOR code complementer. 66 W: Let Ic0(p) (Ic1(p)) be the index set of the yp-partitioned subset Sco(p) (Sc1(p)). Then, Ic1(p) = {2K‘P’1j + (2t - I) 10 Sj fill and l StS 2K1"2 ); and Ic0(P) = Ie - Icl(P)- (3.12) Table 3.8(a) lists the subsets 521-1 and the reduced check part of B(3l,5) and the corresponding index sets are shown in Table 3.8(b). It shows that there exist eight possible STC designs of Berger code B(3l,5). The choice is determined by which checker subset requires the least gate count. W: Let Mip be a measure of the implementation with the yp-partitioned subset Sci(p), where i = 0 or I, and 0 S p S K-2. Mip << qu means the implementation with the yp-partitioned subset Sci(p) is better than that with SCI-(q). The measure in Definition 3.5 can be either the speed performance and/or the total gate count for a yp partitioned checker subset. However, since it has been shown that the partitioning scheme offers an improvement in speed performance, the measurement here, thus, concentrates on the total gate count. 2mm: Let Ma(q) = min{M¢i(P) Ii: 0 or I, and 0 S p SK-Z} be the best measure among Md(p)’s. A Berger code B(I,K) can be implemented by 0 a B(I,K) = iK_JIC(2i-l)/(l+l) xp2i-l (3.13) which requires least gate count, where P3 = PI - [yq}. 67 Ian]: 3 8' Reduced Cher]: Bart to: 13131 3) (o Subsets YOYIYZY3 El 1111 E3 1110 55 1101 E7 1100 E9 1011 EH 1010 1313 1 0 01 EH 1000 E” 0111 Em 0110 En 0101 53 0100 E3 0011 en 0010 Be 0001 EH 0000 0) Reduced ' Check Bit Ict(p) Ieo(p) o) 0 { l,3,5,7,9,11,l3,15} { 17,19,21,23,25,27,29,31} I {I,3,5,7,17,I9,21,23} {9,11,13,15,25.27,29,31} 2 {I,3,9,ll,17,19,25,27} {5,7,13,15,21,23,29,31} 3 {1,5,9,13,l7,21,25,29} {3,7,11,15,19,23,27,31} 68 Theoretically speaking, any Berger code STC checker can be designed either with the partitioning scheme, or the partitioning and folding scheme. Note that the STC design with either scheme requires a set of mln code checkers. In practice, not all mln code checkers can be efficiently designed with the TSC property. This implies that not all Berger code checkers can be designed efficiently with the partitioning scheme. However, among the various alternative designs presented in this section, it is possible to find a folding which employs those existing mln code checkers to achieve the best solution. 3.4 SUMMARY Based on the partitioning scheme presented in [30], both information part and check part of a Berger code are expanded and reduced, respectively. The scheme offers a significant improvement in speed performance, but requires a great hardware cost. This chapter presents a partitioning and folding scheme to improve the hardware cost and speed performance as well. Results have shown that the partitioning and folding scheme reduces the hardware cost nearly by half compared to the scheme in [30]. In order to demonstrate the effectiveness of the presented scheme, Table 3.9 summarizes various existing design schemes. Results show that the partitioning and folding scheme indeed provides an efficient, yet low hardware cost, ST C design for Berger codes. However, the table also shows that the code length of the checker grows linearly with the number of outputs of the functional circuit, but the complexity of the checker grows rapidly and the hardware cost increases nearly exponentially as the code length increases. Interestingly, the table shows that the use of many smaller code checkers may take much less hardware cost than that of a larger checker for the same total information bits. Therefore, if the output bits of a given functional circuit can be partitioned into smaller groups which employ smaller checkers, then the hardware cost can be further reduced. This has motivated the development of an output partitioning algorithm that achieves this objective in the next chapter. 69 CE :cost-effective design ML : minimal-level design * : with the developed partitioning and folding scheme CHAPTER 4 OUTPUT PARTITIONIN G ALGORITHM This chapter describes an efficient output partitioning algorithm. The algorithm partitions the output of a given functional circuit into many smaller groups and each group employs a smaller checker. We first describe the problem to be solved. Then, the problem is formulated in Section 4.2, and an algorithm is given in Section 4.3. Section 4.4 discusses some practical design considerations. Experimental results on MCNC benchmark finite- state machines (FSMs) are shown in Section 4.5. 4.1 PROBLEM STATEMENT Consider the following matrix, referred to as an essential output matrix, which represents all possible outputs of a given circuit, 123456789 (—colurnnindex 000000000 100110000 000001100 (4.1) 001000010 110000101 70 71 A 4/13 code checker is generally chosen for a self-checking circuit design. The checker circuit can be realized either by 721 gates with three-level implementation [17], or by 105 gates and 7 levels with the cost-efi'ective implementation [20]. The output matrix may be partitioned into either the following two submatrices 12348 5679 00000 0000 10010 1000 00000 0110 M2) 00101 0000 11000 0011 or the following three submatrices 2346 589 I7 0000 000 00 0010 100 10 0001 000 01 (43 0100 010 00 1000 001 11 The former requires two checkers for C2” and C715, which can be respectively realized by 26 and 20 gates with 3-level implementation [17], or 46 gates and 3 levels in total excluding a checker for the final result. (In general, a two-rail checker with 6 gates and 2 levels can be used as the final checker.) On the other hand, the circuit for Matrix (4.3) needs three checkers for Cm, C1 ,4, and C214, which can be respectively realized by 10, 8, and 6 gates with 3-level implementation. The above example demonstrates that the use of a larger code checker may require a larger gate count and gate level than that of many smaller ones in total. Table 3.9 also endorses the above finding. Now, the question that naturally arises is how to partition the outputs of a given circuit so that smaller code checkers can be employed to reduce hardware overhead and increase speed performance. 72 4.2 PROBLEM FORMULATION Consider an output matrix C = [c;;]m, where the entry c;; is either 0 or I, and its column index set F = {l,2,..., m}. Let R; denote the set of column indices in the i-th row with c;; = l, i.e., R; = { j | c;; = I}, and IR;| be its cardinality. A matrix is called a k-set matrix if it has at most k 1's in its rows, i.e., lRmxl = max{lR;l I S i S n} = 1L Suppose that the k-Set matrix C is partitioned into t submatrices whose column index sets are denoted as G;, 1 S i S t, where (J G; = l" , (‘1 G; = Q, and lG;| S k. For simplicity, we will first assume that all sub‘rn-altlices are r-St'ets.l (The constraint will be relaxed later.) Therefore, our problem is to develop an efficient algorithm that partitions a given k-set output matrix into a minimal number of r-set submatrices, where r S k. This problem can be formulated as follows. W: To minimize t l l Subjectto UG; = F, (16;: @,and lG;nR;lSr,foralliandj. i = 1 i = 1 The condition, IG; n le S r for all i and j, implies that a set G; cannot contain more than r elements in a set R;. In other words, the set G; should be constructed such that any (r + 1) elements of a set R; cannot be in the same 0;. Thus, the theorem defines the lower bound on the number of the partitioned submatrices. W: Ule/rjs t. m: Since lle = max{lR;l 1 Si S n} and at most r elements in Rum can be in the same 0;, this implies that t will not be less than LllelrJ, i.e., LlRmVrJ s t, where the equality possibly holds only when IRMI = rt. 0 73 Note that, for IRMI = m or IRmax | S r, the given output matrix can be arbitrarily partitioned into any rm/r1 r-set submatrices. Therefore, no partition is needed in our implementation. As mentioned, we are dealing with the output matrix C which consists of all possible outputs of a given circuit. The question is whether or not we should deal with all possible outputs in that matrix. Rs c; R; denotes that the j-th row covers the s-th row. W2: Let G;'s be some sets that satisfy the conditions 5910;- = F, £01 G; = E, and lG;nleSr,forl SiSt.IfRsch,thean;nRs|Srfor I SiSt. 2:991: SinceRsch, we haveG;nRsc;G;nRJ-. Thus, lG;nRs|SlG;nRJ-|Sr. O A row is essential if it is not covered by any other row. Thus, the covered outputs and the duplicated outputs are not necessarily included. In other words, we are only dealing with the matrix, referred to as reduced matrix, which consists of all essential rows. Consider the MCNC benchmark FSM, ex4.kiss2, as shown in Figure 4.1(a), and its output matrix, Figure 4.1(b). There exist four essential rows which form the reduced matrix in Figure 4.1(c). The reduction in dimension is Significant. Based on Theorem 2, Problem P can be re-formulated as follows. Emblem: Tominimizet o l l subjectto UG; = F, (KG; = @,anle;nleSr,foralliandessential i=1 i=1 rowsj. 74 Column Index .i 6 .o 9 Column Index 4,21 Row123456'789 .s 14 N0- 1 ----- 13110000000 1110000000 1 ----- 32000000000 2000000000 1 ----- 25001000000 3001000000 1 ----- 57000000000 4000000000 10---- 7 7 000000000 5 0 0 ° 0 0 0 0 0 0 11---- 7 11 100110000 5 1 ° 0 1 1 ° 0 0 0 1 ----- 1112100100000 7100100000 1-1--- 12 8 000001100 3 0 0 0 ° 0 1 1 0 0 1-0--- 12 8 000000100 9 ° 0 0 0 0 0 1 0 0 1-0--- 8 3 110000000 10 1 1 ° 0 0 0 0 0 0 1-10-- 8 3 110000000 11 1 1 0 0 0 0 0 0 0 1-11-- 8 4 110000000 12 1 1 0 0 ° 0 0 0 0 1---1- 4 13 000000010 13 ° 0 0 0 0 0 0 1 0 1---0- 4 13 000000000 14 ° ° 0 0 0 0 0 0 0 1 ----- 1314001000010 15 001000010 1 ----- 14 6000000000 15 000000000 10---- 6 6 000000000 17 ° ° ° ° 0 0 ° 0 0 11---- 6 9 100110000 13 1 ° 0 1 1 ° ° 0 0 1 ----- 910100100000 19100100000 1----1 10 3 110000101 20 1 1 ° ° 0 0 1 0 1 1----0 10 4 110000100 21 1 1 0 0 0 0 1 0 0 (a) the original kissZ file. (b) output matrix. Figure 4.1 A FSM, ex4.kiss2. Row 123456789 No. 6 100110000 8 000001100 15 001000010 20 110000101 (0) reduced output. 75 4.3 ALGORITHNIS For simplicity, we first present the algorithm for r = 1, and then for r = 2 and other r. A. Case of r = 1 For r = l, by Property 1, a Gi contains at most one element of a Rj set. The question that arises is which column should be selected first when a Gi set is constructed. In this development, a simple rule is presented to establish the selection priority. We first record the column count, i.e., the number of 1's in each column, of the reduced matrix. A column with a higher count means that it most likely belongs to more R,- sets and it will be more difficult to satisfy the condition B, n le S 1. Thus, we apply a heuristic that the column with a higher count should have a lower selection priority, and if two columns have the same count, the lower column index is defined to have the higher priority. Let I“, = 41,72...” ym>, a permutation of 1", denote the descending priority set, i.e., the priority of 71 is higher than that of 75 if i < j. The priority set Pp is used to construct Gi's. Once the priority set 1"p is generated, a selection process follows. Let Sc be the ordered set that consists of the column indices to be selected, where Sc = I“p = 41.72,... ym> initially. We start constructing the set 61 by selecting the leading element, 73, of the set Sc, where 78 = 71 and 01 = {71}. Since each Rj contains at most one element in a GI, those Rj sets which contain 71 must be excluded from Sc, i.e., Sc = Sc - U [Rj I 71 e Rj}. The same process is carried out repeatedly by including the leading element of the set Sc into the 61 set until Sc = 0. Once 61 is generated, the construction of 62 proceeds. In order to keep the disjointedness, 61 must be excluded from 8,, i.e., Sc = Sc - GI. The above procedure is then repeatedly applied for generating the (32 set and other 6, sets. The process is finally terminated when all m columns are selected, i.e., \‘J G, = F. Of course, all Gi’s are disjoint, i.e., (IN G, = Q. i: ! i = l The above procedure is summarized in Algorithm 1. 76 Algorithm 1. (r = 1) Step 1. Generate the reduced output matrix and derive the R]- sets. Step 2. 1“le = max{ IRil 1 Si 5 n} = m or IRMI S r. then STOP; /* No partition is needed. *I Step 3. Establish the selection priority and generate the set Pp = <71.72,...,‘ym>. Step 4. Selection Processes u = 0; Repeat 4.1.u=u+ l;Gu=Q;Sc=I‘p; Repeat 75 = the leading element of Sc; Gu = Gu U {Vs}; Sc=Sc-u {Rj lyse Rj}; Until Sc = G; 4.2. I‘p = I“, - G“; Until Fp = G; W: Consider the reduced output matrix of Figure 4.1(b). The Rj-sets are generated as follows: R6 = {1.4.5}, R3 = {6.7}. R15 = {3.8}. and R20 = {1.2.7.9}. Since Column land 7 have 2-count. while the remaining columns have only l-count. the priority set is Pp = d,3.4.5,6.8.9.1.7>. By Algorithm 1, we first select 73 = 2. where 2 6 R20 = { 1.2.7.9}. Thus. Sc = Sc - R20 = <3.4.5.6.8> and GI = {2}. This is followed by selecting 18 = 3. This results in G] = {2.3} and Sc = Sc - R15 = {4,5,6}. After subsequently selecting 4 and 6. we obtain 01 = {2.3.4.6} and Sc = E. Thus. the construction of GI set completes. For constructing the 62 set. Sc = I“, = I“p - GI = <5.8.9,l.7>. Repeating Step 4.1 we obtain 62 = {5,8,9}. G3 = {l}, and G4: {7}. Examnlm 77 Consider the MCNC benchmark FSM. ex1.kr'ss2. consisting of 9 inputs, l9 outputs. 20 states, and 138 product terms. Figure 4.2 shows the reduced output matrix which contains only 10 essential rows. Based on the Rj sets listed in Figure 4.2, the priority set op: <6,8.11,14.15.16.17.18.19.1.12,2,3,9.13.7.5.10,4>. Repeating Step 4 of Algorithm I. we obtain GI:{6,8,11,14,16.17.19.3};02 = {15.18.1};G3 = {12.2}; G4 = {91:65 = {131;06 = {7}; G7 = {51:63 = {101;69 =14}- Row Column Index Index 1234567890123456789 ll 3 9 20 22 32 39 53 99 127 B. Caseofrzz 1111100000000000000 1000011000000000000 0111101010000000000 0101100101000000000 0001101001111000000 0000000001000110000 0001101011001001000 0001101011011000110 0001000001001000001 0011100001000000000 .— 579 8 0 10111213 9101316 791012131718 11319 4510 CM Figure 4.2 The output matrix and Rj-sets for ex1.kissZ. For r = 2, the condition. 16, n le S 2 for all i and essential rows j, implies that a Gi cannot include more than two elements of Rj. The priority set is generated as follows. Consider a pair set Qj which consists of all. pairs of the elements in Rj. i.e.. Q = {(u1.u7)l 78 all mm; 5 Rj}. Based on these pair sets. the descending priority set 1'}, = <71.72,....yd>. I! U Q14 is the number of distinct pairs. 1' = 1 . \Vrthout loss of generality, let yq = (uq1.uq2) and 7p = (upbupz). The priority of yq is higher where each 'yq is a pair of elements in a Rj and d = than that of 7p if (1) the number of occurrences of 7p is higher than that of yq, or (2) when both have the same number of occurrences. uql < upl. or uql = up] and uqz < upz. Note that QJ = Rj for r = 1. The same concept can be extended for any r, where the r-tuple sets Q = {(u1,u2,...,u,)l all u1,u2,....ur e Rj}. and the priority set I“p = 41,72....,yd>. where each yq is a n-tuple elements of a Rj and d= ll U Q4 . The priority is defined in the same manner as for 1' = 1 r = 2. Similar to Algorithm 1. Algorithm W is employed for r 2 2. Algorithm W Step 1. Generate the reduced output matrix. derive the R]- sets, and the r-tuple sets Q = {(u1.u2,...,u,)l all u1.u2,...,trt e Rj}, for all 1 S j S n. Step 2. If lle = max{lRil 1 Si S n} = m or IRMI S r. then STOP; 1* No partition is needed. *I Step 3. Establish the selection priority and generate the set I“p = <71.72,...,7d>, where d = Q Q J . Step 4. Selection Pjrgcl u = 0; Repeat 4.1.u=u+ l;Gu=Q; Sc=sz Repeat 7,. = the first element of Sc; 6,, = G“ U {73}; Sc=Sc-u {Q 1736 Qj}; Until Sc = Q; 4.2. Fp=Fp-Gu; Until I‘p=@; 79 W: Consider the reduced matrix of Figure 4.1(c). For r = 2. the QJ sets are generated as follows: Q1 = {(1.4).(1,5),(4.5)}; 02 = {(6.7)}; 03 = {(3.8)}; and Q4 = 1(1.2).(1.7).(1.9).(2.7).(2.9).(7.9)}- Thus, the priority set Cp = <(l,2).(1.4).(1.5).(l.7),(l,9),(2.7).(2,9).(3.8),(4,5).(6.7).(7.9)>. Similar to Example 4.1, we obtain 01 = { 1.2.3.4.8} and 62 = {5.6.7.9}. Similarly, following Algorithm W, the following five 2-set submatrices are generated for ex1.kis.s'2 of Figure 4.2 in Example 4.2 G1: {l.2.6,8.9.11,12.14,15.l6.19}; 62 = {3.7.10}; G3 = {4.17}; G4 = {5.18}; and 05 ={13}. 4.4 DISCUSSION The presented algorithm partitions a given k-set matrix C into t submatrices which were assumed to be r-sets. i.e.. each submatrix has the same dimension. The question that arised is whether or not this assumption is necessary. First. a r-set submatrix can be obtained by combining r l-set submatrices. For example. consider the l-set submatrices in Example 4.1. the submatrices GI and G3 can be combined as a 2-set submatrix. so do 62 and G4. Secondly. the total hardware cost of Up]. l/p2. ..., l/pt code checkers is generally much less than that of V(p1+p2+...+po. For example, the gate counts for C1 ,2, C1 ,4. and C1 ,4 checkers are respectively 2, 8, and 8, with 3-level implementation. The total is 18 gates which is much less than 127 gates for a 3/10 (=(l+1+l)/(2+4+4)) code checker with the same 3-level implementation. Finally. the most important fact is that Algorithm I for generating 1-setsubmatrices is simple, yet efficient. Based on the above observation. generating l-set submatrices for such an implementation is adequate. In other words. we should implement with all l-set submatrices. 80 However, implementing many l-set submatrices is not free of penalty as far as the overall hardware cost is concerned. Consider the structure of a self-checking sequential circuit. where the next state functions are encoded in the mln code and the output functions are encoded in either mln or Berger code [54]. Two self-checking checkers are employed to monitor the next state functions and the output functions. The outputs of these two checkers are then fed to a two-rail. or C2“. checker to produce the final checker output. Suppose that the output functions have been partitioned into t l-set submatrices. where each submatrix needs a code checker. Then. the outputs of these tcheckers and the outputs of the next state function checker are fed to a (t+l)l2(t+l) code checker to produce the final checker output. (Such an implementation generally results in less performance degradation.) Apparently. the hardware cost. for (t+1)/2(t+1) code checker. increases as t increases. For example. exI.kissZ in Example 4.2 requires a ang checker for its output functions and a two-rail checker for the final checker output. With the presented partitioning algorithm. however. the output functions are partitioned into 9 l-set submatrices and the circuit needs 3 checkers for C1/9, C114, and Cm. for lGll = 8, IGZI = 3. and IGgl = 2. respectively, no checkers for lGil = 1. 4 S i S 9. and a C1000 checker for the final checker output. In summary, each l-set submatrix needs a very low cost checker to implement with. but the use of many l-set submatrices may require a larger checker for the final checker output. Thus. in order to attain the optimal result. it is necessary to generate all 1-set submatrices. However. some of these submatrices with lower cardinality lGil should be combined into a larger r-set submatrix to reduce the size of the checker for the final checker output. Experimental results presented in the next section will discuss the trade-om 81 4.5 EXPERINIENTAL RESULTS This section describes the implementation of the partitioning algorithm in the design of self-checking circuits for reducing hardware overhead and performance degradation. In order to demonstrate the efl’ectiveness of such an implementation. experimental results on MCNC benchmark FSMs are presented. Table 4.1 shows the experimental results for MCNC benchmark FSMs without implementing the developed partitioning algorithm. where only those FSMs with more than 5 output variables are tested. The columns "#i", ”#p", ”#8". and "#0" represent the number of inputs. product terms. state variables. and outputs. respectively. The column "max #l's" lists the maximal number of 1's. k, in the rows of each output matrix. i.e., the output matrix is a k-set matrix. The output matrix may be encoded in either mln codes or Berger codes. Since none of existing checker design approaches can be universally, optimally implemented for any code length. the experimental results listed in Table 4.1 are obtained by implementing the design approaches which are selected with the following priority: For mln code checkers. (1) if m 2 3, 4m 2 n > 2m. the efficient checker design [20] is applied; (2) if n is small, the checker design with 3-level implementation [17] or 2-level implementation is employed [15]; (3) if n = 2m or n = 2m i l. the efficient checker designs in [14,15] are applied. respectively; and (4) the checker design in [13] is implemented for the remaining cases. On the other hand, for Berger code checkers. among the design approaches [28-31] the choice is determined by whichever provides the optimal results. Note that mln code checker is generally realized with less gate level than the Berger code checker. but it is achieved at the cost of increasing gate count. However. the Berger code checker requires much less check bits in the function circuit than the mln code checker. Thus. the mln code checker is implemented when gate level is of concern. 82 Checker new] Berger code 3/10 gate 127, level 3 code B(7,3) gate 76“ 3110 127, 3 B(7,3) 76.. 9/28 2043: 309.5) 241. 4113 105, B(9.4) 120. 5/13 110, B(8.4) 5/21 2486. 306.4) 9/28 2043. 809.5) -2.0x10’. B(56.6) 283; B( 10.4) 91, B(6.3) 91, B(6,3) B(9,4) B(7,3) %: [14]; +: [151;3: [17]; ‘: [18]; l: [20]; @z [25]; #: [29 ]; &: [30]; u: [31]. 83 Table 4.2 lists the experimental results for FSMs with the partitioning algorithm. In this implementation. we first apply Algorithm 1. for r = 1. to generate the partitionedl-set submatrices. For instance. the output matrix of ex4.kiss2. as shown in Example 4.1. is partitioned into four l-set submatrices. where lGll = 4, lel = 3. and IG3| = |G4| = 1. Thus. "4.3.2“ is recorded in the column ”Groups" of Table 4.2 for ex4.kiss2. where the asterisk indicates the number of l-set submatrices with lGil = 1. Once all l-set submatrices are generated. the submatrices with lGil S 2 are combined and formed into a larger submatrix. Thus. the checkers required for ex4.kiss2 are C1/5. C1/4. and C214. As indicated in Table 4.2. based on the design approaches in [15.17]. the checkers need 23 gates and 3 levels. Consider the FSM, ex1.kiss2, as discussed in Example 4.2. its output matrix is partitioned into nine l-set submatrices. where lGll = 8. |G2l = 3. IG3I = 2, and |G4l = |65| = lGél = |G7l = ngl = ngl = 1. Thus. "8.3.2.6“ is recorded in column "Groups”. The checkers for G1 and G2 are C1 ,9 and cm. respectively. Combining submatrices G3 through G9 into a larger submatrix. one may need either a B(8.4) or C-m5 checker. As indicated in Table 4.2. the implementation of the checkers for C119. cm. and B(8.4) requires 116 gates with 12 levels, while the checkers for C119, C1/4, and C7,” need 304 gates with 3 levels. It should be mentioned that the former implementation requires only 6 check bits in the functional circuit. while the latter one needs 9 check bits. It is obvious that the hardware reduction in the former implementation is achieved at the cost of requiring more gate levels. In our implementation, an mln checker with n 210 should be partitioned further in order to eliminate the fan-in constraint for the efficient 2- or 3-level implementation. For example. the 7/15 code in exIJtissZ is partitioned into 3/7 and 418 codes, where the efficient mln checker design with n = 2m [15] and n = 2m 1 1 [14] can be employed. As indicated in Table 4.2. the implementation of Cup. Cm, C3”, and C4,; checkers requires 73 gates and 3 levels. Compared with 116 gates and 12 levels for C1 ,9. Cm. and B(8.4) checkers and with 304 gates and 3 levels for C1/9, Cm. and C7,” checkers. the reductions in both gate count and level are significant. 84 Checker 1/6,2!4 l/4.2/6 1:9.1/4.B(8,4) 1/9,1I4.7/15 ll9.1/4.3fl ,4/8 1/5. 114.214 l/4.B(5.3) 1/4.5/10 1/4.2l4,3l6 9.23.2.2" lllO.1/4.3/5 6.3.2.2.6“ 1/7,1/4,4/9.4l9 17.9,5,4,4,3, 2.2.2.245" 1118,1110.ll6.1/5.1l5,1/4.B(14.4) lll8, l/10.116.IIS.115.l/4.1W U18,1I10. ll6, 115. 115. U4.2/4.4x(2/5) 52.3" 116,419 1/4,3I6 ll5,5/10 “6.2/4 85 Sirrrilarly. in addition to the checkers for Cmg. (31/10, Cm. C15. C15. and Cm. the schcissZ may be implemented with either a B(14,4) or CID/24 checker. However, due to the property of "2.2,2.2.6*" in the partitioned submatrices. the CID/24 can be partitioned into a C2“ and four Cys’s. Compared with 2110 gates and 6 levels for a CID/24 checker, it requires only 62 gates and 3 levels for the smaller checkers. As a result. the implementation is even better than that with Berger code in both gate count and level. In order to demonstrate the effectiveness of the partitioning algorithm. Table 4.3 compares the experimental results for MCNC benchmark FSMs with and without implementing the partitioning algorithm. As mentioned. without the partitioning algorithm. a 2/4 code checker, realized by 6 gates and 2 levels. is needed for the final checker output. while a (t+1)/2(t+l) code checker is required for implementing with the partitioning algorithm. where t is the number of the smaller checkers which constitute the self-checking checker. Thus. a C35 checker. realized by 20 gates and 2 levels, is employed for those MCNC benchmark FSMs. bbsse, cse. styr, opus. s1. sand, and sse. For scf. it requires a C1234 checker for the final checker output. In order to reduce the hardware cost. while still keeping less gate levels. the C1204 is partitioned into four C3/5’s and their outputs are checked by a Cm checker. As a result. these smaller checkers requires only 85 gates and 6 levels. or 160 gates and 4 levels. For any circuit realization. there exists an inverse proposition relationship between area and delay. The reduction in gate count is generally achieved at the cost of increasing gate level, and vice versa. In Table 4.3. the columns "tn/n code" and "Berger code” list the gate counts and gate levels of various circuits without applying the partitioning scheme. The mln code checker takes less gate level than the corresponding Berger code checker, but it requires more gate count. The remaining columns indicate the experimental results of the circuits with the partitioning scheme. In our partitioning scheme, the optimization may be - achieved by optimizing gate count, or gate level, or gate count under gate level constraint. The gate count optimization attempts to obtain the minimal gate count regardless of gate 86 b14.' m 'n O‘MOKMMGUIMUI 87 level. Similarly, the gate level optimization targets to reduce gate level without considering gate count. Both optimization results are listed in Table 4.3. Interestingly. the gate count and gate level in both optimizations are much better than those without the partitioning scheme listed in the second and third columns. It should be mentioned that the use of Berger code generally requires less check bits than that of mln code. This implies that the reduction in gate level of the checker circuits may be achieved at the cost of increasing the number of check bits. and. thus. increasing both gate level and gate count in the functional circuit. Therefore, for a practical design, the circuit should be synthesized such that the total gate count in both functional circuit and checker circuits is optimized under gate level constraint. This leads to the development of an automatic synthesis system which optimizes the gate count for both functional circuit and checker circuit and keeps a reasonably low gate level. 4.6 SUMMARY This chapter has presented an efficient output matrix partitioning algorithm for r = l and its extension for r 2 2. The presented algorithm has been implemented in the design of self-checking circuits and systems for MCNC Benchmark FSMs for reducing both hardware overhead and performance degradation of the required checkers. The significant reduction has been illustrated from the experimental results. The algorithm particularly benefits those circuits with more output functions. It should be mentioned that the problem of partitioning a k-set output matrix into a minimal number of l-set submatrices is similar to the problem of chromatic partitioning of a graph. where lle = k is the chromatic number. Since the chromatic partitioning problem is NP-complete [65]. any efficient heuristic algorithm deve10ped for the chromatic partitioning problem can also be implemented to improve our experimental results. 88 However. since not all rrr/n and Berger code checkers have been efficiently designed for any code length. the size of the partitioned submatrices should be determined by the availability of the checker design. As shown in Figure 4.1. or in Example 1, the output portion of a FSM is used as the output matrix for generating l-set submatrices. A self-checking FSM is generally synthesized in such a way that each state is assigned a code. referred to as codeword, and the outputs corresponding to the non-codewords are assigned to all 0's. Based on this state assignment. any two product terms will be disjoint. In other words. only one product term is enabled at a time when an input is applied. Thus. the output portion of a FSM can be used as the output matrix. CHAPTER 5 SOLiT: A SYSTEM FOR AUTOMATED SYNTHESIS OF ON-LINE TESTABLE SEQUENTIAL CIRCUITS This chapter presents a system, namely. SOLiT, for automated Synthesis of On- Line Testable sequential circuits with multi-level logic implementation. A sew-checking circuit is comprised of a functional circuit and a self-checking checker. The functional circuit. either a combinational or a sequential circuit. is generally designed such that its primary outputs and some encoded outputs are able to produce an erroneous result in the presence of fault(s). The self-checking checker is designed to produce an error signal'for some normal circuit inputs whenever a fault from a specified set of faults occurs within the circuit. As described in the previous chapters. numerous designs of self-checking checkers have been presented. The synthesis of self-testing functional circuits is presented in Section 5.1. Section 5.2 describes the major components in SOLIT and its implementation is discussed in Section 5.3. In order to demonstrate the automated synthesis system, a complete example from design specification to the physical layout is presented in Section 5.4. 89 90 5.1 SYNTHESIS OF SELF -CHECKING FUNCTIONAL CIRCUITS Consider a Mealy1type sequential circuit. as shown in Figure 5.1. (The same concept can be easily extended for Moore-type sequential circuits.) The combinational part is realized with multi-level logic implementation. The state is encoded in the mln code. while the output functions are encoded in either Berger codes or rrr/n codes for detecting unidirectional errors. It has been shown that. with the above encoding. the circuit output and the next state output functions are unate in state variables [53]. Thus. unidirectional errors can be detected. Note that the inputs need not be encoded if they are obtained from a preceding functional circuit. As shown in Figure 5.1, the circuit requires two TSC checkers to monitor the circuits outputs: one checker (mln checker) checks the next state functions. and the other (either mln checker or Berger code checker) checks the next state functions. The outputs of these two checkers are then fed to a two-rail checker to produce the final checker output. X X "i171 AND/OR “—13 Circuits xt+l . i 8 XP Figure 5.1 A Moore-type self-checking sequential circuit. 91 For detecting multiple unidirectional errors. the encoded output functions of the functional circuit must be realized with inverter-free circuits. The combinational part must be inverter-free. i.e.. with AND and OR gates. and the only inverters allowed are at the primary inputs. Based on this realization. the output functions are binate in terms of inputs xi, 1 S i S t. and unate in terms of inputs x-. t+l S j S p. If the primary inputs are connected from previous stages which are encoded and have been detected as fault-free. then the multiple unidirectional errors can be detected. Otherwise. the circuit can be guaranteed only to detect single stuck-at faults instead of multiple unidirectional stuck-at faults. 5.2 THE AUTOMATED SYNTHESIS SYSTEM SOLiT The conventional synthesis procedure for sequential circuits includes the following three steps: ( 1) state assignment (or encoding); (2) logic minimization (two-level or multi- level); and (3) technology mapping. The state assignment process is to encode each state with a binary vector. Once the states are encoded. logic synthesis including logic minimization and technology mapping are used to optimally realize the circuit with either gate arrays or standard cells available in a cell library. SOUT is a system for automated synthesis of on-line testable sequential circuits, with multi-level logic implementation. Figure 5.2 illustrates the flow chart of 8011’ T which is comprised of the following five major components: (1) output function partitioning and encoding; (2) state encoding; (3) logic minimization and technology mapping; (4) checker library; and (5) layout synthesis. The output function partitioning procedure partitions the output functions of a circuit during synthesizing into many smaller sub-functions and the resultant output sub-functions are encoded with either mln or Berger codes. The checker library is comprised of all possible TSC mln code checkers and ST Berger code checkers. The layout synthesis includes the placement and routing of the functional circuits. checkers. and interconnections. 92 Design Specification 1 Behavioral Description in FSM kiss2 Format Description (Standard PLA Form) 1 Output Groups Output Partitioning Analysis |-——> Designer’s Selection Output Encoding State Assignment I I Checker Multi-level Minimization (SIS) L'b Technology Mapping f 1 rary J Selected Checkers Connect the Functional Circuit and Required Checker Spec. | Placement and Routing l t Figure 5.2 Flow chart for SOUT. 93 5.3 IMPLEMENTATION SOUT has been implemented on Sun/4 workstation in the C-language. Since the finite state machines (FSMs) in the MCNC benchmarks are represented in either kiss2 or blif format. the system takes the FSMs described in kiss2 format as input. However, it can be easily modified for any other format. W The output portion of the kiss2 file is partitioned by the algorithms developed in Chapter 4. A procedure, namely. out _part. is developed to partition the set of output functions into many independent subsets. Each independent set requires a checker. Thus, the number of independent sets determines the number of checkers required in a realized circuit. As described in Section 4.4. there exists a trade-off between the. size of the independent sets and the hardware cost. The strategy presented in Section 4.5 has been taken into account in this program development. The experimental results have been shown in Table 4.2 (in Section 4.5). Once the set of output functions are partitioned, each subset is encoded individually. The output functions are encoded in either Berger or rrr/rr codes depending upon whichever provides an optimal solution. (Based on the designer’s choice in either area-optimum or time-optimum.) The procedure, namely. out_encode. receives the input information which includes the partitioned groups in the output functions. selected encoding scheme. and area- or time-optimum, and produces the appropriate encoded output sub-functions for that circuit. Wu: The VLSI design tool. mustang, is generally used for state encoding targeting multi-level logic implementation. However. the program only provides lln codes. but not 94 for mln codes. In order to assign rn/n codes to the next state functions. a program. namely assigner. is deve10ped in which each state can be assigned sequentially or randomly. In this implementation. the states are assigned sequentially. for example, for a 25 code. State 1 is assigned as (00011). State 2 as (00101). State 3 as (00110). and etc. In order to make the next state functions unate. the present states are modified by changing "O" to "-" ("don’t care"). For example, if the present state is (00101). it is modified as (--1-1). The resultant next state functions are referred to as modified encoded state functions. Note that. as shown in Figure 5.1. the combinational part of the circuit must be realized with AND and OR gates The functional circuit is realized with multi-level logic. In order to keep the output functions and the next state functions unate. the functions are optimized by a multi-level log mizer. sis. with algebraic decompositions. Then. the technology mapper in sis is used to map the resultant circuit network to AND and OR gates in a cell library which is generated from the cell library in sis. W In order for checkers to check their own faults. they must also be inverter—free circuits. As discussed in Chapter 2. numerous Berger and rrr/n code checkers have been developed. but not for all possible code lengths. The checker library is a collection of modules which implement the existing checker designs with AND and OR gates. The checker library provides the information of the size and delays for each checker. For example. a gate-level 2/7 code checker is shown in Example 2.2. and the cell-level is illustrated in Figure 5.3. The modules are designed such that it has the same height as the AND and OR cells. 95 Figure 5.3 Cell-level implementation of 2/7 code checker. I IS ll '-21 I II! I' Once the net-list of the self-checking circuit is available. the VLSI tool implementing Tirnberwolf placement algorithm, twp [66]. is employed for placing the components in the functional circuits and the checkers. Note that the placement of the functional circuits and the checker circuits are performed simultaneously, where the checker circuits are pre-designed as macro modules. The program generates a cif file for routing. where a software program. namely, aisce [67], a modified version of the layout editor, magic, is employed. 5.4 SYNTHESIS EXAMPLE In order to demonstrate the developed synthesis procedure in SOLiT. the MCNC benchmark FSM. mark1.kiss2, as shown in Figure 5.4. is generated. First. based on the 1 5 o 16 p 22 .s 15 0---- 1---- 1---- 1---- 1-111 1-110 1-10- 1-011 1-010 1-001 1-000 1---- 1---- 1---- 1---- 1-.....- 1---- 10--- 11--- 1---- 1---- 1---- * statel statez state3 state4 state4 state4 state4 state4 state4 state4 states state6 state? state8 state9 statelo statell statell state12 state13 state14 statel state3 stateO state4 state13 statelo state9 state8 state? state6 states state14 state14 state14 state14 state14 statell state13 state12 state13 state14 state3 —11---1-00 ------ -11---1-00 ------ -11---1-00 ------ 101---1-01 ------ -11---1-00 ------ -11---1-00 ------ -11---1-00 ------ -11---1-00 ------ -11---1-00 ------ -11---1-00 ------ -11---1-00 ------ 0011--1—00 ------ 00100-0-00000011 001---1100 ------ 010---1-00 ------ 001---1010000101 —11---1-00100000 -11---1-00 ------ -11---1-00 ------ -110110-00 ------ -11---1-00 ------ -110110-00 ------ Figure 5.4 A FSM example - mark1.kiss2. 97 output portion of mark] .kissz, the output partitioning procedure. out _part produces three partitioned independent sets. as shown in Figure 5.5. where the algorithm first generates five independent sets which contain 9. 3. 2. l. and 1 column(s). respectively. The procedure suggests three groupings as shown. In addition, the procedure also generates an output file. as shown in Figure 5.6, which is used as an input for procedure out_encode for the output function encoding. The first line in Figure 5.6 indicates the number of groupings and area- optimum option. where the options "a" and "t" indicate area-optimum and time-optimum. respectively. Each group is described by two lines with the information of the encoding scheme ("m" for m/n code and "b" for Berger code), 111 and n for mln code or I and K for Berger code. the number of columns. and the column indices. As the second line indicates in Figure 5.6. the first group is encoded in m/n code. where m = 1 and n = 10. and contains 9 columns which are l. 4, 5, 8. 9. ll. 12. 13, and 15. The second group is encoded in 1/4 code which contains three columns. 6, 10. and 14; Finally. the third group is encoded in 1/5 code and has four columns. 2. 3, 7. and 16. This concludes that the output sub-functions are encoded in l/ 10. 1/4. and 3/5 codes. respectively and require three extra columns for the check part. In this procedure. the designer is allowed to modify the selected code scheme and grouping scheme. The input file (in Figure 5.6) is applied to Procedure out_encode to encode the output function. as the output portion shown in Figure 5.7. Once the output functions are encoded. the program. assigner. is executed for encoding states. As shown in Figure 5.7 . the states are encoded in 3/6 codes. where the "0"s in the present state are changed to "-."s The encoded states and output functions are then optimized by the multi-level optimizer. sis. and the resultant network is shown in Figure 5.8, where only AND and OR gates are realized in the combinational part and the inverters appear only in the primary inputs. Thus. the output functions are unate in terms of the inputs v1. v5. V6, and v9. and binate in terms of the remaining primary inputs. According to Figure 5.9. the inverters are also needed for the functions v11.1 and 11.3 which are the output of the state variables. Since a D flip-flop provides Q and O outputs. and V] 1.1 and V] 1.3 are the outputs of flip-flops. the inverting functions can be absorbed by using the O outputs. 98 The product term under consideration Row# # of 1’s 4 1010001001000000 12 0011001000000000 13 0010000000000011 14 0010001100000000 16 0010001010000101 17 0110001000100000 20 0110110000000000 bufiWWWWtfi Occurence Table Occ. Column numbers 1 1 4 5 6 8 9 10 11 14 15 2 2 16 5 7 7 3 Selected Indepentent Sets: 1 4 5 8 9 11 12 13 15 6 10 14 2 16 7 3 Suggested Groupings 1 4 5 8 9 11 12 13 15 6 10 14 2 3 7 16 Figure 5.5 Output of Procedure out_part. 3 a m 1 10 9 1 4 5 8 9 11 12 13 15 m 1 4 3 6 10 14 m 1 S 4 2 3 7 16 Figure 5.6 Input frle to Procedure out_encode. .i 5 .o 19 .p 22 .s 15 0---- 1---- 1---- 1---- 1-111 1-110 1-10- 1-011 1-010 1-001 1-000 1---- 1---- 1---- 1---- 1---- 1---- 10--- 11--- 1---- 1---- 1---- * statel stateZ state3 state4 state4 state4 state4 state4 state4 state4 states state6 state? state8 state9 statelo statell statell state12 state13 state14 statel state3 stateO state4 state13 statelo state9 state8 state? state6 states state14 state14 state14 state14 state14 statell state13 state12 state13 state14 state3 0110001000000000110 0110001000000000110 0110001000000000110 1010001001000000001 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0011001000000000011 0010000000000011011 0010001100000000011 0100001000000000111 0010001010000101000 0110001000100000010 0110001000000000110 0110001000000000110 0110110000000000001 0110001000000000110 0110110000000000001 Figure 5.7 Encoded output functions. MD .p 22 statel state3 state2 stateO state4 state13 statelo state9 state8 state? state6 states state14 statell state12 *******=fl=*=fl=###*# ---111 --1-11 --11-1 --111- -1--11 -1-1-1 -1-11- -11--1 -11-1- -111-- 1---11 1--1-1 1--11- 1-1--1 1-1-1- 000111 001011 001101 001110 010011 010101 010110 011001 011010 011100 100011 100101 100110 101001 101010 .i 11 .o 25 000111 001011 001110 010011 010101 010110 011001 011010 011100 100011 100101 100110 100110 100110 100110 100110 101001 010101 101010 010101 100110 001011 0110001000000000110 0110001000000000110 0110001000000000110 1010001001000000001 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0110001000000000110 0011001000000000011 0010000000000011011 0010001100000000011 0100001000000000111 0010001010000101000 0110001000100000010 0110001000000000110 0110001000000000110 0110110000000000001 0110001000000000110 0110110000000000001 Figure 5.8 Resultant state encoding with assigner. 101 INORDER = v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10; OUTORDER = v11.0 v11.1 v11.2 v11.3 v11.4 v11.S v11.6 v11.? v11.8 v11.9 v11.10 v11.11 v11.12 v11.13 v11.14 v11.15 v11.16 v11.17 v11.18 v11.19 v11.20 v11.21 v11.22 v11.23 v11.24; v11.0 = v0*v11.22*[151] + v0*!v8*v11.22 + v0*!v10*!v11.1 + [163] + [162]; v11.1 = [149]*[152]*{15S] + lv4*[149]*[155]; v11.2 = v4*v11.1*!v11 3 + v0*!v5*v10 + v9*[162] + v0*[151] + v2*v11.1 + [160]; v11.3 = lv2*!v3*v4*v11.1 + v8*!v11.1*[155] + !v7*v10*[150] + v3*!v4*v11.1 + [157] + [10]; v11.4 = v2*v4*v8*[155] + lv4?v11.1*[152] + lvl*lv8*[150] + v9*[151] + v8*[10] + [157] + v11.16; v11.5 = v8*v9*[154] + 1v2*v11.1*1v11.3 + !v3*v11.1*v11.3 + v1*!v8*v11.3 + v?* [10] + [160] + [148]; v11.6 = [15]; v11.7 = [158] + [10]; v11.8 = [160] + [157] + [150] + v11.16 + [10]; v11.9 = v9*v10*[163]; [10] = v0*v5*[151] + v10*[162]; v11.10 = [10]; v11.11 = [10]; v11.12 = [158] + [153] + [15] + [14]; v11.13 = v9*[164]; [14] = [1551*[163]; v11.14 = [14]; [15] = [1491*[151]; v11.15 - [15]; v11.16 = v7*[164]; v11.17 = ; v11.18 = 0; v11.19 = [14]; v11.20 = v10*[164]; v11.21 = v11.20 + [14]; v11.22 = v6*[151] + [150] + !v0; v11.23 = [158] + [153] + v11.20; v11.24 = 0; [129] = v8*[154] + v5*v10 + v7*v8 + [151]; [148] = v6*v7*[129] + lvO; [149] = v0*v8; [150] = v9*[129]; [151] = v?*v10; [152] = lv3 + !v2; [153] = v11.13 + v11.9; [154] = v10 + v5; [155] = v?*v9; [15?] = [15] + lvO; [158] = [150] + [148]; [160] = [153] + v11.21; [162] = v5*[149]; [163] = v0*v6; [164] = v6*[l49]; Figure 5.9 Resultant optimized network. 102 The sis technology mapper is employed to map the resultant optimized network (in Figure 5.9) to the AND and OR gates of the modified cell library. Figure 5.10 shows the partial netlist generated from the sis technology mapper. Finally. the placement program, 09p. and the routing program. aisce. are executed to generate the physical layout for the functional circuits and the checker circuit which are shown in Figures 5.11(a) and 5.11(b). respectively. The functional circuit has a core dimension of 616x1455 (=896280)2.2. while the checker circuit takes the area of 368x1550(=570400) 2.2. Figure 5.12 illustrates the physical layout of the complete example. The complete circuit is comprised of the encoded functional circuit and the checker circuits. In this example, the II 10, 1/4. and 3/5 code checkers are used for the output functions. The states are encoded in 3/6 codes and thus need a 3/6 code checker, referred to as the state checker. The outputs of the state checker and the output function checker are fed to three 2/4 checkers as a final checker. The total dimension is 896280+570400(=1466680)A.2. For the purpose of comparison, Figure 5.13 illustrates the layout of the same FSM which is generated by using the conventional synthesis procedure. It takes the area of 551x1350(=743850)}t.2 which is smaller than the area of the layout in Figure 5. 12 with the ore-line testable capability. Ifthe circuit in Figure 5.13 is to achieve the same degree of testability and reliability. one may apply the duplication with comparison approach described in Chapter 1. This implies that the circuit requires a area of 2x551n1350(=1487700)1.2 excluding the area of the self-checking comparator. In general. the self-checking comparator takes a considerably large amount of chip area. Thus, the total area of the circuit presented in Figure 5.12 is much smaller than the circuit in Figure 5.13, the duplication with comparison scheme. 103 .model markl .inputs v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 .outputs v11.0 v11.1 v11.2 v11.3 v11.4 v11.5 v11.6 v11.7 v11.8 v11.9 v11.10 v11.11 v11.12 v11.13 v11.14 v11.15 v11.16 v11.17 v11.18 v11.19 v11.20 v11.21 v11.22 v11.23 v11.24 .default_input_arriva1 0.00 0.00 .default_output_required 0.00 0.00 .defau1t_input_drive 0.10 0.10 .defau1t_output_load 2.00 .1atch v5 v11.0 1 .latch v6 v11.1 O .latch v7 v11.2 1 .latch v8 v11.3 0 .latch v9 v11.4 0 .latch v10 v11.S 1 .names v5 [655] 0 1 .names v0 [656] O 1 .names v8 [657] 0 1 .names [656] [657] [658] 1- 1 -1 1 .names [655] [658] [659] 1- 1 -1 1 .names [659] [954] 0 1 .names v0 v5 v6 v7 v8 v9.18 0---- 1 -0--- 1 ---1- 1 --1-1 1 .names v0 v5 v6 v7 v8 v9.19 0---- 1 -0--- 1 ---1- 1 -—1-1 1 .end Figure 5.10 Netlist generated by sis technology mapper. (Only partial list is shown). {,3fo I w" 4 wherein. a , . 65 27,52er , my»; (b) checker circuit. Figure 5.11 Physical layouts. 105 9. ‘5“ ~ , . 953‘s"; ~ '. '16. .. were. .1 t x z tweeter 3:15.: um" an _wn a. t .. «Seat. 1111 a 1.541129% an ' r ‘ r- W458. ‘ " * ”are are gearing: ' v“- 4 _ . a w» *2 2‘ ti . ‘4 REE: ‘2' c ‘3 a.“ 22 .4 ext, 595%) 5 48 '3 ”$4 95 ~:~ We‘éii‘ its}; 1*: 1115:). ..- 8811-1“ >133. $4348” “131443;?“ wgifitm-‘mu her” tenure: - : 3' .. 43.4 12132133 81‘ ‘ . gg‘i' - -.< v remaking News.“ . ..a was sewera- . .~5’M.r...~ioc-:v .' Figure 5.12 Layout of an on-line testable sequential circuit. mark1.kiss2. 106 S. meme-«2%» «A m 1 X 6 1 6 ) b , ( and o 9 th (a) the conventional synthesis procedure .kl'ss2 wi (b) encoded functional circuit. 13 Layout of mark] 5 Figure CHAPTER 6 SUMMARY AND CONCLUSION This chapter summarizes the results concluded in this study and identifies the contributions and impacts of this research. Finally. some interesting problems are identified for future research. 6.1 Summary With ever-increasing complexity of digital applications. the issue of reliability has become very important in today's VLSI designs. Concurrent error detection schemes using redundancy design approach have been successfully implemented to enhance chip yield and system reliability. The redundancy design approaches may include hardware redundancy. time redundancy. and information redundancy. The simplest concurrent error detection scheme involves duplication of circuit and comparing the outputs of the two blocks. referred to as duplication with comparison scheme. Any mismatch between them will indicate an error. However. this involves more than 100% hardware. Information redundancy involves the use of coding techniques that enhance circuit capacity for reliable operation. Due to the salient feature of least redundant separable codes. Berger codes have been implemented for fault-tolerant. fail-safe. and concurrent error detection designs of digital circuits. The features of high speed and low hardware cost are 107 108 highly desirable especially for a checker design. In response to these needs, a design methodology using the partitioning and folding scheme is developed for the design of fast. yet low hardware cost Berger code checkers with self-testing capability. Table 3.9 has summarized the significance of the developed checker design comparing it with existing designs. As shown in Table 3.9. the code length of a checker grows linearly with the number of outputs in a functional circuit. while the complexity of the checker may grow rapidly. As a result, a functional circuit may be optimized in its hardware overhead at the cost of requiring a larger checker. As such. the overall hardware cost for the entire self-checking circuit is not reduced through the optimization procedure. In addition. if the speed of the larger checker exceeds the clock cycle time for the functional circuit. then the system cycle time must be increased to capture the error signals. As shown in Table 3.9. the hardware cost of the checkers increases almost exponentially as the code length increases. Thus. the use of many smaller code checkers may take much less hardware cost and delay time than those of a larger checker with the same information bits. Thus, in Chapter 4, an efficient output function partitioning scheme is developed to partition the set of output functions to many smaller subsets so that smaller checkers can be employed. Experimental results listed in Table 4.3 have shown that. with the developed partitioning scheme, the hardware cost can be reduced considerably. With such low hardware cost checkers, on-line testable design becomes very promising and practical. In this research. a system. SOUT, for automated synthesis of on-line testable sequential circuits with multi-level implementation has been developed and illustrated in Figure 5.2. The system is implemented on Sun/4 workstation in the C-language. The system receives a behavioral description of finite state machines in kiss2 format. and automatically generates a physical layout for a self-checking circuit. Results show that the circuits generated by this system take much smaller chip area and delay time than those with the duplication with comparison scheme. 109 6.2 CONTRIBUTIONS AND FUTURE RESEARCH The contributions and impacts of this research can be summarized as follows: (1) developed a fast. yet low hardware cost Berger code checker; (2) developed an efficient output partitioning algorithm for reducing hardware cost and speed degradation of checker circuits; and (3) developed an automated synthesis system for on-line testable sequential circuits with multi-level logic implementation. Based on the results of this study. in addition to hardware cost and delay time. reliability now can also be one of the major design objectives. Several improvements that can be done for upgrading the synthesis system. SOUT, are summarized as follows: (1) Design Specification In SOUT. it receives the input file of a FSM in Itsz format. This is mainly because the existing benchmarks are in such a format. The system is readily adapted for the specification in any other format. (2) Output Partitioning Algorithm In procedure out_part of SOUT, the independent sets of a given output function are generated. The program provides the suggested output groupings. The optimal choice of output groupings should be measured by the overall hardware cost and delay time. Thus, developing an efficient decision algorithm for global optimization can further improve the results. 110 (3) State Encoding As mentioned in Section 5.3. the state variables are encoded sequentially. It did not take the circuit behavior. not structure, into account. The synthesized results would be improved significantly if an efficient state assignment for multi-level logic implementation is developed. BIBLIOGRAPHY 111 BIBLIOGRAPHY [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] B. W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems. Addison- Wesley Publishing Co.. Reading. Mass, 1989. C. L. Wey, “Concurrent Error Detection in Array Dividers by Alternating Input Data.” IEE Proceedings-E. Vol. 139, pp.123-130. March 1992. P. K. Lala. Fault Tolerant & Fault Testable Hardware Design, Prentice-Hall. Englewood Ciffs. N. J.. 1985. T. R. N. Rao and E. Fujiwara. Error-Control Coding for Computer Systems. Prentice- Hall. Englewood Ciffs. N. J., 1989. M. M. Yen. W. K. Fuchs. and I. A. Abraham, “Designing for Concurrent Error Detection in VLSI: Application to Microprogram Control Unit.” IEEE Journal of Solid-State Circuits. Vol. SC-22. pp.595-605, August 1987. C. L. Wey, “On Yield Consideration for the Design of Redundant Programmable Logic Arrays,” IEEE Trans. on Computer-Aided—Design, Vol. 7, pp.528-535. April 1988. Y. Tarnir and C. H. Sequin, “Design and Application of Self-Testing Comparators Implergented with MOS PLA’s,” IEEE Trans. on Computers. Vol. C-33. pp.493~506. June 1 84. J. L. A. Hughes. E. J. McCluskey. and D. J. Lu. “Design of Totally Self-Checking Computers with an Arbitrary Number of Inputs,” IEEE Trans. on Computers. Vol. C- 33. pp.546-550. June 1984. R. W. Cook. W. H. Sisson. T. F. Storey. and W. N. Toy. “Design of a Self-Checking Microprogram Control,” IEEE Trans. on Computers. Vol. C-22. pp.255-262. March 1973. D. K Pradhan and J. I. Stiffier, “Error-Correcting Codes and Self-Checking Circuits.” IEEE Trans. on Computers, Vol. 13. pp.27-37, March 1980. N. K. Jha and M. B. Vora. “A Systematic Code for Detecting t-Unidirectional Errors.” Proc. International Symp. on F ault-Tolerant Computing, pp.96-101, July 1987. S. J. Piestrak. “Design of a Self-Testing Checker for Borden Code.” Free. gtéergcatiorgrgl Conference on Computer Design: VLSI in Computers. pp.582-585, to r1 1. [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [261 [27] 112 M. A. Marouf and A. D. Friedman, “Efficient Design of Self-Checking Checker for Any m-out-of-n Code.” IEEE Trans. on Computers, Vol. 02?. pp.482-490. June 1978. C. Halatsis, N. Gaitanis. and M. Sigala, “Fast and Efficient Totally Self-Checking Checkers for m-out-of-(Zmil) Codes.” IEEE Trans. on Computers, Vol. C-32. pp.507-511. May 1983. D. A. Anderson and G. Metze. “Design of Totally Self-Checking Check Circuits for m-out-of-n Codes.” IEEE Trans. on Computers. Vol. C-22, pp.263-269. March 1973. S. M. Reddy. “A Note on Self-Checking Checkers.” IEEE Trans. on Computers, Vol. C-23, pp.1100-1102. October 1974. T. Nanya and Y. Tohma. “A 3-Level Realization of Totally Self-Checking Checkers for m-out-of-n Codes.” Proc. International Symp. on Fault-Toleth Computing. pp.173-l76. 1983. M. A. Marouf and A. D. Friedman. “Efficient Design of Self-Checking Checker for Any m-out-of-n Code.” IEEE Trans. on Computers, Vol. C-27. pp.482-490, June 1978. S. M. Reddy and J. R. Wilson, “Easily Testable Cellular Realizations for the (exactly p)~out-of-n and (p or more)-out-of-n Logic Functions.” IEEE Trans. on Computers, Vol. C-23. pp.98-100. January 1974. S. J. Piestrak. “Design of Self-Testing Checkers for m-out-of-n Codes,” Proc. International Symp. on Fault-Tolerant Computing, pp.l73-176, June 1983. J. Khakbaz. "Totally Self-Checking Checker for l-out-of-n Code Using TWO-Rail Codes,” IEEE Trans. on Computers. Vol. C-31. pp.677-68l, July 1982. G. Laskaris. T. Harriotakis, A. Paschalis, and D. Nikolos. “New Design Method for Low-Cost TSC Checkers for l-out-of-n and (n-l)-out—of-n Codes in MOS Implementation,” International Journal of Electonics. Vol. 69, pp.805—817. 1990. M. Kotocova. “Design of Totally Self-Testing Check Circuits for Some lln Codes.” Proc. International Conference on Fault-Toleth System Diagnostics. pp.241-245, September 1981. G. Laskaris. T. Haniotakis. A. Paschalis, and D. Nikolos. “Efficient Design of TSC Checker for l-out-of-n Codes in MOS Transistor Implementation.” Proc. International Conference on F ault-Tolerant System Diagnostics, pp. 805-8 17. September 1989. S. J. Piestrak. “Design of Self-Testing Checkers for l-out-of-n Codes.” Proc. International Conference on Fault-Toleth System Diagnostics, pp.57-63. September 1983. H. Dong. “Modified Berger Codes for Detection of Unidirectional Errors,” Proc. International Symp. on Fault-Tolerant Computing, pp.3l7-320, June 1982. N. K. Jha, “Separable Codes for Detecting Unidirectional Errors.” IEEE Trans. on Computer-Aided-Design. Vol. 8. pp.571-574. May 1989. [28] [29] [30] [31] [32] [33] [341 [35] [36] [371 [33] [39] [40] [41] [42] [43] 113 M. J. Ashjaee and S. M. Reddy. “On Totally Self-Checking Checkers for Separable Codes.” IEEE Trans. on Computers, Vol. C-26. pp.737-744. August 1977. M. A. Marouf and A. D. Friedman. “Design of Self-Checking Checkers for Berger Codes.” Proc. International Symp. on Fault-Tolerant Computing. pp.179-184, June 1978. S- I. Piestrak. “Design of Fast Self-Testing Checkers for a Class of Berger Codes.” IEEE Trans. on Computers. Vol. C-36. pp.629-634. May 1987. J. C. Lo and S. Thanawastien, "The Design of Fast Totally Self-Checking Berger Code Checkers Based on Berger Code Partitioning.” Proc. International Symp. on Fault-Tolerant Computing, pp.226-231, 1988. M. J. Ashjaee and S. M. Reddy. “Totally Self-Checking Checkers for a Class of Separable Codes.” Proc. Allerton Conference Circuits System Theory. pp.238-242. 1974. N. K. Jha, “A Totally Self-Checking Checker for Borden’s Code,” IEEE Trans. on Computer-Aided-Design, Vol. 8, pp.731-736, July 1989. B. Bose and D. J. Lin. “Systematic Unidirectional Error-Detecting Codes.” IEEE Trans. on Computers, Vol. C-34, pp.1026-1032, November 1985. B. Bose. “Burst Unidirectional Error Detecting Codes.” IEEE Trans. on Computers. Vol. C-35. pp.350-353. April 1986. M. Blaum. “On Systematic Burst Unidirectional Error Detecting Codes,” IEEE Trans. on Computers. Vol. C-37, pp.453-457, April 1988. Y. Tohma. “Coding Techniques in Fault-Tolerant. Self-Checking and Fail-Safe Circuits.” in Fault-tolerant ompuring: Theory and Techniques. edited by D. K. Pradhan. Prentice-Hall. 1986. D. K. Prahan,“A New Class of Error Correcting/Detecting Codes for Fault Tolerant Applications,” IEEE Trans. on Computers. Vol. C-29. pp.471-481. June 1980. G. P. Mak. J. A. Abraham, and E. S. Davidson, “The Design of PLAs with Concurrent Error Detection.” Proc. International Symp. on Fault-Tolerant Computing. pp.303- 310. June 1982. S. L. Wang and A. Avizienis. “The Design of Totally Self-Checking Circuits Using Programmable Logic Arrays.” Proc. International Symp. on Fault-Tolerant Computing, pp.173-180. June 1979. J. E. Smith and G. Metze. “The Design of Totally Self-Checking Combinational Cgir7c‘;lits.” Proc. International Symp. on F ault-Tolerant Computing. pp. 130-134. June J. F. Wakerly. “Partially Self-Checking Circuits and Their Use in Performing Logical Operations,” IEEE Trans. on Computers, Vol. C-23. pp.658-666, July 1974. J. E. Smith and G. Metze. “Strongly Fault Secure Logic Networks.” IEEE Trans. on Computers. Vol. C-27. pp.49l-499, June 1978. [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] 114 M. Diaz. “Design of Totally Self-Checking Asynchronous Sequential Machines.” Proc. International Symp. on Fault-Tolerant Computing. pp.3-9 to 3-24, June 1974. F. Ozguner. “Design of Totally Self-Checking Asynchronous Sequential Machines.” Proc. International Symp. on Fault-Tolerant Computing. pp.124—l29. June 1977. J. Viaud and R. David, “Sequential Self-Checking Circuits.” Proc. International Symp. on Fault-Tolerant Computing, pp.263-268, June 1980. S. Devadas. H. K. T. Ma. A. R. Newton and. A. L. Sangiovanni-Vincentelli, “Synthesis and Optimization Procedure for Fully and Easily Testable Sequential Machines,” Proc. International Test Conference. pp.621-630. September. 1988. H. K. T. Ma, S. Devada, A. R. Newton. and A. L. Sangiovanni-Vincentelli, “Test Generation for Sequential Finite State Machines.” International Conference on Computer-Aided-Design. pp.288-291, November 1987. J. E. Smith and P. Lam. “A Theory of Totally Self-Checking System Design.” IEEE Trans. on Computers, Vol. C-32. pp.831-843. September 1983. M. Diaz. P. Azema. and J. M. Ayache, “Unified Design of Self-Checking and Fail- Safe Combinational Circuits and Sequential Machine.” IEEE Trans. on Computers. Vol. C-28. pp.276-28l. March 1979. Y. Mukai and Y. Tohma. “A Method for the Realization of Fail-Safe Asynchronous Sequential Circuits.” IEEE Trans. on Computers, Vol. C-23. pp.736-739, July 1974. D. K. Pradhan. “Asynchronous State Assignments with Unateness Properties and Fault-Secure Design,” IEEE Trans. on Computers, Vol. C-27. pp.396o404. May 1978. G. Mago, “Monotone Function in Sequential Circuits.” IEEE Trans. on Computers Vol. C-22. pp.928-933. October 1973. N. K. Jha and S. J. Wang, “Design and Synthesis of Self-checking VLSI Circuits and Systems.” 1991 IEEE International Conference on Computer Design: VLSI in Computers & Processors. pp.578-581. October 1991. A L. Hopkins and T. B. Smith, "The Architectural Elements of a Symmetric Fault- Tolerant Multiprocessor.” Proc. International Symp. on F ault-Tolerant Computing, pp.42-46, June 1974. T. Nanya. “Error Secure Propagating Concept and its Application to the Design of Strongly Fault-Secure Processor.” IEEE Trans. on Computers, Vol. C-37. pp.14-24. January 1988. E. J. McCluskey. “Design Techniques for Testable Embedded Error Checkers,” Computer. pp.84-88, July 1990. H. B. Bakoglu, Circuits, Interconnections, and Packing for VLSI, Addison-Wesley Publishing Co.. Reading, Mass. 1990. W. K. Fuchs and J. A. Abraham. “A Unified Approach to Concurrent Error Detection in Highly Structured Logic Arrays.” Proc. International Symp. on Fault-Toleth Computing. pp.4-9, June 1984. [60] [61] [62] [63] [641 [65] [66] [67] 115 W. Baker, “Oct. Tools Distribution.3.0 Vol. 6: Light/Oct/Vemlisp Implementation,” Electronics Research Laboratory. University of California at Berkeley, 1989. W. C. Carter and P. R. Schneider. “Design of Dynamically Checked Computers.” Proc. International Federation for Information Processing, Vol. 2, pp.878-883. August 1968. S. Devadas. H. K. Ma. and A. R. Newton. “MUSTANG: Targeting Multilevel Logic Implementations,” IEEE Trans. on Computer-Aided-Design. Vol. 7, pp.1290-1300. December 1988. R. Spickelrnier, “Oct. Tools Distribution 2.1.” Electronics Reserach Laboratory. University of California at Berkeley, 1988. N. Weste and K. Eshraghian. Principle of CMOS VLSI Design, Addison-Wesley Publishing Co.. Reading, Mass. 1985. G. Chartrand and L. Lesniak. Graphs & Diagraphs. Second Edition. Wadsworth & Brooks/Cole Advanced Books & Software. Pacific Grove. Ca.. 1986. C. Sechen and A. Sangiovanni-Vincentelli. “The Trrnberwolf Placement and Routing Package,” IEEE Journal of Solid-State Circuits. Vol. C-20, pp.510-522. April 1985. H. Li and W. Li. “AISCE: A Layout S thesis System for ASIC Design.” Proc. International Conference on Circuits an Systems. 911419-422, June 1991. "I11111111111110