. , '9 ’Q ~ 9 J." r .1“ war:- it i; ‘ 3,1 . - Jaffij’f: , ’Ji’i (-3»: .n .213) ., ~ 0 .’3'; a, 1} ‘v a.» '..'., :1: . v find-ha? ’ ~¢ p. {it (3).": WWW“ 3",. m4 - J94: «ham; 3!; a :1- ?” W“ , , , “a“? a". V 7‘ W 3'1"“ '4 l :’ 7; ll! ‘ ”A A": I {13“ . - . . K“ " . ~,~ “no.“ a .13; I v» u ,1, H- ." (c510 ' 3"" 1.02:“ All 1/ ill/,1”. “3’s. «'4 ~5- x. ,- '_" c} o‘ I. 3'" .';I"':"V',' trm ‘ 1"."4’51.;.r": ' I; r 930'}? L_ 1‘ 34;, W" 91" 7";{fifq/jl , )gijuJfiAfi ' «bit/fie: , (fut? ; I‘ f? I 'vv’ "'15 I!) 'Mr V ‘1: n_u‘ filial." “hf. . 33. 6.23. _ 1.,“ A_ ‘l x » fi}.n1f7¢: “"lqtvlm‘ n16 "Tm ’* ->*‘ yr. a'i' fiflffi‘géfig} f“ 9 ‘ ‘1 . y . . 7h ’ f) 1134);???" ’ ”‘63:; "I ' n ‘ "I § — W71" 5‘ 1:"? .‘i-‘,§’:4~i§§:..xw‘? ’ .43? r #4“ W‘ ’36. g V. ~~—. ._ Jag-5&3: yfim~”" Jfirkfiti 432W; 4 .,‘§€,g€&::u:: - éP‘fié‘Mg. 31%? 5d,." 1': :- A: :1“. 43";U§;;‘; 'fi: (at; rag 51%;“; 4.3;. 3% ‘5' m igiils .15“ J31 .. M ‘ ‘ 3:2: ”17.;Cfis‘ fifi~ «wig a “M": “Lui‘ ‘I?<';U 6:33“ «(If 1*“ 21l- (:96, 1': «13.5”; #11; v a? {563, ‘u \‘g’fih’u' ‘H’ _. .v'hifu“; ;! _.r )r‘x‘ IL , it 3:352}? - "36: {a (1.35; I" v! \.:""n'lx.~:;t:-; :3 ‘ "M‘ . . , \Kfln‘flf ;.§’é‘.‘1,l‘.“ ' ,- .fv ' ’ my). 1 1‘ ' K. a: A < ' ‘ “A"; t; a. .r V} 3 h I}: 1-.“ 1a,- 2 lg " wide! - h :.-4- ‘f" x if; ‘93; ‘ ‘Siuk . 'Gvg’ ' ’4'": J ,J an 7 :3. 'ifg‘f’fiz 55%;” 4‘43“? 5'} f‘u M'ufi', J52!A 5‘“4:f;-i? W r J ’2: ig" 2"“ " .4; gasfiw . {55. cups ‘m":’;~fl\ ::ra.: gr ‘ ‘W _. 3-3" ._ p- ”3!; ’ 339”“ at? . 325W: 0 \ "HEQN cm a) ll°l°\\\\m\\“mum lllfjlilu DESIGN OF FAULT-TOLERANT PROGRAMMABLE LOGIC ARRAYS FOR YIELD ENHANCEMENT presented by Tsin-Yuan Chang has been accepted towards fulfillment of the requirements for Ph.D. degree in Electrical Engineering [Zia/77752 ”/57 Major/ofessor Date Ni’V. é/ /397 7 MS U is ab Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY Mlchlgan State Unlveralty PLACE ll RETURN BOX to roman this checkout from your record. TO AVOID FINES Mum on or Moro duo duo. DATE DUE DATE DUE DATE DUE MSU I. An Affirmwvo Action/Equal Opponunuy Instituuon Wanna 7%, DESIGN OF FAULT-TOLERANT PROGRAMMABLE LOGIC ARRAYS FOR YIELD ENHANCEMENT By Tsin-Yuan Chang A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1989 DESIGN OF FAULT-TOLERANT PROGRAMMABLE LOGIC ARRAYS FOR YIELD ENHANCEMENT BY Tsin-Yuan Chang Department of Electrical Engineering Michigan State University ABSTRACT The yield (expected percentage of good chips out of a wafer) of integrated circuits (ICs) has always been crucial to the commercial success of their manufactm‘e. The technology of ICs evolved from LSI, VLSI, to ULSI in the past two decades. Multiple layers and scaling techniques make it possible for more than 106 transistors to be put into a single chip. However, as the complexity of digital devices increases and geometry shrinks, the probability of having faulty components also increases, thereby lowering the chip yield. One solution to the low yield problem is to improve manufacturing and testing processes, but it is very costly and quite difficult to implement within a short time. Another practical way is the use of fault-tolerant structures, which has been demonstrated in practice for high density memory chips. The result of fault-tolerant memory design is a reduction in the capital-required level of shippable product, and also that redundancy typically improves yields by 1.5 to 5 times. Programmable Logic Arrays (PLAs) have the advantages of regular structure, design simplicity, and fast turnaround time. The use of PLAs becomes increasingly popular for implementing Boolean logic functions and control blocks in the design of integrated circuit. Due to the fact that complex chips (and in particular microprocessors) can be efficiently implemented using PLAs, a trend towards manufacturing larger programmable chips is expected. In this dissertation, a fault-tolerant design for large PLAs is proposed. The fault-tolerant design achieves a full diagnosability of single and multiple stuck-at faults, bridging faults, and crosspoint faults. During the manufacturing process, faults in the PLA can be detected, located, and repaired with the spare lines. When the PLA is used in field, the structure still possesses the easily testable capability. An automatic layout generator, MRPLA, has also been developed and implemented in Sun 3/160 for generating the physical layout of the proposed fault-tolerant PLA. In addition, some important issues such as die size, speed, and yield enhancement are also addressed in this study. The results of this study show that the yield can be enhanced significantly. A simple, yet efficient optimization method has been presented to determine the optimal redundancy of various sizes of PLAs. This study also introduces a PLA structure based on memory cells. The RAM- Based PLA (RBPLA) allows designers to reprogram the PLA as many times as needed. A fault-tolerant RBPLA is also presented to electrically repair faults in the manufacturing process and also in field use. ACKNOWLEDGMENTS The author wishes to express his sincere appreciation to his major advisor, Dr. Chin-Long Wey, for the guidance and encouragement given in the course of this graduate study. He also wishes to thank the dissertation committee members, Dr. Donnie Reinhard, Dr. Michael Shanblatt, and Dr. Byron Drachman for their valuable suggestions and comments in his dissertation research. The author also gives thanks to Dr. Harriett Rigas, and regrets her passing. The author is especially grateful for the financial support from Dr. Wey and the National Science Foundation under grant No. MIP—8700880. Without these supports, this research effort would not have been possible. He would like to acknowledge all the faculty and staff members, and the students who gave him help and assistance during his studying in the Electrical Engineering Department at Michigan State University, and many friends, especially from the Church in Lansing, who showed their support and concern. Finally, he is very grateful to his family for years of concern, encouragement and support. iv TABLE OF CONTENTS List of Tables ........................................................................................................... viii List of Figures .......................................................................................................... ix Chapter 1 Introduction ........................................................................................ 1 1.1 Problem Statement ................................................................................ 2 1.2 Objectives ............................................................................................. 4 1.3 Physical Failures in VLSI Circuits ....................................................... 5 1.4 Redundancy Architectures .................................................................... 6 1.5 Thesis Organization .............................................................................. 9 Chapter 2 Fault-Tolerant Semiconductor Memories ......................................... 11 2.1 On-Chip Redundancy ............................................................................ 12 2.2 Repair Techniques ................................................................................ 13 2.3 Fault Analysis ....................................................................................... 16 2.4 Fault-Tolerant RAM Design Examples ................................................ 19 2.4.1 A Fault-Tolerant Dynamic RAM ......................................... 19 2.4.2 A Fault-Tolerant Static RAM .............................................. 22 2.5 Discussion and Summary ...................................................................... 25 Chapter 3 Fault-Tolerant Programmable Logic Arrays ................................... 26 3.1 Programmable Logic Arrays ................................................................ 27 3.1.1 PLA Structure and Notation ................................................... 28 3.1.2 Fault Models .......................................................................... 28 3.2 Design of the Repairable PLA ............................................................. 33 3.2.1 Repair Rules ........................................................................... 33 3.2.2 Repairable PLA ...................................................................... 38 3.2.2.1 SISC and Spare Input Bit Lines .................................... 38 3.2.2.2 SOSC and Spare Output Lines ...................................... 41 3.2.2.3 Spare Product Lines ...................................................... 43 3.2.3 Automatic Layout Generator ................................................. 43 3.2.4 Performance ........................................................................... 46 3.2.4.1 Chip Area ...................................................................... 48 3.2.4.2 Propagation Delay Time ............................................... 50 3.3 Design of the Diagnosable PLA ........................................................... 52 3.3.1 Augmented Circuits ............................................................... 52 3.3.1.1 Product Lines’ Shift Register (PSR) ....................... 52 3.3.1.2 Input Lines’ Shift Register (ISR) ............................ 55 3.3.1.3 Extra Power Line Vddl .......................................... 57 3.3.2 Design Evaluation .................................................................. 57 3.4 Summary ................................................................................................ 59 Chapter 4 Fault Diagnosis and Repair Process .................................................. 61 4.1 Locate and Repair Faults in Manufacturing Process ............................ 61 4.1.1 Detect Faults in Augmented Circuits ..................................... 62 4.1.2 Identify and Repair Faults in the AND plane ....................... 64 4.1.3. Identify and Repair Faults in the OR plane .......................... 68 4.1.4 Repair Crosspoint Faults ........................................................ 72 4.2 Fault Diagnosis and Repair Algorithm ................................................. 73 4.2.1 Example 1 .............................................................................. 75 4.2.2 Example 2 .............................................................................. 78 4.2.3 Discussion .............................................................................. 80 4.3 Test Chip in Field Use ........................................................................... 80 4.4 Summary ................................................................................................ 81 vi Chapter 5 Yield Analysis ...................................................................................... 82 5.1 Yield Model .......................................................................................... 82 5.1.1 Correctable Random Effect Yield, YCRD .............................. 83 5.1.2 Uncorrectable Random Effect Yield, YURD .......................... 86 5.2 Yield Simulation ................................................................................... 86 53 Optimal Redundancy ............................................................................ 91 5.4 Summary ............................................................................................... 94 Chapter 6 Fault-Tolerant RAM-Based PLAs .................................................... 95 6.1 Basic Structure of an RBPLA ............................................................... 96 6.1.1 A DRBPLA Structure ............................................................ 97 6.1.2 An SRBPLA Structure ........................................................... 98 6.2 A Fault-Tolerant SRBPLA Design ....................................................... 102 6.3 Fault Diagnosis and Repair Process ...................................................... 105 6.3.1 Fault Models ........................................................................... 105 6.3.2 Fault Diagnosis and Repair Algorithm ................................... 106 6.4 Summary ............................................................................................... 107 Chapter 7 Conclusions .......................................................................................... 108 7.1 Summary of Major Contributions ......................................................... 108 7.2 Directions for Future Research ............................................................. 109 7.2.1 Fault-Tolerant Design of Folded PLAs ................................ 110 7.2.2 Fault-Tolerant Design of VLSI/ULSI/WSI Array Structures ....................................... 111 7.2.3 New Yet Low-Yield Technologies ...................................... 112 Appendices .............................................................................................................. 1 13 Bibliography ........................................................................................................... 123 vii Table 2.1 Table 2.2 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 5.1 Table 5.2 Table 5.3 Table 7.1 LIST OF TABLES Memories Built with Redundancy ...................................................... 14 The Comparison Between Laser and Electrical Programming ........... 15 Cubic Notation .................................................................................... 29 Repair Rules ........................................................................................ 35 Area Overhead in RPIAs ................................................................... 49 Delay Time Penalty of the RPLAs ...................................................... 51 Operations of the PSR ......................................................................... 55 Operations of the ISR .......................................................................... 56 Area Overhead of FTPLAs ................................................................. 57 Yield Simulation for (50,190,67)-PLA ............................................... 92 The Effective Yields for (50,190,67)-PLA ......................................... 93 Yield Simulation for (100,400,100)-PLA ........................................... 94 Yields of LSI GaAs Circuits ............................................................... 112 viii Figure 1.1 Figure 1.2 Figure 1.3 Figure 1.4 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 Figure 3.9 Figure 3.10 Figure 3.11 Figure 3.12 LIST OF FIGURES Learning Curves .................................................................................. 3 Implant Mask Defects ......................................................................... 7 Significant and Insignificant Defects .................................................. 7 Reconfiguration Architectures ............................................................ 8 Spare Allocation of Redundant Elements ........................................... 17 Fault-Tolerant 64K DRAM ................................................................. 20 Standard and Spare Row Decoders ..................................................... 21 Block Diagram of Major Hardware Components of Laser Programming System ................................................................ 21 Block Diagram of the 8K x 8 Bit Static RAM .................................... 23 Block Diagram of the Redundancy Conuol Circuit ........................... 23 Laser Diffusion Programmable Devices ............................................. 23 Programmable Logic Array ................................................................ 29 Crosspoint Faults ................................................................................ 31 Stuck-at Faults .................................................................................... 32 Schematic Diagram of a Repairable PLA ........................................... 34 The Repair of S-fault with a Spare Product Line ................................ 37 The Repair of G-fault with a Spare Product Line ............................... 37 Spare Input Selector Circuit (SISC) .................................................... 39 The Programming Procedure of SISC and Spare Input Lines ............ 40 The Programming Procedure of SOSC and Spare Output Lines ........ 42 The Programming Procedure of Spare Product Lines ........................ 44 A Sample RPLA Template ................................................................. 45 MRPLA ................................................................................................ 47 Figure 3.13 Floor Plan of an (sn,sp,sm)-RPLA ...................................................... 48 Figme 3.14 Floor Plan of the PLA ......................................................................... 49 Figure 3.15 Different Allocation Scheme of Spare Lines ...................................... 51 Figure 3.16 A Schematic Diagram of a Fault-Diagnosable PLA ........................... 53 Figure 3.17 A Producr Lines’ Shift Register (PSR) Cell ....................................... 53 Figure 3.18 The Function of Shift Register Cell in Testable PLA Design ............ 54 Figure 3.19 An Input Lines’ Shift Register (ISR) Cell .......................................... 56 Figure 3.20 Fault-Diagnosable PLA ...................................................................... 58 Figure 3.21 An Easily Testable PLA Modified from the FDPLA ......................... 60 Figure 4.1 A Simplified Diagram for Fault Diagnosable PLA ............................ 63 Figure 4.2 Identify Input Line Stuck-at Faults and Bridging Faults .................... 66 Figure 4.3 Identify Faults at Product Lines as well as G- and S- Faults .............. 66 Figure 4.4 Identify s-a-l Faults at Output Lines .................................................. 69 Figure 4.5 Identify Bridging Faults; Outputs s-a-l Faults; and A- and D- faults ................................................................................... 69 Figure 4.6 Examples in the Fault-Diagnosable PLA Design ............................... 74 Figure 5.1 Yields for (50,190,67)-PLA with Redundancy (snsp,sm)=(3,4.2) ............ 88 Figure 5.2 Yield Analysis for (50,190,67)-PLA with (2:2 ................................... 90 Figure 6.1 The Structure of a DRBPLA ............................................................... 97 Figure 6.2 SRBPLA Structure ............................................................................. 99 Figure 6.3 Control Circuit in SRBPLA. ............................................................... 101 Figure 6.4 Fault-Tolerant SRBPLA Scheme ........................................................ 102 Figure 6.5 SI-Cell ................................................................................................. 103 Figure 6.6 P-Cell .................................................................................................. 104 Figure 6.7 Control Signals of the Fault-Tolerant SRBPLA ................................. 104 CHAPTER 1 Introduction As the complexity of digital devices increases and the geometry shrinks, the probability of having faulty components also increases. The yield of integrated circuits (expected percentage of good chips from a wafer) has always been crucial to the commercial success of their manufacture. One solution to solve low yield problems is to improve manufacturing and testing processes, but it is very costly and quite difficult to implement within a short time [57]. Another practical way is the use of fault-tolerant structures [39], which has been demonstrated in practice for high density memory chips. The only integrated circuits so far to have exploited fault-tolerant techniques commercially have been memory chips. This is because memory chips are particularly densely packed and therefore increasingly vulnerable to defects, and also because a regular memory array lends itself to a variety of efficient fault-tolerant designs. The result of fault-tolerant memory design is a reduction in the capital-required level of shippable product, and also that redundancy typically improves yields by 1.5 to 5 times [56]. During the past few years, Programmable Logic Arrays (PLAs) have become increasingly common for implementing Boolean logic functions in Very Large Scale Integration (VLSI) chips. The advantages of regular structure, design simplicity, and fast turnaround time have played significant roles in the manufacturing of large density PLAs. Due to the fact that complex chips (and in particular microprocessors) can be efficiently implemented using PLAs, a trend towards manufacturing larger programmable chips is expected. Therefore, the probability of having faulty PLA chips also increases. The same scenario happens in PLA chips as in memory chips. Low yield, then, is a potential problem in manufacturing of PLA chips. In recent years, research has extensively dealt with the fault detection and test generation of PLAs [6,9,25,33,54,59]. In particular, redundancy techniques have been successfully applied to PLA testing. While extra logic has been added to PLAs to implement function-independent tests [9.59] so that the complexity of PLA test generation can be reduced, other approaches have implemented the added redundancy with coding techniques, such as m-out-of-n codes and Berger codes [33], to design totally self-checking (TSC) PLAs. Little emphasis, however, has been devoted to use redundancy for repairing PLAs. 1.1 Problem Statement New devices traditionally push against the current technological limits and often have a very low yield. This situation is only sensible when continuing advances in processing techniques are likely to ensure a profitable yield level by the time the device is in volume production. When a new generation fabrication process is being developed, the rate of climbing the learning curve is relatively slow, and the initial yield values are typically quite low (Curve 1 in Figure 1.1) [57]. The slow learning is due to the following facts: (1) the immature process design and the technology development vehicle are not centered with respect to process variation; and (2) significant yield drOps can be experienced even in mature processes, because it may take a long time before the causes are diagnosed and the problems are corrected. The time necessary to bring the yield above the economically acceptable yield (Yaw) can be on the order of several months, resulting in loss in revenue and competitive edge. The learning curve may be improved by (1) increasing the initial yield; (2) minimizing process and circuit sensitivity to process variation; and (3) characterizing the most likely failure modes to spwd-up diagnosis. As a result, the yield Yaw as the desired learning curve shown in Curve 2 of Figure 1.1 [57], can be obtained in a shorter period of time. In other words, the yield can be enhanced significantly if the manufacturing process and testing process are improved. However, this improvement requires better factory equipment and better knowledge of design and testing of the chips, which are very costly and quite dificult to implement. Recently, fault-tolerant techniques have been widely applied to the newly developed fabrication processes. were [ Yacc /_ L " 1.2 1.1. Time Figure 1.1 learning Curves [5‘7]. It should be noted that the entire manufacturing process may consist of three major yield steps that affect the total number of functional integrated circuit products that are realized [47]. The major steps are: wafer processing yield, probe yield, and final test yield. Wafer processing yield is defined as the percentage of good wafers that survive the manufacturing process. Probe yield is defined as the percentage of good chips out of a wafer. Final test yield is the percentage of devices that pass a finaltestprogramwhichoccursafterthediehavebeenwirebondedtoaleadframe and placed inside a package. Fault-tolerant techniques have been used extensively by semiconductor manufacttu'ers [39]. The use of redundancy for yield enhancement is not new; the first scheme for redundancy implementation on core memories was published in 1964 [44] and the first practical application in redundant memory design was proposed in 1979 [7]. Currently, more than 15 semiconductor memory manufacturers have commercially produced various redundant memory chips [27,39]. As the technique of the fault-tolerant memory design matures, the next logical step is to apply this technique to PLAs. 1 .2 Objectives The motivation for incorporating fault-tolerance into PLAs is twofold: yield enhancement in the manufacturing phase and fault-tolerance in field. Both are achieved by restructuring the links so as to isolate the faulty lines. This work is centered on the study of the design of fault-tolerant PLAs. The issues include design-for-repairability, design-for-diagnosability, and design-for- manufacturability/yield. Design-for—repairability is the design of a repairable PLA that implements a reconfiguration scheme to replace faulty lines by spare lines. Reconfiguration is defined as an operation for replacing faulty components with spares while maintaining the original interconnection structure. Before the partially defective PLA chips can be repaired, the types and locations of faults must be precisely identified, so that the repair process can be properly and efficiently performed. The need for locating and identifying faults led to the design-for-diagnosability. Designfor-manrq'acturability/yield is aimed at achieving manufacturable, high- yield chips. To enhance chip yield of PLAs, spare lines and reconfiguration circuitry are built into the chip so that partially defective chips can be repaired. Since spare lines and reconfiguration circuitry are also susceptible to defects, too much redundancy may have a "diminishing return" effect on the chip. Therefore, the amount of additional redundancy to the PLA is best kept as low as possible. However, if the amount of redundancy is insufficient, high yield cannot be reached. Two aspects of fault-tolerance can be identified: (1) techniques to tolerate manufacturing defects; and (2) techniques to tolerate failures in field. In this work, the fault-tolerant PLA designs that fulfill the above design aspects are investigated. Fault- tolerant PLA design using laser programming techniques is implemented to tolerate the manufacturing faults, while fault-tolerant RAM-based PLA design is implemented using electrical programming techniques to tolerate manufacturing defects and to tolerate failures in field. ’ 1.3 Physical Failures in VLSI Circuits VLSI systems have the following two classes of failures [60]: manufacturing failures and long-term failures. Manufacturing failures are caused by defects which depend on the processes and materials; while long-term failures are caused by wear-out in field. Long-term failure mechanisms include break-in lines, shorts between lines, and degradation or breakdown of active devices. The manufacturing defects can be divided into two groups: those that affect a relatively large (global) area of the wafer and those that affect a relatively small (local) area [48]. Examples of global defects include cracks or scratches in the material, photolithographic mask misalignment, line dislocations, and major fabrication process control errors. These defects usually have global and prominent effects on the circuit behavior and can be detected easily early in the manufacturing phase. Furthermore, for a finely tuned and mature fabrication line, major processes control errors, and hence global defects, can be detected easily and minimized. For the above reasons, localized spot or point defects are the primary targets for fabrication testing. Point defects can be classified into three categories: silicon substrate inhomogeneities, local surface contarrrinations, and photolithography-related point defects. The origin of defects from each of these categories involves distinct, usually complicated and frequently uncontrollable processes. Complete and accurate physical modeling of point defects inherent in the fabrication process is difficult [48]. Depending on the location, size, and type, a defect may or may not have any effect on the circuit. Only those significant defects which cause faults are considered in causing faults. For example [48], Figure 1.2 shows that a small point defect in the implant window of a depletion mode MOS u'ansistor may or may not have any significant effect at the location. Figure 1.3 illustrates how a missing element of a polysilicon path may or may not have any significant effect at the circuit level. Point defects will cause extra or missing spots of metal, polysilicon, or diffusion layouts. Extra spots may cause shorts between two layers (metal, polysilicon, or diffusion), degradation of elements, or extra devices. On the other hand, missing spots may cause break of a line (metal, polysilicon, or diffusion line), degradation of elements, or missing devices. Fault models are extracted from significant physical failures, and serve two purposes: test generation and fault coverage evaluation. A good fault model is one that is simple to analyze and yet closely represents the behavior of physical faults. 1 .4 Redundancy Architectures The important criteria for evaluating a reconfiguration scheme are hardware overhead, reconfiguration effectiveness (the probability that an array with a given number of faulty cells is reconfigurable), wiring length after reconfiguration, time required for the reconfiguration procedure, and overall yield and reliability [61]. The redundant designs of VLSI array structures can be classified by the following four reconfiguration schemes [61]: (1) the whole row (and/or column) bypass (WRB/W CB); (2) single-cell bypass (SCB); (3) interstitial scheme; and (4) duplicated cell scheme. The first scheme allows for a faulty cell to cause the whole row or column to be bypassed as shown in Figure 1.4 (a). The control circuitry in the WRB/WCB scheme is simpler than those in others. However, the utilization of spare cells is inefficient. To choose the minimum number of spare rows and/or columns that cover all the faulty Figure 1.2 Implant Mask Defects [48]: (a) Non-significant Defect; (b) Significant Effect. l ’ P .1: I. n i D P - . ‘9‘ (II) Figure 1.3 Significant and Insignificant Defects [48]: (a) Defect-free Poly Path; (b) Insignificant Missing Poly; (c) Insignificant Missing Poly; (d) Significant Missing Poly. cells is an NP-complete problem [30]. This leads to various heuristic reconfiguration algorithms that have been proposed [5,13,20,58,63]. Figm'e 1.4 Reconfiguration Architectures [61]: (a) WCB/WRB Scheme; (b) SCB Scheme; (0) Interstitial Scheme; ((1) Duplicated Cell Scheme. To increase the utilization rate of spare cells. the single-cell bypass (SCB) scheme, as shown in Figure 1.4 (b), allows the faulty cells to be passed However, the utilization rate is dependent on the complexity of the control circuits that include switches and interconnecting wires. As a result, long interconnection wires after reconfiguration are possible if higher utilization is attained. The interstitial scheme, as shown in Figure 1.4 (c), has spare cells uniformly distributed into the array and a faulty cell that can only be replaced by its neighboring spare cells. Since spare cells are adjacent to regular cells, the length of connecting wires is limited, which thus results in a low time overhead. However, the drawback 9 is that an array may fail due to the lack of spare cells in one area, while there are unused spares in other area. The last scheme, indicated in Figure 1.4 (d), is the duplicated cell scheme in which each regular cell has its own spare cell. It requires a simple reconfiguration algorithm and low time overhead, but the area overhead is large. Since each cell of the array in both memories and PLAs takes a very small portion of the entire array, the use of scheme (2)-(4) that requires either high complexity of control circuit, or high percentage of area overhead, is not practical. In this study, the WRB/W CB scheme is implemented in the design of fault-tolerant PLA. That is, the faulty lines are repaired and replaced by the spare lines. 1.5 Thesis Organization This dissertation is organized as follows. Chapter 2 reviews the design of fault-tolerant semiconductor memories. Two commercial memory chips that implement the fault-tolerant design are presented. The laser programming techniques developed in these two examples can be applied to the proposed fault- tolerant PLA design. In Chapter 3, a fault-tolerant design of PLAs is proposed. The fault-tolerant design achieves a full diagnosability of single and multiple stuck-at faults, bridging faults, and crosspoint faults. During the manufacturing process, faults in the PLA can be detected, located, and repaired with the spare lines. When the PLAs are used in field, the structure still possesses the easily testable capability. In addition to the fault-tolerant structure, an automatic layout generator, called MRPLA, is presented to generate the physical layout of the proposed fault-tolerant design. Some important issues in a redundant design, such as chip area and propagation delay time, are also addressed. Chapter 4 describes the fault diagnosis and repair process for the proposed fault-tolerant PLA. Two examples will be given to demonstrate that the proposed 10 fault-diagnosable PLA achieves a full diagnosability. In addition, a simple test process is presented for detecting faults after the chip is packaged and used in field. Chapter 5 analyzes the effects of adding redundancy to the design of fault- tolerant PLAs. A yield model for this design is presented and simulated. Based on the yield model a simple, yet efficient optimization method is pr0posed to determine the optimal redundancy of various sizes of PLAs. Chapter 6 illustrates a RAM-based PLA (RBPLA) structure that allows the designers to change the design as many times as needed. In addition, a fault-tolerant design of the RBPLA is also presented. Faults occurred in either the manufacturing process or in field use can be detected, located, and repaired. Finally, the last chapter summarizes the work of this dissertation research and presents suggestions for related future research. CHAPTER 2 Fault-Tolerant Semiconductor Memories Semiconductor memory has made tremendous contributions to the revolutionary growth of digital electronics. The cost and space effectiveness of MOS DRAMs (Dynamic Random Access Memories) has permitted their use in today’s computers, for example, more than 100M bytes for mainframe and even 1M bytes for personal computers. MOS SRAM (Static RAM), with low stand-by power, has been used in small, portable, battery-backed systems [4]. Nonvolatile memories such as EPROMs (Electrically Programmable Read-Only Memories) and EEPROMs (Electrically Erasable PROMs) have opened up new areas of applications such as field-programmable microcomputers. Various needs from different systems applications constitute the driving force toward improved performance/cost and enhanced functions of semiconductor memories. Throughout the short history of semiconductor memories, the number of memory cells in a device has quadrupled approximately every four years [4]. The device density has been increased from 64K, 256K, to 1M, and will soon to 4M and 16M in market. As device density has increased, improved design and fabrication methods have been introduced to maintain an adequate yield of good devices per wafer. On-chip redundancy techniques have been commercially used to eliminate the large number of chip failures due to the local defects, and offered yield improvement in the manufacturing of the commercial memory chips [39]. The result is a reduction in capital required for wafer fabrication to achieve a desired level of shippable product. Instead of 1% or 2% of good dice per wafer in early chip yields, the right combination 11 12 of spare bits per die can suddenly make half the wafer good [42]. Basically, on-chip redundancy techniques take an memory cells as spares. Each device is tested at wafer probe, and if non-functional cells are found, the device is repaired by replacing the non-functional cells with the spares. One of the biggest controversies surrounding on-chip redundancy is whether to make the replacements by blowing fuses electrically or by laser techniques. The argument will be discussed in Section 2.2. Before the defective memory cells can be repaired, techniques for diagnosing the location of the defective cells and efficient spare allocation strategies are needed. The repair of the on-chip redundancy is generally divided into two phases: diagnosis to detect and locate all faulty cells, and repair to allocate spares for all faulty cells. In Section 2.3, existing fault analysis and repair algorithms are reviewed. Finally, two commercial memory products with on-chip redundancy are illustrated in Section 2.4. They are: the fault-tolerant 64K DRAM developed by Bell Laboratories [7], and the 8K x 8 high-performance CMOS SRAM developed by Hitachi Ltd, Japan [38]. 2.1 On-Chip Redundancy The only integrated circuits so far to have exploited on-chip redundancy techniques commercially have been memory chips. At the level of 64K devices, devices are beginning to appear with on-chip redundancy to increase yields and maintain reasonable manufacturing costs. As memory density increases and geometries shrink (via device scaling and circuit innovations) the die size must remain constant for producibility and yield considerations. As such, defect density becomes a much more important factor than with lower density devices since a single defect can wipe out a major section of memory. Process cleanliness becomes more stringent as well [27]. To offset this, on-chip redundancy allows the defective memory cells to be replaced by the spare 13 cells. Redundancy will become more important and probably mandatory at 256K DRAM level and even higher level of device density. Table 2.1 summarizes the memories which utilize on-chip redundant circuitry [27]. The on-chip redundancy has been commercially implemented to 64K and 256K DRAMs, 16K, 32K, and 64K SRAMs, and some others. The table also shows that the number of spares is relatively small comparing with the device size. On a memory with redundancy, incoming addresses are compared with the locations of faulty bits; when a match is found, spare bits take over. Substitutions can be made for individual bits, small clusters or large blocks, or rows or columns. Spare rows and columns have become the most popular approach because they represent a reasonable trade-off between yield enhancement and the number of required elements and associated circuitry. On-chip redundancy techniques are not free of penalties: spare elements increase the chip area, and result in performance degradation and productivity loss. The important attributes for on-chip redundant circuit design are: how much redundancy to employ; how to apply it; and how much it will affect performance, die size, and yield. The number of spare rows or columns is subject to several considerations since spare cells in any form inflate die size and reduce the number of chips per wafer. Furthermore, each spare element demands extra support circuitry which cannot be repaired. Consequently, too much redundancy reduces overall repair efficiency. The yield improvement factor, the ratio of the yield with redundancy to that without redundancy, can be plotted as a function .of the yield without redundancy for different number of spare elements. As more spares are added, the curve is eventually increased and reaches a point of diminishing returns. Further increases in the number of spare elements will start reducing the yield improvement factor. 2.2 Repair Techniques As mentioned previously, the number of programming elements required is an important consideration in the final choice of optimal number of spare elements. On- 14 Table 2.1 Memories Built with Redundancy [27] MANUFACTURER OKI ELECTRIC SIEMENS AG NTT NUSASNIBO BELL LABS HITACHI IBN IBN BELL LABS IBN INNOS INTEL ROSTER HITACHI TOSHIBA TOSHIBA TOSNIBA TOSHIBA INTEL INTEL INTEL INHOS lNNOS HOSTER INTEL BOSTEN SEEU INTEL MOTOROLA NTT NUSASNINO TYPE OF HEHORY 256K ORAN 256K ORAN 256K ORAN 256K ORAN 256K DRAM 288K ORAN (azxxsl 72x BIPOLAR OPAP (9Kx9) 64K DNA" 64: ORAN an: cam 64! ORAN sax ORAN sax saan “KSMN 64x seam 64x SRANICHOSI 54x snwums) 32x SRAM 16! sure ISK SRAN ISR SRAM 16V. SRAN ISR SRAM 128K EEPROH BAX EPRON 16K EEPRON 32R BIPOLAR PROM ISR BIPOLA PRON "BYTE Rm m NSN37256 NCA "CA NCA NCA RCA NCA NCA NCA IHSZBOO IZISA RKAI6A NCA NCA NCA TCSSSAP. TCSSGSP NCA NCA l2l67 INSIAOO INSIAZOIIAZI "[4167 NCA NKZTSA 5213 3632 NCNTBISI NCA TYPE OF RECUNOANCY 6K CELLS 'SPARE ROUS S COLUMNS SPARE rows 5 COLUMNS B SPARE ROWS & 3 SPARE COLUMNS l SPARE RON S I SPARE COLUHR iiSZ BITS WITH a WORD LINES PER can IOTH BIT 8 SPARE RUNS S 3 COIUHNS 8 SPARE ROWS 8 8 COLUMNS a span: nous 4 SPARE COLUMNS 8 SPARE COLUHNS 2 SPARE COLUMNS Z SPARE ROWS I SPARE ROW. 2 SPARE COLUMNS 2 SPARE ROHS S l SPARE COLUMN 2 SPARE COLUNNS 6 SPARE ROWS. A SPARE COLUMNS Z SPARE ROHS. A SPARE COLUMNS 3 SPARE ROWS 2 SPARE COLUMNS 8 SPARE COLUMNS A SPARE COLUHNS A SPARE ROUSaIZS BYTES 25‘ REDUNOANT MEMORY HATRIX- 2 COLUMNS 6 SPARE ROWS A SPARE ROWS ROWS ANO COLUHNS COMPLETE RON REOUNOANCY FOUR IHb MODULES PROGRAMMING TECHNIQUES NOTE HIGH VOLTAGE °ULSES AT HAFER TEST POLY FUSE ”COS REGISTER POLYSILICON LASER FUSE HIGH VOLTAGE. POLYSILICON FUSE UNKNOWN UNKNOWN POLYSILICON LASER FUSES REQUIRES CRITICAL MECHANICAL POSITIONING REQUIRES MAIN OE- INHIBITING-EXTRA GATE DELAY (SAME AS ABOVE) HIGH VOLTAGEIIZV) PULSES AT uartn soar, POLY FUSE HIGH VOLTAGE PULSES AT HATER SORT. POLY FUSE LASER PULSE AT HAFEP SORT POLY FUSE - LASER ZAP. POLY EUS LASER ZAP. POLY FUSE LASER ZAP. POLY FUSE LASEP PULSE AT HAFER SORT, POLY FUSE LASER PULSE AT HAFER SORT, POLY FUSE LASER PULSE AT HAFER SORT, POLY FUSE LASER PULSE AT HATER SORT. POLY FUSE HIGH VOLTAGE PULSES AT uartn SORT, POLY FUSE HIGH VOLTAGE PULSES AT HAFER SORT, POLY FUSES HIGH VOLTAGE PULSES AT HAFEP SORT. POLY FUSES LASER PULSE AT HAFER SORT HIGH VOLTAGE PULSE AT HAFER SORT, POLY FUSES HIGH VOLTAGEIZSV) PULSES AT HAFER SORT. POLY FUSE EPROH FUSES UNKNOWN UNKNOWN (I) It is said that redundancy increases the line. yield by a factor of S to 30 depending on the maturity of the Fab (2) Host Magnetic Bubble Memories also use redundancy to increase manufacturing yields by means of a "boot loop". (3) See “A AND Full Hater RON“ by N.Y. rttano at al. 1980 lEEE ISSCC Digest of Technical Papers. Feb. 13-15, 1980. NCA - NOT COHNERCIALLY AVAILABLE 15 chip storage of the information that identifies the defective cell locations is a key issue in redundancy. The programming elements used for this purpose fall into two categories [53]: the laser programmable and the electrical programmable. Table 2.2 lists a comparison between laser programming and electrically fusible links. Table 2.2 The Comparison between Laser and Electrical Programming [53] Feature ' Laser Approach Electrical Fuses Circuit Layout Links are placed anywhere Links must be accessible to external drives via bonding pads or additional on-chip circuitry Performance Access time of programmed and nonprogrammed Speed is generally adversely affected. particularly if devices are indistinguishable both row and column redundancy are used. Reliability Since exploded links ac covered with final nitride High reliability requires guard rings around link regions passivstion layer. reliability is extremer high Area Penalty Ares increase for redundancy is slight -- increase will Area increase is also slight, but may not scale down as scale down with finer design rules in future devices easily because of layout and reliability concerns Flexibility Performance margins are easily tailored with Layout is not adaptable to unforeseen circuit nwds ”quick fixes“ Equipment costs Software development requirements and hence costs initial costs are lower due to relaxed software demands are lures The major advantages of electrical fuse blowing are that the redundancy can be implemented with minimal initial capital expenditures and that existing test equipment may be used [1]. Also, electrical fuses offer the simplicity of using an unmodified wafer sort machine, but at the cost of requiring each fuse to be connected to a bulky driver transistor. This extra transistor costs area and limits the number of fuses that can be used, thereby complicating the circuit design. Electrical fuses could conceivably be blown inside the memory’s package, opening up the possibility of field ' repair. Laser programming presents the obvious advantage of conserving valuable silicon real estate by eliminating the circuitry associated with blowing electrical fuses, but it requires the addition of a costly laser to the sort equipment, and precise alignment as well. However, lasers allow a wider choice of potential fuse materials. 16 Trade-offs between ease of design implementation, up-front capital investment, and final product cost determine whether laser or electrical programming is best for a particular memory product [1]. As it is shown in Table 2.1, most of redundant memory designs implement the laser programming techniques that perform "cut" and "patch" operations in polysilicon links or fuses. Recently, Sandia National Laboratories [3] have also devised a speedy method of on-chip repair that uses low power lasers to cut and patch the metal lines. 2.3 Fault Analysis Before the repair process is performed, fault analysis algorithm is first called upon to determine if there are any catastrophic problems on the chip or more defects than the spare elements. If not, faults are further analyzed, and the sites of faulty cells are logged into a fault map. According to the fault map, an efficient spare allocation of redundant rows and columns is applied to provide the repair solution. The problem of spare allocation of redundant elements can be specified as follows. Consider a rectangular array that consists of M x N cells, as shown in Figure 2.1, where the dots in the array represent the faulty cells, and 2 spare rows (Sr=2) and 3 spare columns (80:3) are assigned. A partially defective chip is said to be repairable if the spare elements can completely cover all faulty cells; otherwise, it is unrepairable. Therefore, the objective of the fault analysis algorithm is either to quickly check the unrepairability, or to provide repair solutions for the repairable devices. More specifically, if the unrepairability of a device can be quickly determined, then the costly repair process can be terminated early. On the other hand, if spare elements can be efficiently utilized to cover the faults, then more devices can be claimed as good ones. l7 N 1234567 0 #ri—i O s = Sc=3 Figure 2.1 Spare Allocation of Redundant Elements. The problem of optimal spare allocation has been shown as an NP-complete problem [30]. As a result, several heuristic algorithms have been proposed and they are summarized in [20.58.63]. These heuristic algorithms can be classified into two categories: row/column selection and unrepairability checking. The following heuristics are used to select rows or columns for repair: Broadside [58], Repair-most [58], and fault—driven [13]. The broadside approach employs a crude technique to locate each faulty bit and to immediately repair it. No optimization is used. Spares are allocated in a very inefficient fashion, since no overall distribution of faulty bits is considered. This results in failure to identify a potentially repairable device. A limited usage of optimization techniques can be found in repair-most [58]. In this technique, row and column fault counts are employed to determine spare allocations. Repair-most is implemented in a two-stage algorithm: must-repair and final repair. Must-repair determines either a row or a column that must be replaced by a fault-free spare to repair the maximum number of faulty bits. This process is iteratively repeated until no more faulty bits are left uncovered in memory by using spares. This corresponds to a maximization criterion in fault selection; a minimization in allocation of spares can be accomplished by an initial covering of faulty bits. This information is supplied to final-repair to find a balanced time allocation for the desired 18 repair solution. This is accomplished by considering processing time, laser repair time, and spare utilization [58]. Although this approach gives better results than the broadside approach, undesirable features, such as an inability to provide repair solutions for certain devices and no provision for user-defined preferences, are still left. ' Fault-driven [13] partially avoids the drawbacks of the repair-most approach. In fault-driven, repair solutions are generated according to user-defined preferences. Repair is implemented using a two stage analysis: forced-repair and sparse-repair. Fault-counters are still employed. F creed-repair determines specific rows or columns that must be replaced by redundant copies; sparse-repair determines repair solutions for all remaining faulty bits at completion of forced-repair. The following hemistics are used to check the unrepairability: Diagonal-test [5], Maximum-matching [30], Total-faults [20], Fault-count afier Must-repair [58], and Leading-element-test [63]. The diagonal-test approach is a fast test performed on the bits along the major diagonal line of the memory. Since all the faulty bits on a diagonal line of a memory cannot be repaired by the same row or column, if the number of faulty bits on a diagonal line is greater than the total number of spare rows and columns, the memory is unrepairable. Maximum-matching approach uses the aid of graph theory. If the size of the solution found in the graph is greater than the total number of spare rows and columns, the memory is unrepairable. Total-faults approach exploits the fact that the maximum number of faulty bits that can be repaired by Sr spare rows and Se spare columns is MxSc«r-N>P=A P=X—>P=-AB (a) (b) Vdd Vdd ——A 21-32 _-‘-' AB M55138 _— Extra flL_— ‘E A '\ art A E73 ——B O o F ault-free logic Fault logic Fault-free logic Fault logic 0=X+B —> O=B 0: AB —> 0=AB+A (c) ((1) Figure 3.2 Crosspoint Faults: (a) Growth Fault; (b) Shrinkage Fault: (c) Disapperance Fault; and (d) Appearance Fault. 32 Vdd S-A-o 3 an jg] i/i p1 = z '5 L:_ X 47' 5 _ «I _l_'l p2 = A T_ B I; 411 A Y Fault-free logic Fault logic 01 Pl 3 A B ' P1 = B Fault-flee logic Fault logic P2=X ' 122= 1 01=A+B—’ 01:3 (a) (b) V Vdd S-A-l S-A-l i .415] / P =XE %— 2; _.£1 I; ET B A $7 01 Fault-free logic Fault logic Fault-free logic Fault logic 01 =2. ‘I' B ——’ 01 = 1 P2 = A —. P2 = 0 (C) (d) Figure 3.3 Stuck-at Faults: (a) Input Bit Line Stuck-at-O; (b) Product Line Stuck-at-O; (c) Input Bit Line Stuck-at-l; and (d) Product Line Stuck-at-l. 33 3.2 Design of the Repairable PLAs To avoid complex routing and to repair the faulty PLA, a schematic diagram of a repairable PLA (RPLA) is shown in Figure 3.4. In this design, two spare selector circuits are added internally to control the reconfiguration of the input/output signal lines. The selectors are: the Spare Input Selector Circuit (SISC) and the Spare Output Selector Circuit (SOSC). In addition, several spare lines are also augmented in that design. To repair a faulty RPLA, a set of repair rules which are based on the fault models discussed in the previous section, must be established. The repair rules are summarized in Table 3.2. 3.2.1 Repair Rules When a stuck-at fault occurs in an input bit line, the line is forced to be either 1 (for s-a-l fault) or 0 (for s-a-0 fault). A spare bit line programmed with appropriate crosspoints is selected to replace the faulty line, and the faulty line is then disconnected from the SISC circuit shown in Figure 3.4. However, disconnecting the faulty line may cause a "floating" logic. For sake of safety, the disconnected faulty line must be connected to Ground line. Similarly, a stuck-at faulty output line replaced by a spare output line is disconnected from the SOSC circuit and connected to Ground line. In addition, the faulty output line must be disconnected from the pull-up transistor because a malfunctioning pull-up transistor may cause a short between the power line and the grounded faulty output line. An s-a-0 faulty product line does not affect the output functions of other product terms of the PLA realized. However, an s-a-l faulty product line may significantly interfere with the functions, if the faulty line is not repaired. In addition to the use of a spare product line to repair an s-a-l faulty line, the faulty line must be 34 Inputs Outputs 0 2 ® : OE- Normal OFF Normal ON Programmable Link Programmable Link Product SOSC Figure 3.4 Schematic Diagram of a Repairable PLA. 35 Table 3.2 Repair Rules Fault Type Spare line Faulty Line Stuck—at fault Input bit line Input bit line remark 1 Product line Product line remark 2 Output line Output line remark 3 Crosspoint fault Growth Input bit line remark 1 Product line remark 2 Shrinkage Input bit line remark 1 Product line don’t care Disappearance Product line don’t care Output line remark 3 Appearance Product line remark 2 Output line remark 3 Bridging fault Adjacent Input bit lines Input bit lines remark 1 Product lines Product lines remark 2 Output lines Output lines remark 3 Crossing Input and product lines Input bit line remark l and product line remark 2 Product and output lines Product line remark 2 and output line remark 3 Remarks: 1. Faulty bit line is disconnected from SISC, and is connected to GND. 2. Faulty product line is disconnected from the pull-up transistor, and connected to GND. 3. Faulty output line is disconnected from the pull-up transistor and from SOSC, and connected to GND. 36 disconnected from the pull-up transistor and also connected to the Ground line for safety reason. For repair of the crosspoint faults, G- and S- faults are repaired by either spare input lines or spare product lines. Similarly, D- and A- faults are repaired by either spare output lines or spare product lines. In other words, the spare product lines can repair all the four types of crosspoint faults. If the crosspoint faults are repaired by spare input bit lines or spare output lines, the repair process is the same as the procedure for repairing the stuck-at faults. On the other hand, if the crosspoint faults are repaired by spare product lines, two cases can be identified: (1) the repair of S- and D- faults: and (2) the repair of G- and A- faults. An S-fault, with an extra crosspoint in a product line of the AND plane, causes the function realized by the product line to shrink. For example, as shown in Figure 3.5, an S-fault changes the function from A to AB due to an extra crosspoint occurring at the true bit line of B. The use of a spare product line programmed with appropriate crosspoints can repair this fault. Since the function realized by the faulty line is included by that of the spare line, the former function is then redundant and does not affect the overall function. Therefore, the faulty line can be retained in the array. However, in order to remove the possible redundancy for high fault coverage, we suggest that the faulty line be disconnected and connected to Ground line. Similarly, D-faults are repaired in the same manner. Figure 3.6 shows that the function Ps1 =AB, realized by a spare product line programmed with appropriate crosspoints, is included in P1: A which is realized by the product line with a G-fault. The use of the spare product line cannot correct the output function as shown. Therefore, the faulty line must be disconnected, i.e., the faulty line is disconnected from the pull-up transistor and connected to the Ground line. Similarly, the A-faults are repaired in the same way. Bridging faults force the bridged lines to have the same logic. In general, the adjacent bridging faults are repaired the same as that of stuck-at faults. For the ll ~~ l Spare product line S-fault, -. L p = Extra ' 4g- 5. at N P = AB 4; ' ‘ AND - OR plane .04 plane InPuts Output = P31 + P1 =A'ri +A = A Figure 3.5 The Repair of S-fault with a Spare Product Line. Spare product line I, > Lil . . II . , _ ' - - . P = AB Missing ~ i1; , -. ,_ i; . L 5‘ wk.“ _ g i g _.. . V P A 1- i; E) . . AND 2 '2 ' OR plane .04. -oQ- plane A B Inputs Output = Ps1 + P1: A (Incorrect) Figure 3.6 Tire Repair of G-fault with a Spare Product Line. 38 crossing bridging faults, each bridged line is repaired by the same type of spare line as illustrated in Table 3.2. 3.2.2 Repairable PLA A repairable PLA is augmented by adding spare lines and two control circuits, SISC and SOSC. In addition, two types of programmable links are employed: Normal- on link and Normal-off link. As suggested by their names, the Normal-on (Normal- off) link remains at the ON (OFF) state until the link is programmed; it then alters its state. The programming techniques of Normal-on and Normal-off links have been discussed in Section 2.4 for the designs of fault-tolerant 64K DRAM’s [7] and Hi- CMOS 8k x 8 SRAM’s [38]. 3.2.2.1 SISC and Spare Input Bit Lines The SISC is added to the input portion of the conventional PLA between the input decoder and the AND plane. The SISC, as shown in Figure 3.7 (a), consists of programmable links and connecting lines with associated circuits. The SISC circuit operates as follows: prior to the programming of the links, the input signal line connects to the column line through the Normal-on link as in the regular operation of a PLA. Since the Normal-off link is in the OFF state, there is no connection between the input line and the spare input line. When faults are detected and their faulty lines are located, these faulty lines are disconnected from the inputs by opening the Normal-on links. These inputs are then switched to connect spare input lines by programming the Normal-off links to the ON states. More precisely, the mechanism of the line reconfiguration is described as the switches shown in Figure 3.7 (b), where the SI witch (equivalent to Normal-on link) is closed and the 82 switches (equivalent to Normal-off links) are opened during the 39 normal operation. When the faulty input line b.| is found, for example, suppose that the spare input bit line P31 is assigned to repair it. This can be accomplished as shown in figure 3.7 (c), where the SI switch is opened and the $2 switch in the lower connecting line is closed. In this case, the path is formed by connecting the spare input line Ps1 instead of the faulty input line b1. Spare bit lines Input bit lines srsc P 22 $2 Norma... um. @E {ii SEE: El“ Normal-offlink “H HP “H ‘H Input signals (a) b ANDplane b AND lane P31 1 P31 1 p SI I I L ..L I L L 62—6— 6_=P= 2c?"*‘nq,,i( 1 -anzmsn-i (5.11) i=0 Therefore, substituting the array yield in Equation (5.7) to Equation (5.4), the correctable random effect yield YCRD' is YCRD = Pun") x Pp(sp) x Pm(sm) x ANR/AR (5.12) 86 5.1.2 Uncorrectable Random Effect Yield, YURD The percentage of uncorrectable defect area is one of the key factors in determining the effectiveness of redundancy. The uncorrectable yield is defined as [12] YURD =(1+7.x(AUNc/ASUS)/a)’“ (5.13) where AUNC and A8le are the uncorrectable defect-susceptible area and the total defect-susceptible area, respectively. In general, the random defect yield is very sensitive to the percentage of the uncorrectable defect area. A low percentage of uncorrectable defect area allows one to go higher levels of integration before the yield term falls off significantly. 5.2 Yield Simulation Although the proposed fault-tolerant PLA has not yet been fabricated and the experimental processing data are not available to precisely determine the failure rates, we may employ the experimental data studied in [48]. This experiment predicts all faults that are likely to occur in a MOS integrated circuit or subcircuit. The study shows that, of the original 4800 defects, only 476 actually produce significant faulty behaviors at the circuit level. They can classified as follows: 72 crosspoint faults, 388 stuck-at or bridging faults, and 16 power line faults. Since the power line faults do not affect the calculation of array yield, they are excluded. According to our repair rules, most crosspoint faults are efficiently repaired by spare product lines. Thus, it is reasonable to assume that the number of crosspoint faults repaired by spare product lines is as many as twice that repaired by spare input lines and by spare output lines. Also, we assume that the stuck-at and bridging faults are uniformly distributed to each type of line. Based on these assumptions, of the 460 faults, 147 are contributed to bit lines, 166 to product lines, and another 147 to output 87 lines. In other words, the study shows that both bit lines and output lines have the same failure rate because their structures are virtually the same, i.e., q“ = qm, but the failure rate qp is nearly 12% higher than q“, i.e., qp = 1.12 q“. Let qn = qm = q, and thus qp = 1.12 q. The failure rate q is generally obtained from the statistics in the fabrication process and manufacturing process. In this study, however, the failure rate is roughly estimated from the following calculation. The basic concept is that a non-redundant PLA design is conceptually identical to the redundant PLA design with the spares (sn,sp,sm) = (0,0,0). Therefore, the probabilities of having any failures in both designs should be the same. In practice, the former probability can be obtained from Equation (5.3) with x = 0, i.e., Pun: P(x=0) = ( 1 + i. / a )3 (5.14) and the latter probability is the product of YURD' in Equation (5.13), and YCRD with (sn, sp, sm) = (0.0.0), in Equation (5.12), i.e., P = Y R l xY n c o (sn.sp.sm>- (0,0,0) one = [Pntowptmrmtm x ANn/AR] x11+wa>W (5.15) For a redundant design with no redundancy, ANR = AR’ and, by Equations (5.9)- (5.11), Pn(0) = (1492“, 13mm) = (141)“, and rpm) = (1-1.12q)P, Equation (5.15) can be written as PR = [(1-q)2“+m(1-1.12q)P] x[1+(Ua)(AUNc/ASUS)]‘°‘ (5.16) By equating both PNR in Equation (5.14) and PR in Equation (5.16), we obtain (la/ctr“ = 11+waxAUNc/Asusn'“ [(ImZMmu-uzqfi’] (5.17) If the parameters a and x, and the area ratio AUNC/ASUS are given, then, we should be able to solve Equation (5.17) for q. 88 For example, consider a (50,190,67)-PLA; according to the floor plane shown in Figure 3.13, the area ratio is calculated as (Auuc/Asus) = 0.2038. Let a = 2 [62] and consider the case of it a 4, Equation (5.17) results in q = 0.00398, i.e., the failure rates qn = qm = 0.00398 and qp = 0.00446. The failure rates are subject to the number of average faults. For various average numbers of faults, Figtu'e 5.1 illustrates the correctable random effect yield YCRD’ uncorrectable random effect yield YURD’ and the effective yield Yeti for the (50,190,67)-PLA with (sn,sp,sm) = (3,4,2). For the case of r = 4, YCRD = 88.79%, YURD = 54.13%, and Yeti = 48.06%. This shows that the chip yield for the redundant design is much higher than the 11.1% yield for the nonredundant design. % of '° 88.79% yield Ycr cl 80 .00 ‘1 Y CRD 60 .00 " 54.13% . 48.06% ‘10 ' 00 m Yur' a Y Y e f’ 9 2° '°° 11.1% °" \Non-redundancy o . o oo . . . Y n r 0.00 2.00 L1.00 6.00 3.00 10.0 “=2 Average Number of Faults (it) Figure 5.1 Yields for (50,190,67)-PLA with Redundancy (sn,sp,sm)=(3,4,2). 89 Figtne 5.2 illustrates the effects of adding redundancy. Figure 5.2 (a) plots the correctable yield YCRD versus the number of spare product lines, where sn=3, sm = 2, and sp is varied fi'om 0 to 10. The plot shows that, as the number of spare product lines increases, the array yield Y increases, but the area ratio ANR’AR array decreases. As a result, the overall YCRD is increased initially, but decreased as the number of spare product lines increases. For example, YCRD = 86.10% for s = 2, p YCRD = 89.00% for sp = 3, but YCRD = 88.79% for sp = 4. Figure 5 .2 (b) plots various yield simulations versus the number of spare product lines. The plots also show that the yield Yeti = 47.93% for (sn,sp,sm) = (3,3,2), and increases to 48.06% for (sn,sp,sm) = (3,4,2), but drops to 47.82% for (sn,sp,sm) = (3,5,3). The results show that the additional redundancy may not improve the overall yield. Figure 5.2 (c) plots the efl'ective yields for the spare assignments (4,6,4), (3,4,2), and (2,2,1) versus the average number of faults. For 1. = 0, the effective yields for (4,6,4), (3,4,2), and (2,2,1) are 83.74%, 89.23%, and 93.81%, respectively. (It should be noted that the yield does not reach 100% due to the area penalty). For 3.- 1, the yields for the spare assignments are respectively 71.08%, 75.11%, and 77.57%, but for 1. = 10, the yields are respectively 24.36%, 24.33%, and 20.29%. This implies that less redundancy is better if the average number of faults is smaller, but, for larger numbers of faults, more redundancy may produce a higher chip yield. Figure 5.2 has provided significant evidence that the additional redundancy may not improve the yield. This motivates the study of finding optimal redundancy in the proposed fault-tolerant PLA design. 90 % 100 .0 E Correctable 9° '°° ‘ Yield, YCRD 60 .00 “ ‘10 .00 ‘1 3° ‘°° ‘ Non-redundancyL flnr 0 . 000 r r v v S 0.00 2.00 “.00 6.00 0.00 10.0 p (a) % 100 .0 89.00% 88.79% on on - YCRD Effective 6° 0° ‘ 53.35% 54.13% YURD Yreld, Ye" 48.06% 1° W ‘ 47.98% Yeti 20 00 ~ Non-redundancy 0 000 r r 5 S 0 00 2.00 “0.00 6 00 B 00 10 .0 p (2 2 1) % (b) e 9 N94. (3,4,2) _———> (4.6.4) °° °a ‘ 60 00“ Effective Yields Y9" Ho '00 7 \ 20.00 '1 \ Non-redundancy o o r . o oo oo zoo H no sea eon in A, (C) Figure 5.2 Yield Analysis for (50,190,67)-PLA with 01:2: (a) Correctable Yield, (it = 4); (b) Effective Yield, (71.: 4); and (c) Yields for Various Spare Line Assignments. 91 5.3 Optimal Redundancy Integrated circuit manufacturers find it highly desirable to be able to predict the yield loss before a chip is fabricated, and to expect to maximize the probe yield, and thus maximize profits. In this section, an efficient way to determine the optimal redundancy in the proposed fault-tolerant PLA design is presented. As more redundancy is added to redundant PLAs, both yield and productivity increase; however, the redundant spare lines inflate die size and reduce the number of chips per wafer. Figure 5.2 has evidently shown that, as the number of spare lines increases, the array. yield increases, but the area ratio ANR/AR drops. As the redundancy increases, a point will eventually be reached where optimum yield is obtained. The optimal redundancy problem can be expressed by the following nonlinear integer optimization problem: Maximize Yeti :- YURD x [ Pn(sn)Pp(sp)Pm(sm) x ANR/AR ] Subjectto 0