At IBM, I became fascinated with finding defects in semiconductor devices. For most of my professional life I have focused on testing for defects in a manufacturing setting. When in this setting, you want to determine yes or no; often referred to as “no/go” testing. The test is run–and if “no,” you stop, if “yes,” “go” to the next test. You simply don’t care why it failed, what caused it to fail or where in the circuit it failed. However, other engineers do care about these questions and to answer them more data is required. In addition, given a finite set of test data, one needs the ability to sort through the myriad of possibilities.
The topic of test diagnostics came up during the 12-week introduction to test engineering in which I also learned about Weighted Random Patterns. Jim led that training session; he taught that Stop On First Error (SOFE) did not provide enough data for diagnosing the location of a logic failure in a Very Large Silicon Integrated (VLSI) device. You ideally wanted to run all logic tests because knowing which tests passed/failed assists in narrowing down the set of possible failures. Each logic test can detect multiple failures. Overlap on failure detection exists between logic tests. Hence, if two tests fail and have an overlap you have narrowed the suspects. If a test passes you have eliminated possible suspects. Note: in class we learned the general assumption that only 1 failure exists; not a 100% true, but a handy assumption when sorting through thousands of possibilities.
However, the realities of a manufacturing test setting often do not permit data logging of all failures. Compromises are arrived at, perhaps stopping after 10 fails. If only SOFE, then the option to take the failing device to a non-manufacturing ATE (automatic testing equipment) and capture all failures existed. The logistics in sorting through the failed devices make this an unattractive option.
In the mid-1980s VLSI devices typically had 100K transistors; today the numbers can be a staggering one billion devices. Our training session included an introduction to IBM’s software that helped with the diagnosis–TICDS: Technology Independent Chip Diagnosis System. I would learn later that John Waicukauski had a role in creating this system. As documented in a previous post about Weighted Random Patterns, I had an opportunity to engage with John on that project. Some months after the training I became involved with running TICDS and worked with others on training–more on that in a future post.
Have a productive day,
Dear Reader, what memory or question does this piece spark in you? Have you worked on diagnosis of an error or debugged a system that has not worked? Please share your comments or stories below. You, too, can write for the Engineers’ Daughter–see Contribute for more information.
Here’s a paper co-authored by John Waicukauski and JJ Curin on this very topic: Multi-Chip Module Test and Diagnostic Methodology