Signs the problem is very hard:
- Before you have worked on it you’re asked a vague question from an IBM engineer during a research review.
- The solution results in a 10X decrease.
- The solution can be applied to every single product with that circuitry at your company.
- Patents are filed.
- People who can’t make your talk at the conference seek you out to shake your hand.
- The conference paper is awarded best paper.
- Years after that conference paper, an IBM design engineer shakes your hand in thanks.
In February 1994, I began work in a brand new department–Sort Test Technology Development–at Intel Corporation, as an individual contributor. Ken
McQuhae, my boss and department head, arranged for me to work on loan to a microprocessor design team led by Joe Schutz. The P54CS would be product ramping the 864 process (64 nanometer process node.) My primary role: implement a design for test (DFT) based test method for static random access memory (SRAM.) Five on-die Cache memories incorporated the final design. In addition, the matrix manager assigned me to shepherd the TCACHE through the design validation methodology. After several months I fully comprehended the attractiveness of solving this test problem. The list I started this story with spans eight years; below, I’ll delve into it a bit further.
In a competitive world there exists an art to asking a questions so as not to reveal everything you know. CMU’s Electrical and Computer Engineering VLSI CAD Center held a yearly research review. I stood beside a poster describing my research and answered questions from industrial sponsors. My research examined the faulty behaviors of analog circuits and while my research looked at operational amplifiers (op-amps,) SRAM circuitry has analog circuitry. An IBM engineer asked in an obtuse manner–how would I take this point of view of defect behavior to solve the testing of SRAMs for data retention faults?
SRAMs fulfill the role of storing addresses or data that a computer’s computing circuitry (e.g. the Algebra and Logic Unit, ALU) may need at some point while executing a program. The SRAM stores the data as 1’s and 0’s in bit cells, and, as long as the device is powered, their state never changes. Certain defects occur in these bit cells which result in the state being flipped over a long period of time. The standard way to test for this faulty behavior follows:
- Write a 1 into the SRAM bit cells
- Pause for a long time, 500 milliseconds
- Read the SRAM bit cells; if the state of any bit has changed to a 0, mark the integrated circuit (IC) as bad
- Write a 0 into all SRAM bit cells
- Pause for 500 milliseconds
- Read the SRAM bit cells; if the state of any bit has changed to a 1, mark the IC as bad
You can test all on-die SRAMs in parallel to save time. Still, one second consumes a significant amount of test time for one type of faulty behavior. When you produce millions of microprocessors a month, reducing one second to 100 ms results in significant cost savings. You need less test equipment, and hence less factory space. Now translate that into every product with on-die memories for the next ten years. Millions and millions of dollars can be saved.
A solution of this magnitude needs to be protected. Intel submitted patent applications on the DFT circuitry. Sharing the results with the industry at large is not always done–remember, ’tis a competitive world. With patents filed, Intel had protection–as Intel was not solely in the SRAM business, competitive advantage was not a key concern. In sharing the results at an external conference, Intel benefited by adding to its established reputation as an innovator in the semiconductor industry. In the paper I described the P54Cs experience, highlighting the reduced the test time and improved product quality. At the 1996 International Test Conference people sought me out to congratulate—well, that’s a true compliment. Being awarded best paper, that’s icing on the cake. I’ll elaborate in a later story on this feat.
Circling back to IBM, they faced a challenge with SRAM bit-cell stability failures in the early 1990s. Hence, the question that an IBM engineer asked me. Weak Write Test Mode (WWTM) enhanced the ability to test cell stability by focusing not on the functional behavior that misbehaved but on how defects can be detected (to be described in an upcoming post.) In 2001 an IBM SRAM designer, Harry Pilo, sought me out at the International Test Conference to thank me for “saving their butts.” He paid me the highest compliment as an engineer by acknowledging that I have solved a sexy hard problem and that I shared the solution with my peers.
Have a productive day,
Dear Reader, What memory or question does this piece spark in you? Have worked on a problem that meets any of the criteria in the list above? To identify a sexy hard problem what other criteria would you suggest? Please share your comments or stories below. You too can write for the Engineers’ Daughter- See Contribute for more Information.