This week, I’m off to present a paper on Critical Area Analysis and Memory Redundancy. It’s at the 2010 IEEE North Atlantic Test Workshop in Hopewell Junction, NY, just up the road from Fishkill. IBM is in Fishkill. IBM invented CAA in what, the 1960’s? Venturing into IBM country to speak on Critical Area Analysis is kind of like being the court jester. I just hope they don’t say, “Off with his head.”
But seriously, it amazes me how little is known about this topic. There have been other papers on the subject. I’m merely bringing the topic up to date. I did come up with a way of writing out the formula that appears new, though the underlying principle is the same. Memory redundancy is about having repair resources available to fix an embedded RAM, typically an SRAM. Whatever structure you have repair resources for can be thought of as a unit. You have to calculate the unrepaired yield for all units, then adjust for repair resources. You have to add in the probabilities of having all units good, one unit bad, and so on until there are not enough repair resources to make repairs. It can make a dramatic difference in yield if the memories are large enough or defect rates are high enough. For details, see the Conference Proceedings of the 2010 NATW when they become available.
There are two extreme schools of thought on memory redundancy. One says, “Why bother? I can’t fix the logic, so don’t bother on the memories. Just pressure the foundry to reduce defect rates.” The other extreme says, “Redundancy is good. Put redundancy everywhere.” In between, the designer is either taking an educated guess, or just following the memory IP provider’s guidelines. Those guidelines may have been published when the process was new, and may be pessimistic. The only way to know for sure if adding redundancy helps yield significantly, or is just a waste of chip area and tester time, is to do memory redundancy analysis tied to current foundry defect rates. If a design is going to go through a re-spin, either to a half-node, or to add functionality, it may be the ideal time to ask, “Is the current memory redundancy scheme adequate, or is it overkill?” Calibre YieldAnalyzer has this capability. If you analyze the design with redundancy in mind, the redundancy configuration can be adjusted and the yield rapidly recalculated to facilitate a what-if analysis. It’s the best way to determine the optimal redundancy scheme based on actual foundry defect rates.
The downside of overdesign in this area is very real. Let’s say a large SOC is 50% embedded SRAM. If you add 2% to the area of each SRAM for redundancy, you just increased the chip area 1%. That’s 1% fewer die per wafer over the entire life of the design. It better be worth doing. There’s also tester time to consider. A chip tester is a large, expensive piece of hardware. Every millisecond it spends testing chips is accounted for. If you factor the depreciation cost of that hardware over its lifetime, every extra millisecond that tester spends testing embedded memories and applying the repair resources adds to chip cost. Again, that cost is over the entire life of the chip. Designers may have some idea of how much area they are adding, but the impact on good die vs. gross die may be missed without analysis. Designers probably have much less information about how redundancy overdesign impacts tester time and back-end manufacturing cost. I hope the NATW attendees can give me some perspective on that issue.