Sign In
Forgot Password?
Sign In | | Create Account

Sun Verifies Breakthrough Multi-Clock Design Using Unique Mentor Graphics Questa Clock Domain Crossing Verification Solution

The innovative multi-clock architecture of the Sun® UltraSPARC™ T1 processor required a unique clock-domain crossing verification solution that only Questa® could deliver. The close cooperation between Sun’s verification team and the Questa support and R&D staff helped Sun ensure that there were no clock-domain crossing errors at first silicon.

Questa CDC had a positive impact on the quality of our design. It found a handful of critical bugs that may not have been found in the lab because of their intermittent nature.”

Eugena Talvola, Senior Verification Engineer, Formal Technologies Group, Sun Microsystems

 In the new era of network computing, the desire for faster and faster data delivery is peaking while the demand for higher and higher throughput is just beginning to make itself heard.

“Once web transactions hit 250 milliseconds, nobody is that interested in going any faster. But if you can increase the volume they can process, people take notice,” observes Shrenik Mehta, Director of Front-End Technology for the Scalable Systems Group at Sun. “For example, if you tell your customers, ‘We can complete a transaction in 100 milliseconds rather than 250,’ it’s not very interesting. But if I tell them we can handle a million customers in 250 milliseconds, it becomes very interesting.”

This ability to handle a large number of transactions at the same time, anytime, is the driving idea behind throughput computing. Throughput computing improves upon the inefficient performance characteristics of traditional single-threaded processors, which has worsened as the gap between processor and memory speeds widens.

Sun recognized that these efficiencies exacerbated the natural by-products of the high-technology age: massive power consumption and prohibitive physical storage costs due to an evergrowing number of servers. The answer was a design that increases throughput, uses less power, and has a smaller silicon footprint.

The UltraSPARC T1 processor is the genesis of Sun’s OpenSPARC initiative. Through this initiative, Sun is releasing the UltraSPARC T1 RTL to the public as OpenSPARC T1.

Sun was able to achieve this lowpower, high-throughput chip by connecting its low-frequency cores to an onboard memory subsystem via a high-speed crossbar switch. This innovative approach resulted in different clock speeds within the chip. Multiple clock domains enable each core to run faster than the memory interface logic and the I/O pins, optimizing performance and power consumption.

The growing gap between memory and processor speeds exacerbates the inefficiencies of traditional single-threaded processor performance.

Multiple clock domains required Sun to find a tool that could help them verify potential clock-domain crossing (CDC) issues, which may occur when signals pass from one frequency domain to another. At the time, only Questa, now a part of Mentor Graphics®, had a tool with this capability. As part of an overall verification methodology — which included the Questa Formal Verification Verification tool — the Questa CDC solution contributed to the delivery of a high-quality, groundbreaking product within schedule.

A Multithreaded Processor

In order to improve upon the performance characteristics of traditional single-threaded processors, the Sun Microsystems team in Sunnyvale, California developed the UltraSPARC T1, which took advantage of Sun’s patented CoolThreads™ multithreading technology and established a pioneering design strategy.

The UltraSPARC T1 processor combines eight cores, each having four threads, with simple and efficient SPARC® V9 instruction pipelines. A single UltraSPARC T1 processor delivers 32 active threads on a 90 nm die and requires only 70 watts to operate. The eight cores share an onboard memory controller and I/O subsystem through a 134 GB per second crossbar switch, enabling blistering communication between the cores and memory. The eight cores run at 1.2 GHz, the memory interface logic at 300 MHz, and the I/O at 150 MHz.

The UltraSPARC T1 processors also share a four-bank, 3 MB L2 cache, further reducing processor cost and complexity. Four on-chip DDR2 channels deliver a bandwidth of 25.6 GB per second between the processors and memory, providing high throughput with low memory latency and low power.

“We kept the power consumption down by running at a lower frequency, according to the principle that power is directly proportional to frequency,” Mr. Mehta explains. “The question we faced was, could we get adequate performance at 1.2 GHz? The answer was ‘yes,’ because of the way we designed the chip. We achieved a balanced design by using a large crossbar interface to connect the cores with the caches and memory.”

The Only Tool for the Job

Faced with this innovative design, the Sunnyvale team realized they needed innovative verification solutions. They decided they wanted to use formal verification techniques and knew they needed a tool that could make sure there were no problems arising from the use of multiple clocks. These problems could include metastability and other CDC-related issues.

“We looked at alternative solutions, but only Questa CDC could deliver the features we needed at that time,” recalls Mr. Mehta. “It was the only viable solution. This also meant that only Questa could fulfill our strategy to work with one vendor.”

“I don’t think there was anything comparable in the market at that time,” agrees Eugena Talvola, Senior Verification Engineer for the Formal Technologies Group at Sun. “Everybody had standard asynchronous clock domain checking, but we had needs for something above and beyond that—chiefly, ratioed synchronous clocks and lock-up latches.”

The RTL designers attached a very strict methodology and rigorous rules to the clock-domain crossings. These requirements had to be met within a short time frame because the RTL designers handed off the design to the formal verification group fairly late in the product development cycle. Through the close cooperation between the two teams, Questa was able to enhance its CDC tool, and Sun was able to attain verification closure on schedule.

“We only had two or three months to formally verify the crossings,” Ms. Talvola continues. “Adding to the pressure was the fact that CDC verification was a new area for us. We needed certain features that nobody had. Questa implemented checking for ratioed synchronous clocks and lock-up latches for us. They worked with our R&D staff to implement it on fairly short notice, in time to check the chip. The cooperation was amazing. I had Questa R&D sitting with me, giving me documentation, and attending regular meetings. Not many companies will do that.”

Designers typically ensure that signals cross asynchronous clock domains correctly by placing a double flip-flop synchronizer in the design. A CDC verification tool makes sure that the synchronizer is not missing. The Questa CDC tool also checks other requirements, such as that logic was not introduced immediately before a synchronizer, since this may result in a glitch being propagated.

The UltraSPARC T1 had a few hundred ratioed synchronous clocks, which are synchronous clocks with different frequencies. Ratioed synchronous clocks allow the cores to be much more portable because the memory controller and the I/O subsystem can be quickly modified to meet new requirements or standards without changing the core. The RTL designers use a handshake scheme that sends gated, timed synchronization pulses. This consisted of a block that generated the synchronization pulses, which in turn signaled to the rest of the design when the data should be sampled. Whenever there was a synchronous clock crossing, there always had to be a synchronization pulse. If not, the faster clock would allow the data to go through too soon, and the slower clocks would not catch up.

The second feature added by Questa checked lock-up latches used for production testing. Lock-up latches were inserted to delay certain branches of the clock tree by half a cycle. Clock domains can be defined by the user, so clock A and clock B may be branches of the same clock tree if necessary. This is usually used to make sure that a scan chain does not pass through more than one bit per shift cycle due to clock skew. Because the Questa CDC tool could detect where the clock domain crossings were, it was able to check whether the lock-up latches were in place.

Finding Intermittent Bugs

“Questa CDC had a positive impact on the quality of our design,” says Ms. Talvola. “Questa CDC found a handful of critical bugs that may not have been found in the lab because of their intermittent nature. When you’re in the lab, things may fail, but they may not fail every time.”

It took about five minutes to run the clock domain crossing and ratioed synchronous clock tests, and a couple of hours for the lock-up latches. Sun ran Questa CDC on four separate clusters. Questa CDC verified approximately 390 clock domain crossings and approximately 300 ratioed synchronous clocks. A few design bugs were found and corrected, including combinational logic before a synchronizer and a missing synchronizer. When Sun ran the silicon they validated that Questa CDC did not miss any problems with the clock domains.

“It’s easy to use, to run, and to generate a report,” Ms. Talvola observes. “It was much easier to find these bugs using the Questa CDC reports than to try to find them in the lab. This was especially true with metastability effects. Once Questa CDC found a bug, it put you pretty close to where the bug occurred. It seems easy to me, and if I have to explain it to people here, it doesn’t take that long.”

Expanding CDC Verification

The success of the tool, aided by its ease of use and excellent support, has led the Sun formal verification team to expand the use of Questa CDC to multiple projects. Their goal is to eventually have the design groups using the solution themselves, earlier in the product development flow.

“Anytime a project is launched, we have meetings with verification and RTL managers; we tell them this is available, and it’s up to them to schedule who is going to work with me,” explains Ms. Talvola. “I don’t think there are any Sun chips with multiple clock domains that are not utilizing this tool; definitely not the big processors. If there are multiple clock domains, they’ll want to run the tool.”

“Building large designs of this kind, we need specific tools for specific checks. Questa CDC did the job. We are happy with that. Questa CDC was able to handle the number of clock domain crossings on the chip and provide accurate results. The support of the Questa team was also excellent,” Mr. Mehta concludes.


Sun Products:

We looked at alternative solutions, but only Questa CDC could deliver the features we needed at that time. It was the only viable solution.”

Shrenik Mehta, Director Front-End Technology, Scalable Systems Group, Sun Microsystems

Online Chat