Hyperstone: ModelSim with SystemVerilog DPI Speeds Simulation and Debug of C Models
ModelSim makes the task of integrating C code in the RTL simulation a very straightforward, three step process. Many helpful features make it easy to learn.
“ModelSim makes the task of integrating C code in the RTL simulation a very straightforward, three step process. Many helpful features make it easy to learn.”
Arthur Freitas, Development Engineer, Hyperstone AG
Founded in 1990 and headquartered in Konstanz, Germany, with subsidiaries in Taiwan and the U.S., Hyperstone has developed a proprietary single microprocessor core architecture, combining a highperformance 32-bit RISC processor with a powerful DSP unit. Used to develop ASSPs for the consumer electronics, network communications, storage media, and industrial control markets, Hyperstone’s RISC/DSP processor is powerful, yet small, with low power consumption. It is designed to provide the increasing functionality required to execute DSP tasks, such as digital filtering, image processing, echo cancellation, and other related algorithms.
During the last couple of years, Hyperstone has gained a position as a leader in flash memory card controllers. Its recently taped-out Hyperstone S4 flash memory controller is among the most powerful single-chip controllers targeted for SD and MMC cards. The required external component count is reduced to a bare minimum of passive components, enabling the design of very low-cost, highperformance flash memory-based SD/MMC cards for use in digital still cameras, mobile phones, PDAs, portable multimedia players, and digital voice recorders, and the like.
The Hyperstone S4 flash memory controller can operate with flash memory devices that use a NAND type interface at either 3.3 or 1.8 volts. Through its sophisticated flash memory interface, it accommodates up to four directly connected flash memory chips that can achieve a superior data transfer burst-rate of up to 40 Mbytes/s. An on-chip error correction code unit generates the required code bytes for error detection and the correction of up to three random bytes per 512 byte data sector.
Original Test Environment
The Hyperstone S4 comprises the E1-32X microprocessor, SD/MMC interface logic, and other components. The E1-32X is a well established design mass produced in several ASICs. It is available as an extremely low power, compact, latch-based hard macro well suited to the highly competitive design requirements of SD/MMC designs. Thus it compensates for the disadvantage of having to be simulated at the gate-level. When ease of integration is a priority, and for larger ASIC and SoC designs where low power and gate area are less critical, the processor is also available as RTL.
Since the E1-32X microprocessor is considered to be bug-free, the verification of the Hyperstone S4 is focused on the logic surrounding the E1-32X. In the verification environment, the E1-32X runs a simplified version of the firmware, acts as an SD/MMC host, and analyzes results.
Because of the gate-level abstraction of the microprocessor, this approach is flexible but very slow. A test that fully stresses the surrounding logic requires an extremely long runtime.
“To achieve a high level of confidence in code and functional coverage,” explains Arthur Freitas, a development engineer at Hyperstone, “we would have had to spend several weeks, identifying the coverage holes and writing the appropriate tests. It was very difficult to generate a good regression test set with the netlist. Every simulation took hours, and it took more than a day to run a regression with the netlist. Plus, I had to spend hours just to get to the point of running the regression every time. Typically I had to do this repeatedly. This was not feasible.”
Therefore, Hyperstone decided to replace the gate-level representation of the microprocessor with its C model, which was already available in-house. This is an increasingly common strategy, as designers use more and more C code as a golden reference, external model, or testbench stimuli. Yet, how to get there in an efficient and low-risk manner remains a tricky problem.
Mr. Freitas knew he needed a mechanism that allowed Verilog code to call functions written in the C programming language. But he wasn’t sure how to go about it. The first thing that came to mind was to use the Verilog PLI. However, Verilog’s PLI is a complex interface and often does not offer high simulation performance.
“It’s very cumbersome to come up with interfacing for a C model with Verilog using plain Verilog PLI ,” says Mr. Freitas. “For example, one has to define all the system tasks and then associate a calltf C function with that task name.”
Straight Forward C Model Integration
Realizing the drawbacks of the Verilog PLI, Mr. Freitas began looking for an alternative methodology to accelerate his RTL simulation. Because Hyperstone was already using ModelSim, it was natural for him to ask his Mentor Graphics contact at TRIAS Mikroelektronik GmbH for ideas.
“TRIAS gave us important advice, which helped us in our decision to use SystemVerilog,” Mr. Freitas recalls. “They helped us choose an appropriate way of integrating our C model into the RTL simulation, using the SystemVerilog DPI. This was very beneficial to me. The SystemVerilog DPI was definitely better than the Verilog PLI for this application, and you can achieve things faster. It was very good support from TRIAS. I could have spent several weeks going in different a direction.”
The Direct Programming Interface is a new, standardized way in SystemVerilog to link a C model into a Verilog RTL model. The SystemVerilog DPI enables Verilog code to call C functions as if they were native Verilog functions, without the complexity of defining a system task and the associated calltf routine. With ModelSim, Hyperstone was able to quickly get the C integration and simulation up and running.
“ModelSim makes the task of integrating C code in the RTL simulation a very straightforward, three step process,” Mr. Freitas explains. “First you compile the SystemVerilog; next you compile the C to build a UNIX shared object or a Windows .DLL; then you run the simulation. It’s easy to learn and very well explained in the user manual. Many helpful features make it easy to use. For example, the vlog option automatically generated the DPI header file, creating the interfaces between SystemVerilog and C.”
To interface the C model in the RTL simulation a SystemVerilog shell was created. It had exactly the same I/Os as the actual Verilog file it replaced. Internally, instead of the gate-level description of the logic, it consisted of the functions and tasks needed to interface the C code. The C model itself had to be adapted, with additional functions created in order to pass parameters to the Verilog simulation. In the same way that a Verilog task can be exported to C code using a simple export statement, a C function can be imported into Verilog using a simple import statement:
Import “DPI” context task ProcessorCall (input int reset, input int interrupts, …). Because a task can consume simulation time, a DPI C function can synchronize with the simulator.
Mr. Freitas continues. “It took me a few hours to begin compiling the SystemVerilog shell and to link the C model DLL. Then it took me a week to write the C test programs for the C model processor simulation and start getting good results. In a couple of weeks, I was able to get all of the bugs out of the functions I had written to interface the C model with the SystemVerilog DPI. Overall, this was faster than I had expected.
By enabling simulation at a higher level of abstraction, Hyperstone realized an enormous speed up in simulation runtimes. To demonstrate this, the Hyperstone team ran a test case.
“When we used the C model to substitute the netlist, there was a huge difference in simulation performance,” Mr. Freitas reports. “Our benchmark study revealed that by simulating the C model, we gained runtime improvements of more than 200 times over the actual netlist simulation. Designs compiled with ModelSim’s vlog –fast option also showed interesting speed improvements: for this test case, about 1.6 times.”
Integrated C Debugger
The ModelSim integrated C debugger gave Hyperstone visibility into their entire C code, including set break points and whether functions were being called or not. Integration of the C debugger enabled interactive debugging of the C function calls as well as other things, making debugging more efficient and effective.
“The ability to interactively debug C code using the ModelSim integrated GDB debugger was another speed up factor,” Mr. Freitas recalls. “It took me only a couple weeks to get the C model bug free.”
For example, because Hyperstone had modified their processor, the C model had to be altered to match. After simulating the C model and the actual netlist, Mr. Freitas found differences. The C debugger feature from ModelSim helped to debug the problems in the C code by allowing him to set break points and see how the C model was executed. He then used ModelSim to debug and fix them.
Mr. Freitas also found that the ModelSim vpi_printf function facilitated debugging. “I had some variables that I wanted to pass as parameters, but I didn’t know if they were right or not. Whenever I wanted to see a variable from C in ModelSim, I would use the VPI print test variable, and the value would be printed out in the ModelSim tcl interface where I could see it. That was very helpful for debug.”
Additional Advantages of Abstraction and Standards
Because of the enormous speed improvement gained by working at a higher level of abstraction, Hyperstone can realize the advantages of using an additional processor instantiated in their testbench to act as an SD/MMC host and run the stimuli program. The onchip Hyperstone E1-32X microprocessor would simply be used to run the actual firmware, or a simplified version of it.
Additionally, the available C model of the flash controller could be integrated, making the whole system available for simulation. This could help Hyperstone to more effectively debug problems that have been identified in real-world systems using FPGA prototyping. They could then emulate flash misbehavior and see how the firmware reacts to it.
Due to the success of the SystemVerilog DPI solution, Mr. Freitas sees the benefits of moving toward a single standardlanguage work flow, such as that supported by Questa™AFV from Mentor Graphics.
“SystemVerilog covers a lot of things, including assertions. If you can use one language to do many things, then a standard language is much better than using a proprietary one to do the stimuli, functional coverage, and assertions. Then you could write in one language that covers everything. I see a huge advantage there.”
Questa’s full native support of SystemVerilog and its built-in advanced functional verification technologies has also inspired other ways to enhance Hyperstone’s verification flow.
“There are other SystemVerilog capabilities we’re not using, such as assertions, constrained random stimuli, and testbench automation capabilities,” says Mr. Freitas. “It would be nice in the future to check these out, because right now I’m writing direct tests in C, and I also have the possibility of randomize in C. But it is hard work to use plain C to create random stimuli that still makes sense, and SystemVerilog does this for you.”
“The ability to interactively debug C code using the ModelSim integrated GDB debugger was another speed up factor. It took me only a couple weeks to get the C model bug free.”
Arthur Freitas, Development Engineer, Hyperstone AG
“Our benchmark study revealed that by simulating the C model, we gained runtime improvements of more than 200 times over the actual netlist simulation.”
Arthur Freitas, Development Engineer, Hyperstone AG