HLS Fundamentals: Loop Unrolling and Loop Pipelining
Blog Post
Posted Apr 19, 2011
by Thomas Bollaert
Follow on Twitter
Go URL
What is a Go URL?The dust has settled and four winners have emerged from the HLS Bluebook contest, and this week, as promised, I will discuss the question that proved to be the most challenging in third and final round of the contest.
The culprit was the following question, which only 15% of the contenders answered correctly:

HLS Contest - Round 3, Question 1
What this simple C code does is: reading 8 input values (din), summing them in an intermediate variable (tmp) and returning the final results (dout).
Before going over the various answers and explaining the correct one, let’s review two important concepts covered in this question:
- Loop pipelining provides a way to increase the throughput of a loop by initiating the (i+1)th iteration of the loop before the ith iteration has completed. Overlapping the execution of subsequent iterations of a loop exploits parallelism across loop iterations. The number of cycles between iterations of the loop is called the “Initiation Interval” (II).
- Loop unrolling provides a way be used to reduce the latency of a loop by reducing the number of its iterations. When a loop is unrolled - either fully or partially - the loop body is duplicated as many times as the loop is unrolled. This exposes parallelism that exists across subsequent iterations of the loop. The number of times a loop is unrolled is called the “Unroll Factor”.
If these definitions sound a bit theoretical, the following explanations and charts should help make them clearer. And an earlier blog post on this topic is still available here for reference.
- Default behavior
Assuming that the accumulation takes one clock cycle, a single iteration of LOOP0 would look as follows:
|RDi|ACC|
This notation indicates that in the first cycle an input is read and in the second cycle, the data is accumulated.
If not special synthesis contraints are applied to this design, loops either unrolled or pipelined. This means that the next loop iteration will start right after the previous loop iteration completes. In this case the schedule of the design would look like as follows:
|RD0|ACC|RD1|ACC|RD2|ACC|RD3|ACC|RD4|ACC|RD5|ACC|RD6|ACC|RD7|ACC|
In other words, a very serial design is built where in first cycle din[0] is read, in the second cycle the data is accumulated, in the third cycle din[1] is read, etc… In this case, the output would come out every 16 cycles.
- Analysis of Answer #2
Let’s now see what happens in the case of answer 2, when we leave LOOP0 rolled and pipeline the design with II=3.
As in the default case, the loop is kept rolled, therefore one iteration would look like this:
|RDi|ACC|
But now we pipeline the design with II=3, implying that we are building a design were each new iteration of LOOP0 starts 3 cycles after the beginning of the previous iteration.
|RD0|ACC| | | | | | | | |
| |RD1|ACC| | | | | | | |
| | |RD2|ACC| | | | | | |
| | | |RD3|ACC| | | | | |
| | | | |RD4|ACC| | | | |
| | | | | |RD5|ACC| | | |
| | | | | | |RD6|ACC| | |
| | | | | | | |RD7|ACC| |
|<-3cycles->| | | | | | | |
Consequently, the next design iteration (to process a new set of inputs) would start 3 cycles after the start of the last LOOP0 iteration. There are 8 iterations of LOOP0 and each iteration starts every 3 clock cycles: this implies that the design processes new inputs and produces a new result every 24 cycles.
So answer 2 is not the correct one. Have you found the correct answer yet? In my next blog post, I’ll explain what happens when you start unrolling loops.
More Blog Posts
Preparing RecommendationsRecent Posts
- Mentor ESL in TSMC Reference Flow 12
- 48th DAC - Gary’s Magic Formula
- DAC: 9th ESL Symposium
- HLS Fundamentals / Part 2
- HLS Fundamentals: Loop Unrolling and Loop Pipelining
- HLS Contest: And the winner is...
- A Designer’s Perspective on ESL Methodologies for an OFDM Modem Design
- Catapult C and the 7 Samuraïs
- The Why, What and How of HLS @ DATE 2011
- DVCon: Wally Rhine's Keynote
Comments
No one has commented yet on this post. Be the first to comment below.
Add Your Comment
Please complete the following information to comment or sign in.