A yellow background color for the slide release date either
means that the version for the current academic year is available
(when the date mentions 2025) or that the slides have not been
updated for the current academic year. A pink background means
that only last year's slides are available for the moment.
|
|
|
|
|
February 7, 2025 | Organization | - | Organization | February 9, 2024 |
February 7, 2025 | Introduction, Models of computation | [Par09] | Introduction | February 9, 2024 |
February 14, 2025 | Architecture synthesis and scheduling | [Ger99] | Architectural Synthesis | February 16, 2024 |
February 14, 2025 | Overlapped scheduling | [Ger98] | ||
February 21, 2025 |
No lecture, holiday week (Spring Break) | |||
February 28, 2025 | Fixed-point design | [Bou08] | Fixed-Point Design | March 1, 2024 |
February 28, 2025 | The Arx RTL Language and Toolset |
[Hof07] |
Arx Version with audio |
March 1, 2024 |
March 7, 2025 | Algorithm transformations | [Par95] | Transformations Addendum | March 14, 2022 |
March 7 + March 14, 2025 | The CORDIC Algorithm | [And98] and [Loe00] | CORDIC | March 8, 2024 | March 14, 2025 | Polyphase implementation of multirate filters | [Lan02] and [Vai90]
The part of the theory on downsampling is compulsory, the part on upsampling is optional. |
Polyphase implementation | March 16, 2018 |
March 14 + March 21, 2025 | Multiplierless filter design | [Hew00], [Vor07], [Aks14] and [Kot03] | Multiplierless Filter Design | March 25, 2023 |
March 21, 2025 | Software synthesis | Sections I and II of [Bha00] | Software Synthesis | March 25, 2022 |
March 28, 2025 | Code generation | Sections III and IV of [Bha00] + [Goo05] + [Kes19] | Code Generation | March 26, 2021 |
March 28, 2025 | Modern DSP Architectures |
[Anj11] |
DSP Architectures | April 3, 2022 |
April 4, 2025 | To be announced |
Most of the material can only be accessed through the collective subscriptions of LISA, the library services of the University of Twente. Such access is automatic when on campus. For off-campus access, please consult the information page of LISA on this topic. I can recommend the lean-library plug-in.
Caption of Figure 6: last subscript of y should be n-1 instead of n.
Right column of Page 856: Read Figure 9(b) where 9(a) is mentioned and vice versa.
Contents of Figure 17: In order to be consistent with next figures, rewrite "x = a - b" and "y = a - b + c * d".
Those interested in a detailed analysis of the probability density function of the truncation error after multiplication can consult the followin non-compulsory paper:
Ahmadi, A. and M. Zwolinski, Fixed-Point Multiplication: A Probabilistic Bit-Pattern View, Microelectronics Reliability, Vol. 51(4), pp 790-796, (April 2011). Online copy (only in UT domain).
You can skip Seciton 11.3 (2D FIR filters).
Page 203, halfway bottom paragraph: twice add a minus sign to 2's exponent (so 2**n should become 2**-n).
Page 204, Equation 11.10: the "close" parenthesis with exponent 2 should move to the end of the equation.
You can skip Section 6.5.3 on the efficient computation of the iteration-period bound.
Erratum: On Page 138, in the one but last sentence of the one but last paragraph, the range [-11, -4] should be corrected to [-13, -4].
You can skip Section 12.4.3 on force-directed scheduling.
Sections 1 and 2 are compulsory, the remaining sections are optional.
Optional text!
Correction for Equation 6: there should not be a factor 2 in front of h_LP(m).
You can skip Sections 6 (folding) and 8 (relaxed look-ahead).
Comments on Figure 6. The issue is that unfolding can improve the processor utilization. The explanation in the paper is not correct.
The schedule shown in Figure 6(b) is rate optimal i.e. it repeats at the iteration-period bound (T0min) value of 3. In this period, the total of the computations to be performed is 9 (4 operations of 2 and 1 of 1) time units. The lower bound on the number of processors is 3 (=9/3). However, this bound cannot be met. The reason is that the schedule needs to repeat every 3 time units. This means that a separate processor is necessary for each of the operations A to D that take two time units (a processor that would execute two of them would require an iteration period of 4). One has an average processor utilization of 75% (9/12).
Figure 6(c) shows a schedule of the graph after 2-unfolding. The unfolded graph contains 2 iterations of the original graph. This schedule is also rate optimal which means that the 2 iterations are executed in 6 time units. The optimal number of processor in this situation would be again 3 (=18/6). There now exists a schedule that reaches 100% processor utilization (the available 6 time units per processor can now be filled optimally with operations of 2 time units).
In Figure 6(b), the operations A0, B0, C0, D0 and E0 belong to one iteration. The schedule has an iteration period of 3 (A1 starts 3 time units after A0, etc.) a latency of 7 (output on E0) and a span of 8 (end of D0).
In Figure 6(c), the operations A0/A1, B0/B1, C0/C1, D0/D1 and E0/E1 belong to one iteration. The schedule has an iteration period of 6 (A2 starts 6 time units after A0, etc.) and a latency and span of 12 (output on E1).
Comments on Figure 9(a). According to me, two inequalities are incorrect: r(A2) - r(M1) <= 2 and r(M4) - r(A3) <= -1.
Optional text.
Only study Section 1 (until page 6); the rest is optional.
The students concerned do not need to carry out all exercises. It is left up to them to spend as much time as needed to achieve the goals mentioned above. Depending on the students' background 10 to 20 hours are supposed to be sufficient. No reports need to be delivered. This activity does not contribute to the grading of this course.
Before doing any exercise, you need to set up a connection from your PC to server xoc2.ewi.utwente.nl following the infrastructure-guidelines page.
Go (back) to | Sabih's Home Page. |