

MODERN DSP ARCHITECTURES

March 31, 2022

## PARALLEL PROCESSING

- Central question:
  - How to increase the performance?
- Increasing the clock frequency: ٠
  - Leads to the generation of too much power, overheating, etc.

© Sabih H. Gerez, University of Twente, The Netherlands

- Parallel processing is the solution:
  - Not only for computations
  - Also for data transport, memories, etc.

March 31, 2022

## VECTOR PROCESSING, SIMD (1)

- One way to introduce parallelism without modifying too much a processor's architecture is to apply the same instruction to the multiple data:
- Single Instruction Multiple Data (SIMD)
- Also called: vector processing
- Think of computations that are repeated on multiple data and are mutually independent:
  - Taps in an FIR filter
  - Butterflies in the same stage of an FFT
  - Etc.

© Sabih H. Gerez, University of Twente, The Netherlands

IMPLEMENTATION OF DSP IMPLEMENTATION OF DSP MODERN DSP ARCHITECTURES MODERN DSP ARCHITECTURES March 31, 2022 March 31, 2022 VERY-LARGE-INSTRUCTION WORD: **VECTOR PROCESSING, SIMD (2) VLIW (1)** Example: a 32-bit adder re-used as 2 16-bit adders. ٠ Multiple FUs, also called EXUs (execution units) Load-store architecture: 16 15 - Communication with memory is always via register files. - Register files are multi-ported. ADD2: +16 +16 Each FU can receive an instruction every clock cycle 16 15 Fach RISC instruction = one issue slot No dependencies between different RISC instructions - Orthogonal microcode © Kessler 2019, Figure 11. - Compiler friendly One instruction = many RISC instructions © Jef van Meerbergen (TUE/Philips) © Sabih H. Gerez, University of Twente, The Netherlands © Sabih H. Gerez, University of Twente, The Netherlands





UT. MODERN DSP ARCHITECTURES

March 31, 2022

17

## **COARSE-GRAIN RECONFIGURABLE**

- FPGAs are *fine-grain* reconfigurable:
  - One roughly builds digital systems by connecting bit-level building blocks such as AND and OR gates (actually, by configuring look-up tables and interconnections)
- *Coarse-grain reconfigurable* architectures have building blocks at the level of ALUs, multipliers, etc.
  - Proper configuration e.g. creates a data-path able to compute an entire FFT butterfly.



March 31, 2022

18

## **DSP FOR SOFTWARE-DEFINED RADIO**

- Check the following paper:
  - Anjum, O, T. Ahonen, F. Garzia, J. Nurmi, C. Brunelli and H. Berg, *State-of-the-Art Baseband DSP Platforms for Software-Defined Radio: A Survey*, EURASIP Journal on Wireless Communication and Networking, Vol. 2011(5).
- The paper presents several ICs proposed for *software-defined radio* (SDR):
  - SDR: approach to realize radio functions (mixing, filtering, etc.) on processors.
- Check references in paper to really understand specific solutions.

| © Sabih H. Gerez, University of Twente, The Netherlands                                                                                        |                          | Sabih H. G | erez, University of Twente, The Netherl |
|------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|------------|-----------------------------------------|
|                                                                                                                                                | • 10                     |            |                                         |
| UT. MODERN DSP ARCHITECTURES                                                                                                                   | March 31, 2022           |            |                                         |
| <ul> <li>Platforms are mixture of generic proce<br/>processors (e.g. for LDPC decoding: I</li> </ul>                                           | essors and dedicated co- |            |                                         |
| check).                                                                                                                                        |                          |            |                                         |
| <ul> <li>Often also a mix of SIMD and VLIW.</li> <li>Next to DSPs a RISC-style processor<br/>control and control-dominated parts of</li> </ul> | is available for overall |            |                                         |

• Programming such platforms is very complex and quite some effort is spent in compilers and other programming aids.