Integrated Circuits and Systems group, IIT Madras

Design and Implementation of a Two Dimensional Data Access Architecture for Digital Signal Processors

By Srikanth B.

Abstract

Image processing by far has not benefited as much as speech and audio processing from the development of Digital Signal Processors (DSPs). This is because most existing DSP architectures are optimised to handle one dimensional data efficiently. Hence, most programmers tend to rely on factorisation techniques to decompose the two dimensional data array into separable row and column vectors before processing them on such DSPs. Such methods, however, impose a huge penalty on the system in terms of the execution time and structural resources, thereby degrading performance. This wide gap between two dimensional and one dimensional architectures can be bridged if the processor is able to access two dimensional data directly, without having to translate the two indices into a linear memory pointer.

To achieve this objective without incurring huge overheads or losing programming flexibility, a hardware solution for address translation was proposed (Srinivasan et.al., 1991). In this scheme, the conventional one dimensional address computation hardware is replaced with a simple and compact two dimensional unit that translates two dimensional references to a linear address transparently, without unduly burdening the processor. This unit should be capable of computing memory address of one dimensional references also, without compromising performance.

This work deals with the design, modelling and FPGA synthesis of the two dimensional address computation hardware. This includes the design of the I/O connectivity and the instruction set to the address computation hardware. The performance of the DSP employing the two dimensional address computation hardware (2D-ACH) is studied in contrast to conventional DSP architectures. A speedup factor of 2.5 is achieved for typical two dimensional algorithms such as 2D-Convolution. The performance of the DSP is, however, not compromised when running one dimensional algorithms. The increase in the size of the address computation hardware is 34% on Xilinx FPGA (XC4000) technology. The increase in the size of the hardware would translate to a much smaller percentage increase in the size of the DSP. A 2.5 times improvement in performance, for a very small increase in the size of the processor, can be regarded as a considerable improvement in the design of DSPs.