# Prototype board for the test of self-timed circuits developed in FPGAs<sup>1</sup>

M. Sanchez Raya<sup>\*a</sup>, R. Jimenez Naharro<sup>a</sup>, J. Castro Ramírez<sup>b</sup> <sup>a</sup>Escuela Politécnica Superior, Campus Universitario de La Rabida, Huelva, Spain; <sup>b</sup>Instituto de Microelectronica de Sevilla, <sup>2</sup>CNM-CSIC, Sevilla, Spain

# ABSTRACT

Nowadays, there exist a high number of commercial FPGA prototype boards. Nevertheless, these boards are basically oriented to functional verification of synchronous designs. So, it would be interesting to include tests modules dedicated to characterization of digital circuits. These circuits have not to be limited to synchronous circuits, but asynchronous circuits will also be considered due to their potential advantage.

Among parameters to characterization, we are going to include the latency, throughput, power consumption and noise. One of the missions of this characterization will be the comparison among synchronous and asynchronous implementations. In most of existent boards, the measure of certain merit parameters of the design is hindered by the impossibility of varying the frequency of the clock signal, or for the inexistence of measuring points of power consumption or high-speed signals. The main novelty that contributes the design of this board is the possibility of extracting dynamic parameters of the design operation implemented on FPGA.

In this work, a prototype board based on FPGA is proposed. One of the main novelty is the inclusion of an autonomous test system permitting functional verification and characterization of implemented designs. As an application, a test bench has been developed in order to compare and validate several arithmetic circuits, including synchronous and asynchronous implementations.

Keywords: FPGA, Self-timed, Power, Noise, Latency, Throughput.

1.

# INTRODUCTION

During the last decade, the interest in research on asynchronous circuits has been incremented. The main motivations of this revival depend on the differences between synchronous and asynchronous operations, that is, the control by a global signal (clock) or a protocol of communications, as shown in Figure 1.



Figure 1: Difference between operation of synchronous (a) and asynchronous (b) systems.

<sup>&</sup>lt;sup>1</sup> This work has been partially sponsored by the project UHU2004-06

<sup>&</sup>lt;sup>2\*</sup> msraya@diesia.uhu.es; phone +34 959217661; fax +34 959217348

This difference in operation will affect to the main design parameters. Among these parameters, we can find the following difference (all of them due to synchronization or desynchronization of circuit signals):

- The operation in worst or medium case.
- The power consumption in all or only operating blocks.
- The noise due to all or a few signal transitions.
- A worse or better behavior with respect to changes in operation conditions.

These advantages of asynchronous circuits can be quantified by means of a series of parameters. Among these parameters, we can distinguish timing parameters, such as latency and throughput, and voltage and current parameters, such as power consumption and noise.

Though FPGA devices have been traditionally oriented to synchronous circuits, the application of asynchronous techniques to FPGA has been increased in recent days [1, 2]. However, techniques of empirical characterization have not been highly developed and it would be desirable to develop techniques to measure the main design parameters. These techniques will be used to complement the simulation measurements.

So, this work is oriented to achieve two main objectives. The first objective is to develop a framework to perform the empirical measurements as automated as possible. The second objective consists to perform techniques to implement asynchronous circuits in conventional FPGA devices.

The paper is divided as following. Firstly, the parameters, and their measurement techniques, included in framework will be considered. Secondly, we show the architecture of the prototype board, with additional blocks to automate the parameters measurements. After that, the results obtained will be shown. And finally, we will expose the conclusions obtained with this work.

## 2. MEASURED PARAMETERS

In the framework, there must exist the necessary elements in order to measure the main parameters. The parameters considered are power consumption, noise, latency and throughput.

## 2.1. Power consumption

In a digital circuit, the dynamic power consumption follows the following equation [3, 4, 5], depending on operation frequency (f), switching activity ( $\alpha$ ), load capacity (C<sub>LOAD</sub>) and supply voltage (V<sub>DD</sub>).

$$P = \frac{1}{2} \cdot \alpha \cdot f \cdot C_{\text{LOAD}} \cdot V_{\text{DD}}$$

As we are going to use a FPGA device, some of the early parameters are fixed by the device, such as supply voltage, while the other parameters (mainly the switching activity) will generate power in the different parts of FPGA [6]:

- In the case of routing channels, the use of long channels can generate two different effects. Firstly, the routing devices, such as buffers, employed will be used in a higher number, and then the number of signals with activity will also be higher. Secondly, the long channels avoid the isochronic forks, and hence, the number of transitions will increase.
- In the case of CBL blocks, when the circuits implemented shows a high activity, the power consumption will be higher.
- In the case of I/O blocks, their power consumptions depend on external devices to FPGA through load capacity.

The measurements of power consumption found in literature have been mainly limited to statistical studies [7] or simulation results. One of the specific CAD tools oriented to power analysis is the PowerMill package of CADENCE [8]. Also, a model of power distribution has been performed through early studies [9].

Some authors have measured the power consumption direct or indirectly [10]. The form employed depends on the manner in which the equipment affect to the measurement. This measurement consist to obtain the waveform of  $i_{DD}(t)$  of the supply source.

There exist solutions that consist to obtain the  $v_{DD}(t)$  waveform through devices connected between FPGA and supply source. The power consumption can be measured through the drop in voltage. But, the main problem of this solution is to avoid the supply voltage does not decrease below 10% of nominal value to be sure of the correct function of FPGA device.

The method employed in this work is based in these last solutions. The method consists to disconnect temporally the power supply, and the drop in supply voltage (latched in a capacity) during the disconnection is in relation with the power consumption. The disconnection will be realized with a switch controlled by voltage, as shown in Figure 2.



Figure 2: Measurement of power consumption

So, the power consumption is  $P = E/\Delta T$ , where T is the time that the supply source is disconnected from FPGA.

This measurement method presents as advantage the feasibility to measure the energy consumed by a circuit in a single input change and the easy mathematical processing of the information. However, the main disadvantage is the possible existing of problems derived by sensibility to EMI and the need to a calibration of the capacitor  $C_{\rm DD}$  to an absolute measurement.

# 2.2. Noise

The noise generated by a digital circuit is due to the influence of signals transitions in supply and ground lines [9, 11]. The transmission to internal lines of circuits is produced by parasitic inductances existing into the devices. Though the transmission is produced to all lines in circuit, the more sensitive lines are supply and ground lines because these lines arrive to all parts of device. Considering these lines, the major components of parasitic inductances are due to the pad, so it is interesting to obtain a model of the pads in a FPGA. In the Figure 3 we show two possible models of pads.



#### Figure 3 Models of FPGA pad

The method of measurement consists to copy the supply line and to amplify the peak of voltage due to the inductor in order to be appreciate by the microcontroller. Finally, this will measure the maximum peak that will be a measurement of noise. The blocks implemented this measurement method is shown in Figure 4.



amplifier and detector



Firstly, a logarithmic amplifier and detector is used to transform the magnitude in decibels. After that, an amplifier and an analog to digital converter are needed to the noise value can be read by the microcontroller.

#### 2.3. Throughput

The throughput can be defined as the time necessary to introduce a new data in circuit. In the case of self-timed circuit, the throughput is the time between a rise in two protocol signals, request and acknowledge. Obviously it will depend on the degree of occupation of the circuit pipeline. So, we are going to use a test bench that maximizes the speed of circuit. This test bench is shown in Figure 5. The maximum speed is achieved with the connection between request and acknowledge signals (the data is introduced in circuit when the first pipeline stage is free), and the connection between complete and acknowledge signals (the data is put out from the last pipeline stage when it is generated).



Figure 5: Test bench to measure timing parameters

The measurement block must determined the time between both transitions. However, with this scheme of maximum speed, this time is the same than the time in which the request signal is high, as shown in Figure 6. The rise transition of the acknowledge signal is the same than the fall transition of the request signal.



Figure 6: Measurement of time of (a) throughput and (b) latency.

In the case of a synchronous circuit, the throughput is the minimum period of the clock signal, that is, the delay of the slowest digital block.

# 2.4. Latency

The latency can be defined as the necessary time to perform an operation, that is, the time between a rise transition of request signal and its complete signal. Again, this time depends on the degree of the occupation of pipeline, and hence, the same framework than throughput will be employed.

In order to employ the measurement unit of throughput, a new measurement signal must be generated. This signal will have a same level during the time of latency, as shown in Figure 6(b).

In the case of a synchronous circuit, the latency is the product of the throughput and the number of pipeline stages.

## **3. BOARD ARCHITECTURE**

The prototype is based on an early implementation [12]. The main motivation of a new board is due to the change of the FPGA device. In the early board, the device employed belongs to the 4000 family, while in the new board a Spartan family is considered.

In order to automate the early measurement, a prototype board is going to be designed. In it, all measurement blocks must be integrated. However the framework to obtain the latency and throughput will be implemented in FPGA device, so their units will not be present in the board. Then, the board is shown in Figure 7, in which the following blocks can be distinguished:

- A FPGA.
- A RISC microcontroller.
- Two memory devices: compact flash and SDRAM.
- A codec chip.
- Three interfaces: Ethernet, USB and expansion socket.
- Measurement units.

This board must be able to implement purely digital systems and systems on chip. Then, the missions of each part are the following ones. The FPGA device will serve to implement digital circuits and the framework to obtain the timing parameters.

The microcontroller is employed to perform the calculus corresponding to measurements. Also, the microcontroller will load the circuit in the programmable device. So, it has been needed to implement a operative system in order to use the instructions of the microcontroller.

The memory devices are used to store the file corresponding to the circuit. Other of their missions is to serve as memory system of microprocessor in the case of systems on chip.



Figure 7: Board architecture

The interfaces and codec have basically two missions. Firstly, they will serve to the communication with the host in the moment of FPGA programming. Secondly, they will be used in the case of the implementation of a system on chip. The codec will only be used in audio and video circuits.

## 4. **RESULTS**

Several digital circuits have been implemented in the prototype board in order to prove the measurement units. The circuits considered will belong to the category of synchronous and asynchronous ones with different complexity levels. The circuits are: multiplier (synchronous and asynchronous) and a VGA generator.

The synchronous version of multiplier is a 4-bit multiplier with four pipeline stage corresponding to each partial sum. The asynchronous multiplier has the same characteristics than synchronous one. The asynchronous architecture is based on micropipeline [13], and the capture and pass latch is implemented with the scheme found in [;Error! No se encuentra el origen de la referencia.].

In order to show the behavior of the unit of power measurement, the waveform of the supply voltage and control line is shown in Figure 8. This waveform has been obtained with a Promax OD-581 digital storage oscilloscope. In it we can distinguish two operation zone. When the control signal is low, the FPGA supply line is connected to supply source, that is, the normal zone of operation. When the control signal is high, the switch cut the early connection, and hence, the

FPGA supply is obtained by the capacity. The voltage decreases with the discharge of capacity. In order to avoid an erroneous functionality in the operation of FPGA, the control signal must connect the supply voltage before the limit of 10% (indicated by manufacturer) is passed.

This control signal is supplied by microcontroller depending on the supply line in FPGA. The microcontroller will have the mission to avoid a malfunction in FPGA due to a too low supply voltage.

In the case of the Figure 8, there exists a power consumption of 0.54 mW, with a  $C_{DD}$ ,  $\Delta V$  and a  $\Delta T$  of 0.8 $\mu$ F, 400 mV and 118 $\mu$ s respectively.



Figure 8: Waveform of the signals in the unit of power consumption measurement

With respect to the unit of noise measurement, the waveform of supply plane, obtained with oscilloscope, is shown in Figure 9. In it, the peak of supply voltage can be appreciated, one of the measurements of noise.



Figure 9: Noise in supply plane

The results obtained in the implementations of testbench are shown in Table 1. It is interesting to note the increase in hardware resources in the asynchronous version of multiplier. This increase is due to the implementation of communication protocol, but the use of this protocol implies that the increase in noise is not high. However, the increase in power consumption is proportional to the hardware resources.

| Circuit                 | Nº CLB | Nº FF | N° LUT | Peak of supply<br>voltage (mV.) | Power consumption<br>(mW.) |
|-------------------------|--------|-------|--------|---------------------------------|----------------------------|
| Synchronous multiplier  | 49     | 64    | 41     | 25.6                            | 38                         |
| Asynchronous multiplier | 69     | 81    | 66     | 34.4                            | 87                         |

Table 1: Results of testbench

# 5. CONCLUSIONS

With this work, a first phase has been concluded, the implementation of a framework to automate the characterization of digital circuits based on FPGA. The automation has been obtained in the characterization of power consumption and noise. In the case of timing parameters, the automation implies to implement additional digital circuits to generate an operation to maximum speed.

This prototype board has been proved using several circuits as testbench. In these circuits, synchronous and asynchronous versions have been taken in account. In these cases, the synchronous version shows a better behavior than asynchronous one.

## REFERENCES

- 1. QT Ho, JB Rigaud, L Fesquet, M Renaudin, R Rolland, "Implementing Asynchronous Circuits on LUT Based FPGAs", Proceedings of FPL, 2002.
- 2. AH Jackson, AM Tyrrell, "Asynchronous Embryonics", Proceedings of 3rd NASA/DoD Workshop on Evolvable Hardware, 2001.

- 3. F. N. Najm, S. Goel, and I. N. Hajj, "Power estimation in sequential circuits," ACM/IEEE Design Automation Conference, pp. 635-640, 1995.
- 4. "A Simple Method of Estimating Power in XC4000XL/EX/E FPGAs", Xilinx Application Brief XBRF 014, June 1997.
- 5. H.Belhadj, B. Zahiri, A. Tai, "Power-sensitive design techniques on FPGA devices", Proceedings of the International IC Conference China 2001.
- 6. E. Kusse, J. Rabaey, "Low-Energy Embedded FPGA Structures", in Proceedings of ISLEPD 98.
- E. Todorovich, M. Gilabert, G. Sutter, S. Lopez-Buedo, and E. Boemo, "A Tool for Activity Estimation in FPGAs", Proc of the 12th Field Programmable Logic and Application Conference (FPL 2002), Montpellier - France. September, 2002.
- 8. J. McCardle, D. Chester, "Measuring an asynchronous processor's power and noise", Synopsys User Group Conference (SNUG), Boston, 2001.
- L. Smith, R. Anderson, D. Forehand, T. Pelc, T. Roy, "Power Distribuiton System Design Methodology and Capacitor Selection for Modern CMOS Technology" in IEEE Transaction on Advanced Packaging, august, 1999 pp284-291.
- 10. J. Rius, A. Peidro, S. Manich, R. Rodriguez, "Power and Energy Consumption of CMOS Circuits: Measurement Methods and Experimental Results" in Proceedings of the DCIS 2003 Conference.
- 11. L. Smith, "Simultaneous Switch Noise and Power Plane Bounce for CMOS Technology" in IEEE electrical Performance of Electrical Packaging (EPEP) Conference, San Diego, CA Oct 17-25, 1999.
- 12. M. Sanchez, J. Naharro; "Placa Prototipo para el Desarrollo de Test Automatizados de Circuitos en FPGAs" in III Workshop on Reconfigurable Computing and Applications, Madrid, September 2003.
- 13. I.E. Sutherland, "Micropipelines", Communications of the ACM, vol. 32, no. 6, June 1989, pp. 720-738.