THEORY

DESIGN AND DEVELOPMENT OF DIGITAL PULSE COMPRESSION MATCHED FILTER a dessertation submitted in partial fulfilment of bachelor of engineering in electronics and communication engineering from Mangalore university submitted by KIRAN P BRINDA GANESH MALI MANISH KARNATAKA REGIONAL ENGINEERING COLLEGE Table of Contents

ABSTRACT 1

1. THEORY

1.1 Time Delay Estimation (TDE)

1.2 Radar overview

1.3 Principle of Pulse Compression

1.4 Matched Filter

2. MATCHED FILTER DESIGN

3. IMPLEMENTATION AND TOOLS USED

3.1 Chip description

3.2 Tools Utilised

4. SIGNIFICANCE AND FUTURE WORK
References

Appendix A-Map Report 42

Appendix B-Pin Report 43

Appendix C-Post Layout Timing Report 45

List of Figures Figure 1-1: Generalised Cross Correlator for Passive TDE. For active TDE H₁=H₂=1. Figure 1-2: Matched Filter

Figure 2-1: Design Hierarchy

Figure 3-1: Pin Diagram of the chip

Figure 3-2: Internal structure of the chip

Abstract

The choice for a radar signal has been governed by various factors including power considerations, maximum range and resolution distance. The search for a waveform that satisfies these criteria in an optimal fashion has always been on. Pulse compressed coding has emerged as one such solution. Pulse compressed code signals are detected on return with the help of matched filters. This project is an implementation of a matched filter stage for a pulse compressed and coded radar signal detector using Xilinx FPGAs.

THEORY

Time Delay Estimation (TDE)

Time delay estimation (TDE) or time of arrival (TOA) is a basic tool in statistical signal processing. Applications of TDE follow from the simple relationship given by

R= v t

Equation —1

where

R, is the distance of the object,

v, the velocity of the wavefield sent to the object, and

t, time taken for the wave to reach the object .

For example, in range measurements for radar or sonar, v is assumed known and the targets’ range is determined by measuring t , the time required for the transmitted signal to propagate to a target and be reflected back to point of transmission. Also for velocity measurements, like in biomedical or nuclear engineering applications, where R is assumed known and t the time required for a signal to travel the distance , R is measured.

In practice one seeks to measure the delay between two noisy versions of a signal. Unfortunately there is no single measurement procedure appropriate for all TDE scenarios. This fact, combined with practical importance of measuring time delay in so many different applications, is why TDE has received so much attention over the last three decades.

TDE types

There are two types of TDE :

Active TDE: x₁(t) and x₂(t) correspond to the transmitted and received signals, respectively, it is apt to assume that n₁(t) =0 and that s(t) is a known deterministic signal. It is well known that for the nominal active scenario where n₂(t) is a realization of a white, Gaussian random process, the (asymptotically optimum) ML TDE processor is the matched filter, which cross correlates x₁(t) and x₂(t). The estimate t" is the time that corresponds to the maximum of the matched filter output.
Passive TDE: x₁(t) and x₂(t) are two versions of the received signal. It is usually assumed that a=1, s(t) is a realization of a stationary, Gaussian random process, and n₁(t) and n₂(t) are realizations of mutually uncorrelated, zero mean, white, stationary, Gaussian random processed that are also uncorrelated with the signal. In particular the ML processor for this scenario is the generalized cross correlator (GCC). As shown in the figure, the GCC cross-correlates appropriately prefiltered versions of the sensor outputs forming t " as the time corresponding to the maximum GCC output. Moreover, the GCC has been shown to be optimum even in nonasymptotic conditions.

Intuitively an estimator for t should seek the best ‘match’ between x₂(t) and a delayed version of x₁(t). In both active and passive TDE , some form of cross correlation has been proven to be the optimum measure for matching under Gaussian conditions. Another possible measure for matching could be the ‘error signal’:

e(t) = x₂(t) - x₁(t-t’)

Equation —4

An appropriate estimate of time delay, t", would be the t’, which minimizes this error in some sense. It can be shown that for the Gaussian scenarios described above the optimum TDE processor minimizes the mean square error (MSE), E[|e(t)|²]. However straightforward manipulation of the expression for the MSE shows that the minimum MSE and the correlator based processors are equivalent.

Radar overview

An elementary form of radar consists of a transmitting antenna emitting electromagnetic radiation generated by an oscillator, a receiving antenna, and an energy detecting device or receiver. A portion of the transmitted signal is intercepted by a target, and reflected in all directions. The energy that is re-radiated in the direction of the radar is of prime importance. The receiving antenna collects the returned energy and delivers it to the receiver, where it is processed to detect the presence of a target and to determine its position and its relative velocity. The distance of the target is determined by measuring the time taken for the signal to travel to the target and back. The direction, or angular position, of the target may be determined by the direction of arrival of the reflected wave-front.

The most common radar waveform is a train of narrow rectangular shaped pulses modulating a sine wave carrier. The distance is measured as a function of the time taken by the transmitting pulse to travel to the target and back. Since electromagnetic waves travel at the velocity of light, the distance is given by

R = c T/2

Equation —5

where

R is the range of the target,

T is the time take by the Tx pulse to travel to target and return,

c is the velocity of the radar signal in space.

Radar Receiver Operations

Broadly the Radar problem can be given as:

Detection: To detect the presence or absence of a target signal in the presence of background noise.
Estimation: To determine the parameters of the target -mainly range, azimuth, elevation, velocity and acceleration. The maximum range detection capability of a radar is directly dependent on transmitter power and duty cycle of transmitter wave as given in the equation below,

Maximum range = k * P_avg

Equation —6

where

P_avg = P_p * duty ratio

= P_p * t / T

= P_p * t * F

Equation —7

where

P_p is the peak transmitted power,

t is the sub pulse width of the Transmitted wave,

F is the peak repetition frequency (PRF)

Hence the maximum detectable range can be improved by increasing P_p, t or F. Let us consider the three possibilities in detail.

Increasing Pp: High voltages cause insulation breakdown and also complicate transmitter design.
Increasing t: The echo of two targets separated by a time delay is proportional to the distance between the two targets. When the targets are closely placed it would result in the merging of the two echoes and confusing the dual targets as one. The resolution or ability to detect two closely spaced targets could be improved by decreasing t but this would in turn decrease the maximum detectable range. Therefore improvement of range at the cost of resolution is an unacceptable proposition.
Increasing PRF: PRF has an inverse relation with maximum unambiguous range. Thus doubling of PRF frequency would decrease the maximum unambiguous range by half. Hence increase of PRF is of no help as far as improving the range of radar goes.

Thus researchers from all over the world, based on their research work, evolved a universal solution to the problem of improving the detection range without sacrificing range resolution or putting undue constraints on Transmitter peak power. This solution involved sending out a coded signal, the coding being done with the help of Pulse Compression.

Principle of Pulse Compression

Nature of waveform

Pulse compression involves the transmission of a long coded pulse and processing of the received echo to obtain a relatively narrow pulse. A long pulse may be obtained from a narrow pulse. Narrow pulses contain a large number of frequency components with a precise phase relationship between them if the relative pulses are changed by a phase distorting filter, the frequency components combine to produce a stretched or expanded pulse. The expanded pulse is then transmitted. The received echo is processed in the receiver by a compression filter. The compression filter readjusts the relative phases of the frequency components so that a narrow or compressed pulse is again produced.

An example of a pulse compression radar is phase coded pulse compression. In pulse coded waveform the long pulse is sub-divided into a number of shorter subplulses of equal duration. Each is then transmitted with a particular phase in accordance with a phase code (usually binary coding). Phase of the transmitted signal alternates between 0 & 180 degrees in accordance with the sequence of elements: 1s and 0s (+1s & -1s) in the phase code. The phase code used is generally a standard code, which has proved to provide the best resolution and least ambiguity in determining the target parameters. The codes used can be either Barker (which is given below) or some form of pseudo random code. The former is restricted to a maximum of 13 bits while the latter can be of any length. Commercial radars use codes of length nearly 50 to 60 bits.

Code Length	Code Elements	Sidelobe level
2	10, 11	-6.0
3	110	-9.5
4	1101,1110	-12.0
5	11101	-14.0
7	1110010	-16.9
11	11100010010	-20.8
13	1111100110101	-22.3

Table -1 Barker Codes

Correlation : Detection of the waveform

This brings up the question of how the time of arrival of the reflected signal is determined. The basis of this determination involves the computation of the correlation between the two signals, the outgoing and incoming. Correlation is a measure of the similarity or relatedness between two waveforms. For two waveforms v₁ and v₂ the correlation is mathematically given by the following equation

R(t) = _-T/2_ò^T/2 v₁(t’) v₂(t + t’) dt’

lt T® µ

Equation —8

If they have the same fundamental period T₀, then T can be replaced by the same and the average cross correlation can be computed. It should be noted that the correlation depends on the time shift, t given to the waveforms. This shift results in a maximum correlation at some points and zero at others. Two waveforms are considered to be coherent if they are related while they are uncorrelated or incoherent if there is no match between them at any given time. Auto correlation is the measure of the coherence of a waveform with itself. It is noted that in the case of auto correlation it would be maximum when the time shift would be zero or a multiple of its time period. In radar signaling the received and sent signal are basically the same so it is the auto correlation that is computed. The two signals i.e. the one broadcast and the one received are matched continuously. The instant when they match or the auto correlation is the maximum is the point at which the signal is considered to have arrived. This computation is done with the help of a matched filter.

Matched Filter

A pulse compression radar is a practical implementation of matched filter system. A matched filter is part of the receiver that is specifically designed to maximize the output signal to noise ratio. Block diagram of matched filter is shown in figure1.2.

Figure -2: Matched Filter

The Length M or length of the sequence is equal to the number of subpulses in the sequence. The sequence is incorporated into the signal by means of phase coding.

The phase coded received signal enters the taps one by one. The signal is unchanged of the coefficient of the tap is 1 and inverted in phase if it is -1. The coefficients of multiplication are the code in reverse form. The products are summed and the output obtained has maximum signal to noise ratio.

MATCHED FILTER DESIGN

The matched filter used for time delay estimation is a generalized cross correlator. It is in the form of a simple FIR filter without feedforward or feedback as shown in the figure 1-2. It does the correlation of the samples and the code at each clock edge. This has to be done within 200ns which is the sub-pulse width of the radar signal.

Matched filter can be implemented as a software program or a hardware device. Software implementation has the advantage of early implementability. But in real time the embedded system executing the correlation will take more time for execution. This limits the performance of the radar signal processor for which this matched filter is a component. So our matched filter was designed as a single chip using Very high-speed integrated circuit Hardware Description Language [VHDL] and implemented on XILINX Field Programmable Gate Array [FPGA] XC4010PC84. FPGAs are a class of reconfigurable hardware devices generally used for prototyping Application Specific Integrated Circuits [ASICs] or to develop hardware with low Non-Recurring-Engineering [NRE] costs. These are the reasons for going in for FPGA implementation.

Figure -1: Design Hierarchy

The structure of the matched filter was initially conceptualized before going for the implementation stage. This made Bottom-Up design procedure more appropriate. The design consists of the following components as in the design hierarchy in figure 3-1.

MACHFLT: This is the top-level entity where all the components are combined together. It consists of basically four components a shift register stage SHIFTREG, DATPROC, CSASTG, CDREG.

SHIFTREG: It is a twelve bits, sixteen stage register. i.e. it takes in 12 bits of data at each clock pulse and shifts it right by one stage on the next clock. It consists of sixteen such stages. This stage is used as the tap delays. This is designed as a behavioral model.

DATPROC: This does the processing of the outputs of the taps. It does the Multiplication of the samples with code-bits and masking the unused tap outputs.

CSASTG: This component is the adder stage, which adds all the sixteen, twelve bit numbers. Using carry propagate adders to add all the numbers will limit the speed. So the adder stage is implemented using carry-save adders. At the output of each adder a sum vector and a shifted carry vector is obtained. These are used as separate numbers in the next stage. All these are added as carry save adders. The final carry and sum vectors are added with a carry-propagate adder.

CDREG: These are 16-bit registers used for storing the code bits and masking bits.

The input for the implemented matched filter using 16 bit code is 12 bit digital samples from the Analog to Digital Converter. Each sample is shifted through a set of 16 stage 12 bit parallel register (output of one stage is given to the input of the next stage). Each bit in the pulse code is available in each register stage at any instant. These shift registers act as the taps in the block diagram. The output of the taps are multiplied by the code (+1s and -1s). As the numbers involved are large multiplying by a -1 is approximated to complementing of the 12 bit sample. This multiplication is done with the help of an array of 12 XNOR gates(on each for each bit of the sample. The output is determined by the code bit at that stage. If the code bit is ‘0’ the samples are complemented(multiply by -1) and vice-versa. The output of each of these 16 XNOR arrays or multipliers have to be added at the next level to obtain the auto correlation function. The adder must add 16 numbers of 12 bit length. This is done with the help of a CSA adder. The final result is the auto correlation function at the sampled instant.

IMPLEMENTATION AND TOOLS USED

The project was implemented on a XC4010E having 10000 equivalent gates. The design consumed 9018 equivalent gates and utilised 388 out of 400 CLBs and 61 out of 61 IOBs.

Chip description

The chip diagram of the implemented chip is as shown in Figure 3-1.

Figure -1: Pin Diagram of the chip

Pin locations for the chip implemented on the FPGA is given in Pad Report [Appendix-B]. Pin descriptions are as follows:

SMPL_0-11: These are the 12-bit input samples to the chip. It enters the shiftregister[taps].

CDIN_0-15: These are the pins using which the code is input to the chip.

MA_0-15: These are the masking bits. Both mask and code bits are input to a parallel register.

CDLD: This is the enabling signal for the code and mask inputs.

RST: This is the system reset resets all the flops in the chip.

OP_0-13: These are the output pins.

The internal structure of the chip is as shown in Figure 3-2.

Figure -2: Internal structure of the chip

Various modules are as explained in the design(Chap.3) and the input/output lines are as explained in the chip diagram.

Tools Utilised

The tools used for various stages of the chip development are as follows:

Synopsys VHDLANALYSER and DEBUGGER - Functional simulation
Synopsys FPGA EXPRESS - Synthesis
Xilinx back-end tools - Implementation

Testing

The design was implemented and downloaded on the FPGA Demo- board. One of the FPGA was configured as the design. The accompanying FPGA was configured to generate test vectors to the design FPGA. X-checker cable was used to download the design and also for single stepping and readback. The leds on the board were used to indicate the o/p signals. The chip was tested for known input vectors given by the Test-vector generator FPGA. The design was found to be working satisfactorily.

Significance and scope for Future Work

The algorithm used for developing the system for Time delay estimation is an old one. But it has its significance in radar signal processing. The system was designed in Matlab and its performance evaluated for signals in the presence of different levels of random noise. The simulation results are in the appendix. There are many other algorithms for improved performance in the presence of both gaussian and non-gaussian noise. But these algorithms require the use of active filters to modify the code with an error signal depending on the noise signal. This requires multipliers at two levels of the design. This further adds a delay of around 120-150ns (delay estimation based on projects done on Fast-multiplier designs on FPGA undertaken by Undergraduate students at KREC). This requires the sub-pulse width to be atleast 120ns more than that possible by this project. This is because of the optimisation of multipliers as XNOR gates in the design. Once the sub-pulse width increases the resolution capacity of the radar decreases.

Estimated performance that could be achieved using this matched filter is a resolution capacity of 60m. As given in the theory, the maximum Barker code available is 13-bit wide. This matched filter supports all the Barker codes. The features included in the chip like masking and code load makes the filter programmable for different codes of different widths.

Range of the radar can be further increased without forgoing the resolution capacity by using longer codes with the same sub-pulse width. Such codes established for radar TDEs are Pseudo-Random codes typically of 54-bits wide. This provides scope for future improvement upon this project.

----------------- References

1. M.Skolnik, "Introduction to Radar Systems." pub.-McGraw Hill, 2^ndedition, 1980.

2. "Special issue on Time delay estimation." IEEE Transactions on ASSP, Vol ASSP-29,1981.

3. John Villasenor and Brad Hutchings, "The Flexibility Of Configurable Computing." IEEE Signal Processing, vol.15, no. 5, Sept 1998.

4. Mazor & Langstrat, "A guide to VHDL." pub.-Kluwer Academic Publisher, 1^st edition, 1995.

5. Douglas Perry, "VHDL." pub.-McGraw Hill, 2^ndedition, 1995.

Appendix A

MAP REPORT

Design Information

------------------

Target Device : x4010e

Target Package : pc84

Target Speed : -1

Mapper Version : xc4000e -- M1.5.19

Design Summary

--------------

Number of errors: 0

Number of warnings: 1

Number of CLBs: 388 out of 400 97%

CLB Flip Flops: 224

4 input LUTs: 629

3 input LUTs: 216

Number of bonded IOBs: 61 out of 61 100%

IOB Flops: 0

IOB Latches: 0

Number of clock IOB pads: 2 out of 8 25%

Number of primary CLKs: 2 out of 4 50%

13 unrelated functions packed into 13 CLBs.

(3% of the CLBs used are affected.)

Total equivalent gate count for design: 6090

Appendix B

PIN REPORT

Pinout by Pin Name:

+--------------------------------------------------------------------------------------------

| Pin Name | Direction|Pin Number|

+--------------------------------------------------------------------------------------------

| CDIN<0> | INPUT | P18 |

| CDIN<10> | INPUT | P48 |

| CDIN<11> | INPUT | P51 |

| CDIN<12> | INPUT | P39 |

| CDIN<13> | INPUT | P38 |

| CDIN<14> | INPUT | P68 |

| CDIN<15> | INPUT | P67 |

| CDIN<1> | INPUT | P19 |

| CDIN<2> | INPUT | P46 |

| CDIN<3> | INPUT | P47 |

| CDIN<4> | INPUT | P28 |

| CDIN<5> | INPUT | P26 |

| CDIN<6> | INPUT | P10 |

| CDIN<7> | INPUT | P9 |

| CDIN<8> | INPUT | P81 |

| CDIN<9> | INPUT | P79 |

| CDLD | INPUT | P13 |

| CLK | INPUT | P35 |

| MA<0> | INPUT | P14 |

| MA<10> | INPUT | P69 |

| MA<11> | INPUT | P71 |

| MA<12> | INPUT | P61 |

| MA<13> | INPUT | P62 |

| MA<14> | INPUT | P59 |

| MA<15> | INPUT | P66 |

| MA<1> | INPUT | P16 |

| MA<2> | INPUT | P44 |

| MA<3> | INPUT | P45 |

| MA<4> | INPUT | P27 |

| MA<5> | INPUT | P29 |

| MA<6> | INPUT | P8 |

| MA<7> | INPUT | P7 |

| MA<8> | INPUT | P83 |

| MA<9> | INPUT | P82 |

| OP<0> | OUTPUT | P65 |

| OP<10> | OUTPUT| P58 |

| OP<11> | OUTPUT | P57 |

| OP<12> | OUTPUT | P49 |

| OP<13> | OUTPUT | P50 |

| OP<1> | OUTPUT | P23 |

| OP<2> | OUTPUT | P20 |

| OP<3> | OUTPUT | P60 |

| OP<4> | OUTPUT | P84 |

| OP<5> | OUTPUT | P80 |

| OP<6> | OUTPUT | P78 |

| OP<7> | OUTPUT | P70 |

| OP<8> | OUTPUT | P72 |

| OP<9> | OUTPUT | P56 |

| RST | INPUT | P77 |

| SMPL<0> | INPUT | P24 |

| SMPL<10> | INPUT | P36 |

| SMPL<11> | INPUT | P37 |

| SMPL<1> | INPUT | P25 |

| SMPL<2> | INPUT | P17 |

| SMPL<3> | INPUT | P15 |

| SMPL<4> | INPUT | P5 |

| SMPL<5> | INPUT | P6 |

| SMPL<6> | INPUT | P4 |

| SMPL<7> | INPUT | P3 |

| SMPL<8> | INPUT | P40 |

| SMPL<9> | INPUT | P41 |

+------------------------------------------------------------------------------------------

| Dedicated or Special Pin Name | Pin Number |

+------------------------------------------------------------------------------------------

| /PROG | P55 |

| CCLK | P73 |

| DONE | P53 |

| GND | P21 |

| GND | P64 |

| GND | P43 |

| GND | P52 |

| GND | P12 |

| GND | P31 |

| GND | P1 |

| GND | P76 |

| M0 | P32 |

| M1 | P30 |

| M2 | P34 |

| TDO | P75 |

| VCC | P63 |

| VCC | P2 |

| VCC | P22 |

| VCC | P54 |

| VCC | P74 |

| VCC | P33 |

| VCC | P11 |

| VCC | P42 |

+--------------------------------------------------------------------+--------------+

Appendix C

POST LAYOUT TIMING REPORT

Timing summary:

---------------

Timing errors: 0 Score: 0

Constraints cover 186795 paths, 805 nets, and 2159 connections (100.0% coverage)

Design statistics:

Minimum period:14.607ns (Maximum freq:68.460MHz)

Maximum combinational path delay: 81.383ns

Maximum net delay: 13.628ns

---------------------------------------------------------