a prototype fpga implementation of a real-time

ï»¿

A Prototype FPGA Implementation of a Real-TimeDelay-and-Sum Beamformer

Graduation Project 1303

Angela P. Cuadros Castiblanco

December 11, 2013

Advisor:Dr. Cesar L. Nino

PONTIFICIA UNIVERSIDAD JAVERIANA BOGOTASchool of Engineering

Electronics Engineering Department

Contents

Dedicatoria 2

Introduction 3

1 Delay-and-Sum Beam-Forming 4

2 Beam-Former Architecture Overview 62.1 Data Acquisition Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 FPGA System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3.1 Data Acquisition Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3.2 Delay-and-Sum Beam-Forming Algorithm . . . . . . . . . . . . . . . . . . . . 72.3.3 Low-Pass Filter and Down-Sampler . . . . . . . . . . . . . . . . . . . . . . . . 72.3.4 Display System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 System Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 FPGA Implementation 93.1 Delay-and-Sum beamforming algorithm implementation . . . . . . . . . . . . . . . . 93.2 Pipeline Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.2 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.3 Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Results 134.1 MATLAB Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 FPGA Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2.1 Prototype Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2.2 Fixed Signals results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2.3 Prototype Implementation Results . . . . . . . . . . . . . . . . . . . . . . . . 19

5 Conclusion 22

Glossary 23

List of Figures 25

List of Tables 26

Appendix 27

1

Dedicatoria

No ha sido un camino fácil el que concluye hoy con este trabajo de grado, el cual no hubiese sidoposible sin el apoyo de a quienes en esta dedicatoria menciono.

A ti Dios que me diste la oportunidad de vivir, una familia y amigos maravillosos.A quienes me apoyaron a lo largo de mi carrera, con cada uno de mis trabajos, y en especial

con mi trabajo de grado. Quienes día a día luchan para hacer lo posible porque mi hermano yyo logremos nuestros sueños. Ustedes saben que este, y cada uno de los logros que he obtenido selos dedico a ustedes, mis padres. Gracias por su apoyo incondicional, su cariño, su comprensión(en especial en las largas jornadas que pasé en la universidad) y por la oportunidad de realizar elsueño de ser Ingeniera Electrónica Javeriana. Los quiero mucho, no tengo palabras para expresarlo mucho que los quiero y los admiro.

A mi hermano, por interesarse en mí trabajo, por escucharme, por hacerme reír y por estarpresente en mi vida. Siempre estaré muy orgullosa de usted y sé que el camino que inicia en laJaveriana estará lleno de grandes logros.

A mis tíos Inés, Luz, Teresa, Jesús, sin ustedes todo hubiese sido diferente, gracias por laoportunidad de estudiar en la que considero la mejor universidad del país. Su apoyo incondicionala lo largo de toda mi vida, ha sido un factor primordial para lograr mis sueños.

Los papas de mi mamá que aún viven, gracias por su amor y oraciones son un factor queconsidero indispensable en mi vida, gracias por estar presentes en los momentos felices y difíciles,los quiero mucho, abuelita Gracias. A los papas de mi papa, quienes me vieron iniciar este caminoy se sentían orgullosos de sus nietos, lamentablemente hoy no están presentes, pero siguen en micorazón y los recuerdo mucho.

A mi amigo del alma Johan, muchas gracias por creer en mí y por darme cariño y apoyarmetodo el tiempo, por largas jornadas de trabajo, jornadas de descanso y alegría, por enseñarme tantascosas, por ser tú. Te quiero.

A mis profesores gracias por confiar en mí, mi director de tesis Ingeniero Cesar Niño por suapoyo incondicional en el desarrollo de mi trabajo de grado. Al ingeniero Jairo Hurtado muchasgracias por creer en mí.

To my best friend Jhon, I never thought I will find a person like you, neither in this moment ofmy life, but you have become important part of it. Thank you for understanding, for being there,for helping me with everything in my life. Thank you for long hours of nonsense talking, and longhours of just getting to know each other, this has been an awesome year. Thank you for helping meto be a little closer (more a lot) to my dreams; you were like an angel as my mum calls you. Thankyou!

A mi familia, mis tios, mis amigos Viviana, Ivan, Carolina, Julian, Sara, Angie y Sebastian,Pipe y a todos los que aunque no mencionó, estuvieron junto a mí en este proceso. Muchas gracias.

2

Introduction

In applications such as directional transmission and reception in antennas[1], air traffic control[1],localization and classification of sources[1], fetal heart monitoring [2] and acoustic cameras[3], spa-tial filtering is essential to estimate parameters, measurements or the waveform of specific signals.A beamformer estimates a signal that comes from a specific direction in presence of noise by meansof spatial filters and it samples the wave field both in space and time [1]. In the current proto-type, a delay-and-sum beamformer is implemented since it is simpler and faster compared to otherbeamforming algorithms [4], important aspects to obtain a real-time implementation.

Acoustic cameras, are essentially used to estimate the location, origin and intensity of soundsources. Among its main applications are: noise reduction in a specific place and quality controland detection of defects in machines, automobiles and planes [3]. The sensor array is an array ofmicrophones used to acquire sound signals that after being processed with sophisticated algorithms,result on the estimation of sound intensity. In the current beam-former implementation, the pro-cessing is applied to an acoustic camera because the test system and the sensor array can be easilyimplemented.

The spatial algorithms needed to implement the beam-former are of high complexity and needhigh memory bandwidth, which is why they are often implemented off-line [5]. However given therequirements of the applications already mentioned, where adjustments must be made based uponthe results obtained, it becomes important to gather these results in real-time, making it necessaryto come up with an on-line portable implementation. As one of several alternatives, implementationon FPGAs allows the developer to design a system that is on-line, making it fast and portable [6].Furthermore, FPGAs are a flexible tool that allow to make diverse changes and tests of the designthat is being implemented on them, as a result the prototype can be easily modifiable and adaptableto different applications.

The general objective of the project here discussed, is to implement spatial filtering algorithmson a FPGA applied to the development of a beamformer prototype to display sound intensity im-ages. To this end, specific objectives are set: implement the delay-and-sum beamforming algorithmon a FPGA, implement the algorithm in MATLAB to obtain a sound intensity image, evaluatequalitatively the implementation of the prototype by the identification of an acoustic scene with 3sound sources and by means of pre-established signals, verify the effectiveness of the implementedalgorithm.

The remainder of this report is organized as follows: In Chapter I, a theoretical description ofa Delay-and-Sum beamformer is given. In Chapter II, an overview of the prototype architectureis discussed. In Chapter III the FPGA implementation architecture is described. The results andspecifications of the implementation are presented in Chapter IV. Chapter V presents the conclusion.Finally, the Appendix presents block diagrams, data-sheets, state diagrams, peripheral configurationdescriptions and the VHDL code used for the implementation.

3

Chapter 1

Delay-and-Sum Beam-Forming

The beamformer here described, is based on a spatio-temporal filter integrated with an array ofmicrophones used to obtain the signals that are going to be processed. Assuming that the array islocated in the far field of the acoustic source, the wave arriving to the array can be considered as aplane wave. Taking that into consideration, the algorithm consists in focusing the array of sensorsconsecutively to different points in the space in a delimited area, and then superimpose the signalsobtained from each sensor.

Figure 1.1: Estimation of the intensity of the signal in a fixed point, by means of the delay-and-sumbeam-forming algorithm.

As it is shown in Fig.1.1 m0, m1 and mn correspond to the microphones of the sensor array ofthe beam-former, s0[k], s1[k] and sn[k] are the digitized signals of each microphone, and n is thetotal number of microphones. The signal observed, is the one obtained in a fixed point of the spaceand in the case of each microphone, it presents a delay in time since the microphones are not locatedin the same position with respect to the position of the source. Therefore, in order to estimate thesignal at this point, the signals obtained by each microphone will be delayed a particular numberof samples (delays δ0, δ1 ... δn) and then be superimposed, in order to obtain the aforementionedconstructive interference.

The result of a beam-former will be a function of the position and the frequency. The position isrelated to a 3-coordinate system that will define the wavefront. However, in the case of the systemdeveloped, two coordinates define the wavefront and the other one remains as a fixed value. Eachmicrophone, i = 1, ..., n is a source of the sampled audio signal si[k], where k corresponds to theindex of the sample. In order to compute the delay to each of the microphones in some point p,Equation (1.1) is used.

δi(p) = α · fs

c∥p − mi∥. (1.1)

4

Where δi(p) is measured in samples and it denotes the delay associated with the ith microphoneand the fixed point p, which is the point that is being analyzed in the space, fs is the samplingfrequency, c is the speed of sound1, α is a constant that allows the designer to focus the array of mi-crophones, p = [px, py, pz] are the coordinates of any point in the wavefront, and mi = [mx, my, mz]are the coordinates of the ith microphone in the array. In order to obtain constructive interference,a delayed version of the signals obtained from each microphone must be summed, as it is shown inEquation (1.2):

s(p)[k] =nmic∑i=1

si[k − δi(p)]. (1.2)

When the delay δi(p) is not an integer, the value of si[k] is rounded to the nearest integer.Finally, the estimation of sound intensity is obtained by multiplying the samples captured for asingle point p in the wavefront by the correspondent coefficient hl of a FIR filter of Lth order, as itis shown in Equation (1.3):

I(p)[k] =L−1∑l=0

|sp[k − l]|2 · hl. (1.3)

As it can be noted, calculating the delays δi(p) requires high memory bandwidth, and complexand long calculations.

1The speed of sound is assumed as 340 m/s approximately

5

Chapter 2

Beam-Former Architecture Overview

In order to implement the prototype of an acoustic camera, the architecture shown in Fig. 2.1 isproposed. The data acquisition, was done using 4 microphones, fixed in a square array. The soundwaves are pre-amplified and then filtered with an anti-aliasing filter, for their further processing. Atthat point, the signals are digitized using a 4-channel SPI ADC, and then by the implementation ofthe delay-and-sum beam-forming algorithm, the estimation of the sound intensity of certain pointin the space is obtained in real time. In order to visualize the result, the estimation of the soundintensity at every point in the wavefront is represented as one of the pixels of a TFT-display.

Figure 2.1: System Arquitecture: Data Acquisition module with 4 microphones and filters, a 4-channel ADCwith SPI interface, Beamformer modules, TFT-Display to visualize the estimation of the intensity of the

sound waves.

2.1 Data Acquisition ModuleIt consists of 4 electret microphones, fixed in a square array whose sides are 50 cm in length, each ofthem is followed by an amplifier stage, and an anti-aliasing filter, which is an 8th order Butterworthlow-pass filter with a cut-off frequency of 750 Hz. The geometry of the array can eventually bemodified if needed, since the array is detachable from the prototype.

2.2 ADCThe analog-to-digital conversion of the signals, is done using a 4-channel A/D converter (ADC084S021),which has a sample rate range up to 200 ksps, with a resolution of 8 bits for each channel. Theoutput of the converter is serial and it is compatible with several standards, such as SPI, QSPIT M ,

6

and other DSP serial interfaces. In the current implementation, the sample rate was fixed to 3.2ksps.

2.3 FPGA SystemThe FPGA used to implement the beamformer was a Xilinx Spartan 3AN XC3S1400AN-4FGG676C,running at a frequency of 50 MHz. This FPGA is integrated in an Altium Nanoboard3000 FPGAdevelopment board. The 4-channel ADC and the TFT display, used to acquire the data and displaythe sound intensity images respectively, are also peripherals located on board. The System is formedby four modules that perform all processing and controls tasks, besides controlling the peripheralsto the acquisition of data, and visualization of results.

2.3.1 Data Acquisition Controller

The data acquisition controller, is always acquiring and storing data, and according to the fixed fs,generates the signals to control the SPI ADC available on board. The storage of data is done using4 double buffers of size N, one for each channel.

2.3.2 Delay-and-Sum Beam-Forming Algorithm

The implementation of the algorithm is discussed in the next section.

2.3.3 Low-Pass Filter and Down-Sampler

The sampled signals are obtained from the ADC at a sample rate of 3.2kHz. In order to obtain aframe rate greater than 24 frames per second, which is a sufficient rate to have the sense of motionand have a real-time response, the size of the buffer has to be fixed taking into account Equation(2.1)

1frame rate

= 1fs

· N (2.1)

In this case fs is 3.2 kHz and the frame rate is 26 fps, obtaining that the size of the buffer Nshould be N = 122. As a consequence a down-sampler of 122nd order is implemented.

The anti-aliasing filter of the down-sampler is a FIR lowpass filter of order L. Even though in [3],the authors have used a simple averaging system with success, we have designed an antialiasing filterwith truncated response with the use of a Hamming window. Theoretically, the cut-off frequencyfor the filter would be fs

2·N . The coefficients of this filter were obtained with FDAtool in MATLAB,and then used in the implementation.

2.3.4 Display System

The video controller, controls an external double buffer implemented with the 2 external SRAMmemories (IS61LV25616AL) available on the Altium Nanoboard 3000. To this end, it generates thesignals to control the write and read cycles of each of the memories. Writing of the pixels value ofa sound intensity image can be done parallel to the read of the other memory thanks to the doublebuffer implementation.

The display controller, generates the control signals for the TFT display controller (ILI 9320LCD driver)available on board.

7

2.4 System ControllerEach of the main tasks previously described, are controlled by a different state machine as in Fig.2.2,however there is a system control that manages these state machines in order to meet the timingconstraints of the system and execute the tasks in the order they are designed to do it.

Figure 2.2: Overall system architecture and external peripherals, the blocks in blue correspond to FPGAimplemented blocks, and the blocks in green correspond to external peripherals

In the Appendix A a schematic showing the architecture in detail can be found, and in AppendixC the general state diagram is shown.

8

Chapter 3

FPGA Implementation

The implementation was done taking into account 3 main tasks: data acquisition, data processingand image display.

Figure 3.1: Implementation of the beamforming algorithm. sn[k] are the sampled data obtained from thecorresponding channel n of the ADC, δ0i

is the delay that indexes over the buffer the datum that has to beprocessed. Once s(p)[k] is obtained, it is multiplied by the corresponding hi coefficient of the filter and then

accumulated until i=L-2, when the 9 most significant bits address the corresponding color in the colorlook-up table to store it in an external double buffer.

3.1 Delay-and-Sum beamforming algorithm implementationGiven the high complexity of the operations to calculate the delays necessary to process the signals,its computation is done off-line using MATLAB. Equation (1.1) is used to calculate the delays. Theparameters needed to compute the delays are: fs, p, m and α.

The sample frequency of the system is fixed taking into account the maximum frequency thatneeds to be processed, that is 1 kHz for the current application. Therefore, fs has to be at least 2kHz plus a guard band for roll-off, then fs = 3.2 kHz. The variable p corresponds to equidistantpoints in the wavefront, where the sound intensity is estimated. This wavefront plane is located1.5 meters away from the microphone array, and its sides are 2 meters in width and 1.5 metersin length, this configuration can be seen in 3.2. Since each pixel corresponds to one point in thewavefront, the maximum number of points is determined by the resolution of the TFT display,which is 320x240.The position of all the microphones mn is fixed taking into account the geometrydiscussed in the previous section, and locating the origin of the coordinate system in the center

9

of the array. The constant α is fixed to 53 , which is a value computed experimentally through out

trials.

Figure 3.2: Space configuration for the acoustic camera implementation: In the figure z1 corresponds to1.5m, where the wavefront plane is located

The result in MATLAB is a 320x240x4 matrix, where the delays are stored. Even so, theresources available in the FPGA are not enough to store the 307200 values, as a solution the matrixis re-sized to obtain a matrix of 160x120x4 by averaging the previously obtained values in therespective positions.

The obtained delays are stored in an internal ROM memory in the FPGA. They are stored inthe order that the TFT display memory is written, that is from left to right and top to bottom.Each position of the memory corresponds to one pixel and has 28 bits, 7 bits for each microphone(channel) of the array, where the 7 MSB correspond to the delay for channel 0 and the 7 LSBcorrespond to channel 3.

The double buffer for each of the channels, consists of 2 banks of 122 registers of 8 bits. Eachof the registers has an enable signal to write in it and one signal that enables its output for furtherprocessing. The samples obtained from the ADC are placed in the respective order in one of thebuffers of the channel, once the N samples are placed, the ADC starts to storage data in the otherbuffer. The maximum value obtained for the delays, is 60 and the minimum value is 35, thereforethe minimum size of the buffer should be 26, a requirement met with the actual 122-position doublebuffer.

The combination of the samples coming from the buffers provide a stream of data which energy isproportional to the sound intensity at each point in the wavefront. Hence, the process of obtaininga pixel in the image is essentially to low-pass filter and down-sample this energy signals. Thecombination of the 4 samples is squared and filtered by being operated with the correspondingcoefficient of a low-pass digital filter of order L=46.

In order to speed up the filtering process, two values are operated on in parallel and summedwith the value that was stored in an accumulator register, as it can be seen in Fig. 3.1. Afterfiltering, the 9 MSB of the sound intensity estimation obtained for that pixel are used to obtainthe corresponding color in the color look-up table, calculated using the jet-scale color available inMATLAB, to get 512 RGB values. The value for each of the colors is then fixed to the formatspecified for the TFT display, that is 5 bits for red and blue and 6 bits for green.

Each TFT pixel is then stored in their corresponding pixel address in an external double buffer,implemented using 2 external SRAM memories available on board. An image frame is completewhen the 19200 (160x120) values are calculated.

10

3.2 Pipeline Architecture OverviewIn order to do the parallel processing described throughout the document a 3-stage pipeline wasimplemented. The pipeline architecture is shown in Fig. 3.3.

3.2.1 Data Acquisition

In the first stage of the pipeline, data is acquired and stored in the double buffers for each channel.From Equation (2.1), it can be inferred that the duration of this stage is 38.46 ms.

3.2.2 Data Processing

The second task, data processing, consists of a pipeline with the following 2 stages: delay-and-sumand color storage.

The first stage corresponds to the arithmetic operations needed to compute s(p)[k] (The sum ofthe four channels, the square of the obtained value and its further multiplication by the correspond-ing coefficient), this process takes 4 clock cycles. This has to be repeated L/2 times to obtain thesound intensity estimation of one pixel, it is divided by 2 because two s(p)[k] are obtained at thesame time.This process has to be repeated for each of the 19200 pixels to calculate the completesound intensity image.To obtain the maximum value of L/2, the processing time should be the sametime taken by the data acquisition stage. The time is fixed to 35 ms to meet the time constraintsas in Equation (3.1)

19201 · 4 · L · 1clock frequency

= 35ms (3.1)

Figure 3.3: Pipeline architecture of the system. The system is implemented using a 3-stage pipeline tocompute the sound intensity images. The processing stage is composed by a 2-stage pipeline, which first

stage is composed by a 2-stage pipeline to compute the algorithm.

substituting the known values in (3.1), L/2=22.7 . Therefore, in the current system L=46. Thisfirst stage, is as well composed by a pipeline to do all necessary operations without implementingredundant hardware. The delay δ0 is read from the internal ROM, in the first stage in the firstinterval. In the second interval, the accumulator takes the value shown in Fig. 3.3 while the otherstage, calculates δ2 for each of the four microphones. Each of these stages takes 4 clock cycles, thatis 80 ns. Therefore, with L=46 , the delay and sum stage takes 1.84 us in total.

The color storage stage, takes 1.8 us as well. During this time, the 9 MSB of the result obtainedin the previous delay-and-sum stage, are used to obtain the corresponding color from the color

11

look-up table, thereafter this value is written in one of the SRAM memories. This is done 19200times, plus the time that takes the first delay-and-sum interval, making the processing stage last atotal of 35.33 ms.

3.2.3 Display

The display stage, writes the already stored values in the external SRAM memory into the TFTdisplay memory. The system is designed to re-size the image, to 320x240 pixels image. Resizing isperformed by a simple nearest neighbor operation. Each of the pixels takes 15 clock cycles to beread from the SRAM and written into the TFT memory. This stage takes a total elapsed time of23.04 ms.

12

Chapter 4

Results

This chapter provides information regarding to the specifications of the developed prototype andthe development of the specific objectives set for the project.

4.1 MATLAB ImplementationPrevious to the implementation of the algorithm in the FPGA development board, off-line tests wereperformed in MATLAB. To this end, a data base available online from Carnegie Mellon University[8] is used. The signals provided, correspond to the acquired signals with a 15-microphone lineararray, the microphones are not located uniformly in a line but as Fig. 4.1 shows.

Figure 4.1: Linear Microphone Array: In the figure the first microphone is located in the origin of the3-coordinate plane, and the eight microphone corresponds to the center of the array

the wavefront that wants to be analyzed correspond to 50 points sampled in a line of 5 meters,the value of coordinates y and z remains constant, z is fixed to 0.8 m. The configuration can beseen in Fig. 4.2.

Figure 4.2: Space configuration: In the figure z1 corresponds to the distance between the wavefront and thearray, in the tests made in Matlab the wavefront is unidimensional, which means the only component that

varies for the point p is px

The test signals that are used, correspond to two different scenarios. In the first one, there is aperson talking located 1 meter away from the center of the array and a radio is located 45 degrees

13

from the same point. Using the equations, introduced in Chapter 1,(for the current test fs= 16 kHzand α=1) and the signals from this first scenario, the graph of the sound intensity vs the x position,showed in Fig. 4.3 is obtained:

Figure 4.3: First Scenario Test: Using the data corresponding to the first scenario, the graph of the soundintensity vs the x position in meters is obtained

A total of 47616 samples are obtained to do the processing, however Fig. 4.3 is the result ofusing only samples from 1000 to 1100. As it can be seen, the sound intensity is greater in the centerof the array, which is the expected result, since there is where the speaker is located. However,the sound intensity is similar along the array, this can be caused by the short distance between thearray and the wavefront that is being analyzed; therefore the next test is performed with the signalsobtained from the second scenario, where both the person and the radio are located 3 meters awayfrom the array. The results of this test are shown in Fig. 4.4.

Figure 4.4: Second Scenario Test: The magenta graph corresponds to the graphic obtained using samplesfrom 30000 to 30100 and the green one, was obtained using samples from 26000 to 26100

Although the results improve, showing a clear maximum in the middle of the array, the differencebetween the graphs can still be enhanced. To this end, a low-pass digital filter is implemented. Itis designed taking into account the frequency band in which the human voice is, that is from 80Hz to 1100 Hz; Additionally, to obtain a better result, more samples are taken into account for thecomputation of the sound intensity. The result is shown in Fig 4.5:

Subsequently, an acoustic image is calculated over the time using windows of 100 samples,starting from the 245th sample due to the filter implemented. The algorithm is calculated overgroups of 100 samples using all the samples obtained from the signal, this is done in order to see theevolution in time of the sound intensity in the designated wavefront. The result, shows the speakerin the center of the array, in Fig. 4.6, y axis corresponds to the x position, the x axis corresponds to

14

Figure 4.5: Second Scenario Test 2: It can be noticed that, the sound intensity maximum value is clearly inthe center of the array. For both graphics samples from 26000 to 32001 are used. However, they differ in

the α used value, 2 for the magenta graphic and 1 for the blue graphic

the samples in time, blue represents low sound intensity and red represents a high sound intensity;the values in the middle correspond to the colors of the jet-scale color in MATLAB.

Figure 4.6: Behavior of the sound intensity over the time : It can be observed that the maximum soundintensity is again located in the center of the array, as it is expected

The final test done in MATLAB, is the acquisition of an intensity image that shows the depen-dency between the parameter α and the sound intensity. This is done, using only 4 microphones asin the implementation of the acoustic camera in the FPGA. The results are shown in Fig. 4.7 whereit is shown that α must be a value between 1 and 5, because in this range the expected results areobtained. These results are the ones in which the highest sound intensity values are located in thecenter of the wavefront, position where the speaker is located.

From the results obtained in MATLAB, it is concluded that the delays must be calculated off-line as it was already discussed in previous chapters, and the value of α needs to be adjusted tofocus the array and obtain the expected results in pre-established tests.

4.2 FPGA ImplementationOnce the prototype has been implemented, it is tested by means of fixed signals and real soundsources. The latter is done using a 50 cm length square microphone array and a 15 cm lengthsquare array, for each test a different set of delays is loaded in the respective ROM memory, the

15

Figure 4.7: α vs X axis position, sound intensity values

test that uses fixed signal is done using the delays calculated for a 50x50 cm array. In this sectionthe specifications of the prototype are stated and the results of each of the tests are presented.

4.2.1 Prototype Specifications

Table 4.1: FPGA IMPLEMENTATION RESULTS ON XILINX SPARTAN 3AN (XC3S1400AN)

Device Resources Ussage

Slices 9553 (84%)

Slice Flip Flops 8362 (37%)

4 Input LUTs 11032 (48%)

Block-RAMs 32 (100%)

Multiplier Blocks 6 (18%)

The FPGA implementation results are stated in Table 4.1. As it can be seen due to the decodersused to do the delay indexing, more than 50 % of the slices are used. The multiplier blocks availableon the FPGA, were used to implement the square and the multiplication of the coefficients. Themultiplier blocks are of 18 bits, therefore one of them is used for each of the squares, and 2 of themhad to be used for each of the coefficients multiplication. As it was stated, two values are obtainedat the same time, obtaining a total of 6 multipliers, that are iteratively used to obtain the finalvalue. It is important to point out, that additional hardware had to be implemented in order tocontrol the peripherals on board, such as the TFT display and the SPI protocol for the ADC, whichplays an important role in the usage of the device resources.

The final prototype of the acoustic camera, consists of the Altium Nanoboard 3000 developmentboard, and the array of microphones. On the development board the peripherals that are usedare: the SRAM external memories, the 4 channel ADC and the TFT-Display. The features of theprototype are shown in Table 4.2.

4.2.2 Fixed Signals results

The first test is done using a function generator. When the same signal is applied to each of thechannels of the analog-to-digital converter, the expected sound intensity image is a beam corre-

16

Table 4.2: ACOUSTIC CAMERA PROTOTYPE FEATURES

Features Specifications

Microphones 4

Sampling Rate 3.2 kHz

Resolution 8 bit

Frequency Response 0-750Hz

Video Resolution 320x240

Intensity levels 512

Frame Rate 26 fps

Figure 4.8: Sound Intensity Image obtained when a sine wave of 200 Hz is applied to each of the channelsof the ADC

sponding to high sound intensity (near red in color) in the center of the display; indicating that thesource is located in the center of the wavefront and therefore all the channels are receiving the samesignal. In Fig. 4.8 the sound intensity image obtained from the test is presented, in it and in all thefigures that are shown in this section regarding to the sound intensity images obtained as results,the x axis corresponds to space x dimensions in meters, the y axis corresponds to space y dimensionsin meters and the color of each pixel varies depending on the intensity of the corresponding pointin space. The wavefront is the one described in Chapter 3.

As it can be seen, although not of a red color, a clear point of high intensity is located inthe center of the image as it was expected. The second test is done, generating the signals to bestored in the double buffers digitally. Therefore, knowing the delays of each pixel the desired soundintensity image could be generated. In a first scenario, the function generator scenario is replicated,the signal values are generated using MATLAB to obtain a sine wave of 350 Hz as it is shown inFig. 4.9b, obtaining the image shown in Fig. 4.9a

As it can be seen, the image is very similar to the one previously obtained experimentally; themain difference is the intensity of the color which is mainly due to differences between the maximumvalues of the signals used in each case. As a second scenario, first a delayed version of a sine waveof 200 Hz and then of 350 Hz is generated to each channel as it is shown in Fig. 4.10a and Fig.4.11a, the delays between this signals coincide with the delays stored in the delays ROM memorycorresponding to the upper left corner of the TFT display, therefore the expected image is a beamof a more intense color in this direction. The results are shown in Fig. 4.10b and in Fig. 4.11b, for200 Hz and 350 Hz respectively.

17

(a) Sound Intensity Image obtained when a sine waveof 200 Hz is digitally generated and stored in the 4

double buffers (b) Sine wave 350 Hz, Fixed Signals Test 2

Figure 4.9: Fixed signal results Test 2

(a) 200 Hz test signals(b) 200 Hz fixed signal results

Figure 4.10: Fixed signal results test 3 - 200Hz

(a) 350 Hz test signals(b) 350 Hz fixed signal results

Figure 4.11: Fixed signal results test 3 - 350Hz

18

Both of the obtained images, show the beam in the direction that was expected, however thefinal result is not the same due to the frequency of the signals processed.

4.2.3 Prototype Implementation Results

The acoustic camera prototype was tested both in open and enclosed spaces, obtaining the expectedresults. In Fig. 4.12 for instance, a sound intensity image is shown. The image was obtained inan open space with 3 loud speakers, all of them located 1 meter away from the array (50x50 cmscenario), in different locations. Each of them was generating a sine wave of 350 Hz. As it can beseen the 3 sources can be identified.

Figure 4.12: Sound Intensity Image obtained with the acoustic camera prototype, 3 sources acoustic scene.

In Fig. 4.13, the sound intensity image was obtained in an open space with 1 loudspeaker onthe right of the microphones, also located 1 meter away from the array and generating a 350 Hzsine wave. However, it can be seen that at the bottom of the display, another beam is shown. Thisis possibly caused by the echo coming from the floor.

Figure 4.13: Sound Intensity Image obtained with the acoustic camera prototype, 1 source acoustic scene.

In Fig. 4.14, the same test performed formerly is done but, this time the sound source is locatedon the left of the microphone array.

When the tests, where performed in an enclosed space, echo coming from the walls, floor andwindows cause an undesired effect. However the sound sources can still be identified, this can beobserved in the next images. In this test, the sine wave of 350 Hz was generated with a laptop,located on the left of the microphone array. The sound intensity image obtained is shown in Fig.4.15

19

Figure 4.14: Sound Intensity Image obtained with the acoustic camera prototype, 1 source acoustic scene(sound source located on the left of the array).

Figure 4.15: Enclosed space test 1: Sound Intensity Image obtained with the acoustic camera prototype, 1source acoustic scene (sound source located on the left of the array).

The test in repeated, but this time a loudspeaker is used to act as a second sound source.The acoustic scene for this test is the laptop located in front of the bottom of the array, and theloudspeaker located on the right of the array, Fig. 4.16 shows the result of this test.

As it can be seen, the two sources are located but other incorrect beams are also shown. Thearray is then reduced to a 15x15 cm square, and the results are shown in Fig. 4.17. In this test, theacoustic scenes that were adjusted for the former configuration are repeated, this is: first using asingle source on the left as in Fig. 4.17a and then using the same source but this time on the rightas in Fig. 4.17b. For both tests the signals were sine waves of 350 Hz.

In this test, the echo effect is reduced and the beam points to the place where the sound sourceis located, this can also be seen in Fig. 4.18, test done using 2 sound sources.

From the sound intensity images presented in this section, we can assure that the developedprototype of acoustic camera estimates the origin of a sound source, pointing to its location inthe space. The main difficulty to overcome in the tests that are made is the fact that the signalsreceived from the microphones do not present the gain that is expected for the ADC, that is thesignals do not have as peaks the maximum value that the ADC can convert, and if the gain ofthe pre-amplifier is increased, the analog circuit becomes unstable, however adjusting the frequencyof the signal and the configuration of the array as it was shown in the images, the results can beimproved.

20

Figure 4.16: Enclosed space test 2: Sound Intensity Image obtained with the acoustic camera prototype, 2sources acoustic scene (as described before).

(a) 15x15 cm square array results- 1 source left (b) 15x15 cm square array results- 1 source right

Figure 4.17: Enclosed space test 3

Figure 4.18: Enclosed space test 4: Sound Intensity Image obtained with the acoustic camera prototype, 2sources acoustic scene (as described before).

21

Chapter 5

Conclusion

In this work, a prototype acoustic camera has been presented, in which all the digital processinghas been implemented on an FPGA. The design was simulated using Xilinx tools, and implementedusing Altium tools. Sound intensity images in real time are obtained through parallel complexsignal processing on the FPGA. The resources are optimized using a 3-stage pipeline architecturethat at the same time implements the computation and processing using a 2-stage pipeline. Theobtained system is not only portable, but also a versatile prototype given that the delays loaded inthe ROM memory can have a different value in case the geometry of the microphone array changes.

Working with a different FPGA or a different geometry of the microphone array would result inmore space to either implement a low pass anti-aliasing filter with a higher order L, by computingmore samples in parallel; or implement a bigger color look-up table, to divide by a lower value whenindexing over this memory. On the other hand, the current system does not have enough resolutionsince the number of the microphones is reduced, implementing a system with more microphonesand changing the structure of the array to avoid echo effects, will provide sound intensity imageswith higher resolution. These are aspects that could be improved in subsequent developments.

22

Glossary

DSPDigital Signal Processing

FPSFrames per Second

QSPIQueued Serial Peripheral Interface

SPISerial Peripheral Interface

SRAMStatic Random Access Memory

TFTThin-Film Transistor

23

Bibliography

[1] Van Veen, B.D; Buckley, K.M; “Beamforming: A versatile approach to spatial filtering," ASSPMagazine, IEEE, vol.5, no.2, pp.4-24, April 1988.

[2] Widrow, B; Glover, J.R, Jr; McCool, J.M; Kaunitz, J; Williams, C.S; Hearn, R.H; Zeidler, J.R;Eugene Dong, Jr; Goodlin, R.C; , “Adaptive noise cancelling: Principles and applications,"Proceedings of the IEEE , vol.63, no.12, pp. 1692- 1716, Dec. 1975.

[3] Zimmermann, B; Studer,C “FPGA-based real-time acoustic camera prototype," Circuits andSystems (ISCAS), Proceedings of 2010 IEEE International Symposium on , vol, no, pp.1419,May 30 2010-June 2 2010.

[4] Escobar,F.A.; Chang, X.; Ibala,C.; Valderrama,C; “Fast Accurate Hybrid Algorithm for SoundSource Localization in Real-Time,"International Journal of Sensor and Related Networks(IJSRN) Volume1,Issue 1 ,Feb. 2013.

[5] Gfai tech GmbH, 12489 Berlin, Germany, “Acoustic camera: listening with your eyes,"availableat http://www.acoustic-camera.com.

[6] Chang-Hong Hu; Xiao-Chen Xu; Cannata, J.M; Yen, J.T; Shung, K.K; “Development of a re-altime, high-frequency ultrasound digital beamformer for high-frequency linear array transduc-ers,"Ultrasonics, Ferroelectrics and Frequency Control, IEEE Transactions on , vol.53, no.2,pp.317-323, Feb. 2006 doi: 10.1109/TUFFC.2006.1593370.

[7] Bischof, G; “Acoustic imaging of sound sources - a junior year student researchproject,"Frontiers in Education Conference, 2008. FIE 2008. 38th Annual , vol, no, pp.S2C-1-S2C-6, 22-25 Oct. 2008.

[8] Sullivan T, 1996 Pittsburgh USA CMU Audio Databases. Available at http://www.speech.cs.cmu.edu/databases/micarray/

24

List of Figures

1.1 Sound intensity estimation diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 System Arquitecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 System Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1 Beamforming algorithm implementation overview . . . . . . . . . . . . . . . . . . . . 93.2 Space configuration for the acoustic camera implementation . . . . . . . . . . . . . . 103.3 Pipeline architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.1 Linear Microphone Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 Space configuration Matlab Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.3 First Scenario Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.4 Second Scenario Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.5 Second Scenario Test 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.6 Behavior of the sound intensity over the time . . . . . . . . . . . . . . . . . . . . . . 154.7 α vs X axis position, sound intensity values . . . . . . . . . . . . . . . . . . . . . . . 164.8 Fixed Signal results Test 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.9 Fixed signal results Test 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.10 Fixed signal results test 3 - 200Hz . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.11 Fixed signal results test 3 - 350Hz . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.12 3 sources acoustic scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.13 1 source acoustic scene-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.14 1 source acoustic scene-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.15 Enclosed space test 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.16 Enclosed space test 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.17 Enclosed space test 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.18 Enclosed space test 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

25

List of Tables

4.1 Implementation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2 Acoustic camera prototype features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

26

Appendix

Appendix ASchematics

1. FPGA prototype general schematic

Appendix BData-sheets

1. ADC Datasheet2. DAC Datasheet3. ILI9320 Datasheet4. SRAM Datasheet

Appendix CState Diagrams

1. General State Diagram

Appendix DPeripheral configuration descriptions

1. TFT-Display Configuration (Register Values)2. Waveforms SRAM signals3. Waveforms ADC-DAC signals

Appendix EVHDL code

1. Altium Project FPGA Prototype2. Altium Project Objective3. Altium Project ADC-DAC test4. Xilinx Final Project with Simulation files5. Xilinx ADC-DAC with Simulation files6. ROM configuration files for Delays Memory, (15 cm square array)

27

a prototype fpga implementation of a real-time

Documents