accelerating real-time processing of the atst adaptive ... · vivek venugopal • atmospheric...

Post on 16-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Accelerating Real-time Processing of the ATST Adaptive Optics SystemVivek Venugopal

• Atmospheric turbulence distorts the wavefront by generating phase variations in the incoming light and limits the resolution of large solar telescopes such as the four meter solar telescope, Advanced Technology Solar Telescope (ATST) now beginning construction at Maui's Haleakala.

HOAO Real-time system

[1] S. L. Keil, T. R. Rimmele, J. Wagner, and ATST team. Advanced Technology Solar Telescope: A status report. Astronomische Nachrichten, 331:609–615, 2010. [2] V. Venugopal, et. al. Accelerating Real-time processing of the ATST Adaptive Optics System using Coarse-grained Parallel Hardware Architectures. In International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA 2011), pages 296–301, Las Vegas, USA, July 2011.[3] Nvidia Inc. (Last Accessed: February 2012) Nvidia Tesla C2050 GPU Computing Processor. [Online]. Available: http://www.nvidia.com/object/product_tesla_c2050_us.html

Implementation platforms and Results

• The high speed camera sends 1750 20x20 pixel raw sub-aperture images to the processing system. • The sub-apertures undergo a dark field correction followed by a flat field correction, which is the equivalent to correcting the images for zero level and gain equalization. • The 2D cross-correlation step determines the shift in the x and y direction of each sub-aperture, as compared to the reference image. • The wavefront reconstruction step consists of a precomputed 3500x1900 reconstruction matrix, which is multiplied with the x and y shifts.

• Nvidia's GPUs provide computational speedup as compared to the CPU implementation. • Although DSPs are faster, using 48 DSPs are more expensive than 3 GPUs.• GPUs provide flexibility in terms of problem scalability as it can handle the computational complexity and at the same time provides more FLOPS/$ as compared to DSPs.

Introduction

Advanced Technology Solar Telescope

FPGA-DSP solution

Conclusion

References

vivek@vivekvenugopal.net

Processors

Uncorrected light

Corrected light

Tip/Tilt Mirror

DeformableMirror (DM)

Beamsplitter

Shack-Hartmann Lenslet Array

CCD Camera

DM drive signal

Tilt drive signal

Adaptive Optics system

•The adaptive optics (AO) system senses the wavefront aberrations and applies the corresponding correction to the adjustable deformable mirror to improve the resolution of the telescope.

WFS Camera

Cross-correlation

slope computation

Average slope

Offscale slope

detectionMatrix

multiplyActuator servos

Data collection

Zernike offload process

Tip/Tilt servos

X X

Dark field Flat field Reference

imageSlope offsets

Recon-struction matrix

Actuator offsets

Actuator gains

Servo parameters

Offscale slope

tolerance

Tip/Tilt mirror

Deformable mirror

Dark pixels20x20

Raw pixels20x20

flat correction

Flat pixels20x20

2D cross-correlation find maximum

dark correction

Reference pixels20x20

x and y interpolation

GPUFPGA

reconstruction

FFTFFT

Complex conjugate Multiplication

IFFT

reference image

flat corrected

imageoriginal reference

image 26x26 pixels

precomputed reference

(20x20 pixels)

Precomputed Reference pixels 20x20 (49 regions)

precomputed reference

(20x20 pixels)

Region 1

Region 2

precomputed reference

(20x20 pixels)Region 49

Camera data half

Camera data half

FPGA 1

FPGA 2

12 optical fiber

channels

12 optical fiber

channels

PCI-e

bus

GPU/CPU

Camera data half

Camera data half

FPGA 1

FPGA 2

12 optical fiber

channels

48 DSPs

12 channels

12 channels

12 optical fiber

channels

FPGA-GPU solution

x y

1750 1750x

x and y shifts for 1750 sub-aperture images

1900

3500

reconstruction matrix 1900x3500

1900

accumulated values for 1900 actuators

Process partitioning and mapping

FFT correlation

7x7 correlation

1900x3500 reconstruction matrix

0

550

1100

1650

2200

1 50

1889

15101619

1188

Tim

e in

us

No. of images

Tesla C1060Tesla C2050

FFT correlation results

0

100

200

300

400

1 50 584

300.93307.49312.9281.39279.35278.36

Tim

e in

us

No. of images

Tesla C1060Tesla C2050

7x7 correlation results

1

10

100

1000

10000

100000 46769

229

956964

Tim

e in

us

Tesla C1060Tesla C2050DSPCPU

Wavefront reconstruction results

• Test platform: 2 quad-core 2.6GHz AMD processors running Ubuntu Linux OS with Nvidia Tesla C1060 and C2050 • Implementation strategy: 48 DSPs or GPUs?• 7x7 correlation faster than FFT correlation on GPUs

Compute-intensive

top related