Real-Time Implementation of Digital Signal Processing for Coherent Optical Digital Communication Systems
Post on 23-Sep-2016
Embed Size (px)
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 16, NO. 5, SEPTEMBER/OCTOBER 2010 1227
Real-Time Implementation of Digital SignalProcessing for Coherent Optical Digital
Communication SystemsAndreas Leven, Senior Member, IEEE, Noriaki Kaneda, Senior Member, IEEE, and Stephen Corteselli
AbstractDigital signal-processing-based coherent optical com-munication systems are widely viewed as the most promising next-generation long-haul transport systems. One of the biggest chal-lenges in building these systems is the implementation of signalprocessors that are able to deal with signaling rates of a few tensof giga-samples per second. In this paper, we discuss implementa-tion options and design considerations with respect to hardwarerealization and DSP implementation.
Index TermsDigital signal processors, optical fiber communi-cation, quadrature phase-shift keying.
COHERENT communication systems have dominated theworld of wireless communication almost since its begin-nings. Coherent systems, or more exactly phase-coherent sys-tems, offer a number of benefits over noncoherent systems .Nonetheless, practical optical coherent communication systemsbecame feasible only recently.
Coherent optical communication systems have been a mat-ter of intense research in the 1980s and early 1990s of pastcentury. At that time, main motivation was the higher sensitiv-ity coherent receivers promised. Technical difficulties inhibitedrapid transition into commercial systems. With the advent of theerbium-doped fiber amplifier, which offered comparable sensi-tivities with direct-detection systems, the main driving force fordeveloping optical coherent systems disappeared.
Todays motivation to revive coherent concepts in opticalcommunication is twofold. First, coherent receivers enable reli-able data transmission with much higher spectral efficiency thanconventional direct-detection systems, and second, coherent re-ceivers can compensate for linear impairments, most notably,polarization-mode dispersion (PMD) to a degree that is out ofreach for conventional systems.
Also, the technical difficulties that the first generation ofcoherent systems in optical communications faced have beenlessened. This is caused by two developments. First of all, the
Manuscript received January 15, 2010; revised February 12, 2010; acceptedFebruary 28, 2010. Date of publication May 18, 2010; date of current versionOctober 6, 2010.
A. Leven is with Alcatel-Lucent Bell Laboratories, 70435 Stuttgart,Germany (e-mail: firstname.lastname@example.org).
N. Kaneda and S. Corteselli are with Alcatel-Lucent Bell Laboratories,Murray Hill, NJ 07974 USA (e-mail: email@example.com;firstname.lastname@example.org).
Digital Object Identifier 10.1109/JSTQE.2010.2044977
symbol rate to carrier frequency ratio of modern optical com-munication systems approaches the ratio that is commonly usedin wireless systems. For a system that transmits at data rate of100 Gb/s in two polarization orientations utilizing QPSK sig-naling, the symbol rate is 25 GBd. With a carrier frequency ofroughly 200 THz, the symbol rate to carrier frequency ratio is1.25e-3. This indicates that it is possible for optical systems toachieve similar phase noise to symbol rate ratios, as in wirelesssystems.
Second, the performance of digital signal processing (DSP)equipment has been improved dramatically over the past twodecades, which makes it feasible to implement the complexsignal processing steps required to synchronize to the receivedsignal in digital domain. Implementations of optical coherentreceivers have been demonstrated in CMOS-based application-specific ICs (ASICs)  or field-programmable gate arrays(FPGA) , .
Albeit a coherent optical communication system can utilizesingle or multiple carrier [e.g., orthogonal frequency-divisionmultiplexing (OFDM)] transmitter and any modulation format,with QPSK being the most popular and higher order quadrature-amplitude modulation (QAM) and phase-shift keying (PSK)systems under investigation, this paper will concentrate onsingle-carrier frequency-domain-equalized systems, which hasbecome more popular in the wireless domain as well . Themodulation format discussed here will be QPSK. Phase co-herence between a data signal and the reference is typicallyestablished at the receiver side. As this paper is concerned withthe implementation of coherent systems, it will concentrate onreceiver design.
The paper is organized as follows. After a short review of thebasic architecture of optical coherent system, we will discusshardware implementation options. Then, we will discuss spe-cific challenges for the implementation of signal processing atmultiple gigabit per second signaling rates. In Section IV, ex-emplarily some of the algorithms and their implementation willbe described. Finally, some measurement results of a real-timecoherent receiver will be discussed.
II. OPTICAL COHERENT-SYSTEM ARCHITECTUREFig. 1 shows a generic block diagram of a coherent system.
The transmitter on the left side of the diagram consists of adata source, digital-to-analog converters (DACs), and driveramplifiers. Coherent systems often use polarization-division
1077-260X/$26.00 2010 IEEE
1228 IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 16, NO. 5, SEPTEMBER/OCTOBER 2010
Fig. 1. Generic coherent system.
multiplex. In this case, the continuous wave signal of the trans-mit laser is split into two, and then, modulated independentlyby two optical in-phase/quadrature (I/Q) modulators. Thetwo signals are then combined in two orthogonal polarizationorientations.
In Fig. 1, the data source is simply displayed as a black box.In a real system, this black box comprises a number of com-plex functions. Besides the functions that are also performedin a classical system, such as data aggregation, coding, andframing, additional steps need to be performed in a transmit-ter for complex modulation formats, as it is typically used ina coherent system. First of all, the data have to be mapped toconstellation points and, in case of multiple carrier systems,to frequencies. Often, the data are also differentially precodedto cope with phase slips during receiver-side carrier synchro-nization. In a second step, the mapped data might be processeddigitally, for instance, it might be predistorted to compensatefor the nonlinearity of amplifiers and modulators, or it mightbe precompensated for deterministic fiber effects, for instance,chromatic dispersion (CD).
The processing of the earlier described data results in fourdigital data streams that subsequently need to be converted intoanalog data. In case of single-carrier QPSK signaling, each datastream carries only a single bit per symbol, and therefore, doesnot require a DAC. This reduces the complexity, and therefore,power consumption of the transmitter significantly. But evenfor multicarrier systems  or modulation formats with higherorder than QPSK, the performance requirements with respectto resolution and conversion speed are typically less restrictivefor the transmit DAC than for the receive AD converter (ADC).For instance, for a 16-QAM transmitter without preprocessing,only 2 bits (four levels) at a conversion speed equivalent to thesymbol rate are required, while at the receiver side, typically68 bits at a sampling rate of twice the data rate are needed.Technology and architecture choices are similar to the ones ofthe ADC, which will be discussed later.
The I/Q modulator most widely used today is a double-nestedMachZehnder modulator  based on LiNbO3 . However, othermore compact solutions are in development based, for instance,on electroabsorption modulator structures .
The receiver consists of a local oscillator (LO) laser, an opticalhybrid, a photoreceiver array, an ADC array, the digital signalprocessor (DSP), and a data sink, which typically comprises adecoder and a client interface.
The 90 optical hybrid mixes the received signal with thesignal of the LO laser and a 90 phase-shifted copy of the LO
laser signal . The mixing of the signal with the LO referenceis performed for each polarization separately. Preferably, theoutput of the hybrid provides differential signals for suppressionof the direct-detection terms.
Optical hybrids have been demonstrated in different designsand technology platforms. Design-wise, optical hybrids can begrouped into actively controlled devices that require a phasecontrol to maintain the 90 phase difference and passive de-vices that assure a 90 phase difference by design. Active de-vices typically consist of two splitters, one for the receivedsignal and one for the LO signal, and two signal combiners, onefor the in-phase component and one for the quadrature com-ponent . The phase of the signal in one arm of the LOsplitter needs to be adjusted by a tunable phase shifter to be90 out of phase with respect to the signal in the second arm.The phase shift can be controlled by utilizing thermal tuningor electrooptic tuning.
Passive hybrids are designed such that the signals alwaysinterfere with a phase difference of about 90. These can beimplemented, for instance, as a Michelson interferometer or amultimode interference coupler (MMI) . The advantage ofpassive hybrids is obvious; no control signal has to be generatedand distributed to the hybrid device. Nevertheless, the phaseaccuracy is often not sufficient so that a phase correction needsto be implemented in the digital signal processor.
After photodetection and linear amplification, the signalsneed to be converted from AD domain. The ADC performanceis still one of the bottlenecks that determine total data rate ofDSP-based optical coherent systems. ADCs with sampling rateof more than 20 GSample/s have been published. These de-vices have been realized in CMOS technology ,  as wellas SiGe BiCMOS , . SiGe BiCMOS devices typicallyoffer slightly better performance at the expense of increasedpower consumption, while CMOS devices offer the possibilityto integrate the ADC functionality and the DSP on one chip.
SiGe ADCs are typically implemented as flash ADCs, wherea bank of comparators is utilized to convert the analog signalinto a digital thermometer-coded signal. In a second step, thisthermometer-coded signal is converted into a binary or gray-encoded signal.
CMOS ADCs take advantage of the higher integration densityavailable in CMOS by utilizing a time-interleaved architecture.Here, multiple ADCs are employed in parallel with each ADCsampling at a fraction of the total sampling rate. For instance, nADCs sampling at a rate of R/n, each with a time offset of 1/Rto achieve a total sampling rate of R. Gain and sampling point
LEVEN et al.: REAL-TIME IMPLEMENTATION OF DIGITAL SIGNAL PROCESSING 1229
mismatch between the individual sub-ADCs require calibrationto avoid performance loss .
The digital signal processing functionality has been demon-strated in ASICs  as well as in FPGAs , . An ASICnot only offers the possibility of integration of the ADCs on thesame chip, but also allows for an optimized circuit design specif-ically tailored for the application. This results in higher speedand lower power-consumption receivers with higher function-ality and complexity than a realization in an FPGA. For largevolumes, ASICs also tend to be cheaper than FPGA implemen-tations. Unfortunately, an ASIC development is very costly andrequires a lot of resources in the design process. Furthermore, anASIC cannot be reconfigured so that the processing algorithmshave to be fully developed during the design phase. Therefore,an ASIC implementation is preferable for a commercial devel-opment, while FPGAs are the platform of choice for researchand prototyping.
FPGAs are ICs that are built with a set of configurable blocksthat can be interconnected with a reconfigurable set of wires. Themajority of the configurable blocks are so-called logic blocks,but typically modern FPGAs also offer a number of memoryand multiplier blocks. FPGAs are designed to be useful in anumber of applications, which in turn means that the resourcesavailable in a given FPGA are not necessary a perfect fit for theapplication at hand. Resource constraints often lead to adapta-tion or modification of optimum signal-processing algorithms,which in turn lead to reduction in performance (implementationpenalty).
III. IMPLEMENTATION CONSIDERATIONSTodays FPGAs offer processing speeds in the order of a few
100 MHz. The achievable processing speed for an ASIC usingthe same CMOS generation is typically higher by a factor ofabout two to three. Nevertheless, the processing speeds avail-able in todays technologies are about two to three orders ofmagnitude smaller than the data rates in optical communicationsystems. The maximum achievable processing speed of a digitalcircuit is given by the longest time a signal needs to travel be-tween two clocked storage elements (e.g., flipflops). The pathbetween these two storage elements is called the critical path.There are two commonly used techniques to reduce the length ofthe critical path, and therefore, increase processing: pipeliningand parallel processing. Pipelining reduces the critical path byinserting additional retiming elements along the signal path ina manner that does not alter the result of the processing but ofadditional latency. This only allows an increase of processingspeed up to the maximum speed of a single element (gate in anASIC or lookup table (LUT) for an FPGA).
For being able to process data at multiple gigabit per secondup to 100 Gb/s and beyond, a parallel processing structure hasto be implemented. Unfortunately, not all algorithms can beparallelized without modifications and loss of performance. Ingeneral, all structures that can be pipelined can also be processedin parallel .
Algorithms that are time invariant can simply be parallelizedwithout loss of performance by instantiating the circuitry that
Fig. 2. Signal-processing steps.
implements the algorithm multiple times. Often, the complexitycan be reduced by sharing resources between multiple instances.An example of a structure that can easily be parallelized is afinite-impulse response (FIR) filter with constant coefficients. Ifthe filter coefficients are not constant, e.g., within an adaptivefilter structure, there might not be an equivalent parallel struc-ture, e.g., when the update of the filter coefficients is performedonce per sampling period. If one is willing to compromise onupdate speed by accepting an update rate once every clock cy-cle, with the clock cycle being 1/n times the sampling period,with n the parallelization factor, an equivalent structure can beimplemented.
IV. SIGNAL-PROCESSING ALGORITHMSAND THEIR IMPLEMENTATION
Fig. 2 shows a possible flow of signal-processing stepsfor a single-carrier PSK or QAM receiver. After analog-to-digital conversion, imperfections of the receiver frontend needto be corr...