initial cell serch paper
TRANSCRIPT
-
8/22/2019 Initial Cell Serch Paper
1/63
1
Comparison of Initial Cell Search Algorithms for W-CDMA Systems
by
Sanat Kamal Bahl
Thesis submitted to the Faculty of the Graduate School
of the University of Maryland in partial fulfillment
of the requirements for the degree ofMaster of Science
2002
-
8/22/2019 Initial Cell Serch Paper
2/63
2
Title of Thesis: Comparison of Initial Cell Search Algorithms for
W-CDMASytems
Sanat Kamal Bahl, Master of Science, 2002
Thesis directed by: James F. Plusquellic
Assistant Professor
Dept. of Computer Science and Electrical Engineering
ABSTRACT
In this thesis, an Improved Cell Search Design (Improved CSD) using cyclic codes is
compared with the 3GPP Cell Search Design using comma free codes (3GPP-comma free
CSD) in terms of (1) hardware utilization on a field programmable gate array (FPGA) and
(2) acquisition time for different probabilities of false alarm rates. Our results indicate
that for a channel whose signal-to-noise ratio is degraded with additive white gaussian
noise (AWGN), the Improved CSD achieves faster synchronization with the base station
and has lower hardware utilization when compared with the 3GPP-comma free CSD
scheme under the same design constraints.
-
8/22/2019 Initial Cell Serch Paper
3/63
3
Table of Contents
1.0 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.0 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.0 Cell Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Synchronization Channels in W-CDMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Cell Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Stage 1: Slot Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Stage 2: Frame Synchronization and Code Group Identification . . . . . . . . 13
3.2.3 Stage 3: Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.0 Improved Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1 Stage1: Slot Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Stage2: Frame Synchronization and Code Group Identification . . . . . . . . . . . . . 21
4.3 Stage3: Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.1 Scrambling Code Generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.2 Descrambler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.0 3GPP-comma free Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1 Stage 2 of 3GPP-comma free Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Reduced Length FHT Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.0 Experimental Method and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.1 Experimental Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.1.1 FPGA Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.0 Summary, Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.1 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.0 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
-
8/22/2019 Initial Cell Serch Paper
4/63
4
List of Abbreviations
AMPS Advanced Mobile Phone Service
ASIC Application Specific Integrated Circuit
A/D Analog-to-Digital
AWGN Additive White Gaussian Noise
BS Base Station
Cp Primary Synchronization Code
Cssc Secondary Synchronization Code
Cs Cyclic Hierarchical Sequence
CLB Configurable Logic Block
CPICH Common Pilot Channel
D/A Digital-to-Analog
DFT Discrete Fourier Transform
DSP Digital Signal Processing
DS-CDMA Direct Sequence-Code Division Multiple Access
FHT Fast Hadamard Transformer
FPGA Field Programmable Gate Array
GIC Group Indicator Code
GPS Global Positioning System
GSM Global System for Mobile communication
LC Logic Cell
LFSR Linear Feedback Shift Register
LUT Look-Up Table
MS Mobile StationPSC Primary Synchronization Code
P-SCH Primary Synchronization Channel
SSC Secondary Synchronization Code
SNR Signal-to-Noise Ratio
-
8/22/2019 Initial Cell Serch Paper
5/63
5
SCH Synchronization Channel
S-SCH Secondary Synchronization Channel
3G Third Generation
3GPP Third Generation Partnership Project
TIA Telecommunications Industry Association
W-CDMA Wideband-Code Division Multiple Access
-
8/22/2019 Initial Cell Serch Paper
6/63
6
List of Figures
Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page
1 DS-CDMA Transmitter-Receiver Block Level Diagram . . . . . . . . . . . . . . . . . . . . . . 3
2 Synchronization Channels in Cell Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Hierarchical Matched Filter (64-chip and 4-symbol accumulation). . . . . . . . . . . . . . 17
4 Hierarchical Matched Filter (16-chip and 16-symbol accumulation). . . . . . . . . . . . . 18
5 Slot Boundary Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Frame Synchronization and Code Group Identification. . . . . . . . . . . . . . . . . . . . . . . 24
7 Scrambling Code Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8 Multiple Scrambling Code Generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
9 Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3110 Individual Stage of FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
11 16 chip FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
12 Hadamard Code Metrics (Butterfly Operation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
13 2-Slice Virtex-E CLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
14 Detailed View of Virtex-E Slice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
15 Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-3. . . . . . . . . . . 48
16 Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-4. . . . . . . . . . . 48
-
8/22/2019 Initial Cell Serch Paper
7/63
7
List of Tables
Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page
1 Hierarchical Matched Filter (16 and 64-chip Accumulation). . . . . . . . . . . . . . . . . . . 16
2 Sequences X1,i and X2,i for Code Groups 1 to 32. . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Masking Functions used in Stage 3: Scrambling Code Generator . . . . . . . . . . . . . . . 28
4 Allocations of SSCs for Secondary SCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Timing Diagram of Inputs to FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6 Reduced Length Walsh Sequences (256 chip sequence to 16 chip sequence) . . . . . . 41
7 Hardware Specifications of System: Quantization 4 Input Data Bits. . . . . . . . . . . . . 49
8 Hardware Specifications of FHT: 16 and 256 chip sequence. . . . . . . . . . . . . . . . . . . 49
-
8/22/2019 Initial Cell Serch Paper
8/63
8
Chapter 1
Introduction
1.0 Introduction
First generation (1G) mobile communications systems were based on analog technol-
ogy and started in the early to mid 1980s. These 1G systems had a number of limitations
which included (1) low quality voice service, (2) limited capacity and (3) inability to pro-
vide global roaming.
Digital second generation (2G) systems were then developed in Europe and US. The
various second generation systems included (1) Global System for Mobile communica-
tion (GSM) which utilizes time division multiple access (TDMA). In TDMA each user is
assigned a particular time slot. (2) The TDMA/136 specification which was defined in the
US, in 1988, by the Telecommunications Industry Association (TIA), developed with the
aim of digitizing the analog Advanced Mobile Phone Service (AMPS). (3) In the US, IS-
95 was proposed for 2G systems, to provide better voice quality and higher capacity. IS-
95 was based on CDMA technology. However, different 2G technologies were not
interoperable and not available across geographic areas. In addition, the low bit rate of 2G
systems could not meet subscriber demands for multimedia services. Third generation
(3G) systems aim to solve these problems encountered with 2G systems, by promising
global roaming across 3G standards, higher data rates, improved quality of service and
-
8/22/2019 Initial Cell Serch Paper
9/63
9
support for multimedia applications. The most popular candidates for 3G cellular systems
are CDMA2000 and Wideband-Code Division Multiple Access (W-CDMA) [1] [2]. Both
of these schemes are based on Direct Sequence-Code Division Multiple Access (DS-
CDMA) technology. In DS-CDMA, the data signals are directly modulated by a digital
code signal.
In a spread spectrum CDMA system, the transmitted signal is spread over a wide fre-
quency band that is wider than the minimum bandwidth required to transmit the informa-
tion being sent. In a typical scenario where there are multiple users or mobile stations
(MSs) in a cell, each user has a unique scrambling code. This scrambling code should be
such that it has low cross correlation properties with the other user codes. The signal
received by the MS from the transmitting base station (BS) is correlated with the users
scrambling code. This despreads only the signal of that particular user whereas the other
spread spectrum signals will remain spread. A block diagram of a DS-CDMA transmitter
and receiver is shown in Figure 1. Spreading consists of multiplying the input data by a
scrambling code sequence whose bit rate is much higher than the data bit rate. At the
receiving side the signal is multiplied with the same scrambling code sequence that is
exactly synchronized to the received code sequence. The Encoding block shown in Figure
1 is used to add error correcting bits and to perform interleaving in order to protect infor-
mation bits from channel noise and interference. The reverse operations are performed in
the Decoding stage at the receiver.
-
8/22/2019 Initial Cell Serch Paper
10/63
10
The main difference between W-CDMA and CDMA2000 is that W-CDMA supports asyn-
chronous BSs whereas CDMA2000 relies on synchronized BSs. Synchronous CDMA
systems need an external time reference. A Global Positioning System (GPS) clock can
be used by all BSs to synchronize their operations. This allows the MS to use different
phases of the same scrambling code to distinguish between adjacent BSs. In an asynchro-
nous CDMA system, each BS has an independent time reference, and the MS, does not
have prior knowledge of the relative time difference between various BSs. The advantage
of asynchronous operation is that it eliminates the need to synchronize the BSs to an accu-
rate external timing source. However, since there is no external time synchronization
between the adjacent BSs, different phases of the same code cannot be used to distinguish
XEncoding
Scrambling Code
Generator
Scrambling CodeSynchronization
DecodingBaseband Baseband
XData Data
Scrambling Code
Generator
Transmitter Receiver
Figure 1: DS-CDMA Transmitter-Receiver Block Level Diagram
D/A A/D
-
8/22/2019 Initial Cell Serch Paper
11/63
11
adjacent BS. Thus, in an asynchronous CDMA system, adjacent BSs can only be identi-
fied by using distinct scrambling codes. Consequently, cell search, which involves the
process of achieving code, time and frequency synchronization of the MS with the BS,
takes longer in comparison to a synchronous CDMA system. Cell search is complicated in
the presence of signals which are intended for other mobile systems within a cell as well
as signals from other BSs. Thus, it is very important to develop algorithms and hardware
implementations to perform cell search using lower acquisition time and minimum hard-
ware resources for asynchronous CDMA systems.
Cell search is performed according to the algorithm proposed by Wang et al. [3]. In the
proposed cell search algorithm, code and time synchronization is achieved assuming a
large frequency error and after achieving code and time synchronization, frequency syn-
chronization is performed. In this study we consider the problem of achieving code and
time synchronization. The process of achieving code and time synchronization in the cell
search algorithm for W-CDMA systems is divided into three stages (1) slot synchroniza-
tion, (2) frame synchronization and code group identification, and (3) scrambling code
identification. This thesis presents a 3G Partnership Project (3GPP) cell search design
using cyclic codes (Improved CSD) to achieve faster synchronization at lower hardware
complexity. The second part of this thesis compares the two design algorithms for per-
forming initial cell search: the Improved CSD and the 3GPP cell search design using
comma free codes (3GPP-comma free CSD) in terms of (1) acquisition time measure and
(2) hardware specifications on a Xilinx Virtex-E XCV1000E field programmable gate
array (FPGA). The thesis also proposes design improvements in stage 2 of the 3GPP-
-
8/22/2019 Initial Cell Serch Paper
12/63
12
comma free CSD beyond those proposed by Li et al. [4]. The 3GPP-comma free CSD
proposed in this thesis uses a Fast Hadamard Transformer (FHT) in stage 2 that achieves
lower hardware complexity and faster decoding. Furthermore, masking functions are used
in stage 3 of both the Improved CSD and the 3GPP-comma free CSD to reduce the num-
ber of scrambling code generators required as described in previous work [4]. This results
in a reduction in the ROM size required to store the initial phases of the scrambling code
generators in stage 3. The Improved CSD proposed in this thesis aims to achieve faster
synchronization between the MS and the BS and thus improves system performance. The
experiments carried out using accumulation over multiple slots in stage 1 indicate that for
an additive white gaussian noise (AWGN) channel in a high signal-to-noise ratio the
Improved CSD achieves faster synchronization with the BS and has lower hardware utili-
zation when compared with the 3GPP-comma free CSD scheme under the same design
constraints.
The thesis is organized as follows. Work done by other research groups and suggestions
by the 3GPP working group are presented in Chapter 2. Chapter 3 describes the synchro-
nization channels in W-CDMA cell search and introduces the three step cell search algo-
rithm used in W-CDMA for synchronization between the MS and the BS. Chapter 4
describes the Improved cell search design using cyclic codes proposed as a means of
achieving faster synchronization. Chapter 5 discusses the 3GPP cell search design using
comma free codes. Chapter 6 presents the experimental method and results of the compar-
ison of the two cell search algorithms on a Xilinx Virtex-E XCV1000E FPGA. Chapter 7
is a summary, discussion, and an overview of future directions of this research.
-
8/22/2019 Initial Cell Serch Paper
13/63
13
Chapter 2
Background
Cell search design is critical as it impacts the system performance and there is a need to
design efficient receiver structures and algorithms to reduce the cell search time. This
Chapter summarizes efforts by research groups and the 3GPP working groups to design
efficient schemes and algorithms for each of the three stages of the cell search algorithm.
2.0 Background
Wang et al. proposes a pipelined process to be used in first three stages of the cell search
algorithm [3]. The cell search scenarios considered in their study are (1) initial cell
search: when a mobile is switched on and (2) target cell search: during idle and active
modes of the MS. Instead of the serial cell search sequentially searching through code,
time and frequency, their method first acquires code and time synchronization assuming a
larger frequency error and then performs frequency synchronization [3] [5].
The synchronization code sequences used in stage 1 and stage 2 of the cell search algo-
rithm are made up of bits called "chips" which can be either +1 or -1. The synchronization
code sequences are 256 chips in length. If a traditional matched filter is used then a huge
adder circuit (256 input adder) will be required to sum up the correlation results. This will
-
8/22/2019 Initial Cell Serch Paper
14/63
14
lead to wastage of hardware resources. Hence, Siemens and Texas Instruments in their
working group draft have suggested a hierarchical matched filter design which uses two
matched filters to reduce the hardware complexity significantly [6]. The details of the
hierarchical matched filter design will be presented in Chapter 4.
The 3GPP specification uses comma free codes in stage 2 of the cell search algorithm
[7] [8]. Nortel networks in their working group proposal have suggested the use of cyclic
codes in the SCHs [9]. The use of cyclic codes for generating the synchronization codes
will be explained in more detail in Chapter 4. These cyclic codes can reduce hardware uti-
lization and acquisition time if the receiver is properly designed.
To reduce the complexity of searching through all the 512 scrambling codes, the con-
cept of code grouping and group indicator codes (GIC) was introduced [10]. This reduces
the cell search time as the scrambling code is identified by first detecting the code group.
Once the code group is detected then the scrambling code used by the cell can be easily
identified as there are a limited number of codes in each code group. This reduces the cell
search time significantly. This idea was accepted in the 3GPP specifications. To further
reduce cell search time, frame boundary synchronization is also achieved in stage 2 after
identifying the code group and slot ID [11].
Ericsson in their working group draft have proposed increasing the number of code
groups in stage 2 of the cell search [12]. Increasing the number of code groups reduces
the number of scrambling codes in a code group. Their proposed scheme uses either 256,
-
8/22/2019 Initial Cell Serch Paper
15/63
15
128 or 64 code groups in stage 2 of the cell search. They claim that the scheme using 256
code groups is the preferred scheme as it requires only two scrambling code correlators in
stage 3 of initial cell search and achieves reduced hardware complexity.
In stage 2 of the 3GPP-comma free CSD presented in this thesis, a FHT design is pro-
posed in replacement to the Golay correlator presented by Li et al. [4]. A FHT provides an
efficient technique to detect the code group and slot ID in stage 2. Previous FHT designs
[13] and [14] utilize a lot of hardware resources, hence, a fast and efficient Hadamard
transformer is needed to reduce the hardware utilization and to perform faster decoding.
A compact and efficient FHT design will also draw less power from the handset.
Siemens in their working group draft have suggested the use of masking functions in
stage 3 to reduce the design complexity for generating the scrambling codes in parallel
[15]. The use of masking functions reduces the number of scrambling code generators
required to generate the codes in parallel. Any masking function can be selected by the
designer as long as they generate codes with minimum overlap. The use of masking func-
tions reduces the hardware significantly as compared to the previous design by Li et al.
[4].
Li et al. have designed an application specific integrated circuit (ASIC) for performing
cell search in W-CDMA systems [4]. In stage 1 and stage 2 of their cell search design the
authors use a correlator structure to detect the code group and slot ID. The correlator
structure used is a Golay correlator [16]. In stage 3 of the cell search algorithm, 16 scram-
-
8/22/2019 Initial Cell Serch Paper
16/63
16
bling code generators are used for generating the codes in parallel.
In summary, most of the literature found in this area have presented simulation results of
their algorithms and have not investigated the hardware complexity of their design
schemes except the work presented by Li et al. [4]. The designs used by the mobile man-
ufacturers is company proprietary and there are very few documents which describe their
actual design schemes. It is critical to consider a practical hardware implementation of the
cell search algorithm especially because chip area and power utilization are the two most
important factors in a mobile handset.
-
8/22/2019 Initial Cell Serch Paper
17/63
17
Chapter 3
Cell Search Algorithm
3.0 Cell Search Algorithm
This Chapter describes the synchronization channels in W-CDMA cell search and intro-
duces the cell search algorithm used in the synchronization of the MS with the BS for W-
CDMA systems.
3.1 Synchronization Channels in W-CDMA
In CDMA systems, spreading codes are used to differentiate physical channels from the
same transmitter, and scrambling codes are used to differentiate transmitters. The MS
needs to achieve code and time synchronization with the BS before any communication
with the BS can start. The process of searching for a code and achieving synchronization
with the BS is called cell search. Cell search is performed in two scenarios: when a MS is
switched on (initial cell search) and during active or idle mode (target cell search). Target
cell search is used to find handover candidates during a call. Cell search design is impor-
tant and needs to be completed in minimum delay as it impacts the system performance.
Each cell in a CDMA system is identified by its downlink scrambling code which is of
length 38,400 chips. The 38,400 chips form a radio frame which is divided into 15 slots.
-
8/22/2019 Initial Cell Serch Paper
18/63
18
Each slot in the radio frame is of 2,560 chips [7].
Figure 2 shows the slot and frame structure of the three synchronization channels used
in cell search: the Primary-Synchronization Channel (P-SCH), Secondary-Synchroniza-
tion Channel (S-SCH) and the Common Pilot Channel (CPICH) [7] [17]. The P-SCH
together with the S-SCH are also called Synchronization Channel (SCH). In the P-SCH, a
256 chip sequence is transmitted at the start of each slot. The same P-SCH sequence is
used by all the BSs and is transmitted once every slot. As the same sequence is used by all
the transmitting stations, only one matched filter is sufficient to detect the slot boundary
value. To reduce the complexity of the matched filter implementation, a hierarchical
scheme is used as will be explained in detail in Chapter 4. The S-SCH is used for carrying
15 different sequences, one in each slot, for the different code groups and is repeated after
every frame. These sequences are used in identifying the code group. The CPICH is used
38,400 chipsOne Frame = 15 slots (10 msec)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
10 CPICH Symbols
2,560 chips
256 chips
P-SCH
S-SCH
CPICH
(0.67 msec)
(0.067 msec)
Figure 2: Synchronization Channels in Cell Search
-
8/22/2019 Initial Cell Serch Paper
19/63
19
to carry the downlink common pilot symbols scrambled by the scrambling code of the BS.
Each slot of this channel is divided into 10 symbols, each of 256 chips in length.
To reduce the complexity of synchronizing to the BSs in W-CDMA, the concept of code
grouping and the use of code group indicator codes (GIC) were introduced [10]. The 512
scrambling codes used in W-CDMA are divided into code groups. After the code group is
identified then only the scrambling code used by the cell needs to be detected. The num-
ber of possible scrambling codes from which one code needs to be identified depends on
how many code groups are selected in stage 2 of the design. For example, if 32 code
groups are used in stage 2 then the number of scrambling codes in stage 3 are 16. Simi-
larly, if 64 code groups are used then there will be 8 possible scrambling codes. Although,
the number of scrambling codes will be fixed at 512, the number of code groups can be
increased from 32 to 256 [12]. The complexity is further reduced by combining frame
synchronization and code group identification in stage 2 of the cell search algorithm [11].
3.2 Cell Search Algorithm
The process of achieving code and time synchronization in the cell search algorithm is
divided into three stages (1) slot synchronization, (2) frame synchronization and code
group identification, and (3) scrambling code identification [3] [7] [8] [18].
3.2.1 Stage 1: Slot Synchronization
-
8/22/2019 Initial Cell Serch Paper
20/63
20
During stage 1 of the cell search procedure the MS uses the SCHs Primary Synchroniza-
tion Code (PSC) to acquire slot synchronization to a cell. This is typically done with a
single matched filter matched to the PSC which is common to all cells. The slot timing of
the cell can be obtained by detecting peak values in the matched filter output. The starting
position of the synchronization code may be determined from observations over one slot
duration. However, decisions based on observations over a single slot may be unreliable,
when the signal-to-noise ratio (SNR) is low or if fading is severe. Reliable slot synchroni-
zation is required to minimize cell search time. In order to increase reliability, observa-
tions are made over multiple slots and the results are then combined. This ensures that the
correct slot boundary is identified.
3.2.2 Stage 2: Frame Synchronization and Code Group Identification
During stage 2 of the cell search procedure, the MS uses the SCHs Secondary Synchro-
nization Code (SSC) to achieve frame synchronization and identify the code group of the
cell found in stage 1. This is done by correlating the received signal with all possible SSC
sequences and identifying the maximum correlation value. Since the cyclic shifts of the
sequences are unique, the code group as well as the frame synchronization is determined.
3.2.3 Stage 3: Scrambling Code Identification
During stage 3 of the cell search procedure, the MS determines the exact primary scram-
bling code used by the cell. The primary scrambling code is typically identified through
-
8/22/2019 Initial Cell Serch Paper
21/63
21
symbol-by-symbol correlation over the CPICH with all codes within the code group iden-
tified in stage 2. In this stage, a threshold value is used to decide whether the code has
been identified. The threshold value can be predetermined using a parameter called prob-
ability of false alarm rate [19].
This three stage cell search algorithm helps in simplifying the synchronization process
of the MS with the BS. Each stage and their hardware implementation will be explained
in the following Chapters.
-
8/22/2019 Initial Cell Serch Paper
22/63
22
Chapter 4
Improved Cell search Design
4.0 Improved Cell Search Design
This Chapter describes the Improved CSD using a set of cyclic codes. The cyclic codes
were proposed by Nortel networks to be used on the Secondary SCH [9]. These cyclic
codes allow very efficient detection and improves the cell search in terms of acquisition
time and hardware utilization. The three stage cell search design and their hardware
implementation are explained in Sections 4.1, 4.2 and 4.3.
4.1 Stage 1: Slot Synchronization
The MS first needs to acquire the PSC which is common to all the BSs. These codes are
of length 256 chips. The matched filter output is given by
where Rj
is the jth sample of the received complex signal, and
Cpj is the jth bit of the PSC
Hence, a traditional matched filter implementation would require 256 taps and a large
Y RjC pjj 0=
255
= (1)
-
8/22/2019 Initial Cell Serch Paper
23/63
23
adder circuit. This would increase the delay as well as power consumption at the receiver
which is not desirable. Thus, a hierarchical structure is proposed for performing the
matched filter operations which will need lesser number of taps, reduced circuitry and
lower power consumption [6]. The PSC consists of an unmodulated hierarchical sequence
of length 256 chips, transmitted once every slot. The PSC is the same for every BS in the
system and is transmitted time aligned with the slot boundary. The PSC is chosen to have
good auto-correlation properties. This means that when the PSC sequence is correlated
with itself, the interference from adjacent BSs is minimized and a high peak value is
obtained.
The hierarchical sequences used for generating the PSC are constructed from two con-
stituent sequences X1 and X2 of length n1 and n2, respectively, using the following equa-
tion
Cp(n)=X1(n mod n2)+X2(n div n1) modulo 2, n=0,1,..,(n1*n2)-1 (2)
where n1=n2=16.
The constituent sequences X1 and X2 are both defined as:
X1=X2=(1,1,-1,-1,-1,-1,1,-1,1,1,-1,1,1,1,-1,1) [9].
There are different techniques in which the hierarchical matched filter can be designed
as shown in Table 1.
Table 1: Hierarchical Matched Filter (16 and 64 chip Accumulation)
16 chip
Accumulator
16 symbol
Accumulator
64 chip
Accumulator
4 symbol
Accumulator
Register Taps 16 16 64 4
-
8/22/2019 Initial Cell Serch Paper
24/63
24
The hierarchical matched filter consists of two concatenated matched filter blocks. The
design using 64 taps is shown in Figure 3. This solution is not ideal because of the follow-
ing reasons. First, the matched filter design requires 64 taps. Second, the design needs a
64-input adder as shown in Figure 3. A better solution is to use the design shown in Fig-
ure 4. Hence, in stage 1 of both the Improved CSD and the 3GPP-comma free CSD the
hierarchical matched filter using 16 chip and 16 symbol accumulation is used.
Adder Length 16 16 64 4
Table 1: Hierarchical Matched Filter (16 and 64 chip Accumulation)
16 chip
Accumulator
16 symbol
Accumulator
64 chip
Accumulator
4 symbol
Accumulator
X X X X X X X X X X X X X X X X
+ + + + + + + + + + + + + + + +
+ + + + + + + +
+
X
+
X
+
InData
Adder Tree 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 64 65 128 129 192 193 256
Adder Tree 2
PSCHCode
PSCHCode
5 levels of adders
Result
X
+
X
+
X X X X X X X X X X X X X X X X
+ + + + + + + + + + + + + + + +
+ + + + + + + +
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
Figure 3: Hierarchical Matched Filter (64 chip and 4 symbol accumulation)
+ +
+
ShiftRegister 1
ShiftRegister 2
-
8/22/2019 Initial Cell Serch Paper
25/63
25
In this design, the first matched filter receives the input signals serially from the BS.
Correlation over X1 (16 chip accumulation) is performed before correlation over X2 (16
symbol accumulation). However, the two matched filters can be interchanged and the
selection is an implementation option. After 16 clock cycles when the shift register 1 is
filled, the data stored in the shift register 1 is matched in parallel with the code applied to
the taps of the matched filter (tap coefficients). The tap coefficients are the PSC sequences
which are the same for all the BSs. Hence, the same matched filter structure can be used
for all the BSs. The adder circuit is implemented as a tree structure with the 16 inputs
applied in parallel. If the data bits in the shift register 1 match with the tap coefficients
then the result of the adder tree will be the highest value possible (16 or greater). The sec-
ond matched filter has a shift register 2 of size 256 registers. Only 16 taps are needed to
X X X X X X X X X X X X X X X X
+ + + + + + + + + + + + + + + +
+ + + + + + + +
+
+
X X X X X
+ + + + +
+
X X X X
+ + + +
+ + +
X
+
X
+
+
X
+
+
InData
Adder Tree 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 16 17 32 33 48 49 64 65 80 81 96 176 177 192 193 208 209 224 225 240 241 256
Adder Tree 2
PSCH
Code
PSCHCode
3 levels of adders
3 levels of adders
Result
Figure 4: Hierarchical Matched Filter (16 chip and 16 symbol accumulation)
ShiftRegister 1
ShiftRegister 2
-
8/22/2019 Initial Cell Serch Paper
26/63
26
match every sixteenth value of the shift register 2. The result from the first adder tree is
stored in the shift register 2 of the second matched filter. After 256 clock cycles the shift
register 2 in the second matched filter will be filled with the results from the first matched
filter. The data in the shift register 2 is then matched in parallel with the tap coefficients.
The tap coefficients are the same as the PSC sequence. If the data bits match the code
sequence then the result of the second adder tree will be 256 or greater in magnitude corre-
sponding to the peak value. An advantage of this scheme is that no multiplier circuit is
needed as the correlations can be performed using an adder/subtractor circuit.
Each memory cell in shift register 1 is 4-bits wide assuming that, at the input to the dig-
ital receiver, the signal is sampled with a 4-bit analog-to-digital (A/D) convertor. Shift
register 2 is 8-bits wide to store the result from the first adder tree block. For performing
the correlation, it is not necessary to perform 16*16 operations but only 16+16 accumula-
tion operations, which leads to a considerable reduction in hardware complexity. The
hardware complexity of implementing the hierarchical matched filter is calculated as
shown. In one slot period (2,560 chips), the receiver has to perform at least 81,920 com-
plex additions per slot, (2,560*(16+16)). The traditional matched filter implementation
without the hierarchical structure would require 256 complex additions. Thus, the hierar-
chical matched filter achieves a saving of a factor of 8 in terms of complex additions.
From Figure 2, each slot has a duration of 0.67 msec (670 sec). The complexity of stage 1
in terms of real additions per second is 245 Madds/sec (8,1920*2/670). The incoming
complex signal is divided into two components, the sine part called the "in-phase" (I-
phase) and the cosine part called the "quadrature-phase" (Q-phase). The factor of 2 is for
-
8/22/2019 Initial Cell Serch Paper
27/63
27
the two branches I and Q of the complex signal. Thus, in stage 1 of the initial search,
8,1920 complex additions in 1 slot and computing power of 245 Madds/sec is needed.
There are two such hierarchical matched filters for the I and Q channels of the received
complex signal as shown in Figure 5. The correlation results over I and Q channels are
combined non-coherently over 1 slot duration and the result is stored in an accumulator
which is implemented as a shift register. The output of the accumulator is given to a com-
parator block to detect the peak value corresponding to the slot boundary of the closest BS
and the MS needs to synchronize with this BS. As the code can be affected by AWGN and
fading, accumulation over multiple slots is needed to correctly identify the slot boundary.
It is important that the slot boundary is correctly identified in order to avoid the cost of
increased acquisition time in case the wrong slot boundary is given to stage 2.
X X X X X X X X X X X X X X X X
+ + + + + + + + + + + + + + + +
+ + + + + + + +
+
+
X X X X X
+ + + + +
+
X X X X
+ + + +
+ + +
X
+
X
+
+
X
+
+ +
I-Phase
Q-Phase
InData
Adder Tree 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 16 17 32 33 48 49 64 65 80 81 96 176 177 192 193 208 209 224 225 240 241 256
Non-Coherent Detection Block
Accumulator
Comparator
Adder Tree 2
PSCH
Code
PSCHCode
3 levels of adders
3 levels of addersSlot Boundary Value
Stage 1 Complete
(.)
(.) 2
2
+
Figure 5: Slot Boundary Detection
ShiftRegister 1
ShiftRegister 2
-
8/22/2019 Initial Cell Serch Paper
28/63
28
4.2 Stage 2: Frame Synchronization and Code Group
Identification
The Secondary SCH consists of 15 sequences belonging to a family of cyclic codes
(SSCs), each of length 256 chips. These SSCs are transmitted repeatedly in parallel with
the Primary SCH. The procedure for constructing the cyclic codes is similar to that of the
hierarchical sequence (equation 2) for the Primary SCH except that it uses specific
sequences of length 16 from Table 2 for each code group.
The procedure for constructing the cyclic hierarchical sequence Csi,1 for slot 1 is exactly
the same as constructing the hierarchical sequence Cp for the Primary SCH. The
sequence Csi,1 for slot 1 will be referred to as the zero cyclic shift sequence as no shift is
applied to the constituent sequence X1i. For slots 2 to 15, the cyclic codes are constructed
from the two constituent sequences X1i,k-1 and X2i,k-1 of length n1 and n2 respectively
using the following formula
Csi,k(n)=X2i,k-1 (n mod n2)+X1i,k-1 (n div n1) modulo 2, n=0,1,..,(n1*n2)-1 (3)
where i is code group number,
k=2,3,..,15 is slot number,
n is chip number in slot, n1=n2=16, and
the constituent sequences X1i,k-1 and X2i,k-1 in each code group i are chosen to be the
following sequences from Table 2 [9].
-
8/22/2019 Initial Cell Serch Paper
29/63
29
The constituent sequence X2i,k-1 (inner sequence) is exactly equal to the base sequence
X2i in every slot, i.e. X2i,k-1=X2i at all k. The constituent sequence X1i,k-1 (outer
sequence) are formed from the base sequence X1iby cyclic right shifts of X1
ion k-1 posi-
tions (from 0 to 15) clockwise for each slot number k, from 1 to 15. The generation of the
cyclic codes can be understood clearly by considering the following example.
For the first code group the sequence is given by
X11,0=(1,1,1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1,1), k=1 for slot 1, No cyclic shift
X11,1=(1,1,1,1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1), k=2 for slot 2, cyclic right shift by 1 posi-
tion
X11,14=(1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1,1,1,1), k=15 for slot 15, cyclic right shift by 14
positions.
Table 2: Sequences X1i and X2i for Code Groups 1 to 32
Code Group Code Group
1 1 1 1-1-1-1 1-1-1 1 1-1 1-1 1 1 17 1-1 1 1-1 1-1 1 1 1-1 1 1 1-1 1
2 1-1 1 1-1 1 1 1-1-1 1 1 1 1 1-1 18 1 1 1-1-1-1-1-1 1-1-1-1 1-1-1-13 1 1-1 1-1-1-1 1-1 1-1 1 1-1-1-1 19 1-1-1-1 1-1-1 1 1 1 1-1-1-1-1 1
4 1-1-1-1-1 1-1-1-1-1-1-1 1 1-1 1 20 1 1-1 1 1 1-1-1 1-1 1 1-1 1-1-1
5 1 1 1-1 1 1-1 1-1 1 1-1-1 1-1-1 21 -1-1-1 1 1-1-1 1 1-1 1-1 1 1-1-1
6 1-1 1 1 1-1-1-1-1-1 1 1-1-1-1 1 22 -1 1-1-1 1 1-1-1 1 1 1 1 1-1-1 1
7 1 1-1 1 1 1 1-1-1 1-1 1-1 1 1 1 23 -1-1 1-1 1-1 1-1-1 1 1-1-1-1-1-1
8 1-1-1-1 1-1 1 1-1-1-1-1-1-1 1-1 24 -1 1 1 1 1 1 1 1-1-1 1 1-1 1-1 1
9 1 1-1 1-1-1-1 1 1-1 1-1-1 1 1 1 25 -1 1 1 1-1-1 1 1 1-1-1-1-1-1-1 1
10 1-1-1-1-1 1-1-1 1 1 1 1-1-1 1-1 26 -1-1 1-1-1 1 1-1 1 1-1 1-1 1-1-1
11 -1 1-1-1-1-1-1 1 1 1 1-1 1-1 1-1 27 -1 1 1 1 1 1-1-1 1-1-1-1 1 1 1-1
12 -1-1-1 1-1 1-1-1 1-1 1 1 1 1 1 1 28 -1-1 1-1 1-1-1 1 1 1-1 1 1-1 1 1
13 1-1-1-1 1-1-1 1-1-1-1 1 1 1 1-1 29 -1 1-1-1 1 1 1 1 1-1 1 1 1 1-1 1
14 1 1-1 1 1 1-1-1-1 1-1-1 1-1 1 1 30 -1-1-1 1 1-1 1-1 1 1 1-1 1-1-1-1
15 1-1-1-1-1 1 1-1 1 1 1-1 1 1 1-1 31 -1 1 1 1-1-1 1 1-1 1 1 1 1 1 1-1
16 1 1-1 1-1-1 1 1 1-1 1 1 1-1 1 1 32 -1-1 1-1-1 1 1-1-1-1 1-1 1-1 1 1
-
8/22/2019 Initial Cell Serch Paper
30/63
30
The same procedure for forming the cyclic codes will be used for other code groups.
Thus, for the 32 codes groups and 15 slots (in one frame), 512 different cyclic codes with
a length of 256 chips each are constructed. In other words, each of the 32 code groups has
16 cyclic codes. This set of 512 (32X16) cyclic codes has good correlation properties that
make it good candidates for the SSCs. Many pairs of cyclic codes are fully orthogonal as
the cross correlation is zero, some pairs have small cross correlation properties. The cross
correlation of each cyclic hierarchical sequence Csi,kwith Cp code of Primary SCH is
small. These 512 cyclic codes are unique for each code group/slot locations pair. Thus, it
is possible to uniquely determine both the scrambling code group and the frame timing in
the second stage of the initial cell search.
By identifying the code group/slot location pair that gives the maximum correlation
value, the code group as well as the frame synchronization is determined. The output
from the matched filter is given to a non-coherent block which computes the energy over I
and Q channels and then gives the result to the comparator module as shown in Figure 6.
One slot search period time (2,560 chips) is enough to uniquely identify the correct code
group and the frame timing in the second stage of acquisition when the signal-to-noise
ratio is high. This is one major difference with the 3GPP-comma free CSD where at least
three slots are necessary to uniquely identify the correct code group and frame timing.
The Improved CSD also uses a smaller size ROM 32X16 to store the cyclic codes as com-
pared to the 3GPP-comma free CSD which uses a ROM of size 32X60 to store the comma
free codes.
-
8/22/2019 Initial Cell Serch Paper
31/63
31
The input data samples for the Secondary SCH are stored in an input buffer with 256
complex memory cells called the Secondary Buffer as shown in Figure 6. These input
data samplesare producedafter waveform matched filtering and sampling at thechip rate.
The result from the hierarchical matched filter design is then given to a non-coherent mod-
ule which is used to calculate the energy over I and Q channels and then give it to a com-
parator block.
The ROM-stored code sequences given in Table 2 are each tried in sucession before the
data from the next slot comes in. The data in the shift register is latched till all these
+ + + + + + + + + + + + + + + +
+ + + + + + + +
+
X X X X X X X X X X X X X X X X
Adder Tree 1
Adder Tree 2
+ + + + + + + + + + + + + + + +
+ + + + + + + +
+
X X X X X X X X X X X X X X X X
1 256Sampling Counter Secondary Buffer
Code Register 1
Code Register 2
Slot Boundary Value
3 levels adder tree
3 levels adder tree
Enable Stage1 Complete
Matched Filter 1
Matched Filter 2
5X SysClock
5X SysClock
I-Phase
Q-Phase
Code Group
Slot ID
Non-coherent Detection Block
Comparator
Stage 2 Complete
Cyclic Codes
Buffer used to fill the Data Register of
Matched Filter1
(.)
(.) 2
2
+
1 2 3 4 5 6 7 8 9 10111213 1514 16
1 2 3 4 5 6 7 8 9 10111213 1514 16
Rom32 X 16
12
3
32
Figure 6: Frame synchronization and Code Group Identification
Shift Register 1
Shift Register 2
-
8/22/2019 Initial Cell Serch Paper
32/63
32
sequences have been correlated. This is achieved in stage 2 of the Improved CSD scheme
using two clocks, a slow clock called the system clock in the design and a fast clock which
runs at 5X system clock. The sampling is performed at the slow clock rate (system clock).
Once the data is latched in the buffer, the fast clock (5X system clock) is used to perform
the correlations.
The comparator block gives the highest correlated code group from the Table 2 with the
data sequence and also the number of shifts which have been applied to the code group
sequence. The number of shifts is the same as the slot ID. From the slot ID the frame
boundary can easily be identified because the number of slots in a frame is fixed at 15.
4.3 Stage 3: Scrambling Code Identification
After achieving code group and frame synchronization, the scrambling code is identified
by correlating the symbols in the CPICH with all possible scrambling codes in the code
group. The codes are generated using a scrambling code generator and the descrambling
operation is carried out using a descrambler. The details of the scrambling code generator
and the descrambler used in stage 3 of the cell search are explained in Sections 4.3.1 and
4.3.2 respectively.
4.3.1 Scrambling Code Generator
Each cell is allocated one and only one primary scrambling code. The scrambling code
-
8/22/2019 Initial Cell Serch Paper
33/63
33
sequences are constructed by combining two real sequences into a complex sequence [7].
Each of the two real sequences are constructed as the position wise modulo 2 sum of
38,400 chip segments of two binary sequences generated by means of two generator poly-
nomials of degree 18. Let x and y be the two sequences respectively. The resulting
sequences constitute segments of a set of Gold sequences. The x sequence is constructed
using the primitive polynomial 1+X7+X18. The y sequence is constructed using the poly-
nomial 1+X5+X7+X10+X18. The sequence depending on the chosen scrambling code
number n is denoted as zn. Furthermore, let x(i), y(i) and zn(i) denote the ith symbol of the
sequence x, y, and zn, respectively. The sequences x and y are constructed as
x(i+18)=x(i+7)+x(i) modulo 2, i=0,1,..,218 - 20 (4)
y(i+18)=y(i+10)+y(i+7)+y(i+5)+y(i) modulo 2, i=0,1,..,218 - 20 (5)
The nth Gold code sequence zn, n=0,1,..,218 - 2, is then defined as
zn(i)=x((i+n) modulo (218 -1))+y(i) modulo 2, i=0,1,..,218- 2 (6)
Finally, the nth complex scrambling code sequence sn is defined as
sn(i)=zn(i)+jzn((i+131,072) modulo (218-1)), i=0,1,..,38,399 (7)
The pattern from phase 0 up to the phase of 38,399 is repeated for every radio frame.
-
8/22/2019 Initial Cell Serch Paper
34/63
34
The scrambling code generator used to generate the long codes is shown in Figure 7. A
total of 218 -1=262,143 scrambling codes, numbered 0,1,..,262,142 can be generated using
the code generator. However not all the scrambling codes are used. The scrambling codes
are divided into 512 sets each of a primary scrambling code and 15 secondary scrambling
codes. The primary scrambling codes consist of scrambling codes n=16*i where
i=0,1,..,511. The ith set of secondary scrambling codes consists of scrambling codes
16*i+k, where k=1,2,..,15. There is a one-to-one mapping between each primary scram-
bling code and 15 secondary scrambling codes in a set such that ith primary scrambling
code corresponds to ith set of secondary scrambling codes. The set of primary scrambling
codes is further divided into 32 scrambling code groups, each consisting of 16 primary
scrambling codes. The jth scrambling code group consists of primary scrambling codes
16*16*j+16*k, where j=0,1,..,31 and k=0,1,..,14.
+
+
+
+
0717
I Channel
Q ChannelCode
Code
+
+
6 5 4 3 2 18910111213141516
0717 6 5 4 3 2 18910111213141516
Figure 7: Scrambling Code Generator
-
8/22/2019 Initial Cell Serch Paper
35/63
35
In stage 3, 16 scrambling codes need to be generated in parallel. If the scrambling code
generator shown in Figure 7 is used to generate the codes then 16 such code generators
would be required. However, generating the codes in parallel using 16 code generators
could be expensive as a huge ROM would be required to store the initial phases for all the
16 code generators.
Table 3: Masking Functions used in Stage 3: Scrambling Code Generator
Masking Function For I Channel Code
in LFSR 1
Masking Function For Q Channel
Code in LFSR 1
Code1 000000000000000001 001000000001010000Code2 000000000000000010 010000000010100000
Code3 000000000000000100 100000000101000000
Code4 000000000000001000 000000001000000001
Code5 000000000000010000 000000010000000010
Code6 000000000000100000 000000100000000100
Code7 000000000001000000 000001000000001000
Code8 000000000010000000 000010000000010000
+
+
+
+
Masking Function for I Channel
Masking Function for I Channel
Masking Function for Q Channel
Masking Function for Q Channel
0717
071017
I Channel
Q ChannelInitial Phases
1
2
32
ROM 32 X 18
for Code generator Code
Code
. . .
. . .
. . .
. . .
5
Figure 8: Multiple Scrambling Code Generator
LFSR 1
LFSR 2
-
8/22/2019 Initial Cell Serch Paper
36/63
36
In order to reduce the hardware utilization, in stage 3 of both the designs only one
scrambling code generator is used to generate 16 codes in parallel when 32 code groups
are used as shown in Figure 8. Sixteen masking functions are used to generate the codes
in parallel [15]. Masking functions can generate codes which have minimum overlap and
reduce the hardware circuitry to a single scrambling code generator at the expense of a few
logic gates. The masking functions used for generating the codes are given in Table 3.
Masking function for I and Q Channel Code in linear feedback shift register (LFSR) 2
were kept fixed as 000000000000000001 and 001111111101100000. Besides reducing
the hardware from 16 code generators to one code generator, the design also reduces the
ROM size to 32X18 from the size 512X18 if 16 code generators were used.
4.3.2 Descrambler
Descrambling is carried out using data over the CPICH and the codes generated by the
scrambling code generator and masking functions. Counters are used as shown in Figure
9 to keep track of the votes obtained after the descrambling and the comparison opera-
tions. After these operations are completed, the final step is to decide whether cell search
Code9 000000000100000000 000100000000100000
Code10 000000001000000000 001000000001000000
Code11 000000010000000000 010000000010000000Code12 000000100000000000 100000000100000000
Code13 000001000000000000 000000001010000001
Code14 000010000000000000 000000010100000010
Code15 000100000000000000 000000101000000100
Code16 001000000000000000 000001010000001000
Table 3: Masking Functions used in Stage 3: Scrambling Code Generator
Masking Function For I Channel Code
in LFSR 1
Masking Function For Q Channel
Code in LFSR 1
-
8/22/2019 Initial Cell Serch Paper
37/63
37
has been successful and a code has been found. For this purpose a parameter called prob-
ability of false alarm rate (PFA) is used to predefine the threshold value (VTH) [19]. The
relation can be expressed by the following equation
PFA=e-V
TH/V (8)
where V is twice the variance of the I and Q components.
If the counter exceeds VTH then the cell search operation is declared a success and the
particular long code is identified.
-
8/22/2019 Initial Cell Serch Paper
38/63
38
X X
+
(.)
(.)22
+
+
X X
+
Descrambler2
Descrambler3
Descrambler16
Descrambler1
.
Descrambler
Descrambler
counter15
..16
counter13..
14
counter11..
12
counter10..
9
counter7..
8
counter5..
6
counter3..
4
counter1..
2
T
hreshold
FirstComparatorBlock
SecondComparatorBlock
IChannelCode
QCh
annelCode
QCh
annelCode
IChannelCode
Data
Data
Data
Data
Increment
Counter
Code
Found
+ +
++
MaskingFunctionforIChannel
MaskingFunctionforIChannel
MaskingFunctionforQChannel
MaskingFunctionforQChannel0
7
17
0
7
10
17
IChannel
QChannel
InitialPhases
1232
MultipleScramblingCodeGenerator
ROM
32X18
Descrambler
Long
Code
IChannel
QChann
el
IChannel
QChann
el
Output1
Output16
Value
forCodegenerator
Code
Code
Output1
...
...
...
... 5
Figur
e9:ScramblingCodeIdentification
-
8/22/2019 Initial Cell Serch Paper
39/63
39
Chapter 5
3GPP-comma free Cell Search Design
5.0 3GPP-comma free Cell Search Design
This Chapter discusses stage 2 of the 3GPP cell search design using comma free codes.
Stage 1 and stage 3 for the 3GPP-comma free CSD design were kept the same as the
Improved CSD to compare stage 2 of both the designs. A Fast Hadamard Transformer
(FHT) is proposed to be used in stage 2 of the cell search algorithm. To reduce the hard-
ware utilization of the FHT design, reduced length Walsh sequences are proposed as
explained in Section 5.1.
5.1 Stage 2 of 3GPP-comma free Cell Search Design
In CDMA systems, the BS identifies each user in a cell by a unique scrambling code. In
order to minimize the interference in a cell when two users transmit at the same time,
orthogonal (Walsh) codes are used. The Walsh codes are generated using a Walsh-Had-
amard function. When these Walsh codes are transmitted by the BS, they are affected by
interference, fading and noise which may be AWGN. At the receiver, a decoding logic is
required to correctly determine which of the Walsh codes was the most likely to have been
sent. A FHT can be used to provide such a decoding circuitry.
The table provided in the 3GPP Specifications for the comma free codes is for 64 code
-
8/22/2019 Initial Cell Serch Paper
40/63
40
groups. For comparison with the Improved CSD scheme which uses 32 code groups, only
32 of the possible 64 code groups are used. The 32 secondary SCH sequences are con-
structed such that their cyclic shifts are unique, i.e., a non-zero cyclic shift less than 15 of
any of the 32 sequences is not equivalent to some cyclic shift of any other of the 32
sequences. Also, a non-zero cyclic shift less than 15 of any of the sequences is not equiv-
alent to itself with any other cyclic shift less than 15. Table 4 lists the sequences of SSCs
used to encode the 32 different scrambling code groups [7].
Table 4: Allocation of SSCs for Secondary SCH
Scrambling
Code
Group
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Group 0 1 1 2 8 9 10 15 8 10 16 2 7 15 7 16
Group 1 1 1 5 16 7 3 14 16 3 10 5 12 14 12 10
Group 2 1 2 1 15 5 5 12 16 6 11 2 16 11 15 12
Group 3 1 2 3 1 8 6 5 2 5 8 4 4 6 3 7
Group 4 1 2 16 6 6 11 15 5 12 1 15 12 16 11 2
Group 5 1 3 4 7 4 1 5 5 3 6 2 8 7 6 8
Group 6 1 4 11 3 4 10 9 2 11 2 10 12 12 9 3
Group 7 1 5 6 6 14 9 10 2 13 9 2 5 14 1 13
Group 8 1 6 10 10 4 11 7 13 16 11 13 6 4 1 16Group 9 1 6 13 2 14 2 6 5 5 13 10 9 1 14 10
Group 10 1 7 8 5 7 2 4 3 8 3 2 6 6 4 5
Group 11 1 7 10 9 16 7 9 15 1 8 16 8 15 2 2
Group 12 1 8 12 9 9 4 13 16 5 1 13 5 12 4 8
Group 13 1 8 14 10 14 1 15 15 8 5 11 4 10 5 4
Group 14 1 9 2 15 15 16 10 7 8 1 10 8 2 16 9
Group 15 1 9 15 6 16 2 13 14 10 11 7 4 5 12 3
Group 16 1 10 9 11 15 7 6 4 16 5 2 12 13 3 14
Group 17 1 11 14 4 13 2 9 10 12 16 8 5 3 15 6
Group 18 1 12 12 13 14 7 2 8 14 2 1 13 11 8 11
Group 19 1 12 15 5 4 14 3 16 7 8 6 2 10 11 13
Group 20 1 15 4 3 7 6 10 13 12 5 14 16 8 2 11
Group 21 1 16 3 12 11 9 13 5 8 2 14 7 4 10 15
Group 22 2 2 5 10 16 11 3 10 11 8 5 13 3 13 8
Group 23 2 2 12 3 15 5 8 3 5 14 12 9 8 9 14
Group 24 2 3 6 16 12 16 3 13 13 6 7 9 2 12 7
Group 25 2 3 8 2 9 15 14 3 14 9 5 5 15 8 12
Group 26 2 4 7 9 5 4 9 11 2 14 5 14 11 16 16
Group 27 2 4 13 12 12 7 15 10 5 2 15 5 13 7 4
-
8/22/2019 Initial Cell Serch Paper
41/63
41
The 16 SSCs, (Cssc,1,..,Cssc,16), are complex-valued with identical real and imaginary
components, and are constructed from position wise multiplication of a Hadamard
sequence and a sequence z, defined as z=(b,b,b,-b,b,b,-b,-b,b,-b,b,-b,-b,-b,-b,-b), where
b=(1,1,1,1,1,1,-1,-1,1,-1,1,-1,1,-1,-1,1). The Hadamard sequence is obtained from one of
the rows of a Hadamard matrix which consists of +1 and -1. The rows and columns of the
Hadamard matrix have the property that they are mutually orthogonal. The following
examples show how to construct a Hadamard matrix
In general the Hadamard matrix can be defined recursively as
where HN is a matrix of size N X N.
If a vector X with length N is an input then a vector Y obtained as a result of the Had-
amard transform is equal to
Y=HN*X (10)
Group 28 2 5 9 9 3 12 8 14 15 12 14 5 3 2 15
Group 29 2 5 11 7 2 11 9 4 16 7 16 9 14 14 4
Group 30 2 6 2 13 3 3 12 9 7 16 6 9 16 13 12
Group 31 2 6 9 7 7 16 13 3 12 2 13 12 9 16 6
Table 4: Allocation of SSCs for Secondary SCH
Scrambling
Code
Group
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
H21 1
1 1=
H4
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
=
HNHN HN
HN HN= (9)
-
8/22/2019 Initial Cell Serch Paper
42/63
42
The entries in Table 4 denote what SSC to use in the different slots for the different
scrambling code groups, e.g. the entry "5" means that SSC Cssc,5 shall be used for the cor-
responding scrambling code group and slot. The kth SSC, Cssc,kk=1,2,..,16 can be calcu-
lated using the following expression:
Cssc,k=(1+j)(Hm(0)z(0),Hm(1)z(1),Hm(2)z(2),..,Hm(255)z(255)) (11)
where m=16(k-1)
As each element of the Hadamard matrix is either +1 or -1, the multiplication operation
used in equation 11 can be reduced to a series of addition/subtraction operations. In gen-
eral, for a N-point input sample, the FHT algorithm needs to perform Nlog2N addition and
subtraction operations.
Figure 10 shows an individual stage of the FHT. Each stage has an upper and a lower
input terminal. The upper input terminal is configured to receive multiple input signals
which are either Walsh chips (if the stage is the first stage of the FHT) or intermediate cor-
relation coefficients (if the stage is not the first stage of the FHT). If an input of N-Walsh
chips is to be processed then the upper input terminal receives N/2 input signal bits and the
lower input terminal receives the other N/2 input bits.
+
-
0
1
1
0
1
0
+
+
En
1 2
1 2
Figure 10: Individual Stage of FHT
Upper Input
Lower Input
Output to
Next Stageof FHT
Terminal
Terminal
Enable
-
8/22/2019 Initial Cell Serch Paper
43/63
43
+ -
0 1
1 01 0
++
+ -
0 1
1 01 0
++
+ -
0 1
1 01 0
++
SamplingC
ounter
SlotBoundaryVa
lue
EnableS
tage1Complete
CommaFreeCodes
1 2 32
Slot1
Slot2
Slot3
S
lot15
Buffer
D
etector
3
RegistertoStore
Comparator
CodeGroup
SlotID
Table43GPP25.2
13v4.0
ShiftRegister
Adder
Adder/Subtrac
tor
InputDataBitsfrom
Buffer
fromStage1
En
En
En
MSB
LSB
Counter
3Bit
+ + -+
Phase1
Phase
2
Phase3
Phase4
Phas
e5
DatatoFHT
H
adamardCodeMetrics
1
2
3
4
5
6
7
8
9
10111213
15
14
16
1
1
1
1
1
2
2
15
161012
2
6
9
6
HadamardRow
Ids
ROM
32X60
123
4
123
4
1
2
1
2
11
1
2
3
y15
y15+y16
(y13-y14)+(y15-y16)
((y9-y10)-(y11-y12))+((y13-y14)+(y15-y16)
)
((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))+((y9-y10)-(y11-1
2))-((y13-y14)+(y15-y16))
y16
y15-y16
(y13-y14)-(y15-y16)
((y9-y10)-(y11-12))-((y13-y14)+(y15-y16))
((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))-((y9-y10)-(y11-12))-((y13-y14)+(y15-y16))
InputPhase1
Phase2
Phase3
Phase4
y2
y1-y2
(y1+y2)-(y3+y4)
((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))
((y
1+y2)+(y3+y4))+((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))
y1
y1+y2
(y1+y2)+
(y3+y4)
((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))
((y1
+y2)+(y3+y4))+((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12
))+((y13+y14)+(y15+y16))
Figure11:16chipFHT
(C2)
(C0)
-
8/22/2019 Initial Cell Serch Paper
44/63
44
Figure 11 shows the design for a FHT structure which is used for decoding a 16 chip
sequence. The design proposed is a very compact and efficient implementation as com-
pared to previous designs [13] [14]. The inputs to the FHT are applied according to the
timing diagram as shown in Table 5. The inputs are applied in a non-sequential order and
hence a buffer is required to initially store the vectors before passing them to the FHT
structure. If a 16 chip sequence needs to be decoded then a buffer of length 16 registers is
required to initially store the vectors. The addition and subtraction operations in the FHT
algorithm are used to generate correlation coefficients for the received Walsh code. The
correlation coefficients express the likelihood that a received codeword is the correct
Walsh code.
Table 5: Timing Diagram of Inputs to FHT
Phase 1 Upper Input 0 1 2 3 4 5 6 7
Phase 1 Lower Input 8 9 10 11 12 13 14 15
Phase 2 Upper Input 0 1 2 3
Phase 2 Lower Input 4 5 6 7
Phase 3 Upper Input 0 1Phase 3 Lower Input 2 3
Phase 4 Upper Input 0
Phase 4 Lower Input 1
-
8/22/2019 Initial Cell Serch Paper
45/63
45
Phase4
((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12))+((y13+
y14)+(y15+y16))
((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))
((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12))-((y13+y
14)+(y15+y16))
((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))-((y13+y14)+(y15+y16))
((y1+y2)-(y3+y4)
)+((y5+y6)-(y7+y8))+((y9+y10)-(y11+y12))+((y13+y14)-(y15+y16))
((y1+y2)-(y3+y4)
)+((y5+y6)-(y7+y8)-((y9+y10)-(y11+y12))+((y13+y14)-(y15+y16))
((y1+y2)-(y3+y4)
)-((y5+y6)-(y7+y8)+((y9+y10)-(y11+y12))-((y13+y14
)-(y15+y16))
((y1+y2)-(y3+y4)
)-((y5+y6)-(y7+y8))-((y9+y10)-(y11+y12))-((y13+y14
)-(y15+y16))
((y1-y2)+(y3-y4))+((y5-y6)+(y7-y8))+((y9-y10)+(y11-y12))+((y13-y14
)+(y15-y16))
((y1-y2)+(y3-y4))+((y5-y6)+(y7-y8))-((y9-y10)+(y11-y12))+((y13-y14)
+(y15-y16))
((y1-y2)+(y3-y4))-((y5-y6)+(y7-y8))+((y9-y10)+(y11-y12))-((y13-y14)+(y15-y16))
((y1-y2)+(y3-y4))-((y5-y6)+(y7-y8))-((y9-y10)+(y11-y12))-((y13-y14)+
(y15-y16))
((y1-y2)-(y3-y4))+((y5-y6)-(y7-y8))+((y9-y10)-(y11-y12))+((y13-y14)-(y15-y16))
((y1-y2)-(y3-y4))+((y5-y6)-(y7-y8))-((y9-y10)-(y11-y12))+((y13-y14)-(
y15-y16))
((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))+((y9-y10)-(y11-y12))-((y13-y14)-(y15-y16))
((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))-((y9-y10)-(y11-y12))-((y13-y14)-(y
15-y16))
Phase3
((y1+y2)+(y3+y
4))+((y5+y6)+(y7+y8))
((y1+y2)+(y3+y
4))-((y5+y6)+(y7+y8))
((y1+y2)-(y3+y4
))+((y5+y6)-(y7+y8))
((y1+y2)-(y3+y4
))-((y5+y6)+(y7+y8))
((y1-y2)+(y3-y4
))+((y5-y6)+(y7-y8))
((y1-y2)+(y3-y4
))-((y5-y6)+(y7-y8))
((y1-y2)-(y3-y4)
)+((y5-y6)-(y7-y8))
((y1-y2)-(y3-y4
))-((y5-y6)-(y7-y8))
((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))
((y9+y10)+(y11+y12))-((y13+y14)+(y15+y16))
((y9+y10)-(y11+
y12))+((y13+y14)-(y15+y16))
((y9+y10)-(y11+
y12))-((y13+y14)-(y15+y16))
((y9-y10)+(y11-
y12))+((y13-y14)+(y15-y16))
((y9-y10)+(y11-
y12))-((y13-y14)+(y15-y16))
((y9-y10)-(y11-y
12))+((y13-y14)+(y15-y16))
((y9-y10)-(y11-1
2))-((y13-y14)+(y15-y16))
Phase2
(y1+y2)+(y3+y4)
(y1+y2)-(y3+y4)
(y1-y2)+(y3-y4)
(y1-y2)-(y3-y4)
(y5+y6)+(y7+y8)
(y5+y6)-(y7+y8)
(y5-y6)+(y7-y8)
(y5-y6)-(y7-y8)
(y9+y10)+(y11+y12)
(y9+y10)-(y11+y12)
(y9-y10)+(y11-y12)
(y9-y10)-(y11-y12)
(y13+y14)+(y15+y16)
(y13+y14)-(y15+y16)
(y13-y14)+(y15-y16)
(y13-y14)-(y15-y16)
Phase1
y1+
y2
y1-y2
y3+
y4
y3-y4
y5+
y6
y5-y6
y7+
y8
y7-y8
y9+
y10
y9-y10
y11
+y12
y11
-y12
y13
+y14
y13
-y14
y15
+y16
y15
-y16
Input
y1
y2
y3
y4
y5
y6
y7
y8
y9
y10
y11
y12
y13
y14
y15
y16
Figure12:H
adamardCodeMetrics(ButterflyO
peration)
-
8/22/2019 Initial Cell Serch Paper
46/63
46
The correlation coefficients are also called the Hadamard code metrics and are gener-
ated as shown in Figure 12 for a 16-point FHT. This operation is also called the butterfly
operation. The butterfly operation is also used in other digital signal processing (DSP)
applications such as calculating the discrete fourier transform (DFT). The Walsh code
having the largest metric is then selected as the most likely code that will be transmitted.
It is the job of the detector to find which of the code groups and slot ID is being used
from the table provided in the 3GPP specifications [7], using the three Hadamard rows
(Walsh codes). The detector needs to identify the code group in the minimum amount of
time which uses a lot of hardware resources. Also, if the correct sequence of Hadamard
rows is not identified and given to the detector then it can lead to wastage of additional
clock cycles as it will try to find the sequence from the table provided in the 3GPP specifi-
cations. The detection circuitry is used to locate the sequence from the table and hence
find the code group and slot ID. Also, in the 3GPP-comma free CSD implementation, two
clocks are not needed. Even if two clocks are used, a marginal gain will be achieved only
in the detection phase 5 as shown in Figure 11. This is due to the fact that detection of the
code group and slot ID cannot start till at least three slots have been identified by phases 1
- 4.
There are a number of stages in the FHT design depending on the length of the Walsh
sequence. Each subsequent stage receives an input from the previous stage in half the
number of clock cycles required for the previous stage. This is achieved by reducing the
length of shift register by a factor of two for each subsequent stage of the FHT.
-
8/22/2019 Initial Cell Serch Paper
47/63
47
A counter is used as a clock to determine the time interval at which each successive pair
of input signals is received by the FHT. The upper shift registers in each of the stages are
always enabled whereas the lower shift registers are enabled by the bits of the counter.
The length of the counter register is dependent on how many stages are there in the FHT.
The counter bit C0 is the LSB and C2 is the MSB. Counter bit C2 is alternately high for
four clock cycles and then goes low for four clock cycles (000...011, 100...111). The bit
C0 is alternately high and low for each clock cycle (000,001,...etc.). The number of bits in
the counter depend on the number of stages, which in turn depends on the length of Walsh-
Hadamard sequence to be used. If there are N Walsh chips then the counter length must be
log2N bits. The length of the shift register in each of the stage s of the design is given by
the following relation (N/4)/2s. For example the length of the shift registers used in the
first stage of the FHT is (16/4)/20=4. Similarly, the length of registers used in other stages
can be calculated.
In the first stage, the input signals corresponding to Walsh chips 0 to 7 arrive at the
upper adder whereas the Walsh chips from 8 to 15 are applied to the adder/subtractor cir-
cuit in the lower half of stage 1. During the first four clock cycles, the data bits from the
adder unit are selected by the multiplexer 1 in stage 1. The lower shift register of stage 1
is enabled to store the outputs from the adder/subtractor unit. Thus at the end of four
clock cycles, the upper shift register stores the result of addition of the first four pairs
whereas the lower shift register stores the result of subtraction. In the fifth clock cycle, C2
goes high which disables the lower shift register in stage 1. The result of the upper shift
register in stage 1 and the adder output from stage 1, which gives the addition of a new
-
8/22/2019 Initial Cell Serch Paper
48/63
48
pair of inputs, is then passed onto the adder and adder/subtractor unit in stage 2. Thus,
each subsequent stage receives its input from the previous stage. This process is then
repeated for each of the other stages in the FHT. At the end of eight clock cycles, all of the
16 correlation coefficients are generated and the largest coefficient is selected as the most
likely Walsh-Hadamard codeword to have been transmitted. The design is flexible and can
be easily modified to incorporate any chip sequence which has a length of a power of two.
5.2 Reduced Length FHT Design
If the 256X256 matrix is observed carefully then it is noticed that the 256 chip sequence
can be identified by 16 chip sequences shown in Table 6.
Table 6: Reduced Length Walsh Sequences (256 chip sequence to 16 chip sequence)
Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1
3 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1
4 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1
5 1 1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1
6 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1
7 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 1 1
8 1 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1 1 -1
9 1 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1
10 1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1
11 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1
12 1 -1 -1 1 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1
13 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1
14 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 1 -1 1 -1
15 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 1 1 -1 -1
16 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1
-
8/22/2019 Initial Cell Serch Paper
49/63
49
Thus in a CDMA receiver, only the first 16 chips of the entire Walsh sequence can be
used. The buffer, which is used to store the input value, will also be reduced in length
from 256 to 16 registers. The proposed design ideas lead to considerable savings in hard-
ware resources. The reduced length Walsh sequence helps in achieving faster decoding.
The two designs were synthesized and the hardware resources utilized were compared on
a Xilinx Virtex-E XCV1000E FPGA.
-
8/22/2019 Initial Cell Serch Paper
50/63
50
Chapter 6
Experimental Method and Results
6.0 Experimental Method and Results
This Chapter explains the method used to measure the acquisition time for both of the
cell search designs, Improved CSD and the 3GPP-comma free CSD. Section 6.1.1 pro-
vides details of the FPGA used for prototyping the algorithms and for comparing the hard-
ware specifications of both designs. Section 6.2 presents the results of the acquisition time
measure and the hardware comparison. Section 6.2 also compares the hardware utiliza-
tion of the FHT design using 256 and 16 chip sequences.
6.1 Experimental Method
The acquisition time was measured by counting the number of clock cycles used by the
RTL simulation. The input chip rate is given by the 3GPP specifications and this gives the
acquisition time measure. For comparing the hardware specifications and the maximum
frequency of operation of both designs on the FPGA, the Xilinx Foundation ISE software
was used to generate the bit map file for programming the FPGA. The details of the
FPGA and the design process used for the hardware comparison are explained in Section
6.1.1.
-
8/22/2019 Initial Cell Serch Paper
51/63
51
6.1.1 FPGA Design Process
The FPGA used for prototyping the designs is a Xilinx Virtex-E XCV1000E BG560
with a speed grade of 6. As the name suggests, FPGAs are capable of being reconfigured
to implement any desired digital circuit. This is made possible by having a large number
of small configurable logic blocks (CLB) and a connection mechanism between these
blocks which is used to interconnect the CLBs according to the design. The basic building
block of the Virtex-E CLB is the logic cell (LC). Each Virtex-E CLB contains four LCs,
organized in two similar slices, as shown in Figure 13 [20]. A LC includes a 4-input func-
tion generator, carry logic, and a storage element. Virtex-E function generators are imple-
mented as 4-input look-up tables (LUTs). Along with the LUTs the CLB also contains D
flip-flops for storing data. The output from the function generator in each LC drives both
the CLB output and the D input of the flip-flop. The block diagram of a 2-Slice Xilinx
Virtex-E CLB is as shown in Figure 13. The detailed view of a Virtex-E Slice is shown in
Figure 14 [20].
-
8/22/2019 Initial Cell Serch Paper
52/63
52
Figure 13: 2-Slice Virtex-E CLB
Figure 14: Detailed View of Virtex-E Slice
-
8/22/2019 Initial Cell Serch Paper
53/63
53
The entire design was coded in Verilog at the Register Transfer Level (RTL). The RTL
design was then synthesized using the Synopsys FPGA Express synthesis tool available
with the Foundation ISE software. The bit map generated was then used to program the
FPGA using the JTAG cable.
6.2 Experimental Results
To compare the acquisition time between the Improved CSD and the 3GPP-comma free
CSD, experiments were carried out using input vectors generated in Matlab. Threshold
values determined for the two probabilities of false alarm rates (PFA=10-3 and PFA=10
-4)
were 28 and 37 respectively. The number of clock cycles between the start of the system
and the point when the counter in stage 3 exceeds the computed threshold values was
determined. The equivalent gate count and maximum frequency of operation were com-
pared for both the designs using a 256 chip sequence in stage 2 and the same design con-
straints in the FPGA Express synthesis tool on a Xilinx Virtex-E XCV1000E FPGA.
From the experiments conducted, it was observed that the Improved CSD uses fewer
number of slots to achieve synchronization as compared to the 3GPP-comma free CSD in
stage 2. The results obtained indicate that when averaging is carried out over 15 slots in
stage 1 of both the designs (PFA1=10-3 and VTH1=28), the Improved CSD has an acquisi-
tion time of 13.66 msec as compared to 14.53 msec for the 3GPP-comma free CSD. Thus,
the Improved CSD achieves an improvement of 0.87 msec for an AWGN channel (Figure
-
8/22/2019 Initial Cell Serch Paper
54/63
54
15). Similarly, an improvement of 0.87 msec was observed when PFA2=10-4 and
VTH2=37. Figures 15 and 16 show the acquisition time measures for 2,4,8 and 15 slots in
stage 1 of the design. The number of slots in the other stages, as discussed in previous
Chapters, were kept fixed as 1 slot in stage 2 of the Improved CSD and three slots in
3GPP-comma free CSD and 15 slots in stage 3 of both designs.
-
8/22/2019 Initial Cell Serch Paper
55/63
55
Figure 15: Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-3
Figure 16: Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-4
2 4 6 8 10 12 14 162
4
6
8
10
12
14
16Acquisition Time Measures: Quantization 4 Input Data Bits
Number of Slots in Stage1
AcquisitionTime(inmsec)
Improved CSD3GPPcomma free CSD
2 4 6 8 10 12 14 164
6
8
10
12
14
16Acquisition Time Measures: Quantization 4 Input Data Bits
Number of Slots in Stage 1
AcquisitionTime(inmsec)
Improved CSD3GPPcomma free CSD
-
8/22/2019 Initial Cell Serch Paper
56/63
56
As seen from Table 7, the Improved CSD had a lower equivalent gate count (136,297)
and a higher maximum frequency of operation (22.066 MHz) on a Xilinx Virtex-E
XCV1000E FPGA as compared to the 3GPP-comma free CSD when the same constraints
were used in the synthesis of both the designs.
In the FHT design, the input Walsh sequence length can be reduced from 256 chips to
16 chips to reduce the hardware utilization. The proposed idea leads to considerable sav-
ings in hardware resources. The buffer, which is used to store the input value, is reduced
in length from 256 to 16 registers. The reduced length Walsh sequence helps in achieving
faster decoding. The FHT designs using 16 and 256 chip sequences were synthesized and
the hardware resources utilized were compared using a Xilinx Virtex-E XCV1000E
FPGA. The hardware utilization for both the FHT designs are compared in Table 8.
The results of the reduced length sequence indicate that the FHT design, using 16 chip
sequence, achieves 90% reduction in hardware resources (equivalent gate count) as com-
pared to the design which uses 256 chip sequence. Also, the maximum frequency of oper-
Table 7: Hardware Specifications of System: Quantization 4 Input Data Bits
FPGA XCV 1000E
BG560 Speed Grade 6
Number
of Slice
Registers
Number of
4 Input
LUTs
Equivalent
Gate Count
Max. Frequency of
Operation (Post
Route Timing)
Improved CSD 9086 7354 136297 22.066 MHz
3GPP-comma free CSD 10141 7777 144180 12.887 MHz
Table 8: Hardware Specifications of FHT: 16 and 256 chip sequence
FPGA XCV
1000E BG560
Speed Grade 6
Number of
Slice Registers
Number of 4
Input LUTs
Equivalent
Gate Count
Max. Frequency of
Operation (Post
Route Timing)
FHT 16 chips 71 173 1591 35.769 MHz
FHT 256 chips 1070 1370 17,191 16.025 MHz
-
8/22/2019 Initial Cell Serch Paper
57/63
57
ation of the 16 chip FHT (35.679 MHz) is more than double that of the 256 chip FHT
(16.025 MHz).
-
8/22/2019 Initial Cell Serch Paper
58/63
58
Chapter 7
Summary, Conclusions and Future Work
7.0 Summary, Conclusions and Future Work
In this Chapter the conclusions drawn form the experimental results are summarized
and the scope for future work is outlined.
7.1 Summary
In Chapter 2, we discussed some of the previous work done by other research groups
and also the 3GPP working group suggestions. Chapter 3 introduced the cell search algo-
rithm, which is divided into three stages to simplify the synchronization between the MS
and the BS. Chapter 4 discussed the Improved CSD which is the proposed design scheme
to perform initial cell search. The hierarchical matched filter design proposed by Siemens
and Texas Instruments was used in stage 1 of both the cell search designs [6]. In stage 2 of
the initial cell search algorithm, two possible design schemes were compared: the
Improved CSD which uses cyclic codes and the 3GPP-comma free CSD using the comma
free codes. The details of the Improved CSD are described in Chapter 4. In stage 3 of
both the cell search designs, masking functions are proposed to reduce the hardware utili-
zation as compared to the previous design described by Li et al. [4]. Chapter 5 described
the 3GPP-comma free CSD using a FHT design in stage 2 of the cell search algorithm.
Further design improvements are suggested in the FHT design by reducing the length of
-
8/22/2019 Initial Cell Serch Paper
59/63
59
the input Walsh sequence from 256 chips to 16 chip sequences. Chapter 6 discussed the
experimental method and presented the results in terms of acquisition time and hardware
utilization for both the Improved CSD and the 3GPP-comma free CSD. The hardware uti-
lization of the FHT design using 256 chip sequences and the reduced length (16 chip
sequences) are also presented.
7.2 Conclusions
For an AWGN channel model in a high signal-to-noise ratio environment, it was found
that accumulation over one slot in the Improved CSD scheme and accumulation over three
slots in the 3GPP-comma free CSD scheme in stage 2 of the cell search algorithm gives
correct code group and slot boundary identification. Due to the reduction in the required
number of slots, the Improved CSD uses lesser number of clock cycles in stage 2 as com-
pared to the 3GPP-comma free CSD to detect the code group and slot ID. This reduction
in the number of clock cycles leads to faster acquisition, fewer calls getting dropped and
lower power consumption during the synchronization between the MS and the BS. The
use of cyclic codes in the Improved CSD has lower hardware utilization and a higher max-
imum frequency of operation as compared to the 3GPP-comma free CSD. In conclusion,
the Improved CSD is a better cell search design in comparison to the 3GPP-comma free
CSD since it has faster acquisition time and lower hardware utilization.
-
8/22/2019 Initial Cell Serch Paper
60/63
60
7.3 Future Work
This thesis investigates code and time synchronization of the cell search algorithm. In
addition to code and time synchronization, frequency synchronization between the MS
and the BS needs to be achieved. The receiver design presented in this thesis would need
to include another module to achieve frequency synchronization. Also, the cell search
considered in this thesis is initial cell search. There is another cell search called target cell
search which needs to be performed during a call and when a MS is in motion and moves
from one cell to another. VLSI implementations to perform target cell search efficiently
need to be investigated.
Kiessling et al. [21] suggest performance enhancements to W-CDMA initial cell search
algorithm. The authors consider the advantages of oversampling and passing multiple
candidates in the cell search stages instead of one candidate to reduce the cell search time.
Passing multiple candidates in each of the stages will reduce the cell search time but
increase the design complexity a