ericsson 2 g root-cause-analysis-for-kpi

50
Ericsson Confidential GSM NETWORKS KPI ROOT CAUSE ANALYSIS 1 (50) Prepared (also subject responsible if other) No. ESA/SK Subhash Panikar ESA/SK 06:0027 Approved Checked Date Rev Reference Amos Phahla 2006-10-17 PA1 REP00271_A Root Cause Analysis for Key Performance Indicators (KPI) for GSM Networks 1 1 This document could be used as a guideline for GSM Network Performance Measurements, Root Cause Analysis of Network Performance issues and as a reference document (along with ALEX) for activities spanning from a Basic Audit of Radio Network (BARN) to a Radio Network Optimisation (RNO) contract.

Upload: guy-hilaire

Post on 13-Apr-2017

1.029 views

Category:

Technology


41 download

TRANSCRIPT

Page 1: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

1 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Root Cause Analysis for KeyPerformance Indicators (KPI) for GSM Networks1

1 This document could be used as a guideline for GSM Network Performance Measurements, Root Cause Analysis of Network Performance issues and as a reference document (along with ALEX) for activities spanning from a Basic Audit of Radio Network (BARN) to a Radio Network Optimisation (RNO) contract.

Page 2: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

2 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

List of Contents

1. Structure of the document

Contents of this document are divided into 10 core parts, 9 of which are the KPIs recommended by Ericsson to measure performance of a GSM network.

Each of these parts are further divided into three sub parts

• Define: this section describes/ defines the KPI component under discussion.• Measure: this section describes the stats and STS counters available for measuring each of these KPIs

performance.• Analyse: this section describes the “Root Cause Analysis” process for each of these KPIs.

Page 3: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

3 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

2. Random Access (RACH) Success

2.1 Define

There are in all 12 types of logical channels in GSM, of which two are used for traffic; nine used for “control signalling”2 and one used for message distribution (CBCH).

These 9 control channels are subdivided into 3 groups

• Broadcast Channel (BCH, Downlink Only): FCCH, SCH and BCCH.• Common Control Channel (CCCH): PCH, RACH and AGCH.• Dedicated Control Channel (DCCH): SDCCH, SACCH and FACCH.

Random Access Channel (RACH) is used by the MS on the “uplink” to request for allocation of an SDCCH. This request from the MS on the uplink could either be as a page response (MS being paged by the BSS in response to an incoming call) or due to user trying to access the network to establish a call. Availability of SDCCH at the RBS will not have any impact on the Random Access Success.

In the transceiver, the timeslot handler in charge of the RACH channel listens for access burst from mobiles (on the timeslot that transmits BCCH). These bursts contain a check sequence (8 bits) that is used to determine if the message is valid.

The number of times an MS tries to access the network (repeated access in the event of no response from the BS in the form of immediate assignment or immediate assignment reject on AGCH) is decided by the BSS

2

Signaling between RBS and BSC (LAPD) is taken care by TRHs - Transceiver Handlers (RPG3A) in BSC. Between BSC and MSC the signaling is taken care C7 (A interface) signaling Terminals (also RPG3A). This is Common Channel Signaling and BSSAP protocol is used there.

Page 4: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

4 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

parameter MAXRET (maximum number of retransmissions) and the randomness in the time interval between each of these access request is defined by the parameter TX.

2.2 Measure

A failed Random Access Burst does not necessarily lead to call setup failure as MS sends many RA (Random Access) burst each time it tries to connect to the network.

2.2.1 STS Counters and stats for monitoring Random Access Failure

Random Access Success Rate = (CNROCNT)/ (CNROCNT+RAACCFA) * 100

STS Object Type: RANDOMACC and RNDACCEXT

RAACCFA: Failed Random Access CNROCNT: All accepted Random Access

RATRHFAEMCAL: this counter is stepped up for every rejected CHANNEL REQUIRED in TRH3 with establishment cause “Emergency Call”

RATRHFAREG: this counter is stepped for every rejected CHANNEL REQUIRED in TRH with establishment cause “Location Update”.

RATRHFAOTHER: this counter is stepped up for every rejected CHANNEL REQUIRED in TRH with all other establishment causes.

2.2.2 STS Counters for monitoring the RA request type

• RAANPAG: This counter gets Incremented when the MS responds to an incoming page (Mobile Terminated Call).

• RAEMCAL: This counter gets incremented when the MS sends in an emergency call request.• RACALRE: Call Re-establishment or if TCH/F needed or Originating Call. • RAOSREQ: Other Service Request.• All other cases: RAOTHER.

2.3 Analyze

2.3.1 Fish Bone diagram for the root cause analysis of poor Random Access Success

3 TRH: Transceiver Handler, the component in the BSC which handles the radios (TRXs). TRH-RPG2 (RPG is the Regional Processor that controls the TRH) can handle up to 24 TRXs and TRH-RPG3 (based) can handle up to 32 TRXs. The TRH which is controlled by RPG is directly responsible for supporting LAPD “signalling” (or the ABIS interface) where as TRAU is the component within the BSC which supports “speech” on the ABIS.

Page 5: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

5 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Poor RA Succ Rate

3. Poor Coverage/ Spill 1. Poor BSIC Plan

4. Phantom RACH 2. BCCH Plan

5. Faulty Antenna/Cable

6. CRO and ACCMINMAXRET and TX

Figure 1 : Fish Bone diagram for root cause analysis of poor Random-Access Success Rate

2.3.2 Description of the root causes for poor RA Success Rate

• Poor BSIC Plan: A high number of RA failures often are caused by bad BSIC plan. If a BSC is seen to have poor RA Success with all the other root causes not fitting the “profile” of the concerned BSC, then check for excess use of co-BCCH/BSIC plan within the BSC and for cells from immediate surrounding BSCs. Minimize excess use of co-BCCH/BSIC across the same BSC and also for close in cells from the surrounding BSCs.

• Poor BCCH plan: When an MS tries to access the network either in response to a page from the network or in the form of a mobile originating call, it does so using the timeslot carrying the BCCH frequency. Excess levels of interference (co and adjacent) on the BCCH carrier can cause poor RA Succ Rate

• Poor Coverage / Spillage: Poor signal strength coverage within a cell can cause poor RA success Rate especially from across the rapid fading (received signal strength on UL) cell boundaries. If the cell/cells within the BSC shows symptoms of poor coverage (check the stats for high percentage contribution on TCH Drop due to weak DL/UL coverage, confirm findings using MRR4 recordings and trace results graphs for coverage and Timing Advance for MS and BS) work to improve coverage in hot spot areas (antenna tilts, azimuth, height, transmit power setting revisions for both MS and BS, power control setting revision for both MS and BS, use of TMA etc).

• Phantom RACH: In the transceiver, the timeslot handler in charge of RACH channel constantly listens for access burst from MS. These access bursts contain a check sequence (8 bits) that is used to determine if the message is valid. When the traffic in a cell is low, there will not be many access burst coming to the RBS. In such cases most of the received signal will be noise and if the receiver at the RBS is sensitive, some of this received noise will be interpreted as an access burst (cases where certain bit pattern in the noise matching the 8 bit checks sequence). This is called as Phantom RACH and is unavoidable (unless you diminish the receiver sensitivity , introduce powerful filtering function in the signal processor by optimizing the type of filtering that is used in the RBS and/or optimize CRO5 and ACCMIN for the cell. When the traffic is high this problem is not as disturbing as most of the noise will be covered by genuine access bursts. In case of no traffic at the cell, it’s seen that 0.02% of the incoming noise can be

4 MRR (Measurement Report Recording) is usually initiated for a period of time either at cell level or at BSC level, the data collected and graphically presented are RF measurements like signal strength UL/DL, Timing Advance, RxQual UL/DL etc. To measure GSM Layer 3 messages (cell level) use CTR (Cell traffic Recording).5 CRO (Cell reselection Offset) lower value of CRO will favour the neighbouring cells on Idle mode reselection by the phase 2 MS (C2 reselection) where as a higher (say -104 instead of the default -110) setting of ACCMIN will make surrounding cells more favourable on idle mode reselection

Page 6: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

6 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

interpreted as access burst. Since there are 217 bursts per second on an air timeslot, 0.434 access burst/sec could be a phantom random access for a cell that carries very little traffic, which means one access burst every 23 seconds could be a phantom access for these types of cells and will give a call setup failure. The more traffic you have on the cell, less noise will be received in the RACH channel handler (this is because the signal strength levels of traffic will be higher than that of noise) and thus less phantom RACHs will be reported by the BTS. As a conclusion for BSCs with cells carrying very low traffic, it is possible to receive upto 120 phantom RACHs per hour per cell. This is a normal phenomenon and is according to GSM specification 05.05 (this should therefore not be taken into account when measuring the call setup performance of the Ericsson System).

• ACCMIN6 and CRO7: These are idle mode cell reselection parameters directly controlling the way MS does cell reselection when in idle mode. Default value for ACCMIN is -110; this figure can cause excess call setup failures, especially in cases where there is high traffic distribution across weak cell boundaries. Higher settings for ACCMIN (-104) will ensure that the MS will camp on a cell if the received signal strength on the downlink is good enough (this will work in favour of weaker uplink). Default value for CRO is 0 and each increment in CRO represents 2 dB. If the neighbouring cells to a cell which is suspected to have poor RA Succ Rate due to excess traffic from poor signal strength cell boundary , is kept higher than the cell in question , it will make the neighbouring cells more appealing for idle mode MS reselection across the cell boundaries. This approach can improve the RA Succ Rate for cells with excess traffic on weak cell boundary (with good neighbours).

• Faulty Antenna / Cable: If the other root causes are not fitting the profile check for faulty antenna, loose /damaged jumper/feeder cables, VSWR alarms etc.

• MAXRET and TX: for cells with ABIS over VSAT, it is recommended that MAXRET be set to 4 and TX be set to 32. This will limit the number of retransmission from the MS. When the ABIS is over a VSAT, delay in response from the BSS to the received RACH from MS can be sluggish, that’s the reason from keeping the number of retransmission of RACH low and keeping the repeat interval a bit longer.

• Impact on RA Success Rate due to Location Updating failures: Use the following formula [(RATRHFAREG)/( RATRHFAREG+ RATRHFAEMCAL+ RATRHFAOTHER)]*100 ; this equation will give us percentage contribution of Random Access Failure due to Location Area Updating process to the net Random Access Failure Rate. This stat could also be used while analysing the cause for excess Location Area Updating, and also will give an approximate value for loading on SDCCH for the cell due to Location Updating Requests.

3. Paging (PCH) Success 3.1 Define 3.1.1 Paging Process

• Paging Success by far is the most complex KPI to deal with as the process of paging touches almost all the nodes in GSM system and is influenced by performance of each of them. That’s the reason why this write up on paging looks too interwoven and cross refers to too many things. But the good news with paging is by the time paging success rate in a network gets improved; almost all the other KPIs too stand improved.

• In response to an incoming call, the MSC initiates the paging process by broadcasting a “paging request” message on the paging sub channel (IMSI or TMSI of the MS and its Paging Group) and starts timer T31138.

6 ACCMIN : minimum received signal strength at the MS (DL) to allow access to the cell , ACCMIN controls the C1 reselection (phase 1 and phase 2 phones) as per the equation C1=(Receaved_RxLevDL – ACCMIN) – max (CCHPWR –P,0).7 C2=C1+CRO-TO : CRO apart from ACCMIN controls the idle mode reselection by a phase 2 MS of a cell 8 GSM Timer T3113 at the Ericsson MSC is PAGING_timer (guard time for paging initiated from the MSC)

Page 7: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

7 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• A “paging message” consists of the mobile identity (IMSI or TMSI) of the MS being paged and its “paging group number”.

• A Paging Request Message may include more than one MS identification. The maximum number of paged MS per message is 4 when using “TMSI” for identification of the MS (maximum number of paged MS per message is 2 when using IMSI).

• The BSC receives this page and processes the paging request and schedules it for transmission on the PCH at appropriate time.

• The MS on its part will analyse the paging messages (and immediate assignment messages) sent on the paging sub channel corresponding to its paging group.

• Upon receipt of a “paging request” message, MS will initiate within 0.7s an immediate assignment procedure.

• Upon receipt of a page at the MS, the MS responds by transmitting a channel request on the RACH.• BSS in response to the received “channel request”, will process it and immediately assign the MS a

SDCCH (immediate assignment / assignment reject; done over AGCH).• MS Paging response- After receiving the immediate assignment command, MS switches to the

assigned “SDCCH” and transmits a “Paging Response”.• The establishment of the main signalling link is then initiated (E1) with information field containing the

“PAGING RESPONSE” message and the “paging response” is sent to the MSC.• Upon receipt of the “Paging Response” MSC stops the timer T3113.• If the timer T3113 expires and a “Paging Response” message has not been received, the MSC may

repeat the “Paging Request” message and start T3113 all over again. The number of successive paging attempt is a network dependent choice.

3.1.2 Paging Capacity and Paging Group at the RBS

• One control channel Multi Frame is made of 51 TDMA frames9 with a time duration of 235 ms.• Each 51 TDMA frame Multi Frame will have 9 Common Control Channel (CCCH) blocks.• Each of these 9 CCCH block is made of 410 TDMA timeslots• Each CCCH block can carry Paging Messages for 2 MS if IMSI based paging is used or 4 MS if TMSI11

based paging is used.• Thus the paging capacity for one 51 TDMA frame Multi Frame12 will be 9(number of CCCH blocks

available per Multi Frame) * 4 (when TMSI based paging is used) = 36 mobiles per 235 ms or 9*2 = 16 mobiles per 235 ms when IMSI based paging is used.

• Thus the paging capacity of a cell is 153 mobiles per second when TMSI based paging are used and 68 mobiles per second when IMSI based paging are used.

• This means we can improve the “paging bandwidth” for a cell (if there are too many “paging discards at the cell level”) by using TMSI based paging rather than IMSI based (at the expense of increased processor load at the BSC and MSC).

• When the rate of “paging load” at the RBS becomes higher than what the RBS is able to handle (paging capacity of RBS), RBS will start discarding pages (check for high “page discard” stats at the cell level).

9 One TDMA frame = 8 Radio timeslots of duration 4.615 ms10 Signaling Requirement in GSM : One complete “Signaling Message” in GSM system requires 464 bits (or 4*116 bits or 58 bytes) , one TDMA burst on one timeslot can code 58+58=116 bits ; hence it requires 4 TDMA frames to complete one message (E.g. One MR or Measurement Report or One Paging Message).11 At the expense of increased processing load at the MSC and BSC12 with the cell parameter AGBLK = 0 (here all the 9 CCCH blocks will be available for PCH with preference for AGCH)

Page 8: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

8 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• If the “page discard” at the cell is very high then increase the paging bandwidth at the cell level by setting AGBLK=0 (all the 9 CCCH blocks in the 51 frame multi frame available for PCH with preference for AGCH13 ) and use TMSI based paging instead of IMSI based paging.

• If the “PAGE MODE ELEMENT” is set to “extended paging” , the MS will not only listen to its own paging group but also to its “next but one page group” (example , say an MS is assigned 23 as its paging group , with extended paging MS will also listen to the page group 25). Configuration parameters for extended paging are hard coded in the BSC and it is not usually modified during optimization.

• The cell parameter MFRMS and AGBLK together defines the “number” of paging groups available at the cell level.

• Number of Paging Groups at cell = MFRMS*(9-AGBLK) for a “non combine BCCH”14 • MFRMS decides the repeat interval of paging messages for MSs that belongs to the same paging group

across multiple 51 frame multi frames (for an example if an MS belongs to paging group 23 and if MFRMS is set to 4, then pages for all the MSs falling under the paging group 23 will be broadcasted after every 4th 51 frame multi frame).

• Once an MS deciphers its paging group, in an idle mode, it will tune in and check for an incoming page only during broadcast time for its paging group (so further the paging groups are places across multiple 51 frame multi frames (say MFRMS=9), less frequently it will tune in to check for an incoming page and longer will be its battery life. But the problem in this case for a cell with high paging load is higher paging discards. For cells with high paging load it’s recommended to keep MFRMS between 4 to 6.

• If MFRMS<= 3, then the IMSI/TMSI is discarded if it has not been retransmitted within two scheduling of its paging group. If MFRMS>3, then the IMSI/TMSI is discarded if it has not been retransmitted within one scheduling of its paging group. That’s the reason why for a site with ABIS over VSAT it is recommended that MFRMS be set to 2 (to compensate for the delay on satellite system).

• One “Paging Queue” per “Paging Group” is available at the cell level (that means when we decrease the number of paging groups for the cell by reducing MFRMS to lower numbers, we actually reduce the number of available paging queue).

• Paging Queue Length = 14 – (Number of Paging Groups/10); which means higher the number of paging groups in a cell (that is higher the MFRMS settings), lower will be the Paging Queue length, this will compensate for lower number of Paging Queue available at lower MFRMS.

• MFRMS does not affect the processor load of BSC and MSC too much (though very low settings of MFRMS are seen to increase the BSC and MSC load slightly).

• Also call setup time is seen to be shorter for the cell when MFRMS is set low as compared to high value of MFRMS.

• MFRMS also influences downlink signalling failure rate in idle mode leading cell reselection. High values of MFRMS can lead to high (unnecessary) cell reselection. (When an MS camps on a cell , the counter DSC gets initiated to 90/MFRMS , and each time the MS successfully decodes a page DSC gets incremented by 1 till DSC=90/MFRMS and if MS fails to decode a page , DSC is decremented by 4. If DSC gets decremented to 0 a downlink signalling failure gets flagged and the MS will do cell reselection).

3.1.3 Paging Capacity at the BSC

• The key bottleneck in paging performance is the Location Area dimensioning (as the first page usually gets done to Location Area) and BSC capacity.

13 AGCH has to have high priority over PCH as AGCH deals with immediate assignment for either a successfully paged MS or for an MS that has made a successful Random Access.14 In a “combined” BCCH cell, 4 SDCCH sub channels shares the same timeslot with BCCH, this drastically reduces the SDCCH capacity and proportionately reduces the cells paging response capacity. For cells with high paging load always use the non-combined mode of BCCH/SDCCH, in other words use SDCCH/8 mode (i.e. at least one dedicated time slot for SDCCH and one dedicated SDCCH timeslots gives 8 SDCCH sub channels) than SDCCH/4 mode as in a combined BCCH case.

Page 9: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

9 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• Number of cells in a Location Area ranges from as low as 10 to more than 100. Once a page reaches the BSC from MSC, BSC sends it across to “all the cells within the BSC”15. Hence an incoming page to a BSC gives rise to a considerably large number of outgoing paging commands from the BSC (point to multipoint). This is the reason why the BSC is more likely to be the unit “limiting” paging rate than the MSC.

• All the RBSs within this BSC will now broadcast this page at least once , this means the RBS sets the limit on overall paging capacity (158 mobiles per second if TMSI based paging is used or 68 mobiles per second if IMSI based paging is used).

• Paging Bottleneck at the BSC usually is the number of RP signals that can be sent from the CP to the RPDs in the TRHs16.

• Calculations for number of pages per second:

NO PAGERPSIG

NO TRH RPp_

_=

⋅ (1)

( )RPp NO CELLS

NO TRX NO LA

TRXpTRH

= − −⋅

1 1_

_ _(2)

Where:RPSIG17 Maximum number of RP signals per second TRXpTRH18 Average number of TRXs per TRHNO_LA Number of Location AreasNO_TRH Number of TRHs belonging to the BSCNO_TRX Number of TRXs belonging to the BSCNO_CELLS Number of cellsRPp Probability that an RP signal is sent to an TRHNO_PAGE Number of pages per second

• With “more” TRHs the paging capacity for the BSC decreases.• With “more” LACs within a BSC the paging capacity for the BSC increases.• In the BSC there also exists a “paging queue” with 32 slots (i.e. 32 TIMSI or IMSI can be queued at the

BSC , this means the supervision timer for “page response” of the first page PAGTIMEFRST1LA should not be set too short (shorter than 6 seconds)

3.1.4 Paging Strategies and MSC paging timers

15 Unfortunately unlike UMTS there is no localised (page to registered cell) paging in case of GSM.16 CP is the central processor of the BSC, the CP controls numerous Regional Processors (RP); each of these RPs can control up to 16 EMs (Equipment Modules, which are the hardware software function responsible for a particular task / tasks). “RP signals” refers to the number of messages that CP can pass onto RPs for control as well as messaging purpose)17 “Maximum number of RP signals per second” is a BSS release version (R#) dependent, please refer to revision release notes.18 again release version dependent ; for R11 with RPG3 , one TRH can handle up to 32 TRXs

9702634

First page

Global

LAI infoin VLR?

yes

no

1 Loc Area PAGREP-1LA

1 Loc Area(TMSI/IMSI)

Second page

1 Loc Area(IMSI)

Global

No second page

PAGTIME-FRST1LA

PAGREP-GLOB

PAGTIME-FRSTGLOB

0

1

2

3

0

1

PAGETIMEREP1LA

PAGTIMEREPGLOB

Page 10: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

10 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Figure 2: Paging Strategies and MSC paging timers

• It’s more efficient to have the Location Area based first page. At the MSC, the chance of having the correct location area where the mobile is based at any moment is always very high19. Hence having the first page based on Location Area will reduce the amount of paging load at the BSC and MSC; caused by first page being global (the MS being paged across all the LACs falling under the MSC).

• Second page can be Local or Global, but if the paging success rate of second page is seen to be low then trying out global second paging can be a good option.

• Usually it’s seen to be a good strategy to have the first page to be TMSI based (due to higher paging bandwidth through and through) and the second page to be IMSI or TMSI based. This becomes a good option if the paging load / paging discard at the cell level are high and MSC/ BSC processor load is not a major concern.

3.2 Measure

3.2.1 STS counters and stats for measuring Paging performance

3.2.2 MSC Paging Counters for a Location Area

19 When a mobile is switched ON, it does a location updating, this process is called as IMSI attach. When in Idle mode, the MS also does what is called as “periodic location updating” in an interval defined by the timer T3212, also the MS does a Location Updating every time it crosses to a new Location Area. If the parameter ATT=1(recommended) , then each time the MS is switched off or goes out of coverage it does an IMSI detach to inform the MSC about lack of its availability.

Number of successful location updatingNLALOCSUCC

Total number of location updating attemptsNLALOCTOT

Number of location updating rejections due to the CSS restrictionNLALOCSSRFLT

Number of unsuccessful page responses to a LANLAPAGERR

Number of page responses to repeated page to a LANLAPAG2RESUCC

Number of page responses to first page to a LANLAPAG1RESUCC

Number of repeated page attempts to a LANLAPAG2LOTOT

Number of first page attempts to a LANLAPAG1LOTOT

DescriptionCounter-name

Number of successful location updatingNLALOCSUCC

Total number of location updating attemptsNLALOCTOT

Number of location updating rejections due to the CSS restrictionNLALOCSSRFLT

Number of unsuccessful page responses to a LANLAPAGERR

Number of page responses to repeated page to a LANLAPAG2RESUCC

Number of page responses to first page to a LANLAPAG1RESUCC

Number of repeated page attempts to a LANLAPAG2LOTOT

Number of first page attempts to a LANLAPAG1LOTOT

DescriptionCounter-name

Page 11: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

11 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Table 1: MSC Paging Counters for a Location Area

• To check for the percentage of first page attempts eventually being re-paged due to lack of response use the formula (NLAPAG2LOTOT/NLAPAG1LOTOT)*100 and to check paging success rate of second page to a Location Area use (NLAPAG2RESUCC/NLAPAG2LOTOT)*100, the results of which could be used to measure the extend to which second paging is needed for a Location Area. Usually this figure averages around 20 to 25% and this figure could be used to decide upon the second paging strategy. If the figures for second paging are high try to use TMSI based (if there is high paging discards to excess paging load and high mobile terminated traffic) second paging and/or global second paging, provided processor loads at the MSC and BSC not high. Also if the second paging intensity is very high optimizing page repeat timers (PAGETIMEREP1LA and PAGTIMERPGLOB) might be an option (always make sure that none of the paging timer exceeds the MSC page response timer PAGING_timer).

• CSS restriction deals with C7 signalling capacity. • Use counters NPAG1GLTOT and MPAG2GLTOT to measure the first and repeated global paging

attempts (if global paging is used).

3.2.3 Stats for BSC Paging Success Rate Measure

Page 12: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

12 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Stats Formulae What is it ? CommentsP_TOT NPAG1GLTOT+NPAG1LOTOT Total Page Attempts (MSC level)To get Pages/s ; div ide by the m easurement period

LU_TOT (NLOCOLDTOT+NLOCNRGTOT) Total LU (MSC level) To get LU/s div ide with the measurement periodLU_SUC_TOT (NLALOCSUCC/NLALOCTOT)*100% LU Succ Rate (LA Level) Succesfull of total attempta (LA Level)

S_TRAFF (CTRALACC/CNSCAN) in Erl Average SDCCH Traffic Cell LevelS_CONG {(CCONGS+CCONGSSUB)/(CCALLS)}*100 %SDDCH Congestion Rate Cell Level

CP_LOAD (ACCLOAD/NSCAN)*100 % CP Load BSC or MSCRA_OTHER (RAOTHER/CNROCNT)*100 %

RA with all other causes (LA,Detach,Attach etc) of total attempts. BSC Level

RA_TOT (CNROCNT+RAACCFA)Total Number of Random Access Attempts BSC Level

LU_IMSI_AT (NLOCATTTOT/NLOCOLDTOT)*100

IMSI attach attempts of total num ber of LU attem pts from already registered subs (MSC level) This is at the MSC level

LU_PERIOD (NLOCPERTOT/NLOCOLDTOT)*100

Periodic LU Attempts of total num ber of LU Attem pts from Already registerd subs (MSC level) This is at the MSC level

LU_SUC{(NLOCOLDSUCC+NLOCNRGSUCC)/(NLOCOLDTOT+NLOCNRGTOT)}*100

Successful LA attempts of total num ber of LA attempts (MSC) This is at the MSC level

LU attempts from registered Subs. Of total number of LA This is calculated at the MSC levelLU_R

{(NLOCOLDTOT)/(NLOCOLDTOT+NLOCNRGTOT)}*100 %

PL_SUC-1{(NLAPAG1RESUCC+NLAPAG2RESUCC) / (NLAPAG1LOTOT)}*100

Successful First and Repeated Page Attempts of Total Number of first Page Attempts (LA Level)Paging Success Rate (LA Level)

P_1_SUC-1{(NPAG1RESUCC)/(NPAG1GLTOT+NPAG1LOTOT)}*100

Successful First page Attempts of Total Number of first Page Attempts (MSC Level) First Page Success Rate

Successful First and repeated page Attempts of Total Number of first Page Attempts (MSC Paging Success Rate (MSC Level)

{(NPAG1RESUCC+NPAG2RESUCC)/(NPAG1GLTOT+NPAG1LOTOT)}*100P_12_SUC-1

Table 2: Stats for BSC Paging Success Rate Measure

3.2.4 Calculation of Paging discard rate for cells in a BSC

Use the following equations to calculate paging discard rate in a cell:

Equation 1

If the BSC has more than one LA, paging ratio between LAs is evaluated using establishment cause (answer) to paging in the channel request message for each LA in the BSC:

Equation 2

For such cases (multiple LACs within a BSC) page discard rate per cell is calculated by:

Page 13: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

13 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Equation 3

• PAGETOOOLD (cell): number of discarded paging messages (sent out on the PCH) due to being too old.• PAGPCHCONG (cell): number of discarded paging messages due to full paging queue.• TOTPAG (BSC): number of paging messages received at the BSC from the MSC.• PAGCSBSC (BSC): number of paging messages received from SGSN (circuit switched data).• PAGPSBSC (BSC): number of paging messages received from SGSN (packet switched data).• RAAPAG1 (cell): number of RA, answer to paging (channel required is a TCH/F with dual rate MS).• RAAPAG2 (cell): number of RA, answer to paging (CR is TCH/F or TCH/H with dual arte capable MS).

3.2.5 BSC parameters effecting Paging Success Rate

Table 3: BSC Parameters effecting paging success rate

• Cell parameters MFRMS and AGBLK as explained earlier impacts paging performance.• For better paging performance BCCHTYPE is recommended to be non-combined.• T3212, the periodic location updating timer is usually recommended to be set somewhere between 2 Hrs to 4

Hrs. Very high values of T3212 adversely affects paging success rate where as too low values for the same counter puts unnecessary load on SDCCH (due to higher idle mode period location updating from mobiles).

• MSC parameters BTDM + GTDM should not be lower than T3212. BTDM + GTDM determines the implicit detach time at the MSC , that is if the mobile has not done a location updating within the time specified by BTDM+GTDM , then it will taken out from the VLR. That’s the reason why implicit detach time should be greater than periodic location updating time.

• ATT, this is again an MSC parameter that allows IMSI detach, when an MS is switched off (or gone out of coverage). This parameter should always be set to 1(enabled). This will prevent unnecessary paging of mobiles which are either switched off or are out of coverage.

RLSSCCell reselect hysteresisCRH [dB]

RLSBCMax. retransmission at accessMAXRET

RLSBCAttach-detach allowedATT

RLSBCTime-out, MS periodic LUT3212 [Deci hours]

RLDECType of BCCHBCCHTYPE

RLDECNo. of reserved access grant blocksAGBLK

RLDECMultiframes periodMFRMS

MMLExplanationParameters

RLSSCCell reselect hysteresisCRH [dB]

RLSBCMax. retransmission at accessMAXRET

RLSBCAttach-detach allowedATT

RLSBCTime-out, MS periodic LUT3212 [Deci hours]

RLDECType of BCCHBCCHTYPE

RLDECNo. of reserved access grant blocksAGBLK

RLDECMultiframes periodMFRMS

MMLExplanationParameters

Page 14: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

14 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• If high paging failure is suspected from a cell due to poor RF environment (weak coverage / high interference on BCCH radio) , then at times increasing MAXRET (number of retransmission on random access channel) can improve page response from a mobile suffering from poor RF environment.

• ACCMIN, CRH and CRO; these are cell parameters that could be optimized to make an MS select the best possible candidate (from an RF point of view) when in idle mode so that probability of a successful page response from an MS is high.

3.2 Measure

3.2.1 Fish bone diagram for the root cause analysis of poor Paging Success Rate

Poor Paging Succ Rate

1. Incorrect Cell Parameters

4. Poor RF 2. Excess paging Discards

3. Incorrect MSC Parameters

5. Poor Paging Strategy

Figure 3 : Root Cause for Poor Paging Succ Rate (1)

Poor Paging Succ Rate

8. incorrect LAC Dimension 6. SDCCH Congestion

9. ABIS , A interface Congestion 7. Combined BCCH

10. ABIS , A interface fluctuations , Errors

11. decrease signalling load on CCCH

Figure 4 : Root Cause for Poor Paging Succ Rate (2)

• Incorrect Cell Parameters: T3212 determines the periodic registration time for MS, if its too large can lead to poor paging performance where as if set too short can overload SDCCH. It is recommended to set this timer between 2 Hrs to 4 Hr. ACCMIN, CRO & CRH defines the idle mode reselection criteria; make use of these parameters to tailor-make idle mode behaviour of an MS as close to that of active mode; incorrect reselection of inappropriate cells on the idle mode due to incorrect settings of the idle mode cell reselection parameters.

Page 15: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

15 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

MFRMS and AGBLK; if paging discards at the cell level is too high, setting AGBLK to 0 can increase the paging bandwidth also lower values of MFRMS (4 to 6) are ideally recommended to offload very high paging load at a cell.

• Excess paging discards: Use PAGPCHCONG20 and PAGTOOOLD21 counters to measure excess paging discards. If it’s too high “try and use TMSI based paging”; TMSI based paging can dramatically increase paging bandwidth (at the expense of increased processor load at the BSC and MSC). Another method to tackle excess paging discards is by increasing the “Paging Queue Length22”. If the component PAGTOOOLD is the dominating element of PAGPCHCONG & PAGTOOOLD, then try and formulate methods to offload pages faster from the concerned cell (shorten the interval between broadcast of the same paging group by lowering the figure of MFRMS). The acceptable rate of discarded pages in a cell is ideally 0%, but to reach to this rate there might be a need to “increase the CCCH” capacity or decrease the LA size. These conditions might lead to decrease in available TCHs or an increased CP load in the BSC “due to increase in LA updates”. When redesigning LAC or re-dimensioning CCCH bandwidth to reduce “page discards” use 0.1% or less than that as the acceptable figure for “paging discard” at a cell.

• Incorrect MSC Parameters:

MSC Parameters Default RangeBTDM OFF 6 TO 1530 (MINUTES) OR OFFGTDM 0 TO 255 MinsTDD OFF 1 TO 255 DAYS OR OFFPAGTIMEFRST1LA 4 2 TO 10 SECPAGETIMEFRSTGLOB 4 2 TO 10 SECPAGEREP1LA 2 0 TO3PAGEREPGLOB 0 0 TO 1PAGTIMEREP1LA 7 2 TO 10 SECPAGTIMEREPGLOB 7 2 TO 10 SECTMSIPAR 0 0 TO 2TMSILAIMSC 0 0 TO 1LATAUSED 0 0 TO 1PAGLATA 0 0 TO 1PAGREPCT1LA 2 0 TO 3PAGTIMEREPLATA 7 2 TO 10 SEC

Table 4: MSC Parameters effecting paging success rate

BTDM is the implicit detach supervision23 and it should be set equal or longer than T3212, where as GTDM is the guard time given to BTDM after elapse of GTDM and BTDM (if the MS doesn’t do location updating within this interval) the subscriber will be set to be detached. TDD is the time in days that an inactive IMSI is stored in the VLR before it’s is removed. PAGTIMEFRST1LA is the time supervision for the page response of first page, if set too low will lead to expiry of the paging timer before the page response from the MS reaches MSC, similarly PAGETIMEFRSTGLOB decides the time supervision for first global page (used if LAI information does not exist in the VLR). Parameter PAGEREP1LA decides how the second page gets sent24 , if processor load at the MSC and the BSC is low and paging discard rate is very high for the BSCs falling under the concerned MSC change this parameter to 1 (IMSI and TMSI) . Parameter PAGEREPGLOB defines how the

20 Number of paging message discards due to full paging queue.21 Page discarded due to page being in the queue too long. 22 Paging Queue Length = [14 – {MFRMS*(9-AGBLK)}/10] 23 Detach the subscriber24 0=pg in LA is not repeated 1=pg is repeated in LA either with TMSI or IMSI 2=pg is repeated in LA with IMSI 3=pg is repeated as global paging with IMSI.

Page 16: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

16 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

global paging is repeated (0 not repeated, 1 repeated with IMSI). Parameters PAGTIMEREP1LA and PAGTIMEREPGLOB define the time supervision for second page to a Location Area or Global respectively. TMSIPAR indicates if TMSI should be used or not for paging (0 = TMSI not used, 1 = TMSI used only on encrypted connection, 2 = TMSI is always used).Parameter TMSILAIMSC determines if new TMSI shall be allocated at change of LAI within MSC/VLR, this parameter is only valid if PAGLATA=1. Parameter PAGREPCT1LA defines how the paging in one location area is repeated, use value =1 if paging discard is too high for the BSCs in an MSC.

• Poor RF: Interference on the BCCH / SDCCH carrier & or poor coverage.• Poor Paging Strategy: Refer to Figure 2 for various paging strategies. Using Location Area based first

page is usually recommended to bring down the amount of paging needed (as compared to global first page). If MCS and BSC processor loads are very much within acceptable limits and at the same time “paging discard rate” at the BSC is too high use TMSI based paging strategies. Also check for too short page response timers.

• SDCCH Congestion Rate: High SDCCH congestion rate at cell level can cause high page response failures and hence low paging success rate. In these cases re-dimension the number of SDCCH.

• Combined BCCH: Do not use combination of non-combined and combined BSCs together in a BSC. Use combined BCCH (SDCCH/4) only in remote BSCs carrying very little traffic.

• Incorrect LAC dimensioning: Use LAC dimensioning guidelines used in section 4 as a reference. If none of the other root cause/s are responsible for poor paging success rate, then either try to do a LAC cutover of the busy cells from LAC border to the neighbour LAC (provided the neighbour LAC has good paging performance, alternately split the LAC.Increasing the number of LACs within a BSC dramatically increases the paging capacity of a BSC (refer to equation 1 and 2 on page 9). An example on impact of increasing the number of LAC within a BSC and increase in its paging capacity: Consider a BSC with 100 cells, average number of TRXs per cell being 3 and the number of LAC within the BSC equal to 1; now if for the same BSC keeping all the other variables same if we change the number of LAC to 3, the paging capacity of the BSC goes up by 67%. This means if the paging load for a BSC is too high (high volume of mobile terminated traffic within the BSC), splitting the LAC within the BSC could be a good idea (at the expense of increased load on SDCCH for cells falling under new Location Area and an increased signalling traffic on the A interface for the BSC under discussion). If the “paging intensity” at the BSC is found to be higher than the paging rate of BTS (this topic is discussed in detail in the next segment which deals with location Area), then decrease the LA size to reduce “paging intensity”.

• A and ABIS interface congestions, Bit Error Rate and link fluctuations: Link performance (both A and ABIS) directly impacts paging performance. Monitor for excess link congestions, high BER and lack of link availability of the A and ABIS interface.

• Decrease signalling load on CCCH: Set ATT to 1, set a lower value for T3212 (recommended is 40, very short values for this timer can increase load on SDCCH) and allocate enough SDCCH.

• High CP load at the BSC and MSC

Page 17: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

17 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

4. Location Area updating (success rate)

4.1. Define4.1.1. Location Area updating process

• A Location Area (LA) is a group of cells in which “paging” will be broadcasted.• The most important rule while doing a Location Area dimension is not to exceed the “paging capacity of

the BTS or BSC”. • The upper boundary for the size of a Location area is the “service area of an MSC”. In general LA’s are

much smaller than that, mainly due to the excessive “paging load” that a large LA would have to handle. • There are three types of LU: Normal, Periodic and IMSI attach. • A cell constantly broadcasts the LAI (Location Area Identification) as “system information” on the BCCH,

an MS when camped on the cell, reads this broadcasted LAI and stores it in its non-volatile memory and then periodically compares the LAI received on the camped cells BCCH. If the broadcasted LAI on the BCCH differs from the stored one, MS will opt for a “Normal Location Updating”.

• Periodic LU is set by the timer T3212, as MS timer, on expiry of which the MS will do “periodic Location Updating”.

• IMSI attach/detach (provided the cell parameter ATT is set to 1) is the activity done by MS to inform the MSC that it is either switched ON/OFF or has gone out / or in of coverage area (this process prevents unnecessary paging of mobiles no longer available in the network).

• Location area updating process takes place on the SDCCH (average duration of “SDCCH held” time for a Location Updating is around 3.5s).

4.2. Measure

4.2.1. Location Area Dimensioning (from paging capacity point)

• Approximate size of a Location Area is a direct function of the “maximum paging rate” that the BSC and BTSs can sustain without “paging discards”.

• Smaller LA’s reduce the paging load in the BSC as well as BTS, however smaller LA’s also leads to large number of LA border sites and increased signalling load on the links due to increased location updating process.

• Calculate the” paging intensity “in “pages per second” for the BSC and then compare it to the “paging capacity” at the cell level for the BSC to decide need for LA re-dimensioning.

• Paging Capacity handling ability of the BSC CP are often not likely to be a bottleneck , instead possible bottlenecks in the BSC (for handling excess page load) are too few TRHs25 and A-bis signal capacity26.

• Counters to be collected to measure the outgoing “paging commands” from the BSC:

NLAPAG1LOTOT: MSC counter for first paging messages sent to a LANLAPAG2LOTOT: MSC counter for repeated paging messages to a LAPAGPSBSC: BSC counter for “packet switched” paging commands sent outPAGCSBSC: BSC counter for “circuit switched” paging commands sent out

25 Paging bottleneck at the BSC (one TRH which is controlled by RPG-3 can handle up to a maximum of 32 TRXs)26 Paging bottleneck at the BSC (A-bis and A link capacity , especially MTP utilization figures impact the paging performance)

Page 18: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

18 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

4.2.1.2. Paging Intensity (pages per second) Calculations for a BSC

When all the cells in the BSC belong to the same BSC use:

Equation 4

Use the peak traffic hour in the network to for these measurements; and then normalise the “Pag_Int” figure to per second, then compare it with paging capacity of the BTS.

4.2.1.3 Paging Capacity of a BTS (Paging Commands / second)

As mentioned previously, of BSC and BTSs, BTS is usually seen to be “the real bottleneck” in paging bandwidth. Hence it’s usually a good idea to measure the “paging commands / second” released from the BSC to the BTSs. If this paging intensity (refer to Equation 4) exceeds paging capacity of the BTS, then LAC re-dimensioning becomes a good option.

Paging Commands27 per second (including 25% IMSI based second pages) the BTS can handle (allowing a maximum of 50% paging load (recommended) of the maximum “theoretical allowable paging load”) :

Combined BCCH/SDCCH• With AGBLK=0 21.2 paging commands / second• With AGBLK=1 41.4 paging commands / second

Non-Combined BCCH/SDCCH• With AGBLK=0 63.8 paging commands / second• With AGBLK=1 57.7 paging commands / second

When the Location Areas are dimensioned to meet the above design criteria for all the cells in the BSC, paging failures due to “paging discards” is seen to be bare minimum.

Theoretical maximum for allowable paging load at the BTS

Combined BCCH/SDCCH• With AGBLK=0 42.5 paging commands / second• With AGBLK=1 28.75 paging commands / second

Non-Combined BCCH/SDCCH• With AGBLK=0 129 paging commands / second• With AGBLK=1 115 paging commands / second

The reason for taking 50% of maximum theoretical paging capacity of the BTS to be the recommended value for maximum allowable paging load per BTS is to build in plenty of room for “all BTS originated retransmissions”28.

27 Assumption: 1 “paging attempt” = 1 TMSI + ¼ IMSI (that is first page is TMSI based and assumption that 25% of the pages gets retransmitted as second page).28 BTS retransmission of paging requests occurs only when there are free resources on CCCH.

Page 19: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

19 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

4.2.1.4 LA dimensioning strategy

• One LA per BSC is a good rule of thumb, but if the paging load on the BSC and the BTSs exceeds paging capacity for the BSC and BTSs, either migrate cells from the LAC border or do a LAC split.

• Regardless of the fact that the LAC is a rural or urban, have the LAC border cells from low traffic zones. This is so done to reduce excess load on SDCCH due to Location Area Updating.

4.2.1.5 TRXs / BSC and Paging Commands

• Increasing the TRXs / BSC increases the paging commands per second.• Higher the number of TRXs/cell lower will be the paging commands per second.

Page 20: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

20 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

4.3 Analyze

4.2.1 Fish bone diagram for the root cause analysis of poor Location Area Updating

Poor LAU Succ Rate

3. SD drops due to poor RF & Hardware issues 1. SDCCH Congestion

4. Incorrect LAC border 2. Combined BCCH /SDCCH CellsToo short T 3212

Figure 5: Fish bone diagram for root cause of poor LAU success rate (1)

Poor LAU Succ Rate

6. CRH , ACCMIN and CRO

7. High SDCCH mean hold time and excess T 200 timer

Figure 6: Fish bone diagram for root cause of poor LAU success rate (2)

• SDCCH Congestion: Location Area Updating is done by MS over SDCCH, so lack of availability of SDCCH either due to congestion or due to incorrect dimensioning of SDCCH affects the success of a Location Area Updating. Apart from Location Updating and call setup process SDCCH is also used for SMS (SMS unlike speech which makes use of TCH, uses SDCCH); so excess SMS load on a cell (bulk SMS activity at the LAC boundaries) can also effect the SDCCH congestion and thus LAU success rate. Add more SDCCH in these cases.

• Combined BCCH/SDCCH: One timeslot at the TRX corresponds to 8 SDCCH sub channel, this is for a Non-Combined BCCH/SDCCH configuration. We can also have a combined BCCH/SDCCH cell where in TS 0 on the BCCH radio shares the BCCH and an SDCCH/4 (4 SDCCH sub-channel). Combined BCCH/SDCCH is

Page 21: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

21 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

usually used in cells that carry very little traffic (e.g. remote highway sites). Using a combined BCCH/SDCCH can adversely impact LAU success rate especially if used for cells at the LA boundary.

• Excess SDCCH drop due to poor RF: Interference, incorrect SDCCH power control settings and poor coverage often adversely affects the Location Area Updating process. SDCCH drop is taken up as a KPI in the later section of this document (including the topic of SDCCH drop due to hardware faults).

• Incorrect LAC border: The best designed LAC borders consists of low traffic carrying cells with very little across the LAC MS movements.

• Too short T3212 timer: If the periodic registration timer T3212 is set too short, it leads unnecessary location updating requests all across the cell. This not only increased the load on SDCCH but also increased load on the E1 links leading to ABIS or A-interface congestion. Recommended setting for T3212 is 40 (40 Hrs).

• CRH, ACCMIN and CRO: CRH is the cell reselection hysteris and this parameter defines the offset in dB (default 4) set for a MS to reselect a cell across a LAC border when in idle mode. Setting CRH to too low values can cause unnecessary “ping pong” location updating. Impact of ACCMIN and CRO on LAU success rate is as described in RACCH success rate part of this document.

• High SDCCH mean hold time and excess T200 timer expiry: Average SDCCH mean hold time varies from cell to cell based on the type of traffic carried by the cell (on an average LAU, IMSI attach and periodic registration process hold the SDCCH for 3.5 seconds, IMSI attach takes around 2.9 s, where as MO and MT calls holds the SDCCH for 2.7 and 2.9 s respectively). SMS hols the SDCCH for highest amount of time, on an average for 6.2 s. A cell usually has a mix of these different types of traffic and on an average SDCCH mean hold time (measured by STS counter SDCCH Mean holding Time) for a cell should be less than 6 seconds (with the exception of site with ABIS on VSAT). If the SDCCH mean hold time is seen to be too long , check for stuck (sleeping) SDCCH timeslots within the cell (TRX reset usually clears this problem) or excess transmission fluctuations / excess BER on the E1. This could be checked by running a CTR (Cell Traffic Recording) and analysing the Layer 3 messages. If the CTR reports shows excess T200 timer expiry, then unstable transmission could be the cause for poor LAU success rate (confirm this finding by checking for excess “drops due to other reason and drops due to sudden lost connection” as one of the cause apart from faulty transcoder hardware for drops due to other reason is transmission fluctuations.

Page 22: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

22 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

5. SDCCH Congestion Rate

5.1. Define

In GSM; channels are divided into either “Traffic Channels” or “Control-Signalling Channels”.Traffic Channels carries the actual “Payload”29 where as Control/Signalling channels are used for precisely that purpose, control and signalling. Controlling mobiles behaviour, most of the time in what is called as “active mode” or while a call is in progress (e.g. transmit power, handover, coding schemes being used, idle mode cell reselection etc) or for signalling for setup, registration, handover etc.

SDCCH is a control channel and is used for system signalling (UL/DL) during Location Updating (mean hold time of SDCCH=3.5s), IMSI Attach (3.5s), IMSI Detach (2.9s) , Mobile Originated Calls (2.7s) , Mobile Terminated Calls (2.9s), SMS (6.2s) , MS Originated FAX Call Setup (2.7s) , MS Terminated FAX Call Setup (2.9s) and False Access or phantom RACCH access due to radio disturbances (1.8s). In short SDCCH plays a big part in call setup process as the call setup attempts go from a successful Random Access to TCH via SDCCH.

• SDCCH can either be configured with one dedicated radio timeslot or it can be combined with the BCCH.• With a dedicated SDCCH timeslot we get 8 SDCCH sub channels30 / radio timeslot.• When configured as combined BCCH/SDCCH we get 4 SDCCH sub channel on the BCCH timeslot.• It’s usually recommended to equip at the “maximum” 2 SDCCH/8 channels per TRX (or a maximum of 32

SDCCH/8 channels per cell). This is so done as to keep the signalling load on the ABIS manageable. When using the “increased SDCCH feature” this rule is none longer applied at the cell.

• In most networks the acceptable Grade of Service for TCH is 2% and that for SDCCH is 0.5%.

5.1.1 Load in mErlang on SDCCH due to different events

• Location Updating: Load on SDCCH due to LA process varies from an inner cell and a cell at the LAC border. On an average this load is considered to be 1 mE per subscriber (with 3 mE per subscriber for a border cell).

• IMSI Attach-Detach: 1.8 mE per subscriber.• Periodic Registration: 0.5 mE per subscriber.• Call Setup: 0.9 mE per subscriber.• SMS: 1.7 mE per subscriber.

29 All the TDMA timeslots has a bandwidth of 148 bits. The traffic (speech, data or control/signaling information) is structured within this TDMA frame as two separate of 57 bit encrypted and interleaved blocks; 26 bits of this available 148 bits are used for generation training sequence which is used to adjust the automatic gain controller at the receivers. 3 bits are uses as the tail bits at the beginning and end of the TDMA frame. At the end of the first 57 bit information block and at the beginning of the second information block , there is a flag bit which defines the type of the content in the 57 bit block. If this flag is set to 0, it means the content of that particular block is traffic and if the same flag is set to 1 then the contents is signaling.30 Up to 8 mobiles can share the same radio Time Slot for SDCCH usage.

Page 23: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

23 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

5.2. Measure

5.2.1 Traffic Theory

The purpose of dimensioning of traffic and signalling channels is to choose the correct amount of channels and hardware to meet the Grade of Service requirements. Over dimensioning is cost inefficient to the customer and will lead to inefficient use of the equipment where as under dimensioning will lead to “congestion”, delays and deterioration of service performance. In an “AXE application” there are two types of system: loss systems and delay systems. In a loss system the user is disconnected if an idle device cannot be found where as in a delay system the user is put in a queue, in similar situation. SDCCH assignment process for Ericsson system is a loss system.

5.2.2 STS counters for measuring SDCCH congestion Rate

Counter Meaning CommentS_EST S_EST = { (CMSESTAB) / [CCALLS - (CCONGS+CCONGSSUB)]} * 100 % SD Access Succ RateS_CONG S_CONG = (CCCONGS / CCALLS) *100 % SD Congestion Rate

CCALLS Call attempt counter (call attempt on the SDCCH)CMSESTAB Successful MS channel establishment on SDCCHCCONGS Congestion counter for underlaid subcell.CCONGSSUB COngestion counter for overlaid subcell.CCONGS Stepped each time an allocation attempt in an underlaid cellfails due to SD congestion SD Congestion Underlaid CellCTCONGS SD Congestion counter for overlaid cell SD Congestion Overlaid Cell

CNRELCONG# of released connection on SD due to TCH or Transcoder congetsion ; CNDROP is stepped at the same time.

both Over Laid and Under Laid cell ; use CNRELCONGSUB for OL subcell. If TCH is not congested check for XCDR cong.

Table 5: STS counters for measuring SDCCH congestion

• If for a cell SDCCH congestion rate is low and SDCCH Access Success Rate is low it could be hardware related issue.

• Another network service that can put pressure on SDCCH performance is Cell Broadcast31. If the Cell Broadcast service is active in a cell, one signalling “sub channel” is replaced with one Cell Broadcast Channel (CBCCH) , that is 7 SDCCH sub channel available for call set up in the case of an SDCCH/8 (and 3 SDCCH sub channel available in the case of SDCCH/3).

• When Half Rate is used ( two call connections being permitted to establish on the same timeslot , this is so done by making the mobile use every alternate TDMA frame for TX/RX , that is if MS1 used TDMA frame 0 then it will transmit/receive on TDMA frames 2,4,6,8 etc and the even TDMA frames i.e. 1,3,5,7 etc will be used by MS2 for transmit / receive) it will theoretically double the number of available traffic channels ; but most often this increase in the traffic is managed to a maximum of 20% , this means usage of half rate TCH will affect the SDCCH dimensioning.

• On the contrary when extended cell feature is used in a cell, needed number of SDCCH gets halved; this is so because now two TDMA timeslots are used for one call.

.

31 Cell Broadcast service provide the transmission of SMS from a message handling centre to all the MSs in the serving area of the BTS.

Page 24: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

24 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

5.3 Analyze

5.3.1 Fish bone diagram for the root cause analysis high SDCCH congestion rate

High SDCCH Congestion Rate

High Volume of SMS Adaptive Configuration of Logical Channel

High Volume of LAU SDCCH DimensionImmediate Assignment on TCH

RF Spillage

Figure 7: Fish bone diagram for root cause analysis of high SDCCH congestion rate

• Adaptive Configuration of Logical Channel: The purpose of this feature is to make dynamic reconfiguration of idle TCH channels to SDCCH channels when there is a high load on SD. When the SDCCHs in a cell are congested, no new call setup will be accepted even if TCH are available (this is if the feature “Immediate Assignment on TCH” feature is not enabled). In such cases Adaptive configuration of logical channels will dynamically assign more (or less) SDCCHs. It’s highly recommended to use this feature. Main cell parameters controlling this feature are SLEVEL & STIME. SLEVEL: defines the number of remaining SDCCH sub channels when an attempt to reconfigure an idle TCH to an SDCCH /8 will take place, default value for this parameter is 0. If the SDCCH congestion rate for a cell is seen to be too high increase SLEVEL to 2 (range of this parameter is 0 to 2). STIME: defines the minimum time interval before an SDCCH/8 added by the feature Adaptive Configuration of Logical Channel can be reconfigured back to TCH , default set value for this cell parameter is 20s , recommended value is 40s and this parameter has a range from 15s to 3600s. Once again for cells that exhibit high SDCCH congestion rate at peak traffic hour with low TCH congestion figures; increase the value set for this timer.

• SDCCH Dimension: The Grade of Service to which most operators work towards globally is: on TCH 2% and 0.5% on SDCCH. Use the network stats to decide the need to add more resources for SDCCH (equip more timeslots for SDCCH). Measure daily averages for SDCCH congestion rates across a month to study the variations of congestion figures on the SDCCH. It’s recommended not to exceed 2 SDCCH/8 channels per TRX; this is to keep the signalling load on the E1 links and CP load at the BSC below congestion figures (though the feature “increased SDCCH” takes off this upper limit). Before increasing the number of SDCCH to unusually large numbers, study the SDCCH usage. If the cell takes high Location Updating request, try and modify the LAC boundaries instead or make use of the feature immediate assignment on TCH.

• High volume of LAU: change the LAC border away from this cell, to a cell that possibly carries less traffic. Optimize the periodic registration timer T3212 (higher values of T3212 can bring down the load on SDCCH for periodic location updating by mobiles in idle mode, but can adversely impact paging performance).

Page 25: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

25 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• High volume of SMS: SMS traffic has grown globally, in some networks close to 17% of the net traffic carried is SMS and hence SMS contributes to a considerable chunk of revenue generated and it also justifies the need for SDCCH re-dimensioning in some cases.

• Immediate Assignment on TCH: This feature allows signalling to be carried out by the TCH as well as SDCCH32. When TCH is used for immediate assignment and a speech / data channel type is requested in the Channel Request field by the MS the connection after assignment stage will change from signalling to traffic and hence the same time slot is retained for traffic. There are three possible strategies for the use of this feature; first, immediate assignment on TCH not permitted, second “immediate assignment on TCH as last preference” where in a TCH is allocated at immediate assignment only when there are no idle SDCCHs available and the third strategy being Immediate Assignment on TCH as the first preference, where in the SDCCH may only be allocated when there are no idle TCHs available (this increases the load on TCH). The controlling parameter at the cell level which determines the type of immediate assignment on TCH is CHAP: CHAP as cell parameter has a valid range from 0 to 10. CHAP=0,5,7,8,9 and 10 do not allow immediate assignment on TCH at all; CHAP=1 or 6 means use TCH as the last resort , i.e. if all the SDCCH are busy; CHAP= 2,3 or 4 sets TCH as the first option for immediate assignment. Recommended value for CHAP is 1 where as the default value is 0. Use of the feature immediate assignment on TCH can bring down SDCCH Congestion Rate.

• RF Spillage: Unnecessary RF spillage leads to a cell picking up traffic from unintentional area (coverage area of other cells), this can cause unwanted load on the control channels. In such cases its better to run an MRR collection at the cell level and check the percentage of traffic coming in from high unwanted timing advances and then design appropriate antenna down tilts or antenna height reductions.

• Another good parameter to use in case of excess congestion on SDCCH is SCHO (SCHO=ON, default = OFF). When SCHO=ON, it enables handover on SDCCH on congestion (but can lead to increase in SDCCH drop rate).

32 In general TCH is never allocated at “immediate assignment” even if the BSC has enough information that the channel request from the mobile can be completely supported by SDCCH (LAU, SMS etc). In-fact the Channel Request message sent up by the MS contains the “establishment cause which tells the BSC the type of connection request MS is requesting (this is not exactly true as for phase 1 MS, BSC actually struggles quite a bit to determine the request type. Phase 2 MS Channel request has more information field and hence it is easier for the BSC to determine the request type.

Page 26: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

26 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

6. SDCCH Drop Rate

6.1 Define

6.2 Measure6.2.1 SDCCH Drop Rate

The correct formula for measuring SDCCH Drop Rate is

SDCCH Drop Rate = [(CNDROP-CNRELCONG)/ (CMSESTAB)]*100 % Equation 5

Some of the documentation gives the formula for SDCCH Drop Rate as (CNDROP/CMSESTAB)*100, problem with this equation is, it doesn’t take out dropped SDCCH due TCH or Transcoder Congestion. Alternately you can use both these formulae to measure the percentage contribution of TCH or Transcoder Congestion as the primary cause of high reported SDCCH drop rate.

CNDROP: total number of dropped SDCCH channels in a cell (for the measurement period).CNRELCONG: total number dropped (released) connections on SDCCH due to TCH or Transcoder congestion.CMSESTAB: total number of successful MS channel establishment on SDCCH.

6.2.2 SDCCH Drop Reason

There are 6 different stats available in Ericsson system which gives the individual percentage contribution of each possible factor that can contribute towards the net SDCCH Drop:

• Drop Reason, Low Signal Strength Uplink (%).• Drop Reason, Low Signal Strength Downlink (%).• Drop Reason, Bad Quality Uplink (%).• Drop Reason, Bad Quality Downlink (%).• Drop Reason, Excess TA (%).• Drop Reason, Other (%).

Page 27: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

27 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

BSC 1 SDCCH Drop Cause

19.073.38

0.3

77.26

Drop Reason, Low SS (%) Drop Reason, Bad Quality (%)

Drop Reason, Excessive TA (%) Drop Reason, Other (%)

Figure 8 : SDCCH Drop Cause at BSC level

For the above given BSC, close to 77.26% of the drops happening on SDCCH is due to the category Other Reasons. Once we get the major contributing factor towards the SDCCH drop either at the BSC level or at the cell level, it becomes easy to use this information and find the root cause for excess SDCCH drop.

6.3 Analyse

6.3.1 Fish bone diagram for the root cause analysis for high SDCCH drop rate

SDCCH Drop Rate

High TA Signal Strength

TCH /transcoder Congestion Interference

Other Reasons

Power Control Settings

Adaptive Configuration

DIP Status

Figure 9: Fish bone diagram for the root cause analysis of high SDCCH drop rate

• Signal Strength: Excess percentage of SDCCH Drop due to, low signal strength on the uplink and downlink: cause could be poor RF coverage; here use options like MS and BS power control revision (possibly power control33 settings are set too aggressive), antenna down tilts and antenna height reductions (foot print reduction), MS and BS transmit power (set low causing a link imbalance), use of TMA or planning a new site to take care of the weak coverage area.

• Interference: Excess percentage of Interference of SDCCH Drop due to, bad quality on the uplink and downlink: co channel and adjacent channel interference is the root cause here, isolate the interferer34 and change frequency ; alternately do a coverage optimization of either the serving cell or the interferer.

33 Power control is covered in detail under the section of TCH Drop Rate.34 Either by frequency scan using a drive test tool or (preferably) by using the record test neighbour frequency signal strength feature (both these methods are covered in the TCH Drop Rate section of this document).

Page 28: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

28 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• TCH / Transcoder Congestion: in one of the stats [(CNDROP/CMSESTAB)*100] used by Ericsson to measure SDCCH Drop Rate, SDCCH released due to TCH or Transcoder resource congestion is also treated as a dropped SDCCH connection (technically this is not a dropped SDCCH connection, as SDCCH was successfully assigned and was also successfully used to reach to TCH and the connection was “released as a normal network assisted release due to lack of availability of further resources” and not due to issues on SDCCH). Use TCH congestion optimization methods to bring down high SDCCH drop due to TCH congestion.

• High TA: use down tilts, height reduction or BS power reduction35 to take off high TA coverage of the cell.• Power Control Settings: Lack of good power control settings for SDCCH can lead to excess drop. All the cell

parameters that control MS and BS Power Control (covered under the section of TCH Drop Rate) impacts equally the SDCCH power control too. Cell parameters unique to SDCCH power control are SDCCHREG which enables (SDCCHREG=1) or disables (0) power control on SDCCH. By default this parameter is disabled but it is recommended to enable power control on SDCCH. The second parameter from the set of BSS power control parameter that has an impact on SDCCH alone (and not TCH) is INIDES which defines the desired initial signal strength for the SDCCH on DL and UL , default value for the same is -70 dBm (with a range of -100 dBm to 47 dBm). Optimising INDES , especially on the uplink will have an direct impact on SDCCH drop rate , this is because UL is the weakest of both the links and is extremely sensitive to interference ; now if there exists a high percentage of traffic close to the cell (which need not transmit at very high power to communicate with the BS) we can bring down the figure for INDES from -70 to say -85 dBm , this will reduce the cumulative power emitted by mobiles closer to the base station and thus reducing the “high noise floor” created for mobiles trying to approach the same cell from far off zones of the cell coverage area. This will reduce the number of interference related drops on SDCCH from far off zones. UPDWNRATIO, this cell parameter controls the rate at which the MS and BS gets powered up and down, default setting for this parameter is 200 (which means power up step size in dB for UL/DL is twice that of power down step size, say power down step size per power control interval is 2 dB on the UL, then power up step will be 4 dB). Its always a good strategy to have an” aggressive settings for power control from a signal level point of view and make both UL and DL very sensitive to quality issue” , this strategy will make sure that the MS and BS transmits optimum power at all the times but will power up immediately on quality issues caused by interference. This approach is usually seen to bring down both SDCCH and TCH drop rate in a BSC. More on effective power control strategy is covered in the later part of this document.

• Adaptive Configuration of Logical Channel: In some of the earlier software release versions, excess SDCCH drop due to “other reason” was at times (especially during peak busy traffic hours with high CP load at the BSC) seen to be due to the feature “Adaptive Configuration of Logical Channel”. This issue seems to be minimised on the later generation BSS hardware / Software release versions. If all other causes for excess “SDCCH drop due to other reason” don’t appear to be the root cause, try disabling this feature from a test cell to measure the impact on SDCCH drop due to other reason.

• Dip Status: Digital Path (DIP) or the E1 connection to the site as well as the E1 connection to the BSC is often cause of SDCH drop due to “other reason”. High BER (Bit Error Rate) or high Frame Loss due to unstable transmission (check for high T200 expiry from layer 3 messages of CTR measurements or drive tests) is often seen to be the cause for high SDCCH drop due to other reasons.

• Other Reasons: Possible causes for high SDCCH drops due to other reasons could be due to incorrect power control settings, Adaptive Configuration of Logical Channels, DIP status, Hardware faults at the BTS, frequency interference problems (causing sudden drops) or C7 link between the BSC and the MSC having link congestions or link stability issues.

35 Use the cell parameter BSCTXPWR

Page 29: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

29 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

7. TCH Congestion Rate

7.1 Define

One TDMA frame has 8 timeslots (one radio) and each timeslot can act as TCH channel (traffic channel carrying the payload).

There are two types of TCHs:

• Full rate channel, which is used for full rate speech at 13 kbps or for data rate up to 14.4 kbps.• Half rate channel, which is used for half rate speech at 6.5 kbps or for data rate up to 4.8 kbps.

7.2 Measure

Primary cause of TCH congestion is lack of enough TCH resources36 to cater for the offered traffic.

The following STS counters and stats could be used to measure TCH congestion figures.

7.2.1 STS counters and stats for TCH Congestion Rate

36 Transcoder Capacity: to measure the released TCH signaling (during setup) due to transcoder congestion (despite the availability of TCH); use the counter TFNRELCONG (for full rate) and THNRELCONG (half rate).

Page 30: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

30 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Counter Meaning CommentTFTRALACC Traffic level accumulator for full rate TCHTHTRALACC Traffic level accumulator for half rate TCHTFNSCAN Total number of accumulation of the counter for full rate TCH (use THNSCAN for Half rate)TF_TRAFFIC TF_TRAFFIC = (TFTRALACC/TFNSCAN) Erlangs (use H instead F for half rate)TAVASCAN Counter value for number of available Basic Physical Channel for traffic channelTF_MEANH (TF_TRAFFIC*PERLEN*60)/TFMSESTB seconds TCH Mean Hold timePERLEN Measurement period length used in STS (min)TFMSESTB Succesful MS establishment on TCH full rate.T_AVAIL T_AVAIL = {(TAVAACC) / (TAVASCAN*TNUCHCNT)}*100 TCH Availability %TNUCHCNT number of defined TCHT_DWN T_DWN = (TDWNACC/TDWNSCAN)*100 TCH downtime %TDWNACC this counter is incrimented every 10th sec. if there are no TCHs in IDLE or BUSY mode with cell state ACTIVETDWNSCAN this counter is stepped every 10th second when cell state is activeTFCASSALL Total number of assignment complete messages for all MSs in underlaid subcell (FR) similar counters for HR tooTFCASSALLSUB Total number of assignment complete messages for all MSs in overlaid subcell (FR) similar counters for HR tooTASSATT Number of "first assignment" attempts on TCH (successful + unsuccessful) both counted in the TARGET cellTASSALL

T_AS_SUC T_AS_SUC = (TFCASSALL+TFCASSALLSUB+THCASSALL+THCASSALLSUB)/(TASSALL) *100 %

TFCONGSAS #of congestion at assignment or immediate assignment in UL subcell (use TH for HR)TFCONGSHO #of congestion at incomming handover in UL subcell (use TH for HR)TFTCONGS

TFTHARDCONGS

T_CONG T_CONG = [(CNRELCONG+TF_REL_C+TH_REL_C)/(TASSALL)]*100 TCH Congestion Rate

Soft congestion "time" counter for UL ; counter statrs when a channel request is made and no idle channels are available. (use H for half rate) Hard congestion time counter for UL subcell. Counter starts to incriment only when it has not been possible to allocate a channel without any type of pre emption.

use these counters to check half rate utilization in a cell

Number of "first assignment" attempts on TCH. Successful attempts are counted in the serving cell (servign cell being the cell where the MS was tuned in to aSD or TCH for signalling)

TCH Assignment Succ Rate. i.e. succ. change from SD to TCH

Table 6: Counters and stats for TCH Congestion Rate

7.3 Analyze

7.3.1 Fish bone diagram for the root cause analysis for high TCH Congestion Rate

TCH congestion Rate

Half Rate (feature )Multiband (feature )

HCS (feature ) CLS (feature )Assignment to another cell (feature )

Figure 10: TCH Congestion Rate and Traffic Features

Page 31: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

31 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

TCH Congestion Rate

Micro Cell Layer TRU Upgrade Plan

Site Density RF Optimization (offload traffic )Indoor Solutions

Figure 11: TCH Congestion Rate and Capacity Solutions

• Half Rate: GSM half rate is a feature that provides capacity solution for congestion with the existing radio resource. Speech and data rates for half rate are 6.5 Kbps and 4.8 Kbps respectively. This is achieved by making two mobiles share the same TDMA timeslot; that is if the first mobile uses timeslot number 3 of TDMA frame 0, it will TX/RX again on timeslot number 3 of TDMA frame 2 leaving behind timeslot number 3 of TDMA frame 1 for the use by the second mobile that shares this timeslot. GSM half rate is seen to be a good capacity solution at the expense of inferior speech quality which is inherent to half rate coding. The table given below describes the parameters controlling half rate.

7.3.2 Half Rate Parameters

Page 32: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

32 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Parameter Name Default Value Recommended Value Range Comment

threshond (%) for non AMR HR capable MS at channel allocation below which an AMR MS will be allocated on HR channel.DTHNAMR 15 15 0 to 100

Switching threshold for triggering a qulaity based handover from FR to HR for an AMR capable MS (based on Ul and DL reported RxQual).<dmqbamr

DTHAMR 30 30 0 to 100

threshond (%) for AMR HR capable MS at channel allocation below which an AMR MS will be allocated on HR channel.

DMQGAMR 35 35 0 to 100

Switch ON or OFF quality based switching from FR to HR (move an existing FR call to HR is quality reported is bleow the threshold.

DMQGNAMR 30 30 0 to 100

Switching threshold for triggering a qulaity based handover from FR to HR for non AMR MS (based on Ul and DL reported RxQual) shld

DMQG OFF ON

Switching threshold for triggering a qulaity based handover from HR to FR for non AMR MS (based on Ul and DL reported RxQual).

DMQBAMR 50 50 0 to 100

Switching threshold for triggering a qulaity based handover from HR to FR for an AMR capable MS (based on Ul and DL reported RxQual).

DMQBNAMR45 (rx qual of

4.5) 45 0 to 100

Turns dynamic half rate feature ON or OFF at the cell level.

DMQB OFF ON ON/OFF

Switch ON or OFF quality based switching from HR to FR (move an existing HR call to FR is quality reported is bleow the threshold.

DHA OFF ON ON/OFF

Table 7: Parameters controlling Half Rate

Problem with excessive usage of Half Rate in a network speech quality, HR has an inferior speech quality as compared to FR and it is strongly recommended to keep the HR usage in a network below 20%.

7.3.3 Mean Opinion scores for Half Rate

Page 33: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

33 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Mean Opinion Scores for Half RateExperiment 1b - Test Results

1 .0

2 .0

3 .0

4 .0

5 .0

Conditions

MOS

EFR7.957.46.75.95.154.75FRHR

EFR 4.21 4.21 3.74 3.34 1.58

7.95 4.11 4.04 3.96 3.37 2.53 1.60

7.4 3.93 3.93 3.95 3.52 2.74 1.78

6.7 3.94 3.90 3.53 3.10 2.22 1.21

5.9 3.68 3.82 3.72 3.19 2.57 1.33

5.15 3.70 3.60 3.60 3.38 2.85 1.84

4.75 3.59 3.46 3.42 3.30 3.10 2.00

FR 3.50 3.50 3.14 2.74 1.50

HR 3.35 3.24 2.80 1.92

No Errors C/I=19 dB C/I=16 dB C/I=13 dB C/I=10 dB C/I= 7 dB C/I= 4 dB

AMRCodecRange

Chart shows call quality measured against a

scale of 0-5 (where

‘5’ is perfect and ‘O’ is

unacceptable)

C/I = xdBScale shows

how the differentspeech codecs performas the user experiences

high levels of interference

Figure 12: Mean Opinion Scores for Half Rate

Its interesting to see how HR consistently keep higher error rates compared to FR but towards very low C/I zones HR shows slightly better speech quality , and that possibly explains why Ericsson came out with the quality based trigger value from FR to HR too.

7.3.4 Why Quality is so important?

Page 34: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

34 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

21%

19%

11%

8%

6%

6%

6%

23%

Better service prices or phone

prices

Better service quality/quality of

carrier's network

To get additional product

features or services

To take advantage of a

promotion or sale

Better local calling area

coverage

They had particular brand or

type of phone I wanted

Better customer service

Other

Current Carrier X Subscribers who switched within past 12 months

25% of current Carrier X customers who changed providers

report switching due to network performance factors

Network Performance Factors = 25%

Network Performance Factors = 25%

Figure 13: Network Quality affecting customer satisfaction

Page 35: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

35 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• Cell Load Share (CLS): CLS offers a possibility for offloading of “highly loaded” cell before congestion. This is done by triggering congestion based handovers to neighbouring cells (if the neighbouring cell allows incoming CLS handovers) when the channel utilization (TCH) within the cell reaches a certain definable percentage.

7.3.5 Parameters controlling CLSParameter Name Default Value Recommended Value Value Range Comment

EBANDSINCLUDED OFF ON,OFF BSC;include or exclude E-GSM cells

Enables or disables incomming handover due to load sharing.

RHYST 75 100 0 to 100

Handover Hysterisis(dB margin between src and trgt cell) reduction paramter(final reduction of hyst as

HOCLSACC OFF ON ON,OFF

%of idle TCHs at or below which no handover due to CLS is accepted by a target cell.

CLSRAMP 5s 8 0 to 30

time interval during which "handover hysterisis to the target cell is reduced" when the cell is hit

CLSACC 40% 1 to 100

% of idle TCHs at or below which calculations for congestion relief handover takes place.

CLSTIMEINTERVAL 100ms 100 100 to 1000a BSC Parameter, determines the "cycle time" for "load monitoring".

CLSLEVEL 20% 0 to 99

Table 8: Parameters Controlling CLS

Figure 14: Uneven distribution of TCH congestion due to uneven distribution of traffic

Page 36: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

36 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Traffic distribution density is never uniform. The most effective way to successfully use congestion relief features like CLS is to identify high congestion density sites within a BSC and if there are un congested and reliable (good RF transition at the cell boundary) first tier neighbours in the near vicinity , then make use of it . If the transition boundary between the source and target cells is weak, then excessive usage of CLS can cause increase in dropped calls and high “ping pong” handovers.

• Hierarchical Cell Structure (HCS)37: with this feature cells (layers) can be given priority over stronger cells. This is based on the theory “cells with comparatively weaker signal strength may provide valuable capacity while the interference level is lower (e.g. preferred (higher priority for setting up the call)1800 band cells with larger spectrum width but inherently higher propagation loss than 900 band of cells which suffers from higher interference levels due to limited frequency spectrum width , being used as the capacity layer in dense urban to urban clutter class) or it could also be used for creating a preferred layer for localised capacity solutions like Micro cell layer which are usually used to provide localised capacity solution within a Macro layer. HCS as a feature could be used to create up to 8 such capacity layers with varying “preferred status”.

Figure 15: Layering of cells with traffic preference for the lowest layer

In this figure 1800 band micros are given the highest preference for carrying traffic (layer 3) followed by 900 band micros, 1800 band macros and finally 900 band macros. This will make sure that the micro layer and the 1800 band macro layer forms the capacity layer carrying say close to 60 to 70% of the net traffic density and spectrum limited (interference/quality prone) 900 band macros carrying only 40 to 30% of the net traffic.

Also this layering will allow the 1800 macros (with higher propagation path loss as compared to cells from 900 bad) which inherently show a weaker footprint (a cleaner footprint) as compared to 900 bands, carrying higher traffic than in a non layered RF architecture.

37 Two HCS options are available, one with 8 HCS bands and 8 Layers (also called as Full HCS feature) or a stripped down version with no HSC band 3 Layers.

Page 37: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

37 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

7.3.6 Parameters controlling HCS

Parameter Name Default Value Recommended Value Value Range Comment

NHO 3 2 to 10

force the fast moving mobiles to higher layer when "number of handovers" exceed NHO for the last

THO 30 10 to 100(s)

FASTMSREG OFF ON/OFF

threshold applied to lower numbered band/cell (prefered) ; specifically for fast moving traffic in to smaller cells

PTIMTEMP 0 0 0 to 600

duration for PSSTEMP is applied to the smaller neighbour , from the moment it got reported as one of the

PSSTEMP 0 0 0 to 63(dB)

threshod (RxLev DL) where transfering a call across two different bands

HSCBANDHYST 2 2 0 to 63(dB)

hysteris (handover margin in dB) between two cells from different bands

HSCBANDTHR 95150 to 0(-

dBm)

hysteris (handover margin in dB) between two cells from different layers

HSCBAND 2 1 to 8band number , lower the band number heigher will be band preference

LAYERHYST 2 2 0 to 63(dB)

Lower the layer number of the cell , higher will be its preference within the same band

LAYERTHR 75150 to 0(-

dBm)

threashold (RxLevDL) where transfer of a call happens across different layers

LAYER 2 1 to 7

Table 9: Parameters controlling HCS

Keep Micros in the lowest layer and if it is an external micro with quite a fraction of fast moving mobiles in its foot print make use of the parameter that penalises lower layered cells (PSSTEMP,PTIMTEMP,FASTMSREG,THO and NHO) to prevent drag and drop scenarios by fast moving mobiles. Keep 1800 band cells at a lower layer compared to 900 band cells. Cells within the same band suffering from excess TCH congestion figures, keep them at a higher layer as compared to neighbouring cells that are not showing high TCH congestion figures.

• Assignment to another cell: This feature operates at the “call-setup” immediately after SDCCH is assigned and takes the call from SDCCH of the source cell to TCH of a target cell and thus bypassing TCH congestion at the source. This process is called as assignment to “worse cell”, aptly named as the target cell (identified by the locating algorithm in the BSC from the best reported (signal strength) neighbour with an available TCH channel. Another application of assignment to another cell (not dealing with TCH congestion) is assignment to “better cell”; locating algorithm kicks in at the real early stage, even before the MS has landed on TCH. At times a MS gets camped on to incorrect cell (especially the mobiles from weak cell boundaries) either due to slow idle mode reselection due to incorrect set parameters or due to physical fast movement and if it makes a

Page 38: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

38 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

random access under these given situations, it can land on an inappropriate cell. With the feature assignment to another cell, such an MS will be handed out direct to the better cell (this part of this feature has no much impact on TCH congestion, though it can bring down the TCH drop rate at the source cell).

7.3.7 Parameters controlling Assignment to another cell

Parameters Default Value Recommended Value Value Range Comment

congestio relief handover margin from the src cell to trget cell

CAND BOTH BOTHAWN,NHN,BOTH

AWOFFSET 3 10(dB) 0 TO 63per neighbour(permission) setting: AWN=nbr possible candidate for better or worse & not for normal HO, NHN=better and normal (no worse) ,

BCC parameter , determines if ASSOC to another BSC can be done

HNDSDCCHTCH 1 1 0,1determines if inter MSC and inter BSC SDCCH to TCH handovers is permited

IBHOASS OFF ON ON,OFF

Turns the feature "Assignment to another cell" ON or OFF

AW OFF ON ON,OFFEnables/Disables Assignment to Worse Cell (if ASSOC=ON)

ASSOC OFF ON ON,OFF

Table 10: Parameters controlling Assignment to another cell

High values for AWOFFSET can trigger congestion relief handovers to very weak (signal strength) neighbour TCH, which can result in a TCH drop on the target (neighbour) cell, but if the RF transition between the source and the neighbour is reliable (check per neighbour handover success rate, higher this figure, reliable will be the neighbour under consideration), higher values like 10 dB could be used for per neighbour definition of AWOFFSET. This will bring down the TCH congestion rate at the source cell.

• Multiband Operation: The feature “Multiband cell” creates a “cost efficient high capacity network”, often consisting of a 900 band coverage layer and an 1800 band capacity layer. Assignment to the higher band (1800 MHz) is done based on the available Class Mark 3 (CM3)38 information about the MS at the BSC and MSC. In both , idle and active mode the mobile will measure its neighbours , when in idle mode MS uses the BA list sent down as a system information to get the general bearing of the neighbours that it should be monitoring (MBCCHNO is the cell parameter that gives the frequencies of the neighbouring cells). One of the critical parameter while using multiband cell feature is MBCR; MBCR determines (in active mode) how many of the decoded neighbours from the “currently non operational band” must be reported back on Measurement Report (SACCH) for neighbour “locating” process. Recommended setting for MBCR is 2, which means when an MS is being served by 900 band cell; it will report 2 possible neighbours from the 1800 band irrespective of their signal strength w.r.t. available 900 band neighbours. All in all, when in active mode the MS reports back six best (signal strength based ranking) possible neighbours for handover decision (locating algorithm) by the BSC. With MBSR=2 , we make sure at all the time in a dual band network , 33% of the potential neighbours reported to the BSC for handover decision making process is a higher band cell. This approach is needed as higher band cells are more prone to RF propagation loss (RF propagation loss is directly proportional to square of the operating frequency), so at far off zones higher a higher band cell will be seen to be weaker

38 If the BSC parameter ECSC=YES, then an MS during a “call setup attempt” or a “location area updating process” will send to the BSC its frequency band and transmit power capabilities. With ECSC=YES, MS will also send up “Class Mark Change” (CM2) capability as soon as possible after network access, CM2 defines the MSs capability to change bands while in a call; now depending upon another BSC parameter CLMRKMSG, BSC will send the CM2 information to MSC, MSC in its turn will use this information if the MS takes an inter MSC or inter BSC handover and transfer CM2 capabilities to the new BSC. One of the draw back of sending the CM information of all the mobiles in active mode in a BSC back to MSC is “increase in load” on the A-interface (E1 connection/s from BSC to the MSC), in fact load on A interface can be considerably reduced by not permitting BSC to send the CM information to MSC (CLMRKMSG=1) , but in that case , when an MS takes an inter-BSC handover , the new BSC will treat it as a “single band” capable and will not change its operational band till end of the call.CLMRKMSG=0 (BSC will always transfer CM information to MSC); =1 (will not transfer CM information to MSC); = 2 (BSC delays the transfer of CM information to MSC until inter-BSC handover takes place, if EGSM is not used in the network, 2 is the recommended setting); =3 (do not send CM message if it’s a location updating, but if it’s a call set up send the CM message to MSC).

Page 39: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

39 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

(signal strength) as compared to a co-located lower band cell .1800 band cells are seen to be 6 to 12 dB weaker than 900 band cells in far off zones. Difference in signal strength will depend a lot on clutter class and distance from the cell site (as a contradiction, in dense urban clutter with a site to site spacing of 450m to under 1Km, its usually seen that 1800 band exhibits better signal strength as compared to 900 band, this is because 1800 band frequencies has smaller wavelength as compared 900 band frequencies and hence will show better indoor penetration, precisely the same reason also make the 1800 band cells more lossier in other clutter classes).

To built an efficient capacity layer to bring down high TCH Congestion on 900 band cells in the network make use of the Idle Mode Reselection , HCS,CLS and Locating Algorithm.

Idle Mode Reselection: make use of ACCMIN and CRO to give preferred status to the 1800 band cells w.r.t 900 band neighbours. (E.g.: 900band /1800 band; ACCMIN=104, CRO=0 / 2).

HCS: If the network uses “full HCS feature” then differentiate 900 band and 1800 band cells into different bands, set HCSBAND to lower value for 1800 band cells w.r.t. 900 band cells (this will give preferred status to 1800), also set HSCBANDTHR to 95 (-95) for the 1800 HSC band; with such a low band threshold for 1800 band, make sure that the signal strength filters are set to 3 to 5 s at the max. This argument is more relevant in cases of fast moving MS, where the 1800 at cell edges is seen to be fade off at a rapid rate (more than 10 dB in 5 second) , in such cases HSCBANDTHR is recommended to be 85(-85 dBm).

If full HCS feature is not available, differentiate these two bands by defining different LAYER settings and a lower LAYERTHR and LAYERHYST settings for 1800 band.

Bottom line is, creating an 1800 capacity layer is one of the best available capacity solutions to keep TCH congestion in control, but on the negative side, if the 1800 band to 900 band transitions and 900 to 1800 band transitions are not optimised properly, it can lead to increased TCH drop rate in a network.

Counters that could be used to measure dual band performance:

• TFDUALTRALACC: Traffic level accumulator, number of seized channels for dual band.• TFDUALCASSALL: Number of assignment “COMPLETE” for all MS power class.• TFDUALASSALL : Number of assignment “ATTEMPT” for all MS power class.• TFDUALNDROP: Number of abnormally dropped dual band connection.

Page 40: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

40 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

8. TCH Drop Rate

8.1 DefineTCH drop (or a dropped call) could be broadly classified into 3 sub classes:

1. Degradation of the links (Uplink and Downlink): either degradation of Signal Strength which falls near or lower than the sensitivity of the base station (around to -110 dBm) or that of the mobile (around -104dBm) or degradation of quality of the links (Uplink and Downlink) often due to interference.

2. Excess TA (TA>63 or excess path imbalance due to high TA).3. Other Reasons.

BSC 1 TCH Drop Cause

1.25

36.56

23.751.56

1.56

0.63

32.19

0

2.5

Drop Reason Low SS DL (%) Drop Reason Low SS UL (%)

Drop Reason Low SS UL/DL (%) Drop Reason Bad Quality DL (%)

Drop Reason Bad Quality UL (%) Drop Reason Bad Quality UL/DL (%)

Drop Reason Suddenly Lost Connections (%) Drop Reason Excessive TA (%)

Drop Reason Other (%)

Figure 16: an example of distribution of cause of TCH Drop in a BSC

8.2 Measure

Use the following available stats to measure nature of drop call rate in a cell / BSC

• Drop Reason: Low Signal Strength DL %.• Drop Reason: Low Signal Strength UL %.• Drop Reason: Bad Quality DL %.• Drop Reason: Bad Quality UL %.• Drop Reason: Sudden Lost Connections %.• Drop Reason: Excess TA %.• Drop Reason: Other %

Page 41: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

41 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Also the raw counters given in table 11 could be used for further analyse the cause of TCH drop. Always categorize the highest dropper (TNDROP) in a network by traffic and not just drop call rate, as an example take two cells in a network Cell A dropping at the rate of 20% with a net carried traffic of 30 calls in a day (say 0.025 Erlang) and Cell B dropping at the rate of 3.5% with a net carried traffic of 10,000 calls a day (say 42 Erlang). Though TCH Drop Rate for cell B is only 3.5% as compared to 20% TCH Drop Rate for cell A; Cell A drops 350 calls a day , where as cell B drops only 6 calls a day. Thus fixing the TCH drop issue on Cell A will show larger percentage of improvement at the BSC level as compared to Cell B.

8.2.1 STS counters for TCH drop

Counter Meaning CommentTF_TRAFFIC TF_TRAFFIC = (TFTRALACC/TFNSCAN) in Erlangs

T_DR_S T_DR_S =(TN_DROP/N_CALLS)*100 % TCH drop rate (DCR)N_CALLS I_CALLS+Inc(HO-AB-AW)-Outg(HO-AB-AW) Net sum of terminated calls in cell

I_CALLS = # of initiated calls in a cell (sum of four CASSALL for TCH or CMSESTAB for SD)

Inc = sum of all incomming handovers to a cell from all its neighbours

Outg = Sum of all outgoing handovers from a cell to all its neighbours

HO is the number of successful handovers on TCH = HOVERSUC

AW is number of successful assignment to worst cell =HOSUCWCL

AB is number of succesful assignments to better cell =HOSUCBCL

TNDROP TN_DROP= TFNDROP+TFNDROPSUB+THNDROP+THNDROPSUB Total number of drops on TCHT_DR_S T_DR_S =(TN_DROP/N_CALLS)*100 % TCH drop rate (DCR)

TF_REL_C number of dropped TCH connection due to transcoder resource congestion (on TCH FR)at immediate assignmentTH_REL_C ;; on TCH HR in UL and OL bothTF_REL_C TF_REL_C=TFNRELCONG+TFNRELCONGSUBTH_REL_C TH_REL_C=THNRELCONG+THNRELCONGSUB

THNRELCONG ;; for HR

TFNRELCONGSUB ;; for overlaid subcell FR

THNRELCONGSUB ;; for underlaid subcell in HR

TFNRELCONG# of released TCH signallign connections due to transcoder resource congestion during the immediate assignment on TCH. Transcoder Congestion

TFNDROP is incrimented at the same time.

Table 11: STS raw counters for TCH drop

Page 42: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

42 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

8.3 Analyze8.3.1 Fish bone diagram for the root cause analysis for high TCH Drop Rate

TCH Drop Rate

Low Signal Strength DL Low Signal Strength UL

Bad Quality DL Bad Quality ULHigh TA /RF Spillage /Path Imbalence

External Interference

Figure 17: Fish bone diagram for the root cause analysis for high TCH Drop Rate

TCH Drop Rate

Hardware Faults Drops due to Other Reason

Power Control Sudden Lost ConnectionHandover Failures

HCSCLS

Assignment to another cell

Figure 18: Fish bone diagram for the root cause analysis for high TCH Drop Rate

• Low Signal Strength UL: UL is the weakest of the both DL and UL on an average in a network; the uplink is seen to be 3 dB weaker than DL. Hence the contribution of “weak signal on the UL” to TCH drop call rate in a network will usually be far greater than that by “weak signal strength on the DL” (check figure 16, for BSC1 DL weak signal strength contributes to 1.25% of the net cause of TCH Drop Rate where as weak signal strength on the UL contributes to 36.56%); this means that the UL in any network will be more prone to interference (that is a lower C/I as I staying constant on both DL and UL , and since C is lower in case of UL , naturally C/I for UL will be lower for UL).

Optimisation of UL can either be done using, RF coverage optimization, Hardware fault checks or parameter optimisation.

Page 43: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

43 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

1. RF Coverage Optimization for UL: Antenna load heights , tilts , azimuth , weak coverage cluster optimization, addition of more sites , using Micro cells to cover hotspots, use of TMA to give additional gain on the UL etc.

2. Hardware fault checks: Faulty Antenna, High VSWR due to incorrect termination of RF cable , RF Jumper and antenna feed mechanisms, damaged connectors , damaged cables ,water in external RF feed cables faulty TRX39 etc.

3. Parameter Optimization: In idle mode ACCMIN & CRO could be optimised to make sure that the MS makes correct cell reselection, also make use of the feature “Assignment to Another Cell (assignment to better cell part of this feature)” for these cases of incorrect cell reselection in idle mode. MSTXPWR defines the maximum permitted transmit power from the MS, make sure its set at 43 (1.9 Watts) for 900 band and 30 (1 Watt) for 1800 band. Another set of cell parameters that influences UL signal strength performance is UL power control settings; make sure SSDESUL is not set too low (-95 dBm or weaker than that for a cell that drops excessively on the UL) , LCOMPUL and QCOMPUL settings are not too high (increasing the value of these two parameters leads to aggressive power control on UL) , filter lengths for signal strength SSLENUL and uplink quality QLENUL are set between 3 to 5 seconds for faster power revision commands form the BSS especially for mobiles in the weak coverage areas of the cell , where signal strength fading rate can be very rapid. Also for cells with high percentage of drops due to weak uplink levels make sure that the desired quality for the UL is set very sensitive, use the parameter QDESUL to control this aspect (keep it a 0 instead of default 20). Power control is covered in more detail later in this document.

• Low Signal Strength on DL: Excess reasons for drops on weak DL are similar to that of drops on weak UL level. Drops due to weak levels on DL for any cell should always be lower than that due to weak levels on the uplink , if it’s the other way round either its because of hardware issues similar to that discussed above or due to incorrect settings of BSTXPWR. For cells with wide footprint and high density of traffic from far-off zones (high TA & high distribution of traffic at far-off zones seen on MRR recordings) , excess drops are seen due to path-imbalance ; idle mode resection and eventual channel request are made based on DL measurements , which means in these kind of cases an MS might camp on a cell with reliable DL but unreliable UL which it only realises either during RACH, AGCH and SDCCH signalling process or when the MS lands on the TCH , where it drops due to weak and unreliable UL (rather than DL related issues). That’s the reason why power control optimization of UL or use of TMA usually brings in much better performance improvements in a network than similar work done on DL.

• Bad Quality UL: The reasons for excess drop due to bad UL quality could be classified as Interference related (Co Channel interference , Adjacent Channel interference or External Interference) , coverage issues, handover issues or incorrect cell parameter settings.

Once again if correct power control settings are used at the BSC/Cell level with a good frequency plan and use of synthesizer frequency hopping can bring down excess drops on UL due to bad quality.

1. Interference: Check for co-channel, adjacent channel interference and make the relevant frequency changes.

2. Coverage: discussed in the last section.3. Parameters: Correct settings for idle mode reselection parameters (ACCMIN and CRO) Power control

settings (covered later) , use of DTX40, MSTXPWR , QLIMUL41

4. Handovers: missing neighbours, incorrect settings for locating algorithm etc can lead to drag and drop scenarios.

• External Interference: External to GSM system, but operating frequency band, or harmonics falling within GSM band. Use a spectrum analyser in field to measure and identify the source.

39 Excess VSWR is a common observed problem in networks with weak maintenance plan. High reflected power back to the radios often causes damage of sensitive Low Noise Amplifiers in the Rx Path causing excess UL related drops.40 Use of Discontinuous Transmission on the UL ( DTXU=0 , MS may use discontinuous transmission ) brings down interference on UL41 QLIMUL determines the urgency condition for UL quality based handover trigger value, default setting of 55 (i.e. Rxqual of 5.5 for 4 SACCH period leading to urgency based handover) ; if quality drops on UL is seen to be very high , reducing QLIMUL is seen to bring down TCH Drop Rate.

Page 44: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

44 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

• High TA / RF Spillage and Path imbalance: Covered under the section of Low Signal Strength UL.• Drops due to Other Reason: This is often a difficult one to crack , excess TCH drop due to other reason

could be due to the following reasons:

1. Transcoder synchronization fault, the counter TRASYNCCOUNT gets incremented when TRA sync fault is reported by the BTS on any of the timeslots within the TG42.

Relationship between TCH Drop Rate andTranscoder Synchronization Faults

0

1

2

3

4

5

6

7

8

9

10

11

12

13

08-n

ov

09-n

ov

10-n

ov

11-n

ov

12-n

ov

13-n

ov

14-n

ov

15-n

ov

16-n

ov

17-n

ov

18-n

ov

19-n

ov

20-n

ov

21-n

ov

22-n

ov

23-n

ov

24-n

ov

25-n

ov

26-n

ov

27-n

ov

28-n

ov

29-n

ov

30-n

ov

01-d

ic

02-d

ic

03-d

ic

%

0

20

40

60

80

100

120

140

160

180

200

220

240

260

TCH_DROP TRASYNC

Figure 19: Relationship between increase of TCH Drop Rate and the increase of transcoder synchronization fault (TRASYNCCOUNT from STS MOTG object type)

2. Faulty Transcoder43.3. TRAB (vododers) congestion; use the counter TFNRELCONG & THNRELCONG counter to check

this.4. Blocked A-interface (RALT) and ABIS (RBLT) devices causing A and ABIS interface problems.5. C7 link problems (link unavailability, high BER).6. LAPD problems (link failures, high BER).7. If LAPD problems (excess T20044 timer expiry) are not due to transmission related issues

(congestion, availability or high BER due to interference), give a reset to DXUs on TG that is showing LAPD problems).

42 TG or Transceiver Group is a MO (managed object) within the RBS43 If a transcoder card is seen to be faulty use the “transcoders in a pool” feature to minimize localised high TCH drop to a BSC by distributing the issue across cells from more than one BSC till the faulty transcoder card is replaced.44 T200 timer comes into picture mainly during Call Setup, Assignment and handover (signalling part) and it monitors the “decodability” of the signalling information (from SDCCH, FACCH and SACCH) on the signalling frames. If the information in the frames are not decodable T200 timer is decremented , and if it gets continuously decremented by a figure specifies by N200 (23 for SDDCH or approximately 220 ms , 34 for FACCH used for handover or approximately 115ms or 5 for SACCH or 900 ms), BTS will send a Layer 3 message” Error indication- abnormal release” –reason unspecified (cause value =1) to both MS and BSC along with T200 expiry , both these messages could be read from the Layer 3 messages on the MS ( TEMS ) or from CTR measurements at the BTS. BSC in its turn will drop this call/setup attempt and increment the counter CNDROP and treat it as a drop due to “other reason”. Often interference , or incorrect settings of ACCMIN is the cause of excess T200 expiry in a cell ; but if all the three sectors shows excess T200 expiry (T200 expiry common to the site) then the cause often is transmission related.

Page 45: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

45 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

8. Intermittent digital path (DIP) quality problems on transmission networks caused by interference on the transmission networks.

9. Caused by usage of features like BTS/MS power control (high discrepancy in power control settings in a BSC causing excessive CP load) or Adaptive Configuration of Logical Channels.

10. Very high load of Location Updating Requests within the BSC.11. LAPD Concentration factor45 set to too high value resulting in LAPD congestion.

• TCH drop due to Sudden Lost Connection: This can be due to almost all the reasons from TCH drops due to other reason plus RF coverage issues. An example of TCH drop due to sudden lost connection could be a subscriber entering a lift while in conversation; apart from these reasons check also for loose feeder cable/jumper cable connections to the antenna, connectors and to the RBS.

• Hardware faults: Faulty antenna, RF cabling issues, VSWR, BTS faults (check alarms).• Handover: Missing neighbours, swapped sectors, locating algorithm needing corrections, high number of co-

BSIC/BCCH combinations in a cell, RF coverage issues etc. This topic is covered in detail in the next section.• CLS: Aggressive use of CLS, especially for cells with weak or interfered neighbours can cause high TCH Drop

Rate (check per neighbour handover success rate and only then optimise CLS settings).• HCS: In a dual band network, often to make the 1800 band carry more traffic HCS parameters are set too

sensitive leading to drop on the higher bands. Very low multiband parameters like LAYETHR and HSCBANDTHR often leads to high TCH drop rate on the 1800 layer.

• Assignment to Worse Cell: This feature can increase the TCH drop Rate at the expense of brining down congestion in cases where the neighbours are weak or interfered. Check per neighbour handover success rate before implementing this feature.

• MS and BS Power Control: Power control is usually seen to have huge impact on both SDCCH and TCH drops especially on the UL, improving both call setup success rate and drop call rates both on SDCCH and TCH. Excess transited power, on the downlink by the BS as well on the uplink by the MS, often causes proportional excess interference on the links.

Adverse impact of excess transmit power by calls originated from close in fields on calls which are originated from far off fields is usually acute; reasoning here being, received signal strength at the MS for far off traffic is weak due to excess fading from longer propagation distance and hence prone to more damage from interference.Controlling the high percentage of excess transmit power in the close in fields can bring down the interference levels on uplink (and downlink).

45 LAPD Concentration factor: The main components of an RBS is the TRUs (radio that helps setup and sustain a call from MS), which is connected to a DXU which acts as the interface/switching unit between the E1-transmission connection in and out of the RBS to the BSC (this E1 connection is also referred to as DIP or a PCM). 1 E1 connection to an RBS (terminated on the DXU of the RBS) has 32 PCM timeslots in all with each timeslot having a capacity of carrying speech information from 4 speech timeslots from a TRU, that is each timeslot on the E1 has a 64kBps bandwidth (and hence 16*4 air speech timeslots mapped onto one PCM timeslot). Now each TRU w.r.t. to this E1 link to the DXU within the RBS needs a “whole 64 Kbps TS for FACCH , BCCH , SCH , CCCH(paging), SDCCH and SACCH ; this is because only TCH information gets send at sub rate of 16 Kbps where as signalling is not sub rated and doesn’t pass through the transcoder). Now one whole PCM timeslot needed for signalling per TRU can bring down the available bandwidth for TCH traffic on the E1 connection between RBS and BSC. That’s the reason why a component called CON (concentrator) is used in the RBS. CON is connected direct to the DXU and all the signalling timeslots from TRU gets routed through CON to the DXU. CON defines the priority of the signalling information and makes “sharing of a PCM timeslot between TRUs possible”; this is done using the cell parameter CONFACT (default value =1, range =1 to 4). Setting it at high values , say =4 , CONFACT=4 means FOUR TRUs from the RBS will share ONE PCM timeslot on the E1 instead of using 4 PCM timeslots. Such settings can increase the number of TRUs that could be equipped on one E1 between the BSC and the RBS but at the same time will put immense load on the signalling bandwidth (bringing down signalling performance, increasing TCH drops due to other reason, lower paging success rate etc) especially in the case of cells that carry a lot of traffic.

Page 46: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

46 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

8.3.2 Power control parameters for MS

Parameters Default Recommended Range Unit Comment

SACCH periods time interval(min) between two power orderREGINTUL 5 1 1 to 30

dtqudesired UL quality ; if aggressive power control of UL SS is used , keep it at 0 (sensitive on

INIDES -70 -80 neg100 to neg 47 dBmSDDCH : ininitial desired signal strength for SDCCH (keeping it on -70 can cause drop on

QDESUL 20 0 ,10 or 20 0 to 70

SACCH periodsfilter length for SDDCH power control , for faster power adjustments on SD keep it low

QLENUL 8 3 1 to 20 SACCH periodsLength of quality filter on UL , keep it low to react queckly(increase power) degradation of rxQual

INILEN 2 2 0, 2 to 5

%UL quality compensation factor,high values of QCOMPUL: faster response to degradation of

SSLENUL 5 3 3 to 15 SACCH periodsLength of signal strength filter , for cells with fast moving traffic keep it low

QCOMPUL 30 75 0 to 60

dBmdesired signal on the UL if rest all criterias(path balance and interference ) are met.

LCOMPUL 70 6 0 t0 100 %UL path loss compensation factor, increasing LCOMP will increase aggressiveness of UL Pcon

SSDESUL -85 -95 neg110 to -47

Table 12: Power control parameters for UL

• SSDESUL: keeping it low will make sure that when the UL path loss (controlled by LCOMPUL) is low (that is the MS is very close to the site where from it need not transmit at very high power to reach comfortable to BS) and the UL quality is very good (QDESUL=0, RxQual UL is 0; QCOMPUL=75), the MS will be powered down to very low transmit power thus decreasing the overall noise floor for the BSC. This strategy of aggressive power reduction on MS and at the same time making the uplink extremely sensitive to high path loss and interference (power up immediately on interference or higher path loss) is seen to bring down TCH drop and SDCCH drop by huge factor.

• Also at the same time keeping the filter lengths low ensures that power change commands are send to MS at a faster rate to keep the link under high interference due to low levels or high path loss.

• UPDWNRATIO: This parameter controls the “ratio between up and down power regulation speed”. Default value for UPDWNRATIO is 200. When using the above strategy for UL power control keeps this parameter at 300.

8.3.3 Power control parameters for BS

Parameters Default Recommended Range Unit CommentSSDESDL -70 -90 neg 110 to -47 dBmQDESDL 20 30 0 TO 70 dtquLCOMPDL 70 5 0 to 100 %QCOMPDL 30 55 0 to 100 %REGINTDL 5 1 1 to 10 SACCH PeriodsSSLENDL 5 3 3 to 15 SACCH PeriodsQLENDL 8 3 1to 20 SACCH PeriodsSDCCHREG OFF ON ON,OFF Enables power control on SDCCH

Table 13: Power control parameters for BS

Page 47: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

47 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

9. Handover

9.1 Define

Handover in Ericsson BSS system is controlled by what is called as the locating algorithm in the BSC. Locating algorithm operates on the basis of Measurement Reports (MR)46 sent in by the MS on SACCH.

The inputs that the BSC uses for making a handover decision, from the received MRs from the MS is the DL signal strength, DL quality, and the signal strength of the six best reported neighbours. From the serving BTS, for the same MS the BSC will use UL signal strength, UL quality and TA.

There are two Location algorithms E1 and signal strength based E3. E1 algorithm uses path loss measurements and received signal strength measurements to come to a handover decision. For both E1 and E3 the BSC goes through the same sets of steps before coming out with a need of a handover decision; these steps are MR filtering, urgency condition detection (quality urgency or TA urgency), radio network function evaluations and basic ranking of the serving and reported six best neighbours.

9.2 Measure

9.2.1 STS counters and stats for handover performance

Counter Meaning CommentHOVERCNT Number of handover commands sent to the MSHOVERSUC Number of successful handovers to the neighbouring cellHODUPFT Number of successful handovers back to old cel within 10 secondsHOTOKCL Handover attempt made to better K-cell ; corresponding L cell is called HOTOLCL for Ericson 1 algorithmHOTOHCS Handover attempt due to HCSHODWNQA

HOEXCTA Number of handover attempts due to excessive timign advanceHOASBCL Number of assinment attempts to better cellHOASWCL Number of assinment attempts to worse cellHOSUCBCL Number of successful assignment attempt to better cellHOSUCWCL Number of succesful assignment to worse cellHOATTLSS Number of handover attempts when serving cell is a low signal strength cellHOATTHSS Number of handover attempts EVEN when serving cell is a high signal strength cellHOATTHR Number of handover attempts at high handover rateHOSUCHR Number of successful handovers at high handover rateH_LOST H_LOST = [(HOVERCNT-HOVERSUC-HORTTOCH) / HOVERCNT]*100 % HO lost ; Ho succ rate

Number of hadover attempts due to bad downlink quality ; HO counter for bad UL quality is HOUPLQA

Table 14: Counters and stats available for Handover

Apart from these counters STS also keeps “per neighbour handover success rate” for all the available neighbours to a cell. Per neighbour handover success rate gives us immense amount of information on the performance between two neighbours, strength/ weakness of RF overlap between two cells, interference issues

46 MR called as Measurement Reports are sent up by the MS in/on every SACCH interval (480 mS) when in Active mode. The BSC primarily uses the information in MR sent up to it on SACCH to make power control and handover decisions. If the BSS cannot decode MRs for the time period set by radio link timeout counter (usually set for 16 SACCH periods, or 8 seconds) will drop the call. A Measurement Report usually consists of the following information: BA used, DTX used, RxLev Full and Sub for the serving cell and 6 best neighbours, RxQual Full and Sub of the serving cell, TA of the serving cell and decoded BCCH-BSIC of the six best neighbours.

Page 48: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

48 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

between the borders of two cells etc and could be used to identify weak neighbours for RF, handover optimisation to improve upon TCH drop caused by handover failures.

9.3 Analyze

9.3.1 Fish bone diagram for the root cause analysis for high handover failure rate

Poor HO Succ Rate

Co -BCCH /BSIC Missing Neighbours

TCH Congestion on target Swapped SectorsWeak Cell Boundry

TCH interference

Too many nbrs

Incorrect Parameters

Figure 20: Fish bone diagram for the root cause analysis for high handover failure rate

• Missing Neighbours: make use of planning tool, NOX47 and drive tests to identify and add missing neighbours.

• Co-BCCH/BSIC: Often a major contributor to handover failures in a network. MS when in active mode reports back the six best detected neighbours(MS in active mode measures only those BCCH supplied to it on system information from the current serving cell , this list supplied to the MS in active mode by the serving cells is the neighbour list of the serving cell or also called as the SACCH list of the serving cell) this six best sorting done by MS strictly is based on the reported neighbours DL signal strength level , and gets reported back to the BSC in the MR sent up on SACCH. This information on the measured neighbours consists of RxLevel of the neighbours and reported BCCH and BSIC of the neighbour. If within the coverage boundary of a cell , there exists another cell (spillage) with similar BCCH as the reported neighbour , then it can often cause incorrect decoding of the relevant neighbours BSIC (or no decoding, depending upon the interference level by the spiller). Such a situation can lead to a handover failure leading to TCH drop. Do an audit of the co BCCH/BSIC allocation within the BSC and surrounding BSC and change co BSIC for cells with same BCCH.

• TCH Congestion at Target: Handover signalling between two cells is performed over FACCH. FACCH operates by stealing a bit from the TCH channel that is serving the MS currently (this means the whole signalling process of handover happens on TCH itself). When the BSC initiates a handover process over FACCH of the serving cell direct to TCH channel of the best reported neighbour (once again FACH of the neighbour) and if the neighbour is congested (TCH congestion at target) it will lead to a handover failure at the source cell due to a cause “target cell congested”. If such signalling failure occurs during a handover BSC will apply penalty of PSSHF (default setting 63, which is 63 dB and a range of 0 to 63 dB) for a time period of PTIMHF (default setting of 5 sec, range of 0 to 600 second), this penalty is applied to a handover for the

47 NOX: Neighbour Cell List Optimization Expert is an OSS feature from Ericsson which can detect missing neighbours by rotating test neighbours in a cell across the legal frequency band, and measure the signal strength reported from MS for these test neighbours (BCCH/BSIC combinations). If any of the reported BCCH/BSIC shows very good levels (thresholds could be pre set), NOX can either add it as a neighbour or else create a missing neighbour report.

Page 49: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

49 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

mobile under consideration and not for other mobiles in the same source cell or other cells trying for handover to the neighbour under dispute. After applying this penalty BSC will do the process of locating again and if the same congested neighbour is reported as the best neighbour, after the period set by PTIMF the MS will retry for a handover. Excess congestion at a target cell with a lot of inward mobility can lead to high handover failure rate at the source cells and often TCH drop due to drag and drop scenario.

• Swapped Sectors: Antenna implementation fault (with un-swapped neighbour definition).• BCCH / TCH interference: At the cell boundaries causing either difficulty in decoding corrupt MRs or

signalling failures due to high interference on TCH (on FACCH).• Weak Cell Boundary: Poor RF coverage across the cell boundary will make decoding of SACCH and FACCH

difficult leading to handover failures or TCH drop.• Too Many Neighbours: In a GSM system a cell can have up to 32 defined neighbours. But for a BSC a

combination of high density of traffic, high percentage of mobile traffic needing handovers, with majority of the cells having very high count of neighbours means “too much processing” ; which can lead to either too high CP load or very high amount of signalling on the transmission networks (especially on the ABIS) leading to link congestions. For such BSCs brining down the average neighbour count per cell can marginally improve handover performance across the BSC.

• Incorrect Handover Parameters: Optimization of handover parameters (for both E1 and E3 algorithm) can lead to improvement of HO success rate in a network.

9.3.2 E1 Algorithm for Locating (EVALTYPE=1)

Parameters Default Recommended Value Unit Comment

dB

Hysteris appiled to LOFFSET , to prevent ping pong handover.

LOFFSET 0 0neg 63 to +63

dBLink balance criteria , positive value will make the sorce cell more attractive to retain the call (strech the call into target),delay the

LHYST 3 3 0 to 63 dB

Handover Margin , absed on signal strength criteria ; neighbour needs to stronger that server by this factor to be considered for handover calcul.

TROFFSET 0 0neg 63 to +63

dBfactor by which a cell boundry is displaced away from the target (for handover) ; delays handover (ping pong red.)

TRHYST 2 2 0 to 63 dBsimilar to KHYST , but applied over Link balance criteria of E1 and it reduces unnecessary handovers

KOFFSET 0 0neg 63 to +63

dBmsufficient Receaved level on the UL from a neighbour to be considered as a candidate hor handover evaluation

KHYST 3 3 0 to 63 dBUsed to decrease(applied penalty) ranking value for a neighbouring cell (reduce ping pong handovers)

BSRXSUFF neg150 to 0

MSRXSUFF neg150 to 0 dBmsufficient Receaved level on the DL from a neighbour to be considered as a candidate hor handover evaluation

Table 15: E1 Locating algorith parameters

In E1 algorithm, the BSC uses the above mentioned parameters to rank handover candidates into two lists; the L-List (link balance list) and K-list (signal strength list). This ranking is done as per figure 21

L-list

K-list

Lowest L -rank

Highest L -rank

Best cell

Worst cell

Highest K -rank

Lowest K -rank

L-list

K-list

Lowest L -rank

Highest L -rank

Best cell

Worst cell

Highest K -rank

Lowest K -rank

Page 50: Ericsson 2 g root-cause-analysis-for-kpi

Ericsson Confidential

GSM NETWORKS KPI ROOT CAUSEANALYSIS

50 (50)

Prepared (also subject responsible if other) No.

ESA/SK Subhash Panikar ESA/SK 06:0027Approved Checked Date Rev Reference

Amos Phahla 2006-10-17 PA1 REP00271_A

Figure 21: Basic Candidate List by Locating Algorithm

This kind of listing by E1 algorithm will ensure that a handover decision takes place only when the mobile is far away from the serving cell (delay the handover till the link budget becomes an issue or an emergency handover trigger either in terms of bad quality (QLIMULor QLIMDL) or TA (TALIM) has occurred. 9.3.3 E3 algorithm for Locating (EVALTYPE=3)

Parameters Default Value Recommended Value Value Range Unit Comment

dBHandover hysteris (neighbour should be strong by this factor) applied to the serving cell if it is tagged as "high

OFFSET 0 0 neg 63 to 63 dBhandover margin (the neighbour should be stronger than the server by OFFSET+hysterisis to be considered for a

HIHYST 5 3 0 to 63

dBmThe thhreshod below which the the serving cell gets tagged as "low signal strength cell" .

LOHYST 3 3 0 to 63 dBHandover hysteris (neighbour should be strong by this factor) applied to the serving cell if it is tagged as "low signal

HYSTSEP -90 neg 150 to 0

Table 16: E3 algorithm for locating Ericsson Algorithm 3 (algorithm type is determined by the BSC parameter EVALTYPE) is easy to implement and optimize.

• Urgency Condition in handover: Urgency conditions are special conditions where a handover is a must even if the locating algorithm has not found a need to perform a handover to the best neighbour located in the candidate list. Bad Quality issues and excess TA issues trigger an urgency condition handover. Parameters controlling urgency condition handovers are listed in table 17.

9.3.4 Urgency condition handover trigger parameters Parameters Default Recommended Range Unit Comment

TALIM snormal range 62 62 0 to 63 bit periodextended range 62 0 to 219 bit period

s time period for which PSSTA is applied

EXTPEN OFF OFF ON,OFF alllows or dissallowes urgency condition inter-BSC handovers

PTIMTA 30 30 0 to 600

dB time period for which PSSBQ is applied

PSSTA 63 63 0 to 63 dBpenalty given to the cell abandoned on high TA (for that particular connection , that MS)

urgency condition due to high TA

PTIMBQ 15 15 0 to 600

dtqucell to cell symetrical relation, the negative offset in level acceptable to a urgency handover(to a worse neighbour) on

PSSBQ 10 7 0 to 63 dBpenalty given to the cell abandoned on bad quality (for that particular connection , that MS)

BQOFFSET 3 3 0 to 63

dtquthreshold for UL quality emergency (triggers an handover if UL Qual reaches RxQual of 5.5 or worse for quality filter length

QLIMDL 55 55 0 to 100 dtquthreshold for DL quality emergency (triggers an handover if DL Qual reaches RxQual of 5.5 or worse for quality filter length

QLIMUL 55 55 0 to 100

Table 17: Urgency condition handover trigger parameters