maintenance experience_ issue252(cdma network products)

31

Upload: era1521

Post on 10-Mar-2015

75 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Maintenance Experience_ Issue252(CDMA Network Products)
Page 2: Maintenance Experience_ Issue252(CDMA Network Products)

PrefaceMaintenance ExperienceEditorial Committee

Maintenance ExperienceNewsroom

Address: ZTE Plaza, No. 55, Hi-tech Road

South, ShenZhen, P.R.China

Postal code: 518057

Contact: Ning Jiating

Email: [email protected]

Tel: +86-755-26771195

Fax: +86-755-26772236

Document support mail box: [email protected]

Technical support website: http://ensupport.zte.

com.cn

Maintenance ExperienceBimonthly for CDMA ProductsNo.2 Issue 252, February 2011

Director: Qiu Weizhao

Deputy Director: Huang Dabin

Editors:Fang Xi, Wang Zhaozheng, Xu Xinyong,

Xiao Shuqing, Zhang Jian, Zhang Jiebin,

Zhou Guifeng, Zhao Cen, Zhao Haitao,

Jiang Haijun, Xu Zhijun, Huang Ying,

Ge Jun, Dong Yemin, Dong Wenbin

Technical Senior Editors:Gao Xiaoxia, Chen Zhiliang, Xue Xiaoxing

Executive Editor:Chen Lin

In pursuit of the idea “First-class service, Customer service”, and in order to make more profit and provide better service for our customers, we have edited Maintenance Experience (Issue 252 for CDMA products of ZTE, which includes 14 articles about maintenance cases of CDMA products of ZTE (MSS and ZXC10 3GCN) and special technology topics on common problems troubleshooting.

With the wide application of CDMA products of ZTE, routine maintenance of the products, a major job of the technical maintenance personnel of operators, has become more and more important. This issue, which collects articles written by CDMA technical personnel of ZTE, describes practical cases that are often found during the commissioning and maintenance, as well as the solutions. We hope it will be helpful to your routine maintenance work.

For more articles of Maintenance Experience and related technical materials, please log on ZTE technical support website http://ensupport.zte.com.cn.

If you have any requirement, question, or suggestion, please feel free to contact us. Your attention and support are greatly appreciated.

Maintenance Experience Editorial CommitteeZTE CorporationFebruary, 2011

Page 3: Maintenance Experience_ Issue252(CDMA Network Products)

Contents

Analysis of High Utilization of the IP Address Pool and Low Success Ratio of PPP Connection .......2Analysis of A Trunk Circuit Encountering Signaling Blocking After Disaster-Recovery Changeover ..8Analysis of Two Outgoing H3 Messages Being Generated During Interception on A Called Subscriber in One Office ..................................................................................................................10Analysis of A Failure in Suspension and Resumption for Accounting...............................................12Analysis of DNS Error During the xGW-PDSN Wap Service ............................................................15Analysis of A Roaming Subscriber Failing to Send A Short Message ..............................................16Analysis of the Time Recorded in CDR Suddenly About Jumping One Hour ...................................18Analysis of the SDP Decoding Failure of Incoming SIP Calls from an Interconnected IMS Office ..22Analysis of AAA Sending Unrecognizable User Name to PPS .........................................................24Analysis of the Problem of MSCe Being Directly Interconnected With SCP ....................................27

Page 4: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience2

February 2011 Issue 252

Analysis of High Utilization of the IP Address Pool and Low Success Ratio of PPP Connection⊙Qian Wei / ZTE Corporation

Fault PhenomenonFigure 1 shows the alarms appearing in the system.

Figure 1. Alarm Information

Problem Analysis(1) Single subscriber

After performing all-subscriber signaling trace and the failure observation, an engineer found

that some subscribers had the signaling messages shown in Table 1.

psPPPGetIPRepFromDBS() Err:DB returned IPAddress is: [0],release user.

IMSI = 262199506317659 4

psPPPDown() Err:User's PCF[192.168.50.90],Nego Status[2],Fail Reason[0],blRecvLCPPkt[1],blRecvIPCPPkt

[1],ucType[30].IMSI = 262199506317659 4

Table 1. Failure Observation

From the first row of Table 1, the engineer inferred that the system failed to allocate an address

to the terminal. He then observed the signaling trace performed for this subscriber with IMSI

262199506317659, and found that the IPCP process was prior to the process where the PDSN

sent a LCP Termination Request message, as shown in Figure 2.

Page 5: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

3CDMA Network Products

Figure 2. PDSN initiating LCP Termination Request After the failure of PPP Connection

The engineer checked the last IPCP message. In a process of successfully establishing the

PPP connection, the IPCP connection request from the terminal should contain the allocated IP

address at the end of the IPCP negotiation process. The IP address, however, was empty in this

flow, as shown in Figure 3.

Figure 3. No Valid IP Address Contained in IPCP Connection Request

Page 6: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience4

February 2011 Issue 252

Figure 4. Statistics of IP Address Pool Utilization of SMP on September 10, 2010

Figure 5. Statistics on Success Ratio of the PPP Connection on September 8, 2010

(2) Performance statistics

The engineer viewed the previous

records about the ut i l izat ion of the

IP address pool of the SMP from the

performance statistics of the PDSN OMC,

as shown in Figure 4.

As shown in Figure 4, the IP-address-

pool utilization (represented by the blue

line shown in Figure 4) of CPU 4 of the SMP

module reached 100% from 10:00 till 18:00.

The engineer viewed the statistical records

about the success ratio of the PPP connection in

the same day, as shown in Figure 5.

As shown in F igure 5 , the success fu l

connection ratio (represented by the red line in

Figure 5) of CPU 4 of the SMP module declined

Page 7: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

5CDMA Network Products

greatly in the same period. The lowest one was

about 15%. By comparing the utilization (blue line)

of the IP address pool of CPU 4 shown in Figure

4 with the success ratio of the PPP connection of

the same CPU shown in Figure 5, the engineer

found that the success ratio was 100% when the

utilization of the IP address pool was lower than

100% from 10:00 till 18:00. The success ratio

was lower than 100% when the utilization of the

IP address pool was 100%. The amplitude of the

success ratio declining depended on the number

of the subscribers attempting to dial a number.

The engineer found the same phenomena after

observing the similar performance statistical data

in other days. Therefore, he determined that the

system exception was due to exhaustion of the

IP address pool of CPU 4 of the SMP

module.

(3) Configuration

As shown in Figure 6, the terminals

were allocated to different CPUs for

processing according to the last two digits

of the IMSIs. The terminals with a number

ranging from 00 to 49 were allocated to

CPU 3, while the terminals with a number

ranging from 50 to 99 were allocated to

CPU 4. Statistically, the subscriber IMSIs

can be regarded as distributed evenly.

With the configuration shown in Figure 6,

the subscribers could be allocated evenly

to two CPUs, hence the configuration was

correct.

Figure 6. Configuring the Relationship Between the IMSI and the Module

Figure 7. Configuration of the IP Address Pool

Page 8: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience6

February 2011 Issue 252

Figure 8. PDSN Configuration After Adding New IP Address Pools

As shown in Figure 7, the IP addresses

of the two CPUs of the SMP were different

in numbers. CPU 3 could use 126 IP

addresses while CPU 4 could use only 62

IP addresses.

Because subscribers were evenly

allocated to two CPUs for processing

according to the IMSI, CPU 4 with less IP

addresses first encountered the problem of

exhaustion of IP addresses. This analysis

result is also verified by Table 1 and Figure

2, where the CPU numbers are all 4.

TroubleshootingThe engineer added new IP address

pools. As mentioned previously, the

IP address pool of CPU 3 had 60 more IP

addresses than that of CPU 4. The purpose of

adding new IP address pools was to make two

CPUs have a similar number of IP addresses,

hence maximizing the ut i l izat ion of the IP

addresses. Figure 8 shows the configuration of

new IP address pools.

IP address pools ippool3 and ippool4 were

added, as shown in Figure 8. After the configuration

was updated, CPU 3 and CPU 4 each had 157 IP

addresses.

After the configuration of the PDSN system was

updated, the alarms about high utilization of the IP

address pool and the low success ratio of the PPP

connection never occurred. The engineer checked

the performance statistical data of the system.

Page 9: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

7CDMA Network Products

Figure 10. Statistics on Success Ratio of the PPP Connection After Updating the IP-Address-Pool Configuration

Figure 9. Statistics on Utilization of the IP Address Pool of SMP After Updating the IP-Address-Pool Configuration

As shown in Figure 9, the peak value of the IP

address pool utilization was about 50% during the

time span from 10:00 to 18:00 in which utilization

of the IP address pool had used to reach 100%.

As shown in Figure 10, the success ratio of

the PPP connection was 100% all day long after

the IP-address-pool configuration of the

system was updated. This proves that

analysis of the problem was correct. The

proposed solution settled the problem of

high utilization of IP address pools and low

success ratio of the PPP connection. ■

Page 10: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience8

February 2011 Issue 252

Analysis of A Trunk Circuit Encountering Signaling Blocking After Disaster-Recovery Changeover⊙Zhou Fei / ZTE Corporation

Fault PhenomenonAn office adopted the 1+1 disaster

recovery mode, but the disaster-recovery

changeover was never performed. During

a disaster recovery drill, an engineer

changed over the services from the

active office to the standby office, and

then he found that many circuits were

running improperly. He queried for these

circuits in Daily Maintenance > Dynamic Management, and found that they were

in “signaling blocking” status. He then

queried the office direction and found

that it was normally in “connected” status.

At that time, all services were improper,

and no call could be connected. After all

services were changed over to the active

office, all the trunk circuits and services

recovered.

Problem AnalysisT h e e n g i n e e r a n a l y z e d t h e

configuration, but found nothing unusual.

Why did the trunk circuits encounter

signaling blocking?

The engineer later found that not all

services became abnormal. The intra-

office calls could be put through.

After viewing the on-the-site networking

diagram and the test result, the engineer

detected that all the trunk circuits over the

LSTP encountered this problem. Was it

possible that the LSTP data was wrong? Figure 1

shows the networking diagram. According to the

requirements for the disaster recovery drill, the

LSTP should be configured as follows.

Figure 1. Networking Model

(1) SCCP layer● MSCeIN1 was directed to office MSCe

1, without backup.● MSCeIN2 was directed to office MSCe

2, without backup.● MSCeIN2 was directed to office MSCe

3, without backup.

(2) MTP3 layer

LSTP1—MSCe1, MSCe2, and MSCe3● Active route: link A from LSTP1 to

active MSCe 1, MSCe 2 and MSCe 3● Alternate route 1: link B from LSTP 1 to

MSCe 4 (standby)● Alternate route 2: link C from LSTP 1 to

LSTP 2

Page 11: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

9CDMA Network Products

A signaling link was enabled between LSTP 1

and MSCe 4 (standby). LSTP 1 only sent signaling

maintenance messages to the signaling point of

MSCe 4 (standby). At any time, the DPC was the

signaling point of each active MSCe when LSTP 1

was sending service messages.

The local LSTP was not manufactured by ZTE.

Before the drill, the engineer confirmed that this

device was configured with links A, B, and C by

asking the maintenance engineer of this vendor.

The engineer then worked together with the

engineer of this vendor to analyze the device

configuration. Actually, these three links did not

adopt the alternative route mode.

The networking diagram is shown Figure 2.● Signaling link A was enabled from LSTP

1 to MSCe 1, and signaling link A’ was

enabled from LSTP 2 to MSCe 1.● Signaling link B was enabled from LSTP

1 to MSCe 2, and signaling link B’ was

enabled from LSTP 2 to MSCe 2.● Signaling link C was enabled between

LSTP 1 and LSTP 2.

On the SCCP layer, take LSTP 1 as an

example. When all services fell on MSCe 1, the

first-selected route was signaling link A, and the

second-selected route was from signaling link C to

signaling link A’.

If these two routes were obstructed,

LSTP would fall all services on MSCe 2.

In this case only signaling links A and B

were configured, which did not fulfill the

requirement. Signaling link B was not

configured as the second alternate route

to MSCe 1, as shown in Figure 2.

Figure 2. Networking Model

TroubleshootingThe engineer asked the engineer of the

LSTP to configure links A, B, C according

to the requirements of ZTE. After that,

the engineer implemented the disaster

recovery drill again. During the drill, all

the trunk circuits over the LSTP and the

services were proper. ■

Page 12: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience10

February 2011 Issue 252

Analysis of Two Outgoing H3 Messages Being Generated During Interception on A Called Subscriber in One Office⊙Xu Wei / ZTE Corporation

Fault PhenomenonA lawful interception center reported

that two identical interception records were

received when the intercepted MS acted

as a called party, while only one record

was received when this MS acted as a

calling party. Figure 1 shows the flow of an

SIP WIN call.

Analyzing and ProcessingAfter collecting the BCM and Proxy

signaling messages, an engineer found

that the SIP-related messages appeared

during the call process.

Apparently, the call with the called

number being intercepted was an intra-

office call, but was a call containing an

“incoming” sub-call and an “outgoing” sub-

call. The calling subscriber was a PPS

subscriber.

This project required that 3GCN

was interconnected with PrepaidServer

over the SIP protocol to implement the

PPS service, which was fundamentally

different from the traditional CDMA WIN

implementation mode. It was necessary

for the product to implement this function.

The core network devices should support

the SIP protocol for interworking with

Prepaid Server, the application server,

to implement the intelligent services and

value-added services.

Figure 1. Flow of A SIP WIN Call

Figure 2. Networking Diagram

Figure 2 shows the networking diagram.

By analyzing the problem, the engineer drew

the following conclusion:

This symptom is due that a PPS cal l is

equivalent to an outgoing call plus an incoming call.

In this case, interception was triggered twice after

the called number had been intercepted, because

double calls were involved.

Page 13: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

11CDMA Network Products

Note: Under this solution, the system cannot

intercept the called subscriber in an

outgoing call and the calling subscriber

in an incoming call. The system can only

intercept this subscriber in his home

office. In this case, the system cannot

intercept subscriber B who is a non-

local-office subscriber when subscriber

A calls subscriber B. Subscriber B can

only be intercepted in his home office. It

is impossible to intercept subscriber B in

office A. ■

(2) When the security department is setting

surveillance, trace type 1 instead of trace type 2 is

used, as shown in Table 2.

TTTarget Type

Integer: 0–254

N/A

Lawful interception number type:0-mdn

1-imsi2-esn3-isdn4-min254-all

Lawful interception can be triggered normally by different number types.

Table 1. Description of Surveillance Setting

Two schemes are available to avoid this

situation.

(1) When the security department is setting

surveillance, implement interception by IMSI

instead of MDN.

That is to say, set Target Type to 1-imsi rather

than 0-mdn, as shown in Table 1.

Note: This solution requires that the MIN/IMSI

number of the subscriber is known to the security

department and the corresponding relationship

between MDN and IMSI should be provided to the

security department in real-time.

TRACETYPETrace type

Integer: 0–1

1

0: Trace type one. Only under the condition that LI users are local ones, can LI be

triggered.1: Trace type two. LI can be triggered

in the situation that LI users are registered locally

and remotely. Trace type two is used for LI at

GMSCe.

Table 2. Description of Surveillance Setting

● 0: Trace type 1 indicates that

interception is triggered only when

the intercepted subscriber is a

local-office subscriber. ● 1: Trace type 2 indicates that

interception is triggered no matter

whether the intercepted subscriber

is a local-office subscriber. This

mode is used for interception at a

gateway office.

Page 14: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience12

February 2011 Issue 252

Analysis of A Failure in Suspension and Resumption for Accounting⊙Li Hua / ZTE Corporation

Fault PhenomenonAfter a number segment was added in

an office, error code 200051 “Fail to get

HLR office attribute by user” was returned

during the suspension and resumption

operations performed for a subscriber,

indicating that the system failed to get the

attribute of the HLR office according to the

subscriber.

==============================================================================

2010-11-09 10:44:38:326, 11091044383262000<FROM BOSS-SOAP>Mod CUser:MDN=861895

6227086,MSStat=1

2010-11-09 10:44:38:326, <MI Prompt Info>Assigned WorkSpace_Index = 0!

2010-11-09 10:44:38:326, 11091044383262000<SEND DBIO>Mod CUser:MDN=86189562270

86,MSStat=1

2010-11-09 10:44:38:326, 11091044383262000<FROM DBIO-1>ACK:Mod CUser:

RETN=200051, DESC=Fail to get hlr office attribute by user;

2010-11-09 10:44:38:326, 11091044383262000<SEND BOSS-SOAP>ACK:Mod CUser:

RETN=200051, DESC=Fail to get hlr office attribute by user;

2010-11-09 10:44:38:326, <MI Prompt Info>Released WorkSpace_Index = 0!

… …

2010-11-09 11:03:24:326, 11091103243267600<FROM BOSS-SOAP>Mod CUser:MDN=861895

6209892,MSStat=233

2010-11-09 11:03:24:326, <MI Prompt Info>Assigned WorkSpace_Index = 0!

2010-11-09 11:03:24:326, 11091103243267600<SEND DBIO>Mod CUser:MDN=86189562098

92,MSStat=233

2010-11-09 11:03:24:326, 11091103243267600<FROM DBIO-1>ACK:Mod CUser:

RETN=200051, DESC=Fail to get hlr office attribute by user;

2010-11-09 11:03:24:326, 11091103243267600<SEND BOSS-SOAP>ACK:Mod CUser:

RETN=200051, DESC=Fail to get hlr office attribute by user;

2010-11-09 11:03:24:326, <MI Prompt Info>Released WorkSpace_Index = 0!

==============================================================================

Analyzing and ProcessingAn engineer searched for th is er ror in

app_20101109.log, the reported accounting log.

This accounting log covered the cases of this

error code being returned because of illegal

MDN subscriber query and by legal subscribers

during the suspension/resumption operation. The

engineer extracted all the records about this error

code being returned by legal subscribers.

These records are as follows:

Page 15: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

13CDMA Network Products

The system started to record this error at 10:44

and stopped at 11:03, lasting about 20 minutes.

The engineer did not find this error in other

intervals.

Why did this error occur? Based on this thread,

the engineer checked whether any unexpected

alarms appeared during this interval, but did not

find any useful information because the system did

not generate any alarms related to this error during

this interval.

======================================

2010-11-09 10:44:35 ======= Begin of

DataSync ======

2010-11-09 10:44:36 RxCommBeginReq:

Sender's dbVersion is 256,

dbVersionRef is 36! [DBS_P_SSyncRx.

c](490)

2010-11-09 10:44:36 RxCommBeginReq:

Receiver's dbVersion is 256,

dbVersionRef is 35! [DBS_P_SSyncRx.

c](492)

2010-11-09 10:44:36 RxCommBeginReq:

_ODC_S10_MEMUSE changed to 1.

[DBS_P_SSyncRx.c](527)

… …

2010-11-09 11:03:26 Saving DataBase

to Path ..\DATA\DATA1 End.

2010-11-09 11:03:26 Saving DataBase

to Path ..\DATA\DATA1 Succeeded!

======================================

Did the subscriber perform relevant operations?

The engineer found the thread from the log file in

the trace directory on server 134. The log file was

returned from the field. The log file kept the record

of data synchronization that was carried out on the

background subsystem during this interval.

Data synchronization started at 10:44, and

storage of all data had not been completed

until 11:03.

This interval matched the interval in

which the accounting system returned

this error. According to the processing

mechanism of the system, each node was

not working during data synchronization,

no matter synchronization of increment

lists or synchronization of all tables. As a

result, the DBIO did not perform service

provision for both new and old subscribers.

Another question is that why the

influence of data synchronization for

a newly-added number segment had

persisted for 20 minutes. The engineer

carefu l ly compared the account ing

provision log with the synchronization log

recorded by DBIO, and found that this

error did not last for so long. The start

time and the end time of this error being

recorded are as follows:

==============================

2010-11-09 10:44:38:326,

11091044383262000<FROM DBIO-

1>ACK:Mod CUser: RETN=200051,

DESC=Fail to get hlr office

attribute by user;

… …

2010-11-09 10:44:50:763,

11091044507632400<SEND

BOSS-SOAP>ACK:Mod CUser:

RETN=200051, DESC=Fail to get

hlr office attribute by user;

The duration of this record is

less than one minute. The start

time and the end time of data

synchronization being recorded

are as follows:

2010-11-09 10:44:35 ======

Begin of DataSync =======

Page 16: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience14

February 2011 Issue 252

The start times and the end times

recorded in these two logs are matched

respectively. The accounting provision

log shows that this error was generated

in three different intervals. Provision was

implemented properly from the time when

this error was reported for the first time to

the time when this error was reported for

the second time. Accounting provision was

affected in the interval from the time when

synchronization started to the time when

storage completed.

The engineer later conf i rmed that data

synchronization had been carried out three times

in this interval, resulting in this error being reported

during each period of data synchronization.

In the interface specification of the version, this

error is explained as follows:

(1)There are two cases for error code 200051:

I. There is no corresponding configuration in the

network management system, resulting in failure of

acquiring the configuration.

I I . The network management system is

synchronizing data with DBIO, leading to that the

user fails to acquire the configuration.

(2) For error codes 201310, 210278 and

210183, the instruction should be retransmitted

every 10 seconds.

(3) For error code 208200, the accounting

system needs to retransmit the instruction after

adjustment of the node capacity.

This interface specification also explains other

error codes, which provides helpful guidance on

handling of the errors returned by the accounting

instructions.

SummaryWhen the accounting system returns an error

code, preliminarily handle it according to the

interface specification, which may reduce the

difficulty of troubleshooting and improves the

efficiency. ■

… …

2010-11-09 10:44:53 Saving

DataBase to Path ..\DATA\DATA1

End.

2010-11-09 10:44:53 Saving

DataBase to Path ..\DATA\DATA1

Succeeded!

2010-11-09 10:44:53

RxSaveTblOver: _ODC_S10_MEMUSE

changed to 0. [DBS_P_SSyncRx.

c](305)

2010-11-09 10:44:53

RxSaveTblOver: Notifying

Application of Configuration

Changing! [DBS_P_SSyncRx.

c](322)

==============================

Page 17: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

15CDMA Network Products

Analysis of DNS Error During the xGW-PDSN Wap Service⊙Wu Zhibin / ZTE Corporation

Fault PhenomenonA subscriber dialed a number with a wap

account. After the xGW-PDSN allocated an IP

address, the DNS was displayed abnormally. The

subscriber was able to access the wap website.

The DNSs of the net service for Internet access

over the CDMA network were 202.101.224.68 and

202.101.224.69, and the DNSs of the wap service

were 220.192.8.58 and 220.192.32.103. As shown

in Figure 1, a public network DNS was allocated

for the wap service, resulting in abnormality of

accessing web pages through the wap service.

Figure 2 shows the onsite networking diagram.

Analyzing and ProcessingAccording to the standards, AAA delivers the

DNS of the wap service, which is contained in the

Radius Access-Accept message. The engineer

used LMT to trace the signaling messages of the

PDSN during dialing with a wap account, but did

not find the DNS parameter in the Radius Access

Accept message.

The Radius Access Accept message grabbed

from the AAA is as shown in Figure 3.

The engineer found that th is message

contained the DNS parameter, but Cisco-AVPair

was the private DNS of Cisco. After the engineer

modified the AAA configuration and delivered a

standard DNS, the problem was solved.

The engineer found out the AAA-delivered

DNS attribute value in the Radius Access Accept

message of the PDSN, as shown in Figure 4. The

mobile phone could access Internet properly. ■

Figure 1. ipconfig/all Displaying after Dialing With a wap Account

Figure 2. Networking Model

Figure 3. Radius Access Accept Message

Figure 4. DNS Attribute Value

Page 18: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience16

February 2011 Issue 252

Analysis of A Roaming Subscriber Failing to Send A Short Message⊙Xue Xiaoxing / ZTE Corporation

Fault PhenomenonAfter roaming to the ZTE service zone,

a subscriber was unable to send short

messages over a control channel. After

tracing the signaling messages, an engineer

found that the MSCe sent an authentication

request message to the home HLRe of the

subscriber, and then this HLRe returned a

Gerror signaling message. The subscriber

failed to send short messages because of

authentication failure. The signaling trace

result is shown in Figure 1.

Figure 1. Signaling Message (1)

Analyzing and ProcessingBecause this fault was due to rejection from

the called HLRe, the engineer first analyzed the

configuration of the called HLRe. After analyzing

the home HLRe configuration of the subscriber,

the engineer confirmed that TermType was a

mandatory parameter in an authentication request.

The authentication request from the MSCe,

however, did not carry this parameter. As a result,

the called HLRe considered that the request

message did not contain all necessary parameters,

and then directly returned a GERROR message.

The analysis is as follows:

Parameter TermType is ext racted f rom

parameter Clsmark2 that the radio side sends to

the switching side. If the radio side does not report

it to the MSCe, the MSCe is unable to know the

terminal type.

As shown in Figure 2, the SmsTrans message

did not contain parameter Clsmark2 during short

message origination.

Figure 2. Signaling Message (2)

Page 19: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

17CDMA Network Products

The CmServReq message carried parameter

Clsmark2 during voice call origination, as shown in

Figure 3.

Consequently, the engineer drew the following

conclusion:

(1) When the caAddsTransfer_T message

wi thout carry ing parameter btClsmark2 is

transmitted for short message origination, the

MSCe sends the authentication message without

containing parameter termtype.

(2) When the caCMServReq_T message

carrying parameter tClsmark2 is transmitted

for voice call origination, the MSCe sends the

authent icat ion request carrying parameter

termtype.

In the case of whether parameter tClsmark2

is necessary, IOS5.0 protocol specifies that

parameter CIT in the CM request message for

voice call origination is a mandatory parameter.

Parameter CIT in the ADDS Transfer message

over a control channel, however, is an optional

parameter.

In a radio environment, there is no problem

when parameter tClsmark2 is not carried during

short message or ig inat ion. Some network

operators explicitly define termtype as an

optional parameter for authentication.

MAP 41D judges parameter TermType

when AC receives an AUTHREQ request

message. If the AUTHREQ request message

does not carry parameter TermType and AC

does not support the TSB51 authentication

flow, AC returns Error because of unsupported

operation. If the operation is supported,

processing continues.

For the suggest ion of the MSCe

compatible with informal processing flows,

no protocol requires the MSCe to construct

TermType in the event that the radio

does not report parameter tClsmark2.

The constructed parameter TermType

neither conforms to the requirements of

the specification nor reflects the real type

of the terminal currently accessed. The

purpose of requiring a terminal to carry

this parameter on each access is for the

network side to check its validity thus

to block illegal terminals. If the MSCe is

allowed to construct TermType at random,

checking this parameter is meaningless. ■

Figure 3. Signaling Message (3)

Page 20: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience18

February 2011 Issue 252

Analysis of the Time Recorded in CDR Suddenly About Jumping One Hour⊙Sun Zuotao / ZTE Corporation

Fault PhenomenonThe time of some conversations in a

CDR file generated by an MSCe suddenly

jumped about one hour, as shown in

Figure 1.

In this CDR file, the answer time and the

release time of some conversations were both an

hour later than other conversations. This CDR file

was named UN10110911407116GCDR.

The daylight saving time ended on October 31,

24:00.

Figure 1. CDR File (UN10Files110911407116GCDR)

Page 21: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

19CDMA Network Products

Analyzing and ProcessingT h e e n g i n e e r o b s e r v e d t h e a n s w e r

time and the release time listed in the CDR

f i les generated before and af ter CDR f i le

UN10110911407116GCDR, as shown in Figure 2

and Figure 3.

UN10110911307115GCDR:

UN10110911507117GCDR:

The engineer found that all conversation

records in CDR file UN10110911307115GCDR

were generated at about 10:00, whi le the

majority of the conversation records in CDR file

UN10110911507117GCDR were generated at

about 11:00, only several records were generated

at about 10:00.

The engineer determined that the fore-end time

jumping resulted in the time jumping of the records

in CDR file UN10110911407116GCDR, which were

one hour later than the actual conversation time.

Why did time jump? The engineer first checked

whether the system had used an external clock.

If the system used an external clock source such

as NTP clock or GPS synchronization clock, the

system time varied with the time of the external

clock source.

After checking the NTP log, the engineer found

that the NTP clock was not enabled in the field.

Figure 2. CDR File (UN10110911307115GCDR)

Figure 3. CDR File (UN10110911507117GCDR)

====================================

2010-10-31 23:40:38 From server:

0.0.0.0, Time before adjust: 0.0, Time

after adjust: 0.0, Adjust Fail

2010-10-31 23:50:38 From server:

0.0.0.0, Time before adjust: 0.0, Time

after adjust: 0.0, Adjust Fail

2010-11-01 00:00:38 From server:

0.0.0.0, Time before adjust: 0.0, Time

after adjust: 0.0, Adjust Fail

2010-11-01 00:10:38 From server:

0.0.0.0, Time before adjust: 0.0, Time

after adjust: 0.0, Adjust Fail

====================================

Page 22: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience20

February 2011 Issue 252

The engineer used the SHOW GPSTIME

command on the MML Terminal window to

check the GPS time.

==============================

NO.129 : SHOW GPSTIME;

------ 17 MSCE -----------

No. |Result

|Year |Month |Day |Hour

|Minute |Second

-------------------------

1 Operation Success 2010

11 17 13 49 36

------------------------------

Rows: 1

==============================

The result shows that the GPS clock of the

BSC was used in the field.

By analyz ing the col lected not i f icat ion

messages, the engineer learned that the GPS time

of the BSC was quite different from the field time. In

this case, the system did not allow automatic time

update. This notification message, however, did

not appear after November 10. Probably manual

update was performed on November 9, resulting in

time jumping.

Figure 4 shows the notification messages.

The engineer checked the operation logs,

confirming that manual GPS update was performed

at 11:37:28, November 9.

The GPS time was also updated manually at

10:04:34, October 29, as shown in Figure 6.

Figure 4. Notification Messages

Figure 5. Operation Log (1)

Figure 6. Operation Log (2)

Page 23: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

21CDMA Network Products

Because the local engineer said that the

daylight saving time ended on October 31, at

24:00, the problem might be due to that the

daylight saving time was not enabled on the BSC

but enabled on the MSCe. Therefore, the MSCe

was an hour earlier than the BSC. Because the

GPS time was updated manually on October 29,

2010, at 10:11:19, the MSCe system time jumped

and rolled back an hour, which was recorded in the

fore-end log.

======================================

###:2010-10-29 10:11:19 01008 SCH8,

OSS_Config:

The Error-value between set-time and

OSS-time Larger than 300 sec, after

system start 3310122 sec, OSS-current-

time(Greenwich) is 341633479 sec, set-

time(Greenwich) is 341629879 sec,

time-zone is 420, summerTOffset is60.

======================================

==============================

###:2010-11-09 10:44:10 01008 SCH8,

OSS_Config:

The Error-value between set-time and

OSS-time Larger than 300 sec, after

system start 4269648 sec, OSS-current-

time(Greenwich) is 342589450 sec, set-

time(Greenwich) is 342593049 sec,

time-zone is 420, summerTOffset is60.

==============================

The MSCe quit the daylight saving time on

October 31 and hence the system time rolled back

an hour. In this case, the MSCe was an hour later

than the BSC.

The GPS time was updated manually on

November 9, resulting in time jumping of the MSCe

again. The MSCe was an hour earlier than before.

At this moment, the system time recovered to

normal state. The fore-end log also recorded it.

This symptom also appeared when

the corresponding GPS time was filled in

with a standard time instead of a daylight

saving time under the condition that the

daylight saving time was enabled on the

BSC. From the situation verified by the

laboratory, this problem was due to that

the daylight saving time was not enabled

on the BSC.

SummaryThe time jumping of CDR files is due

to the time jumping of the fore end. To

handle it, focus on the reason why the

time jumping occurs on the fore end. In

the event when the NTP external time

or the GPS external time is enabled, it

is necessary to analyze the changes in

the external clock. In addition, analyze

the problem by viewing the notification

messages, fore-end operation logs and

the back-end operation logs. ■

Page 24: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience22

February 2011 Issue 252

Analysis of the SDP Decoding Failure of Incoming SIP Calls from an Interconnected IMS Office⊙Zhu Xuefeng / ZTE Corporation

Fault PhenomenonFigure 1 shows the ne twork ing

diagram.

Figure 1. Traffic Model of the Fault

Figure 2. Call Signaling Messages

When a fixed network subscriber under

IMS calls a CDMA subscriber, a fault

occurs at the CDMA end office, prompting

“SDP decoding failure”.

Analyzing and Processing(1) Trace the SIP signaling messages

at the CDMA end office. Figure 2 shows

the signaling messages.

On receipt of the INVITE message

from the IMS, the CDMA end office returns

a 400-message with the following content:

------------------------------

SIP/2.0 400 sdp decode fail

Via: SIP/2.0/UDP

5.0.128.86:5060;

The message prompts “sdp decode fai l”

(indicating SDP decoding failure). The problem is

due to a mismatch between the coding/decoding

type carried by the invite message and that

configured at the CDMA end office.

(2) Analyze the received invite message with

the following content:

branch=z9hG4bK-4cd11898-84

To: <sip:[email protected]:600

1;

user=phone>;tag=022-

39895630022e22e2a57b3-247e6d49

From: <sip:[email protected]>;tag=76

8

Call-ID: 1288771736-86@local

CSeq: 1 INVITE

User-Agent: ZTE-MSCe

Content-Length: 0

--------------------------------------

--------------------------------------

INVITE sip:[email protected]:6

001;user=phone SIP/2.0

Via: SIP/2.0/UDP

5.0.128.86:5060;branch=z9hG4bK-

4cd11898-84

Page 25: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

23CDMA Network Products

To: <sip:[email protected]:600

1;user=phone>

From: <sip:[email protected]>;tag=76

8

Call-ID: 1288771736-86@local

CSeq: 1 INVITE

Max-Forwards: 70

Contact: <sip:[email protected]>

Supported: timer

Require: 100rel

Expires: 330

Allow: INVITE, ACK, BYE, CANCEL,

OPTIONS, INFO, PRACK, REFER,

SUBSCRIBE, NOTIFY, UPDATE, REGISTER

Session-Expires: 1800

Min-SE:90

P-Asserted-Identity: <sip:[email protected]

.128.86;user=phone>

MIME-Version: 1.0

Content-Type: multipart/mixed;boundary

=tekBoundary

Content-Length: 445

--tekBoundary

Content-Type: application/sdp

v=0

o=MxSIP 0 1416567652 IN IP4 5.0.129.4

s=SIP Call

c=IN IP4 5.0.129.4

t=0 0

m=audio 20534 RTP/AVP 8 0 18 101

a=rtpmap:8 PCMA/8000

a=rtpmap:0 PCMU/8000

a=rtpmap:18 G729/8000

a=fmtp:18 annexb=no

a=rtpmap:101 telephone-event/8000

a=fmtp:101 0-15

a=silenceSupp:off - - - -

--tekBoundary

Content-Type: application/isup;

version=itu-t92+

01 00 00 00 00 02

08 81 10 51 43 48 06 06 01

07 81 13 58 33 41 06 0f 00

--tekBoundary--

------------------------------

The message shows that there is no

space separating the first encapsulated

S D P m e s s a g e f r o m t h e s e c o n d

encapsulated ISUP message. The format

of the message encountering the problem

is as follows:

------------------------------

a=silenceSupp:off - - - -

--tekBoundary

Content-Type: application/isup;

version=itu-t92+

------------------------------

The correct one should be as

follows:

------------------------------

a=silenceSupp:off - - - -

--tekBoundary

Content-Type: application/isup;

version=itu-t92+

------------------------------

The system will misjudge that the

encapsulated ISUP message was also

the content of the first encapsulated SDP

message, resulting in failure of decoding

the SDP message. As a result, the system

prompts “sdp decode fail”.

(3) Analysis and verification: Ask the

IMS maintenance personnel to cancel

encapsulation of the ISUP in the Invite

message, and then perform a dialing test.

The service becomes proper. ■

Page 26: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience24

February 2011 Issue 252

Analysis of AAA Sending Unrecognizable User Name to PPS⊙Zhu Wei / ZTE Corporation

Fault PhenomenonWhen AAA was interconnected with

PPS, the engineer at the PPS side

reflected that something was wrong with

the user name that AAA sent to PPS.

Only the first Request message carried

the correct username. The subsequent

Request messages carried unrecognizable

username.

Analyzing and ProcessingAn engineer checked the interaction messages

between AAA and PPS to locate the problem.

The Request message sent to the PPS for

the first time carried the correct user name,

0012CFE777BF, as shown in Figure 1.

When the subscriber quota reached the

threshold, the Request message sent to PPS

second time carried unrecognizable user name, as

shown in Figure 2.

Figure 1. Interaction Message (1)

Figure 2. Interaction Message (2)

Page 27: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

25CDMA Network Products

With packet capture, the engineer detected

that something was wrong with the user name

that AAA sent to PPS. AAA matches user names

according to Wimax-Session-ID. If a user name

is not matched, AAA generates a random value

for attribute User-Name and fills in this value, and

then sends it out. The unrecognizable user name

probably was due to Wimax-Session-ID. The

engineer then analyzed Wimax-Session-ID.

He viewed the first message, that is,

the first Request message that AAA sent

to PPS, as shown in Figure 3.

Figure 3 shows the Wimax-Session-ID

parameter that AAA sent to PPS.

With packet capture, the engineer

found that the parameter that PPS

returned to AAA was generated by PPS

itself, instead of the parameter that AAA

delivered to PPS, as shown in Figure 4.

Figure 3. Interaction Message (3)

Figure 4. Interaction Message (4)

Page 28: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience26

February 2011 Issue 252

As shown in the captured packets, PPS

delivered a wrong Wimax-Session-ID to AAA

that would directly forward it to AGW. When

the subscriber quota reached the threshold,

AGW sent a Request message containing

this wrong Wimax-Session-ID for more

quotas. Because AAA did not generate this

Wimax-Session-ID, it was unable to match

the user name by it. When the user name

was not matched, AAA generated a random

value, and then sent this value to PPS. This

value was the unrecognizable user name

seen at the PPS side.

TroubleshootingPPS should return a message either

containing the Wimax-Session-ID generated by

AAA or or containing this parameter. The problem

was solved after the response message did not

contain Wimax-Session-ID.

SummaryAccording to the NGW protocol, parameter

Wimax-Session-ID is generated by AAA instead

of PPS. In this case, PPS delivered the Wimax-

Session-ID generated by itself to AAA, leading to

that AAA sent an unrecognizable user name to

PPS.

AAA can handle this case when the response

message that PPS sends to AAA either carries the

Wimax-Session-ID that AAA sends to PPS or not

carry this parameter. ■

Page 29: Maintenance Experience_ Issue252(CDMA Network Products)

www.zte.com.cn

27CDMA Network Products

Analysis of the Problem of MSCe Being Directly Interconnected With SCP⊙Chen Jiansheng / ZTE Corporation

Fault PhenomenonAfter the MSCe was interconnected with the

SCP by using the DPC+SSN addressing mode,

the office was accessible, but the MSCe failed to

send signaling messages to the SCP during call

origination and termination.

Problem Analysis(1) The engineer viewed the MAP signaling

trace result of the MSCe and the failure observation

records of the SPS.

Figure 1 shows a call-termination flow, where

the MSCe sent an AnlyzdReq message to the SCP,

but received an Abort message. The call failed.

The engineer checked the failure

observation, as shown in Figure 2. He

found that the lower layer discarded the

AnlyzdReq message.

(2)The engineer continued to trace

the SCCP-layer signaling messages and

perform the failure observation.

An ERROR message was received at

the MTP3 layer, as shown in Figure 3.

(3)The engineer checked the signaling

point connecting the adjacent office, and

verified with the opposite end that it was

correct. He checked other configurations

of the adjacent off ice, including the

Figure 3. Signaling Messages

Figure 2. Failure Observation

Figure 1. Signaling Trace Messages

Page 30: Maintenance Experience_ Issue252(CDMA Network Products)

Maintenance Experience28

February 2011 Issue 252

network type, adjacent office attributes

and protocol type, but did not find any

problems. The opposite-end office was

accessible.

(4)The engineer checked the failure

observation records. The first record

showed that the failure cause was “SPC

inaccessible”, and the detailed information

was “SPC :0-4-250,net 1 inaccessible”.

Why d id the system prompt "net 1

inaccessible" since the SPC was correct.

The adjacent office configuration showed

that the MSCe was interconnected with

the SCP by using Net 5.

(5)The MTP layer, however, did not find

the signaling point route of the SCP in Net

1 when sending the message. As a result, it

discarded the message. The adjacent office

of the SCP, however, was configured in Net

5. The MSCe did not perform addressing according

to the MML configuration.

(6)The engineer doubted that whether a security

variable affected this flow, because security variables

had a higher priority over the MML configuration

during the internal processing of the system.

Therefore, the engineer searched for the possible

security variables. The engineer found security

variable “SCP network type” under Other Function.

Its default value is 1, indicating that the network type

of the SCP has been set to 1. Therefore, Net=5 in the

MML configuration was invalid.

TroubleshootingThe engineer set this security variable to 5,

as shown in Figure 4, and the transferred the

configuration. After that, the MSCe could properly

send the AnlyzdReq message to the SCP. The

problem was solved. ■

Figure 4. Security Variable Setting

Page 31: Maintenance Experience_ Issue252(CDMA Network Products)