các lệnh kiem tra trang thai stp

9
Here are the things that are pending ISTT & VietTel to be done on HCM- NEW site 1. Downgrade MD from version 13.3 to version13.1 to be aligned with the NX version. a. Required file is available on FTP - ftp://ftp.allot.com/Previous_Versions/MNG_server/MD/SMP- DC.13.1.10_B8/md-13.1.10-08.i386.rpm b. remove current installed version using the command rpm –e md-13.3 … (use TAB key to fill the right file name) c. install the correct version (13.1.10) as described in the NX installation guide ( https://c.eu1.visual.force.com/apex/KB?KBID=36962719 ) i. Install SW rpm –ivh md-13.1.10-08.i386.rpm ii. once finished set it as STC dev_setup.sh –m stc ii. 2. Upgrade the SGSE to same build as other sites a. Copy the content of /home/sysadmin/AOS/ from HNI to HCM- NEW (or locate the files in HCM SGSE) b. Run the upgrade with the command ./aos-install.sh the script will run through all the cards and upgrade them one by one 3. After the upgrade you can distribute the policy from HCM to HCM-NEW so the devices will be aligned 4. VietTel need to connect the HCM-NEW to the asymmetric control network SFB_7_L1 already configured HCM-NEW already added to the asymmetric group

Upload: phamtrunghieu1985

Post on 05-Feb-2016

6 views

Category:

Documents


0 download

DESCRIPTION

Các Lệnh Kiem Tra Trang Thai STP

TRANSCRIPT

Page 1: Các Lệnh Kiem Tra Trang Thai STP

Here are the things that are pending ISTT & VietTel to be done on HCM-NEW site

1.       Downgrade MD from version 13.3 to version13.1 to be aligned with the NX version.a.       Required file is available on FTP -

ftp://ftp.allot.com/Previous_Versions/MNG_server/MD/SMP-DC.13.1.10_B8/md-13.1.10-08.i386.rpm

b.      remove current installed version using the command rpm –e md-13.3… (use TAB key to fill the right file name)

c.       install the correct version (13.1.10) as described in the NX installation guide (https://c.eu1.visual.force.com/apex/KB?KBID=36962719)

                                                               i.      Install SW rpm –ivh md-13.1.10-08.i386.rpm                                                              ii.      once finished set it as STC dev_setup.sh –m stc                                                                        ii.

2.       Upgrade the SGSE to same build as other sitesa.       Copy the content of /home/sysadmin/AOS/ from HNI to HCM-NEW (or locate the files

in HCM SGSE)b.      Run the upgrade with the command ./aos-install.sh the script will run through all the

cards and upgrade them one by one3.       After the upgrade you can distribute the policy from HCM to HCM-NEW so the devices will be

aligned

4.       VietTel need to connect the HCM-NEW to the asymmetric control network          SFB_7_L1 already configured         HCM-NEW already added to the asymmetric group

5.       VietTel need to connect traffic to new device

6.       As for the power issue in HCM-NEW observed (2 CBs are off and one PEM with red lights)         Check with VietTel’s electrician that the PEM is connected properly and all CBs are OK         If issue still exists verify that PEM is properly fitted into the SGSE         If still exists please open a case for S/N SGSE1406000956 (the HCM-NEW SGSE, it has

valid support)

Page 2: Các Lệnh Kiem Tra Trang Thai STP

7.       Please send me the snapshot from SGSE in HNI site ( /opt/allot/snapshots/snapshot.system.22.10.14_03.24.00.tgz ) so i will be verifying that the new cards are listed in CRM properly.

Kiể�m tra Loging

sysadmin@EXC-SBH[7/14]:~$ history | more 2 Oct/16 - 00:03:55 | ssh [email protected] 3 Oct/16 - 00:04:16 | cd - 4 Oct/16 - 00:04:24 | cd /home/sysadmin/ 5 Oct/16 - 00:04:25 | ls -l 6 Oct/16 - 00:04:39 | acmon 7 Oct/16 - 00:04:49 | acstat 8 Oct/16 - 00:18:36 | ls -l 9 Oct/16 - 00:18:49 | cd AOS131

opt/allot/logs/rsyslog.secondary.log

Kiể�m tra trạng thái port....sysadmin@HHT9402SGSE14-SBH[7/14]:~$ go config view nic

Interface SB_6_L1 : Mode full Speed 10000 Mbps MAC 00:09:38:50:32:21 Admin enable Status disable Action on Failure none Supported Actions none,

go config view network

Show system/module temperatures (CC,SFB..), FAN and power status:1. Connect to the SMC (from the SGSV blade) ssh [email protected] (no password, just type enter)

clia sel > clia_sel.txt clia sel –v > clia_sel_v.txt clia fru -v > clia_fru_v.txt clia board –v > clia_board_v.txt clia shelf pd > clia_shelf_pm.txt

Page 3: Các Lệnh Kiem Tra Trang Thai STP

clia shelf pm > clia_shelf_pd.txt clia shelf at > clia_shelf_at.txt clia fans > clia_fans.txt clia sensordata > clia_sensordata.txt clia alarm > clia_alarm.txt

Recreate LTC procedure

All procedure needed to be done on root user

1. First check that all processes on NX HAP works as needed:

a. Active NX

i. HeartBeat

[root@nx1 ~]# service heartbeat statusheartbeat OK [pid 7038 et al] is running on nx1.viettel [nx1.viettel]...

ii. Mount

[root@nx1 ~]# mount/dev/sda1 on / type ext3 (rw)proc on /proc type proc (rw)sysfs on /sys type sysfs (rw)devpts on /dev/pts type devpts (rw,gid=5,mode=620)/dev/sda2 on /opt type ext3 (rw)tmpfs on /dev/shm type tmpfs (rw)none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)/dev/dm-1 on /opt/sybase/data type ext3 (rw) --> this is mounted to storage

iii. Netxplorer

[root@nx1 ~]# service netxplorer statusswKeeper (pid 7363) is running...

b. SB NX

i. HeartBeat

root@nx2 ~]# service heartbeat statusheartbeat OK [pid 15303 et al] is running on nx2.viettel [nx2.viettel]...[root@nx2 ~]# mount

Page 4: Các Lệnh Kiem Tra Trang Thai STP

/dev/sda1 on / type ext3 (rw)proc on /proc type proc (rw)sysfs on /sys type sysfs (rw)devpts on /dev/pts type devpts (rw,gid=5,mode=620)/dev/sda2 on /opt type ext3 (rw)tmpfs on /dev/shm type tmpfs (rw)none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)[root@nx2 ~]# service netxplorer statusswKeeper is stopped

c. You can always run crm_mon to see that NX HAP works properly on both NX servers:

connecting as root user and running the command crm_monthis is the result:============Last updated: Tue Jun 24 15:15:09 2014Current DC: nx1.viettel (66b948b6-f1ff-4381-a6fb-3c1fdb7f2e87)2 Nodes configured.1 Resources configured.============

Node: nx2.viettel (ad505531-4280-4f5e-b721-64cb137a7819): onlineNode: nx1.viettel (66b948b6-f1ff-4381-a6fb-3c1fdb7f2e87): online

Resource Group: nx_ha vip (ocf::heartbeat:IPaddr2): Started nx1.viettel db (ocf::heartbeat:Filesystem): Started nx1.viettel nx (lsb:netxplorer): Started nx1.viettel

2. Netxplorer have to run all bellow processes when running the command ps –ef |grep opt, in case it is not there are logs which can be viewed on /opt/allot/log/

dbsrv12 (3 instances cfg, stc, ltc(swkeeperPollerConverterLoaderltc_pollerltc_Loaderjava

3. Check the disk space on dm-1 an see that it is full – df –h

Page 5: Các Lệnh Kiem Tra Trang Thai STP

Only when checking all the above and seeing that there is a disk full and all other NX HAP functionality is fine, you can proceed with recreate procedure:

A. On SB NX stop heartbeat – service heartbeat stop

B. On Active NX:

a. stop heartbeat – service heartbeat stop

b. unmounts - umount /opt/sysbase/data/

c. stop Netxplorer – service netxplorer stop

C. recreate to LTC:

a. mount the Storage mount /dev/dm-1 /opt/sybase/data

b. recreate LTC DB /opt/allot/bin/recreate_db.sh ltc

D. once it is finish, start the NX HAP again – on Active NX

a. start heartbeat – service heartbeat start

b. start Netxplorer – service netxplorer start

c. check all Netxplorer processes are up ps –ef |grep opt

d. when all is up on active NX, start heartbeat also on SB NX – service heartbeat start

According to logs, my suspicions are correct – it seems that the system went to BYPASS mode and back to Active mode due to lost connection between primary SB#7 to SB#6.

Let me explain:

On the sigma you have 4 CCs and 3 SBs:

Cards list :

|Slot |Card Type |SMC State |Card Status

--------------------------------------------

|1 |EXC-CC |ON |ACTIVE

--------------------------------------------

Page 6: Các Lệnh Kiem Tra Trang Thai STP

|2 |EXC-CC |ON |ACTIVE

--------------------------------------------

|3 |EXC-CC |ON |ACTIVE

--------------------------------------------

|4 |EXC-CC |ON |ACTIVE

--------------------------------------------

|6 |EXC-SB |ON |ACTIVE

--------------------------------------------

|7 |EXC-SB |ON |ACTIVE

--------------------------------------------

|8 |EXC-SB |ON |ACTIVE

According to configuration if you have less than 3 SBs working or 3 CCs Working the system will go into BYPASS mode:

Minimum number of Core Controllers 3

Number of active Core Controllers 4

Minimum number of Switch Balancers 3

When checking the logs on the system I could see that on both cases the transfer to BYPASS and back to ACTIVE was due to connection lost to SB#6:

2014-07-08 09:15:51 .306 notice systemMgr_____[5685]: Event received from FB 6, card system status changed to BYPASS

2014-07-08 09:15:51 .307 notice systemMgr_____[5685]: Update System Status on card status change, there are 2 active FBs, min=3

2014-07-08 09:15:51 .307 notice systemMgr_____[5685]: Switch System to Bypass mode

When checking SB#6 logs to understand the root cause of it could see that the link#8 of SB#6 is down:

Page 7: Các Lệnh Kiem Tra Trang Thai STP

2014-07-08 09:15:51 .301 info evMngr________[4818]: New event from CPU=HOST: src=FM3224, type=link state, reaction=SEND_EVENT, count=577, tSmp=1743017212

2014-07-08 09:15:51 .302 info evMngr________[4818]: FM3224 port[8] link is DOWN, speed[10G], duplex[FULL]

2014-07-08 09:15:51 .302 info evMngr________[4818]: - FM3224/8/down: SERDES_STATUS[0000001F]: signal detected[1], symbol lock[1 1 1 1]

2014-07-08 09:15:51 .302 info evMngr________[4818]: - FM3224/8/down: SERDES_CTRL_2[00000000]: lane power down[0 0 0 0], lane reset[0 0 0 0]

2014-07-08 09:15:51 .302 info evMngr________[4818]: - FM3224/8/down: PCS_IP [00003FD4]: LF[0], RF[0], lanes misaligned[1], link[1]

2014-07-08 09:15:51 .302 info evMngr________[4818]: - FM3224/8/down: MAC_CFG_2 [0007217C]: DisableRx_MAC[0], DisableTx_MAC[0]

2014-07-08 09:15:51 .302 info evMngr________[4818]: - FM3224/8/down: PCS_CFG_1 [00640C0A]: speed[10G]

2014-07-08 09:15:51 .303 info LTM_SB________[4847]: FP:Local event dispatch (4832):eventHandler:3585:port event: switch[0] port[8] is down

2014-07-08 09:15:51 .303 info LTM_SB________[4832]: Received port status event (from keeper) - FABRIC_PORT_8 is DOWN [ethLinkStatus=2]

2014-07-08 09:15:51 .303 info LTM_SB________[4832]: FABRIC_PORT_8 is DOWN - notify keeper if needed

If I guess correctly this is rarely issue and not happens so much. My suggestion is that in case the issue will repeat, I recommend to restart the sigma cause I can see that sigma is up since last upgrade – 262 days:

Version AOS.SGSE14.13.1.32 Build 89

Tue Sep 24 12:07:15 EEST 2013

14:13:54 up 262 days, 14:12, 1 user, load average: 2.60, 2.20, 1.81

If there are any further questions or relevant information requested, please do not hesitate to ask.