how to debug cci issues 1.3

32
How To Debug CCI Issues – Version 1.3 How To Debug CCI Issues – Version 1.3 Introduction This short document describes some of the many ways CCI can break. It should be used as a list of the things to check with a customer whenever CCI problems are raised. Check: This tag is used to highlight what to look for. Installation and Environment Variables UNIX You would think it is easy to get this right – but we see many errors caused due to bad installation. Here is what should be done. Let us assume a non-default installation – as that is the easiest for the user to get wrong Let us assume we want to install CCI in /opt/HORCM 1 Copy the installation file to the hard disk somewhere. It is called RMHORC. Let’s copy it to /var/tmp 2 cd /opt Change directory to where you want /HORCM created 3 cpio -idmu < /var/tmp/RMHORC This will copy all the files in the RMHORC “package” to /opt/HORCM Check: We have seen cases where users build their own installation “packages” for HORCM – and then copy the files from one Host to another. In UNIX particularly this is dangerous. CCI needs a “hidden” directory called .uds or it will not start. In CCI 01-16-03 and below it was in /var/tmp In CCI 01-17-03 and above it is in /yourdirectory/HORCM Mike Le Voi Page 1 15/01/2022

Upload: skamalarajan

Post on 28-Apr-2015

253 views

Category:

Documents


13 download

TRANSCRIPT

Page 1: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

How To Debug CCI Issues ndash Version 13

Introduction

This short document describes some of the many ways CCI can break It should be used as a list of the things to check with a customer whenever CCI problems are raised

Check This tag is used to highlight what to look for

Installation and Environment Variables

UNIX

You would think it is easy to get this right ndash but we see many errors caused due to bad installation Here is what should be done Let us assume a non-default installation ndash as that is the easiest for the user to get wrong

Let us assume we want to install CCI in optHORCM 1 Copy the installation file to the hard disk somewhere It is called RMHORC Letrsquos

copy it to vartmp

2 cd optChange directory to where you want HORCM created

3 cpio -idmu lt vartmpRMHORCThis will copy all the files in the RMHORC ldquopackagerdquo to optHORCM

Check We have seen cases where users build their own installation ldquopackagesrdquo for HORCM ndash and then copy the files from one Host to another In UNIX particularly this is dangerous CCI needs a ldquohiddenrdquo directory called uds or it will not start

In CCI 01-16-03 and below it was in vartmp In CCI 01-17-03 and above it is in yourdirectoryHORCM

This directory contains UNIX ldquopipesrdquo when the instances are started The ldquopipesrdquo are deleted when the instance stops Thus you will see this

rootSYD-E250-1optHORCMudsls -altotal 4drwxrwxrwx 2 root sys 512 Feb 22 1526 dr-xr-xr-x 12 root sys 512 Feb 22 1504

rootSYD-E250-1optHORCMudshorcmstartsh 4starting HORCM inst 4HORCM inst 4 starts successfully

rootSYD-E250-1optHORCMudsls -altotal 6drwxrwxrwx 3 root sys 512 Feb 22 1629 dr-xr-xr-x 12 root sys 512 Feb 22 1504

Mike Le Voi Page 1 11042023

How To Debug CCI Issues ndash Version 13

drwxrwxrwx 2 root other 512 Feb 22 1629 lcmcl04srwxrwxrwx 1 root other 0 Feb 22 1629 lcmep04

Always check for this directory if you have a case where CCI does not start

4 ln -s optHORCM HORCM

Check You must create a link or the install in the next step will fail

5 HORCMhorcminstallsh

Check You must do this on UNIX to create links to the CCI commands

6 raidqry -h Here is what you see if the user has done everything right

rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC -h HelpUsage -I[] Set to HORCMINST -IH[] or -ITC[] Set to HORC mode [and HORCMINST] -IM[] or -ISI[] Set to MRCF mode [and HORCMINST] -z Set to the interactive mode -zx Set to the interactive mode and HORCM monitoring -q Quit(Return to main()) -g Specify for getting all group name on local -l Specify the local query -r ltgroupgt Specify the remote query -f Specify display for floatable host

Check Always get the user to run this command and send you the output

If user is using an old version of CCI ask why 01-19-0304 or later is preferred for many reasons which will be discussed later

In some special cases like HPtM a specific level of CCI may be stated in the ECN or Release Notes In this case it may be advisable to stick with that level Also the microcode ECNs for 9900V and USP always recommend a CCI level

However in my experience CCI is always backwards compatible ndash and the developer has confirmed this ndash so one should always use the minimum level stated in ECNs

Any command this user issues will be assumed to be TrueCopy (refer above - HORC) If the user is trying to perform ShadowImage operation you now know why it is failing

No instance has been set Here is what you see if the instance variable has been set

rootSYD-E250-1optHORCMudsexport HORCMINST=4rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4] -h HelpUsage

Mike Le Voi Page 2 11042023

How To Debug CCI Issues ndash Version 13

If the user is trying to control Instance 1 now you know why it is failing

Windows

Installation is easier Double click the EXE and follow the bouncing ball I always recommend taking the default of CHORCM The directory is only about 10 MB in size so we are not likely to fill the drive

Check The same rule applies as for UNIX Always ask for raidqry output

Environment Variables

Here is a ShadowImage example

CHORCMETCgtset horcminst=4

CHORCMETCgtset horcc_mrcf=1

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4] -h HelpUsage -I[] Set to HORCMINST -IH[] or -ITC[] Set to HORC mode [and HORCMINST] -IM[] or -ISI[] Set to MRCF mode [and HORCMINST] -z Set to the interactive mode

Check horcminst is case insensitive on windows ndash it is case sensitive on UNIX

Change mode of operation to TrueCopy

CHORCMETCgtset horcc_mrcf=

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Now try this on UNIX Change mode of operation

rootSYD-E250-1optHORCMudsexport HORCC_MRCF=1rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

rootSYD-E250-1optHORCMudsexport HORCC_MRCF= rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

Mike Le Voi Page 3 11042023

How To Debug CCI Issues ndash Version 13

Check This does not work You must do this

rootSYD-E250-1optHORCMudsunset HORCC_MRCFrootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Sending logs to GSC

If you have to escalate the problem to GSC we will need the complete set of HORCM logs and all the HORCM CONF files In general the preferred method of doing this is to run ldquogetconfigrdquo These executablesscripts are available on the TUF web site (httpstufhdscom)

If for any reason you do not run these scripts you must zip up all the LOG directories underneath the HORCM directory Never pick and choose which log to upload ndash many of them have the same name ndash and GSC may need to refer to all of them

In addition the factory always asks for the output from these commands (Windows only)

inqraid -CLI -fgx $Physinqraid -CLI -fgvx $Volinqraid -CLI -fgx $LETALL

Finding Command Devices

You cannot create a HORCM CONF file or check it for accuracy without doing INQRAID commands for UNIXWindows and RAIDSCAN commands for Windows

UNIX

Check Get the user to send you the result of this command

rootSYD-E250-1optHORCMudsls devrdsk | inqraid -CLI -fxgDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDc2t0d16s2 CL1-A-4 10111 0 - - - - OPEN-V-CMc2t2d36s2 - - - - - - - - c2t4d0s2 CL1-A-20 20169 43 - - - - OPEN-V-CMc2t6d0s2 CL1-A-11 80025 31F - - - - OPEN-V-CMc2t6d14s2 - - - - - - - - c3t2d128s2 CL2-A-6 10262 2180 - - - - OPEN-V-CMc3t3d4s2 CL2-A-7 3157 1 - - - - DF600F-CM

Here are 5 command devices ndash 2 are in bold The first is a USP ndash the second is a 9570V If the user wants to use the first one they need to code

devrdskc2t6d0s2

in the HORCM CONF file

Windows

Mike Le Voi Page 4 11042023

How To Debug CCI Issues ndash Version 13

Check Get the user to send you the result of these commands

CHORCMETCgtraidscan -x findcmddev h020cmddev of Ser 10111 = PhysicalDrive2cmddev of Ser 10111 = PhysicalDrive5cmddev of Ser 41 = PhysicalDrive7cmddev of Ser 10262 = PhysicalDrive8cmddev of Ser 80025 = PhysicalDrive10cmddev of Ser 20169 = PhysicalDrive11cmddev of Ser 20169 = Volume3c107ab6-7dbf-11db-a1ed-000e0c6abf1d

Check Do not use ANY of these names If you find a user using this syntax ask that it be changed See INQRAID output below

Harddisk numbers can change after a reboot GUID numbers can change in a MS Cluster environment after reboot Do yourself a favour ndash do not use these names

CHORCMETCgtinqraid $Phys -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDHarddisk0 - - - - - - - 00JS-22MHB0 Harddisk1 - - - - - - - 00JS-22MHB0 Harddisk2 CL1-A 10111 515 - - - - OPEN-V-CM Harddisk3 CL1-A 10111 1920 - ssss 9997 503-02 OPEN-V Harddisk4 CL1-A 10111 768 - ssss 9993 506-02 OPEN-V Harddisk5 CL1-A 10111 1856 - - - - OPEN-V-CM Harddisk6 CL1-A 10111 2632 - Psss 999A 506-02 OPEN-V Harddisk7 CL1-A 41 0 - - - - DF600F-CM Harddisk8 CL1-A 10262 16 - - - - OPEN-V-CM Harddisk9 CL1-A 10262 8320 - ssss 2000 501-05 OPEN-V Harddisk10 CL1-A 80025 784 - - - - OPEN-V-CM Harddisk11 CL1-A 20169 13 - - - - OPEN-V-CM

CMD syntax has been around since 01-17-0305 There is no reason not to use it If the user is running 01-17-0305 or below get them to use 01-19-0304 or higher

In this case for Harddisk8USP 10262 the correct syntax in the HORCM CONF file is

CMD-10262-16 or evenCMD-10262-16-CL1-A-12 if you know this is HSD 12 - orCMD-10262-16-CL1-A or for slack peopleCMD-10262

CMD-10262-16 is my preferred coding technique ndash as this takes care of multipath environments as well

Starting HORCM Instances

There are so many ways for this to fail that I could write a book on this topic

So always take the easy way out Send the user a deck that is bound to work If it does not then you have very little to debug Here is such a deck ndash HORCM4CONF

UNIX

Mike Le Voi Page 5 11042023

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 2: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

drwxrwxrwx 2 root other 512 Feb 22 1629 lcmcl04srwxrwxrwx 1 root other 0 Feb 22 1629 lcmep04

Always check for this directory if you have a case where CCI does not start

4 ln -s optHORCM HORCM

Check You must create a link or the install in the next step will fail

5 HORCMhorcminstallsh

Check You must do this on UNIX to create links to the CCI commands

6 raidqry -h Here is what you see if the user has done everything right

rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC -h HelpUsage -I[] Set to HORCMINST -IH[] or -ITC[] Set to HORC mode [and HORCMINST] -IM[] or -ISI[] Set to MRCF mode [and HORCMINST] -z Set to the interactive mode -zx Set to the interactive mode and HORCM monitoring -q Quit(Return to main()) -g Specify for getting all group name on local -l Specify the local query -r ltgroupgt Specify the remote query -f Specify display for floatable host

Check Always get the user to run this command and send you the output

If user is using an old version of CCI ask why 01-19-0304 or later is preferred for many reasons which will be discussed later

In some special cases like HPtM a specific level of CCI may be stated in the ECN or Release Notes In this case it may be advisable to stick with that level Also the microcode ECNs for 9900V and USP always recommend a CCI level

However in my experience CCI is always backwards compatible ndash and the developer has confirmed this ndash so one should always use the minimum level stated in ECNs

Any command this user issues will be assumed to be TrueCopy (refer above - HORC) If the user is trying to perform ShadowImage operation you now know why it is failing

No instance has been set Here is what you see if the instance variable has been set

rootSYD-E250-1optHORCMudsexport HORCMINST=4rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4] -h HelpUsage

Mike Le Voi Page 2 11042023

How To Debug CCI Issues ndash Version 13

If the user is trying to control Instance 1 now you know why it is failing

Windows

Installation is easier Double click the EXE and follow the bouncing ball I always recommend taking the default of CHORCM The directory is only about 10 MB in size so we are not likely to fill the drive

Check The same rule applies as for UNIX Always ask for raidqry output

Environment Variables

Here is a ShadowImage example

CHORCMETCgtset horcminst=4

CHORCMETCgtset horcc_mrcf=1

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4] -h HelpUsage -I[] Set to HORCMINST -IH[] or -ITC[] Set to HORC mode [and HORCMINST] -IM[] or -ISI[] Set to MRCF mode [and HORCMINST] -z Set to the interactive mode

Check horcminst is case insensitive on windows ndash it is case sensitive on UNIX

Change mode of operation to TrueCopy

CHORCMETCgtset horcc_mrcf=

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Now try this on UNIX Change mode of operation

rootSYD-E250-1optHORCMudsexport HORCC_MRCF=1rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

rootSYD-E250-1optHORCMudsexport HORCC_MRCF= rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

Mike Le Voi Page 3 11042023

How To Debug CCI Issues ndash Version 13

Check This does not work You must do this

rootSYD-E250-1optHORCMudsunset HORCC_MRCFrootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Sending logs to GSC

If you have to escalate the problem to GSC we will need the complete set of HORCM logs and all the HORCM CONF files In general the preferred method of doing this is to run ldquogetconfigrdquo These executablesscripts are available on the TUF web site (httpstufhdscom)

If for any reason you do not run these scripts you must zip up all the LOG directories underneath the HORCM directory Never pick and choose which log to upload ndash many of them have the same name ndash and GSC may need to refer to all of them

In addition the factory always asks for the output from these commands (Windows only)

inqraid -CLI -fgx $Physinqraid -CLI -fgvx $Volinqraid -CLI -fgx $LETALL

Finding Command Devices

You cannot create a HORCM CONF file or check it for accuracy without doing INQRAID commands for UNIXWindows and RAIDSCAN commands for Windows

UNIX

Check Get the user to send you the result of this command

rootSYD-E250-1optHORCMudsls devrdsk | inqraid -CLI -fxgDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDc2t0d16s2 CL1-A-4 10111 0 - - - - OPEN-V-CMc2t2d36s2 - - - - - - - - c2t4d0s2 CL1-A-20 20169 43 - - - - OPEN-V-CMc2t6d0s2 CL1-A-11 80025 31F - - - - OPEN-V-CMc2t6d14s2 - - - - - - - - c3t2d128s2 CL2-A-6 10262 2180 - - - - OPEN-V-CMc3t3d4s2 CL2-A-7 3157 1 - - - - DF600F-CM

Here are 5 command devices ndash 2 are in bold The first is a USP ndash the second is a 9570V If the user wants to use the first one they need to code

devrdskc2t6d0s2

in the HORCM CONF file

Windows

Mike Le Voi Page 4 11042023

How To Debug CCI Issues ndash Version 13

Check Get the user to send you the result of these commands

CHORCMETCgtraidscan -x findcmddev h020cmddev of Ser 10111 = PhysicalDrive2cmddev of Ser 10111 = PhysicalDrive5cmddev of Ser 41 = PhysicalDrive7cmddev of Ser 10262 = PhysicalDrive8cmddev of Ser 80025 = PhysicalDrive10cmddev of Ser 20169 = PhysicalDrive11cmddev of Ser 20169 = Volume3c107ab6-7dbf-11db-a1ed-000e0c6abf1d

Check Do not use ANY of these names If you find a user using this syntax ask that it be changed See INQRAID output below

Harddisk numbers can change after a reboot GUID numbers can change in a MS Cluster environment after reboot Do yourself a favour ndash do not use these names

CHORCMETCgtinqraid $Phys -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDHarddisk0 - - - - - - - 00JS-22MHB0 Harddisk1 - - - - - - - 00JS-22MHB0 Harddisk2 CL1-A 10111 515 - - - - OPEN-V-CM Harddisk3 CL1-A 10111 1920 - ssss 9997 503-02 OPEN-V Harddisk4 CL1-A 10111 768 - ssss 9993 506-02 OPEN-V Harddisk5 CL1-A 10111 1856 - - - - OPEN-V-CM Harddisk6 CL1-A 10111 2632 - Psss 999A 506-02 OPEN-V Harddisk7 CL1-A 41 0 - - - - DF600F-CM Harddisk8 CL1-A 10262 16 - - - - OPEN-V-CM Harddisk9 CL1-A 10262 8320 - ssss 2000 501-05 OPEN-V Harddisk10 CL1-A 80025 784 - - - - OPEN-V-CM Harddisk11 CL1-A 20169 13 - - - - OPEN-V-CM

CMD syntax has been around since 01-17-0305 There is no reason not to use it If the user is running 01-17-0305 or below get them to use 01-19-0304 or higher

In this case for Harddisk8USP 10262 the correct syntax in the HORCM CONF file is

CMD-10262-16 or evenCMD-10262-16-CL1-A-12 if you know this is HSD 12 - orCMD-10262-16-CL1-A or for slack peopleCMD-10262

CMD-10262-16 is my preferred coding technique ndash as this takes care of multipath environments as well

Starting HORCM Instances

There are so many ways for this to fail that I could write a book on this topic

So always take the easy way out Send the user a deck that is bound to work If it does not then you have very little to debug Here is such a deck ndash HORCM4CONF

UNIX

Mike Le Voi Page 5 11042023

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 3: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

If the user is trying to control Instance 1 now you know why it is failing

Windows

Installation is easier Double click the EXE and follow the bouncing ball I always recommend taking the default of CHORCM The directory is only about 10 MB in size so we are not likely to fill the drive

Check The same rule applies as for UNIX Always ask for raidqry output

Environment Variables

Here is a ShadowImage example

CHORCMETCgtset horcminst=4

CHORCMETCgtset horcc_mrcf=1

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4] -h HelpUsage -I[] Set to HORCMINST -IH[] or -ITC[] Set to HORC mode [and HORCMINST] -IM[] or -ISI[] Set to MRCF mode [and HORCMINST] -z Set to the interactive mode

Check horcminst is case insensitive on windows ndash it is case sensitive on UNIX

Change mode of operation to TrueCopy

CHORCMETCgtset horcc_mrcf=

CHORCMETCgtraidqry -hModel RAID-ManagerWindowsNTVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Now try this on UNIX Change mode of operation

rootSYD-E250-1optHORCMudsexport HORCC_MRCF=1rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

rootSYD-E250-1optHORCMudsexport HORCC_MRCF= rootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HOMRCF[4]

Mike Le Voi Page 3 11042023

How To Debug CCI Issues ndash Version 13

Check This does not work You must do this

rootSYD-E250-1optHORCMudsunset HORCC_MRCFrootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Sending logs to GSC

If you have to escalate the problem to GSC we will need the complete set of HORCM logs and all the HORCM CONF files In general the preferred method of doing this is to run ldquogetconfigrdquo These executablesscripts are available on the TUF web site (httpstufhdscom)

If for any reason you do not run these scripts you must zip up all the LOG directories underneath the HORCM directory Never pick and choose which log to upload ndash many of them have the same name ndash and GSC may need to refer to all of them

In addition the factory always asks for the output from these commands (Windows only)

inqraid -CLI -fgx $Physinqraid -CLI -fgvx $Volinqraid -CLI -fgx $LETALL

Finding Command Devices

You cannot create a HORCM CONF file or check it for accuracy without doing INQRAID commands for UNIXWindows and RAIDSCAN commands for Windows

UNIX

Check Get the user to send you the result of this command

rootSYD-E250-1optHORCMudsls devrdsk | inqraid -CLI -fxgDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDc2t0d16s2 CL1-A-4 10111 0 - - - - OPEN-V-CMc2t2d36s2 - - - - - - - - c2t4d0s2 CL1-A-20 20169 43 - - - - OPEN-V-CMc2t6d0s2 CL1-A-11 80025 31F - - - - OPEN-V-CMc2t6d14s2 - - - - - - - - c3t2d128s2 CL2-A-6 10262 2180 - - - - OPEN-V-CMc3t3d4s2 CL2-A-7 3157 1 - - - - DF600F-CM

Here are 5 command devices ndash 2 are in bold The first is a USP ndash the second is a 9570V If the user wants to use the first one they need to code

devrdskc2t6d0s2

in the HORCM CONF file

Windows

Mike Le Voi Page 4 11042023

How To Debug CCI Issues ndash Version 13

Check Get the user to send you the result of these commands

CHORCMETCgtraidscan -x findcmddev h020cmddev of Ser 10111 = PhysicalDrive2cmddev of Ser 10111 = PhysicalDrive5cmddev of Ser 41 = PhysicalDrive7cmddev of Ser 10262 = PhysicalDrive8cmddev of Ser 80025 = PhysicalDrive10cmddev of Ser 20169 = PhysicalDrive11cmddev of Ser 20169 = Volume3c107ab6-7dbf-11db-a1ed-000e0c6abf1d

Check Do not use ANY of these names If you find a user using this syntax ask that it be changed See INQRAID output below

Harddisk numbers can change after a reboot GUID numbers can change in a MS Cluster environment after reboot Do yourself a favour ndash do not use these names

CHORCMETCgtinqraid $Phys -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDHarddisk0 - - - - - - - 00JS-22MHB0 Harddisk1 - - - - - - - 00JS-22MHB0 Harddisk2 CL1-A 10111 515 - - - - OPEN-V-CM Harddisk3 CL1-A 10111 1920 - ssss 9997 503-02 OPEN-V Harddisk4 CL1-A 10111 768 - ssss 9993 506-02 OPEN-V Harddisk5 CL1-A 10111 1856 - - - - OPEN-V-CM Harddisk6 CL1-A 10111 2632 - Psss 999A 506-02 OPEN-V Harddisk7 CL1-A 41 0 - - - - DF600F-CM Harddisk8 CL1-A 10262 16 - - - - OPEN-V-CM Harddisk9 CL1-A 10262 8320 - ssss 2000 501-05 OPEN-V Harddisk10 CL1-A 80025 784 - - - - OPEN-V-CM Harddisk11 CL1-A 20169 13 - - - - OPEN-V-CM

CMD syntax has been around since 01-17-0305 There is no reason not to use it If the user is running 01-17-0305 or below get them to use 01-19-0304 or higher

In this case for Harddisk8USP 10262 the correct syntax in the HORCM CONF file is

CMD-10262-16 or evenCMD-10262-16-CL1-A-12 if you know this is HSD 12 - orCMD-10262-16-CL1-A or for slack peopleCMD-10262

CMD-10262-16 is my preferred coding technique ndash as this takes care of multipath environments as well

Starting HORCM Instances

There are so many ways for this to fail that I could write a book on this topic

So always take the easy way out Send the user a deck that is bound to work If it does not then you have very little to debug Here is such a deck ndash HORCM4CONF

UNIX

Mike Le Voi Page 5 11042023

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 4: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Check This does not work You must do this

rootSYD-E250-1optHORCMudsunset HORCC_MRCFrootSYD-E250-1optHORCMudsraidqry -hModel RAID-ManagerSolarisVerampRev 01-19-0304Usage raidqry [options] for HORC[4]

Sending logs to GSC

If you have to escalate the problem to GSC we will need the complete set of HORCM logs and all the HORCM CONF files In general the preferred method of doing this is to run ldquogetconfigrdquo These executablesscripts are available on the TUF web site (httpstufhdscom)

If for any reason you do not run these scripts you must zip up all the LOG directories underneath the HORCM directory Never pick and choose which log to upload ndash many of them have the same name ndash and GSC may need to refer to all of them

In addition the factory always asks for the output from these commands (Windows only)

inqraid -CLI -fgx $Physinqraid -CLI -fgvx $Volinqraid -CLI -fgx $LETALL

Finding Command Devices

You cannot create a HORCM CONF file or check it for accuracy without doing INQRAID commands for UNIXWindows and RAIDSCAN commands for Windows

UNIX

Check Get the user to send you the result of this command

rootSYD-E250-1optHORCMudsls devrdsk | inqraid -CLI -fxgDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDc2t0d16s2 CL1-A-4 10111 0 - - - - OPEN-V-CMc2t2d36s2 - - - - - - - - c2t4d0s2 CL1-A-20 20169 43 - - - - OPEN-V-CMc2t6d0s2 CL1-A-11 80025 31F - - - - OPEN-V-CMc2t6d14s2 - - - - - - - - c3t2d128s2 CL2-A-6 10262 2180 - - - - OPEN-V-CMc3t3d4s2 CL2-A-7 3157 1 - - - - DF600F-CM

Here are 5 command devices ndash 2 are in bold The first is a USP ndash the second is a 9570V If the user wants to use the first one they need to code

devrdskc2t6d0s2

in the HORCM CONF file

Windows

Mike Le Voi Page 4 11042023

How To Debug CCI Issues ndash Version 13

Check Get the user to send you the result of these commands

CHORCMETCgtraidscan -x findcmddev h020cmddev of Ser 10111 = PhysicalDrive2cmddev of Ser 10111 = PhysicalDrive5cmddev of Ser 41 = PhysicalDrive7cmddev of Ser 10262 = PhysicalDrive8cmddev of Ser 80025 = PhysicalDrive10cmddev of Ser 20169 = PhysicalDrive11cmddev of Ser 20169 = Volume3c107ab6-7dbf-11db-a1ed-000e0c6abf1d

Check Do not use ANY of these names If you find a user using this syntax ask that it be changed See INQRAID output below

Harddisk numbers can change after a reboot GUID numbers can change in a MS Cluster environment after reboot Do yourself a favour ndash do not use these names

CHORCMETCgtinqraid $Phys -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDHarddisk0 - - - - - - - 00JS-22MHB0 Harddisk1 - - - - - - - 00JS-22MHB0 Harddisk2 CL1-A 10111 515 - - - - OPEN-V-CM Harddisk3 CL1-A 10111 1920 - ssss 9997 503-02 OPEN-V Harddisk4 CL1-A 10111 768 - ssss 9993 506-02 OPEN-V Harddisk5 CL1-A 10111 1856 - - - - OPEN-V-CM Harddisk6 CL1-A 10111 2632 - Psss 999A 506-02 OPEN-V Harddisk7 CL1-A 41 0 - - - - DF600F-CM Harddisk8 CL1-A 10262 16 - - - - OPEN-V-CM Harddisk9 CL1-A 10262 8320 - ssss 2000 501-05 OPEN-V Harddisk10 CL1-A 80025 784 - - - - OPEN-V-CM Harddisk11 CL1-A 20169 13 - - - - OPEN-V-CM

CMD syntax has been around since 01-17-0305 There is no reason not to use it If the user is running 01-17-0305 or below get them to use 01-19-0304 or higher

In this case for Harddisk8USP 10262 the correct syntax in the HORCM CONF file is

CMD-10262-16 or evenCMD-10262-16-CL1-A-12 if you know this is HSD 12 - orCMD-10262-16-CL1-A or for slack peopleCMD-10262

CMD-10262-16 is my preferred coding technique ndash as this takes care of multipath environments as well

Starting HORCM Instances

There are so many ways for this to fail that I could write a book on this topic

So always take the easy way out Send the user a deck that is bound to work If it does not then you have very little to debug Here is such a deck ndash HORCM4CONF

UNIX

Mike Le Voi Page 5 11042023

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 5: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Check Get the user to send you the result of these commands

CHORCMETCgtraidscan -x findcmddev h020cmddev of Ser 10111 = PhysicalDrive2cmddev of Ser 10111 = PhysicalDrive5cmddev of Ser 41 = PhysicalDrive7cmddev of Ser 10262 = PhysicalDrive8cmddev of Ser 80025 = PhysicalDrive10cmddev of Ser 20169 = PhysicalDrive11cmddev of Ser 20169 = Volume3c107ab6-7dbf-11db-a1ed-000e0c6abf1d

Check Do not use ANY of these names If you find a user using this syntax ask that it be changed See INQRAID output below

Harddisk numbers can change after a reboot GUID numbers can change in a MS Cluster environment after reboot Do yourself a favour ndash do not use these names

CHORCMETCgtinqraid $Phys -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDHarddisk0 - - - - - - - 00JS-22MHB0 Harddisk1 - - - - - - - 00JS-22MHB0 Harddisk2 CL1-A 10111 515 - - - - OPEN-V-CM Harddisk3 CL1-A 10111 1920 - ssss 9997 503-02 OPEN-V Harddisk4 CL1-A 10111 768 - ssss 9993 506-02 OPEN-V Harddisk5 CL1-A 10111 1856 - - - - OPEN-V-CM Harddisk6 CL1-A 10111 2632 - Psss 999A 506-02 OPEN-V Harddisk7 CL1-A 41 0 - - - - DF600F-CM Harddisk8 CL1-A 10262 16 - - - - OPEN-V-CM Harddisk9 CL1-A 10262 8320 - ssss 2000 501-05 OPEN-V Harddisk10 CL1-A 80025 784 - - - - OPEN-V-CM Harddisk11 CL1-A 20169 13 - - - - OPEN-V-CM

CMD syntax has been around since 01-17-0305 There is no reason not to use it If the user is running 01-17-0305 or below get them to use 01-19-0304 or higher

In this case for Harddisk8USP 10262 the correct syntax in the HORCM CONF file is

CMD-10262-16 or evenCMD-10262-16-CL1-A-12 if you know this is HSD 12 - orCMD-10262-16-CL1-A or for slack peopleCMD-10262

CMD-10262-16 is my preferred coding technique ndash as this takes care of multipath environments as well

Starting HORCM Instances

There are so many ways for this to fail that I could write a book on this topic

So always take the easy way out Send the user a deck that is bound to work If it does not then you have very little to debug Here is such a deck ndash HORCM4CONF

UNIX

Mike Le Voi Page 5 11042023

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 6: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

HORCM_MONip_address service poll(10ms) timeout(10ms)10129253 11004 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_namedevrdskc2t6d0s2

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

There are only 3 things to check

Is the IP address correct Note You can use ldquolocalhostrdquo here but this will not work for TC environments using 2 different CCI servers

Is 11004 a ldquofreerdquo UDP port Almost certainly it is Is the CMDDEV right You can tell that from the commands we have already issued

UNIX HORCM CONF files are kept in etc

Windows

Here is HORCM8CONF for Windows

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 11008 1000 3000

HORCM_CMDdev_name dev_name dev_name dev_nameCMD-10262-16

HORCM_DEVdev_group dev_name port TargetID LU MU

HORCM_INSTdev_group ip_address service

Use the same logic as for UNIX Windows HORCM CONF files are in CWINDOWS

Other recommendations

HDvM uses HORCM CONF files called HORCM900CONF to HORCM988CONF for temporary HORCM CONF files Do not use these numbers yourself

I suggest that you use 0-799 for user created files and 800-899 for HDvM created permanent HORCM CONF files

I also suggest a numbering convention of 1100x where x is the number in HORCMxCONF This means that you will need to ldquoreserverdquo UDP ports 11000 to 11899 for HORCM CONF usage

Updating the ldquoServicesrdquo file

Mike Le Voi Page 6 11042023

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 7: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Many people code HORCM CONF files like this

HORCM_MONip_address service poll(10ms) timeout(10ms)101293127 horcm8 1000 3000

In this case the UDP port ndash horcm8 ndash must be defined in the ldquoServicesrdquo file

Windows CWINDOWSsystem32driversetcservicesUNIX etcservices

Like this

horcm0 11000udphorcm1 11001udphelliphorcm8 11008udphorcm9 11009udpldquoblank linerdquo

Check Under Windows if there is no blank line after horcm9 (in this example) that definition will be ignored PS No blank lines at the end of the HORCM CONF file please

Check If you have 2 CCI servers using horcm8 and horcm9 for example then both horcm8 and horcm9 have to be defined in both servers

Reading the LOGS

Windows

Letrsquos start with Windows first this time

In our example ndash we used Instance 8 ndash so you will find the log here

CHORCMlog8curloghorcm_ml_acer510_logtxt

because this server is called ml_acer510

Let us examine it in detail

- HORCM STARTUP LOG - Thu Feb 22 180308 2007180308-39210-05000- horcmgr started on Thu Feb 22 180308 2007180308-39210-05000- _spawnvp() horcmd_08 using horcmgr [CWD=CHORCMETC]180308-3d090-07240- Fibre address conversion TBL has been set to 2

PP RAID Manager for WindowsNTModel RAID-ManagerWindowsNTVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

Mike Le Voi Page 7 11042023

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 8: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

HORCM(ml_acer510 7240) started by Administrator (0) on Thu Feb 22 180308 2007

Lots of useful information here See the data in bold

180308-3d090-07240- horcmd_08 started on Thu Feb 22 180308 2007180308-3d090-07240- [horcmcfgrdf] access(conf_file) OK180308-3d090-07240- [horcmcfgrdf] access(check) OK180308-3d090-07240- [horcmcfgrdf] open(conf_file) OK180308-3d090-07240- [horcmcfgetent] fseek(top) OK180308-40b28-07240- converted CMDDEV filename CMD-10262-16 to PhysicalDrive8

Here is where CMD syntax is converted to a physical drive number

180308-40b28-07240- [horcmcfgetent] read(conf_file) OK180308-40b28-07240- [horcmcfgrdf] close(conf_file) OK180308-40b28-07240- [horcmcfgrdf] check(conf) OK180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262

Here is the USP serial number

[0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR is supported180308-40b28-07240- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012db18]0000 80000000 00000000 00000000 00000000 [0x0012db28]0010 00000000 00000000 00000000 00000000 180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start

Mike Le Voi Page 8 11042023

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 9: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

180308-40b28-07240- [horcread] cmddevopen() finished180308-40b28-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-40b28-07240- [HORCREAD] maxldev = 16384 unitnum = 256180308-40b28-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-40b28-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-40b28-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-40b28-07240- [HORCMCFGRDF] SLPR bitmap is checked180308-40b28-07240- [horcmcfgrdf] horccmddev(0) OK180308-40b28-07240- [HORCRELOWNLBA] floatable LBA(e011) is releasedID0PhysicalDrive8180308-40b28-07240- [horcread] cmddevopen() start180308-40b28-07240- [horcread] cmddevopen() finished180308-449a8-07240- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d8d4]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0x0012d8e4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0x0012d8f4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0x0012d904]0030 f0f0f0f0 f0f0f0f1 f0f2f6f2 00040d09 000000010262 [0x0012d914]0040 50090100 00040000 00040004 00040004 P [0x0012d924]0050 ffffffff ffffffff 00060006 00060006 [0x0012d934]0060 00070007 00070007 000f0c00 00000000 [0x0012d944]0070 00000000 ef00e011 08030100 01004000 [0x0012d954]0080 38000400 04400100 01000400 00ff0100 8 [0x0012d964]0090 80000000 00000000 00000000 00000000 [0x0012d974]00a0 00000000 00000000 00000000 00000000 [0x0012d984]00b0 00800012 000e0002 00000000 00000000 [0x0012d994]00c0 00000000 00000000 00000000 00000000 [0x0012d9a4]00d0 00000000 00000000 00000000 00000000 [0x0012d9b4]00e0 00000000 00000000 00000000 00000000 [0x0012d9c4]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0x0012d9d4]0100 0004ffff 00060007 ffffffff ffff000b [0x0012d9e4]0110 ffff000d 000e000f 00100011 00120013 [0x0012d9f4]0120 00140015 00160017 9914ffff 001a001b [0x0012da04]0130 001c001d 001e001f 00200021 00220023 [0x0012da14]0140 20002001 00260027 00280029 002a002b amp()+ [0x0012da24]0150 002c002d ffff002f 00300031 00320033 -0123 [0x0012da34]0160 00340035 00360037 00380039 003a003b 456789 [0x0012da44]0170 003c003d 003e003f 00400041 00420043 lt=gtABC 180308-449a8-07240- [HORCREAD] maxldev = 16384 unitnum = 256

Mike Le Voi Page 9 11042023

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 10: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

180308-449a8-07240- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1180308-449a8-07240- [HORCREAD] Number of used instance(s) = 17 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1180308-449a8-07240- [HORCREAD] execute-test read is donePhysicalDrive8180308-449a8-07240- [horcmcfgrdf] seldevdata() OK180308-449a8-07240- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes180308-449a8-07240- MONHORCM_CMD=PhysicalDrive8[Fibre][AL-PA=0xef -gt C=5T=1] port=CL1-A targ=1 lun=12

Here is the AL-PA for the Port and the Port target ID and LUN

180308-449a8-07240- MON(HORC)number of Mus = 0180308-449a8-07240- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes180308-449a8-07240- MON(HOMRCF)number of Mus = 0180310-d1b78-05000- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

I have quoted this in full for a reason This is what you should expect to see if it all works If it does not work at least you can compare the user log with this one

UNIX

Here is the output for a Solaris server called SYD-E250-1 This is the log for instance 4 As we installed CCI in optHORCM the log is here

rootSYD-E250-1optHORCMlog4curlogls -altotal 28drwxr-xr-x 3 root other 512 Mar 7 1629 drwxr-xr-x 4 root other 512 Mar 7 1629 -rw-r--r-- 1 root other 10274 Mar 7 1630 horcm_SYD-E250-1logdrwxr-xr-x 2 root other 512 Mar 7 1629 horcmlog_SYD-E250-1

- HORCM STARTUP LOG - Wed Mar 7 162959 2007162959-cac9d-11271- horcmgr started on Wed Mar 7 162959 2007162959-cd940-11271- execvp() horcmd_04 using etchorcmgr [CWD=]162959-e99c5-11272- Fibre address conversion TBL has been set to 1

PP RAID Manager for SolarisModel RAID-ManagerSolarisVerampRev 01-19-0304Release Production(GA)

ALL Rights Reserved Copyright (c) 1998-2006 Hitachi Ltd

HORCM(SYD-E250-1 11272) started by root (0) on Wed Mar 7 163000 2007

163000-11d9d-11272- horcmd_04 started on Wed Mar 7 163000 2007163000-17e65-11272- [horcmcfgrdf] access(conf_file) OK163000-1c076-11272- [horcmcfgrdf] access(check) OK163000-1e127-11272- [horcmcfgrdf] open(conf_file) OK163000-29cf3-11272- [horcmcfgetent] fseek(top) OK163000-31d0e-11272- [horcmcfgetent] read(conf_file) OK163000-34856-11272- [horcmcfgrdf] close(conf_file) OK163000-389cb-11272- [horcmcfgrdf] check(conf) OK163000-4a34c-11272- [horcmcfgrdf] horccmddev(0) OK163000-5ac7f-11272- [horcread] cmddevopen() start163000-63837-11272- [horcread] cmddevopen() finished163000-6e384-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM

Mike Le Voi Page 10 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 11: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

[0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025

NSC55 with a Serial Number of 80025

[0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163001-ae6ea-11272- [HORCREAD] maxldev = 16384 unitnum = 256163001-b1cea-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163001-b5e34-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163001-c2226-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2

Here is the CMDDEV

163001-c636e-11272- [HORCMCFGRDF] SLPR is supported163001-ca4bf-11272- SLPR bitmap ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfce08]0000 80000000 00000000 00000000 00000000 [0xffbfce18]0010 00000000 00000000 00000000 00000000 163001-dad71-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163001-deb6b-11272- [horcread] cmddevopen() start163001-e2d12-11272- [horcread] cmddevopen() finished163001-e7502-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff

Mike Le Voi Page 11 11042023

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 12: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

[0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-62fd9-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-6712a-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-6b268-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163002-77659-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163002-7b7d2-11272- [HORCMCFGRDF] SLPR bitmap is checked163002-7f90c-11272- [horcmcfgrdf] horccmddev(0) OK163002-85faf-11272- [HORCRELOWNLBA] floatable LBA(e00c) is releasedID0devrdskc2t6d0s2163002-89c66-11272- [horcread] cmddevopen() start163002-8de05-11272- [horcread] cmddevopen() finished163002-925ff-11272- horcread ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcb94]0000 48495441 43484920 4f50454e 2052454d HITACHI OPEN REM [0xffbfcba4]0010 4f544520 434f5059 20535953 54454d20 OTE COPY SYSTEM [0xffbfcbb4]0020 44415441 20545950 45203031 00000000 DATA TYPE 01 [0xffbfcbc4]0030 f0f0f0f0 f0f0f0f8 f0f0f2f5 00070d09 000000080025 [0xffbfcbd4]0040 50090500 00020000 00050005 00050005 P [0xffbfcbe4]0050 00040004 00040004 00060006 00060006 [0xffbfcbf4]0060 00070007 00070007 000f2a00 00000000 [0xffbfcc04]0070 00000000 b200e00c 08030100 01004000 [0xffbfcc14]0080 f8000400 04400100 01000400 00ff0100 8 [0xffbfcc24]0090 80000000 00000000 00000000 00000000 [0xffbfcc34]00a0 00000000 00000000 00000000 00000000 [0xffbfcc44]00b0 0080000e 00080002 00000000 00000000 [0xffbfcc54]00c0 00000000 00000000 00000000 00000000 [0xffbfcc64]00d0 00000000 00000000 00000000 00000000 [0xffbfcc74]00e0 00000000 00000000 00000000 00000000 [0xffbfcc84]00f0 00000000 00000000 00000000 00000000 ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffbfcc94]0100 00050004 00060007 00080009 0060ffff ` [0xffbfcca4]0110 ffffffff ffffffff ffff000d ffffffff [0xffbfccb4]0120 0020ffff ffffffff ffffffff ffffffff [0xffbfccc4]0130 ffffffff ffffffff ffffffff ffffffff [0xffbfccd4]0140 0039ffff ffffffff ffffffff ffffffff 9 [0xffbfcce4]0150 0028ffff ffffffff ffff002d ffffffff (- [0xffbfccf4]0160 ffffffff 00320033 ffffffff ffffffff 23 [0xffbfcd04]0170 ffffffff ffffffff ffffffff ffff000a 163002-e7a8a-11272- [HORCREAD] maxldev = 16384 unitnum = 256163002-ebbdb-11272- [HORCREAD] maxhorc = 4 maxmrcf = 64 maxlun = 1024 maxctg = 256 maxjnlg = 256 mixport =1 slprflag = 1163002-efd23-11272- [HORCREAD] Number of used instance(s) = 13 Number of attached instance(s) = UNKNOWN Number of same instance(s) = 1163003-07ece-11272- [HORCREAD] execute-test read is donedevrdskc2t6d0s2163003-0e0d4-11272- [horcmcfgrdf] seldevdata() OK163003-12354-11272- MON(HORC) Size of memory allocation for CONFIG_DB = 64 bytes163003-16392-11272- MONHORCM_CMD=devrdskc2t6d0s2[Fibre][AL-PA=0xb2 -gt C=2T=32] port=CL1-A targ=32 lun=42

Here is the AL-PA for the Port and the Port target ID and LUN

163003-1a4ba-11272- MON(HORC)number of Mus = 0163003-1e633-11272- MON(HOMRCF) Size of memory allocation for CONFIG_DB = 0 bytes163003-2275a-11272- MON(HOMRCF)number of Mus = 0163007-b3adf-11271- horcmgr executed system(binls devrdsk |

HORCMusrbinraidscan -find inst)

Audit Logging

Mike Le Voi Page 12 11042023

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 13: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Check Always set full logging if possible This was introduced with 01-17-0305 ndash but it is disabled by default The environment variable is HORCC_LOGSZ If this environment variable is not set only errors are logged With this variable set successful commands are logged as well ndash very useful if you need to know what was typed and when

However only the input is logged not the output So always cut and paste the entire Command Prompt session and send that to GSC as well

Check Can the user reproduce this problem at will If so get them to stop CCI delete the LOGx directories and then start CCI and issue the command that fails This will make reading the LOGx files much easier as the only messages in the logs will be what you want to look at

Windows

TSTARTBAT BAT file to start CCI and set the correct options for TC

echo offremrem Batch file to start HORCM for TrueCopy operationsremrem turn on CCI logging for 01-17-0305 or laterset HORCC_LOGSZ=2048remraidscan -x findcmddev h020set horcmfctbl=2rem set instance to match your naming convention for the PVOL instanceset horcminst=0rem next line with a value for SI onlyset horcc_mrcf=horcmstart 0 1

TSTOPBAT BAT file to stop CCI

echo offremrem Batch file to stop HORCM after TrueCopy operationsremhorcmshutdown 0 1set horcmfctbl=set horcminst=set horcc_mrcf=set HORCC_LOGSZ=

UNIX

Check Always ask the user to ldquocut and pasterdquo the command line input and output ndash you need to know what they typed and what the result was

rootSYD-E250-1optHORCMlog4curloghorcmstartsh 4starting HORCM inst 4

HORCM inst 4 starts successfullyrootSYD-E250-1optHORCMlog4curlogexport HORCC_LOGSZ=2048rootSYD-E250-1optHORCMlog4curlograidscan -p CL1-A

Mike Le Voi Page 13 11042023

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 14: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

helliprootSYD-E250-1optHORCMlog4curloghorcmshutdownsh 4inst 4HORCM Shutdown inst 4

rootSYD-E250-1optHORCMlog4curlogcd rootSYD-E250-1optHORCMlog4ls -altotal 10drwxr-xr-x 4 root other 512 Mar 7 1650 dr-xr-xr-x 12 root sys 512 Feb 22 1504 drwxr-xr-x 3 root other 512 Mar 7 1649 curlog-rw-r--r-- 1 root other 289 Mar 7 1651 horcc_SYD-E250-1logdrwxr-xr-x 3 root other 512 Mar 7 1629 tmplog

Here are the contents of LOG file horcc_SYD-E250-1log

COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165036 2007CMDLINE raidscan -p CL1-A165037-450c6-11368- [raidscan][exit(0)]COMMAND NORMAL EUserId for HORC[4] root (0) Wed Mar 7 165153 2007CMDLINE usrbinhorcctl -S165154-0f8cf-11376- [horcctl][exit(0)]

Command Device Reject

Most CCI errors are self explanatory ndash however this one is usually impossible for the user to debug

Here is a simple ShadowImage example

HORCM8CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11008 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 7 0HORCM_INSTdev_group ip_address service VG01 localhost 11009

HORCM9CONF

HORCM_MONip_address service poll(10ms) timeout(10ms) localhost 11009 1000 3000 HORCM_CMDCMD-977-5HORCM_DEVdev_group dev_name port TargetID LU MUVG01 LDEV49 CL1-A-1 1 8 0HORCM_INSTdev_group ip_address service VG01 localhost 11008

Mike Le Voi Page 14 11042023

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 15: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Check Is the user using ldquogood syntaxrdquo

Even though this is a 9500V users should always use Port-HSD-LUN syntax I strongly recommend not to use Port-LDEV syntax ndash where is the cross-check Is MU specified for ShadowImage On some levels of CCI this is mandatory

However you should specify it anyway as this is Best Practice

How to check if the HORCM CONF files are correct

CHORCMETCgtraidscan -p CL1-A-1 -m 0PORT ALPACTID LUNum(LDEV)PS Status LDEVP-SeqP-LDEVCL1-A-1ef 5 1 0-0 1(13)S-VOL PAIR 13 ----- 10CL1-A-1ef 5 1 1-0 1(29)P-VOL PSUS 29 977 309CL1-A-1ef 5 1 2-0 1(48)P-VOL PSUS 48 977 300CL1-A-1ef 5 1 3-0 1(309)S-VOL SSUS 309 ----- 29CL1-A-1ef 5 1 4-0 1(310)S-VOL SSUS 310 ----- 29CL1-A-1ef 5 1 5-0 1(308)S-VOL SSUS 308 ----- 24CL1-A-1ef 5 1 6-0 1(305)S-VOL SSUS 305 ----- 1CL1-A-1ef 5 1 7-0 1(49)SMPL ---- ----- ----- -----CL1-A-1ef 5 1 8-0 1(50)SMPL ---- ----- ----- -----

CHORCMETCgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU-M) SeqLDEVPSStatus SeqP-LDEV MVG01 LDEV49(L) (CL1-A-1 1 7-0 ) 977 49SMPL --------- ----- -VG01 LDEV49(R) (CL1-A-1 1 8-0 ) 977 50SMPL --------- ----- -

Check the PVOL and SVOL LDEV numbers (in this case 49 and 50) Check the PortHSDLUN (in this case CL1-A-1 ndash LUNs 7 and 8)

But now the failure

CHORCMETCgtpaircreate -g VG01 -vlpaircreate [EX_CMDRJE] An order to the controlcommand device was rejectedRefer to the command log(CHORCMlog8horcc_hp2k5_logtxt) for details

And in the log we see this

COMMAND ERROR EUserId for HOMRCF[8] Administrator (0) Wed Mar 07 170230 2007CMDLINE paircreate -g VG01 -vl170230-9a8a8-12452- ERRORcm_sndrcv[rc lt 0 from HORCM]170230-9e728-12452- [paircreate] L_CMD(CREATEPAIR) ERROR rc = -35170230-9e728-12452- [paircreate][exit(221)][EX_CMDRJE] An order to the controlcommand device was rejected[Cause ] An order to the command(control) device failedor was rejected[Action]Please confirm the following itemsIf this trouble doesnt resolvethen collect HORCM error logs(HORCM_LOG=CHORCMlog8curlog) and Remote HORCM logsand send them to service personnel(1) Check if the HORC or HOMRCF function is installed in the RAID(2) Check if the RCP and LCP are installed in the RAID(3) Check if the path between the RAID CUs is established by using the SVP(4) Check if the pair target volume is an appropriate status

Yes meaningless error message numbers like -35 and 221 If this is a RAID subsystem check the SSBLOGS on the SVP However for DF the SSB is logged in CCI

Contents of CHORCMlog8curloghorcmlog_servernamehorcm_logtxt

170230-9a8a8-14140- SCSI Check Condition170230-9a8a8-14140- SCSI SENSE DATA ---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------

Mike Le Voi Page 15 11042023

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 16: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

[0x0012f2b4]0000 70000500 00000038 8400000d 961c0000 p8 [0x0012f2c4]0010 00001000 00000000 00000000 00000000 [0x0012f2d4]0020 00000000 00000000 00000000 00000000 [0x0012f2e4]0030 00000000 00000000 00000000 00000000 [0x0012f2f4]0040 00000000 00000000 00000000 00000000 [0x0012f304]0050 00000000 00000000 00000000 00000000 170230-9a8a8-14140- SKEY = 0x05170230-9a8a8-14140- ASC = 0x96170230-9a8a8-14140- SSB = 0x8400000d

170230-9a8a8 is the cross-check Next it is not obvious but the error code is

961C 000D

Now get hold of the latest AMS CCI manual which contains Appendix A4

A4 How to Read Detailed Error Log Codesand this subsectionA44 Sense Code and Detail Code

Beware ndash some versions of this manual do not contain these sections Find one that does

Table A5 Sense Codes and Detailed CodeshellipError Contents Recommended Action961C 000C The S-VOL is a Sub LU of a unified LU Check the status of the LU961C 000D The default controllers controlling the P-VOL and S-VOL are not the samehellip961C 000E The P-VOL is a Cache Residency LU Check the status of hellip

In this case the PVOL and SVOL default controllers are not the same

ldquoOld Syntaxrdquo HORCM CONF Files

This problem only applies to RAID subsystems from 9900V onwards With 7700E and 9900 there were no Host Storage Domains (HSD) so all LUNs were on the ldquorealrdquo port With 9900V USP etc the LUNs are normally considered to be attached to ldquologicalrdquo ports ndash which are called HSD or Host Groups

However it is still possible to use the ldquooldrdquo syntax This always causes confusion after a while as LUNs get added and deleted from various HSD Here is an example

Imagine that 3 HSD are created on an empty port ndash HSD 12 and 3 Each HSD has 3 LUNs added ndash numbered as 0 1 and 2

If this is done in sequence HSD 1 has ldquoabsoluterdquo LUNs 0-2 HSD 2 has ldquoabsoluterdquo LUNs 3-5 and HSD 3 has ldquoabsoluterdquo LUNs 6-8

Now imagine that the following actions have been performed some time later Delete HSD 2 Add HSD 4 with LUNs 0 and 1

And then you allocate LUN 3 to HSD 1 and 3 If you did not know that the previous changes had been made it would be impossible for you to ldquoguessrdquo that

Mike Le Voi Page 16 11042023

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 17: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

HSD 1 LUN 3 was ldquoabsoluterdquo LUN 5 HSD 3 LUN 3 was ldquoabsoluterdquo LUN 9

Even worse you have no way of looking at the LUN allocations via Storage Navigator as that only shows ldquorelativerdquo LUN numbers

In a recent case 47 S-VOL LUNs were deleted by mistake from a HSD When the mistake was noticed the same 47 S-VOL LUNs were added back in the ldquosame orderrdquo However a subsequent pairdisplay showed the following

TC-WRP 1003-108A(L) (CL2-F 0 45)32179 10b5S-VOL PAIR ASYNC 0 102e TC-WRP 1003-108A(R) (CL1-C 0 4)32208 1003P-VOL PAIR ASYNC 0 108a - (1)TC-WRP 1004-108B(L) (CL2-F 0 46)32179 --------- ---- ----------- ----- -TC-WRP 1004-108B(R) (CL1-C 0 5)32208 1004P-VOL PAIR ASYNC 0 108b -TC-WRP 1005-108C(L) (CL2-F 0 47)32179 --------- ---- ----------- ----- -TC-WRP 1005-108C(R) (CL1-C 0 6)32208 1005P-VOL PAIR ASYNC 0 108c -TC-WRP 1006-108D(L) (CL2-F 0 48)32179 --------- ---- ----------- ----- -TC-WRP 1006-108D(R) (CL1-C 0 7)32208 1006P-VOL PAIR ASYNC 0 108d -TC-WRP 1007-108E(L) (CL2-F 0 49)32179 108aS-VOL PAIR ASYNC 0 1003 - (2)TC-WRP 1007-108E(R) (CL1-C 0 8)32208 1007P-VOL PAIR ASYNC 0 108e - (3)

What can you tell from the display above Firstly the pairdisplay was issued by the ldquoDRrdquo CCI server ndash as (L) refers to the S-VOL Next we have obvious mismatches ndash in yellow What is less obvious is that the turquoise and green pairs are also invalid Indeed

(3) P-VOL is 1007 and the associated S-VOL is 108E(2) S-VOL is 108A and the associated P-VOL is 1003

This entry does not go with (3)(1) This is the associated P-VOL for (2)

Here is an excerpt from the ldquooldrdquo HORCM CONF file ndash using ldquoabsoluterdquo LUN numbers

TC-WRP 1003-108A CL2-F 0 45TC-WRP 1004-108B CL2-F 0 46TC-WRP 1005-108C CL2-F 0 47TC-WRP 1006-108D CL2-F 0 48

And here is the same excerpt after the file has been changed to use HSD syntax

TC-WRP 1003-108A CL2-F-2 0 6TC-WRP 1004-108B CL2-F-2 0 7TC-WRP 1005-108C CL2-F-2 0 8TC-WRP 1006-108D CL2-F-2 0 9

As you can the new HORCM CONF file is easier to understand and compare with Storage Navigator

By the way here is how you find out the ldquoabsoluterdquo and ldquorelativerdquo LUN numbers

raidscan -p CL2-F -fxCL2-F 88 3 0 491(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F 88 3 0 501(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F 88 3 0 511(108c)S-VOL PAIR ASYNC 108c ----- 1005

Mike Le Voi Page 17 11042023

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 18: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

raidscan -p CL2-F-2 -fxCL2-F-2 88 3 0 61(108a)S-VOL PAIR ASYNC 108a ----- 1003CL2-F-2 88 3 0 71(108b)S-VOL PAIR ASYNC 108b ----- 1004CL2-F-2 88 3 0 81(108c)S-VOL PAIR ASYNC 108c ----- 1005

Secured CMDDEV and HORCMPERM Implications

If you use a ldquonormalrdquo ie non-secured CMDDEV you can control CCI for any LUNs on any Host This also means that you can destroy anyonersquos data by using SI to copy your LUNs over the top of their LUNs

For this reason you normally only let the Storage Administrator have access to a ldquonormalrdquo CMDDEV ndash and you always give normal users access to a Secured CMDDEV

You can tell if a CMDDEV is secured as follows

CHORCMETCgthorcmstart 0starting HORCM inst 0HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=0CHORCMETCgthorcctl ndashDCurrent control device = PHYSICALDRIVE1

This is a ldquonormalrdquo CMDDEV For this test I also had access to a secured CMDDEV ndash and it is possible to swap between them as follows

CHORCMETCgthorcctl -CChanged control device(PHYSICALDRIVE1 -gt PHYSICALDRIVE10CHORCMETCgthorcctl -DCurrent control device = PHYSICALDRIVE10

The asterisk means that the CMDDEV is secured Normally of course you would not give any user access to different types of CMDDEV as that will cause problems

To test what will happen before giving a secured CMDDEV to a user you can set the HORCMPROMOD environment variable as follows

CHORCMetcgtset HORCMPROMOD=1CHORCMETCgthorcmstart 410starting HORCM inst 410HORCM inst 0 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgthorcctl -DCurrent control device = PhysicalDrive53

Note however that this does not affect the horcctl display

Here is some pairdisplay output when HORCMPROMOD is not set on any CCI server

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 410S-VOL PAIR NEVER ----- 410 -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 411S-VOL PAIR NEVER ----- 411 -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 412S-VOL PAIR NEVER ----- 412 -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -

Mike Le Voi Page 18 11042023

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 19: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

VG01 d3(R) (CL1-A 1 413)75010010 413S-VOL PAIR NEVER ----- 413 -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 414S-VOL PAIR NEVER ----- 414 -

As you can see LDEVs 410-414 on an AMS1000 (SN begins with 770x) are paired with LDEVs 410-414 on an AMS500 (SN begins with 750x)

Here is the same pairdisplay output when HORCMPROMOD has been set on both CCI servers

CHORCMetcgtset HORCMPROMOD=1CHORCMetcgthorcmstart 410starting HORCM inst 410HORCM inst 410 starts successfullyCHORCMETCgtset horcminst=410CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

As you can see the local CCI instance (L) has access to all its LUNsLDEVs However the DR CCI server (R) has no access to LDEVs 410-414

If you attempt to do any commands such as pairsplit the following will happen

CHORCMetcgtpairsplit -g VG01pairsplit [EX_ENPERM] Permission denied with the LDEVRefer to the command log(CHORCMlog410horcc_Verdande_logtxt) for details

You can use HORCMPERMCONF ( is the instance number) to further limit CCI access HORCMPERMCONF does not give you access to LDEVs that you are not allowed to process It removes access to LDEVs that you are allowed to process but do not wish to process

How does this work Letrsquos start instance 410 with HORCMPROMOD=1 and no HORCMPERMCONF file At the bottom of the start up log you will see this

110148-518b0-02092- HORCM has been set to the PROTECT MODE on ENV110150-e2900-01428- horcmgr executed CreateProcess(raidscanexe -pi $PhysicalDrive -find inst -z2w=NUL -z1w=NUL)exit = 0

As you can see raidscan is called internally with an argument of -pi $PhysicalDrive (all physical drives) ndash thus allowing all LUNs on this server to be accessed

Now letrsquos stop horcm and define a file as follows

CHORCMetcgttype CWINDOWSHORCMPERM410CONFhd0-56

CHORCMetcgt

Mike Le Voi Page 19 11042023

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 20: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

HORCMPERM410CONF contains a list of every device that we wish to be able to access via CCI Here is the resultant pairdisplay after a restart of horcm

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 ---- ---- ----------- ----- -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 ---- ---- ----------- ----- -VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

The bold lines show what has changed Here is the bottom of the start up log file

110803-7d3e8-02408- horcmgr executed CreateProcess(raidscanexe -find inst -z0r=CWINDOWShorcmperm410conf -z2w=NUL -z1w=NUL)exit = 0

As you can raidscan has been called internally and is using the list of disks in HORCMPERM410CONF to determine which LDEVs can be accessed Here is some inqraid output

CHORCMetcgtinqraid $LETALL -CLIDEVICE_FILE PORT SERIAL LDEV CTG HM12 SSID RGroup PRODUCT_IDEVol13Dsk54 CL2-D 77010027 410 - Psss 0000 A07-00 DF600FFVol14Dsk55 CL2-D 77010027 411 - Psss 0000 A07-00 DF600FQVol11Dsk12 CL1-B 3157 169 - Psss 0000 502-00 DF600FGVol15Dsk56 CL2-D 77010027 412 - Psss 0000 A07-00 DF600FRVol12Dsk13 CL1-B 3157 170 - Psss 0000 502-00 DF600FHVol16Dsk57 CL2-D 77010027 413 - Psss 0000 A07-00 DF600FIVol17Dsk58 CL2-D 77010027 414 - Psss 0000 A07-00 DF600FJVol2Dsk0 - - - - - - - ST336754LC

The bold lines show that LDEVs 413 and 414 are Physical Drives 57 and 58 ndash and as we only allowed access to Physical Drives 0-56 this explains why the pairdisplay has changed

Note that it is possible to ldquofixrdquo this ldquomistakerdquo by manual use of the raidscan command as follows

CHORCMetcgtecho hd57-58 | raidscan -find instDEVICE_FILE Group PairVol PORT TARG LUN M SERIAL LDEVHarddisk57 VG01 d3 CL2-D 1 413 0 77010027 413Harddisk57 VG01 d3 CL2-D 1 413 - 77010027 413Harddisk58 VG01 d4 CL2-D 1 414 0 77010027 414Harddisk58 VG01 d4 CL2-D 1 414 - 77010027 414

CHORCMetcgtpairdisplay -g VG01Group PairVol(LR) (PortTID LU)SeqLDEVPSStatusFenceSeqP-LDEV MVG01 d0(L) (CL2-D 1 410)77010027 410P-VOL PAIR NEVER 75010010 410 -VG01 d0(R) (CL1-A 1 410)75010010 ---- ---- ----------- ----- -VG01 d1(L) (CL2-D 1 411)77010027 411P-VOL PAIR NEVER 75010010 411 -VG01 d1(R) (CL1-A 1 411)75010010 ---- ---- ----------- ----- -VG01 d2(L) (CL2-D 1 412)77010027 412P-VOL PAIR NEVER 75010010 412 -VG01 d2(R) (CL1-A 1 412)75010010 ---- ---- ----------- ----- -VG01 d3(L) (CL2-D 1 413)77010027 413P-VOL PAIR NEVER 75010010 413 -VG01 d3(R) (CL1-A 1 413)75010010 ---- ---- ----------- ----- -VG01 d4(L) (CL2-D 1 414)77010027 414P-VOL PAIR NEVER 75010010 414 -

Mike Le Voi Page 20 11042023

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 21: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

VG01 d4(R) (CL1-A 1 414)75010010 ---- ---- ----------- ----- -

Of course you are unlikely to fix such an issue with raidscan You would normally fix HORCMPERMCONF and then stop and restart horcm

ldquoBasicrdquo HORCM CONF problems

When HORCM will not start you strip the CONF file back to the bare essentials ndash and then change one thing at a time Sometimes even this fails Here are the most common reasons

HORCM_MONip_address service poll(10ms) timeout(10ms) 101293127 11042 1000 3000

HORCM_CMDdev_name CMDDEV0 - USP600 - SN 10111 - CMD-10111-4CMD-10111-4

The above file is correct ndash let us make some simple changes to break it

1 Wrong IP Address

Change 101293127 to 101292127 A simple typo but here is what you get

Windows

[System Call Error]SysCall bindWSAerr 10049(0x00002741) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 124303 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Of course it is the ldquoInternal Errorrdquo that confuses most people here The real error is in the line above It is the result of a standard call to an OS socket service in this case Winsock

Here is the relevant section from winsock2h

define WSABASEERR 10000hellipdefine WSAEADDRNOTAVAIL (WSABASEERR+49)

Here is a useful web page ndash and some useful information

httpwwwsocketscomerr_lst1htm

WSAEADDRNOTAVAIL (10049) Cannot assign requested address

Berkeley description Normally results from an attempt to create a socket with an address not on this machine

Mike Le Voi Page 21 11042023

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 22: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

So the error is obvious when you know where to look The problem is not many people know where to look

UNIX

UNIX errors messages are not only different they are different on each platform Here is the same error for Solaris

[System Call Error]SysCall bindErrorno 126 (Cannot assign requested address)ErrInfo Internal ErrorErrTime Tue Sep 2 114540 2008SrcFile shorcmccSrcLine 2427

ERRORcmr_repcre[scmcrepcr fail]

Here is a useful web page

httpwwwioplexcom~miallenerrcmpphtml

The relevant line for this error says

AIX 4351 HP-UX 1122 Solaris 910

EADDRNOTAVAIL 68 Cant assign requested address

227 Cant assign requested address

126 Cant assign requested address

Once again this is not the most intuitive error I have seen

2 Invalid CMDDEV

Here is what you get if you change the CMDDEV to CMD-10111-42

125223-16b48-04004- horcread()cannot open command deviceCMD-10111-42125223-16b48-04004- [WARNING] This device(CMD-10111-42) is not ready for receiving a command125223-16b48-04004- No device is ready for receiving a command in 1 line from HORC_CMD125223-16b48-04004- ERRORhorcm_cfg_create125228-0b3b0-01136- horcmgrFailed to connect to HORCM

Here I think it is pretty obvious what the problem is

Mike Le Voi Page 22 11042023

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 23: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

3 Invalid service name

Change 11042 to ldquohorcm42rdquo

172902-d59f8-02260- [horcmcfgrdf] open(conf_file) OK172902-d59f8-02260- ERROR A wrong ipaddr or servicename line exists in HORCM_MON line 4172902-d59f8-02260- 101293127 horcm42 1000 3000172902-d59f8-02260- [horcmcfgrdf] close(conf_file) OK172902-d59f8-02260- ERRORhorcm_cfg_create

Once again it is more obvious what is wrong

4 UDP port which is in use

Change 11042 to 1030 This is not a ldquosensiblerdquo port number It was chosen to cause an error

[System Call Error]SysCall bindWSAerr 10013(0x0000271d) (See winsock2h)ErrInfo Internal ErrorErrTime Mon Sep 08 173946 2008SrcFile shorcmccSrcLine 2405

ERRORcmr_repcre[scmcrepcr fail]

Here is the relevant section from winsock2h

define WSAEACCES (WSABASEERR+13)

The following web page has more information

httpwwwsocketscomerr_lst1htm

WSAEACCES (10013) Permission denied

Berkeley description An attempt was made to access a file in a way forbidden by its file access permissions

However in this case that is hardly descriptive of the problem Of course if one had access to a command prompt one could do this

CHORCMETCgtnetstat -a -p UDP

Active Connections

Proto Local Address Foreign Address State UDP ml_acer510microsoft-ds UDP ml_acer510isakmp UDP ml_acer5101030 hellip UDP ml_acer51054323

It is not likely that you will be this lucky

Mike Le Voi Page 23 11042023

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023

Page 24: How to Debug CCI Issues 1.3

How To Debug CCI Issues ndash Version 13

Comments

This is a work in progress If you would like to see anything else let me know

Mike Le VoiSoftware Technical SpecialistAPAC Global Support Centre8th September 2008

Mike Le Voi Page 24 11042023