Operator Messages Manual

Chapter 92 SMN (External ServerNet SAN Subsystem) Messages

The messages in this chapter are generated by the HP NonStop™ External ServerNet System Area Network (SAN) subsystem. The subsystem ID displayed by these messages includes SMN as the subsystem name. The External ServerNet SAN subsystem (SMN) is managed and monitored by an External ServerNet SAN manager ($ZZSMN, sometimes referred to as SANMAN) process that runs in every system connected to a ServerNet Cluster.

NOTE: Negative-numbered messages are common to most subsystems. If you receive a negative-numbered message that is not described in this chapter, see Chapter 15.


4001

The External ServerNet SAN Manager process, process-name, has started in processor cpunum. Program file: filename Priority: pri Autorestart count: count Processor list: (first-processor-in-list [next-processor-in-list, ..., last-processor-in-list])

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

cpunum

is the number of the processor in which the primary External ServerNet SAN manager process has started.

filename

is the name of the program file for the External ServerNet SAN manager process.

pri

is the priority at which the External ServerNet SAN manager process is running.

count

is the autorestart count configured for the External ServerNet SAN manager process.

first-processor-in-list ... last-processor-in-list

is the processor list configured for the External ServerNet SAN manager process.

Cause   The External ServerNet SAN manager process has started.

Effect  The External ServerNet SAN Manager process is running.

Recovery  This is an informational message. No corrective action is required.



4002

The External ServerNet SAN Manager process, process-name, has terminated. Reason: reason

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

reason

indicates the reason the process terminated:

111Process stopped by operator
112Environmental problem

Cause  The External ServerNet SAN manager process terminated voluntarily. Either it was terminated by an operator command, or an environmental problem caused it to self-terminate. If this event is due to self-termination, an associated SMN 4010 (SANMAN-Additional-Information) message reports the environmental problem found by the External ServerNet SAN manager process.

Effect  The External ServerNet SAN manager process is no longer running.

Recovery  If this event is due to self-termination, follow recovery instructions for the SMN 4010 event. After correcting any environmental problems, restart the External ServerNet SAN manager process with an operator command.



4003

Process process-name: Primary processor cpunum.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

cpunum

is the number of the processor in which the primary External ServerNet SAN manager process is running.

Cause  Either the External ServerNet SAN manager process was initialized for the first time, or a backup process has become the primary process.

Effect  The External ServerNet SAN manager process is running in the indicated processor.

Recovery  This is an informational message. No corrective action is required.



4004

Process process-name: Backup process created in processor cpunum.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

cpunum

is the number of the processor in which the backup External ServerNet SAN manager process is running.

Cause  The External ServerNet SAN manager process has successfully created a backup process.

Effect  The External ServerNet SAN manager process is no longer vulnerable to a single failure.

Recovery  This is an informational message. No corrective action is required.



4005

Process process-name: Unable to create backup in processor cpunum. Process creation error: errnum Error detail: err-detail

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

cpunum

is the number of the processor in which the backup process-creation attempt was made.

errnum

is the Guardian process-creation error number.

err-detail

is the error detail subcode returned with the Guardian process-creation error.

Cause  An attempt to create an External ServerNet SAN manager backup process has failed. For information on process-creation errors and error detail subcodes, see the Guardian Procedure Errors and Messages Manual.

Effect  Until a backup process is started, External ServerNet SAN management is vulnerable to a single failure. The External ServerNet SAN manager process attempts to start a backup process immediately if any processor in its processor list other than that used by the primary process is running. The External ServerNet SAN manager process makes two restart attempts in each processor eligible to contain the backup process. Each failed attempt results in an SMN 4005 message. If all restart attempts fail, an SMN 4007 message is generated.

Recovery  This is an informational message. Although no corrective action is required, the message might provide information for recovery in the event of an SMN 4007 message.



4006

Process process-name: Backup process in processor cpunum failed.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

cpunum

is the number of the processor in which the backup External ServerNet SAN manager process had been running.

Cause  The backup process of the External ServerNet SAN manager process pair failed.

Effect  Until a backup process is started, External ServerNet SAN management is vulnerable to a single failure. The External ServerNet SAN manager process attempts to start a new backup process immediately if any processor in its processor list other than that used by the primary process is running. The External ServerNet SAN manager process makes two restart attempts in each processor eligible to contain the backup process. Each failed attempt results in an SMN 4005 message. If all restart attempts fail, an SMN 4007 message is generated.

Recovery  This is an informational message. Although no corrective action is required, the message might provide information for recovery in the event of an SMN 4007 message.



4007

The External ServerNet SAN Manager process, process-name, is running without a backup. Reason: reason.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

reason

indicates why there is no backup process:

101No Processor Available
102Excess Failed Start Attempts
103Backup Creation Failure

Cause  Either there is no processor available for running the backup process, or there have been multiple failures of the backup process or attempts to create a backup process. If this event is due to backup process-creation failures, there is associated SMN 4005 messages. Other possible precursors are SMN 4006 and SMN 4008 messages.

Effect  The External ServerNet SAN manager process runs without a backup, and the External ServerNet SAN subsystem is vulnerable to a single failure. Whenever a processor in its processor list is reloaded, the External ServerNet SAN manager process attempts to create a backup there.

If this event is caused by repeated backup failures or backup process-creation failures, the External ServerNet SAN manager process periodically attempts to create a backup.

Recovery  Either reload the processors on the processor list for the External ServerNet SAN manager process, or use the information in the associated SMN 4005, 4006, or 4008 messages to determine the cause and recovery actions for backup process-creation failures. To list the processors that have been configured in the processor list for the External ServerNet SAN manager process, issue an SCF INFO command.



4008

The External ServerNet SAN Manager process, process-name, backup process terminated. Reason: reason.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

reason

indicates why the backup process was terminated:

100Process stopped by operator
101Backup Processor Is Down
102Checkpoint error

Cause  The backup process failed, or it was terminated by the primary (for example, if the primary found a fatal error when checkpointing to the backup).

Effect  Until a backup process is started, External ServerNet SAN management is vulnerable to a single failure. The External ServerNet SAN manager process attempts to start a new backup process immediately if any processor in its processor list other than that used by the primary process is running. The External ServerNet SAN manager process makes two restart attempts in each processor eligible to contain the backup process. Each failed attempt results in an SMN 4005 message. If all restart attempts fail, an SMN 4007 message is generated.

Recovery  This is an informational message. Although no corrective action is required, the message might provide information for recovery in the event of an SMN 4007 message.



4009

External ServerNet SAN Manager internal trace entry trace-entry.

trace-entry

contains an internally defined trace record in hexadecimal format.

Cause  An internal trace was initiated on the External ServerNet SAN manager process.

Effect  Trace data has been dumped into the EMS log. The External ServerNet SAN manager state is unchanged.

Recovery  This is an informational message. No corrective action is required.



4010

External ServerNet SAN Manager process, process-name, reports problem.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

problem

describes the environmental problem. Possible values are:

1Wrong Process Name
2Bad CPU-LIST
3Wrong Processor
4Not Running As SUPER.SUPER
5Internal Error
6Nested Signal
8Unsupported System Topology

Cause  The External ServerNet SAN manager process found an environmental problem. Possibly the processor list in the startup message is invalid, the process name is wrong (not $ZZSMN), the process was not started under the SUPER.SUPER user ID, an internal coding error occurred, or the process was inadvertently started in a system type that does not require or support the External ServerNet SAN manager process. This message is usually followed by a termination message.

Effect  This is an informational message, but normally the External ServerNet SAN manager process will terminate after detecting an environmental problem.

Recovery  Corrective action might be required to correct the environmental problem:

  • If problem is “Wrong Process Name” or “Bad CPU-LIST,” alter the External ServerNet SAN manager process startup parameters (generic process configuration under SCF).

  • If problem is “Wrong Processor,” correct the External ServerNet SAN manager process startup parameters, or configure and start the External ServerNet SAN manager process through SCF. This error occurs if the External ServerNet SAN manager process was started manually from a TACL prompt on a processor that is not in its processor list.

  • If problem is “Not Running As SUPER.SUPER,” configure and start the External ServerNet SAN manager process under $ZZKRN as a generic process through SCF. This error occurs if the External ServerNet SAN manager process was started manually from a TACL prompt by a user other than SUPER.SUPER.

  • If problem is “Internal Error” or “Nested Signal,” the External ServerNet SAN manager process terminates and is restarted automatically. Submit the ZZSA* savefile to your service provider for analysis of this problem.

  • If problem is “Unsupported System Topology,” abort and delete the External ServerNet SAN manager process by issuing the SCF ABORT PROCESS $ZZKRN.#ZZSMN and SCF DELETE PROCESS $ZZKRN.#ZZSMN commands. This error occurs if $ZZSMN is started in a system type that does not require or support the External ServerNet SAN manager process.



4011

External ServerNet SAN Manager process, process-name, version is not compatible with the current version of the driver-type driver.

process-name

is the name of the External ServerNet SAN manager process ($ZZSMN).

driver-type

is an enumerated value designating the driver that SANMAN has determined incompatible with its current version. Possible values are IBC and SMC.

Cause  SANMAN made a comparison of its own version and that of the NonStop Kernel IBC and SMC drivers that it is using. One or both of the versions was determined to be incompatible.

Effect  The External ServerNet SAN Manager process terminates.

Recovery  Check SPR requisites for NonStop Kernel (T9050) and SANMAN (T0502). The operator needs to ensure that the versions of SANMAN and the NonStop Kernel IBC and SMC drivers are compatible.



4012

DSM Trace error err, err-detail. Operation op.

err

is the error code returned by the DSM trace routine.

err-detail

is the error detail returned by the DSM trace routine.

op

is the operation that was being performed when the error was encountered. Possible values are:

 opDescription
1InitThe error occurred in the DSM_TRACE_INIT_ function call.
2StartThe error occurred in the DSM_TRACE_NEW_(start trace) function call.
3VersionThe error occurred in the DSM_TRACE_NEW_(set version) function call.
5RecordThe error occurred in the DSM_TRACE_NEW_(add record) function call.
6StopThe error occurred in the DSM_TRACE_NEW_(stop trace) function call.

Cause  The External ServerNet SAN manager process encountered an error using the DSM trace routines.

Effect  Any pending trace is terminated.

Recovery  Investigate the cause of the error. Reissue the SCF TRACE command. If the problem persists, contact your service provider.



4101

External ServerNet fabric fabric not found due to cause [ {MSEB | Cluster Connectivity CRU} Location group.module.slot. ]

fabric

identifies the ServerNet fabric (X or Y) that was not found:

1None
2X
3Y

cause

contains the cause code identifying why the external ServerNet fabric was not found. Possible values are:

-1Undefined Cause
0No Error
1MSEB, Cluster ConnCRU, or Module missing
2Link Dead
3No Response
4Processor Fabric Down
5CRU Type Not MSEB or Wrong CRU Type
6No NNA Plug-In Card or Cluster Conn PIC Missing
7Wrong External Fabric
9Bad Switch Port Number
10Bad SCB Loaded
11Bad Switch PIC Type
12Bad Switch GUID
16Node Number Mismatch
17NNA Verify Fail
18SP I/O Library Call Error
19Power Up
20MSEB Config Record Not Found or Cluster Conn CRU Cfg Rec Not Found
21Bad MSEB Config Record or Bad Cluster Conn CRU Cfg Record
22MSEB Config Record Fetch Error or Cluster Conn CRU Cfg Rec Fetch Err
23Internal System Fabric Down
24Both Fabric LEDs Are Set
25TNet Initialization Error
26Invalid IBCD Fabric Parameter
27IBCD Switch Limit Exceeded
28Bad Packetizer
29Both Configurations Invalid
30Read Pointer Directory Error
31Wrong Configuration Version ID
32Bad Node ID Bit Mask
33Bad Node Routing ID
34Invalid Pointer Directory Address
35Invalid Barrier Address
36Invalid Packetizer ServerNet ID
37Invalid ServerNet Speed
38Switch Object Not Found
39Bad RDMA Switch State
40Link Bundle Already In Use
41SMC Return No space
42SMC Return Error
43Bad Numeric Selector Value
44Invalid Firmware Response
45MSEB disabled
46Router 3 Port Not Found

group.module.slot

is the slot location in which the Modular ServerNet Expansion Board (MSEB) is configured.

Cause  This message is generated by the External ServerNet SAN manager when there is an attempt to discover an external fabric, but the discovery fails. Discovery of an external fabric may fail due to the following causes:

  • SANMAN detected a configuration error in the local node (e.g., an MSEB,NNA PIC, Cluster Connectivity CRU, or Cluster Connectivity PIC is missing) that prevents it from sending an In-Band Control (IBC) request to discover an external fabric

  • SANMAN detected an initialization error when installing a logical device for a nearest switch, and consequently could not send an IBC request to discover an external fabric

  • SANMAN detected an error when sending an IBC request to discover an external fabric

  • An IBC response has not been received from an external ServerNet fabric despite several retries

  • An IBC response was received from the external ServerNet fabric, but SANMAN detected an incompatible or incorrect value in one of the response data fields

SANMAN tries to discover an external fabric in the following cases:

  • During External ServerNet SAN manager initialization

  • When the backup External ServerNet SAN manager takes over

  • When the External ServerNet SAN manager perceives an environmental change that grants an attempt to discover an external fabric (for example, an MSEB CRU or Cluster Connectivity CRU insertion event or return of link alive on the cable that connects the system to an external fabric)

  • Periodically

Effect  The external ServerNet fabric discovery fails. The External ServerNet SAN manager process will not attempt to bring up the physical ServerNet connection between the node and the external fabric. The node will not be able to communicate with other remote nodes via that fabric. SANMAN will continue to perform periodic attempts to discover the fabric, but in most cases these will succeed only after the condition that caused the external fabric discovery failure is corrected.

Recovery  Recovery is dependent on the cause:

causeRecovery
MSEB Missing (1)Install an MSEB in slot 1.1.51 or 1.1.52.
Cluster Connectivity CRU Missing (1)Install a p-switch logic board in slot 100.2.14 or 100.3.14.Install a cluster switch PIC in the p-switch in slot 100.2.2 or 100.3.2.
Link Dead (2)Check the fiber-optic cable from the local node to the nearest switch.
No Response (3)Contact your service provider.
Processor Fabric Down (4)Use the OSM action to switch the SANMAN primary and backup processors. Or, check the status of the internal fabric, and bring up the fabric, if necessary.
CRU Type Not MSEB (5)Install an MSEB in slot 1.1.51 or 1.1.52.
Wrong CRU Type (5)Slot 14 must be LB and slot 2 must be SMF or MMF PIC.
No NNA Plug-In Card (6)Install an MSEB with a Node Numbering Agent (NNA) plug-in card (PIC) in port 6 of the MSEB in slot 51 or 52 of group 01.
Cluster CRU PIC Missing (6)Need SMF of MMF PIC in slot 2 of P switch; Need transceiver in port of CRU in slot 2.
Wrong External Fabric (7)Move the fiber-optic cable to the nearest switch on the other fabric, using the same switch port.
Bad Switch Port Number (9)Move the fiber-optic cable to a supported port on the nearest switch (0 through 7 for an HP NonStop™ Cluster Switch (model 6770), or slots 6 through 9 for an HP NonStop™ ServerNet Switch (model 6780.)
Bad SCB Loaded (10)Contact your service provider.
Bad Switch PIC Type (11)Need DUAL SMF for QUAD MMF PICS in slot 2 of P switch.
Bad Switch GUID (12) Unused in the current product version.
Node Number Mismatch (16)Check the fiber-optic cables from the local node to the nearest switches, and make sure the same port is used on both switches.
NNA Verify Fail (17)Use the OSM action to switch the SANMAN primary and backup processors.
SP I/O Library Call Error (18)Recovery is automatic.
Power Up (19)Recovery is automatic.
MSEB Config Record Not Found (20)Install an MSEB containing an NNA PIC in slot 51 or 52 of group 01.
Cluster Conn CRU cfg Rec Not Found (20)LB in slot 14 or PIC in slot 2 (SMF for MMF)
Bad MSEB Config Record (21)Install an MSEB containing an NNA PIC in slot 51 or 52 of group 01.
Bad Cluster Conn CRU cfg Record (21)LB in slot 14 or SMF/MMF in slot 2
MSEB Config Record Fetch Error (22)Unused in the current product version.
Cluster Conn CRU cfg Rec Fetch Errm (22)Unused in the current product version.
Internal System Fabric Down (23)Unused in the current product version.
Both Fabric LEDs Are Set (24)Use the OSM Set LED action to alter the fabric setting of the nearest 6770 switch so that it matches the fabric the switch is on (X or Y).
TNet Initialization Error (25) Use the OSM action to switch the SANMAN primary and backup processors.
IBC Driver Bad Fabric Parameter (26)Use the OSM action to switch the SANMAN primary and backup processors.
IBC Driver Switch Limit (27)Use the OSM action to switch the SANMAN primary and backup processors.
Bad Packetizer (28)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Both Configurations Invalid (29)Contact your service provider.
Read Pointer Directory Error (30)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Wrong Configuration Version ID (31)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Bad Node ID Bit Mask (32)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Bad Node Routing ID (33)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Invalid Pointer Directory Address (34)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Invalid Barrier Address (35)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Invalid Packetizer ServerNet ID (36)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Invalid ServerNet Speed (37)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
Switch Object Not Found (38)Contact your service provider.
Bad RDMA Switch State (39)Contact your service provider.
Link Bundle Already In Use (40)Unused in the current product version.
SMC Return No Space (41)Use the OSM action to switch the SANMAN primary and backup processors.
SMC Return Error (42)Contact your service provider.
Bad Numeric Selector Value (43)Ensure that the numeric selector value configured in the nearest 6780 switch is valid for the topologies supported by the T0502 SPR running on the node and the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the switch.
Invalid Firmware Response (44)Ensure that the T0502 SPR running on the node is compatible with the firmware (T2789), configuration (T2790), and FPGA microcode (T2819) SPRs running on the nearest 6780 switch.
MSEB Disabled (45)Pull out the MSEB from slots 51 or 52 of group 01 and reinsert. If reinsertion does not work then MSEB may need a replacement
Router3 Port Not Found (46)Recovery is Automatic



4102

External ServerNet fabric fabric found. Node number assigned to the system on the fabric fabric: node. Nearest switch GUID: GUID. Nearest switch configuration tag: tag.

fabric

identifies the ServerNet fabric (X or Y) that was discovered:

1None
2X
3Y

node

is the ServerNet node number.

GUID

is the globally unique ID (GUID) of the nearest switch.

tag

is the configuration tag of the nearest switch.

Cause  External ServerNet fabric discovery was successful.

Effect  SANMAN proceeds and tries to bring up the physical ServerNet connection between the node and the external fabric by programming the NNA (Node Numbering Agent) for that connection.

Recovery  This is an informational message. No corrective action is required.



4103

Connection to external ServerNet fabric fabric failed due to cause. [ ServerNet node number assigned by the X fabric: node-X. ServerNet node number assigned by the Y fabric: node-Y. Switch port number on the X fabric: port-X. Switch port number on the Y fabric: port-Y. Switch position ID on the X fabric: tag-X. Switch position ID on the Y fabric: tag-Y. ] [ Cluster connectivity CRV location: group.module.slot]

fabric

identifies the external ServerNet fabric (X or Y) to which the connection failed:

1None
2X
3Y

cause

contains the cause code identifying why the external ServerNet fabric was not found. Possible values are:

-1Undefined Cause
0No Error
1MSEB Missing
2Link Dead
3No Response
4Processor Fabric Down
5CRU Type Not MSEB or Wrong CRU Type
6No NNA Plug-In Card or Cluster Conn PIC Missing
7Wrong External Fabric
9Bad Switch Port Number
10Bad SCB Loaded
11Bad Switch PIC Type
12Bad Switch GUID
16Node Number Mismatch
17NNA Verify Fail
18SP I/O Library Call Error
19Power Up
20MSEB Config Record Not Found or Cluster Conn CRU Cfg Rec Not Found
21Bad MSEB Config Record or Bad Cluster Conn CRU Cfg Record
22MSEB Config Record Fetch Error or Cluster Conn CRU Cfg Rec Fetch Err
23Internal System Fabric Down
24Both Fabric LEDs Are Set
25TNet Initialization Error
26Invalid IBCD Fabric Parameter
27IBCD Switch Limit Exceeded
28Bad Packetizer
29Both Configurations Invalid
30Read Pointer Directory Error
31Wrong Configuration Version ID
32Bad Node ID Bit Mask
33Bad Node Routing ID
34Invalid Pointer Directory Address
35Invalid Barrier Address
36Invalid Packetizer ServerNet ID
37Invalid ServerNet Speed
38Switch Object Not Found
39Bad RDMA Switch State
40Link Bundle Already In Use
41SMC Return No space
42SMC Return Error
43Bad Numeric Selector Value
44Invalid Firmware Response

node-X

is the ServerNet node number assigned to the X fabric. This value is present only if cause = 16 (Node Number Mismatch).

node-Y

is the ServerNet node number assigned to the Y fabric. This value is present only if cause = 16 (Node Number Mismatch).

port-X

is the switch port number on the X fabric. This value is present only if cause = 16 (Node Number Mismatch).

port-Y

is the switch port number on the Y fabric. This value is present only if cause = 16 (Node Number Mismatch).

tag-X

is the switch position ID on the X fabric. This value is present only if cause = 16 (Node Number Mismatch).

tag-Y

is the switch position ID on the Y fabric. This value is present only if cause = 16 (Node Number Mismatch).

Cause  The External ServerNet SAN manager process failed to activate the physical ServerNet connection to an external fabric:

  • If this event is due to an NNA verification error, an SMN 4202 message reports the NNA verification detected by the External ServerNet SAN manager process.

  • If this event is due to Service Processor I/O Library (SPIOLIB) call errors, one or more SMN 4204 messages reports the SPIOLIB call errors detected by the External ServerNet SAN manager process.

Effect  The physical ServerNet connection between the node and the external fabric is down. The node is not able to communicate with other remote nodes via that fabric.

Recovery  Recovery depends on the cause:

causeRecovery
Node Number Mismatch (16)Correct the cable that connects the node to the external fabrics so the node numbers match.
NNA Verify Fail (17)You can reset the Service Processor or upgrade the version of the Service Processor firmware, but it is recommended that you contact your service provider for additional instructions.
SPIO Library Call Error (18)Contact your service provider.
Power Up (19)Recovery is automatic.
MSEB Config Record Not Found (20)Install an MSEB containing an NNA PIC in slot 51 or 52 of group 01.
Cluster Conn CRU cfg Rec Not Found (20)LB in slot 14 or PIC in slot 2 (SMF for MMF)
Bad MSEB Config Record (21)Install an MSEB containing an NNA PIC in slot 51 or 52 of group 01.
Bad Cluster Conn CRU cfg Record (21)LB in slot 14 or SMF/MMF in slot 2
MSEB Config Record Fetch Error (22)Unused in the current product version.
Cluster Conn CRU cfg Rec Fetch Errm (22)Unused in the current product version.



4104

Connection to external ServerNet fabric fabric brought up successfully. ServerNet node number assigned to the system: node.

fabric

identifies the external ServerNet fabric (X or Y) to which the connection was made.

node

is the ServerNet node number.

Cause  The physical connection between the node and an external ServerNet fabric was brought up successfully after the External ServerNet SAN manager process programmed the Node Numbering Agent (NNA) with the node number assigned by that fabric.

Effect  Physical access to the external ServerNet fabric indicated in the event is now enabled for the node.

Recovery  This is an informational message. No corrective action is required.



4105

Connection to the external ServerNet fabric fabric was lost. Reason: cause [Cluster connectivity location: group.module.slot]

fabric

identifies the external ServerNet fabric (X or Y) to which the connection failed:

1None
2X
3Y

cause

contains the cause code identifying why access to the external ServerNet fabric failed. Possible values are:

1MSEB MissingThe MSEB CRU for that fabric was removed.
2Link DeadThe Link Alive was lost in the connection to the external fabric.
4Processor Fabric Down 
19Power Up 
30Read Pointer Directory 
45MSEB DisabledThe MSEB CRU for that fabric was diabled.

Cause  A previously active connection to an external ServerNet fabric was downed.

Effect  The physical ServerNet connection between the node and the external fabric is down. The node cannot communicate with other remote nodes via that fabric.

In the case of a power failure, if both external fabric connections were active before the power outage, the External ServerNet SAN manager process generates one event for the X fabric and one event for the Y fabric.

Recovery  Recovery depends on the cause.



4201

SANMAN failed to register with Configuration Services. Error returned by Configuration Services: err Error detail returned by Configuration Services: err-detail

err

is the error code returned from NonStop Kernel Configuration Services (CONFIG_SPEV_EVENTTYPE_REGISTER_ routine).

err-detail

is the error detail code returned from NonStop Kernel Configuration Services (CONFIG_SPEV_EVENTTYPE_REGISTER_ routine).

Cause  The External ServerNet SAN manager process was not able to register with NonStop Kernel Configuration Services.

Effect  The External ServerNet SAN manager process relies on NonStop Kernel Configuration Services for fast notification of certain configuration and status changes, such as Cluster Connectivity CRU insertion and removal events, and link-alive status changes in the connection to an external fabric. If the External ServerNet SAN manager fails to register with NonStop Kernel Configuration Services, it can still detect configuration and status changes by means of periodic checks although a delay might occur in detecting configuration and status changes. It might take longer for the External ServerNet SAN manager to bring up the connection to an external fabric after a repair.

Recovery  Stopping one or more subsystems that register with NonStop Kernel Configuration Services should allow the External ServerNet SAN manager to register itself. Registration occurs when the External ServerNet SAN manager is initialized. The recommended long-term solution is to request a NonStop Kernel Configuration Services (T6586) SPR capable of supporting a larger number of registered subsystems.

If the external ServerNet SAN manager appears to be slow in detecting configuration changes, and you want to accelerate recovery of the external fabric connection after a repair, use the following workaround. Issue the SCF PRIMARY PROCESS $ZZSMN command to force the primary and backup external ServerNet SAN manager processes to switch roles. Upon takeover, the new primary immediately checks the external ServerNet SAN subsystem for configuration and status changes.

This list includes all possible errors returned by the NonStop Kernel Configuration Services CONFIG_SPEV_EVENTTYPE_REGISTER_ routine:

ErrorDescriptionError Detail
1Configuration Services error other than success, parameter error, or bounds error.6010: cannot add more entries to the tables used to keep track of registrants.564: the calling process is not named.6006: the calling process is not named.
2Parameter error: a parameter was missing, or a combination of parameters was invalid.Error detail indicates the number of the missing parameter or the number of the first parameter that was in conflict with some other parameter.
3A reference parameter has an out-of-bounds value.Error detail indicates the parameter number of the reference parameter with the out-of-bounds value.



4202

SANMAN detected a verification error when programming a Node Numbering Agent (NNA) PIC for the external ServerNet fabric fabric connection. NNA PIC location: CRU in slot group.module.slot, port port

fabric

identifies the external ServerNet fabric (X or Y) of the target ServerNet switch:

1None
2X
3Y

group.module.slot

is the slot location in which the Cluster Connectivity is configured.

port

is a port number on the Cluster Connectivity CRU.

Cause  The External ServerNet SAN manager attempted to configure the ServerNet node number by writing to the internal registers of a Node Numbering Agent (NNA). After the write operation completed, the External ServerNet SAN manager read the register contents back for verification purposes, compared the contents of the NNA registers with the intended values, and found that at least one of the register values did not match the intended value.

Effect  The External ServerNet SAN manager automatically retries to program the NNA and possibly resets the NNA during the retry. Until the NNA is programmed correctly, the physical ServerNet connection to the external fabric indicated by the event remains down. If the retries to program the NNA fail, an SMN 4103 message is generated.

Recovery  This is an informational message. Although no corrective action is required, information for recovery might be available in the SMN 4103 message.



4203

ServerNet node number changed from old-node to new-node.

old-node

is the old ServerNet node number.

new-node

is the new ServerNet node number.

Cause  The External ServerNet SAN manager detected that the ServerNet node number assigned by the external fabrics to the node has changed, because the node was moved to a different position in the ServerNet Cluster topology. The ServerNet node number that the External ServerNet SAN manager had previously configured in the Node Numbering Agents (NNAs) is no longer valid.

Effect  The External ServerNet SAN manager configures the new ServerNet node number in the Node Numbering Agents (NNAs).

Recovery  This is an informational message. No corrective action is required.



4204

A call to Service Processor (SP) I/O library routine routine failed.

routine

identifies the SP I/O library routine. These routines are possible:

  • SP-Session-Create

  • SP-Session-Destroy

  • Set-OLAP

  • Get-OLAP

  • OLAP-Block-Write

  • HW-SSPI-Operation

  • CRU-Handle-Get

  • Device-Handle-Get

  • Hardware-Olap-MOp

  • Cluster-Config-GET

  • NNA-Set

  • Router3-Point-Set

  • Router3-Complex-Stat-Get

  • CRU-Entro-Get

Cause  The External ServerNet SAN manager made a call to an SP I/O library routine, but it failed.

Effect  The External ServerNet SAN manager automatically retries the SP I/O library routine, possibly after destroying its previous session with the Service Processor and starting a new session. If the retries fail, and the SP I/O library routine had been called by the External ServerNet SAN manager to program an NNA, an SMN 4103 event occurs.

Recovery  The External ServerNet SAN manager automatically retries the SP I/O library routine. If the errors persist (as indicated by additional SMN 4204 messages), an SMN 4103 message is generated. The Service Processor might have to be initialized, or the Service Processor firmware might have to be upgraded. Contact your service provider for assistance.



4205

This event has three forms.

For the 6770 switch when neither a switch tag nor a switch type are supplied the message has this form:

A status change has been detected in the ServerNet switch on the external fabric fabric. Port port-number status changed to: status1 [ Port number status changed to: status1 ]

For the 6770 switch when a switch tag value is supplied the message has this form:

A status change has been detected in ServerNet switch switchID on the external fabric fabric. Port port-number status changed to: status1 [ Port number status changed to: status1 ]

For a 6870 switch the message has this form:

A status change has been detected in ServerNet switch fabric.zone.layer, in group group, module module, slot slot. Port port-number status changed to: status2 Port connects to location location-number [ Port number status changed to: status2, Port connects to location location-number, ... ... ]

NOTE: This event applies to the 6770 switch and the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch:

1None
2X
3Y

port-number

is the port number of one of the twelve ports on the ServerNet switch.

status1

Possible values are:

-1Undefined Port Status
0Reset
1Uninstalled
2Link Dead
3Link Alive, Enabled
4Link Alive, Disabled

status2

Possible values are:

-2Undefined Port Status
-1Unknown
0Reset
1Reset - Lost Optical Signal
2Reset - Transceiver Absent
3Reset - CRU Absent
4No Link Alive
1792Link Alive, Disabled
5888Nbr Chk In Progress
9984Nbr Chk OK, Waiting For Manage Port Cmd
9985Nbr Chk Failed - No Link Alive
9986Invalid Neighbor
9987Invalid Configuration Tag
9988Invalid Configuration Version ID
9989Unmatched Connector Number
9990Unmatched GUID
9991Invalid Switch Configuration Revision
9992Nbr Chk Failed - No Rsp To Mng Port Cmd
9993Nbr Chk Failed - Manage Port Cmd NACKed
16128Enabled

switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module, slot

are the locations of the group, module, and slot, respectively.

location

is one of:

zone

appears if the port connects to a zone.

layer

appears if the port connects to a layer.

node

appears if the port connects to a node.

location-number

is the zone, layer, or node number.

Cause  SANMAN has detected at least one external port status change in a ServerNet switch.

Effect  The status of switch port or ports (including neighbors) indicated has been changed.

Recovery  Depends on the kind of switch. When the switch is a 6770 switch:

statusRecovery
ResetThis an informational status and might occur transiently after the 6770 switch is power cycled or hard reset or after a link failure. Due to the transient nature of this status, it should rarely be detectable and reported by SANMAN via a SMN 4205 event. No corrective action is required for this status change.
Link Alive, enabledAn informational status. No corrective action is required for this status changes.
UninstalledThe PIC (Plug-In Card) for a port is not installed. This status is unexpected for a 6770 switch, which normally has PICs installed in all 12 ports. This status could signify that the PIC at the port is faulty. Contact your service provider for assistance if an Uninstalled status is reported for a 6770 switch port.
Link DeadNo link-alive signals are being detected at the port. Determine what is preventing signals arriving at this port and correct the condition. When signals are once again arriving, the port will recover automatically.
Link Alive, disabledLink-alive signals are being detected at the port, but the port is disabled for regular ServerNet data traffic. The likely cause of this condition is a neighborhood check error. The service logs should have an SMN 4212 ("Switch-Neighbor-Status-Change") event reporting the specific neighborhood check error for the port. Optionally, the neighborhood check for the port can be determined via the SCF STATUS SWITCH $ZZSMN command. For specific recovery instructions for ServerNet II Switch neighbor check errors, see the SMN 4212 event documentation.

Recovery   When the switch is a 6780 switch:

statusRecovery
ResetThis informational status might occur transiently after the 6780 switch is power cycled or hard reset or after a link failure. Due to the transient nature of this status, it should rarely be detectable and reported by SANMAN via a SMN 4205 event. No corrective action is required for this status change.
Reset - Transceiver AbsentThe port has been reset because the presence of a transceiver cannot be detected. Probably a hardware error occurred. See the OSM procedure for replacing a PIC, and follow its suggestions.
Reset - CRU AbsentThe port has been reset because the presence of PIC can no longer be detected. Probably a hardware error occurred. See the OSM procedure for replacing a PIC, and follow its suggestions.
Nbr Chk In Progress, Nbr Chk OK, Waiting For Manage Port CmdThese informational status codes might occur transiently after the 6780 switch is power cycled or hard reset or after a link failure. These status codes indicate that neighborhood checks for the port are currently in progress. Each status code represents a different state of the ongoing neighbor checks. Due to the transient nature of these status codes, they should rarely be detected and reported by SANMAN via a SMN 4205 event. No corrective action is required for these status codes.
EnabledAn informational code. No corrective action is required for this status change.

When the new status is:

statusRecovery
Reset - Lost Optical SignalThe port has been reset due to loss of optical signal. Determine what caused the signal to be lost (fiber disconnection, transceiver failure, and so on) and correct the condition. When the optical signal is restored, the port recovers automatically.
Reset - Transceiver AbsentThe port has been reset because the presence of a transceiver cannot be detected. Probably a hardware error occurred. See the OSM procedure for replacing a PIC, and follow its suggestions.
Reset - CRU AbsentThe port has been reset because the presence of PIC can no longer be detected. Probably a hardware error occurred. See the OSM procedure for replacing a PIC, and follow its suggestions.
No Link AliveNo link-alive symbols are being detected at the port. Determine what is preventing signals arriving at this port and correct the condition. When link-alive symbols are once again arriving, the port recovers automatically.
Link Alive, Disabled Link-alive symbols are being detected at the port, but the port is disabled for regular ServerNet data traffic. On a switch-to-switch port, the likely cause of this condition is that the link-alive signals have been detected at the port, but neighborhood checks have not been initiated for the port yet. On a switch-to-node port, the likely cause of this condition is that the link-alive signals have been detected at the port, but the node connected to that port has not enabled the port for data traffic yet (for example, due to an absent SANMAN process in that node). Check if OSM is reporting an alarm for the port, and follow the repair actions for the alarm.
Nbr Chk Failed - No Link AliveThe port is not connected, and cannot run neighbor checks. See ZSMN-ENM-PORT-STATE-DOWN. The following codes specify conditions detected during neighbor checking by the firmware in a switch. In general, they indicate a conflict between configuration expectations in one switch, and actual configuration in a second switch.
Invalid NeighborA request for neighbor information was sent via this port but no response has been received. The most likely cause is that the configuration loaded in the switch is expecting the neighbor to be another switch, but the port is connected to a NonStop Kernel node. See if OSM is reporting an alarm for the port, and follow the repair actions for the alarm.
Invalid Configuration Tag Invalid Configuration Version ID Unmatched Connector Number Unmatched GUID Invalid Switch Configuration RevisionThe data received from a switch in response to a request for neighbor information, did not match the expectations of the configuration file loaded in the reporting switch. A possible cause of these codes is that the wrong configuration file has been loaded into a switch. It is also possible that the switch-to-switch fiber optic cables have been misrouted and thus, the reporting switch is connected to the wrong neighbor or to a wrong port of a neighbor. See if OSM is reporting an alarm for the port, and follow the repair actions for the alarm. The repair actions may include verifying and correcting configuration files loaded on a switch, switch numeric selector settings, and cabling. When the repair actions indicated by OSM have been performed, the port will be recovered automatically.



4206

This event has two forms, depending on the version of the switch and firmware.

When the switch is a 6770 switch this message appears:

SANMAN has altered an attribute of the [ ServerNet switch switchID | the ServerNet switch | ] on the external ServerNet fabric fabric. Reason: cause Attribute that was altered: [ Fabric setting changed to: fabric ] [ Halt symbol propagation changed to: { ENABLE | DISABLED }] [ Switch locator string changed to string ] [ New Globally Unique ID accepted. ]

When the switch is a 6780 switch this message appears:

SANMAN has altered an attribute of ServerNet switch fabric.zone.layer, group group, module module. Reason: cause Attribute that was altered: [ Halt symbol propagation changed to: { ENABLE | DISABLED }] [ Switch locator string changed to string2] [ New Globally Unique ID accepted. ]

NOTE: This event applies to the 6770 switch and the 6780 switch.
type

contains the version of the switch hardware.

switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module

are the locations of the group and module, respectively.

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

cause

is the reason the configuration attribute was altered, as follows:

1Operator command
2Required attr previously not config

string

indicates the physical location of the switch. This token is present if the switch hardware is 6770 switch or earlier.

string2

indicates the physical location of the switch. This token is present if the switch hardware is a 6780 switch or later.

Cause  The operator issued an ALTER SWITCH command, or if the physical hardware is a 6770 switch SANMAN detected that the required fabric setting attribute of a 6770 switch was set to None when discovering an external ServerNet fabric. In the latter case, SANMAN will automatically sets the switch fabric setting attribute to match the identity of the external fabric in which the switch was discovered.

Effect  SANMAN has successfully altered an attribute of a ServerNet switch configuration.

Recovery  This is an informational message. No corrective action is required.



4207

SANMAN detected an error when registering the driver Driver. Error: cause.

driver

is an enumerated value designating the particular driver with which SANMAN was attempting to register when the error was encountered. The possible values are IBC and SMC.

cause

contains the cause for the failure. Possible values are:

1Kernal Memory Allocation Failure
2Permissive Listener Registrative-Fail
3IBC Already Registered
4IBC TIB initialization error
5SMC No GLobal Space Error
6SMC Lock Memory Error
7Mismatch between the executing versions of SANMAN and the NonStop Kernel
8SMC Already Initialized
9SMC Bad parameter

Cause  The primary SANMAN process attempted to register with a NonStop Kernel Driver, but registration failed.

Effect  If the failed attempt was to the IBC Driver, the primary SANMAN process will not be able to communicate with the external ServerNet fabrics via IBC packets, and will terminate itself. The backup SANMAN will attempt to register with the NonStop Kernel IBC Driver in a different processor when it becomes the new primary. The SANMAN process pair will terminate if registration with the NonStop Kernel IBC Driver does not succeed in at least one processor in SANMAN's configured CPU list. If the failed attempt was to the SMC Driver, the primary SANMAN process will not be able to communicate directly with 6780 switches via ServerNet RDMA reads and writes, and will terminate itself. The backup SANMAN process will take over and attempt to register with the SMC Driver when it becomes the new primary. The SANMAN process pair will terminate if registration with the NonStop Kernel SMC Driver does not succeed in at least one processor in SANMAN's configured CPU list.

Recovery  SANMAN will automatically attempt to register with the NonStop Kernel Driver on a different processor. A ZZSA* save abend file will be created when the primary SANMAN process terminates. The ZZSA* save abend file should be provided for analysis, along with the ZZSV* service log event file containing the SMN 4207 event. If the error detected by SANMAN was a failure to register a permissive listener interrupt handler, the recommended long-term solution is to request a TNet Millicode (T8460) product revision capable of supporting a larger number of registered subsystems. If the errors indicate that SANMAN has attempted to register with the indicated driver more than once, the errors are programming errors. SANMAN should not attempt to register with a NonStop Kernel Driver more than once. Ensure that the saveabend file and event logs are preserved and report this incident to your HP service provider. If the error code implies that there is a mismatch between the executing versions of SANMAN and the NonStop Kernel, the event logs should contain an SMN 4011 event with additional details on the version mismatch between SANMAN and the NonStop Kernel. The operator needs to ensure that the versions of SANMAN and the NonStop Kernel IBC and SMC drivers are compatible. Check the level of the installed product revisions for both T9050 and T0502 and the product revision requisites for these two products. If the two do not match, install any maintenance required to ensure that the products are at compatible levels. If the error codes indicate a shortage of memory in NonStop Kernel data space, it is highly likely that the automatic retry when the backup takes over will recover this situation. If it does not, the operator may attempt to stop other processes to free up system global memory then retry starting the SANMAN process using the SCF command START PROCESS $ZZKRN.ZZSMN. Otherwise, determine and correct the cause of the error. When the error is corrected, it may be necessary to manually restart SANMAN by using the SCF START PROCESS $ZZKRN.ZZSMN command.



4208

A power status change has been detected in [ ServerNet switch switchID | the ServerNet switch ] on the external ServerNet fabric fabric. The specific status that changed is: status

NOTE: This event applies only to the 6770 switch.
switchID

is the position ID of the reporting ServerNet switch. Possible values are: 1, 2 or 3. The position ID is the location of a cluster switch on the fabric using the current topology. X1/Y1 cluster switches have a position ID of 1. X2/Y2 cluster switches have a position ID of 2. X3/Y3 cluster switches have a position ID of 3.

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

status

is the status that changed. Possible values are:

  • UPS on/ok status changed to: { Normal | Abnormal } Operation.

  • Mode of Operation status changed to: { Battery Operation | Line Operation }.

  • Battery Voltage status changed to: { OK | Low }.

  • Battery management status changed to: { Charging | Discharging | Floating | Resting }.

  • Line Regulation status changed to: { Normal,Straight Through | Step Down, Buck | Step Up, Boost }.

  • Attention Required status changed to: { No Attention Required | Battery Failure | Ground Failure | Overloaded (but not shutting down) }.

  • Malfunction! Immediate Action status changed to: { No Malfunction | Backfeed Contact Failure | Overload (UPS will shut down) | Inverter Under Voltage }.

  • Primary Power rail on/off status changed to: { On | Off }.

  • Backup Power rail on/off status changed to: { On | Off }.

  • UPS cable detected status changed to { On | Off }.

  • UPS responding status changed to { On | Off }.

  • Power Currently Available to the UPS:

  • Remaining Backup Time changed to: nnminutes [or more].

  • Nominal AC input voltages: nn volts

Cause  The ServerNet SAN manager has detected at least one power environment status change in a ServerNet switch.

Effect  Some element or elements of the AC power in a ServerNet switch has changed in status. These changes could include the detection of an error or the detection of recovery from a previously detected error.

Recovery  The following status changes are informational only and require no corrective action:

UPS on/ok status changed to: Normal operation.
Mode of Operation status changed to: Line Operation.
Battery Voltage status changed to: OK.
Battery management status changed to: { Charging | Discharging | Floating Resting }.
Line Regulation status changed to: { Normal,Straight Through | Step Down, Buck | Step Up, Boost }.
Attention Required status changed to: No Failure.
Malfunction! Immediate Action status changed to: None.
Remaining Backup Time: nn minutes.

Other status changes require recovery action, depending on the change reported.



4209

[ SANMAN has gathered initial internal status in ServerNet switch switchID on the external ServerNet fabric fabric. | An internal status change has been detected in ServerNet switch switchID on the external ServerNet fabric fabric. { Detected error conditions: error | Repairs to previously detected errors: repaired-error }]

NOTE: This event applies only to the 6770 switch.
switchID

is the position ID of the reporting ServerNet switch. Possible values are: 1, 2 or 3. The position ID is the location of a cluster switch on the fabric using the current topology. X1/Y1 cluster switches have a position ID of 1. X2/Y2 cluster switches have a position ID of 2. X3/Y3 cluster switches have a position ID of 3.

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

error

is a hardware or firmware error that was detected. Possible values are:

Unknown error type.
Bad program checksum.
Factory-default blank configuration was loaded.
New configuration bad after configuration download.
Both configurations bad after configuration download.
SRAM Memory test failed.
FLASH sector test failed.
Bad SEEPROM Checksum.
SBUSY error.
FLASH Program error.
Power supply fan failure.
Switch is not responding.
SP ran over allocated space.
Bad program CRC.
Bad switch configuration CRC.
Firmware images are different.
Configuration images are different.
Router self check.
Bad Flash ID String.
Flash boot lockout 0 error.
Flash boot lockout 1 error.
repaired-error

is a previously detected error that has been repaired. Possible values are:

No unknown errors.
Program checksum is now OK.
Current persistent router configuration was loaded.
New configuration OK after configuration download.
At least one configuration OK after configuration download.
SRAM Memory test OK.
FLASH sector test OK.
SEEPROM Checksum OK.
No SBUSY errors detected.
No FLASH Program errors detected.
Both power supply fans are now OK.
Switch is now responding.
SP is no longer over allocated space.
Program CRC is now OK.
Switch configuration CRC is now OK.
Firmware images are identical.
Configuration images are identical.
No router self check.
Flash ID string is now OK.
No flash boot lockout 0 errors detected.
No flash boot lockout 1 errors detected.

Cause  SANMAN has detected at least one firmware and/or hardware status change in a ServerNet switch.

Effect  Depends on the status change. An error has been detected or a repair has been detected in one or more aspects of the ServerNet Switch firmware or hardware.

Recovery  The following status changes are informational only and require no corrective action:

No unknown errors.
Program checksum is now OK.
Current persistent router configuration was loaded.
New configuration OK after configuration download.
At least one configuration OK after configuration download.
SRAM Memory test OK.
FLASH sector test OK.
SEEPROM Checksum OK.
No SBUSY errors detected.
No FLASH Program errors detected.
Both power supply fans are now OK.
Switch is now responding.
SP is no longer over allocated space.
Program CRC is now OK.
Switch configuration CRC is now OK.
Firmware images are identical.
Configuration images are identical.
No router self check.
Flash ID string is now OK.
No flash boot lockout 0 errors detected.
No flash boot lockout 1 errors detected.

Other status changes require recovery action, depending on the change reported.



4210

This message has two forms.

When the hardware is a 6770 switch this message appears:

The External ServerNet SAN Manager process has loaded a new filetype into [ ServerNet switch switchID | the ServerNet switch in the external ServerNet fabric fabric. ] Download file: download-file Download file version: version Current revision: [ Release=rev ] Major=rev; Minor=rev New revision: [ Release=rev ] Major=rev; Minor=rev [ Current configuration tag: [0x]input-tag ] [ New configuration tag: [0x]tag ]

When the hardware is a 6780 switch this message appears:

The External ServerNet SAN Manager process has loaded a new filetype into ServerNet switch fabric.zone.layer, group group, module module, slot slot. Download file: download-file Download file version: version Current revision: [ Release=rev ] Major=rev; Minor=rev New revision: [ Release=rev ] Major=rev; Minor=rev [ Current configuration tag: input-tag ]

NOTE: This event applies to the 6770 switch and the 6780 switch.
filetype

indicates the type of download file. Possible values are:

-1Undefined Load File
1Firmware File
2Configuration File
3FPGA File

switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

download-file

indicates the name of the download file.

version

indicates the VPROC of the download file.

rev

indicates revision number.

input-tag

indicates the current tag of the switch configuration block loaded into the specified ServerNet switch.

tag

indicates the new tag of the switch configuration block loaded into the specified ServerNet switch.

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module, slot

are the locations of the group, module, and slot, respectively.

Cause  The operator issued a LOAD SWITCH command to download a firmware, configuration, or FPGA file into a ServerNet switch.

Effect  SANMAN has successfully loaded the download file into the ServerNet switch.

Recovery  This is an informational message. No corrective action is required.



4211

This message has three forms.

When the hardware is a 6770 switch this message appears:

The External ServerNet SAN Manager process has reset [ ServerNet switch switchID | the ServerNet switch ] on the external ServerNet fabric fabric. Reset type: reset-type

When the hardware is a 6780 switch and this SANMAN process requested the reset this message appears:

The External ServerNet SAN Manager process has reset ServerNet switch fabric.zone.layer, group group, module module. on the external ServerNet fabric fabric. Reset type: reset-type Reset type detail: detail ServerNetID of reset requestor: snid

When the hardware is a 6780 switch and this SANMAN process is not the one which requested the reset this message appears:

The External ServerNet SAN Manager process has detected a reset in ServerNet switch fabric.zone.layer, group group, module module on the external ServerNet fabric fabric. Reset type: reset-type Reset type detail: detail [ ServerNetID of reset requestor: snid HDI Identifier: hdi Numeric Selector value used: sel. Configuration Tag: config-tag. | Firmware Exception Error: error. ]

NOTE: This event applies to the 6770 switch and the 6780 switch.
switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

reset-type

indicates the type of reset. Possible values are Hard Reset and Soft Reset.

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module

are the locations of the group and module, respectively.

detail

indicates details about about the type of reset. Possible values include:

-2Invalid Reset Type
-1Unknown Reset Type
0Power On Reset
1Front Panel Button Pressed
2Watchdog Level 3 Timer Expiration
4Watchdog Level 2 Timer Expiration
16Firmware Error Initiated
32Soft Reset Command
64Hard Reset Command

snid

contains the ServerNet ID of the requestor.

hdi

is the value of the current Hardware Data Identifier.

config-tag

contains the configuration tag of the configuration that was loaded.

sel

contains the value for the numeric selector (a.k.a. thumb wheels) that was used to load the configuration.

error

contains the exception error code.

Cause  SANMAN has detected that the indicated ServerNet switch has been reset.

Effect  The specified ServerNet switch has been reset.

Recovery  If the reset type is Watchdog Level 2 Timer Expiration or Watchdog Level 3 Timer Expiration, contact your service provider. Also, check for the presence of OSM alarms and perform any recommended recovery actions. Otherwise, this is an informational message. No corrective action is required.



4212

A neighbor status change has been detected in ServerNet switch switchID on the external ServerNet fabricID fabric [ Port port status changed to:status ] [ Switch port port neighbor status changed to: status ]

NOTE: This event applies only to the 6770 switch.
switchID

is the position ID of the reporting ServerNet switch. Possible values are: 1, 2 or 3. The position ID is the location of a cluster switch on the fabric using the current topology. X1/Y1 cluster switches have a position ID of 1. X2/Y2 cluster switches have a position ID of 2. X3/Y3 cluster switches have a position ID of 3.

fabricID

identifies the external ServerNet fabric of the ServerNet switch. Possible values are X and Y.

port

is the port number. Possible values are in the range 0 through 11.

status

is the status of the neighbor connected to a switch port. Possible values are:

Not Defined
Status OK
Link Dead
Disabled, unknown reason
Wrong Fabric
Invalid
Invalid Port
Mixed GUID
Invalid Part Number
Invalid Version ID
Mixed Configuration
Invalid uninitialized Configuration

Cause  SANMAN has detected at least one port status change in a neighbor ServerNet switch.

Effect  The status of the neighbor switch port or ports indicated have been changed.

Recovery  The following messages require corrective action: Cabling may be incorrect. Check cables Check if a valid switch configuration block is loaded on the neighbor switch connected to that port. Check cabling. If cabling appears to be correct, make sure the correct configuration image is loaded on the neighbor switch connected to that port.



4213

This message has two forms:

When the hardware is a 6770 switch this message appears:

A neighbor check error has been detected in { ServerNet switch switchID | the ServerNet switch } on the external ServerNet fabric fabric. Error type: err [ Nearest switch port number: portnumber ] location

When the hardware is a 6780 switch this message appears:

A neighbor check error has been detected in ServerNet switch fabric.zone.layer, group group, module module., { slot slot, port port | router instance slot, router port } Error type: err3 location3

NOTE: This event applies to the 6770 switch and the 6780 switch.
switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

err

indicates the error found when checking the neighbor port. Possible values:

-1Not Defined
0Status OK
1Link Dead
2Wrong Fabric
3Invalid Neighbor
4Invalid Port
5Mixed GUID
6Invalid Part Number
7Invalid Version ID
8Mixed Configuration tag
9Invalid Configuration Tag
10Uninitialized
11Disabled, unknown reason

location

reports errored device information. It includes:

Neighbor switch current port number
Expected Neighbor switch port number
Nearest switch configuration tag; (0x10000, 0x10001, 0x10002, 0x10003, 0x10004)
Neighbor switch configuration tag; (0x10000, 0x10001, 0x10002, 0x10003, 0x10004)
Expected Neighbor switch configuration tag; (0x10000, 0x10001, 0x10002, 0x10003, 0x10004)
Current Neighbor switch fabric setting; X, Y or none
Expected Neighbor switch fabric setting; X or Y
Current Neighbor switch GUID; a 6 character switch identifier
Expected Neighbor switch GUID; a 6 character switch identifier
Current Neighbor switch manufacturing part number
Expected Neighbor expected manufacturing part number; currently it has a value of 0x100150A7
Current Neighbor SCB format version
Expected Neighbor SCB format version; currently it has an ASCII value of "S2C0"
Nearest switch configuration major revision
Nearest switch configuration minor revision
Neighbor switch configuration major revision
Neighbor switch configuration minor revision
err3 is one of
-2Undefined Port State
-1Unknown
0Reset
1Reset - Lost Optical Signal
2Reset - Transceiver Absent
3Reset - CRU Absent
4No Link Alive
1792Link Alive, Disabled
5888Nbr Chk In Progress
9984Nbr Chk OK, Waiting For Manage Port Cmd
9985Nbr Chk Failed - No Link Alive
9986Invalid Neighbor
9987Invalid Configuration Tag
9988Invalid Configuration Version ID
9989Unmatched Connector Number
9990Unmatched GUID
9991Invalid Switch Configuration Revision
9992Nbr Chk Failed - No Rsp To Mng Port Cmd
9993Nbr Chk Failed - Manage Port Cmd NACKed
16128Enabled

location3

reports errored device information. It includes:

Config version: exp=, recv=
Config tag: exp=, recv=
Config major rev: exp=, recv=
Config minor rev: exp=, recv=
Switch GUID:exp=, recv=
Slot: exp=, recv=
Port: exp=, recv=
zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module, slot

are the locations of the group, module, and slot respectively.

port

contains router port information.

portnumber

contains the port number.

Cause  SANMAN has detected an error with respect to the neighbor ServerNet switch.

Effect  The switch port will not be enabled for ServerNet pass-though traffic.

Recovery  The majority of the neighbor check errors require corrective actions, check for the presence of OSM alarms and perform any recommended recovery actions.



4214

This message has three forms, depending on switch type.

When the switch is a 6770 switch this message appears:

ServerNet switch switchID on the external ServerNet fabric fabric has recovered from a blocking incident. Packet Grabber Receive Length: packet-length Port Routing Status: [ Router port port-number: port-status routing to zport-number2 ] [ ... ]

When the switch is a 6770 switch and the switch ID is not present this message appears:

The ServerNet switch on the external ServerNet fabric fabric has recovered from a blocking incident. Packet Grabber Receive Length: packet-length Port Routing Status: [ Router port port-number: port-status, routing to zport-number2 ] [ ... ]

When the switch is a 6780 switch this message appears:

ServerNet switch fabric.zone.layer in group group, module module, slot slot Has recovered from a blocking incident. Packet Grabber Receive Length: packet-length Port Routing Status: [ Router port port-number: port-status routing to zport-number2 ] [ ... ]

NOTE: This event applies to the 6770 switch and the 6780 switch.
switchID

has these possible values:

-2Blank Mfg Dflt Switch Position ID
1Cluster Switch Position ID 1
2Cluster Switch Position ID 2
3Cluster Switch Position ID 3

fabric

identifies the external ServerNet fabric (X or Y) of the ServerNet switch, as follows:

1None
2X
3Y

packet-length

represents the packet grabber receive length.

port-number

represents port number.

port-status

can be:

-1Undefined Router Port Status
0Inport is idle
1Not blocked
2Internally blocked
3Externally blocked

zport-number2

represents port number.

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module, slot

are the locations of the group, module, and slot respectively.

Cause  The ServerNet switch detected a backpressure incident and became blocked for ServerNet traffic. This event appears only if T0502AAG and T0569AAE (or superseding product revisions) are installed. T0502AAG and T0569AAE are included in the G06.14 product version.

Effect  The ServerNet switch firmware automatically recovered the switch from the backpressure incident by performing a selective reset of the switch. If the cluster switch is running T0569AAE firmware (or a superseding product revision), detection and recovery from a blocked switch incident is typically performed in a few milliseconds. In most cases, selective reset recovery from a blocked switch incident does not cause any loss of interprocessor communication (IPC) connectivity through the cluster switch under recovery. For more information about blocked switch incidents, see Support Note S01122.

Recovery  Use the SCF STATUS SUBNET $ZZSCL, PROBLEMS command to validate the connectivity status in the cluster and confirm that all of the remote IPC paths were automatically repaired. If paths are found to be down, verify the status of the backpressured switch and of all NNAs in the cluster by using OSM or SCF. Use the diagnostic commands and recovery steps documented for an unprogrammed NNA in Support Note S01122.



4215

An internal port status change has occurred in ServerNet Switch fabric zone layer, group address, module address router instance port-status.instance-number, router port port-status.port-number [ Connected slot number: port-status.slot-number ] current status port-status.new-status

NOTE: This event applies only to the 6780 switch.
port-status

is a structured token map that contains the specific location and status information for the port whose status has changed. Its contents are:

status-version

is the current version of the data structure. This value is incremented any time the structure is changed. The field is intended to allow ongoing ServerNet Cluster releases to maintain downward compatibility if the structure must be changed.

instance-number

is the instance number of the router with the port whose changed status is being reported. Valid values range from 1 to 5.

port-number

is the ordinal number of the port whose status is being reported. For a 6780 switch, the range of valid values is from zero (0) to eleven (11).

transceiver-port-number

is the transceiver number.

slot-number

is the CRU where the port is located. Does not have a valid value (zero) for internal ports.

old-status

holds the status of the port prior to the change being reported.

old-status-detail

holds the status detail of the port prior to the change being reported. The value of this field is not displayed in the event message text.

new-status

holds the current status of the port being reported.

new-status-detail

holds the current status detail of the port being reported. The value of this field is not displayed in the event message text.

fabric

identifies the external ServerNet fabric (X or Y) of the switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event and is set to zero.

Cause  SANMAN has detected a change in the status of an internal port on one of the internal routers in a switch.

Effect  The status of the indicated router port, internal to the switch, has been changed.

Recovery  Reset, Uninstalled, and Link Alive, enabled are informational messages. No corrective action is required for these status changes. Recovery from Link Dead or Link Alive, disabled depends on the problem that caused the change to this status.



4216

A power status change has been detected in ServerNet switch fabric zone layer, group address, module address on the external ServerNet fabric fabric. State change(s) detected: power-item.item-name (slot power-item.item-slot) : power-item.item-status

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event and is set to zero.

power-item

is a structured token map that contains the power item name, specific location, and status information for the item whose status has changed. Its contents are:

item-version

is the current version of the data structure. This value is incremented any time the structure is changed. The field is intended to allow ongoing ServerNet Cluster releases to maintain downward compatibility if the structure must be changed.

item-name

is an enumerated value which encodes the name of the specific item that has the power status change.

item slot

is the switch slot number containing the item whose power status has changed.

item-status

is the current value of the power status flag for the specific item.

Cause  SANMAN has detected a change in the status of at least one component of the power system in a switch.

Effect  The status of the indicated component of the power system has been changed.

Recovery  If the new status, as displayed in the event, indicates that the component has failed, see the OSM alarm and follow the suggested repair action. Otherwise, this is an informational message and no recovery action is required.



4217

{ SANMAN has gathered initial internal status from | An internal status change has ben detected in } ServerNet switch fabric zone layer, group address, module address. [ State change(s) detected: message ]

NOTE: This event applies only to the 6780 switch.
message

is a map token containing the values and indicators for any changed status items. Its message is listed below and is accompanied by an appropriate value:

Firmware state
RTC battery fail
HDI mismatch
Numeric selector
Firmware image A status
Firmware image B status
Config image A status
Config image B status
FPGA image A status
FPGA image B status
Internal temperature (C)
Saved Dump Available: true
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

contains the switch group and switch module where the specified switch is physically located.

Cause  SANMAN has detected a change in the status of a component of the firmware or hardware in a switch.

Effect  The status of the indicated component of the switch has changed.

Recovery  If the new status, as displayed in the event, indicates that the component has failed, see the OSM alarm and follow the suggested repair action. Otherwise, this is an informational message and no recovery action is required.



4218

Error(s) have been detected in a asic-type ASIC on the ServerNet switch fabric zone layer, group address, module address. [ [ Router instance: router-instance. err-rprt1.type errors: err-rprt1.value [ Partial list of detailed self-check errors: | Detailed self-check error(s): ] self-check-errors ] | [ Partial list of errors: | Detected error(s): ] err-rprt2.type : err-rprt2.value. [ Partial list of detailed self-check errors/hardware exceptions: | Detailed self-check error(s)/hardware exception(s): ] self-check/hardwareexcp ] err-rprt1

NOTE: This event applies only to the 6780 switch.
asic-type

is the type of the ASIC that is having errors.

type

the router error counter type. It can be:

-1Invalid or Unknown
1Self-Check

self-check-errors

It can be:

-1Invalid or Unknown
1Port 0 Self-Check
2Port 1 Self-Check
3Port 2 Self-Check
4Port 3 Self-Check
5Port 4 Self-Check
6Port 5 Self-Check
7Port 6 Self-Check
8Port 7 Self-Check
9Port 8 Self-Check
10Port 9 Self-Check
11Port 10 Self-Check
12Port 11 Self-Check
13Packet Grabber Self-Check
14Bad Parity On Registers
15Bad Parity In Routing Table

err-rprt1

contains data specific to the ROUTER2 counter error being reported. The data includes:

version

is the version of the data structure. This structure is introduced in the 6780 switch.

type

is the error counter type.

error-counter

is the value of the error counter

err-rprt2

contains data specific to the ROUTER2 counter error being reported. The data includes:

version

is the version of the data structure. This structure is introduced in the 6780 switch.

type

is the type of ServerNet 2 Packetizer error. It can be:

-1Invalid or Unknown
1Self-Check/Hardware Exception
2BTE Timeout
3BTE Packet Nack
4BTE Parameter Programming
5BTE Flush Timeout
6BTE Status Inconsistency
7Packet Header
8Packet Length
9Spurious Acknowledgement Packet
10This Packet Bad (TPB)
11Bad CRC
12AVT Bad Mask
13AVT Table Access
14AVT Bad Source
15AVT Bad Path
16AVT Access
17AVT Bad Interrupt
18AVT Interrupt Queue Full
19AVT Low Priority Queue Overflow
20AVT High Priority Queue Overflow
21AVT Error Queue Overflow
22Stack Link Exception
23Stack Receive Packet Tossed
24Stack Forward Progress Timeout
25Stack Back-Pressure Timeout
26Stack Self-Check
27PCI

error-counter

is the value of the error counter

value

is the ServerNet 2 Packetizer error count.

selfcheck/hardwareexcp

is the type of hardware exception or self check. It can be:

-1Invalid or Unknown
1ServerNet Stack 1 Self-Check
2ServerNet Stack 0 Self-Check
3PCI Logic Self-Check
4PCI Logic Hardware Exception
5Memory Controller Self-Check
6Memory Controller HW Exception
7IMAN Logic Self-Check
8IMAN Logic Hardware Exception
9AVT Logic Self-Check
10AVT Logic Hardware Exception
11BTE Logic Self-Check
12BTE Logic Hardware Exception
13VIP Logic Self-Check
14VIP Logic Hardware Exception
15SMI Self-Check
16SMI Hardware Exception

router-instance

is the router instance that is having errors.

fabric

identifies the external ServerNet fabric (X or Y) of the switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

is a map token containing, in this case, the group, module, and slot where the specified ASIC switch is physically located.

Cause  SANMAN has detected the occurrence of an internal error in one of the ASICs in a switch.

Effect  If the error is from a Router 2, connectivity to the External ServerNet fabric via the effected ASIC is lost until the ASIC is reset. If the error is from a Colorado 2xy, management command capability is lost until the ASIC is reset. However, in both cases, the ASIC is automatically reset by the switch firmware.

Recovery  Once the ASIC reset is complete, any lost connections are automatically recovered. OSM tracks the occurrence of this event, and beyond a threshold level, will raise an alarm. If the OSM alarm appears, follow the suggested repair action. If no OSM alarm appears, this is an informational message and no recovery action is required.



4219

Port error(s) have been detected on a port-info, router instance port-info, router port port-info in ServerNet switch fabric zone layer, group address, module address. Router port type: port-info [ Port number: port-info ] [ Connected slot number: port-info | [ Port connects to: [ node | zone | layer ] conn-number Associated slot number: port-info ] [ port-err-rprt. Z-TYPE:port-err-rprt.Z-VALUE ]

NOTE: This event is emitted for 6780 switches and beyond, only. There is going to be one event for each router port with errors.This event is not emitted during SANMAN initialization or takeover processing.This event applies only to the 6780 switch.
port-err-rprt

contains data specific to the router error being reported. The data includes:

  • Router 2 Port Error Ver - the version of the data structure. This structure is introduced with 6780 switches and its version is incremented for every new release.

  • Router 2 Port Error Type - an enumeration of counter types.

  • Router 2 Port Error Count - the counter value.

port-info

contains data specific to the router 2 where the port error is being reported. It contains:

  • Router Port Info Version - the version of the data structure.

  • Router Type - the router ASIC type.

  • Router Instance - the ASIC instance number of the router reporting the error.

  • Router Port Number - the port number, within the router instance, where the error occurred.

  • Router Port Type - the port type. This is one of External Port, Internal Port, Packetizer Port, or External Loop Port.

  • Associated Slot Number - this is displayed if the Router Port Type is External Loop Port. It corresponds to the router interconnect PIC slot number associated to the router port reported in the event.

  • Connected Slot Number - this is displayed if the Router Port Type is External Port, Internal Port, or Packetizer Port. It corresponds to the the logical board slot number if the port type is either Internal Port or Packetizer Port, or it corresponds to the PIC slot number connected to the router port if the port type is External Port.

  • Cru Port Number - this is displayed if the Router Port Type is External Port. It corresponds to the CRU port number connected to the router port.

Z-TYPE

can be:

-1Invalid or unknown errors
1Loss of optical signal errors
2Receive clock OK transitions
3Link alive status changes to on
4Link alive status changes
5Receive FIFO overflow errors
6Command exception errors
7Packet missing status errors
8Packet CRC errors
9Packet lenght errors
10ATTN command symbol received errors
11Packet routing errors
12Othe Link Bad Symbols Received
13This Link Bad Symbols received

fabric

identifies the external ServerNet fabric (X or Y) of the switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

node

is the ServerNet node number.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event and is set to zero.

conn-number

identifies the node, zone, or layer number.

Cause  SANMAN has detected the occurrence of a specific port error in one of the router ASICs in a switch.

Effect  Depends on the error.

Recovery  See if OSM is reporting an alarm for the port, and follow the repair actions for the alarm.



4220

The SMC Driver API interface call error returned error error, error detail error. [ ServerNet switch fabric zone layer. ]

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

error

is a map token containing data relative to the call and its error returns. The data includes:

  • SMC API Error Version - the version of the data structure. This structure is introduced with 6780 switches and its version is set to one. The structure version is incremented whenever the structure changes.

  • SMC API - an enumeration of the specific interface call which returned the error.

  • SMC API Error - an enumeration of the possible error returns from the SMC Driver.

  • SMC API Error Detail - an integer code of the lower level error responses received by the SMC Driver, and which resulted in the driver returning an error to SANMAN. These error codes are usually TNet Services codes.

Cause  SANMAN invoked an SMC Driver API interface function. The function returned a code other than SMC_RTN_OK.

Effect  SANMAN will automatically retry the SMC Driver API call. If subsequent retries fail, and if this attempt to invoke the SMC Driver is the result of a request from a SANMAN client, the client request will ultimately receive a failure response.

Recovery  This event is primarily provided for support personnel. SANMAN will automatically retry the SMC Driver API call. If the error persists (as indicated by additional SMN 4220 events), use the error and error detail codes in the event to determine and correct the cause of the failure. Then retry the high level command that failed.



4221

[ A CRU internal status change has been detected in ServerNet switch fabric zone layer, group address, module address. Status change(s) detected in slot address: | SANAMN has gathered initial CRU internal status from Problem(s) detected in slot address ]

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event and is set to zero.

Cause  SANMAN has detected a change in the status of a CRU in a switch.

Effect  The status of the indicated component of the 6780 switch has changed.

Recovery  If the new status, as displayed in the event, indicates that the component has failed, see the OSM alarm and follow the suggested repair action. Otherwise, this is an informational mesage and no recovery action is required.



4222

ServerNet switch fabric zone layer, group address, module address. update-type update state changed to: [ status ]

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event and is set to zero.

update-type

is an enumeration with the type of update. Possible values are invalid, firmware, and configuration.

status

is the status of the update. Possible values are Erase Image A, Write Image A, Erase Image B, Write Image B.

Cause  SANMAN has received an indication that an update has reached a new phase.

Effect  The status of the indicated component of the switch has changed.

Recovery  No recovery needed.



4223

Event logging for missing status errors and/or packet CRC errors and/or TLB/OLB command exception errors on router instance port-info, router port port-info in ServerNet switch fabric.zone.layer, group group, module module is supressed due to excessive errors. Router port type: port-info [ Port number: port-info ] [ Connected slot number: port-info Port connects to: conntype | Associated slot number: port-info ]

NOTE: This event applies only to the 6780 switch.
port-info

contains data specific to the router 2 where the port error is being reported. It contains:

  • Router Port Info Version - the version of the data structure.

  • Router Type - the router ASIC type.

  • Router Instance - the ASIC instance number of the router reporting the error.

  • Router Port Number - the port number, within the router instance, where the error occurred.

  • Router Port Type - the port type. This is one of External Port, Internal Port, Packetizer Port, or External Loop Port.

  • Associated Slot Number - this is displayed if the Router Port Type is External Loop Port. It corresponds to the router interconnect PIC slot number associated to the router port reported in the event.

  • Connected Slot Number - this is displayed if the Router Port Type is External Port, Internal Port, or Packetizer Port. It corresponds to the the logical board slot number if the port type is either Internal Port or Packetizer Port, or it corresponds to the PIC slot number connected to the router port if the port type is External Port.

  • Cru Port Number - this is displayed if the Router Port Type is External Port. It corresponds to the CRU port number connected to the router port.

fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1, and the maximum value is 4.

group, module

are the locations of the group and module, respectively.

conntype

is connectivity information of an external port such as the type of neighbor the port is connected to and numeric information to identify the neighbor

Cause  SANMAN has received an indication that an update has reached a new phase.

Effect  Not all router port error counters are going to generate an event SMN 4219 during the supression interval of time.

Recovery  No recovery needed. An event SMN 4219 with the summary of the router port error counters that have changed values during the suppression interval is generated after the interval is over.



4224

$ZZSMN encountered a $ZCNF access error error, error detail err-detail Operation: op

error

is the error code returned from a system configuration database access routine.

err-detail

is the error detail code returned from a system configuration database access routine.

op

is the operation that SANMAN was attempting when the error occurred. Possible values are:

1Record SizeThe error occurred while checking the size of the configuration record.
2Record FetchThe error occurred while trying to fetch the record.
3Record Insert The error occurred while trying to insert the record.
4Database LockThe error occurred while trying to lock the database.
5Database UnlockThe error occurred while trying to unlock the database.
6Record Update The error occurred while trying to update the record.

Cause  The External ServerNet SAN manager process (SANMAN) encountered an error while using the HP NonStop operating system configuration services application programming interface (API) for access to the External ServerNet SAN subsystem configuration record.

Effect   If this error occurs during process startup, SANMAN infers zone-to-zone distances in a long-distance ServerNet cluster topology from the configuration tags loaded on switches, as opposed to using the zone-to-zone distance attributes stored in its private configuration record. If this error occurs later, the action is prompted by an SCF [ALTER | START | STOP] SUBSYS $ZZSMN command. In this case, the command fails with an error.

Recovery   Restart the process or reissue the failed SCF command. If the error persists, contact your service provider.



4225

Firmware has encountered an error in reading the internal temperature of the Servernet switch fabric zone layer, group address, module address on the external Servernet fabric fabric.

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1; the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1; the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event.

Cause  Firmware reads the internal temperature of the logic board in the Servernet switch. If there is an error in reading, SANMAN detects it and displays this event.

Effect   The internal temperature of the logic board in the ServerNet switch is not displayed correctly.

Recovery   This is an informational message, and no recovery action is required.



4226

The Internal temperature read error in the Servernet switch fabric zone layer, group address, module address on the external fabric fabric has been resolved.

NOTE: This event applies only to the 6780 switch.
fabric

identifies the external ServerNet fabric (X or Y) of the 6780 switch, as follows:

1None
2X
3Y

zone

is the cluster switch zone number of the switch in which a status change has been detected. The minimum value is 1; the maximum value is 3.

layer

is the cluster switch layer number of the switch in which a status change has been detected. The minimum value is 1; the maximum value is 4.

address

is a map token containing, in this case, the switch group and switch module where the specified switch is physically located. The third field in this map, the switch slot, is unused in this event.

Cause  Firmware can now read the internal temperature of the logic board in the ServerNet switch correctly. SANMAN detects this and displays this event.

Effect   The internal temperature of the logic board is now displayed correctly.

Recovery   This is an informational message, and no recovery action is required.