Operator Messages Manual
Chapter 92 SMN (External ServerNet SAN Subsystem) Messages
The messages in this chapter are generated by the HP NonStop™
External ServerNet System Area Network (SAN) subsystem. The subsystem
ID displayed by these messages includes SMN as the subsystem name.
The External ServerNet SAN subsystem (SMN) is managed and monitored
by an External ServerNet SAN manager ($ZZSMN, sometimes referred to
as SANMAN) process that runs in every system connected to a ServerNet
Cluster.  |  |  |  |  | NOTE: Negative-numbered messages are common to most subsystems. If
you receive a negative-numbered message that is not described in this
chapter, see Chapter 15. |  |  |  |  |
4001 The External ServerNet SAN Manager process, process-name, has started in processor cpunum. Program file: filename Priority: pri Autorestart count: count Processor list: (first-processor-in-list [next-processor-in-list, ..., last-processor-in-list]) | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | cpunum | is the number of the processor in which the primary
External ServerNet SAN manager process has started. | filename | is the name of the program file for the External ServerNet
SAN manager process. | pri | is the priority at which the External ServerNet SAN
manager process is running. | count | is the autorestart count configured for the External
ServerNet SAN manager process. | first-processor-in-list ... last-processor-in-list | is the processor list configured for the External
ServerNet SAN manager process. |
Cause The External ServerNet SAN manager process has started. Effect The External ServerNet SAN Manager process is running. Recovery This is an informational message. No corrective action is required. |
4002 The External ServerNet SAN Manager process, process-name, has terminated. Reason: reason | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | reason | indicates the reason the process terminated: |
Cause The External ServerNet SAN manager process terminated voluntarily.
Either it was terminated by an operator command, or an environmental
problem caused it to self-terminate. If this event is due to self-termination,
an associated SMN 4010 (SANMAN-Additional-Information) message reports
the environmental problem found by the External ServerNet SAN manager
process. Effect The External ServerNet SAN manager process is no longer running. Recovery If this event is due to self-termination, follow recovery instructions
for the SMN 4010 event. After correcting any environmental problems,
restart the External ServerNet SAN manager process with an operator
command. |
4003 Process process-name: Primary processor cpunum. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | cpunum | is the number of the processor in which the primary
External ServerNet SAN manager process is running. |
Cause Either the External ServerNet SAN manager process was initialized
for the first time, or a backup process has become the primary process. Effect The External ServerNet SAN manager process is running in the
indicated processor. Recovery This is an informational message. No corrective action is required. |
4004 Process process-name: Backup process created in processor cpunum. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | cpunum | is the number of the processor in which the backup
External ServerNet SAN manager process is running. |
Cause The External ServerNet SAN manager process has successfully
created a backup process. Effect The External ServerNet SAN manager process is no longer vulnerable
to a single failure. Recovery This is an informational message. No corrective action is required. |
4005 Process process-name: Unable to create backup in processor cpunum. Process creation error: errnum Error
detail: err-detail | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | cpunum | is the number of the processor in which the backup
process-creation attempt was made. | errnum | is the Guardian process-creation error number. | err-detail | is the error detail subcode returned with the Guardian
process-creation error. |
Cause An attempt to create an External ServerNet SAN manager backup
process has failed. For information on process-creation errors and
error detail subcodes, see the Guardian Procedure Errors
and Messages Manual. Effect Until a backup process is started, External ServerNet SAN management
is vulnerable to a single failure. The External ServerNet SAN manager
process attempts to start a backup process immediately if any processor
in its processor list other than that used by the primary process
is running. The External ServerNet SAN manager process makes two restart
attempts in each processor eligible to contain the backup process.
Each failed attempt results in an SMN 4005 message. If all restart
attempts fail, an SMN 4007 message is generated. Recovery This is an informational message. Although no corrective action
is required, the message might provide information for recovery in
the event of an SMN 4007 message. |
4006 Process process-name: Backup process in processor cpunum failed. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | cpunum | is the number of the processor in which the backup
External ServerNet SAN manager process had been running. |
Cause The backup process of the External ServerNet SAN manager process
pair failed. Effect Until a backup process is started, External ServerNet SAN management
is vulnerable to a single failure. The External ServerNet SAN manager
process attempts to start a new backup process immediately if any
processor in its processor list other than that used by the primary
process is running. The External ServerNet SAN manager process makes
two restart attempts in each processor eligible to contain the backup
process. Each failed attempt results in an SMN 4005 message. If all
restart attempts fail, an SMN 4007 message is generated. Recovery This is an informational message. Although no corrective action
is required, the message might provide information for recovery in
the event of an SMN 4007 message. |
4007 The External ServerNet SAN Manager process, process-name, is running without a backup. Reason: reason. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | reason | indicates why there is no backup process: |
Cause Either there is no processor available for running the backup
process, or there have been multiple failures of the backup process
or attempts to create a backup process. If this event is due to backup
process-creation failures, there is associated SMN 4005 messages.
Other possible precursors are SMN 4006 and SMN 4008 messages. Effect The External ServerNet SAN manager process runs without a backup,
and the External ServerNet SAN subsystem is vulnerable to a single
failure. Whenever a processor in its processor list is reloaded, the
External ServerNet SAN manager process attempts to create a backup
there. If this event is caused by repeated backup failures or backup
process-creation failures, the External ServerNet SAN manager process
periodically attempts to create a backup. Recovery Either reload the processors on the processor list for the External
ServerNet SAN manager process, or use the information in the associated
SMN 4005, 4006, or 4008 messages to determine the cause and recovery
actions for backup process-creation failures. To list the processors
that have been configured in the processor list for the External ServerNet
SAN manager process, issue an SCF INFO command. |
4008 The External ServerNet SAN Manager process, process-name, backup process terminated. Reason: reason. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | reason | indicates why the backup process was terminated: |
Cause The backup process failed, or it was terminated by the primary
(for example, if the primary found a fatal error when checkpointing
to the backup). Effect Until a backup process is started, External ServerNet SAN management
is vulnerable to a single failure. The External ServerNet SAN manager
process attempts to start a new backup process immediately if any
processor in its processor list other than that used by the primary
process is running. The External ServerNet SAN manager process makes
two restart attempts in each processor eligible to contain the backup
process. Each failed attempt results in an SMN 4005 message. If all
restart attempts fail, an SMN 4007 message is generated. Recovery This is an informational message. Although no corrective action
is required, the message might provide information for recovery in
the event of an SMN 4007 message. |
4009 External ServerNet SAN Manager internal trace
entry trace-entry. | trace-entry | contains an internally defined trace record in hexadecimal
format. |
Cause An internal trace was initiated on the External ServerNet SAN
manager process. Effect Trace data has been dumped into the EMS log. The External ServerNet
SAN manager state is unchanged. Recovery This is an informational message. No corrective action is required. |
4010 External ServerNet SAN Manager process, process-name,
reports problem. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | problem | describes the environmental problem. Possible values
are: |
Cause The External ServerNet SAN manager process found an environmental
problem. Possibly the processor list in the startup message is invalid,
the process name is wrong (not $ZZSMN), the process was not started
under the SUPER.SUPER user ID, an internal coding error occurred,
or the process was inadvertently started in a system type that does
not require or support the External ServerNet SAN manager process.
This message is usually followed by a termination message. Effect This is an informational message, but normally the External
ServerNet SAN manager process will terminate after detecting an environmental
problem. Recovery Corrective action might be required to correct the environmental
problem: If problem is “Wrong
Process Name” or “Bad CPU-LIST,” alter the External
ServerNet SAN manager process startup parameters (generic process
configuration under SCF). If problem is “Wrong
Processor,” correct the External ServerNet SAN manager process
startup parameters, or configure and start the External ServerNet
SAN manager process through SCF. This error occurs if the External
ServerNet SAN manager process was started manually from a TACL prompt
on a processor that is not in its processor list. If problem is “Not
Running As SUPER.SUPER,” configure and start the External ServerNet
SAN manager process under $ZZKRN as a generic process through SCF.
This error occurs if the External ServerNet SAN manager process was
started manually from a TACL prompt by a user other than SUPER.SUPER. If problem is “Internal
Error” or “Nested Signal,” the External ServerNet
SAN manager process terminates and is restarted automatically. Submit
the ZZSA* savefile to your service provider for analysis of this problem. If problem is “Unsupported
System Topology,” abort and delete the External ServerNet SAN
manager process by issuing the SCF ABORT PROCESS $ZZKRN.#ZZSMN and
SCF DELETE PROCESS $ZZKRN.#ZZSMN commands. This error occurs if $ZZSMN
is started in a system type that does not require or support the External
ServerNet SAN manager process.
|
4011 External ServerNet SAN Manager process, process-name, version is not compatible with the current
version of the driver-type driver. | process-name | is the name of the External ServerNet SAN manager
process ($ZZSMN). | driver-type | is an enumerated value designating the driver that
SANMAN has determined incompatible with its current version. Possible
values are IBC and SMC. |
Cause SANMAN made a comparison of its own version and that of the
NonStop Kernel IBC and SMC drivers that it is using. One or both
of the versions was determined to be incompatible. Effect The External ServerNet SAN Manager process terminates. Recovery Check SPR requisites for NonStop Kernel (T9050) and SANMAN (T0502).
The operator needs to ensure that the versions of SANMAN and the
NonStop Kernel IBC and SMC drivers are compatible. |
4012 DSM Trace error err, err-detail. Operation op. | err | is the error code returned by the DSM trace routine. | err-detail | is the error detail returned by the DSM trace routine. | op | is the operation that was being performed when the
error was encountered. Possible values are: |
Cause The External ServerNet SAN manager process encountered an error
using the DSM trace routines. Effect Any pending trace is terminated. Recovery Investigate the cause of the error. Reissue the SCF TRACE command.
If the problem persists, contact your service provider. |
4101 External ServerNet fabric fabric not found due to cause [ {MSEB
| Cluster Connectivity CRU} Location group.module.slot. ] | fabric | identifies the ServerNet fabric (X or Y) that was
not found: | cause | contains the cause code identifying why the external
ServerNet fabric was not found. Possible values are: | group.module.slot | is the slot location in which the Modular ServerNet
Expansion Board (MSEB) is configured. |
Cause This message is generated by the External ServerNet SAN manager
when there is an attempt to discover an external fabric, but the discovery
fails. Discovery of an external fabric may fail due to the following
causes: SANMAN detected a configuration error in the local
node (e.g., an MSEB,NNA PIC, Cluster Connectivity CRU, or Cluster
Connectivity PIC is missing) that prevents it from sending an In-Band
Control (IBC) request to discover an external fabric SANMAN detected an initialization error when installing
a logical device for a nearest switch, and consequently could not
send an IBC request to discover an external fabric SANMAN detected an error when sending an IBC request
to discover an external fabric An IBC response has not been received from an external
ServerNet fabric despite several retries An IBC response was received from the external ServerNet
fabric, but SANMAN detected an incompatible or incorrect value in
one of the response data fields
SANMAN tries to discover an external fabric in the following
cases: During External ServerNet SAN manager initialization When the backup External ServerNet SAN manager takes
over When the External ServerNet SAN manager perceives
an environmental change that grants an attempt to discover an external
fabric (for example, an MSEB CRU or Cluster Connectivity CRU insertion
event or return of link alive on the cable that connects the system
to an external fabric)
Effect The external ServerNet fabric discovery fails. The External
ServerNet SAN manager process will not attempt to bring up the physical
ServerNet connection between the node and the external fabric. The
node will not be able to communicate with other remote nodes via that
fabric. SANMAN will continue to perform periodic attempts to discover
the fabric, but in most cases these will succeed only after the condition
that caused the external fabric discovery failure is corrected. Recovery Recovery is dependent on the cause: |
4102 External ServerNet fabric fabric found. Node number assigned to the system on the fabric fabric: node. Nearest
switch GUID: GUID. Nearest switch configuration
tag: tag. | fabric | identifies the ServerNet fabric (X or Y) that was
discovered: | node | is the ServerNet node number. | GUID | is the globally unique ID (GUID) of the nearest switch. | tag | is the configuration tag of the nearest switch. |
Cause External ServerNet fabric discovery was successful. Effect SANMAN proceeds and tries to bring up the physical ServerNet
connection between the node and the external fabric by programming
the NNA (Node Numbering Agent) for that connection. Recovery This is an informational message. No corrective action is required. |
4103 Connection to external ServerNet fabric fabric failed due to cause. [ ServerNet node number assigned by the X fabric: node-X. ServerNet node number assigned by the Y fabric: node-Y. Switch port number on the X fabric: port-X. Switch port number on the Y fabric: port-Y. Switch position ID on the X fabric: tag-X. Switch position ID on the Y fabric: tag-Y. ] [ Cluster connectivity CRV location: group.module.slot] | fabric | identifies the external ServerNet fabric (X or Y)
to which the connection failed: | cause | contains the cause code identifying why the external
ServerNet fabric was not found. Possible values are: | node-X | is the ServerNet node number assigned to the X fabric.
This value is present only if cause = 16
(Node Number Mismatch). | node-Y | is the ServerNet node number assigned to the Y fabric.
This value is present only if cause = 16
(Node Number Mismatch). | port-X | is the switch port number on the X fabric. This value
is present only if cause = 16 (Node Number
Mismatch). | port-Y | is the switch port number on the Y fabric. This value
is present only if cause = 16 (Node Number
Mismatch). | tag-X | is the switch position ID on the X fabric. This value
is present only if cause = 16 (Node Number
Mismatch). | tag-Y | is the switch position ID on the Y fabric. This value
is present only if cause = 16 (Node Number
Mismatch). |
Cause The External ServerNet SAN manager process failed to activate
the physical ServerNet connection to an external fabric: If this event is due to an NNA verification error,
an SMN 4202 message reports the NNA verification detected by the External
ServerNet SAN manager process. If this event is due to Service Processor I/O Library
(SPIOLIB) call errors, one or more SMN 4204 messages reports the SPIOLIB
call errors detected by the External ServerNet SAN manager process.
Effect The physical ServerNet connection between the node and the external
fabric is down. The node is not able to communicate with other remote
nodes via that fabric. Recovery Recovery depends on the cause: |
4104 Connection to external ServerNet fabric fabric brought up successfully. ServerNet node
number assigned to the system: node. | fabric | identifies the external ServerNet fabric (X or Y)
to which the connection was made. | node | is the ServerNet node number. |
Cause The physical connection between the node and an external ServerNet
fabric was brought up successfully after the External ServerNet SAN
manager process programmed the Node Numbering Agent (NNA) with the
node number assigned by that fabric. Effect Physical access to the external ServerNet fabric indicated in
the event is now enabled for the node. Recovery This is an informational message. No corrective action is required. |
4105 Connection to the external ServerNet fabric fabric was lost. Reason: cause [Cluster connectivity location: group.module.slot] | fabric | identifies the external ServerNet fabric (X or Y)
to which the connection failed: | cause | contains the cause code identifying why access to
the external ServerNet fabric failed. Possible values are: |
Cause A previously active connection to an external ServerNet fabric
was downed. Effect The physical ServerNet connection between the node and the external
fabric is down. The node cannot communicate with other remote nodes
via that fabric. In the case of a power failure, if both external fabric connections
were active before the power outage, the External ServerNet SAN manager
process generates one event for the X fabric and one event for the
Y fabric. Recovery Recovery depends on the cause. |
4201 SANMAN failed to register with Configuration
Services. Error returned by Configuration Services: err Error detail returned by Configuration Services:
err-detail | err | is the error code returned from NonStop Kernel Configuration
Services (CONFIG_SPEV_EVENTTYPE_REGISTER_ routine). | err-detail | is the error detail code returned from NonStop Kernel
Configuration Services (CONFIG_SPEV_EVENTTYPE_REGISTER_ routine). |
Cause The External ServerNet SAN manager process was not able to register
with NonStop Kernel Configuration Services. Effect The External ServerNet SAN manager process relies on NonStop
Kernel Configuration Services for fast notification of certain configuration
and status changes, such as Cluster Connectivity CRU insertion and
removal events, and link-alive status changes in the connection to
an external fabric. If the External ServerNet SAN manager fails to
register with NonStop Kernel Configuration Services, it can still
detect configuration and status changes by means of periodic checks
although a delay might occur in detecting configuration and status
changes. It might take longer for the External ServerNet SAN manager
to bring up the connection to an external fabric after a repair. Recovery Stopping one or more subsystems that register with NonStop Kernel
Configuration Services should allow the External ServerNet SAN manager
to register itself. Registration occurs when the External ServerNet
SAN manager is initialized. The recommended long-term solution is
to request a NonStop Kernel Configuration Services (T6586) SPR capable
of supporting a larger number of registered subsystems. If the external ServerNet SAN manager appears to be slow in
detecting configuration changes, and you want to accelerate recovery
of the external fabric connection after a repair, use the following
workaround. Issue the SCF PRIMARY PROCESS $ZZSMN command to force
the primary and backup external ServerNet SAN manager processes to
switch roles. Upon takeover, the new primary immediately checks the
external ServerNet SAN subsystem for configuration and status changes. This list includes all possible errors returned by the NonStop
Kernel Configuration Services CONFIG_SPEV_EVENTTYPE_REGISTER_ routine: |
4202 SANMAN detected a verification error when programming
a Node Numbering Agent (NNA) PIC for the external ServerNet
fabric fabric connection. NNA PIC location: CRU in slot group.module.slot, port port | fabric | identifies the external ServerNet fabric (X or Y)
of the target ServerNet switch: | group.module.slot | is the slot location in which the Cluster Connectivity
is configured. | port | is a port number on the Cluster Connectivity CRU. |
Cause The External ServerNet SAN manager attempted to configure the
ServerNet node number by writing to the internal registers of a Node
Numbering Agent (NNA). After the write operation completed, the External
ServerNet SAN manager read the register contents back for verification
purposes, compared the contents of the NNA registers with the intended
values, and found that at least one of the register values did not
match the intended value. Effect The External ServerNet SAN manager automatically retries to
program the NNA and possibly resets the NNA during the retry. Until
the NNA is programmed correctly, the physical ServerNet connection
to the external fabric indicated by the event remains down. If the
retries to program the NNA fail, an SMN 4103 message is generated. Recovery This is an informational message. Although no corrective action
is required, information for recovery might be available in the SMN
4103 message. |
4203 ServerNet node number changed from old-node to new-node. | old-node | is the old ServerNet node number. | new-node | is the new ServerNet node number. |
Cause The External ServerNet SAN manager detected that the ServerNet
node number assigned by the external fabrics to the node has changed,
because the node was moved to a different position in the ServerNet
Cluster topology. The ServerNet node number that the External ServerNet
SAN manager had previously configured in the Node Numbering Agents
(NNAs) is no longer valid. Effect The External ServerNet SAN manager configures the new ServerNet
node number in the Node Numbering Agents (NNAs). Recovery This is an informational message. No corrective action is required. |
4204 A call to Service Processor (SP) I/O library
routine routine failed. | routine | identifies the SP I/O library routine. These routines
are possible: |
Cause The External ServerNet SAN manager made a call to an SP I/O
library routine, but it failed. Effect The External ServerNet SAN manager automatically retries the
SP I/O library routine, possibly after destroying its previous session
with the Service Processor and starting a new session. If the retries
fail, and the SP I/O library routine had been called by the External
ServerNet SAN manager to program an NNA, an SMN 4103 event occurs. Recovery The External ServerNet SAN manager automatically retries the
SP I/O library routine. If the errors persist (as indicated by additional
SMN 4204 messages), an SMN 4103 message is generated. The Service
Processor might have to be initialized, or the Service Processor firmware
might have to be upgraded. Contact your service provider for assistance. |
4205 This event has three forms. For the 6770 switch when neither a switch tag nor a switch type
are supplied the message has this form: A status change has been detected in the ServerNet
switch on the external fabric fabric.
Port port-number status changed to: status1 [ Port number status
changed to: status1 ] For the 6770 switch when a switch tag value is supplied the
message has this form: A status change has been detected in ServerNet
switch switchID on the external fabric fabric. Port port-number status changed to: status1 [ Port number status changed to: status1 ] For a 6870 switch the message has this form: A status change has been detected in ServerNet
switch fabric.zone.layer, in group group, module module,
slot slot. Port port-number status changed to: status2 Port connects to location
location-number [ Port number status changed to: status2, Port connects to location
location-number, ... ... ] |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch: | port-number | is the port number of one of the twelve ports on the
ServerNet switch. | status1 | Possible values are: | status2 | Possible values are: | switchID | has these possible values: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module, slot | are the locations of the group, module, and slot,
respectively. | location | is one of: zone | appears if the port connects to a zone. | layer | appears if the port connects to a layer. | node | appears if the port connects to a node. |
| location-number | is the zone, layer, or node number. |
Cause SANMAN has detected at least one external port status change
in a ServerNet switch. Effect The status of switch port or ports (including neighbors) indicated
has been changed. Recovery Depends on the kind of switch. When the switch is a 6770 switch: Recovery When the switch is a 6780 switch: When the new status is: |
4206 This event has two forms, depending on the version of the switch
and firmware. When the switch is a 6770 switch this message appears: SANMAN has altered an attribute of the [ ServerNet
switch switchID | the ServerNet switch
| ] on the external ServerNet fabric fabric. Reason: cause Attribute that
was altered: [ Fabric setting changed to: fabric ] [ Halt symbol propagation changed to: { ENABLE | DISABLED }] [
Switch locator string changed to string ] [ New Globally Unique ID accepted. ] When the switch is a 6780 switch this message appears: SANMAN has altered an attribute of ServerNet
switch fabric.zone.layer, group group, module module. Reason: cause Attribute that was altered:
[ Halt symbol propagation changed to: { ENABLE | DISABLED }] [ Switch
locator string changed to string2] [ New
Globally Unique ID accepted. ] |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
type | contains the version of the switch hardware. | switchID | has these possible values: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module | are the locations of the group and module, respectively. | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | cause | is the reason the configuration attribute was altered,
as follows: | string | indicates the physical location of the switch. This
token is present if the switch hardware is 6770 switch or earlier. | string2 | indicates the physical location of the switch. This
token is present if the switch hardware is a 6780 switch or later. |
Cause The operator issued an ALTER SWITCH command, or if the physical
hardware is a 6770 switch SANMAN detected that the required fabric
setting attribute of a 6770 switch was set to None when discovering
an external ServerNet fabric. In the latter case, SANMAN will automatically
sets the switch fabric setting attribute to match the identity of
the external fabric in which the switch was discovered. Effect SANMAN has successfully altered an attribute of a ServerNet
switch configuration. Recovery This is an informational message. No corrective action is required. |
4207 SANMAN detected an error when registering the driver Driver. Error: cause. | driver | is an enumerated value designating the particular
driver with which SANMAN was attempting to register when the error
was encountered. The possible values are IBC and SMC. | cause | contains the cause for the failure. Possible values
are: |
Cause The primary SANMAN process attempted to register with a NonStop
Kernel Driver, but registration failed. Effect If the failed attempt was to the IBC Driver, the primary SANMAN
process will not be able to communicate with the external ServerNet
fabrics via IBC packets, and will terminate itself. The backup SANMAN
will attempt to register with the NonStop Kernel IBC Driver in a different
processor when it becomes the new primary. The SANMAN process pair
will terminate if registration with the NonStop Kernel IBC Driver
does not succeed in at least one processor in SANMAN's configured
CPU list. If the failed attempt was to the SMC Driver, the primary
SANMAN process will not be able to communicate directly with 6780
switches via ServerNet RDMA reads and writes, and will terminate itself.
The backup SANMAN process will take over and attempt to register with
the SMC Driver when it becomes the new primary. The SANMAN process
pair will terminate if registration with the NonStop Kernel SMC Driver
does not succeed in at least one processor in SANMAN's configured
CPU list. Recovery SANMAN will automatically attempt to register with the NonStop
Kernel Driver on a different processor. A ZZSA* save abend file will
be created when the primary SANMAN process terminates. The ZZSA* save
abend file should be provided for analysis, along with the ZZSV* service
log event file containing the SMN 4207 event. If the error detected
by SANMAN was a failure to register a permissive listener interrupt
handler, the recommended long-term solution is to request a TNet Millicode
(T8460) product revision capable of supporting a larger number of
registered subsystems. If the errors indicate that SANMAN has attempted
to register with the indicated driver more than once, the errors are
programming errors. SANMAN should not attempt to register with a NonStop
Kernel Driver more than once. Ensure that the saveabend file and event
logs are preserved and report this incident to your HP service provider.
If the error code implies that there is a mismatch between the executing
versions of SANMAN and the NonStop Kernel, the event logs should contain
an SMN 4011 event with additional details on the version mismatch
between SANMAN and the NonStop Kernel. The operator needs to ensure
that the versions of SANMAN and the NonStop Kernel IBC and SMC drivers
are compatible. Check the level of the installed product revisions
for both T9050 and T0502 and the product revision requisites for these
two products. If the two do not match, install any maintenance required
to ensure that the products are at compatible levels. If the error
codes indicate a shortage of memory in NonStop Kernel data space,
it is highly likely that the automatic retry when the backup takes
over will recover this situation. If it does not, the operator may
attempt to stop other processes to free up system global memory then
retry starting the SANMAN process using the SCF command START PROCESS
$ZZKRN.ZZSMN. Otherwise, determine and correct the cause of the error.
When the error is corrected, it may be necessary to manually restart
SANMAN by using the SCF START PROCESS $ZZKRN.ZZSMN command. |
4208 A power status change has been detected in [
ServerNet switch switchID | the ServerNet
switch ] on the external ServerNet fabric fabric. The specific status that changed is: status |  |  |  |  |  | NOTE: This event applies only to the 6770 switch. |  |  |  |  |
switchID | is the position ID of the reporting ServerNet switch.
Possible values are: 1, 2 or 3. The position ID is the location of
a cluster switch on the fabric using the current topology. X1/Y1 cluster
switches have a position ID of 1. X2/Y2 cluster switches have a position
ID of 2. X3/Y3 cluster switches have a position ID of 3. | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | status | is the status that changed. Possible values are: UPS on/ok status changed to: { Normal | Abnormal }
Operation. Mode of Operation status changed to: { Battery
Operation | Line Operation }. Battery Voltage status changed to: { OK | Low }. Battery management status changed to: { Charging
| Discharging | Floating | Resting }. Line Regulation status changed to: { Normal,Straight
Through | Step Down, Buck | Step Up, Boost }. Attention Required status changed to: { No Attention
Required | Battery Failure | Ground Failure | Overloaded (but
not shutting down) }. Malfunction! Immediate Action status changed to:
{ No Malfunction | Backfeed Contact Failure | Overload (UPS
will shut down) | Inverter Under Voltage }. Primary Power rail on/off status changed to: { On
| Off }. Backup Power rail on/off status changed to: { On
| Off }. UPS cable detected status changed to { On | Off }. UPS responding status changed to { On | Off }. Power Currently Available to the UPS: Remaining Backup Time changed to: nnminutes [or more]. Nominal AC input voltages: nn volts
|
Cause The ServerNet SAN manager has detected at least one power environment
status change in a ServerNet switch. Effect Some element or elements of the AC power in a ServerNet switch
has changed in status. These changes could include the detection of
an error or the detection of recovery from a previously detected error. Recovery The following status changes are informational only and require
no corrective action: UPS on/ok status changed to: Normal operation. | Mode of Operation status changed to: Line Operation. | Battery Voltage status changed to: OK. | Battery management status changed to: { Charging | Discharging
| Floating Resting }. | Line Regulation status changed to: { Normal,Straight Through
| Step Down, Buck | Step Up, Boost }. | Attention Required status changed to: No Failure. | Malfunction! Immediate Action status changed to: None. | Remaining Backup Time: nn minutes. |
Other status changes require recovery action, depending
on the change reported. |
4209 [ SANMAN has gathered initial internal status
in ServerNet switch switchID on the external
ServerNet fabric fabric. | An internal
status change has been detected in ServerNet switch switchID on the external ServerNet fabric fabric.
{ Detected error conditions: error |
Repairs to previously detected errors: repaired-error }] |  |  |  |  |  | NOTE: This event applies only to the 6770 switch. |  |  |  |  |
switchID | is the position ID of the reporting ServerNet switch.
Possible values are: 1, 2 or 3. The position ID is the location of
a cluster switch on the fabric using the current topology. X1/Y1 cluster
switches have a position ID of 1. X2/Y2 cluster switches have a position
ID of 2. X3/Y3 cluster switches have a position ID of 3. | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | error | is a hardware or firmware error that was detected.
Possible values are: Unknown error type. | Bad program checksum. | Factory-default blank configuration was loaded. | New configuration bad after configuration download. | Both configurations bad after configuration download. | SRAM Memory test failed. | FLASH sector test failed. | Bad SEEPROM Checksum. | SBUSY error. | FLASH Program error. | Power supply fan failure. | Switch is not responding. | SP ran over allocated space. | Bad program CRC. | Bad switch configuration CRC. | Firmware images are different. | Configuration images are different. | Router self check. | Bad Flash ID String. | Flash boot lockout 0 error. | Flash boot lockout 1 error. |
| repaired-error | is a previously detected error that has been repaired.
Possible values are: No unknown errors. | Program checksum is now OK. | Current persistent router configuration was loaded. | New configuration OK after configuration download. | At least one configuration OK after configuration download. | SRAM Memory test OK. | FLASH sector test OK. | SEEPROM Checksum OK. | No SBUSY errors detected. | No FLASH Program errors detected. | Both power supply fans are now OK. | Switch is now responding. | SP is no longer over allocated space. | Program CRC is now OK. | Switch configuration CRC is now OK. | Firmware images are identical. | Configuration images are identical. | No router self check. | Flash ID string is now OK. | No flash boot lockout 0 errors detected. | No flash boot lockout 1 errors detected. |
|
Cause SANMAN has detected at least one firmware and/or hardware status
change in a ServerNet switch. Effect Depends on the status change. An error has been detected or
a repair has been detected in one or more aspects of the ServerNet
Switch firmware or hardware. Recovery The following status changes are informational only and require
no corrective action: No unknown errors.
Program checksum is now OK.
Current persistent router configuration was loaded.
New configuration OK after configuration download.
At least one configuration OK after configuration download.
SRAM Memory test OK.
FLASH sector test OK.
SEEPROM Checksum OK.
No SBUSY errors detected.
No FLASH Program errors detected.
Both power supply fans are now OK.
Switch is now responding.
SP is no longer over allocated space.
Program CRC is now OK.
Switch configuration CRC is now OK.
Firmware images are identical.
Configuration images are identical.
No router self check.
Flash ID string is now OK.
No flash boot lockout 0 errors detected.
No flash boot lockout 1 errors detected.
|
Other status changes require recovery action, depending on the
change reported. |
4210 This message has two forms. When the hardware is a 6770 switch this message appears: The External ServerNet SAN Manager process has
loaded a new filetype into [ ServerNet switch switchID | the ServerNet switch in the external ServerNet fabric fabric. ] Download file: download-file Download file version: version Current
revision: [ Release=rev ] Major=rev; Minor=rev New revision:
[ Release=rev ] Major=rev; Minor=rev [ Current configuration tag:
[0x]input-tag ] [ New configuration tag:
[0x]tag ] When the hardware is a 6780 switch this message appears: The External ServerNet SAN Manager process has
loaded a new filetype into ServerNet switch fabric.zone.layer, group group, module module, slot slot. Download file: download-file Download
file version: version Current revision:
[ Release=rev ] Major=rev; Minor=rev New revision: [ Release=rev ] Major=rev; Minor=rev [ Current configuration tag: input-tag ] |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
filetype | indicates the type of download file. Possible values
are: | switchID | has these possible values: | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | download-file | indicates the name of the download file. | version | indicates the VPROC of the download file. | rev | indicates revision number. | input-tag | indicates the current tag of the switch configuration
block loaded into the specified ServerNet switch. | tag | indicates the new tag of the switch configuration
block loaded into the specified ServerNet switch. | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module, slot | are the locations of the group, module, and slot,
respectively. |
Cause The operator issued a LOAD SWITCH command to download a firmware,
configuration, or FPGA file into a ServerNet switch. Effect SANMAN has successfully loaded the download file into the ServerNet
switch. Recovery This is an informational message. No corrective action is required. |
4211 This message has three forms. When the hardware is a 6770 switch this message appears: The External ServerNet SAN Manager process has
reset [ ServerNet switch switchID | the
ServerNet switch ] on the external ServerNet
fabric fabric. Reset type: reset-type When the hardware is a 6780 switch and this SANMAN process requested
the reset this message appears: The External ServerNet SAN Manager process has
reset ServerNet switch fabric.zone.layer, group group, module module. on the external ServerNet fabric fabric. Reset type: reset-type Reset type detail: detail ServerNetID of reset requestor: snid When the hardware is a 6780 switch and this SANMAN process is
not the one which requested the reset this message appears: The External ServerNet SAN Manager process has
detected a reset in ServerNet switch fabric.zone.layer, group group, module module on the external ServerNet fabric fabric. Reset type: reset-type Reset type detail: detail
[ ServerNetID of reset requestor: snid HDI Identifier: hdi Numeric Selector
value used: sel. Configuration Tag: config-tag. | Firmware Exception Error: error. ] |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
switchID | has these possible values: | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | reset-type | indicates the type of reset. Possible values are Hard
Reset and Soft Reset. | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module | are the locations of the group and module, respectively. | detail | indicates details about about the type of reset. Possible
values include: | snid | contains the ServerNet ID of the requestor. | hdi | is the value of the current Hardware Data Identifier. | config-tag | contains the configuration tag of the configuration
that was loaded. | sel | contains the value for the numeric selector (a.k.a.
thumb wheels) that was used to load the configuration. | error | contains the exception error code. |
Cause SANMAN has detected that the indicated ServerNet switch has
been reset. Effect The specified ServerNet switch has been reset. Recovery If the reset type is Watchdog Level 2 Timer Expiration or Watchdog
Level 3 Timer Expiration, contact your service provider. Also, check
for the presence of OSM alarms and perform any recommended recovery
actions. Otherwise, this is an informational message. No corrective
action is required. |
4212 A neighbor status change has been detected in
ServerNet switch switchID on the external
ServerNet fabricID fabric [ Port port status changed to:status
] [ Switch port port neighbor status changed
to: status ] |  |  |  |  |  | NOTE: This event applies only to the 6770 switch. |  |  |  |  |
switchID | is the position ID of the reporting ServerNet switch.
Possible values are: 1, 2 or 3. The position ID is the location of
a cluster switch on the fabric using the current topology. X1/Y1 cluster
switches have a position ID of 1. X2/Y2 cluster switches have a position
ID of 2. X3/Y3 cluster switches have a position ID of 3. | fabricID | identifies the external ServerNet fabric of the ServerNet
switch. Possible values are X and Y. | port | is the port number. Possible values are in the range
0 through 11. | status | is the status of the neighbor connected to a switch
port. Possible values are: Not Defined | Status OK | Link Dead | Disabled, unknown reason | Wrong Fabric | Invalid | Invalid Port | Mixed GUID | Invalid Part Number | Invalid Version ID | Mixed Configuration | Invalid uninitialized Configuration |
|
Cause SANMAN has detected at least one port status change in a neighbor
ServerNet switch. Effect The status of the neighbor switch port or ports indicated have
been changed. Recovery The following messages require corrective action: Cabling may
be incorrect. Check cables Check if a valid switch configuration
block is loaded on the neighbor switch connected to that port. Check
cabling. If cabling appears to be correct, make sure the correct configuration
image is loaded on the neighbor switch connected to that port. |
4213 This message has two forms: When the hardware is a 6770 switch this message appears: A neighbor check error has been detected in
{ ServerNet switch switchID | the ServerNet
switch } on the external ServerNet fabric fabric. Error type: err [ Nearest switch port number:
portnumber ] location When the hardware is a 6780 switch this message appears: A neighbor check error has been detected in
ServerNet switch fabric.zone.layer, group group, module module., {
slot slot, port port | router instance slot, router port } Error type: err3 location3 |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
switchID | has these possible values: | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | err | indicates the error found when checking the neighbor
port. Possible values: | location | reports errored device information. It includes: Neighbor switch current port number | Expected Neighbor switch port number | Nearest switch configuration tag; (0x10000, 0x10001, 0x10002,
0x10003, 0x10004) | Neighbor switch configuration tag; (0x10000, 0x10001, 0x10002,
0x10003, 0x10004) | Expected Neighbor switch configuration tag; (0x10000, 0x10001,
0x10002, 0x10003, 0x10004) | Current Neighbor switch fabric setting; X, Y or none | Expected Neighbor switch fabric setting; X or Y | Current Neighbor switch GUID; a 6 character switch identifier | Expected Neighbor switch GUID; a 6 character switch identifier | Current Neighbor switch manufacturing part number | Expected Neighbor expected manufacturing part number; currently
it has a value of 0x100150A7 | Current Neighbor SCB format version | Expected Neighbor SCB format version; currently it has an
ASCII value of "S2C0" | Nearest switch configuration major revision | Nearest switch configuration minor revision | Neighbor switch configuration major revision | Neighbor switch configuration minor revision |
| err3 is one of | | location3 | reports errored device information. It includes: Config version: exp=, recv= | Config tag: exp=, recv= | Config major rev: exp=, recv= | Config minor rev: exp=, recv= | Switch GUID:exp=, recv= | Slot: exp=, recv= | Port: exp=, recv= |
| zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module, slot | are the locations of the group, module, and slot respectively. | port | contains router port information. | portnumber | contains the port number. |
Cause SANMAN has detected an error with respect to the neighbor ServerNet
switch. Effect The switch port will not be enabled for ServerNet pass-though
traffic. Recovery The majority of the neighbor check errors require corrective
actions, check for the presence of OSM alarms and perform any recommended
recovery actions. |
4214 This message has three forms, depending on switch type. When the switch is a 6770 switch this message appears: ServerNet switch switchID on the external ServerNet fabric fabric
has recovered from a blocking incident. Packet Grabber Receive Length:
packet-length Port Routing Status: [ Router port port-number: port-status routing to zport-number2 ] [ ... ] When the switch is a 6770 switch and the switch ID is not present
this message appears: The ServerNet switch on the external ServerNet fabric fabric has recovered from a blocking incident.
Packet Grabber Receive Length: packet-length Port Routing Status:
[ Router port port-number: port-status, routing to zport-number2
] [ ... ] When the switch is a 6780 switch this message appears: ServerNet switch fabric.zone.layer in group
group, module module, slot slot Has recovered from a blocking incident.
Packet Grabber Receive Length: packet-length Port Routing Status:
[ Router port port-number: port-status routing to zport-number2 ] [ ... ] |  |  |  |  |  | NOTE: This event applies to the 6770 switch and the 6780 switch. |  |  |  |  |
switchID | has these possible values: | fabric | identifies the external ServerNet fabric (X or Y)
of the ServerNet switch, as follows: | packet-length | represents the packet grabber receive length. | port-number | represents port number. | port-status | can be: | zport-number2 | represents port number. | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module, slot | are the locations of the group, module, and slot respectively. |
Cause The ServerNet switch detected a backpressure incident and became
blocked for ServerNet traffic. This event appears only if T0502AAG
and T0569AAE (or superseding product revisions) are installed. T0502AAG
and T0569AAE are included in the G06.14 product version. Effect The ServerNet switch firmware automatically recovered the switch
from the backpressure incident by performing a selective reset of
the switch. If the cluster switch is running T0569AAE firmware (or
a superseding product revision), detection and recovery from a blocked
switch incident is typically performed in a few milliseconds. In most
cases, selective reset recovery from a blocked switch incident does
not cause any loss of interprocessor communication (IPC) connectivity
through the cluster switch under recovery. For more information about
blocked switch incidents, see Support Note S01122. Recovery Use the SCF STATUS SUBNET $ZZSCL, PROBLEMS command to validate
the connectivity status in the cluster and confirm that all of the
remote IPC paths were automatically repaired. If paths are found to
be down, verify the status of the backpressured switch and of all
NNAs in the cluster by using OSM or SCF. Use the diagnostic commands
and recovery steps documented for an unprogrammed NNA in Support Note
S01122. |
4215 An internal port status change has occurred
in ServerNet Switch fabric zone layer, group address, module address router instance port-status.instance-number, router port port-status.port-number [ Connected slot number: port-status.slot-number ] current status port-status.new-status |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
port-status | is a structured token map that contains the specific
location and status information for the port whose status has changed.
Its contents are: status-version | is the current version of the data structure. This
value is incremented any time the structure is changed. The field
is intended to allow ongoing ServerNet Cluster releases to maintain
downward compatibility if the structure must be changed. | instance-number | is the instance number of the router with the port
whose changed status is being reported. Valid values range from 1
to 5. | port-number | is the ordinal number of the port whose status is
being reported. For a 6780 switch, the range of valid values is from
zero (0) to eleven (11). | transceiver-port-number | is the transceiver number. | slot-number | is the CRU where the port is located. Does not have
a valid value (zero) for internal ports. | old-status | holds the status of the port prior to the change being
reported. | old-status-detail | holds the status detail of the port prior to the change
being reported. The value of this field is not displayed in the event
message text. | new-status | holds the current status of the port being reported. | new-status-detail | holds the current status detail of the port being
reported. The value of this field is not displayed in the event
message text. |
| fabric | identifies the external ServerNet fabric (X or Y)
of the switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event
and is set to zero. |
Cause SANMAN has detected a change in the status of an internal port
on one of the internal routers in a switch. Effect The status of the indicated router port, internal to the switch,
has been changed. Recovery Reset, Uninstalled, and Link Alive, enabled are informational
messages. No corrective action is required for these status changes.
Recovery from Link Dead or Link Alive, disabled depends on the problem
that caused the change to this status. |
4216 A power status change has been detected in
ServerNet switch fabric zone layer, group address, module address on the external ServerNet fabric fabric.
State change(s) detected: power-item.item-name (slot power-item.item-slot) : power-item.item-status |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event
and is set to zero. | power-item | is a structured token map that contains the power
item name, specific location, and status information for the item
whose status has changed. Its contents are: item-version | is the current version of the data structure. This
value is incremented any time the structure is changed. The field
is intended to allow ongoing ServerNet Cluster releases to maintain
downward compatibility if the structure must be changed. | item-name | is an enumerated value which encodes the name of the
specific item that has the power status change. | item slot | is the switch slot number containing the item whose
power status has changed. | item-status | is the current value of the power status flag for
the specific item. |
|
Cause SANMAN has detected a change in the status of at least one component
of the power system in a switch. Effect The status of the indicated component of the power system has
been changed. Recovery If the new status, as displayed in the event, indicates that
the component has failed, see the OSM alarm and follow the suggested
repair action. Otherwise, this is an informational message and no
recovery action is required. |
4217 { SANMAN has gathered initial internal status
from | An internal status change has ben detected in } ServerNet
switch fabric zone layer, group address, module address. [ State change(s) detected:
message ] |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
message | is a map token containing the values and indicators
for any changed status items. Its message is listed below and is accompanied
by an appropriate value: Firmware state | RTC battery fail | HDI mismatch | Numeric selector | Firmware image A status | Firmware image B status | Config image A status | Config image B status | FPGA image A status | FPGA image B status | Internal temperature (C) | Saved Dump Available: true |
| fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | contains the switch group and switch module where
the specified switch is physically located. |
Cause SANMAN has detected a change in the status of a component of
the firmware or hardware in a switch. Effect The status of the indicated component of the switch has changed. Recovery If the new status, as displayed in the event, indicates that
the component has failed, see the OSM alarm and follow the suggested
repair action. Otherwise, this is an informational message and no
recovery action is required. |
4218 Error(s) have been detected in a asic-type ASIC on the ServerNet switch fabric zone layer, group address, module address. [ [ Router instance: router-instance. err-rprt1.type errors: err-rprt1.value [ Partial list of detailed self-check errors: | Detailed
self-check error(s): ] self-check-errors ] | [ Partial list
of errors: | Detected error(s): ] err-rprt2.type
: err-rprt2.value. [ Partial list of detailed self-check errors/hardware
exceptions: | Detailed self-check error(s)/hardware exception(s):
] self-check/hardwareexcp ] err-rprt1 |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
asic-type | is the type of the ASIC that is having errors. | type | the router error counter type. It can be: | self-check-errors | It can be: | err-rprt1 | contains data specific to the ROUTER2 counter error
being reported. The data includes: version | is the version of the data structure. This structure
is introduced in the 6780 switch. | type | is the error counter type. | error-counter | is the value of the error counter |
| err-rprt2 | contains data specific to the ROUTER2 counter error
being reported. The data includes: version | is the version of the data structure. This structure
is introduced in the 6780 switch. | type | is the type of ServerNet 2 Packetizer error. It can
be: | error-counter | is the value of the error counter |
| value | is the ServerNet 2 Packetizer error count. | selfcheck/hardwareexcp | is the type of hardware exception or self check. It
can be: | router-instance | is the router instance that is having errors. | fabric | identifies the external ServerNet fabric (X or Y)
of the switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | is a map token containing, in this case, the group,
module, and slot where the specified ASIC switch is physically located. |
Cause SANMAN has detected the occurrence of an internal error in one
of the ASICs in a switch. Effect If the error is from a Router 2, connectivity to the External
ServerNet fabric via the effected ASIC is lost until the ASIC is reset.
If the error is from a Colorado 2xy, management command capability
is lost until the ASIC is reset. However, in both cases, the ASIC
is automatically reset by the switch firmware. Recovery Once the ASIC reset is complete, any lost connections are automatically
recovered. OSM tracks the occurrence of this event, and beyond a threshold
level, will raise an alarm. If the OSM alarm appears, follow the suggested
repair action. If no OSM alarm appears, this is an informational message
and no recovery action is required. |
4219 Port error(s) have been detected on a port-info, router instance port-info, router port port-info in ServerNet switch fabric zone layer, group address, module address. Router port type: port-info [ Port number: port-info
] [ Connected slot number: port-info | [ Port connects to: [ node | zone | layer
] conn-number Associated slot number: port-info ] [ port-err-rprt. Z-TYPE:port-err-rprt.Z-VALUE ] |  |  |  |  |  | NOTE: This event is emitted for 6780 switches and beyond, only. There
is going to be one event for each router port with errors.This event is not emitted during SANMAN initialization or takeover
processing.This event applies only to the 6780 switch. |  |  |  |  |
port-err-rprt | contains data specific to the router error being reported.
The data includes: Router 2 Port Error Ver - the version of the data
structure. This structure is introduced with 6780 switches and its
version is incremented for every new release. Router 2 Port Error Type - an enumeration of counter
types. Router 2 Port Error Count - the counter value.
| port-info | contains data specific to the router 2 where the port
error is being reported. It contains: Router Port Info Version - the version of the data
structure. Router Type - the router ASIC type. Router Instance - the ASIC instance number of the
router reporting the error. Router Port Number - the port number, within the router
instance, where the error occurred. Router Port Type - the port type. This is one of External
Port, Internal Port, Packetizer Port, or External Loop Port. Associated Slot Number - this is displayed if the
Router Port Type is External Loop Port. It corresponds to the router
interconnect PIC slot number associated to the router port reported
in the event. Connected Slot Number - this is displayed if the Router
Port Type is External Port, Internal Port, or Packetizer Port. It
corresponds to the the logical board slot number if the port type
is either Internal Port or Packetizer Port, or it corresponds to the
PIC slot number connected to the router port if the port type is External
Port. Cru Port Number - this is displayed if the Router
Port Type is External Port. It corresponds to the CRU port number
connected to the router port.
| Z-TYPE | can be: fabric | identifies the external ServerNet fabric (X or Y)
of the switch, as follows: |
| zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | node | is the ServerNet node number. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event
and is set to zero. | conn-number | identifies the node, zone, or layer number. |
Cause SANMAN has detected the occurrence of a specific port error
in one of the router ASICs in a switch. Effect Depends on the error. Recovery See if OSM is reporting an alarm for the port, and follow the
repair actions for the alarm. |
4220 The SMC Driver API interface call error returned error error, error detail error. [ ServerNet switch fabric zone layer. ] |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | error | is a map token containing data relative to the call
and its error returns. The data includes: SMC API Error Version - the version of the data structure.
This structure is introduced with 6780 switches and its version is
set to one. The structure version is incremented whenever the structure
changes. SMC API - an enumeration of the specific interface
call which returned the error. SMC API Error - an enumeration of the possible error
returns from the SMC Driver. SMC API Error Detail - an integer code of the lower
level error responses received by the SMC Driver, and which resulted
in the driver returning an error to SANMAN. These error codes are
usually TNet Services codes.
|
Cause SANMAN invoked an SMC Driver API interface function. The function
returned a code other than SMC_RTN_OK. Effect SANMAN will automatically retry the SMC Driver API call. If
subsequent retries fail, and if this attempt to invoke the SMC Driver
is the result of a request from a SANMAN client, the client request
will ultimately receive a failure response. Recovery This event is primarily provided for support personnel. SANMAN
will automatically retry the SMC Driver API call. If the error persists
(as indicated by additional SMN 4220 events), use the error and error
detail codes in the event to determine and correct the cause of the
failure. Then retry the high level command that failed. |
4221 [ A CRU internal status change has been detected
in ServerNet switch fabric zone layer,
group address, module address. Status change(s) detected in slot address: | SANAMN has gathered
initial CRU internal status from Problem(s) detected in slot address
] |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event
and is set to zero. |
Cause SANMAN has detected a change in the status of a CRU in a switch. Effect The status of the indicated component of the 6780 switch has
changed. Recovery If the new status, as displayed in the event, indicates that
the component has failed, see the OSM alarm and follow the suggested
repair action. Otherwise, this is an informational mesage and no recovery
action is required. |
4222 ServerNet switch fabric zone layer, group address, module address. update-type update state changed to:
[ status ] |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event
and is set to zero. | update-type | is an enumeration with the type of update. Possible
values are invalid, firmware, and configuration. | status | is the status of the update. Possible values are Erase
Image A, Write Image A, Erase Image B, Write Image B. |
Cause SANMAN has received an indication that an update has reached
a new phase. Effect The status of the indicated component of the switch has changed. Recovery No recovery needed. |
4223 Event logging for missing status errors and/or
packet CRC errors and/or TLB/OLB command exception errors on router
instance port-info, router port port-info in ServerNet switch fabric.zone.layer,
group group, module module is supressed due to excessive errors. Router
port type: port-info [ Port number: port-info ] [ Connected slot number:
port-info Port connects to: conntype | Associated
slot number: port-info ] |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
port-info | contains data specific to the router 2 where the port
error is being reported. It contains: Router Port Info Version - the version of the data
structure. Router Type - the router ASIC type. Router Instance - the ASIC instance number of the
router reporting the error. Router Port Number - the port number, within the router
instance, where the error occurred. Router Port Type - the port type. This is one of External
Port, Internal Port, Packetizer Port, or External Loop Port. Associated Slot Number - this is displayed if the
Router Port Type is External Loop Port. It corresponds to the router
interconnect PIC slot number associated to the router port reported
in the event. Connected Slot Number - this is displayed if the Router
Port Type is External Port, Internal Port, or Packetizer Port. It
corresponds to the the logical board slot number if the port type
is either Internal Port or Packetizer Port, or it corresponds to the
PIC slot number connected to the router port if the port type is External
Port. Cru Port Number - this is displayed if the Router
Port Type is External Port. It corresponds to the CRU port number
connected to the router port.
| fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1, and
the maximum value is 4. | group, module | are the locations of the group and module, respectively. | conntype | is connectivity information of an external port such
as the type of neighbor the port is connected to and numeric information
to identify the neighbor |
Cause SANMAN has received an indication that an update has reached
a new phase. Effect Not all router port error counters are going to generate an
event SMN 4219 during the supression interval of time. Recovery No recovery needed. An event SMN 4219 with the summary of the
router port error counters that have changed values during the suppression
interval is generated after the interval is over. |
4224 $ZZSMN encountered a $ZCNF access error error, error detail err-detail Operation: op | error | is the error code returned from a system configuration
database access routine. | err-detail | is the error detail code returned from a system configuration
database access routine. | op | is the operation that SANMAN was attempting when the
error occurred. Possible values are: |
Cause The External ServerNet SAN manager process (SANMAN) encountered
an error while using the HP NonStop operating system configuration
services application programming interface (API) for access to the
External ServerNet SAN subsystem configuration record. Effect If this error occurs during process startup, SANMAN infers
zone-to-zone distances in a long-distance ServerNet cluster topology
from the configuration tags loaded on switches, as opposed to using
the zone-to-zone distance attributes stored in its private configuration
record. If this error occurs later, the action is prompted by an SCF
[ALTER | START | STOP] SUBSYS $ZZSMN command. In this case, the command
fails with an error. Recovery Restart the process or reissue the failed SCF command. If
the error persists, contact your service provider. |
4225 Firmware has encountered an error in reading
the internal temperature of the Servernet switch fabric
zone layer, group address, module address on the external Servernet
fabric fabric. |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1; the
maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1; the
maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event. |
Cause Firmware reads the internal temperature of the logic board in
the Servernet switch. If there is an error in reading, SANMAN detects
it and displays this event. Effect The internal temperature of the logic board in the ServerNet
switch is not displayed correctly. Recovery This is an informational message, and no recovery action is
required. |
4226 The Internal temperature read error in the Servernet
switch fabric zone layer, group address, module address on
the external fabric fabric has been resolved. |  |  |  |  |  | NOTE: This event applies only to the 6780 switch. |  |  |  |  |
fabric | identifies the external ServerNet fabric (X or Y)
of the 6780 switch, as follows: | zone | is the cluster switch zone number of the switch in
which a status change has been detected. The minimum value is 1; the
maximum value is 3. | layer | is the cluster switch layer number of the switch in
which a status change has been detected. The minimum value is 1; the
maximum value is 4. | address | is a map token containing, in this case, the switch
group and switch module where the specified switch is physically located.
The third field in this map, the switch slot, is unused in this event. |
Cause Firmware can now read the internal temperature of the logic
board in the ServerNet switch correctly. SANMAN detects this and displays
this event. Effect The internal temperature of the logic board is now displayed
correctly. Recovery This is an informational message, and no recovery action is
required. |
|