Operator Messages Manual
Chapter 84 SCL (ServerNet Cluster Subsystem) Messages
The messages in this chapter are generated by the HP NonStop™
ServerNet Cluster subsystem monitor process. The subsystem ID displayed
by these messages includes SCL as the subsystem name. | | | | | NOTE: Negative-numbered messages are common to most subsystems. If
you receive a negative-numbered message that is not described in this
chapter, see Chapter 15. | | | | |
1001 The ServerNet Cluster subsystem monitor process, process-name, has started in processor cpunum. Program file: filename Priority: pri Autorestart count: count Processor list: (first-processor-in-list [,next-processor-in-list, ...,
last-processor-in-list]) | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | cpunum | is the number of the processor in which the primary
ServerNet cluster monitor process has started. | filename | is the name of the program file for the ServerNet
cluster monitor process. | pri | is the priority at which the ServerNet cluster monitor
process is running. | count | is the autorestart count configured for the ServerNet
cluster monitor process. | first-processor-in-list ... last-processor-in-list | is the processor list configured for the ServerNet
cluster monitor process. |
Cause The ServerNet cluster monitor process has been started by an
operator or by the persistence manager process ($ZPM) after a failure
of both the primary and backup ServerNet cluster monitor processes.
(The ServerNet cluster monitor process has no means of distinguishing
between the two cases.) Effect The ServerNet cluster monitor process is running. Recovery This is an informational message. No corrective action is required. |
1002 The ServerNet Cluster subsystem monitor process, process-name, has terminated. Reason: reason. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | reason | is the reason for the termination. Possible values
are: |
Cause The ServerNet cluster monitor process terminated voluntarily,
either by an operator command or because an environmental problem
caused it to self-terminate. If this event is due to self-termination,
an SCL 1010 message reported the environmental problem. Effect The ServerNet cluster monitor process is no longer running. Recovery If this event is due to self-termination, follow recovery instructions
for the SCL 1010 message. After correcting any environmental problems,
restart the ServerNet cluster monitor process with an operator command. |
1003 Process process-name: Primary processor cpunum | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | cpunum | is the number of the processor in which the primary
ServerNet cluster monitor process is running. |
Cause Either the ServerNet cluster monitor process was initialized
for the first time or a backup process has become the primary process. Effect The ServerNet cluster monitor process is running in the indicated
processor. Recovery This is an informational message. No corrective action is required. |
1004 Process process-name: backup process created in processor cpunum. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | cpunum | is the number of the processor in which the backup
ServerNet cluster monitor process is running. |
Cause The ServerNet cluster monitor process successfully created a
backup process. Effect The ServerNet cluster monitor process is no longer vulnerable
to a single failure. Recovery This is an informational message. No corrective action is required. |
1005 Process process-name: Unable to create backup in processor cpunum. process
creation error: errnum, error detail: err-detail. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | cpunum | is the number of the processor in which the backup
process creation attempt was made. | errnum | is the Guardian process creation error number. | err-detail | is the error detail subcode returned with the Guardian
process creation error. |
Cause An attempt to create a ServerNet cluster monitor backup process
has failed. For information on process creation errors and error detail
subcodes, see the Guardian Procedures Errors and Messages
Manual. Effect Until a backup process is started, the ServerNet cluster monitor
process is vulnerable to a single failure. The ServerNet cluster monitor
process attempts to start a backup process immediately if any processor
in its processor list, other than that used by the primary process,
is running. The ServerNet cluster monitor process makes two restart
attempts in each processor eligible to contain the backup process.
Each failed attempt results in an SCL 1005 message. If all attempts
fail, an SCL 1007 message is generated. Recovery This is an informational message. Although no corrective action
is required, the data in this message might provide information for
recovery in the event of an SCL 1007 message. |
1006 Process process-name: Backup process in processor cpunum failed. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | cpunum | is the number of the processor in which the backup
ServerNet cluster monitor process was running. |
Cause The backup process of the ServerNet cluster monitor process
pair failed. Effect Until a new backup process is started, the ServerNet Cluster
monitor process is vulnerable to a single failure. The ServerNet cluster
monitor process attempts to start a new backup process immediately
if any processor in its processor list, other than that used by the
primary process, is running. The ServerNet cluster monitor process
makes two restart attempts in each processor eligible to contain the
backup process. Each failed attempt results in an SCL 1005 event.
If all of these attempts fail, an SCL 1007 event is generated. Recovery This is an informational message. Although no corrective action
is required, the data in this message might provide information for
recovery in the event of an SCL 1007 event. |
1007 The ServerNet Cluster subsystem monitor process, process-name, is running without a backup. Reason: reason. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | reason | indicates why the backup process was terminated. Possible
values are: |
Cause Either no processor is available for running the backup process,
or there have been multiple failures of the backup process or attempts
to create a backup process. If this event is due to backup process
creation failures, there will have been associated SCL 1005 messages
generated. Other possible precursors are SCL 1006 and SCL 1008 messages. Effect The ServerNet cluster monitor process is running without a backup,
and the ServerNet cluster subsystem is vulnerable to a single failure.
Whenever a processor in the processor list is reloaded, the ServerNet
cluster monitor process attempts to create a backup there. If this event is caused by repeated backup failures or backup
process creation failures, the ServerNet cluster monitor process periodically
attempts to create a backup. Recovery Either reload the processors on the processor list for the ServerNet
cluster monitor process, or use the information in the associated
SCL 1005 messages to determine the cause and recovery actions for
backup process creation failures. To list the processors that are configured in the processor
list for the ServerNet cluster monitor process, issue an SCF INFO
command. For example: -> INFO PROCESS $ZZKRN.#ZZSCL, DETAIL
|
|
1008 The ServerNet Cluster subsystem monitor process, process-name, backup process is terminating. Reason: reason. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | reason | indicates the reason the backup process is terminating.
Possible values are: |
Cause The backup processor failed, or the backup process was terminated
by the primary process (for example, if the primary process found
a fatal error when checkpointing to the backup.) Effect Until a new backup process has been started, the ServerNet cluster
monitor process is vulnerable to a single failure. The ServerNet cluster
monitor process attempts to start a new backup process immediately
if any processor in its processor list, other than that used by the
primary process, is running. The ServerNet cluster monitor process
makes two restart attempts in each processor eligible to contain the
backup process. Each failed attempt results in an SCL 1005 message.
If all of these attempts fail, an SCL 1007 message is generated. Recovery This is an informational message. Although no corrective action
is required, the data in this message might provide information for
recovery in the event of an SCL 1007 message. |
1009 ServerNet Cluster subsystem/Message System Monitor
trace entry trace-entry. | trace-entry | contains an image of an internal ServerNet monitor
process trace entry as it is recorded in memory. |
Cause An internal trace was initiated on the ServerNet monitor process. Effect Trace data is dumped into the EMS log. Recovery This is an informational message. No corrective action is required. |
1010 ServerNet Cluster subsystem monitor process, process-name, reports info. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | info | describes the environmental problem. Possible values
are: |
Cause The ServerNet cluster monitor process found one or more environmental
problems. The possible environmental problems are described in more
detail under “Recovery.” This message is usually followed
by an SCL 1002 termination message. Effect This is an informational message, but for certain environmental
problems, the ServerNet Cluster monitor process terminates. Recovery Corrective action might be required to fix one or more environmental
problems: If info is “Wrong
process name” or “Bad CPU-List,” alter the ServerNet
cluster monitor process startup parameters (generic process configuration
under SCF.) If info is “Wrong
processor,” correct the ServerNet cluster monitor process startup
parameters or configure and start the ServerNet cluster monitor process
through SCF. This error occurs only if the ServerNet cluster monitor
process was started manually from an HP Tandem Advanced Command Language
(TACL) prompt on a processor that is not in its processor list. If info is “internal
error,” “Nested signal,” or “Power on processing
error,” the ServerNet cluster monitor process terminates and
is restarted automatically. Submit the ZZSA* savefile to your service
provider for analysis of this problem. If info is “Unsupported
system topology,”the system hardware has been incorrectly configured
and the X and Y fabrics have different topologies. For example, the
X fabric of a NonStop BladeSystem might be configured for connectivity
to a BladeCluster Solution topology, but the Y fabric of the same
system is not configured similarly (or vice-versa). Correct the configuration
so that both fabrics are configured with the same topology and then
restart SNETMON. If info is “Bad SvNet
node number,” the configured ServerNet node number is out of
range with respect to the allowed node number range for the current
configuration. An older version of ME firmware might be running on
one or both fabrics. This environmental problem might also be reported
if the local node number has changed, however SNETMON was not able
to gracefully stop the subsystem to update the node number. If info is “SvNet
node number mismatch,” the X and Y fabrics have been configured
differently through the OSM Low-Level Link. The ServerNet Cluster
subsystem is shut down, an event is generated, and SNETMON is terminated.
For recovery, configure the system correctly, then restart SNETMON. If info is “Missing
ServerNet Cluster License,” the $ZZSCL process finds that it
is running on a node that has been configured using the OSM Low-Level
Link for connectivity to a BladeCluster topology, but the node does
not have a BladeCluster license file on either $SYSTEM.SYSnn or $SYSTEM.SYSTEM.
The $ZZSCL process pair is terminated. Contact your service provider
for the correct license file.
|
1011 $ZCNF Access error err, err-detail Operation: op. | err | is the error code returned from the DSM trace routine. | err-detail | is the error detail code returned from the DSM trace
routine. | op | is the operation that was being performed when the
error occurred. Possible values are: |
Cause The ServerNet cluster monitor process encountered an error using
the HP NonStop Kernel configuration services application program interface
(API). Effect If this error occurs during process startup, the ServerNet cluster
monitor process uses STARTSTATE STOPPED (default) instead of using
the data stored in the private configuration record. If this error
occurs later, the action is prompted by an SCF [ALTER | START | STOP]
SUBSYS command. In this case, the command fails with an error. Recovery Investigate the cause of the error. Restart the process or reissue
the failed SCF command. |
1012 DSM Trace error err, err-detail Operation: op. | err | is the error code returned by the DSM trace routine. | err-detail | is the error detail returned by the DSM trace routine. | op | is the operation that was being performed when the
error was encountered. Possible values are: |
Cause The ServerNet cluster monitor process encountered an error using
the DSM trace routines. Effect Any pending trace is terminated. Recovery Investigate the cause of the error. Reissue the SCF TRACE command.
If the problem persists, contact your service provider. |
1013 ServerNet Cluster subsystem monitor process, process-name, cannot register for SANMAN notifications.
Reason: reason. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | reason | is the reason that the registration attempt failed.
Currently, the only reason is “Fabric Access Control struct
Full” (-2). |
Cause The ServerNet cluster monitor process attempted to register
for ServerNet SAN manager (SANMAN) notifications relative to external
fabric connection status changes. However, no room remained in the
Fabric Access Control (FAC) structure in system global memory, so
the process could not be registered. Probably all registration slots are being consumed by previously
registered clients. Currently, the maximum number of clients allowed
to register simultaneously is 32. Effect The ServerNet Cluster subsystem is placed in the STOPPED state.
Direct ServerNet communication with processors on other servers, even
when physically connected, is not possible. Recovery Stop at least one of the processes currently registered. Then
use the SCF START SUBSYS command to restart the ServerNet Cluster
subsystem. |
1014 ServerNet Cluster subsystem, process-name, cannot start. Reason: reason. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL). | reason | describes determination made by the ServerNet cluster
monitor process determination of the reason that the external fabric
connections for this ServerNet node are not ready. Possible values
are: |
Cause The ServerNet cluster monitor process has determined that the
physical connection to the ServerNet Cluster is not ready for ServerNet
connections. Probably the ServerNet cluster monitor process was running before
the ServerNet SAN manager process could communicate with the external
fabrics or program the Node Numbering Agents (NNAs). Effect The ServerNet Cluster subsystem remains in the STOPPED state.
Direct ServerNet communication with processors on other servers is
not possible. Recovery The ServerNet SAN manager process automatically notifies the
ServerNet cluster monitor process when the external fabric connections
are ready for use. The ServerNet cluster monitor process then retries
the Start command. |
1015 InterProcessor Communication (IPC) Monitor Process, process-name, Reports cause. | process-name | is the name of the Message system monitor process
($ZIM<nn>). | cause | describes the environmental problem. |
Cause A message monitor process found some environmental problems
such as the process name being wrong (not $ZIMnn). This event is usually
followed by a message monitor termination event. Effect This is an informational message, but for certain environmental
problems, the message monitor process terminates. Recovery Alter the message system monitor process’s (MSGMON) startup
parameters (generic configuration under SCF). |
1016 Process process-name is not compatible with the current version of the Kernel message
system. | process-name | is the name of the ServerNet cluster monitor process
($ZZSCL) or the message system monitor process ($ZIM<nn>). |
Cause The ServerNet cluster monitor process or the message system
monitor process compared its own version and that of the NonStop Kernel
message system and determined that the versions were incompatible.
This event will be followed by a process termination event. Effect The ServerNet cluster monitor process or the message system
monitor process terminates. Recovery Ensure that the versions of the ServerNet cluster monitor process
and message system monitor process (T0294) and the NonStop Kernel
message system (T9050) are compatible. Check SPR requisites for T9050
and T0294. |
1100 The ServerNet direct connection from processor local-cpu to processor remote-cpu in ServerNet node remote-node[,] [ system
name sysname, Expand node remote-sysnum ] has become unusable. | local-cpu | is the number of the local processor for which connectivity
is lost. | remote-cpu | is the number of the remote processor for which connectivity
is lost. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause Both the X and Y paths from the local processor to the indicated
remote processor are down. This message is preceded by one or more IPC (NonStop Kernel
Message System) 110 messages generated by the local processor. The
IPC 110 messages report the causes of the failures. If the lost connection is due to the failure of the remote processor,
an SCL 1102 message is generated. | | | | | NOTE: If the ServerNet cluster monitor process receives the remote
processor down information in time, the SCL 1100 message is suppressed. | | | | |
Effect All intersystem ServerNet traffic between the indicated local
and remote processors is routed via the Expand-over-ServerNet line-handler
process. Consequently, transmission is slower. Recovery This is an informational message. For recovery information,
see the accompanying IPC 110 or SCL 1102 messages. |
1101 The ServerNet direct connection from processor local-cpu to processor remote-cpu in ServerNet node remote-node [,] [ system
name sysname, Expand node remote-sysnum ] has been restored. | local-cpu | is the number of the local processor for which connectivity
is restored. | remote-cpu | is the number of the remote processor for which connectivity
is restored. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause One or both of the paths between the indicated processors is
restored. An IPC (NonStop Kernel Message System) 111 message with
additional information is generated by the local processor. | | | | | NOTE: This event is not generated during ServerNet cluster monitor
process initialization. | | | | |
Effect Direct ServerNet communication between the processors is possible. Recovery This is an informational message. No corrective action is required. |
1102 Processor remote-cpu in ServerNet node remote-node [,] [ system
name sysname, Expand node remote-sysnum ] has failed. | remote-cpu | is the number of the remote processor that failed. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause A remote processor failed. This event is logged on every other
ServerNet cluster member when a processor on a node fails. Effect ServerNet paths between all local processors and the remote
processor are taken down. The local processors suppress IPC 110 messages
in this case. Possibly local processors detected path failures and
logged path down events before being informed of the remote processor's
failure. The particular sequence of events depends on the speed with
which the ServerNet cluster monitor process is informed of the remote
processor's failure and the levels of message traffic from the local
system to the failed remote processor. Recovery Reload the remote processor. |
1103 Processor remote-cpu in ServerNet node remote-node [,] [ system
name sysname, Expand node remote-sysnum ] has been reloaded. ServerNet direct
connectivity to that processor has been restored. | remote-cpu | is the number of the remote processor that is reloaded. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause A remote processor is reloaded, and its connections with this
system are restored. This event is logged on every other ServerNet
cluster member when a processor on a node is reloaded. Effect Direct ServerNet message traffic with the indicated processor
can resume. Recovery This is an informational message. No corrective action is required. |
1104 Local processor local-cpu was reloaded. ServerNet direct connectivity to remote systems have
been restored. | local-cpu | is the number of the local processor that is reloaded. |
Cause A local processor is reloaded. Effect Local connections of the processor are restored. Recovery This is an informational message. No corrective action is required. |
1105 Processor remote-cpu in ServerNet node remote-node [,] [ system
name sysname, Expand node remote-sysnum ] has lost connectivity to the ServerNet fabric. | remote-cpu | is the number of the remote processor that lost its
ServerNet connectivity. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. | fabric | identifies the ServerNet fabric (X or Y) to which
the indicated processor lost connectivity: |
Cause A remote processor detected that its connection to the indicated
ServerNet fabric failed. Possibly the other PMF CRU in the processor
enclosure was removed. The indicated processor logs an IPC (NonStop
Kernel Message System) 112 message on its local system. Effect ServerNet paths on the indicated fabric between all local processors
and the remote processor are taken down. Direct ServerNet message
traffic is still possible by using the other fabric. When the local
processors down the paths to the remote processor on the indicated
fabric, IPC 110 message logging is suppressed. Possibly local processors detected path failures and logged
path down events before being informed of the remote processor's fabric
failure. The particular sequence of events depends on the speed with
which the Servernet cluster monitor process is informed of the remote
processor's fabric failure and the levels of message traffic from
the local system to the remote processor. Recovery This is an informational message. For recovery information,
see the IPC 112 message. |
1106 Processor remote-cpu in ServerNet node remote-node [,] [ system
name sysname, Expand node remote-sysnum ] has regained fabric ServerNet connectivity. | remote-cpu | is the number of the remote processor whose fabric
connection was restored. | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. | fabric | identifies the ServerNet fabric (X or Y) to which
the indicated processor regained connectivity: |
Cause A remote processor regained connectivity to the indicated ServerNet
fabric. Effect Paths between the local system are restored and paths between
the local system and the indicated remote processor on the indicated
fabric wait for automatic recovery kick in. The system on which the
processor resides logs an IPC (NonStop Kernel Message System) 113
message. Recovery This is an informational message. No corrective action is required. |
1107 ServerNet connectivity over the fabric to ServerNet node remote-node [,] [ system name sysname, Expand node remote-sysnum ] has been lost. | fabric | identifies the ServerNet fabric (X or Y) over which
connectivity to the remote system is lost: | remote-node | is the ServerNet node number of the remote system. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause All individual processor-to-processor paths over the indicated
fabric from the local system to the remote system failed. These failures
are documented by IPC (NonStop Kernel Message System) 110 messages
generated by the individual processors. Possibly a ServerNet router
or cable failed. If both fabrics are indicated through the IPC 110
messages, the remote system itself might have failed or lost power. Effect All communication with the remote system over the given fabric
ceases. Recovery If only one fabric is involved, investigate the condition of
the intervening ServerNet routers and cables. If both fabrics are
involved, investigate the condition of the remote system itself. In
either case, the associated IPC 110 messages might provide further
recovery information. |
1108 ServerNet direct connectivity with ServerNet
node remote-node [,] [ system name sysname, Expand node remote-sysnum ] has been lost due to reason. | remote-node | is the ServerNet node number of the remote system
to which ServerNet direct connectivity has been lost. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. | reason | indicates why connectivity to the remote system has
been lost. Possible values are: |
Cause ServerNet direct connectivity to a remote system is lost for
the reason stated in the message. The remote system failed, lost power,
or has a duplicate system number. The local system might have lost
power. Router or cable failures might have occurred on both ServerNet
fabrics. Effect All connections with the remote system are shut down. Recovery Depending upon the reason indicated, correct the problem. For example, bring up the failed processors or bring up/replace
the failed cable or router. If a protocol error occurred, there might be an associated HP
Tandem Failure Data System (TFDS) (TFDS subsystem ID: DMP) failure
capture event for the primary ServerNet cluster monitor process. If the reason was Duplicate System Number, change the Expand
node number of the newly connected node to a unique number (range
0 to 254) in the cluster. For more information about changing the
system number, see the SCF Reference Manual for the Kernel
Subsystem and contact your service provider. |
1109 ServerNet direct connectivity with ServerNet
node remote-node [,] [ system name sysname, Expand node remote-sysnum ] has been initialized [WITH WARNINGS]. | remote-node | is the ServerNet node number of the remote system
with which a ServerNet direct connection is initialized. | sysname | is the name of the remote system. | remote-sysnum | is the Expand node number of the remote system. |
Cause A ServerNet connection is established with a remote system.
This connection might be the initial one caused by starting ServerNet
cluster services on either of the systems, or it might be the recovery
of a failed connection. Effect Direct message system traffic of ServerNet between the two systems
is possible. Recovery This is an informational message. No corrective action is required. |
1110 ServerNet Cluster subsystem configuration error
on fabric fabric while performing the test. [Fabric fabric is
not usable for ServerNet connectivity with external nodes.] | fabric | identifies a ServerNet fabric (X or Y) that has a
configuration error. | test | is the test that the service processor (SP) was running
when it found the configuration error. Possible values are: |
Cause The service processor found a configuration error when checking
each ServerNet fabric prior to starting ServerNet cluster services. The tokens in the event contain details of the configuration
error, including: The number of the enclosure, the module, and the slot
containing the CRU that was found to be not configured for inclusion
in a ServerNet cluster The error code returned by the service processor for
the CRU that was found to be not configured for inclusion in a ServerNet
cluster
Effect If either the X or Y fabric is correctly configured, ServerNet
cluster services move to the STARTING state and connections with remote
systems are established on the correctly configured fabric. If both fabrics are incorrectly configured, ServerNet cluster
services remain in the STOPPED state, and no connections with remote
systems are established. Recovery Determine the details of the configuration error and correct
it. |
1111 No systems were discovered for ServerNet direct
connectivity. | Cause This system is the first in the ServerNet cluster to be started,
or a ServerNet connectivity failure occurred. Effect The system attains the STARTED state, but there is no ServerNet
connectivity with other systems. Recovery If this system is the first in the ServerNet cluster to be started,
no corrective action is required. Otherwise, repair the ServerNet
connectivity problem. |
1112 Registration to listen to remote ServerNet Cluster
discovery packets failed. | Cause The MS Driver in the reporting processor could not register
to listen to permissive packets. Effect Any discovery packets sent to the reporting processor are not
delivered to the MS Driver. The MS Driver in the reporting processor
cannot find out about any packets received in the permissive Access
Validation and Translation Table Entry (AVTTE). If there are no registered
processors in the target system, the target system can neither be
discovered nor can it generate any SCL 1114 events. Recovery Stopping one or more subsystems that register for permissive
packets should allow the MS Driver to register itself. MS Driver registration
occurs when a processor is loaded or reloaded, when the message monitor
process is initialized, and when the ServerNet cluster monitor process
is initialized. The recommended long-term solution is to request a
TNet Services (T8460) SPR capable of supporting a larger number of
registered listeners. |
1113 Discovery of remote ServerNet node node failed due to reason. Details of failed discovery attempt: Protocol Stage: stage [ Selected target processors: disc-targs | Sender protocol version: send-prot | Target protocol version: targ-prot | Sender minimum protocol version: send-min-prot | Target minimum protocol version: targ-min-prot | Target processor number: targ-cpu | Node instantiation error: inst-err
] | node | is the target node for which discovery failed in the
sender system. | reason | indicates the reason discovery failed. Possible values
are: | stage | is the protocol stage at the time of discovery failure. | disc-targs | is a 16-bit binary mask representing the processors
selected by the sender ServerNet cluster monitor process as discovery
targets. | send-prot | is the discovery protocol version of the sender ServerNet
cluster monitor process. | targ-prot | is the discovery protocol version of the target ServerNet
cluster monitor process. | send-min-prot | is the minimum discovery protocol interpretation version
of the sender ServerNet cluster monitor process. | targ-min-prot | is the minimum discovery protocol interpretation version
of the target ServerNet cluster monitor process. | targ-cpu | is the processor number returned by the target ServerNet
cluster monitor process in the discovery response packet payload. | inst-err | is the type of error encountered by the processor
of the sender ServerNet cluster monitor process when instantiating
the target node. Possible error types are: 1 Memory
allocation 2 AVTTE allocation 3 Device installation |
Cause A discovery failure was detected by the sender system. The ServerNet
cluster monitor process could not discover the target system. Effect A ServerNet cluster connection to the target system is not established.
The systems perform periodic retries to discover each other. Once
the cause of failure is corrected, discovery should proceed automatically. Recovery The recovery procedure depends on the value of reason: |
1114 Discovery started by remote ServerNet node node failed due to reason. Details of failed discovery attempt: Protocol Stage: stage | [ Sender protocol
version: send-prot | Target
protocol version: targ-prot
| Sender minimum protocol version: send-min-prot | Target minimum protocol version: targ-min-prot | Sender processor number: send-cpu | Sender processor type: send-cpu | ServerNet reported sender node: reported-node | ServerNet reported sender processor:reported-cpu | Expected sender ServerNet ID: exp-snetID | Reported sender ServerNet ID: ret-snetID ] | node | is the target node to which discovery failed. | reason | indicates the reason discovery failed. Possible values
are: | stage | the protocol stage at the time of discovery failure. | send-prot | is the discovery protocol version of the sender ServerNet
cluster monitor process. | targ-prot | is the discovery protocol version of the target ServerNet
cluster monitor process. | send-min-prot | is the minimum discovery protocol interpretation version
of the sender ServerNet cluster monitor process. | targ-min-prot | is the minimum discovery protocol interpretation version
of the target ServerNet cluster monitor process. | send-cpu | is the processor number returned by the target ServerNet
cluster monitor process in the discovery response packet payload. | reported-node | is the node number reported by ServerNet. | reported-cpu | is the processor number reported by ServerNet. | exp-snetID | is the ServerNet node identification (ServerNet ID)
stored by the sender ServerNet cluster monitor process for the target
(node, processor) during process initialization. This ServerNet ID
is calculated by the service processor in the sender system. | ret-snetID | is the actual ServerNet ID of the target processor
(determined by the sender on reception of a discovery response packet). |
Cause A discovery failure was detected by the target system. The ServerNet
cluster monitor process could not discover the sender system. Effect A ServerNet cluster connection to the target system is not established.
The systems perform periodic retries to discover each other. Once
the cause of failure is corrected, discovery should proceed automatically. Recovery The recovery procedure depends on the value of reason: |
1115 Error PQE received From ServerNet node: remote-node, processor: remote-cpu [ SNError: snerror ] X Path Error: xerror Y Path Error: yerror SIB retry count: retry-count | remote-node | is the ServerNet node number of the remote system
to which ServerNet cluster monitor process (SNETMON) connectivity
is lost. | remote-cpu | is the number of the processor on the remote system
on which SNETMON resides. | snerror | indicates the reason ServerNet cluster connectivity
failed. Possible values are: | xerror | identifies the reason for the transfer failure on
the X fabric. This internal error code is used for troubleshooting. | yerror | identifies the reason for the transfer failure on
the Y fabric. This internal error code is used for troubleshooting. | retry-count | is the number of times a packet was sent before the
transfer was considered failed. |
Cause The connection between the ServerNet cluster monitor process
(SNETMON) on the local system and SNETMON on a remote system is lost
due to a low-level communication problem. Effect The connection between the SNETMON process on the local node
and the SNETMON process on the remote node is lost. Recovery This is an informational message. No corrective action is required;
recovery is automatic. |
1116 ServerNet connectivity over the fabric to ServerNet node remote-node[,] [ system name sysname, Expand node sysnum ] has been established. | fabric | indicates which fabric’s connectivity has been
established. | remote-node | is the ServerNet node number of the remote system. | sysname | is the Expand name of the remote system. | sysnum | is the Expand system number of the remote system. |
Cause There is at least one working path on the specified fabric from
the local node to the specified remote system now, when previously
there were none. The path state changes are documented by IPC 111
events generated by the individual processors. Effect There is at least partial connectivity with the remote system
over the given fabric. Recovery This is an informational message only. No corrective action
is required. |
1117 A call to Service Processor (SP) I/O library
routine routine failed. | routine | identifies the SP I/O library routine. These routines
are possible: |
Cause The ServerNet cluster monitor process (SNETMON) made a call
to an SP I/O library routine, but it failed. Effect The ServerNet cluster monitor process (SNETMON) automatically
retries the SP I/O library routine, possibly after destroying its
previous session with the Service Processor and starting a new session. Recovery The ServerNet cluster monitor process (SNETMON) automatically
retries the SP I/O library routine. The Service Processor might have
to be initialized, or the Service Processor firmware might have to
be upgraded. Contact your service provider for assistance. |
1118 The SMC Driver API interface call error returned
error err, err-detail error
detail. [ Node Number nodeNum Fabric fabric.] | nodeNum | Identifies the ServerNet node number associated with
the failed SMC Driver API interface call. | fabric | Identifies the external ServerNet fabric (X or Y). | err | is a map token containing data relative to the call
and its error returns. The data includes: |
Cause SNETMON invoked an SMC Driver API interface function. The function
returned a code other than SMC_RTN_OK. Effect SNETMON will automatically retry the SMC Driver API call. Recovery This event is primarily provided for support personnel. SNETMON
will automatically retry the SMC Driver API call. If the error persists
(as indicated by additional SCL 1118 events), use the error and error
detail codes in the event to determine and correct the cause of the
failure. |
1119 The ServerNet Cluster subsystem monitor process, process-name, detected an error when registering with
the SMC Driver. Error: err. | process-name | is the name of the ServerNet Cluster subsystem monitor
process ($ZZSCL). | err | contains the reason for the failure. Possible values
are: |
Cause The primary SNETMON process attempted to register with the SMC
Driver, but registration failed. Effect The primary SNETMON process will not be able to communicate
directly with remote nodes via ServerNet reads and will terminate
itself. The backup SNETMON process will take over and attempt to register
with the SMC Driver when it becomes the new primary. The SNETMON process
pair will terminate if registration with the SMC Driver does not succeed
in at least one processor in SNETMON's configured CPU list. Recovery SNETMON will automatically attempt to register with the SMC
Driver on a different processor. A ZZSA* save abend file will be created
when the primary SNETMON process terminates. The ZZSA* save abend
file should be provided for analysis, along with the ZZSV* service
log event file containing the SCL 1119 event. |
1120 The ServerNet Cluster subsystem monitor process, process-name, version is not compatible with the current
version of the SMC driver. | process-name | is the name of the ServerNet Cluster subsystem monitor
process ($ZZSCL). |
Cause SNETMON made a comparison of its own version and that of the
SMC driver that it is using. The version was determined to be incompatible. Effect The ServerNet Cluster subsystem monitor process terminates. Recovery Check SPR requisites for the SMC Driver (T2800) and SNETMON
(T0294). The operator needs to ensure that the versions of SNETMON
and the SMC Driver are compatible. |
1200 ServerNet statistics for {the local node | remote
ServerNet node node } logged due to cause. | node | is the ServerNet node number of the remote system
that collected the statistics data. | cause | indicates why the statistics were generated. Possible
values are: |
Cause Statistics data is sent to the EMS service log. Effect Statistics counters with nonzero values are written to the EMS
service log. If a request to reset the statistics counter was received,
all statistics counters are reset to 0. Recovery This is an informational message. No corrective action is required. |
Each processor keeps a set of counters for each system in the
ServerNet cluster. To keep the amount of data sent to the service
log at a minimum, statistics counters are present in the node statistics
event only if they have nonzero values. The node statistics event
contains this information: Statistics Node Identification Information | |
This information is always included in the statistics event
to identify the node that is the source of the statistics counters: Messages Sent and Received Counters | |
The node statistics event reports the number of each message
type sent or received to and from each system, including the local
one. Counts are kept separately for each system. This information is returned in the Datalist Messages Sent and
Datalist Messages Received structures: Message System Error Counters | |
The node statistics event reports the number of errors detected
on connections with each system (including the local one): Path Management Counters (X Fabric and Y Fabric) | |
Path management counters record path events pertaining to paths
that originate within the processor. Counts are kept separately for
each X and Y fabric. This information is returned in the Datalist Path Management
X and Datalist Path Management Y structures: ServerNet Path Error Counters (X Fabric and Y Fabric) | |
ServerNet path error counters are maintained separately for
each system, including the local one. Counts are kept separately for
each X and Y fabric. This information is returned in the Datalist TNet Errors X and
Datalist TNet Errors Y structures: Cause Register Error Counters | |
Cause register error counters that are not traceable to connections
with any particular system are included only in Node statistics events
generated for local nodes. Node statistics events for remote nodes
do not contain cause register error counter statistics. Generic Error Counters (X Fabric, Y Fabric, and Unknown Fabric) | |
Counters are maintained in each processor for errors that are
not traceable to connections with any particular system. These generic
error counters are included only in statistics events generated for
local nodes. Node statistics events for remote nodes do not contain
generic error counter statistics. There is a separate set of generic
error counter statistics for each X, Y, and unknown fabric. This information is returned in the Datalist Generic Errors
X, Datalist Generic Errors Y and Datalist Generic Errors U structures:
|