Operator Messages Manual
Chapter 45 IPC (NonStop Kernel Operating System Message System) Messages
The messages in this chapter are generated by the NonStop™
Kernel operating system Message System (IPC) subsystem. The subsystem
ID displayed by these messages includes IPC as the subsystem name.  |  |  |  |  | NOTE: Negative-numbered messages are common to most subsystems. If
you receive a negative-numbered message that is not described in this
chapter, see Chapter 15. |  |  |  |  |
100 Sequence error packets received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. Expected sequence number: number-expected and the received sequence number: number-received | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | number-expected | is the sequence number that the processor expected
to receive. | number-received | is the sequence number contained in the message received. |
Cause A ServerNet message system interrupt packet was received twice
or was dropped. Effect The out-of-sequence message will not be acknowledged. The message
sender will eventually get a WACK timeout and will undergo a recovery
protocol to resynchronize the sequence number. The message will be
retried. Recovery This is an informational message only; no corrective action
is needed. |
101 Bad ServerNet packet received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. Details of the bad packet are: status of packet: packet-status, ServerNet transaction type: tr-type, ServerNet address used: snet-addr, bytecount
in packet: byte-count | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | packet-status | is the ServerNet error code contained in the packet. | tr-type | is the type of ServerNet transaction being attempted. | addr-used | is the target ServerNet address in the packet. | byte-count | is the length of the packet received, in bytes. |
Cause This is probably caused by a protocol error on the part of the
NonStop Kernel message system software. Effect The packet received is ignored and the problem processor is
responsible for initiating error recovery. Recovery This is an informational message only; no corrective action
is needed. |
102 ServerNet nack received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. Details of the nack are: Nack Status code: nack-code, sibs used for the transfer: sib-type, ServerNet address used for transfer: snet-addr, bytecount for the transfer: byte-count | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | nack-code | is the ServerNet NACK code contained in the packet. | sib-type | is the type of Send Info Block (SIB, a NonStop Kernel
message system data structure) that was used by the NonStop Kernel
sending logic to send the packet that was rejected in this event. | snet-addr | is the target ServerNet address in the packet. | byte-count | is the length of the packet received, in bytes. |
Cause The problem processor might be down or recovering from a power
failure. Also, this event might be caused by a protocol error on the
part of the NonStop Kernel message-system. Effect If the error is due to a power fail recovery or the nack status
code indicates an interrupt to a full queue, the packet will eventually
be resent. Otherwise, both plans will be downed and communication
to the problem processor will be severed. If the problem and reporting
processors are in the same system, regroup will be invoked, which
will halt one of the processors. Recovery This is an informational message only; no corrective action
is needed unless a processor halts. If a processor does halt: Take a dump of the halted processor. Take an online dump of the problem or reporting processor. RELOAD the halted processor.
|
103 Bad destination ServerNet packet received on
the path path by processor reporting-cpu from processor problem-cpu {in ServerNet node node-number}. Details of the bad packets are: ServerNet transaction
type: tr-type, ServerNet address used: snet-addr, bytecount in the packet: byte-count, Intented destination node: dest-node | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | tr-type | is the type of ServerNet transaction being attempted. | snet-addr | is the target ServerNet address in the packet. | byte-count | is the length of the packet received, in bytes. | dest-node | is the ServerNet address of the intended destination
processor. |
Cause An error might have occurred in the routing tables of the ServerNet
network. Effect The reporting processor ignores the packet. If the ServerNet routing tables are not reliable, then it is
possible that the reporting and problem processors will not be able
to communicate. The NonStop Kernel message system will detect the
problem and, if both the processors are in the same system, cause
one of the involved processors to halt. Recovery This is an informational message only; no corrective action
is needed unless a processor halts. If a processor does halt: Take a dump of the halted processor. Take an online dump of the problem or reporting processor. RELOAD the halted processor.
|
104 Unexpected packet received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. |
Cause The problem processor was sending “I'm alive” packets
to the reporting processor, and the reporting processor has declared
the problem processor as down. Effect Generally, the reporting processor ignores the packet; but for
packets originating in the local system, the reporting processor returns
a “poison packet” to the problem processor, causing it
to halt itself. Recovery This is an informational message only; no corrective action
is needed. If a processor halts, contact your service provider. |
110 The path path from
processor reporting-cpu to processor problem-cpu {in ServerNet node node-number} was DOWNED due to reason. OPERATOR ATTENTION
NEEDED. Path had excessive failures and will NOT be recovered automatically.
| path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor on which
the error was detected. | problem-cpu | is the processor in which the message originated. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | reason | is the reason the path was taken down. |
Cause The NonStop Kernel (NSK) has brought down the path used by the
reporting processor in order to communicate with the problem processor. Effect The downed path will no longer be used for communication between
the indicated processors. Another path will be used. Recovery The operator can try to bring the path back up using the SCF
START SERVERNET command. Alternatively, unless the path was downed
by the operator (SCF command) or due to fabric failure, NonStop Kernel
automatic path recovery will attempt to recover the path. |
111 The path path from
processor reporting-cpu to processor problem-cpu {in ServerNet node node-number}, was brought UP due to reason. | path | is the path that contains the fabric over which the
message was transmitted. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | reason | indicates the reason why the path was brought down. |
Cause The NonStop Kernel (NSK) message system has resumed using a
path from the reporting processor. Generally this event occurs when
a processor comes up or when an operator sends a Subsystem Control
Facility (SCF) command to bring up the fabric. Effect The indicated processors may resume communication via the restored
path. Recovery This is an informational message only; no corrective action
is needed. |
112 Processor reporting-cpu has lost connectivity to the path path
due to path-reason. | reporting-cpu | is the processor number of the processor that reports
the problem. | path | identifies the fabric to which connectivity has been
lost. | path-reason | is the reason that connectivity was lost. |
Cause The reporting processor’s connection to the indicated
ServerNet fabric was brought down. Effect The processor no longer attempts to communicate with the rest
of the system or ServerNet cluster via the indicated fabric. Recovery If the fabric was down because of: A hardware problem- correct the problem then bring
the fabric up using the SCF START SERVERNET command. An operator SCF command- bring the fabric up using
the SCF START SERVERNET command.
|
113 Processor reporting-cpu has recovered connectivity to the path path due to path-reason. | reporting-cpu | is the processor number of the processor which has
just regained connectivity. | path | identifies the fabric to which the reporting processor
has regained connectivity. | path-reason | is the reason connectivity was regained. |
Cause Connectivity between the indicated processor and the fabric
has been restored. Effect The processor is able to again communicate with other system
components through this fabric. Recovery This is an informational message only; no corrective action
is needed. |
114 OPERATOR ATTENTION NEEDED. Connectivity on the path path of processor reporting-cpu is still down due to path-reason. | path | is the path that contains the fabric to which the
reporting processor cannot connect. | reporting-cpu | is the processor number of the processor which has
no connectivity to the fabric. | path-reason | is the reason that connectivity was lost. |
Cause The processor has had no connection to the fabric for the duration
displayed. Effect The processor is not able to communicate with the other system
components over the indicated fabric. Recovery If the fabric is down because of: A hardware problem- correct the problem then bring
the fabric up using the SCF START SERVERNET command. An operator (SCF) command- bring the fabric up using
the SCF START SERVERNET command.
|
115 Event logging for path path from processor reporting-cpu to
processor problem-cpu {in ServerNet node node-number} is suppressed due to excessive path state
transitions. | path | is the fabric over which the message was transmitted. | reporting-cpu | is the processor number of the processor making the
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. |
Cause There have been excessive path transitions. Effect The logging of PATH-UP and PATH-DOWN events on the indicated
path is suspended to avoid flooding the logs. Recovery This is an informational message only; no corrective action
is needed. |
116 The path path from
processor reporting-cpu to processor problem-cpu {in ServerNet node node-number} had count automatic recoveries since
the last log. | path | is the fabric over which the message was transmitted. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and will not be passed if the
reporting and problem processors are in the same (local) node. | count | is the count of automatic recoveries. |
Cause The number of automatic recoveries has been recorded for the
indicated path since the last log. Effect None, the system is simply displaying a count indicating ongoing
actions. Recovery This is an informational message only; no corrective action
is needed. |
120 BTE timeouts reported on the path path from processor reporting-cpu to
processor problem-cpu {in ServerNet node node-number}. Number of
BTE timeouts: count | path | is the path that contains the fabric over which the
message system was attempting to transmit. | reporting-cpu | is the processor number of the processor making this
report, in this case, the sending processor. | problem-cpu | is the processor at the other end of the point-to-point
connection, in this case, the target processor. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the count of BTE timeout occurrences. |
Cause BTE timeouts occurred on the indicated path. Effect The transmission is automatically retried by the NonStop Kernel
message-system. Recovery This is an informational message only; no corrective action
is needed. |
121 BARRIER timeouts on the path path from processor reporting-cpu to processor problem-cpu {in ServerNet node node-number}. Number of BARRIER timeouts: count | path | is the path that contains the fabric over which the
message system was attempting to transmit. | reporting-cpu | is the processor number of the processor making this
report, in this case, the sending processor. | problem-cpu | is the processor at the other end of the point-to-point
connection, in this case, the target processor. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the number of barrier timeout occurrences. |
Cause Either the network is congested, the problem processor is in
a hardware freeze state, or the ServerNet connect is severed or unusable. Effect The path is downed and the message is retried on the other fabric.  |  |  |  |  | NOTE: The PATH-DOWN events will be reported as a result of this error. |  |  |  |  |
Recovery This is an informational message only; no corrective action
is needed. |
122 Spurious ServerNet acks received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. Number of Spurious acks: count | path | is the path that contains the fabric over which the
acknowledgments were received. | reporting-cpu | is the processor number of the processor making this
report, in this case, the processor receiving the acknowledgments. | problem-cpu | is the processor at the other end of the point-to-point
connection, in this case, the processor purportedly sending the acknowledgments. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the count of spurious acknowledgments. |
Cause Spurious ServerNet acknowledgments occurred on the indicated
path. Effect None. Recovery This is an informational message only; no corrective action
is needed. path | is the path that contains the fabric on which the
out-of-sequence message/s was/were received. | reporting-cpu | is the processor number of the processor making this
report, in this case, the processor receiving the sequence errors. | problem-cpu | is the processor at the other end of the point-to-point
connection, in this case, the processor sending the out-of-sequence
message/s. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the count of sequence errors. |
Cause An out-of-sequence message or messages occurred. Effect None. A summary of out-of-date sequence errors is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
123 Sequence errors received on the path path by
processor reporting-cpu from processor problem-cpu {in ServerNet node
node-number}. Number of sequence errors: count | path | is the path that contains the fabric on which the
out-of-sequence message or messages were received. | reporting-cpu | is the processor number of the processor making this
report; in this case, the processor sending the out-of-sequence messages. | problem-cpu | is the processor at the other end of the point-to-point
connection; in this case, the processor sending the out-of-sequence
message or messages. | node-number | is the cluster node number of the cluster node containing
the problem processor. This is an optional parameter and is not passed
if the reporting and the problem processors are i9n the same (local)
node.. | count | is the count of sequence errors. |
Cause One or more out-of-sequence messages occurred. Effect None. The summary is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
124 Bad ServerNet packets received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. ServerNet
Transaction typ: tr-type. Details of the error counts
are: Unsupported pkt type: unsup-pkt-type, Unsupported pkt length: unsup-pkt-length, Bad ServerNet address mask: bad-mask, Bad ServerNet source: bad-source, AVT
access error: access-error, Bad Interrupt: bad-interrupt, Interrupt to full Queue: int-to-full-q | path | is the path that contains the fabric on which the
errors occurred. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | tr-type | is the ServerNet transaction type (e.g. read, write,
etc.). | unsup-pkt-type | is the count of unsupported packet type errors detected. | unsup-pkt-length | is the count of unsupported packet length errors detected. | bad-mask | is the count of bad ServerNet address mask errors
detected. | bad-source | is the count of bad ServerNet source errors detected. | access-error | is the count of AVT access errors that occurred. | bad-interrupt | is the count of bad interrupts that occurred. | int-to-full-q | is the count of ServerNet interrupts that occurred
while the queue was full. |
Cause A summary of the bad packet-type errors detected on the indicated
path is logged. Effect None. The summary is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
125 Nacks received on the path path by processor reporting-cpu from
processor problem-cpu {in ServerNet node node-number}. Details of
the error counts are: Unsupported pkt type: unsup-pkt-type, Unsupported pkt length: unsup-pkt-length, Bad ServerNet address mask: bad-mask, Bad ServerNet source: bad-source, AVT
access error: access-error, Bad Interrupt: bad-interrupt, Interrupt to full Queue: int-to-full-q | path | is the path that contains the fabric on which the
negative acknowledgments occurred. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | unsup-pkt-type | is the count of unsupported packet type errors detected. | unsup-pkt-length | is the count of unsupported packet length errors detected. | bad-mask | is the count of bad ServerNet address mask errors
detected. | bad-source | is the count of bad ServerNet source errors detected. | access-error | is the count of AVT access errors that occurred. | bad-interrupt | is the count of bad interrupts that occurred. | int-to-full-q | is the count of ServerNet interrupts that occurred
while the queue was full. |
Cause A summary of NACKs encountered on the indicated path is logged. Effect None. The summary is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
126 Bad destination ServerNet packets are received
on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet node node-number}. Number of bad destination packets: count | path | is the path that contains the fabric on which the
errors were detected. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of the point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the count of invalid destination ID errors. |
Cause A summary of packet counts with invalid destination ID is received
on the indicated path. Effect None. The summary is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
127 Unexpected ServerNet packets received on the path path by processor reporting-cpu from processor problem-cpu {in ServerNet
node node-number}. Number of unexpected
packets: count | path | is the path that contains the fabric on which the
errors were detected. | reporting-cpu | is the processor number of the processor making this
report. | problem-cpu | is the processor at the other end of this point-to-point
connection. | node-number | is the node number of the node containing the problem
processor. This is an optional token and is not passed if the reporting
and problem processors are in the same (local) node. | count | is the count of unexpected packets. |
Cause A summary count of unexpected packets received on the indicated
path is logged. Effect None. The summary is logged periodically. Recovery This is an informational message only; no corrective action
is needed. |
140 R10K speculative write problem(s) encountered
on reporting-cpu Instances of this problem
since: last log log-spec-write, coldload life-spec-write Attempts to use alternate buffer:
Since last log: Since coldload: [Successful: log -alt-sw-ok life-alt-sw-ok] [Failed: log-alt-sw-fail life-alt-sw-fail ] Last occurrence of the problem:
Buffer with error: Address: fail-buf-addr, Type: fail-buf-type, [Source processor: source-cpu, Req Ctrl Size: req-ctrl-
size, Req Data Size: req-data-size ], [Source processor: source-cpu, Reply
Data Size: reply-data- size], [Source
processor: source-cpu, Reply Ctrl Size: reply-ctrl- size,], [Source processor: source-cpu, Reply Ctrl Size: reply-ctrl-
size, Reply Data Size: reply-data-size], [Source: Node: source-cluster, Processor: source-cpu, PIN: pin [1], Destination: pin [2] [Source: Internal
Use: internal-data] | reporting-cpu | is the processor number of the processor on which
the speculative write error was detected. | log-spec-write | is the number of occurrences of this event since
the last time one was logged. | life-spec-write | is the number of occurrences of this event, in this
processor, since the system was coldloaded. | log-alt-sw-ok | is the number of times message transmission successfully
used the alternate buffer (because an error occurred in the primary
buffer) since the last time an occurrence of this event was logged. | log-alt-sw-fail | is the number of times the message system was unable
to switch to the alternate buffer (after detecting a speculative writer
error in the primary buffer) since the last time an occurrence of
this event was logged. | life-alt-sw-ok | is the number of times message transmission successfully
used the alternate buffer (because an error occurred in the primary
buffer) since the reporting processor was coldloaded or reloaded. | life-alt-sw-fail | is the number of times the message system was unable
to switch to the alternate buffer (after detecting a speculative
write error in the primary buffer) since the reporting processor was
coldloaded or reloaded. | fail-buf-addr | is the address of the buffer in which the speculative
write error was detected. | fail-buf-type | is the type of buffer in which the speculative write
error was detected.. | source-cpu | is the processor number of the other processor involved
in this event. | req-ctrl-size | is the size of the message’s request control
element, in bytes. | req-data-size | is the size of the message’s data element,
in bytes.. | reply-data-size | is the size of the message’s data reply element,
in bytes. | reply-ctrl-size | is the size of the message’s reply control
element, in bytes. | source-cluster | is the cluster number of the other cluster involved
in the problem of this event. | pin [1] | is the process identifier of the process which originated
the message. | pin [2] | is the process identifier message’s destination
process. | internal-data | is a piece of internal message system data intended
to assist in the debugging. |
Cause The message-system performed error recovery after detecting
a potential buffer corruption during message traffic handling. Effect The NonStop Kernel message system automatically retries the
failing message. Recovery This is an informational message only; no corrective action
is needed. |
150 Processor reporting-cpu started
regroup because of processor problem-cpu with reason: regroup-reason. Processors:
before regroup init-cpu-mask after regroup end-cpu-mask
Regroup sequence numbers: Current sequence-no [1] Previous sequence-no [2]. Duration of this incident: duration milliseconds. | reporting-cpu | is the processor number of the processor making this
report. It is the highest numbered processor which survived the regroup
incident. | problem-cpu | is the subject token to which all communication was
lost. | regroup-reason | is the reason that the regroup was started. | init-cpu-mask | is a bit mask of the “up” processors before
the regroup began. | end-cpu-mask | is a bit mask of the “up” processors after
the regroup completes. | sequence-no [1] | is the system regroup sequence number as of the end
of the logged regroup incident. | sequence-no [2] | is the system regroup sequence number as of the beginning
of the logged regroup incident. |
Cause A regroup incident has occurred. Effect One or more processors might have halted. Recovery Determine the reason why the processors halted and follow the
installation procedures to document the problem and reload the processors. Minimally, any processor which halted should be dumped and the
dumps should be transmitted to your service provider along with system
documentation files (e.g. tsysclr, conflist, service logs, etc.). |
160 RCVDUMP/RELOAD failed with the reason fail-cause for the processor cpu with the number of retries num-retries and type of dump is dumptype. Other details
are dump-rld-type: Specification of the
dump, either Reload or RCVDUMP, slice-id: Slice where error occurred, last-xfab-err: Last X fabric error, last-yfab-err:
Last Y fabric error, avt-mapping: AVT mapping
status for the dump. | fail-cause | indicates the reason for the RCVDUMP/RELOAD failure. | cpu | is the processor number where the failure occurred. | num-retries | is the number of implicit retries. | dumptype | is the type of dump. | dump-rld-type | is the dump specification, either Reload or RCVDUMP. | slice-id | identifies the slice where the error occurred. | last-xfab-err | is the last error that occurred on X fabric. | last-yfab-err | is the last error that occurred on Y fabric. | avt-mapping | is the AVT mapping status for the dump. |
Cause A RCVDUMP/RELOAD failure occurred. Effect The RCVDUMP/RELOAD did not complete. Recovery This occurrence of this message indicates a likely hardware
problem. Contact your service provider. |
170 Message failed due to a request buffer being
modified while the message was in transit. Sending Pin: sending-pin, Buffer size: buffer-size, Buffer context: buffer-context, Buffer
context-relative address: buffer-craddr, Expected checksum: expected-checksum, Calculated checksum: calculated-checksum, Recalculated checksum: recalculated-checksum, Buffer type:buffer-type, Retry count: Num-retries. | sending-pin | is the client processor pin number that owned the
buffer that was modified. | buffer-size | is the size of the message buffer that was modified. | buffer-context | is the buffer CBA context. | buffer-craddr | is the buffer CBA context-relative address. | expected-checksum | is the expected checksum calculated by the client
CPU before the buffer was modified . | calculated-checksum | is the checksum calculated by the server CPU upon
receiving the modified buffer. | recalculated-checksum | is the checksum recalculated by the client CPU after
being informed by the server CPU that the expected and calculated
checksums did not match. | buffer-type | is the type of message buffer that was modified (i.e.,
either a request control or a request data buffer). | num-retries | is the number of retries performed to attempt to recover
from modifications in the message buffer. |
Cause Failed memory handling check due to a message buffer being modified
while the message was in transit. Effect The message fails with File System error 654 (“A message
or I/O operation failed due to a message or I/O buffer being modified
while the operation was in progress.”) Recovery Contact your service provider to determine if the buffer was
modified due to a possible programming error in the process represented
by the sending-pin. In particular, a programming
error is highly likely if a retry count (num-retries) greater than 0 is reported in the event (this signifies that the
buffer was modified multiple times, thereby preventing retries from
succeeding). Note that the NonStop Kernel message system automatically
retries the failing message (up to a maximum retry limit) if the AUTO_RETRY_ON_ERROR_654
Kernel subsystem parameter is configured with a value of ON. You can
determine the value of this parameter by issuing the SCF INFO SUBSYS
$ZZKRN, DETAIL command. Conversely, a retry count of 0 signifies that
the AUTO_RETRY_ON_ERROR_654 Kernel subsystem parameter is configured
with a value of OFF, thereby disallowing retries when the NonStop
Kernel message system detects that a message buffer was modified while
the message was in transit. A possible programming error in the process
represented by the sending-pinshould also
be suspected even if the retry count is 0. However, if applications
running in the system have a legitimate reason to modify message buffers
of in-transit messages, then consider enabling automatic retries for
modified message buffers. This can be accomplished by configuring
the AUTO_RETRY_ON_ERROR_654 Kernel subsystem parameter with a value
of ON through the SCF ALTER SUBSYS $ZZKRN, AUTO_RETRY_ON_ERROR_654
on command. For more details please refer to the SCF Reference
Manual for the Kernel Subsystem. |
171 Message failed due to a reply buffer being modified
while the message was in transit. Sending Pin: sending-pin, Buffer size: buffer-size, Buffer context: buffer-context, Buffer context-relative address: buffer-craddr, Expected checksum: expected-checksum, Calculated checksum: calculated-checksum, Buffer type:buffer-type, Retry count: Num-retries. | sending-pin | is the server processor pin number that owned the
buffer that was modified. | buffer-size | is the size of the message buffer that was modified. | buffer-context | is the buffer CBA context. | buffer-craddr | is the buffer CBA context-relative address. | expected-checksum | is the expected checksum calculated by the server
CPU before the buffer was modified. | calculated-checksum | is the checksum calculated by the client CPU upon
receiving the modified buffer. | buffer-type | is the type of message buffer that was modified (either
a reply control or a reply data buffer). | num-retries | is the number of retries performed to attempt to recover
from modifications in the message buffer. |
Cause Failed memory handling check due to a message buffer being modified
while the message was in transit. Effect The message fails with File System error 654 (“A message
or I/O operation failed due to a message or I/O buffer being modified
while the operation was in progress.”) Recovery Contact your service provider to determine if the buffer was
modified due to a possible programming error in the client process.
In particular, a programming error is highly likely if a retry count
(num-retries) greater than 0 is reported
in the event (this signifies that the buffer was modified multiple
times, thereby preventing retries from succeeding). Note that the
NonStop Kernel message system automatically retries the failing message
(up to a maximum retry limit) for an inflight reply buffer. A possible
programming error in the client process must also be suspected even
if the retry count is 0. For more information, see the SCF Reference Manual for the Kernel Subsystem. |
|