Operator Messages Manual

Chapter 99 SNT (ServerNet Error Handler) Messages

The messages in this chapter are generated by the ServerNet Error Handler subsystem. The subsystem ID displayed by these messages includes SNT as the subsystem name.

NOTE: Negative-numbered messages are common to most subsystems. If you receive a negative-numbered message that is not described in this chapter, see Chapter 15.


1000

CPU cpu: Rcv Ugly Packet Errors errors Occurred on Fabric fabric-number

cpu

is the processor number that encountered the RCV Ugly errors.

errors

is the count of RCV Ugly errors encountered.

fabric-number

is the fabric number (0 for X, 1 for Y) where the RCV Ugly errors were encountered.

Cause  The processor detected an ill-formatted packet.

Effect  The corrupted packet is dropped and software error handling takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1001

CPU cpu: TPB Packet Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number that encountered the TPB error.

errors

is the count of the TPB.

fabric-number

is the fabric number (0 for X, 1 for Y) where the TPB errors were encountered.

Cause  A corrupted ServerNet packet has been detected and tagged with the TPB label by a router, and forwarded to the reporting processor.

Effect  The corrupted packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1002

CPU cpu: CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number that encountered the CRC error.

errors

is the count of the CRC error numbers.

fabric-number

is the fabric number (0 for X, 1 for Y) where the CRC errors were encountered.

Cause  A corrupted ServerNet packet has been detected and tagged with the TPB label by a router, and then forwarded to the reporting processor.

Effect  The corrupted packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1003

CPU cpu: TPB with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the TPB with CRC errors were encountered.

Cause  A corrupted ServerNet packet has been detected.

Effect  The corrupted packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1004

CPU cpu: Underrun Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the underrun errors were encountered.

Cause  A packet with an underrun has been detected on a ServerNet link.

Effect  The packet with the underrun is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1005

CPU cpu: Underrun with TPB Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the underrun with TPB errors were encountered.

Cause  A packet with an underrun and TPB symbol has been detected on a ServerNet link.

Effect  The packet is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1006

CPU cpu: Underrun with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the underrun with CRC errors were encountered.

Cause  The packets with an underrun and with a bad CRC have been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1007

CPU cpu: Underrun with TPB with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the underrun with TPB with CRC errors were encountered.

Cause  Packets with an underrun accompanied by TPB symbol and with a bad CRC have been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1008

CPU cpu: Runt Packet Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the runt packet errors were encountered.

Cause   Runt packets have been detected.

Effect  The packet is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1009

CPU cpu: Runt Packet with TPB Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the runt packet with TPB errors were encountered.

Cause  A runt packet followed by a TPB symbol has been detected.

Effect  The packet is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1010

CPU cpu: Runt Packet with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the runt packet with CRC errors were encountered.

Cause  Runt packets with bad CRC have been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1011

CPU cpu: Runt Packet with TPB with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the runt packet with TPB with CRC errors were encountered.

Cause  Runt packets accompanied by a TPB symbol and a bad CRC have been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1012

CPU cpu: Overrun Packet Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the overrun errors were encountered.

Cause  A packet with an overrun has been detected.

Effect  The packet is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1013

CPU cpu: Overrun Packet with TPB Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the overrun with TPB errors were encountered.

Cause  A packet with an overrun and a TPB has been detected.

Effect  The packet is dropped and an internal error handler takes care of the dropped packet.

Recovery  This is an informational message only; no corrective action is needed.



1014

CPU cpu: Overrun Packet with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the overrun packet with CRC errors were encountered.

Cause  Packets with overrun and a bad CRC have been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1015

CPU cpu: Overrun Packet with TPB with CRC Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the overrun with TPB with CRC errors were encountered.

Cause  Packets with overrun accompanied by a TPB symbol and a bad CRC error has been detected.

Effect  The packets are dropped and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1016

CPU cpu: Unsupported Packet Type Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the unsupported packet type errors were encountered.

Cause  Packets with unsupported ServerNet transaction type have been detected.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1017

CPU cpu: Unsupported Length Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the unsupported length errors were encountered.

Cause  Packets with unsupported length have been detected.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1018

CPU cpu: Bad Destination ID Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad destination ID errors were encountered.

Cause  Packets with bad destination node IDs have been detected on a ServerNet link.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1019

CPU cpu: Bad SrcIdBad Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad SrcIdBad errors were encountered.

Cause  Packets with bad source node IDs have been detected.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1020

CPU cpu: Bad RdReqOvflo Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad read request overflow errors were encountered.

Cause  Packets with bad request overflow have been detected.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1021

CPU cpu: Spurious Acknowledgment Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the spurious acknowledgment errors were encountered.

Cause  An unexpected ServerNet acknowledgment packet has been detected on a ServerNet link.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1022

CPU cpu: Bad Mask Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad mask errors were encountered.

Cause  Packets with a bad ServerNet address that failed the mask check have been encountered.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1023

CPU cpu: Bad Path Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad path errors were encountered.

Cause  Packets that failed the path bit AVT check have been detected on a ServerNet link.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1024

CPU cpu: Bad Source Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad source errors were encountered.

Cause  Packets that failed the source node check in the AVT have been detected.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1025

CPU cpu: Bad Access Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad access errors were encountered.

Cause  Packets that failed permission check in the AVT have been detected on a ServerNet link.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1026

CPU cpu: Bad Interrupt Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the bad interrupt errors were encountered.

Cause  Packets that failed the interrupt AVT check have been detected on a ServerNet link.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1027

CPU cpu: Interrupt Que Full Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the interrupt queue full errors were encountered.

Cause  An interrupt packet was received while the corresponding interrupt queue was full.

Effect  The offending packet is dropped and an internal error handler takes care of the corrupted packet.

Recovery  This is an informational message only; no corrective action is needed.



1028

CPU cpu: Babbling Source Detected Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the babbling source errors were detected.

Cause  A source of the ServerNet interrupt packet was found generating too many interrupt packets and causing the interrupt queue to be full.

Effect  These interrupts are counted and if too many of such interrupts occur, the offending ServerNet link is not used.

Recovery  This is an informational message only; no corrective action is needed.



1029

CPU cpu: Interrupt With No Device Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the interrupt with no device errors was encountered.

Cause  When a software subsystem that interacts on ServerNet is deinstalled, an interrupt packet can be posted to the subsystem. Such interrupts cause the “interrupt with no device” errors to occur.

Effect  These interrupts are counted and if too many such interrupts occur, the offending ServerNet link is no longer used. The error causing the interrupt packet is dropped.

Recovery  This is an informational message only; no corrective action is needed.



1030

CPU cpu: Exception Errors errors Occurred on Fabric fabric‑number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the exception errors were encountered.

Cause  The exception interrupt queue was full or the exception interrupt AVT was corrupted.

Effect  These interrupts are counted and if too many such interrupts occur, the processor halts.

Recovery  This is an informational message only; no corrective action is needed.



1031

CPU cpu: Write Overflow Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the write overflow errors were encountered.

Cause  The write response buffer was exhausted. At least one write request response was lost.

Effect  These interrupts are counted and if too many such interrupts occur, the processor halts.

Recovery  This is an informational message only; no corrective action is needed.



1032

CPU cpu: Read Overflow Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the read overflow errors were encountered.

Cause  The receiver read request buffer received more than the maximum allowed read request packets.

Effect  These interrupts are counted and if too many such interrupts occur, the processor halts.

Recovery  This is an informational message only; no corrective action is needed.



1033

CPU cpu: Interrupt Queue Full Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the interrupt queue full errors were encountered.

Cause  The interrupt queue was being filled faster than the processor can empty them.

Effect  These interrupts are counted and reported.

Recovery  This is an informational message only; no corrective action is needed.



1034

CPU cpu: Link Exception Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the link exception errors were encountered.

Cause  The X ServerNet link might have failed.

Effect  Too many of these errors can cause the X ServerNet link to be unusable.

Recovery  This is an informational message only; no corrective action is needed.



1035

CPU cpu: Link Exception Errors errors Occurred on Fabric fabric-number

cpu

is the processor number.

errors

is the error counter.

fabric-number

is the fabric number (0 for X, 1 for Y) where the link exception errors were encountered.

Cause  The Y ServerNet link might have failed.

Effect  Too many of these errors can cause the Y ServerNet link to be unusable.

Recovery  This is an informational message only; no corrective action is needed.



1036

CPU cpu: Fabric fabric-number Down Due to Reason down-reason

cpu

is the processor number.

fabric-number

is the fabric number (0 for X, 1 for Y) that was downed.

down-reason

is the reason code for the problem that caused the fabric to be downed.

Cause  A ServerNet link has been marked as down and unusable. Too many errors might have caused the fabric to become unusable.

Effect  The processor no longer uses the downed fabric for either sending or receiving ServerNet packets.

Recovery  An operator intervention is needed to recover the downed ServerNet link.



1037

CPU cpu: Packet abnormal end Errors abnormal-end Occurred on Fabric fabric-number

cpu

is the processor number.

abnormal-end

is the number of RCV abnormal end errors encountered.

fabric-down

is the fabric number (0 for X, 1 for Y) that was downed.

Cause  The processor has detected badly formatted packets on a ServerNet link.

Effect  The packets are dropped, and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1038

CPU cpu: Non-atomic-wrt during sleep Errors natmwrt‑during‑sleep Occurred on Fabric fabric-number

cpu

is the processor number.

natmwrt-during-sleep

is the number of nonatomic packets received during sleep mode that were encountered.

fabric-down

is the fabric number (0 for X, 1 for Y) that was downed.

Cause  The processor has detected badly formatted packets on a ServerNet link.

Effect  The packets are dropped, and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1039

CPU cpu: Unknow packet Errors avt-unknown Occurred on Fabric fabric-number

cpu

is the processor number.

avt-unknown

is the number of unknown errors encountered.

fabric-down

is the fabric number (0 for X, 1 for Y) that was downed.

Cause  The processor has detected badly formatted packets on a ServerNet link.

Effect  The packets are dropped, and an internal error handler takes care of the dropped packets.

Recovery  This is an informational message only; no corrective action is needed.



1040

CPU cpu:Transfer sidebuffer corruption Errors sidebuf-error Occurred

cpu

is the processor number.

sidebuf-error

is the number of sidebuffer errors encountered.

Cause  The processor has detected a corrupted incoming transfer.

Effect  Either the processor will halt or the client will be notified that the transfer was corrupted depending on the size of the transfer.

Recovery  This is an informational message only; no corrective action is needed. The client might retry the operation.



1041

CPU cpu: Fabric fabric-number Up

cpu

is the processor number.

fabric-number

is the fabric number (0 for X, 1 for Y) that was downed.

Cause  A fabric was brought up.

Effect  An additional fabric is available for data transfers that was not previously available.

Recovery  This is an informational message only; no corrective action is needed.



1042

CPU cpu: ServerNet software encountered errors timeouts on Fabric fabric-number. Recovery details: details

cpu

is the processor number that encountered the error.

errors

is the number of congestion recovery occurrences that have been encountered.

fabric-number

indicates the fabric (0 for X, 1 for Y) where the timeout occurred, but it is not guaranteed that this is the fabric that is experiencing congestion.

details

is additional error information that can be appended to the event-message text. This information can vary, depending on the version of the software being used.

Cause  A ServerNet packet transfer timed out, and ServerNet software is performing recovery procedures.

Effect  The packet that timed out is not delivered, but higher-level software must tolerate this condition and recover from it. If the cause of the timeout was transient, the ServerNet software will be able to clear the error state and continue. If the cause of the timeout is persistent congestion, ServerNet software will reset the ports on the ServerNet hardware, potentially causing errors on other packets.

Recovery  This message in only intended for use by developers. It is an informational message only; no corrective action is needed.



1043

CPU cpu: ServerNet software sent errors information events. Information details: Information-id datum-1 datum-2 datum-3

cpu

specifies the processor that encountered the error.

errors

is the number of congestion errors encountered.

Information-id datum-1 datum-2 datum-3

is additional error information that can be appended to the event-message text.

Cause  The ServerNet software sent information to the event log.

Effect  None, other than the event is logged.

Recovery  This message in only intended for use by developers. This is an informational message only; no corrective action is needed.



1044

Processor cpunum: count Link Alive event(s) have occurred. [ The SNet Config Register at the time of the first event was: registervalue ] [ The SNet Config Register at the time of the last event was: registervalue ]

cpunum

is the processor on which the event occurred.

count

is the number of undefined Link Alive events that have occurred.

registervalue

is the value of the ServerNet configuration register.

Cause  A hardware link that was down is now up.

Effect  The hardware link can be used for communications.

Recovery  This is an informational message only; no corrective action is needed.



1045

Processor cpunum: count Undefined ASIC error(s) have been reported. [ Additional info at the time of the first exception: LLP Event X: x-llp-evtreg Exception Data X: x-data LLP Event Y: y-llp-evtreg Exception Data Y: y-data Interrupt Cause: intcause HW SelfCheck: selfcheck ] [ Additional info at the time of the last exception: LLP Event X: x-llp-evtreg Exception Data X: x-data LLP Event Y: y-llp-evtreg Exception Data Y: y-data Interrupt Cause: intcause HW SelfCheck: selfcheck ]

cpunum

is the number of the processor on which the interrupt occurred.

count

is the number of undefined ASIC errors that have been reported.

x-llp-evtreg

is the value of LLP event register X.

x-data

is the value of the LLP X Exception Data Register.

y-llp-evtreg

is the value of LLP event register Y.

y-data

is the value of the LLP Y Exception Data Register.

intcause

is the cause of the interrupt.

selfcheck

is the value of the LLP Y SelfCheck Register.

Cause  Lower-level software presented ServerNet software with an interrupt of an indeterminate type.

Effect  None

Recovery  This is an informational message only; no corrective action is needed.



1046

Processor cpunum: count minor port error(s) on have occurred on port fabric. [ Additional info at the time of the first error: Port Error Cnt: errorcount Link Event: linkerror ] [ Additional info at the time of the last error: Port Error Cnt: errorcount Link Event: linkerror ]

cpunum

is the processor on which the event occurred.

count

is the number of minor port errors that have occurred on the port fabric.

fabric

is the number of the port on which the event occurred.

errorcount

is the number of minor port errors that the hardware has detected. (The hardware can detect and count many errors before the software logs the event.)

linkerror

is the type of link error of the first or last minor port error have occurred on the port fabric.

Cause  The ServerNet hardware detected one or more minor port errors.

Effect  The ServerNet hardware tries to recover any data transmission affected by the error.

Recovery  This is an informational message only; no corrective action is needed.



1047

Processor cpunum: count Transmit FILL error(s) have occurred on port fabric. [ Additional info at the time of the first error: LLP Status: status Transmit Buffer Watermark: watermark ] [ Additional info at the time of the last error: LLP Status: status Transmit Buffer Watermark: watermark ]

cpunum

is the processor on which the event occurred.

count

is the number of Transmit FILL errors that have occurred on the port fabric.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Transmit FILL error that occurred on the port fabric.

watermark

is the value of the Transmit Buffer Watermark at the time of the first or last Transmit FILL error that occurred on the port fabric.

Cause  ServerNet hardware is transmitting excessive FILL symbols.

Effect  The receiver of the data stream might report a minor port error. ServerNet communications and performance might be slightly degraded.

Recovery  This is an informational message only; no corrective action is needed.



1048

Processor cpunum: count Transmit RunOn error(s) have occurred on port fabric. [ Additional info at the time of the first error: LLP Status: status Transmit Buffer Watermark: watermark ] [ Additional info at the time of the last error: LLP Status: status Transmit Buffer Watermark: watermark ]

cpunum

is the processor on which the event occurred.

count

is the number of Transmit RunOn errors that have occurred on the port fabric.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Transmit RunOn error.

watermark

is the value of the Transmit Buffer Watermark at the time of the first or last Transmit RunOn error.

Cause  ServerNet hardware is transmitting too many symbols in the data stream without sending an end-of-packet indicator.

Effect  The receiver of the data stream might report a minor port error. ServerNet communications and performance might be slightly degraded.

Recovery  This is an informational message only; no corrective action is needed.



1049

Processor cpunum: count BTE Timer expiration event(s) have occurred. [ Additional info at the time of the first expiration: Device: devicenum BTE: btenum BTE Hdr1: hdr1val BTE Hdr2: hdr2val BuffAddr: bufferaddress Status: transmitstatus ] [ Additional info at the time of the last expiration: Device: devicenum BTE: btenum BTE Hdr1: hdr1val BTE Hdr2: hdr2val BuffAddr: bufferaddress Status: transmitstatus ]

cpunum

is the processor on which the event occurred.

count

is the number of BTE Timer expiration events that have occurred.

devicenum

is the number of the device on which the first or last event occurred.

btenum

is the number of the first or last BTE Timer expiration event that occurred.

hdr1val

is the value of the first BTE header when the first or last BTE Timer expiration event occurred.

hdr2val

is the value of the second BTE header when the first or last BTE Timer expiration event occurred.

bufferaddress

is the address of the buffer when the first or last BTE Timer expiration event occurred.

transmitstatus

is the transmission status of the first or last BTE Timer expiration event.

Cause  A packet transmission did not complete, and normal software and hardware timer functions did not detect the error.

Effect  The data was not completely transmitted. Data transmissions to multiple destinations on the same fabric will time out. A higher-level software transmission protocol might detect that the data did not arrive and attempt recovery.

Recovery  This is an informational message only; no corrective action is needed.



1050

Processor cpunum: count Loss of LinkAlive event(s) detected on port fabric. [ Additional info at the time of the first event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ] [ Additional info at the time of the last event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ]

cpunum

is the processor on which the event occurred.

count

is the number of Loss of LinkAlive events that have been detected on the port fabric.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Loss of LinkAlive event.

configreg

identifies the configuration register at the time of the first or last Loss of LinkAlive event.

eventreg

identifies the event register at the time of the first or last Loss of LinkAlive event.

Cause  ServerNet hardware detected that one of the transmission links has become inoperative.

Effect  Data cannot be transmitted on the specified port until the link is recovered.

Recovery  There is no user-level recovery. ServerNet hardware and software monitor the status of the port and automatically recover the port when it is operational and can transmit data without errors.



1051

Processor cpunum: Excessive Minor Port errors have caused a major error on port fabric. [ Additional info at the time of the first event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ] [ Additional info at the time of the last event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ]

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Loss of LinkAlive event.

configreg

identifies the configuration register at the time of the first or last minor port error.

eventreg

identifies the event register at the time of the first or last minor port error.

Cause  Excessive minor port errors caused the ServerNet hardware to shut down the port to protect data integrity.

Effect  Data cannot be transmitted on the specified port until the link is recovered.

Recovery  There is no user-level recovery. ServerNet hardware and software monitor the status of the port and automatically recover the port when it is operational and can transmit data without errors.



1052

Processor cpunum: Outbound Forward Progress error on port fabric. [ Additional info at the time of the first event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ] [ Additional info at the time of the last event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ]

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Outbound Forward Progress error.

configreg

identifies the configuration register at the time of the first or last Outbound Forward Progress error.

eventreg

identifies the event register at the time of the first or last Outbound Forward Progress error.

Cause  ServerNet hardware cannot achieve satisfactory progress when transmitting data on the specified port. The reason might be congestion in the fabric.

Effect  The packet being transmitted is truncated. The receiver might report a minor port error. Data cannot be transmitted on the specified port until the link is recovered.

Recovery  There is no user-level recovery. ServerNet hardware and software monitor the status of the port and automatically recover the port when it is operational and can transmit data without errors.

Scan the event logs for other events that might indicate hardware failure that could have caused the congestion. If possible, reconfigure your system and software to reduce communication needs between nodes.



1053

Processor cpunum: Inbound Forward Progress error on port fabric. [ Additional info at the time of the first event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ] [ Additional info at the time of the last event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ]

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last Inbound Forward Progress error.

configreg

identifies the configuration register at the time of the first or last Inbound Forward Progress error.

eventreg

is the value of the event register at the time of the first or last Inbound Forward Progress error.

Cause  ServerNet hardware cannot achieve satisfactory progress when receiving data on the specified port and has brought the link down. The reason is usually a problem in the local ServerNet hardware or timer settings, not congestion in the fabric.

Effect  The port is down. Data in the receiving buffer is discarded, possibly causing timeouts at the sender's end.

Recovery  There is no user-level recovery. ServerNet hardware and software monitor the status of the port and automatically recover the port when it is operational and can transmit data without errors.



1054

Processor cpunum: ServerNet link layer SelfCheck error(s) on port fabric. [ Additional info at the time of the first event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ] [ Additional info at the time of the last event: LLP Status: status SNet Config Reg: configreg Link Event: eventreg ]

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

status

is the status of LLP register at the time of the first or last SelfCheck error.

configreg

identifies the configuration register at the time of the first or last SelfCheck error.

eventreg

identifies the event register at the time of the first or last SelfCheck error.

Cause  ServerNet hardware reported a SelfCheck condition in its link portion.

Effect  The port is down. Incoming data is discarded. Outgoing data is truncated. ServerNet hardware that is exchanging data with is subject to timeouts and minor link errors.

Recovery  After the hardware error condition is resolved, use SCF commands to reestablish connectivity on the port.



1055

Processor cpunum: count Packet Toss errors have occurred. [ Additional info at the time of the first error: LLP Status: status Transmit Buffer Watermark: watermark Toss Timeout: timeout Number of packets tossed: packet-count ] [ Additional info at the time of the last error: LLP Status: status Transmit Buffer Watermark: watermark Toss Timeout: timeout Number of packets tossed: packet-count ]

cpunum

is the processor on which the event occurred.

count

is the number of Packet Toss errors that have occurred.

status

is the status of LLP register at the time of the first or last Packet Toss error.

watermark

identifies the Transmit Buffer Watermark at the time of the first or last Packet Toss error.

timeout

is the value of the toss timeout register.

packet-count

is the number of packets that had been tossed at the time of the first or last error.

Cause  Probably fabric congestion.

Effect  The tossed packets are not received. The sender might experience timeouts. Higher-level software probably recovers and resends the data.

Recovery  This is an informational message only; no corrective action is needed.



1056

Processor cpunum: count Congestion NACKs seen on port fabric. [ Additional info at the time of the first event: Nack Data: data Source ID: srcid Destination ID: destid ] [ Additional info at the time of the last event: Nack Data: data Source ID: srcid Destination ID: destid ]

cpunum

is the processor on which the event occurred.

count

is the number of Congestion NACK events that have been detected on the port fabric.

fabric

is the number of the port on which the event occurred.

data

is the NACK data value.

srcid

is the ServerNet source ID of the packet that was not acknowledged.

destid

is the ServerNet destination ID of the packet that not acknowledged.

Cause  A ServerNet router detected a minor congestion condition. The ServerNet node is not necessarily the cause of the congestion.

Effect  Possible delay in packet routing. Timeout conditions might occur when sending data.

Recovery  This is an informational message only; no corrective action is needed.



1057

Processor cpunum: count Congestion Link Down NACKs seen on port fabric. [ Additional info at the time of the first event: Nack Data: data Source ID: srcid Destination ID: destid ] [ Additional info at the time of the last event: Nack Data: data Source ID: srcid Destination ID: destid ]

cpunum

is the processor on which the event occurred.

count

is the number of Congestion Link Down NACK events that have been detected on the port fabric.

fabric

is the number of the port on which the event occurred.

data

is the NACK data value.

srcid

is the ServerNet source ID of the packet that was not acknowledged.

destid

is the ServerNet destination ID of the packet that not acknowledged.

Cause  A ServerNet router detected a major congestion condition. The ServerNet node is not necessarily the cause of the congestion.

Effect  Data cannot be transmitted and routed on the indicated port.

Recovery  Remove the source of congestion and recover the affected port.



1058

Processor cpunum: There has been an error in the ServerNet Software error handling interface. Port: portnum Reason: reason Error Type: errtype Location: location Error Data: data

cpunum

is the processor on which the event occurred.

portnum

is the number of the port on which the event occurred.

reason

is the reason for the error.

errtype

is the error type.

location

is the location of the error.

data

is a data value that depends on error type.

Cause  ServerNet error-logging software detected either an interface error or an internal error.

Effect  An expected event might not generate properly.

Recovery  This is an informational message only; no corrective action is needed.



1059

Processor cpunum: A single ServerNet symbol was sent on port fabric. The Control Register is ctlreg, the data sent was symboldata.

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

ctlreg

identifies the control register.

symboldata

is the single symbol that was transmitted.

Cause  The ServerNet hardware detected the transmission of a Single Symbol, which is an abnormal condition except during low-level hardware tests.

Effect  The symbol was transmitted. This is usually benign, but might cause ServerNet anomalies.

Recovery  This is an informational message only; no corrective action is needed.



1060

Processor cpunum: An Old Style IBC symbol was received on port fabric. The IBC data received is ibcdata.

cpunum

is the processor on which the event occurred.

fabric

is the number of the port on which the event occurred.

ibcdata

is the IBC symbol that was received.

Cause  The ServerNet hardware detected the receipt of an old-style in-bounds control (IBC) symbol, which is abnormal.

Effect  None

Recovery  This is an informational message only; no corrective action is needed.



1061

Processor cpunum A exceptiontype was seen on port fabric. [ Interrupt Source Register: sourcereg ] Cause Register: causereg, HW Exception Register: exceptionreg

cpunum

is the processor on which the event occurred.

exceptiontype

is the exception type.

fabric

is the number of the port on which the event occurred.

sourcereg

identifies the source register.

causereg

identifies the cause register.

exceptionreg

identifies the exception register.

Cause  ServerNet software determined that an exception condition was spurious because additional exception data was either missing or inconsistent.

Major port error handling modifies some port error register values such that they affect the handling of subsequent minor port errors, which can cause spurious minor port errors.

Effect  The spurious exception is ignored. Its presence might indicate faulty hardware or software.

Recovery  This is an informational message only; no corrective action is needed.



1063

Processor cpunum: errors soft memory error(s) detected in ServerNet data transfers. [ Transfer [ Source-ID: sourceID, ] Dest-ID: destinationID ]]

cpunum

is the processor on which the event occurred.

errors

is the number of soft memory errors that occurred since this event was last reported.

sourceID

is the ServerNet ID of the adapter or processor that sent the transfer.

destinationID

is the ServerNet ID of the adapter or processor that received the transfer.

Cause  Mismatched data indicates that one or more soft memory errors occurred during a transfer.

Effect  The data path is checked and validated. If no other errors are detected, the transfer is retried. If there are additional errors, the ServerNet software attempts to identify cause of those errors.

Recovery  This is an informational message only; no corrective action is needed.



1064

Processor cpunum: ServerNet is approaching its capacity at capacity_loc. [SNet client was notified via timeout status.]

cpunum

is the processor on which the event occurred.

capacity_loc

is the ServerNet router port that is approaching its capacity.

Cause  ServerNet hardware notified ServerNet software that the hardware is approaching its capacity and is taking action to alleviate that condition.

Effect  The alleviation mechanism might cause some data flowing through the ServerNet fabric to be truncated or returned to the sender, in which case the ServerNet and upper-level software resends the data.

Recovery  This is an informational message only; no corrective action is needed.