Operator Messages Manual

Chapter 13 CMAP (Complex Manager Auxiliary Process) Event Messages

The messages in this chapter are generated by the Complex Manager Auxiliary Process (CMAP) subsystem. The subsystem ID displayed by these messages includes CMAP as the subsystem name.

NOTE: Negative-numbered messages are common to most subsystems. If you receive a negative-numbered message that is not described in this chapter, see Chapter 15.


100

On CPU cpu, the PE on slice slice (LID lid) has stopped Slice module tracking ID : smtid LSU0 module tracking ID : lsu0id LSU1 module tracking ID: lsu1id Stop code : stopcode

cpu

is the processor where the PE resides..

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

stopcode

is the code for the reason that the PE stopped. Values are defined by the millicode.

Cause  The millicode stopped a PE for the reason that stopcode specifies.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



101

On CPU cpu, PE on slice slice (LID lid) issued a request to join Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id

cpu

is the processor where the PE resides..

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

Cause  A PE’s state changed from stopped to primitive.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



102

On CPU cpu, the PE on slice slice (LID lid) has been reintegrated Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration Retries : retries

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

retries

is the number of tries required to reintegrate the PE.

Cause  A PE has been reintegrated into a processor.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



103

On CPU cpu, reintegration of PE on slice slice (LID lid) started Target slice module tracking ID: target-smtid Source slice module tracking ID: source-smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration reason : reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

target-smtid

is the tracking ID of the target NonStop Blade Element.

source-smtid

is the tracking ID of the source NonStop Blade Element.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that reintegration started. Possible reasons include:

  • PE failed.

  • NonStop Blade Element was reset.

  • PE shares a memory copy channel with a PE that needed to be reintegrated, therefore this PE also must be reintegrated.

  • Reintegration API was called.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



104

On CPU cpu, reintegration of PE on slice slice (LID lid) failed Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration failure reason : reason Reintegration Retries : retries

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE failed to reintegrate.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that reintegration failed. Possible reasons include:

  • PE could not be isolated.

  • Start reintegration phase failed.

  • Target PE could not be made quiescent.

  • Source FPGA 3 hardware could not be initialized.

  • Target FPGA 3 hardware could not be initialized.

  • Memory copy failed.

  • Final reintegration phase failed.

  • Processor being reintegrated failed.

  • CRL controlling reintegration failed.

retries

is the number of tries required to reintegrate the PE (including this try).

Cause  The cause is indicated by reason.

Effect  None.

Recovery  CMAP tries to reintegrate the PE again, possibly after resetting the NonStop Blade Element where the PE resides. Repeated reintegration attempts become less frequent. If the problem persists, contact your service provider.



105

On CPU cpu, reintegration of PE on slice slice (LID lid) is deferred Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration defer reason: reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that reintegration was deferred (see Table 13-1).

Table 13-1 Reason That Reintegration Was Deferred (105)

Value of reasonMeaning
ZCMP-ENM-DEFER-REINT-DISABLEDPE reintegration is disabled.
ZCMP-ENM-DEFER-REINT-HALTProcessor would halt.
ZCMP-ENM-DEFER-REINT-SOURCECopy source is not available.

 

Cause  See Table 13-1.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



106

On CPU cpu, reintegration of PE on slice slice (LID lid) is disabled Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration Disable reason : reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that reintegration was disabled. Possible reasons include:

  • Disable-reintegration API called.

  • Disable-reintegration-immediately API called.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



107

On CPU cpu, reintegration of PE on slice slice (LID lid) is enabled Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration enable reason : reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that reintegration was enabled. Possible reasons include:

  • Enable-reintegration time limit expired.

  • Enable-reintegration API called.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



108

New CRL in complex complex Slice module tracking ID : smtid Slice module tracking ID1 : smtid1 Slice module tracking ID2 : smtid2

complex

is the NonStop Blade Complex ID of the NonStop Blade Complex.

smtid

is the tracking ID of the first NonStop Blade Element in the NonStop Blade Complex.

smtid1

is the tracking ID of the second NonStop Blade Element in the NonStop Blade Complex.

smtid2

is the tracking ID of the third NonStop Blade Element in the NonStop Blade Complex.

Cause  Either:

  • The processor containing the previous NonStop Blade Complex leader stopped.

  • A cabling problem that made the NonStop Blade Complex invalid was corrected.

Effect  The new NonStop Blade Complex leader is responsible for handling NonStop Blade Complex-wide events.

Recovery  This is an informational message only; no corrective action is needed.



109

Slice reset has started Slice ID : slice Slice module tracking ID : smtid Reset reason : reason Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

reason

is the reason that the NonStop Blade Element was reset. Possible reasons include:

  • PE reinitialization failed.

  • Previous NonStop Blade Element reset failed.

  • NonStop-Blade-Element-reset API called.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



110

Slice reset has completed Slice ID : slice Slice module tracking ID : smtid Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  A NonStop Blade Element reset completed.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



111

Slice reset has failed Slice ID : slice Slice module tracking ID : smtid Reset failure reason : reason Reset retry count : retries Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

reason

is the reason that the reset failed. Possible reasons include:

  • Could not isolate NonStop Blade Element.

  • Could not initiate NonStop Blade Element reset.

  • NonStop Blade Element-reset time limit expired.

  • Could not initialize FPGA 3 hardware.

retries

is the number of tries required to reintegrate the PE (including this try).

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  Every processor that has a PE on this NonStop Blade Element continues to lose some redundancy.

Recovery  CMAP tries to reset the NonStop Blade Element again. Repeated reintegration attempts become less frequent. If the problem persists, contact your service provider.



112

Slice reset has been deferred Slice ID : slice Slice module tracking ID : smtid Reset defer reason : reason Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

reason

is the reason that the reset was deferred. Possible reasons include:

  • NonStop Blade Element resets are disabled.

  • Reset would halt processor.

  • Copy source for reset is unavailable.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



113

Slice reset has been disabled Slice ID : slice Slice module tracking ID : smtid Reset disable reason : reason Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

reason

is the reason that the reset was disabled. Possible reasons include Disable-NonStop-Blade-Element-reset API.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



114

Slice reset has been enabled Slice ID : slice Slice module tracking ID : smtid Reset enable reason : reason Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

slice

is the ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element.

reason

is the reason that the reset was enabled. Possible reasons include:

  • Reset-disabled timer expired.

  • Enable-NonStop-Blade-Element-reset API called.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



115

CMAP Process internal error Internal error : error Internal error detail : detail

error

is the value of the internal error.

detail

is the detail of the internal error.

Cause  The reason that error and detail specify (usually an assertion failure).

Effect  The processor on which CMAP is running halts.

Recovery  Contact your service provider.



116

On CPU cpu-1, the PE on slice slice-1 (LID lid-1) cannot be reintegrated Reason : reason Slice module tracking ID : smtid-1 CPU : cpu-2 LID : lid-2 Slice 1 : slice-2 Slice module tracking ID 1 : smtid-2

cpu-1

is the processor where the PE resides.

slice-1

is the NonStop Blade Element where the PE resides.

lid-1

is the LID of the PE.

reason

is the reason that PE cannot be reintegrated. Possible reasons include:

  • Reintegration would halt processor.

  • PE not in state to be reintegrated.

  • FPGA 3 logic error occurred.

  • Reintegrations are disabled.

  • Source PE for reintegrations is down.

  • PE in HSS - cannot be reintegrated.

  • PE is running the NonStop operating system.

smtid-1

is the tracking ID of the NonStop Blade Element where the PE resides.

cpu-2

is the processor containing the PE that cannot be stopped without halting that processor.

lid-2

is the LID of the PE that cannot be stopped without halting the processor.

slice-2

if reason is DOWN, the value of this token is the ID of the NonStop Blade Element that is preventing reintegration of the PE; otherwise, the value is 255 (decimal).

smtid-2

if reason is DOWN, the value of this token is the tracking ID of the NonStop Blade Element that is preventing reintegration of the PE; otherwise, the value is 255 (decimal).

Cause  The cause is indicated by reason.

Effect  None.

Recovery  If the problem persists, contact your service provider.



117

After resetting a slice a PE failed to enter primitive state Reason : reason Slice ID : slice Slice module tracking ID : smtid Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

reason

is the reason that the PE cannot be reset. Possible reasons include:

  • Reset would halt processor.

  • No PE available for reset.

  • FPGA 3 logic error occurred.

  • Resets are disabled.

  • No source for reintegration.

  • Reintegration disabled for one or more PEs.

  • One or more PEs are running the NonStop operating system.

slice

is the NonStop Blade Element that cannot be reset.

smtid

is the tracking ID of the NonStop Blade Element that cannot be reset.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  If the problem persists, contact your service provider.



118

Operation aborted Operation : operation Abort reason : reason Affected Slice Tracking ID : smtid [ Affected CPU : cpu Affected LID : lid ]

operation

is the operation that aborted. Possible values include:

  • PE reintegration

  • NonStop Blade Element reset

  • Processor reload

  • Firmware update

  • Firmware scrub

reason

is the reason that operation aborted. Possible reasons include:

  • Invalid NonStop Blade Complex configuration

  • Invalid CPU configuration

  • Disable request received

smtid

is the tracking ID of the affected NonStop Blade Element.

cpu

is the processor where the PE resides.

lid

is the tracking ID of the affected PE on the NonStop Blade Element.

Cause  Either:

  • A CMAP client explicitly requested that the operation be aborted.

  • An error event required that the operation be aborted.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



119

The complex complex leader running in CPU cpu has failed Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3

complex

is the NonStop Blade Complex ID of the NonStop Blade Complex.

cpu

is the processor where the CRL failed.

smtid-1

is the tracking ID of the first NonStop Blade Element in the NonStop Blade Complex.

smtid-2

is the tracking ID of the second NonStop Blade Element in the NonStop Blade Complex.

smtid-3

is the tracking ID of the third NonStop Blade Element in the NonStop Blade Complex.

Cause  The processor hosting a CRL halted.

Effect  The remaining processors running on the NonStop operating system will elect a new CRL.

Recovery  This is an informational message only; no corrective action is needed.



120

CPU cpu has been declared invalid Complex ID : complex Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3

cpu

is the invalid processor.

complex

is the NonStop Blade Complex ID of the NonStop Blade Complex.

smtid-1

is the tracking ID of the first NonStop Blade Element in the NonStop Blade Complex.

smtid-2

is the tracking ID of the second NonStop Blade Element in the NonStop Blade Complex.

smtid-3

is the tracking ID of the third NonStop Blade Element in the NonStop Blade Complex.

Cause  PEs report information that is inconsistent, either:

  • Within the processor

  • With the other members of the processor’s NonStop Blade Complex

The inconsistency can be due to any of these conditions (reported by one or more ZCMP events):

  • LSU tracking ID mismatches

  • NonStop Blade Element tracking ID mismatches

  • PE LID mismatches

Effect  PEs on affected processor cannot be reintegrated. NonStop Blade Element resets cannot be performed without compromising this processor.

Recovery  Correct the cabling problem.



121

CPU cpu has been declared valid Complex ID : complex Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3

cpu

is the valid processor.

complex

is the NonStop Blade Complex ID of the NonStop Blade Complex.

smtid-1

is the tracking ID of the first NonStop Blade Element in the NonStop Blade Complex.

smtid-2

is the tracking ID of the second NonStop Blade Element in the NonStop Blade Complex.

smtid-3

is the tracking ID of the third NonStop Blade Element in the NonStop Blade Complex.

Cause  The problem that invalidated a processor was corrected.

Effect  The processor can reintegrate PEs and, if necessary, be reset.

Recovery  This is an informational message only; no corrective action is needed.



122

Halted state services error in the PE on slice slice (LID lid) Slice module tracking ID : smtid HSS Operation : operation HSS status : status

slice

is the NonStop Blade Element that contains the PE.

lid

is the LID of the affected PE.

smtid

is the tracking ID of the NonStop Blade Element.

operation

is the HSS operation being requested. Possible values include:

  • HSS ping

  • Isolate NonStop Blade Element

  • Make target quiescent

  • Finalize reintegration

  • Abort reintegration

status

is the status of HSS. Possible values include:

  • HSS communication time limit expired.

  • HSS rejected request.

Cause  Probably a PE hardware fault.

Effect  The affected processor has restricted ability to reintegrate PEs.

Recovery  CMAP resets the NonStop Blade Element, if possible. If the problem persists, contact your service provider.



123

Unable to isolate LID lid, in slice slice of complex complex Slice module tracking ID : smtid Isolate error : error

lid

is the LID of the PE that cannot be isolated.

slice

is the NonStop Blade Element that contains the PE.

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

smtid

is the tracking ID of the NonStop Blade Element that cannot be isolated.

error

is the reason that the PE or NonStop Blade Element cannot be isolated. Possible values include:

  • Isolation request time limit expired.

  • Isolation request was rejected.

Cause  The cause is indicated by error.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



124

Unable to perform slice reset from cpu cpu Slice ID : slice Slice module tracking ID : smtid Reset error : error Affected CPUs : CPU cpu-1, LID lid-1 [ CPU cpu-2, LID lid-2 [ CPU cpu-3, LID lid-3 [ CPU cpu-4, LID lid-4 [ None ] ] ] ]

cpu

is the processor where the NonStop Blade Element that cannot be reset resides.

slice

is the NonStop Blade Element that cannot be reset.

smtid

is the tracking ID of the NonStop Blade Element that cannot be reset.

error

is the reason that reset could not start. Possible values include:

  • PE communication time limit expired.

  • PE rejected request.

cpu-n

is the processor of PE n on the NonStop Blade Element.

lid-n

is the LID of PE n on the NonStop Blade Element.

Cause  The cause is indicated by error.

Effect  None.

Recovery  CMAP tries to recover. If the problem persists, contact your service provider.



125

On CPU cpu, unable to quiesce the PE on slice slice (LID lid) for reintegration Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Quiesce error : reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the affected NonStop Blade Element.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that PE could not be made quiescent. Possible reasons include:

  • PE communication time limit expired.

  • PE rejected request.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



126

Echo initialization on slice slice failed Source Slice module tracking ID: source-smtid Echo Error Status : source-echo-status Echo Error Register : source-echo-error Echo info : source-echo-info Spectator Slice module tracking ID: spectator-smtid Echo Error Status : spectator-echo-status Echo Error Register : spectator-echo-error Echo info : spectator-echo-info Echo Initialization failure reason : reason

slice

is the NonStop Blade Element that was to be reset.

source-smtid

is the tracking ID of the source NonStop Blade Element.

source-echo-status

is the status of the FPGA 3 for the source NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

source-echo-error

is the contents of the FPGA 3 error register for the source NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

source-echo-info

is additional FPGA 3 information returned for the source NonStop Blade Element in hexadecimal (contents of the echoAdditionalInfo field of the echoInfo structure).

spectator-smtid

is the tracking ID of the spectator NonStop Blade Element.

spectator-echo-status

is the status of the FPGA 3 for the spectator NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

spectator-echo-error

is the contents of the FPGA 3 error register for the spectator NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

spectator-echo-info

is additional FPGA 3 information returned for the spectator NonStop Blade Element in hexadecimal (contents of the echoAdditionalInfo field of the echoInfo structure).

reason

is the result of the FPGA 3 initialization. Possible values include:

  • FPGA 3 hardware error occurred.

  • FPGA 3 hardware could not be accessed.

  • FPGA 3 hardware initialization time limit expired.

Cause  An error occurred while trying to initialize FPGA 3 hardware in preparation for reintegrating a PE or resetting a NonStop Blade Element.

Effect  If FPGA 3 hardware is not working correctly, PEs that are on the NonStop Blade Element with the FPGA 3 hardware cannot be reintegrated.

Recovery  Contact your service provider.



127

Echo target initialization for PE LID lid-1 on slice slice failed Slice module tracking ID: smtid Echo Channel : lid-2 Echo Initialization failure reason : reason

lid-1

is the LID of the affected PE.

slice

is the affected NonStop Blade Element.

smtid

is the tracking ID of the affected NonStop Blade Element.

lid-2

is the LID of the affected channel.

reason

is the initialization error that occurred. Possible reasons include:

  • PE communication time limit expired.

  • PE rejected request.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



128

On CPU cpu, reintegrate of PE, slice slice (LID lid) final step failed Slice module tracking ID : smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id Reintegration failure reason : reason

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE failed to reintegrate.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

reason

is the reason that the final phase of the reintegration failed. Possible reasons include:

  • PE could not be isolated.

  • PE communication time limit expired.

  • PE rejected request.

  • Unable to synchronize PEs for final step.

  • Unable to turn off FPGA 3 copy mode.

Cause  The cause is indicated by reason.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



129

Initiation of a memory copy was unsuccessful Source slice module tracking ID : source-smtid Echo Status Register : 0xsource-echo-status Echo Error Register : 0xsource-echo-error Destination slice module tracking ID : destination-smtid Echo Status Register : 0xdestination-echo-status Echo Error Register : 0xdestination-echo-error Echo Channel : lid Echo Error State : error-state

source-smtid

is the tracking ID of the source NonStop Blade Element of the copy operation.

source-echo-status

is the status of the FPGA 3 for the source NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

source-echo-error

is the contents of the FPGA 3 error register for the source NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

destination-smtid

is the tracking ID of the destination NonStop Blade Element of the copy operation.

destination-echo-status

is the status of the FPGA 3 for the destination NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

destination-echo-error

is the contents of the FPGA 3 error register for the destination NonStop Blade Element (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

lid

is the LID of the affected channel.

error-state

is the state of the FPGA 3 copy logic. Possible values include:

  • No errors detected.

  • FPGA 3 errors detected.

Cause  Either:

  • A hardware problem with the FPGA 3 logic

  • Software internal errors

Effect  None.

Recovery  Contact your service provider.



130

On CPU cpu, PE on slice slice (LID lid) in error state Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id PE error : error

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

error

is the reason for this message. Possible values include:

  • PE communication time limit expired.

  • PE rejected request.

  • PE LID mismatch.

  • PE memory size mismatch.

  • PE processor type mismatch.

  • PE frontside bus mismatch.

  • PE Primitive Communications Protocol (PCP) revision mismatch.

  • PE Memory Address Registers (MAR) set mismatch.

Cause  The cause is indicated by error.

Effect  The PE cannot be reintegrated or participate in the reintegration of another PE.

Recovery  Correct the hardware error.



131

On CPU cpu, PE on slice slice (LID lid) not responding Slice module tracking ID: smtid LSU0 module tracking ID: lsu0id LSU1 module tracking ID: lsu1id

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0id

is the tracking ID of the first LSU to which the PE connects.

lsu1id

is the tracking ID of the second LSU to which the PE connects.

Cause  A PE is unresponsive.

Effect  The PE cannot be reintegrated or participate in the reintegration of another PE.

Recovery  Correct the hardware error.



132

On CPU cpu, unable to initiate reintegration the PE on slice slice (LID lid) Slice module tracking ID: slice-track-id LSU0 module tracking ID: lsu0-track-id LSU1 module tracking ID: lsu1-track-id Reint Start error : error

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where PE reintegration could not start.

lid

is the LID of the PE whose reintegration could not start.

slice-track-id

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0-track-id

is the tracking ID of the first LSU to which the PE connects.

lsu1-track-id

is the tracking ID of the second LSU to which the PE connects.

error

is the PE error that occurred.

Cause  As indicated by error, either the PE communication time limit expired or the PE rejected the request.

Effect  None.

Recovery  Informational message only; no corrective action is needed.



133

On CPU cpu, PE on slice slice (LID lid) failed to enter state to allow quiesce Slice module tracking ID: slice-track-id LSU0 module tracking ID: lsu0-track-id LSU1 module tracking ID: lsu1-track-id

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where the PE resides.

lid

is the LID of the PE whose reintegration timed out.

slice-track-id

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0-track-id

is the tracking ID of the first LSU to which the PE connects.

lsu1-track-id

is the tracking ID of the second LSU to which the PE connects.

Cause  The PE did not reply to the request to become quiescent, possibly because of a hardware error.

Effect  None.

Recovery  Informational message only; no corrective action is needed.



134

On CPU cpu, PE on slice slice (LID lid) failed to enter state to allow reintegration start Slice module tracking ID: slice-track-id LSU0 module tracking ID: lsu0-track-id LSU1 module tracking ID: lsu1-track-id

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element where PE reintegration timed out.

lid

is the LID of the PE whose reintegration timed out.

slice-track-id

is the tracking ID of the NonStop Blade Element where the PE resides.

lsu0-track-id

is the tracking ID of the first LSU to which the PE connects.

lsu1-track-id

is the tracking ID of the second LSU to which the PE connects.

Cause  PE reintegration could not start, possibly because of a hardware (PE) error.

Effect  None.

Recovery  Informational message only; no corrective action is needed.



135

Reinitialization has started on CPU cpu.

cpu

is the processor for which reinitialization is requested.

Cause  Reinitialization request is received and reinitialization is started.

Effect  CPU will be reinitialized.

Recovery  Informational message only; no corrective action is needed.



136

Reinitialization has completed on CPU cpu.

cpu

is the processor on which reinitialization has completed.

Cause  Reinitialization is completed.

Effect  CPU is reinitialized.

Recovery  Informational message only; no corrective action is needed.



200

The status of the Echo logic for slice slice has changed. Slice module tracking ID: smtid Echo state : echo-state Echo status register : 0xecho-status Echo error register : 0xecho-error Channel chnl-1, Rcv: 0xrcv-1, Exp: 0xexp-1 Xmit: 0xxmit-1 [ Channel chnl-2, Rcv: 0xrcv-2, Exp: 0xexp-2 Xmit: 0xxmit-2 ]

slice

is the NonStop Blade Element to which the FPGA 3 is connected.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

error-state

is the state of the FPGA 3 copy logic. Possible values include:

  • No errors detected.

  • FPGA 3 errors detected.

echo-status

is the status of the FPGA 3 (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

echo-error

is the contents of the FPGA 3 error register (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  FPGA 3 logic detected a state change.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



201

All PEs on a slice are not reporting the same slice ID Slice module tracking ID : smtid CPU : cpu-1, LID : lid-1, Slice ID : slice-1 [ CPU : cpu-2, LID : lid-2, Slice ID : slice-2 [ CPU : cpu-3, LID : lid-3, Slice ID : slice-3 [ CPU : cpu-4, LID : lid-4, Slice ID : slice-4 ] ] ]

smtid

is the tracking ID of the NonStop Blade Element where the PEs reside.

cpu-n

is the processor of PE n.

lid-n

is the LID of PE n.

slice-n

is the NonStop Blade Element ID of PE n.

Cause  Not all PEs that have the same NonStop Blade Element tracking ID are connected to the same port on each LSU NonStop Blade Complex; therefore, the PEs’ NonStop Blade Element IDs do not match.

Effect  The NonStop Blade Complex is invalid; therefore, CMAP cannot reintegrate its PEs or reset its NonStop Blade Elements.

Recovery  Correct the PE/LSU cabling problem.



202

All PEs on a slice are not reporting the same Node Name and Node number - complex invalid Slice module tracking ID : smtid CPU : cpu-1, LID : lid-1, Slice ID : slice-1 [ CPU : cpu-2, LID : lid-2, Slice ID : slice-2 [ CPU : cpu-3, LID : lid-3, Slice ID : slice-3 [ CPU : cpu-4, LID : lid-4, Slice ID : slice-4 ] ] ]

slice

is the NonStop Blade Element where the PEs reside.

smtid

is the tracking ID of the NonStop Blade Element where the PEs reside.

cpu-n

is the processor of PE n.

lid-n

is the LID of PE n.

slice-n

is the NonStop Blade Element ID of PE n.

Cause  Not all PEs that have the same NonStop Blade Element tracking ID are connected to the same port on each LSU NonStop Blade Complex; therefore, the PEs’ node names and node numbers do not match.

Effect  The NonStop Blade Complex is invalid; therefore, CMAP cannot reintegrate its PEs or reset its NonStop Blade Elements.

Recovery  Correct the PE/LSU cabling problem.



203

In complex complex, all CPUs do not share a common set of slices - complex invalid CPU : cpu LID : lid LSU0 Tracking ID : lsu0id LSU1 Tracking ID : lsu1id Slice 1, Slice Tracking number : stmid-1 [ Slice 2, Slice Tracking number : stmid-2 [ Slice 3, Slice Tracking number : stmid-3 ] ]

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

cpu

is the processor.

lid

is the LID of the PEs that comprise the processor.

lsu0id

is the tracking ID of the first LSU to which the processor connects.

lsu1id

is the tracking ID of the second LSU to which the processor connects.

stmid-n

is the NonStop Blade Element tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

Cause  Not all PEs that have the same NonStop Blade Element tracking ID are connected to the same port on each LSU NonStop Blade Complex; therefore, the PEs’ NonStop Blade Element IDs do not match.

Effect  The NonStop Blade Complex is invalid; therefore, CMAP cannot reintegrate its PEs or reset its NonStop Blade Elements.

Recovery  Correct the PE/LSU cabling problem.



204

In complex ID complex, number of PEs reported by each slice does not match - complex invalid Slice 1 module tracking ID : stmid-1, Num Pes : pe-count-1 Slice 2 module tracking ID : stmid-2, Num Pes : pe-count-2 Slice 3 module tracking ID : stmid-3, Num Pes : pe-count-3

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

stmid-n

is the NonStop Blade Element tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

pe-count-n

is the number of unique LIDs (PEs) on NonStop Blade Element n in the NonStop Blade Complex.

Cause  A hardware (NonStop Blade Element) problem (not a cabling problem).

Effect  The NonStop Blade Complex is invalid; therefore, CMAP cannot reintegrate its PEs or reset its NonStop Blade Elements.

Recovery  Replace the NonStop Blade Element.



205

CPU cpu reported slice tracking IDs that do not match the tracking IDs for the complex - complex invalid Complex ID : complex Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3

cpu

is the processor that reported unexpected NonStop Blade Element tracking IDs.

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

stmid-n

is the NonStop Blade Element tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

Cause  The PEs composing a processor in a NonStop Blade Complex are cabled to an LSU that has the correct NonStop Blade Element ID but resides in another NonStop Blade Complex.

Effect  The NonStop Blade Complex is invalid; therefore, CMAP cannot reintegrate its PEs or reset its NonStop Blade Elements.

Recovery  Correct the PE/LSU cabling problem.



206

Complex ID complex has been declared invalid Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

stmid-n

is the NonStop Blade Element tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

Cause  A cabling or hardware error.

Effect  The NonStop Blade Complex is invalid; therefore, it cannot reintegrate PEs or reset NonStop Blade Elements.

Recovery  Correct the problem specified in preceding ZCMP event messages.



207

Complex ID complex has been declared valid Slice module tracking ID : smtid-1 Slice module tracking ID1 : smtid-2 Slice module tracking ID2 : smtid-3 Members of the Complex : CPU cpu-1 [ CPU cpu-2 [ CPU cpu-3 [ CPU cpu-4 ] ] ]

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

stmid-n

is the NonStop Blade Element tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

cpu-n

is processor n.

Cause  Either:

  • The first processor in a NonStop Blade Complex was loaded and its configuration is valid.

  • The problem that invalidated a NonStop Blade Complex was corrected, and that NonStop Blade Complex is now valid.

Effect  The NonStop Blade Complex can reintegrate PEs and reset NonStop Blade Elements.

Recovery  This is an informational message only; no corrective action is needed.



208

For slice tracking ID smtid, both echo cables do not come from same slice Channel chnl-1, Rcv: 0xrcv-1, Exp: 0xexp-1 Xmit: 0xxmit-1 [ Channel chnl-2, Rcv: 0xrcv-2, Exp: 0xexp-2 Xmit: 0xxmit-2 ] Channel <11>, Rcv : 0x<12>, Exp: 0x<13> Xmit: 0x<14>

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  An reintegration cabling problem.

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



209

For slice tracking ID smtid, both echo cables do not come from same complex Channel chnl-1, Rcv: 0xrcv-1, Exp: 0xexp-1 Xmit: 0xxmit-1 [ Channel chnl-2, Rcv: 0xrcv-2, Exp: 0xexp-2 Xmit: 0xxmit-2 ]

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  One or more reintegration cables connects one or more NonStop Blade Complexes. Reintegration cables must connect NonStop Blade Elements within a NonStop Blade Complex.

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



210

For slice tracking ID smtid, both echo cables do not come from same side Echo Error State : state Echo Status Register : 0xstatus Echo Error Register : 0xregister Channel chnl‑1, Rcv: 0xrcv‑1, Exp: 0xexp‑1 Xmit: 0xxmit‑1 [ Channel chnl‑2, Rcv: 0xrcv‑2, Exp: 0xexp‑2 Xmit: 0xxmit‑2 ]

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

error-state

is the state of the FPGA 3 copy logic. Possible values include:

  • No errors detected.

  • FPGA 3 errors detected.

status

is the status of the FPGA 3 (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

register

is the contents of the FPGA 3 error register (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  Incoming reintegration cables come from different sides of the same NonStop Blade Element. For example, channels 0 and 1 of the incoming NonStop Blade Element connect to channels 1 and 0, respectively, of the other NonStop Blade Element (cable crossover).

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



211

In Complex ID complex, The echo cables do not form a ring topology Slice module tracking ID : smtidChannel chnl‑1, Rcv: 0xrcv‑1, Exp: 0xexp‑1 Xmit: 0xxmit‑1 [ Channel chnl‑2, Rcv: 0xrcv‑2, Exp: 0xexp‑2 Xmit: 0xxmit‑2 ]

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  Reintegration cables do not form a ring. (An example of a ring is: A transmits to B, B transmits to C, and C transmits to A.)

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



212

In Complex ID complex, the echo cables form ring topology, but a slice was excluded Slice 1 tracking ID : smtid-1 Slice 2 tracking ID : smtid-2 Excluded slice module tracking ID : smtid-excluded Channel chnl‑1, Rcv: 0xrcv‑1, Exp: 0xexp‑1 Xmit: 0xxmit‑1 [ Channel chnl‑2, Rcv: 0xrcv‑2, Exp: 0xexp‑2 Xmit: 0xxmit‑2 ]

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

smtid-1

is the tracking ID of the first NonStop Blade Element that connects to the FPGA 3.

smtid-2

is the tracking ID of the second NonStop Blade Element that connects to the FPGA 3.

smtid-excluded

is the tracking ID of the NonStop Blade Element that does not connect to the FPGA 3.

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  Reintegration cables form a ring, but a NonStop Blade Element is excluded from the ring.

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



213

Echo cabling in complex ID complex has been declared invalid Slice module tracking ID: smtid-1 Slice module tracking ID1: smtid-2 Slice module tracking ID2: smtid-3

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

smtid-n

is the tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

Cause  A cabling or hardware error.

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the problem specified in preceding ZCMP event messages as soon as possible.



214

Echo cabling in complex ID complex has been declared valid Slice module tracking ID: smtid-1 Slice module tracking ID1: smtid-2 Slice module tracking ID2: smtid-3

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

smtid-n

is the tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

Cause  Either:

  • The first processor in a NonStop Blade Complex was loaded and its reintegration cabling configuration is valid.

  • The problem that invalidated an reintegration cabling configuration was corrected, and that reintegration cabling configuration is now valid.

Effect  The reintegration cabling can reintegrate PEs and reset NonStop Blade Elements.

Recovery  This is an informational message only; no corrective action is needed.



215

Echo scrub error on slice slice detected Slice module tracking ID: smtid Echo Status Register : 0xstatus Echo Error Register : 0xerror Echo info : 0xinfo

slice

is the NonStop Blade Element ID of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

status

is the status of the FPGA 3 (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

error

is the contents of the FPGA 3 error register (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

info

is additional FPGA 3 information returned from the source NonStop Blade Element in hexadecimal (contents of the echoAdditionalInfo field of the echoInfo structure).

Cause  Periodically the FPGA 3 hardware scrubs itself. While doing so, it found an error.

Effect  PEs on the NonStop Blade Element with the FPGA 3 cannot be reintegrated.

Recovery  Contact your service provider.



216

On CPU cpu, in complex complex the LIDs of the PEs have changed to an invalid configuration SLICE A LID : lid-1 [ SLICE B LID : lid-2 [ SLICE C LID : lid-3 ] ]

cpu

is the affected processor.

lid-n

is the LID of the PE n.

Cause  The PE-to-LSU cabling is such that some PEs comprising a processor have different LIDs.

Effect  PE reintegration and voting logic function incorrectly on the affected processor.

Recovery  Correct the cabling problem.



217

On CPU cpu, the PE in slice slice (LID lid) detected an LSU state change Slice module tracking ID : smtid LSU module tracking ID : lsuid LSU Physical Location : Group : group Module : module Slot : slot PE A old status : pe-old-1, new status : pe-new-1 PE B old status : pe-old-2, new status : pe-new-2 PE C old status : pe-old-3, new status : pe-new-3 LSU strand A old status : lsu-old-1, new status : lsu-new-1 LSU strand B old status : lsu-old-2, new status : lsu-new-2 LSU strand C old status : lsu-old-3, new status : lsu-new-3

cpu

is the processor where the PE resides.

slice

is the NonStop Blade Element that is connected to the PE.

lid

is the LID of the PE.

smtid

is the tracking ID of the NonStop Blade Element where the PE resides.

lsuid

is the tracking ID of the affected LSU.

group

is the group identification of the affected LSU.

module

is the module identification of the affected LSU.

slot

is the slot identification of the affected LSU.

pe-old-n

is the previous status of PE n. Possible values include:

  • PE not present.

  • PE not responding.

  • PE returned error.

  • PE stopped.

  • PE in primitive mode.

  • PE executing HSS.

  • PE executing NSK.

pe-new-n

is the status of PE n. Possible values are the same as for pe-old-n.

lsu-old-n

is the previous status (from the LSU register) of strand n connecting the LSU to the PE.

lsu-new-n

is the status (from the LSU register) of strand n connecting the LSU to the PE.

Cause  The cause is indicated by pe-new-n.

Effect  The PE might be stopped. If there is a cabling problem, the PE might continue to run but might be unable to participate in reintegrations.

Recovery  Contact your service provider.



218

The tracking ID of a slice in a complex has changed Complex ID : complex Slice ID : slice New Slice module tracking ID : smtid-new Prior Slice module tracking ID : smtid-old

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

slice

is the NonStop Blade Element ID of the NonStop Blade Element whose tracking ID changed.

smtid-new

is the new NonStop Blade Element tracking ID.

smtid-old

is the original NonStop Blade Element tracking ID.

Cause  Either:

  • A NonStop Blade Element was replaced in a NonStop Blade Complex.

  • A NonStop Blade Element was inserted in a NonStop Blade Complex, changing the characteristics of the NonStop Blade Complex (for example, a duplex NonStop Blade Complex became a triplex NonStop Blade Complex).

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



219

In complex complex, no PEs on slice slice are responding Slice module tracking ID : smtid

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

slice

is the NonStop Blade Element ID of the NonStop Blade Element whose PEs are not responding.

smtid

is the tracking ID of the NonStop Blade Element whose PEs are not responding.

Cause  Either:

  • A NonStop Blade Element was removed.

  • A NonStop Blade Element encountered a failure that affects all PEs, such as a NonStop Blade Element power failure.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



220

In complex complex, LID lid reporting inconsistent slice memory sizes Slice module tracking ID : smtid-1, memory size memsize-1 Slice module tracking ID : smtid-2, memory size memsize-2 [ Slice module tracking ID : smtid-3, memory size memsize-3 ]

complex

is the NonStop Blade Complex ID of the affected NonStop Blade Complex.

lid

is the LID of the PE reporting inconsistent NonStop Blade Element memory sizes.

smtid-n

is the tracking ID of NonStop Blade Element n in the NonStop Blade Complex.

memsize-n

is the memory size reported for NonStop Blade Element n in the NonStop Blade Complex.

Cause  NonStop Blade Elements that compose a NonStop Blade Complex reported different memory sizes.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



221

For slice tracking ID smtid, echo cabling link problems Echo status register : 0xstatus Echo error register : 0xerror Channel chnl‑1, Rcv: 0xrcv‑1, Exp: 0xexp‑1 Xmit: 0xxmit‑1 [ Channel chnl‑2, Rcv: 0xrcv‑2, Exp: 0xexp‑2 Xmit: 0xxmit‑2 ]

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

status

is the status of the FPGA 3 (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

error

is the contents of the FPGA 3 error register (contents of the FPGA 3 status register, which might provide FPGA 3 hardware error information).

chnl‑n

is ID of channel n.

rcv‑n

is the contents of the receive register for channel n.

exp‑n

is the contents of the expecting register for channel n.

xmit‑n

is the contents of the transmit register for channel n.

Cause  One or more reintegration cables is either missing, broken, or connected to a NonStop Blade Element that is not operating.

Effect  CMAP cannot reintegrate one or more PEs or reset one or more NonStop Blade Elements. Miscabling the FPGA 3 does not exclude processors from the system, but it compromises the system’s ability to recover from PE failures.

Recovery  Correct the reintegration cabling problem as soon as possible.



222

In complex ID complex, number of PEs reported by the logical CPUs does not match - complex invalid Slice 1 module tracking ID : track-id1, Num Pes : num-pe1 Slice 2 module tracking ID : track-id2, Num Pes : num-pe2 Slice 3 module tracking ID : track-id3, Num Pes : num-pe3

complex

is the identifier of the NonStop Blade Complex.

track-id1

is the tracking ID of the first NonStop Blade Element in the NonStop Blade Complex.

num-pe1

is the number of PEs that the first NonStop Blade Element in the NonStop Blade Complex reports.

track-id2

is the tracking ID of the second NonStop Blade Element in the NonStop Blade Complex.

num-pe2

is the number of PEs that the second NonStop Blade Element in the NonStop Blade Complex reports.

track-id3

is the tracking ID of the third NonStop Blade Element in the NonStop Blade Complex.

num-pe3

is the number of PEs that the third NonStop Blade Element in the NonStop Blade Complex reports.

Cause  The NonStop Blade Elements that compose a processor do not have the same number of PEs.

Effect  CMAP cannot reintegrate or reset any NonStop Blade Element connected to the NonStop Blade Complex.

Recovery  Correct the NonStop Blade Element problem.



223

In complex ID complex, Two CPUs have the same LID on a slice with the same tracking ID - complex invalid Slice tracking ID : track-id LID : lid CPU : cpu1, SLICE : slice1 CPU : cpu2, SLICE : slice2

complex

is the identifier of the NonStop Blade Complex.

track-id

is the tracking ID of the NonStop Blade Complex.

lid

is the LID that two processors report.

cpu1

is the first processor that reports the LID.

slice1

is the identifier of the NonStop Blade Element for cpu1.

cpu2

is the second processor that reports the LID.

slice2

is the identifier of the NonStop Blade Element for cpu2.

Cause  A hardware error occurred or an improperly programmed NonStop Blade Element tracking ID (two physical NonStop Blade Elements have the same tracking ID) was present. One of the two reported processors is defective, but it is impossible to identify which one.

Effect  CMAP cannot reintegrate or reset any NonStop Blade Element connected to the NonStop Blade Complex.

Recovery  Contact your service provider.



224

CRAM errors with echo configuration error on slice slice detected Slice module tracking ID: smtid Echo Error Register : 0x Error0 for slice 0 Echo Error Register : 0x Error1 for slice 1 Echo Error Register : 0x Error2 for slice 2

slice

is the identifier of the NonStop Blade Element.

smtid

is the tracking ID of the NonStop Blade Element that connects to the FPGA 3.

Error0

is the contents of the FPGA 3 error register for SLICE 0.

Error1

is the contents of the FPGA 3 error register for SLICE 1.

Error2

is the contents of the FPGA 3 error register for SLICE 2.

Cause  Echo configuration error for CRAM HW error.

Effect  PEs on the NonStop Blade Element with the FPGA 3 cannot be reintegrated.

Recovery  Contact your service provider.



300

Slice slice module power redundancy state change Slice Physical Location : Group : group Module : module Slot : slot Slice module tracking ID : smtid Module power redundancy old state : old, new state : new

slice

is the NonStop Blade Element ID of the affected NonStop Blade Element.

group

is the group identification of the affected NonStop Blade Element.

module

is the module identification of the affected NonStop Blade Element.

slot

is the slot identification of the affected NonStop Blade Element.

smtid

is the tracking ID of the affected NonStop Blade Element.

old

Indicates whether power to the NonStop Blade Element was redundant.

new

Indicates whether power to the NonStop Blade Element is redundant.

Cause  Power supply failure or loss of alternating current to a power supply.

Effect  If redundancy is lost, a second failure causes the NonStop Blade Element to fail.

Recovery  Check redundant power and repair it if necessary.



301

Slice slice hardware environment state change Slice Physical Location : Group : group Module : module Slot : slot Slice module tracking ID : smtid BMC communications old state : bmc-old, new state : bmc-new First fan old state : fan-old-1, new state : fan-new-1 Second fan old state : fan-old-2, new state : fan-new-2 Third fan old state : fan-old-3, new state : fan-new-3 Temperature sensor 1 old state : ts-old, new state : ts-new Power supply 1 old state : ps-old-1, new state : ps-new-1 Power supply 2 old state : ps-old-2, new state : ps-new-2 BMC watchdog timer old state : wdt-old, new state : wdt-new

slice

is the NonStop Blade Element ID of the affected NonStop Blade Element.

group

is the group identification of the affected NonStop Blade Element.

module

is the module identification of the affected NonStop Blade Element.

slot

is the slot identification of the affected NonStop Blade Element.

smtid

is the tracking ID of the affected NonStop Blade Element.

bmc-old

indicates whether BMC communications were functioning normally.

bmc-new

indicates whether BMC communications are functioning normally.

fan-old-n

is the previous status of fan n. Possible values include:

  • Fan is within specification.

  • One fan failed.

  • Both fans failed.

  • Fans are either missing or not responding.

fan-new-n

is the status of fan n. Possible values are the same as for fan-old-n.

ts-old

is the previous status of the temperature sensor. Possible values include:

  • Temperature is within specification.

  • Temperature is near critical.

  • Temperature is critical.

  • Temperature is past critical - BMC shutdown.

  • Temperature access error occurred.

ts-new

is the status of the temperature sensor. Possible values are the same as for ts-old.

ps-old-n

is the previous status of power supply n. Possible values include:

  • Power is within specification.

  • Power supply is either missing or not responding.

  • Power supply fan fault.

  • Power is not within specification.

  • No AC power.

  • Power supply access error occurred.

ps-new-n

is the status of power supply n. Possible values are the same as for ps-old-n.

wdt-old

indicates whether the BMC watchdog timer had previously triggered.

wdt-new

indicates whether the BMC watchdog timer has triggered.

Cause  A problem with a fan, temperature sensor, or power supply.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



302

Slice slice LED led state change Slice Physical Location : Group : group Module : module Slot : slot Slice module tracking ID : smtid LED old state : old, new state : new

slice

is the NonStop Blade Element ID of the affected NonStop Blade Element.

led

is the ID of the LED that changed status. Possible values include:

  • NonStop Blade Element system LED.

  • NonStop Blade Element system locator LED.

  • NonStop Blade Element front LED.

  • NonStop Blade Element rear LED.

  • Medusa Optics Adapter (MOA) fault LED.

  • LSU logic LED.

  • LSU logic optics LED.

  • LSU ServerNet X LED.

  • LSU ServerNet Y LED.

  • LSU strand A LED.

  • LSU strand B LED.

  • LSU strand C LED.

group

is the group identification of the affected NonStop Blade Element.

module

is the module identification of the affected NonStop Blade Element.

slot

is the slot identification of the affected NonStop Blade Element.

smtid

is the tracking ID of the affected NonStop Blade Element.

old

is the previous status of the LED.

new

is the status of the LED.

Cause  An LED on a NonStop Blade Element changed status.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



303

LSU power supply state change. GMS gms LSU module tracking ID: lsuid Power supply rail 1 old status : psr-old-1, new status : psr-new-1 Power supply rail 2 old status : psr-old-2, new status : psr-new-2

gms

is the group-module-slot identification of the affected LSU.

lsuid

is the tracking ID of the affected LSU.

psr-old-n

indicates whether power supply rail n was up. Possible values include:

  • Power supply rail is within specification.

  • Power supply rail is near critical.

  • Power supply rail is critical.

  • Power supply rail is past critical - BMC shutdown.

  • Power supply rail access error occurred.

psr-new-n

indicates whether power supply rail n is up. Possible values are the same as for psr-old-n.

Cause  The power supply status of an LSU changed.

Effect  None.

Recovery  Check power supply and repair it if necessary.



304

LSU module LED led state change LSU Physical Location : Group : group Module : module Slot : slot LSU module tracking ID : lsuid LED old state : old, new state : new

led

is ID of the LED that changed status. Possible values include:

  • NonStop Blade Element system LED.

  • NonStop Blade Element system locator LED.

  • NonStop Blade Element front LED.

  • NonStop Blade Element rear LED.

  • Medusa Optics Adapter (MOA) fault LED.

  • LSU logic LED.

  • LSU logic optics LED.

  • LSU ServerNet X LED.

  • LSU ServerNet Y LED.

  • LSU strand A LED.

  • LSU strand B LED.

  • LSU strand C LED.

lsuid

is the tracking ID of the affected LSU.

old

indicates whether the LED was on.

new

indicates whether the LED is on.

Cause  The fault LED status of an LSU changed.

Effect  None.

Recovery  If ZCMP-TKN-LED-STATUS is true, repair the reason for the fault.



330

CPU cpu slice slice LED state change: LED led, Old status: old, New status: new

cpu

is the processor reporting this event.

slice

is the NonStop Blade Element:

ValueNonStop Blade Element
0A
1B
2C

led

is the LED:

ValueLED
0System LED
1System locator LED
2Fault LED on MOA

old

indicates whether the LED was on:

ValueStatus
0Off
1On

new

indicates whether the LED is on:

ValueStatus
0Off
1On

Cause  The status of a processor module (NonStop Blade Element) LED changed.

Effect  None

Recovery  This is an informational message only.



331

CPU cpu slice slice module redundancy power status change: Old power status: old, New power status: new

cpu

is the processor reporting this event.

slice

is the NonStop Blade Element:

ValueNonStop Blade Element
0A
1B
2C

old

is the power’s old status:

ValueStatus
0Nonredundant
1Fully redundant

new

is the power’s new status:

ValueStatus
0Nonredundant
1Fully redundant

Cause  The status of the redundancy power changed.

Effect  The OSM checks the status of the power supplies.

Recovery  This is an informational message only.



332

CPU cpu Slice slice ESC status change: GMS cpu, Slice slice, Old fan status: fan-old, New fan status: fan-new, Old power status: power-old, New power status: power-new, Old temp status: temp-old, New temp status: temp-new, Old BMC Status: bmc-old, New BMC Status: bmc-new, Old BMC WDT Status: wdt-old, New BMC WDT Status: wdt-new

cpu

is the processor reporting this event.

slice

is the NonStop Blade Element:

ValueNonStop Blade Element
0A
1B
2C

fan-old

is the previous status of the fan.

fan-new

is the status of the fan.

power-old

is the previous status of the power supply:

ValueStatus
0Nonredundant
1Fully redundant

power-new

is the status of the power supply:

ValueStatus
0Nonredundant
1Fully redundant

temp-old

is the previous status of the temperature.

temp-new

is the status of the temperature.

bmc-old

is the previous status of the BMC:

ValueStatus
0Not communicating with IPF processor
1Communicating with IPF processor

bmc-new

is the status of the BMC:

ValueStatus
0Not communicating with IPF processor
1Communicating with IPF processor

wdt-old

is the previous status of the watchdog timer:

ValueStatus
0Has not gone off
1Has gone off

wdt-new

is the status of the watchdog timer:

ValueStatus
0Has not gone off
1Has gone off

Cause  One of the following:

  • The status of the fan, power, or temperature changed.

  • The BMC is not able to communicate with the IPF processor.

  • The watchdog timer went off.

The Standard Millicode generates this event every 30 minutes unless ESC status has not changed.

Effect  None

Recovery  This is an informational message only.



333

LSU module LED state change: GMS gms, LED: led, Old status: old, New status: new

gms

is the GMS of the LSU.

led

is the LED:

ValueLED
3LSU logic LED
4Optic card LED

old

is the previous status of the LED:

ValueStatus
0Off
1On

new

is the status of the LED:

ValueStatus
0Off
1On

Cause  The state of the LED for the LSU logic or the optics card changed.

Effect  None

Recovery  This is an informational message only.



400

FWUPDT : Firmware scrub start Firmware module : module Slice module tracking ID : slice CPU : cpu LID : lid LSU tracking ID : lsuid Firmware Device : device

module

is the module ID for the firmware that was scrubbed. Possible values include:

  • BMC firmware

  • PAL and SAL Firmware

  • Halted State Services Firmware

  • Itanium processor Primitive State Firmware

  • FIRMWARE_IPS 1040

  • IPS 1040 firmware

  • FPGA

  • LSU CPLD

  • NonStop Blade Element CPLD

  • NonStop Blade Element FPGA 1

  • FPGA 2

  • FPGA 3

  • PAL and SAL Firmware Header

  • BMC Firmware Header

slice

is the tracking ID of the NonStop Blade Element where the module that was scrubbed resides.

cpu

is the processor where the module that was scrubbed resides.

lid

is the LID of the PE on the NonStop Blade Element.

lsuid

is the tracking ID of the affected LSU.

device
  • Primary image

  • Shadow image

Cause  A scrub started on a firmware module.

Effect  The firmware might need to be scrubbed again or updated.

Recovery  This is an informational message only; no corrective action is needed.



401

FWUPDT : Firmware scrubbed Firmware module : module Slice module tracking ID : slice CPU : cpu LID : lid LSU tracking ID : lsuid Scrub Result : result Firmware Device : device

module

is the module ID for the firmware that was scrubbed. Possible values include:

  • BMC firmware

  • PAL and SAL Firmware

  • Halted State Services Firmware

  • Itanium processor Primitive State Firmware

  • FIRMWARE_IPS 1040

  • IPS 1040 firmware

  • FPGA

  • LSU CPLD

  • NonStop Blade Element CPLD

  • NonStop Blade Element FPGA 1

  • FPGA 2

  • FPGA 3

  • PAL and SAL Firmware Header

  • BMC Firmware Header

slice

is the tracking ID of the NonStop Blade Element where the module that was scrubbed resides.

cpu

is the processor where the module that was scrubbed resides.

lid

is the LID of the PE on the NonStop Blade Element.

lsuid

is the tracking ID of the affected LSU.

result

is the result of the scrub (contents of an FPGA 3 register). Possible reasons include:

  • Scrub succeeded.

  • Header error occurred.

  • Checksum error occurred.

  • Millicode error occurred.

device
  • Primary image

  • Shadow image

Cause  A scrub completed on a firmware module.

Effect  Depends on the result. The firmware might need to be scrubbed again or updated.

Recovery  If there is a problem, contact your service provider.



402

FWUPDT : Firmware update start Firmware module : module File Name : file Slice module tracking ID : slice CPU : cpu LID : lid LSU tracking ID : lsuid

module

is the module ID for the firmware that was scrubbed. Possible values include:

  • BMC firmware

  • PAL and SAL Firmware

  • Halted State Services Firmware

  • Itanium processor Primitive State Firmware

  • FIRMWARE_IPS 1040

  • IPS 1040 firmware

  • FPGA

  • LSU CPLD

  • NonStop Blade Element CPLD

  • NonStop Blade Element FPGA 1

  • FPGA 2

  • FPGA 3

  • PAL and SAL Firmware Header

  • BMC Firmware Header

file

is the name of the file used to update the firmware.

slice

is the tracking ID of the NonStop Blade Element where the module that was updated resides.

cpu

is the processor where the module that was updated resides.

lid

is the LID of the PE on the NonStop Blade Element.

lsuid

is the tracking ID of the affected LSU.

Cause  An update started on a firmware module.

Effect  None.

Recovery  This is an informational message only; no corrective action is needed.



403

FWUPDT : Firmware update finished Firmware module : module File Name : file Slice module tracking ID : slice CPU : cpu LID : lid LSU tracking ID : lsuid Update Result : result

module

is the module ID for the firmware that was scrubbed. Possible values include:

  • BMC firmware

  • PAL and SAL Firmware

  • Halted State Services Firmware

  • Itanium processor Primitive State Firmware

  • FIRMWARE_IPS 1040

  • IPS 1040 firmware

  • FPGA

  • LSU CPLD

  • NonStop Blade Element CPLD

  • NonStop Blade Element FPGA 1

  • FPGA 2

  • FPGA 3

  • PAL and SAL Firmware Header

  • BMC Firmware Header

file

is the name of the file used to update the firmware.

slice

is the tracking ID of the NonStop Blade Element where the module that was updated resides.

cpu

is the processor where the module that was updated resides.

lid

is the LID of the PE on the NonStop Blade Element.

lsuid

is the tracking ID of the affected LSU.

result

is the result of the update. Possible results include:

  • Update succeeded.

  • File format error occurred.

  • Millicode error occurred.

Cause  An update completed on a firmware module.

Effect  If the update failed, the firmware must be updated again.

Recovery  If there is a problem, contact your service provider.



404

FWUPDT : Firmware update program failed CPU : cpu PIN : pin Firmware module : module File Name : file Slice module tracking ID : slice CPU : cpu1 LID : lid LSU tracking ID : lsuid Reason : reason

cpu

is the processor where the firmware update process was running.

pin

is the PIN of the firmware update process.

module

is the module ID for the firmware that was scrubbed. Possible values include:

  • BMC firmware

  • PAL and SAL Firmware

  • Halted State Services Firmware

  • Itanium processor Primitive State Firmware

  • FIRMWARE_IPS 1040

  • IPS 1040 firmware

  • FPGA

  • LSU CPLD

  • NonStop Blade Element CPLD

  • NonStop Blade Element FPGA 1

  • FPGA 2

  • FPGA 3

  • PAL and SAL Firmware Header

  • BMC Firmware Header

file

is the name of the file used to update the firmware.

slice

is the tracking ID of the NonStop Blade Element where the module that was updated resides..

cpu1

is the processor where the module that was updated resides.

lid

is the LID of the PE on the NonStop Blade Element.

lsuid

is the tracking ID of the affected LSU.

reason

is the reason that the firmware update process failed. Possible reasons include:

  • Processor hosting update process failed.

  • Update process abended.

Cause  The cause is indicated by reason.

Effect  Firmware update is incomplete.

Recovery  Restart the firmware update.



500

RCVDUMP : RCVDUMP with PARALLEL option failed CPU : cpu LID : lid Slice : slice Slice module tracking ID : track-id File Name : filename Process Launch error : error Completion Code : comp-code Termination Code : term-code

cpu

is the processor where the PE resides.

lid

is the LID of the PE.

slice

is the identifier of the NonStop Blade Element where the PE resides.

track-id

is the tracking ID of the NonStop Blade Element where the PE resides.

filename

is the name of the target file for the PE’s scheduled dump.

error

is the process launch failure code:

  • Zero indicates that the receive-dump process failed after launch.

  • Nonzero is an error returned by a process create call.

comp-code

is the completion code. If nonzero, it is the completion code that the process returned when it returned condition codes.

term-code

is the termination code. If nonzero, it is the termination code that the system returns to a process when the process stops for any reason.

Cause  The processor in which the receive dump was launched halted for the reason reported by term-code.

Effect  The PE’s scheduled receive dump is not performed.

Recovery  Informational message only; no corrective action is needed.



600

CMAP : SAD process failure Failure type : failure-type Error returned : error Restart retry count : retries

failure-type

is the type of failure that occurred; either the launch of the SAD process failed or the SAD process was stopped or abended.

error

is the specific stop code or process-creation error.

retries

is the number of times CMAP attempted to restart the SAD process without receiving updated information from it. (The maximum is four.)

Cause  Either a usable SAD process (NSADPR) cannot be found in either default directory (SYSnn or SYSTEM), or a running SAD process encountered an unrecoverable error.

Effect  The affected processor loses all environmental monitoring.

If failure-type indicates that a SAD process launch failed, only one event is generated (retries are useless).

If failure-type indicates that the SAD process stopped or abended after being launched, CMAP tries to restart the SAD process up to four times. After the fourth time, if the SAD process has not provided CMAP with updated information, CMAP does not try again to restart the SAD process. If, during any retry, the SAD process provides CMAP with updated information, CMAP resets the retry count.

Recovery  Contact your service provider.