Operator Messages Manual

Chapter 43 HRM (Host Resources Subagent) Messages

The messages in this chapter are sent by the Host Resources Subagent subsystem. The subsystem ID displayed by the Host Resources Subagent messages includes HRM as the subsystem name.

The following example shows the format for HRM operator messages as they are sent to printers, log files, or terminals:

95-03-10  16:34:20 \COMM.$HMSA  TANDEM.HRM.D20  000001 Object Unavailable
                                Host Resources MIB subagent process \COMM.$HMSA,
                                event number: Process Terminated,
                                cause: Process aborted
NOTE: Negative-numbered messages are common to most subsystems. If you receive a negative-numbered message that is not described in this chapter, see Chapter 15.


1

000001 Object Unavailable - Host Resources MIB subagent process subagent-process-name, event number: Process Terminated, cause: explanatory-text

subagent-process-name

is the name of a Host Resources Subagent process.

explanatory-text

identifies one of the following causes:

  • Arithmetic overflow

  • Illegal address reference

  • Instruction failure

  • Loop-timer timeout

  • Memory space exhausted

  • Process aborted

  • Stack overflow Termination

Cause  The subagent process terminated normally. (Abnormal termination is reported by the operating system.)

Effect  The backup process of the NonStop process pair should take over. If recovery is successful, an event announcing the takeover is generated.

Recovery  Examine the event logs for related events. Often event message 3 is generated prior to this event; see the message contents for a description of the associated problem. If you or your local Simple Network Protocol Management (SNMP) expert can determine the cause of the termination, correct it and, if necessary, restart the subagent process to allow continued monitoring of the host resources. If the termination resulted from incorrect specification of the agent process name, rerun the agent using the ‑a startup parameter to specify the proper name. See the SNMP Configuration and Management Manual for startup information.



2

000002 Object Available - Host Resources MIB subagent process subagent-process-name, event number: Process Started, reason: Process up, previous state: not running, current state: started

subagent-process-name

is the name of the Host Resources Subagent process.

Cause  The Host Resources Subagent process has started.

Effect  The Host Resources Subagent process is ready for service.

Recovery  Informational message only; no corrective action is needed.



3

000003 Transient Fault - Host Resources MIB subagent process subagent-process-name, event number: Process I/O error, fault type: I/O error, FS error: error-code, subcode: error-subcode, File Name: file-name, Additional Info: expanatory-text

subagent-process-name

is the name of a Host Resources Subagent process.

error-code error-subcode

are file-system error codes documented in the Guardian Procedure Errors and Messages Manual and listed in Appendix B

file-name

is the name of the file with which the subagent experienced an I/O error.

explanatory-text

is information describing the problem.

Cause  The subagent process encountered an I/O error. The cause of the error is most likely the assignment of an incorrect agent process name at startup or a stopped NonStop agent process.

Effect  The subagent cannot function.

If the subagent was running as a process pair when the I/O error occurred, the backup process takes over. This new primary process tries to establish communication with the NonStop agent. If the process cannot establish communication, it retries every 10 minutes for a period of 1 hour, then stops.

Recovery  See Appendix B or the Guardian Procedure Errors and Messages Manualfor a definition of the reported error code and subcode, and correct the condition described. For additional information, including recovery actions, also see this manual. If the error code is 14, the subagent could not open the NonStop agent process, and you should ensure that the NonStop agent is running or that you specified the correct agent process name when starting the subagent. To interpret other error codes, see “File-System Errors” in the Guardian Procedure Errors and Messages Manual.



4

000004 Transient Fault - Host Resources MIB subagent process subagent-process-name, event number: Process No Memory Space, fault type: Memory full

subagent-process-name

is the name of a Host Resources Subagent process.

Cause  The subagent process ran out of memory. The most likely causes are a problem of internal resource management or insufficient swap space.

Effect  This error can prevent a part of the subagent from functioning. The subagent might be able to correct the situation or might subsequently fail. In the latter case, additional events would be emitted.

Recovery  Stop the subagent. For assistance, contact your local Simple Network Protocol Management (SNMP) expert and provide all relevant information as follows:

  • Description of the problem and accompanying symptoms

  • Details from the message or messages generated

  • Supporting documentation such as Event Management Service (EMS) logs

If your local operating procedures require contacting the Global Mission Critical Solution Center (GMCSC), supply your system number and the numbers and versions of all related products as well.



5

000005 Transient Fault - Host Resources MIB subagent process subagent-process-name, event number: Process Internal Error, fault type: Internal error, error detail: explanatory-text

subagent-process-name

is the name of a Host Resources Subagent process.

explanatory-text

is information describing the problem.

Cause  The subagent process experienced an internal or logic error.

Effect  The process might or might not be able to recover from the error. If the primary subagent process cannot recover, its backup process should take over; if recovery is successful, an event announcing the takeover is generated. If the error is not recoverable and the process terminates, additional events are generated.

Recovery  If the subagent primary and backup processes cannot recover from the error, try to restart the subagent. Regardless of whether the subagent is able to recover, contact your local Simple Network Protocol Management (SNMP) expert and provide all relevant information as follows:

  • Descriptions of the problem and accompanying symptoms

  • Details from the message or messages generated

  • Supporting documentation such as Event Management Service (EMS) logs

If your local operating procedures require contacting the Global Mission Critical Solution Center (GMCSC), supply your system number and the numbers and versions of all related products as well.