Operator Messages Manual

Chapter 59 NFS (Network File System) for Open System Services (OSS) Messages

The event messages in this chapter are generated by the Network File System (NFS) for the Open System Services (OSS) subsystem. This chapter completely replaces the former chapter of the same name, which described NFS error messages rather that NFS event messages. Descriptions of NFS error messages can be found instead in the NFS Management and Operations Manual for Open System Services.

The subsystem ID displayed by the event messages described herein includes NFS as the subsystem name.

NOTE: Negative-numbered messages are common to most subsystems. If you receive a negative-numbered message that is not described in this chapter, see Chapter 15.


01

Proc: Backup open without primary open, process Issuing_process_name

Proc

is the name of the manager process.

Issuing_process

is the name of the process issuing the open.

Cause  An OPEN message was received from the backup manager process, but no corresponding open primary was found. There are two possible reasons:

  • A CHECKOPEN was attempted after the OPEN failed.

  • The primary and backup manager processes are out of synchronization.

Effect  The OPEN is rejected with Guardian File-System error 17. For more details, see the chapter on file-system errors in the Guardian Procedure Errors and Messages Manual.

Recovery  Informational only; no action is needed. If the problem recurs, stop the backup manager process. It will automatically restart.



02

File: Invalid type_of_file file

File

is the name of the OSS NFS file.

type_of_file

specifies the file type, which can be either of the following:

  • NFS subsystem user alias file (ZNFS User Alias)

  • NFS subsystem configuration file (ZNFS Config)

Cause  An invalid file was encountered in the NFS file system, or an invalid management file was found in a ZZNFSnnn or ZNFSUSR file. Either the file contains invalid entries, or it has the wrong file code for a management or NFS file.

Effect  The requested operation is not completed, and one of the following occurs:

  • If an invalid configuration file is encountered when starting a process, the process is not started.

  • A process that encounters invalid entries will terminate.

  • Encountering other conditions causes an error to be returned to the process that attempted the operation.

Recovery  Perform one of the following actions:

  • If the specified NFS configuration subvolume was incorrect, specify the correct one and restart the system.

  • If an invalid management file was named ZZNFSnnnor ZNFSUSR, rename it to a nonconflicting name. When an NFS management file is missing, NFS automatically creates one; however, all the information from the original file must be inserted into it.

  • If a valid management file contains invalid entries, report this situation to HP in a Genesis solution that includes the complete message and the EMS log.

  • If a valid file contains invalid entries, either repair it or rename it out of the NFS file system and replace it with a valid one.



03

Proc: Open from: \nodename, cpu, pin, name: Issuing_process, paid: paid rejected with error code: error_code

Proc

is the name of LAN or server process.

nodename

is the node name of the process issuing the open. Examples of nodenames are: \IDEV, \IGATE, and \IDC12.

cpu

is the CPU number of the process issuing the open.

pin

is the process identification number (PIN) of the process issuing the open. For more information, see the Guardian Procedure Calls Reference Manual.

Issuing_process

is the name of the process issuing the open.

paid

is the process Access ID (PAID) of the process issuing the open. For more information, see the chapter on Guardian System security in the Security Management Guide.

error_code

is the optional Guardian error code. For more details, see the chapter on file-system errors in the Guardian Procedure Errors and Messages Manual.

Cause  An attempt by the indicated process to open a NFS process was rejected, which indicates either an incorrect open request or a possible security violation. An attempt to open the LAN process by any process other than the manager process is a security violation.

Effect  The open is rejected with an error, and a bad-open event is logged.

Recovery  Verify that the open is correctly specified and that the name of the issuing process is valid. Correct these items as necessary.



04

Proc: This product requires GUARDIAN 90 XF release C00 or later.

Proc

is the name of the LAN process.

Cause  The indicated NFS process was initiated on a version of Guardian earlier than C00.

Effect  NFS process terminates.

Recovery  Restart on a suitable processor.



05

Proc: Backup process in CPU Backup_CPU died because cause

Proc

is the name of the manager process.

Backup_CPU

is the CPU where backup was running.

cause

is the reason the backup terminated.

Cause  The backup manager process terminated, causing a loss of fault tolerance. The backup process has halted or abended, or the backup processor has gone down.

Effect  The manager process no longer has backup, and fault tolerance is lost.

Recovery  If the backup processor has gone down, either wait until the original backup processor is restored or, if appropriate, change the assigned backup processor. The primary manager process will automatically attempt to restart the backup on the designated processor. This event should not occur during normal operation and should be reported to HP in a Genesis case that includes the complete message and the EMS log.



06

Proc: Unable to start backup because Termination_cause procedure error_code error Procedure_number

Proc

is the name of the manager process.

Termination_cause

is the reason why backup terminated.

error_code

is the NEWPROCESS error code. See the Guardian Procedure Errors and Messages Manual for more information.

Procedure_number

is the procedure number.

Cause  The primary manager process was unable to create its backup process, and the backup process terminated with the indicated error. The procedure number and error can give a more detailed explanation. Procedure numbers are defined in the ZGRDDDL file for Guardian procedures and in the ZFILDDL file for file-system procedures. Typical procedure numbers for this event include:

Guardian procedure 3NEWPROCESS
FILESYS procedure 4 CHECKOPEN
FILESYS procedure 5 CHECKPOINT
FILESYS procedure 27 CHECKPOINTX

Effect  The attempt to start the backup is abandoned and fault tolerance is lost.

Recovery  If the backup CPU is down, a CPU RELOAD is needed to bring up the backup process. NFS will attempt to create the backup when it receives the CPU RELOADED system message. If it is unsuccessful, there may be insufficient resources on the backup processor. Appropriate recovery depends on the error cause, which is revealed in the error code.



07

First_proc: Unable to communicate with Second_process

First_proc

is the name of the manager, LAN, or server process that reports the communication failure.

Second_process

is the name of the manager, LAN, or server process with which the first process is unable to communicate.

Cause  A NFS process (manager process, LAN interface process, or server process) was unable to communicate with another NFS process. The reason is usually any of:

  • NFS subsystem processes are running on different processors, and a processor failure occurred.

  • NFS components are running on different systems, and the network connection is lost.

  • An NFS process terminated abnormally, and the manager process has not yet started a replacement process.

Effect  The NFS subsystem is inaccessible.

Recovery  Perform one of these actions:

  • If the connection was lost because of a processor failure, reload the failed processor. When the manager process detects that the processor has been reloaded, it automatically restarts any NFS processes that were running there.

  • If the connection was lost because of a network connection failure, restore the network connection. Under most conditions, the process that generated the event will periodically try to re-establish communication.



08

Proc: EMS recording has been stopped.

Proc

is the name of the manager process.

Cause  An interactive or programmatic command to stop event collection. This event occurs after all NFS subsystem components - manager process, LAN interface process, and server process or processes - have received the stop message.

Effect  No NFS-subsystem components will generate further EMS events.

Recovery  Informational message only; no corrective action is needed.



09

Proc: EMS recording switched from previous_collector to new_collector

Proc

is the name of the manager process.

previous_collector

is the EMS collector that formerly received events.

new_collector

is the EMS collector that now receives events.

Cause  An interactive or programmatic ALTER PROCESS command that named the new collector and moved the EMS collection point.

Effect  The event collection functions switch from the indicated old collector to the new collector. This message is both the last message sent to the old collector, if it is accessible, and the first message sent to the new one.

Recovery  Informational message only; no corrective action is needed.



10

Proc: Error File_System_error_num encountered on procedure proc_name

Proc

is the name of a file or of the manager, LAN, or Server process.

File_System_error_num

is the Guardian file-system error number, as specified in the Guardian Procedure Errors and Messages Manual.

proc_name

is the name of the procedure that encountered the file system error. Possibilities are listed in the Guardian Procedure Call Reference Manual.

Cause  The error was returned for an I/O operation on the indicated file/process. Probably the device on which the file or process exists is inaccessible (device down or network connection lost).

Effect  If the file or process is critical to the NFS subsystem, the process abends and the subsystem becomes inaccessible.

Recovery  Restore the network connection or bring up the device. An error on the $RECEIVE file should not occur during normal operation and should be reported to HP in a Genesis case that includes the complete message and the EMS log.



11

Proc: Memory full, text

Proc

is the name of the LAN or server process that detected the error.

text

is the message text that indicates the reason for the memory shortage

Cause  Insufficient memory is available to satisfy internal needs for any of the following reasons:

  • Insufficient configured memory.

  • Insufficient disk space.The swap volume may not have enough space to accommodate the needs of the process.

  • Process overload. Too many demands were made of the process, causing it to use up internal resources.

Effect  If this situation occurs when a process is starting up, the process will terminate. Otherwise, while the problem exists, requests that cannot be handled are rejected with the appropriate error.

Recovery  Perform one of these actions:

  • If insufficient memory is available, stop the NFS subsystem, adjust the DATAPAGES start-up argument to a larger value, and restart NFS.

  • If insufficient disk space is available, move the data swap volume to a disk with more space available.

  • If an overload exists, examine the load configuration for problems in calling applications, such as looping on certain requests or making unnecessary demands.



12

Proc: Internal error err_code - file: file_name, Timestamp: timestamp, Procedure: entry_point

Proc

is the name of manager, LAN or server process.

err_code

is the code identifying the internal program error.

file_name

is the name of the object file.

timestamp

is the object file’s bind timestamp.

entry_point

is the entry-point label in the procedure where the error occurred.

Cause  The indicated NFS process detected an internal error that must be corrected by HP.

Effect  One or more NFS components will abend, and the operation in progress cannot be completed.

Recovery  This event should not occur during normal operation, and it must be corrected by HP. Report this situation to HP in a Genesis solution that includes the complete message and the EMS log.



13

Proc: RPC procedure proc_num failed; errno = errno_code

Proc

is the name of LAN or Server process.

proc_num

is the RPC procedure number, which is documented in the program’s protocol specification and identifies the calling (client) procedure.

errno_code

is an error code that can be interpreted as follows:

errno < -2000RPC library error (ZRPC.RPCPARMH)
-2000 < errno < 0 C library error (SYSTEM.ERRNOH)
1 < errno < 300 File-system error number (ZSPIDEF.ZFILDDL)
300 < errno TCP/IP socket library error (ZTCPIP.PARAMH)

Cause  An RPC library procedure failed and returned the indicated value.

Effect  A client program’s request cannot be serviced, and the outcome depends on how the client responds.

Recovery  The outcome depends on how the client program deals with this error.



14

Proc: Message with old sync ID, process: Manager_proc, file number file_num

Proc

is the name of the LAN or server process.

Manager_proc

is the name of the manager process.

file_num

is the number identifying the file that the request tried to access. This number was originally returned when the file was opened.

Cause  The SYNC-LEVEL parameter, supplied on the OPEN call, tells the NFS subsystem component how many completed requests must be retained for protection in the event of a takeover. This event indicates that the manager process reissued a request older than the set of saved replies, and it might signal that the process is performing checkpoints improperly.

Effect  The file system has rejected a client’s NFS request and returned a Guardian file-system error. The outcome depends on how the client is programmed to deal with this rejection.

Recovery  This event should not occur during normal operation and should be reported to HP in a Genesis case that includes the complete message and the EMS log.



15

Proc: Trap # trap_num - File: file_name. Timestamp: time_stamp, Procedure: entry_point

Proc

is the name of the manager, LAN, or server process.

trap_num

is the trap number.

file_name

is the object filename of the running process: manager, LAN, or server process.

time_stamp

is the object file’s bind timestamp.

entry_point

is the entry-point label in the procedure where the error occurred.

Cause  A hardware or software trap was detected, which should not occur during normal operation.

Effect  The process abends.

Recovery  This event should not occur during normal operation and should be reported in a Genesis case that includes the complete message and the EMS log.



16

Proc: Unrecognized systems message on $RECEIVE. code: code

Proc

is the name of the LAN or server process.

code

is the first word of the message, which is the message code.

Cause  An unexpected system message was detected on $RECEIVE, which should not occur during normal operation.

Effect  The system message is ignored, and a dummy reply is generated.

Recovery  This event should not occur during normal operation and should be reported to HP in a Genesis case that includes the complete message and the EMS log.



17

Proc: Message from unknownsource, \ nodename, cpu, pin, name: proc_name, paid: paid

Proc

is the name of the LAN or server process issuing the message.

nodename

is the node name of the process issuing the message.

cpu

is the CPU number of the process issuing the message.

pin

is the process identification number (PIN) of the process issuing the message. For more information see the Guardian Procedure Calls Reference Manual.

paid

is the process access ID (PAID) of the process issuing the open. For more information, see the chapter on Guardian system security in the Security Management Guide.

proc_name

is the procedure name.

Cause  An inbound message was rejected because it originated from a process that does not have a current open of the indicated NFS process. This situation can occur after the NFS subsystem starts when a previously running subsystem had the same name.

Effect  The request is rejected with file system error 60. For more information, see the Guardian Procedure Errors and Messages Manual.

Recovery  Before restarting the NFS subsystem, stop all previously running subsystem components.



18

Proc: Socket Library Routine: sock_lib_routine, Socket name: sock_name, Socket type: sock_type, Wait type: wait_type, Remote IP addr: IP_addr

Proc

is the name of the LAN process.

sock_lib_routine

indicates the failed socket library routine. For more information, see the TCP/IP and IPX/SPX Programming Manual.

sock_name

indicates the socket number

sock_type

is the socket type. Possible values:

1 - A transmission control protocol (TCP) stream socket
2 - A user datagram socket.

wait_type

is a numeric value indicating whether the failed operation was waited or nowaited.

1 -The failed operation was waited
2 -The failed operation was nowaited

IP_addr

is the IP address of the remote host where the NFS client is running.

Cause  A socket call or I/O completion routine failed because of a failed socket library routine.

Effect  The LAN process is unable to communicate with a client process, so clients cannot access NFS.

Recovery  Ensure that the TCP/IP process is running and configured properly. If it is not, start it and retry the socket call.



19

Proc: Connection reestablished with Process

Proc

is the name of the LAN or server process.

Process

is the process name.

Cause  The connection to a NFS client process is established.

Effect  The NFS subsystem is up, and the client can start communicating.

Recovery  Informational message only; no corrective action is needed.



20

Proc: Initialization failure due to text

Proc

is the name of the LAN or server process.

text

The message text can be any of the following:

invalid manager devtype -Invalid manager process name
invalid TCP/IP devtype -Invalid TCP/IP process name
process must be named -Process is unnamed
process must be owned by super -Process was not owned by super-user ID

Cause  If logged by the LAN process, when the LAN process was started either:

  • An invalid TCP/IP process name was given as an argument.

  • The LAN process was unnamed.

    If logged by the SERVER process, when the SERVER process was started either:

  • The server process was not started with super-user id

  • The server process was unable to start the timer because the system call SIGNALTIMEOUT failed.

Effect  The NFS process fails to initialize and abends.

Recovery  Based on the cause described in text, correct the problem and restart the process.



21

Proc: Too many takeovers -- backup not restarted

Proc

is the name of the manager process.

Cause  Within the last 30 minutes (since the process was initiated or since the operator caused its primary instance to fail) multiple takeover attempts have been made on the process. Each takeover was due to abnormal termination of the primary manager process. When the takeover count exceeds five (MAX_TAKEOVERS), this event is generated, and the primary NFS manager assumes that recovery is not possible.

Effect  NFS stops.

Recovery  Assign a different backup CPU. This event is preceded by some other event, which specifies the reason for the primary abend. Examine the previous events and take appropriate action. If the problem persists, there might be an internal programming error in the subsystem. This event should not occur during normal operating environment and should be reported in a Genesis case that includes the complete message, the EMS log, and the preceding event that caused the primary ABEND.



22

Proc: Too many backup failures -- backup not restarted.

Proc

is the name of the manager process

Cause  The primary server process has detected that its backup has stopped or abended multiple times. When that number exceeds 20 (MAX_BACKUP_FAILS), this event is generated. The cause can be hardware failure or an internal programming error.

Effect  The backup is not restarted, and the primary process continues to run, but without backup.

Recovery  Perform one of these actions:

  • If the processor in which the backup was running went down, either change the backup processor assignment to restore the process to full backup status, or wait until the processor is restored. The primary server process will automatically attempt to restart the backup on designated processor.

  • Otherwise, this event should not occur during normal operating environment and should be reported in a Genesis case that includes the complete message and the EMS log.



23

Proc: Backup process started in CPU cpu

Proc

is the name of the manager process.

cpu

is the CPU where backup process was started.

Cause  NFS manager process has started its backup process.

Effect  The backup process is running.

Recovery  Informational message only; no corrective action is needed.



24

Proc: Netgroup netgroup is nested too deep. Maximum of 10 levels allowed.

Proc

is the name of the manager process.

netgroup

is the netgroup name.

Cause  Netgroups can contain other netgroup objects as members, and the depth of these recursive netgroup definitions cannot exceed 10. This event is generated when this maximum recursive depth is reached.

Effect  Further addition of netgroups to that netgroup will fail.

Recovery  Reduce the number of netgroups in this netgroup.



25

Proc: Unable to recover corrupted database file

Proc

is the name of the manager process.

Cause  When the NFS manager is restarted, it checks the integrity of the NFS configuration database by searching the file for a recovery record. If one is found and if the NFS manager does not have write access to the database, this event is generated to indicate a corrupt database. On the other hand, if a recovery record is found and if the NFS manager has write access to the database, it performs the operation indicated by the recovery record, and NFS operation continues.

Effect  The NFS manager abends.

Recovery  ZNFSUSR and ZNFSUSR1 files where user information is stored are Safeguard protected. Therefore, proceed as follows:

  • Permissions for these files must be set so that the NFS manager can access them.

  • The NFS subsystem must be started with the proper user ID.

  • As the last resort, recreate these two files.



27

Proc: Security violation on user userid, host hostname, NFS operation NFS_op, NFS Filename NFS_filename, Remote IP addr IP_addr

Proc

is the name of the manager process.

userid

is the NFS user ID of the user that attempted to violate security.

hostname

is the name of the host where the NFS client that caused the violation is running.

NFS_op

is the NFS operation that failed. For more information, see the subsection on the NFS protocol specification in the chapter on requests for comments reference specification in the Overview of NFS for Open System Services manual.

NFS_filename

is the optional name of the file to which access was attempted.

IP_addr

is the IP address of the remote host that runs the NFS client that attempted to violate security.

Cause  The identified user has tried to access the named file without having access permission for it.

Effect  Access to this file by this user is denied.

Recovery  None required, but users needing access should contact the NFS administrator to obtain access permission.



28

Proc: QIO Monitor process QIO_proc is not running; NFS will run slower

Proc

is the name of the server process.

QIO_proc

is the name of the QIO monitor process.

Cause  QIO monitor process (QIOMON)is not running.

Effect  NFS runs slower. The master NFS process for a single OSS file system is called the headpin. It manages a pool of slave processes, called workerbees. If the QIO monitor process (QIOMON) is not running in the CPU assigned to the OSS NFS server, all messages between the headpin process and workerbee processes are sent over $RECEIVE.

Recovery  Restart the QIO monitor in the CPU that runs the OSS NFS server.



29

Server: Internal error: Caller caller_fn, Callee callee_fn, Return status ret_status

Server

is the name of the NFS server that initiated the event.

caller_fn

is the caller function name.

callee_fn

is the name of the function called.

ret_status

is the status returned from the function called.

Cause  If any call to the OSS file-system API fails, this event is generated. A hardware error, perhaps a disk error, occurred when the operation was in progress.

Effect  The NFS operation fails.

Recovery  Appropriate recovery depends on the return status of the event message. This event should not occur during normal operation and should be reported to HP in a Genesis case that includes the complete message and the EMS log.



30

Server: Internal error: EFS is not configured or started.

Server

is the name of the edit file server that is not started.

Cause  The user tried to start the edit file server when it was not properly configured. The edit file server is used to access and operate on edit files.

Effect  The named edit file server does not start.

Recovery  Adjust the configuration of the edit file server object. The edit file server object is NFSEFS and is stored on the installation subvolume, typically $SYSTEM.ZOSSNFS. This object must be licensed and have the PROGID of SUPER.SUPER. Only one EFS server can be run for each volume to be converted from EDIT to UNIX or DOS format. For good performance, the EFS process priority must be high.



31

Server: Unable to start worker bee because text

Server

is the name of the NFS server process that tries to start the workerbee.

text

is the brief description of the reason why the server is unable to start the workerbee.

Cause  The master NFS process for a single OSS file system is called the headpin. It manages a pool of slave processes called workerbees, which allow multithreading of requests to NFS. This failure can occur for either of the following reasons.

  • Excessive load on the headpin process

  • Lack of memory

Effect  The request made by remote client will not be serviced. Headpin does not stay alive if it cannot create at least one workerbee, and NFS operation fails.

Recovery  Depends on the cause of workerbee failure. Try the following:

  • Reduce the load on the server process indicated by Server.

  • Make more memory available.