2.6. Error handling

The world is an imperfect place. Things go wrong. Sometimes a file can't be opened, or sometimes a tyrannical system administrator won't let us access something. It can be rough.

One of the problems with the example programs that we've written so far is that, although they detect when something went wrong, they couldn't tell us what the problem was. They know something happened, but they don't know what.

2.6.1. Retrieving the error number.

Like most of the UNIX-type APIs, our IFS functions return their error information using the C language "errno" variable. The idea is that there is a global variable called "errno" which a C program can check after something has gone wrong. The result is an integer that corresponds to a specific error message.

On the AS/400, the "errno" variable is actually returned by a sub-procedure that, for C programmers, gets called behind-the-scenes. So, for us to check errno, all we have to do is call that sub-procedure, and get the return value.

The sub-procedure that returns error information is called "__errno" and is part of the ILE C runtime library which is installed on every AS/400. The C language prototype for "__errno" looks like this:

      int *__errno(void);
   

What that means is that the procedure is called __errno, and it returns a ("int *") pointer to an integer. The "void" signifies that there are no parameters.

In RPG, you can't start a sub-procedure name with the underscore character, so we'll add another symbol to the front of the prototype to make it work. The result looks like this:

     D @__errno        PR              *   ExtProc('__errno') 
   

Now, you'll note that although we're looking for an integer, this procedure actually returns a pointer. Yech! So what we'll do is create a simple sub-procedure that gets an integer from the area of memory that the pointer points at. That's a very simple sub-procedure, and it looks like this:

     P errno           B                                        
     D errno           PI            10I 0                      
     D p_errno         S               *                        
     D retval          S             10I 0 based(p_errno)       
     c                   eval      p_errno = @__errno           
     c                   return    retval                       
     P                 E                                        
   

2.6.2. What does the error number mean?

So, now we know that errno can be called, and it will give us an integer that tells us which error has occurred. But, what does the number mean? For example, if we got back the number 3401, how would we know what went wrong?

In C, there's a source member which programmers use that contains constants for each error number. For example, it will define the constant EACCES to be the number 3401. The C program can compare errno to EACCES, and if they match, it knows that the user does not have enough access (or "authority") to carry out the function.

In fact, if you look at the text in the IBM Information Center that explains (for example) the write() API, you'll see that under "Error Conditions" it says "If write() is not successful, errno usually indicates one of the following errors . . ." and then goes on to list errors like [EACCES] and [ENOSPC]. These error conditions are nothing more than the named constants that I mentioned above.

Since the "errno" stuff can be used by other APIs besides the ones that this book covers, we will place these in their own header member. That way, you can include them into your future programs without also including the code that's IFS-specific.

I've called my /copy member "ERRNO_H". If at all possible you should consider using the one that I provide with this book. In it, I've put RPG named constants corresponding to all the values of errno that I know about. Since it would be tedious for you to find all of these values and type them in, you may as well just use mine!

2.6.3. Getting a human-readable error message

In addition to the named constants for each error number, it's useful to have a "human-readable" error message that corresponds to each error number. For example, when you want to print a message on the screen explaining what went wrong, you'd probably rather say "No such path or directory" rather than "Error 3025 has occurred."

The ILE C/400 runtime library contains a procedure called "strerror()" for this purpose. When you call strerror() with an error number as a parameter, it returns a pointer to a variable length, null-terminated, error message. Here's the C and RPG prototype for strerror():

      char *strerror(int errnum); 

     D strerror        PR              *   ExtProc('strerror')     
     D    errnum                     10I 0 value                   
   

In addition to strerror(), you can also view each error number as a message in an OS/400 message file. The QCPFMSG message file contains all of the C error numbers prefixed by a "CPE". For example, the error ENOENT is 3025. If you type DSPMSGD CPE3025 MSGF(QCPFMSG) at the command prompt, it will show you a message that says "No such path or directory." Likewise, if you looked up CPE3401, you'd see the human-readable message "Permission denied."

2.6.4. Utilities for communicating errors

As you may already know, OS/400 programs usually return error information from one program to another by sending a "program message". When an error occurs which causes a program to fail, the program usually sends back a program message of type "escape" to it's caller.

For the sake of making error handling easier, we will create two simple sub-procedures that we can use to send back "escape messages", and add these to our ERRNO_H file, so all of our programs can use them.

The first utility is called "die". It will send back any user supplied error message under a message number of CPF9897. This is useful for supplying simple text error messages in our example programs. Here's the code:

     P die             B
     D die             PI             1N
     D    msg                       256A   const

     D QMHSNDPM        PR                  ExtPgm('QMHSNDPM')
     D   MessageID                    7A   Const
     D   QualMsgF                    20A   Const
     D   MsgData                    256A   Const
     D   MsgDtaLen                   10I 0 Const
     D   MsgType                     10A   Const
     D   CallStkEnt                  10A   Const
     D   CallStkCnt                  10I 0 Const
     D   MessageKey                   4A
     D   ErrorCode                  256A

     D dsEC            DS
     D  dsECBytesP             1      4I 0 inz(%size(dsEC))
     D  dsECBytesA             5      8I 0 inz(0)
     D  dsECMsgID              9     15
     D  dsECReserv            16     16
     D  dsECMsgDta            17    256

     D MsgLen          S             10I 0
     D TheKey          S              4A

     c     ' '           checkr    msg           MsgLen
     c                   if        MsgLen<1
     c                   return    *off
     c                   endif

     c                   callp     QMHSNDPM('CPF9897': 'QCPFMSG   *LIBL':
     c                               Msg: MsgLen: '*ESCAPE':
     c                               '*': 3: TheKey: dsEC)

     c                   return    *off
     P                 E
   

The other utility function is called "EscErrno". We will pass an error number as an argument to this function, and it will send back the appropriate CPExxxx error message as an escape message to the calling program.

EscErrno is useful when we want our programs to crash and report errors that the calling program can monitor for individually. For example, a calling program could be checking for CPE3025, and handle it separately than CPE3401.

Here is the code for EscErrno:

     P EscErrno        B                                               
     D EscErrno        PI             1N                               
     D   errnum                      10i 0 value                       
                                                                       
     D QMHSNDPM        PR                  ExtPgm('QMHSNDPM')          
     D   MessageID                    7A   Const                       
     D   QualMsgF                    20A   Const                       
     D   MsgData                      1A   Const                       
     D   MsgDtaLen                   10I 0 Const                       
     D   MsgType                     10A   Const                       
     D   CallStkEnt                  10A   Const                       
     D   CallStkCnt                  10I 0 Const                       
     D   MessageKey                   4A                               
     D   ErrorCode                  256A  
                                                                       
     D dsEC            DS                                              
     D  dsECBytesP             1      4I 0 inz(%size(dsEC))            
     D  dsECBytesA             5      8I 0 inz(0)                      
     D  dsECMsgID              9     15                                
     D  dsECReserv            16     16                                
     D  dsECMsgDta            17    256                                
                                                                       
     D TheKey          S              4A                               
     D MsgID           S              7A                               
                                                                       
     c                   move      errnum        MsgID                 
     c                   movel     'CPE'         MsgID                 
                                                                       
     c                   callp     QMHSNDPM(MsgID: 'QCPFMSG   *LIBL':  
     c                               ' ': 0: '*ESCAPE':                
     c                               '*': 3: TheKey: dsEC)             
                                          
     c                   return    *off        
     P                 E                       
   

What I've done in my ERRNO_H is put the codes for the all of the procedures (errno, die, and escerrno) at the bottom, and enclosed them in these:

      /if defined(ERRNO_LOAD_PROCEDURE)
     .... procedure code goes here ....
      /endif
   

This allows us to include all of the error handling code in our programs by copying the header member twice, once without the "errno_load_procedure" symbol defined, which goes in our D-specs, and once with the "errno_load_procedure" symbol defined, which goes where our sub-procedures go.