[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: XML_Parse ended in error



Hi Scott,
Thank you very much for help.
"One person shop" - sometimes, is hard to catch those errors.
 A especially when it worked for years.
Thank you,
Vladimir Vayntraub


-----Original Message-----
From: ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx
[mailto:ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Scott Klement
Sent: Wednesday, October 07, 2009 5:06 PM
To: HTTPAPI and FTPAPI Projects
Subject: Re: XML_Parse ended in error

Hi Vlad,

I found the problem with your program.  The problem is in your 
chardata() subprocedure.  Here's the code:


      P chardata        B
      D chardata        PI
      D   data                          *   value
      D   string                   65535A   const options(*varsize)
      D   len                         10I 0 value

      D x               s             10I 0
      D val             s            132A
      D newval          s            132A   varying
       /free
          if (len < 1);
             return;
          endif;

          val = %subst(string:1:len);
          QDCXLATE( len
                  : val
                  : 'QTCPEBC' );


Note the length of 'String'.  It varies in size, but it can be as large 
as 65535 bytes -- a very big string.  The 'len' parameter contains the 
current length of that string, which logically could be between 0 and 
65535 bytes.  (Actually, technically, I don't think it can get larger 
than 32768 because the input buffer is 8192, and if it happens to be 
made entirely of 1-byte chars in ASCII that expand to 4-byte chars in 
UTF-8, the output could be 8192x4=32768.  But even that is pretty far 
fetched.)

Still, to be safe, since it's declared as 65535, you should expect that 
  data can e as long as 65535 in this routine.

Then -- for no reason that I can fathom -- you use %subst() to copy the 
passed portion of 'string' to 'val'.  Since val is declared at 132A, you 
have the potential here to drop characters.  If the data happens to be 
larger than 132A, you'll lose everything beyond the 132.  That's 
foolish, and can cause you to get invalid junk back in your variables -- 
but it won't cause a crash.

But this line of code will:

          QDCXLATE( len
                  : val
                  : 'QTCPEBC' );

Note that you're passing 'len' in the first parameter.  'len' can be as 
large as 65535 -- but you are passing 'val' which can at maximum only be 
132 characters in the 2nd parameter.

That means that any time 'len' happens to be bigger than 132, you are 
lying to the QDCLXATE API!  In the particular example you sent me, the 
Restrictions element has 2290 bytes of data.  2290 is significantly 
larger than 132.  You say to the API "hey API, I have a variable that's 
2290 bytes long, and I need you to translate it from ASCII to EBCDIC".

The API doesn't know that you're lying.  It doesn't know that your 
variable is only 132 bytes long.  Remember, variables are just a 
convenience for people -- they don't really exist.  Under the covers, 
all you really have is data stored in the RAM banks of your computer at 
a particular address.  You've told the API, at address 2000 (for 
example) I have 2290 bytes that you can trnslate to EBCDIC for me.
Your actual variable occupies only 132 bytes, so it'd be from addresses 
2000-2131.  But the API thinks it occupies from 2000-4289.  It will go 
ahead read the bytes from addresses 2132-4289, and run each one through 
the ASCII to EBCDIC table.  It doesn't know that your variable isn't 
really there.  So the question is...   what happens to be stored in your 
computer's memory banks in addresses 2132-4289?  (shrug)  I don't 
know... do you?   Whatever it is, you just radically changed the values 
of those bytes.  I suspect that at least part of it was a pointer, since 
the C routine suddently found it has an invalid pointer.

Anyway...  the fix for this is very easy.  Just increase the size of VAL 
to 65535, so you never have to worry about 'len' being larger than the 
length of 'val'.

However, there are a LOT of other problems with this program...

a) Expat is returning UTF-8, but you are treating it as ASCII.  That 
will work to an extent, because the basic ASCII characters have the same 
code point in UTF-8.  But if you have any international or unusual 
characters, you will get back garbage.

b) You are assuming that you'll get all character data for a given XML 
element in a single call to chardata.  That's not always true.  It's 
possible that it'll send you half (or some fraction) in the first call, 
and the rest in a second call to chardata.   Or, it's possible that 
it'll take 3-4 calls.   Don't make this assumption... accumulate the 
data on the stack until the 'end' routine is called, and THEN take it 
off the stack and put it into your variable.

c) You are concatenating the element value ('newval') onto the end of 
the element name in the stack array.  The stack array is only 50A!  That 
means that a name like 'Restrictions' which is already 12 long, can only 
have 37 bytes of data.  Why would you do that?  What is it gaining you 
to add the name & data together like that?

d) newval is also arbitrarily limited to 132 bytes.

e) You bring in the copy book for HTTPAPI_H, despite that you don't use 
it.  But that's interesting, because it means you are using a POSITIVELY 
ANCIENT (i.e. more than 3 years out of date) version of HTTPAPI.  That's 
the only way Expat would be compiled to output UTF-8 instead of UTF-16. 
  Current versions of HTTPAPI cannot work with UTF-8 data.

This was done very intentionally, since ILE RPG has native support for 
UTF-16 and UCS-2.  Therefore you don't have to resort to kludgy QDCXLATE 
calls or complex iconv() calls to translate the data.  You can just 
access it as Unicode data using RPG's native Unicode support.

f) You are using %trim() in an inefficient way that has the potential to 
lose blanks in the middle of a string.

PLEASE consider using HTTPAPI's XML support instead of calling Expat 
directly.  I don't want to sound like a jerk but...   you are learning 
your lessons the hard way, and taking up a lot of my time in the 
process.  There's no reason for it!  HTTPAPI's routines are intended to 
make this easier for you so you don't have to understand all of this 
stuff...


Vladimir Vayntraub wrote:
> Hi Scott,
> Thanks for reply,
> I've attached simple program and xml file.
> 
> 
> -----Original Message-----
> From: ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Scott Klement
> Sent: Wednesday, October 07, 2009 12:40 PM
> To: HTTPAPI and FTPAPI Projects
> Subject: Re: XML_Parse ended in error
> 
> Hello Vlad,
> 
> Vladimir Vayntraub wrote:
>> It takes a dump on statement 2433 in EXPAT in "C" Module "XMLPARSE.C" on
>> character data handling routine. 
> 
> In my copy of EXPAT, line 2433 is a comment.  Please don't post a line 
> number, that's not useful information.
> 
> Instead provide a program -- the simplest possible program -- that 
> demonstrates the problem you are having.  This should be a program that 
> I can take and run on my computer with a minimum of effort.
> 
> I've already tried parsing the XML file you attached to your e-mail, and 
> it parsed for me (in HTTPAPI) with absolutely no errors or problems.
> 
> 
>> May be there's field size issue.
> 
> It's not a field size issue.  I've parsed fields that are megabytes long 
> in a single XML element.  Instead of taking random stabs at the problem, 
> please provide some real information.  I don't want to play 20 guesses. 
>   Show me how to reproduce the problem.
> -----------------------------------------------------------------------
> This is the FTPAPI mailing list.  To unsubscribe, please go to:
> http://www.scottklement.com/mailman/listinfo/ftpapi
> -----------------------------------------------------------------------
> 
> 
> ------------------------------------------------------------------------
> 
> -----------------------------------------------------------------------
> This is the FTPAPI mailing list.  To unsubscribe, please go to:
> http://www.scottklement.com/mailman/listinfo/ftpapi
> -----------------------------------------------------------------------

-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------

-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------