[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Memory leak in EXPAT?



I would love to set up an environment so that the issue can be easily reproduced on a different system.  Alas, the application is highly proprietary and I just don't know of any way to reproduce it externally.

Remaining realistic, here's a chronology:

1. 	An application program continuously reads a web service.  Each XML response is about 56K bytes.  We believe that the web service occasionally sends mal-formed XML.  (Sometimes the document stops abruptly, and any open tags are never closed.)  We believe that each time EXPAT gets invalid XML, a small amount of memory is consumed (not freed up).  After 11 to 13 hours, all heap space is exhausted and the program crashes.

2.	Using Advanced Job Scheduler, we stop and then re-start the job every 8 hours.  Now the program runs without crashing (however there are three times per day, approx 15 minutes each time, when we don't monitor the service even though we'd like to monitor it 24/7).

3.	We changed the application to use IBM's XML Parser (XML-INTO).  We still get occasional parsing errors when the XML document is incomplete.

4.	We observed that the XML errors seem more frequent when we receive multiple responses during the same clock-second.  We changed the program so that it waits at least one full second between requests.  This seemed to reduce the number of errors, however the parsing errors still occur.  We did NOT run this program until it croaked, although that kind of experiment could be insightful.

5.	We changed the program to make sure that the very last XML tag is present.  If that last tag is missing, then we ignore the document because we know that the XML is malformed and it will not parse properly.  Yet we still get occasional parsing errors.  The XML appears to contain occasional control characters such as LF (Line Feed) and occasional random < and > characters.

6.	We can filter out the control characters (CR, LF, etc.) and see what happens next.

7.	According to the vendor, we're their only customer having issues with the web service.  Except for the fact that the program crashes after 11 to 13 hours, we would be totally unaware of any issues.

8.	Our infrastructure team is checking to determine if there's any kind of "noise" in the line which might corrupt XML responses from a website.  The program reliably croaks after running continuously for 11 to 13 hours . (11 hours on Production, 13 hours on Development) . We believe that the program croaks after a finite number of bad XML responses, and that this number is achieved sooner on Production, because the Production CPU is faster than the Development CPU.

9.	A completely separate application monitors a different web service 24/7, and that application works perfectly.  However, the flawless web service is in JSON while the croaky one is in XML.

10.	The troublesome service offers JSON as an option.  Our next step will be to change the program to request and receive the info in JSON, to see if this solves the issues.

Bottom line:  The evidence suggests that invalid XML can causes EXPAT to consume (not free up) a small amount of memory.  After a large number of requests with invalid XML, EXPAT runs out of heap space.

The best solution is probably to correct the incoming data stream, so that the XML is always proper.  Still, it's certainly interesting that a sufficient number of mal-formed XML messages seems to eventually crash EXPAT.


Nasser Shukayr
   I.T. Application Development Team Lead
Heartland Co-op



-----Original Message-----

Date: Sat, 9 Apr 2016 12:14:53 -0500
From: Scott Klement <sk@xxxxxxxxxxxxxxxx>
To: HTTPAPI and FTPAPI Projects <ftpapi@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Memory leak in EXPAT?

Nasser,

There isn't just one place where it allocates memory, there are many 
places.   The little bit of information you provided points to the IBM 
QC2ALLOC routine, which is part of the ILE environment used by all ILE programs on the system, so that isn't specific enough.

I really need to know how to reproduce the problem.

-SK


On 4/8/2016 9:04 AM, Nasser Shukayr wrote:
> Thank you for the rapid reply!
>
> The heap space is exceeded when EXPAT tries to allocate a new buffer.
>
> MCH6903:
>      To module . . . . . . . . . :   QC2ALLOC
>      To procedure  . . . . . . . :   do_malloc_default__FUL
>      Statement . . . . . . . . . :   3
>      Message . . . . :   The heap space has reached its maximum allowable size.
>
> I believe that EXPAT allocates resources when it starts to parse, then frees up those resources after it finishes.
>
> Here are two possible theories for running out of heap space:
>
> 1. Theory:  every call to EXPAT depletes (does not free up) a few bytes.  After many thousands of calls, all available heap space is exhausted.
>
> or
>
> 2. Background:  The XML document (from a web service) could occasionally contain mal-formed XML.  The document is approx. 65K bytes and has info on about 90 different commodity price quotes.  On rare occasions in debug, I observed documents which end abruptly.  Some closing tags, and sometimes half of a data element, are missing.  Theory:  When an invalid document gets passed to EXPAT (and it errors out), it does not always free up all the allocated memory.  It happens every now and then.  After 13 hours, it happens often enough to exhaust all available heap space.
>
> On our production system, the program runs about 12 and a half hours (give or take 45 minutes) until it croaks.  On Development, it runs about an hour longer, i.e. 13 and a half hours, plus or minus 45 minutes.  Measurements were made during an 18-day period.  I believe that development is less active than production.
>
>
> Nasser Shukayr
>      I.T. Application Development Team Lead
>     http://www.heartlandcoop.com
>     2829 Westown Parkway, Suite 350
>     West Des Moines, Iowa 50266
> NShukayr@xxxxxxxxxxxxxxxxx
> Office: 515.309.3857
>
>
> -----Original Message-----
>
> Message: 1
> Date: Thu, 7 Apr 2016 20:01:22 +0000
> From: Nasser Shukayr <nshukayr@xxxxxxxxxxxxxxxxx>
> To: "ftpapi@xxxxxxxxxxxxxxxxxxxxxx" <ftpapi@xxxxxxxxxxxxxxxxxxxxxx>
> Subject: Memory leak in EXPAT?
> 	
> Setup:  Program queries a web service constantly, receiving XML response.  Response is parsed with EXPAT.  Program runs most hours of the day.
> Issue:  After 13 hours, the program croaks with MCH6903 (The heap space has reached its maximum allowable size).  (It's not always exactly 13 hours ; it varies between 11:45 and 15:45 ) Temporary solution:  We changed the program so that after about 8 hours, it exits and then re-starts as a brand-new copy.
>
> Why this is not ideal:  We really need to monitor the web service3 continuously, even during the few seconds needed to exit and restart the program.
>
> Has anyone else had a problem with running out of heap space after 13 hours of continuously using EXPAT?
>
> Many thanks,
>
>
> Nasser Shukayr
>      I.T. Application Development Team Lead
>     http://www.heartlandcoop.com
>     West Des Moines, Iowa 50266
> NShukayr@xxxxxxxxxxxxxxxxx
>
> Message: 3
> Date: Thu, 7 Apr 2016 16:33:39 -0500
> From: Scott Klement <sk@xxxxxxxxxxxxxxxx>
> To: HTTPAPI and FTPAPI Projects <ftpapi@xxxxxxxxxxxxxxxxxxxxxx>
> Subject: Re: Memory leak in EXPAT?
>
> Hello Nasser,
>
> This is the first time I can remember seeing this problem.
>
> So to reproduce it, I should create an XML document, and parse it
> repeatedly for 13 hours?   Does it matter what is in the document? Do I
> have to make an HTTP request each time, or does just parsing the XML sufficient?
>
> -SK
-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------