[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HTTPAPI - XPAT - encoding "ISO-8859-2"
Hello,
The crux of the problem is that Expat natively understands the following
encodings: ISO-8859-1, UTF-8, UTF-16 and US-ASCII. Both Expat and
HTTPAPI have mechanisms that let you overcome this limitation, but it
requires you to write extra code.
So I have two possible solutions:
a) The Expat solution. Instead of letting HTTPAPI do the XML parsing,
call the Expat routines directly. Expat supports an
"UnknownEncodingHandler". When Expat analyzes the document and
discovers that the encoding is ISO-8859-2, it'll call your unknown
encoding handler. You will write code to translate the encoding to
unicode for Expat to process.
This requires an in-depth knowledge of Expat, so it's somewhat
complicated. However, it has the advantage that Expat still does the
analysis of the file to determine the encoding, and therefore you don't
have to write a routine to determine the encoding prior to calling Expat.
b) The HTTPAPI solution. In this solution, you will use
http_parse_xml_stmf() to parse the XML document. However, in the 2nd
parameter to this API, you'll specify the CCSID of the data (instead of
using HTTP_XML_CALC).
To do that, you'll have to first download the XML data, then open it up
and read it to determine if the encoding is iso-8859-2. If it is,
you'll tell http_parse_xml_stmf() that the CCSID is 912 (which
corresponds to iso-8859-2). Otherwise, you can still use HTTP_XML_CALC
to let Expat figure out the appropriate encoding.
What will actually happen under the covers in this solution: When
HTTPAPI reads your IFS file, it'll translate the data in the file from
CCSID 912 (iso-8859-2) to UTF-8. It will tell Expat that the data is in
UTF-8 format (so Expat will ignore the encoding in the file's header).
Then it will parse it as a normal UTF-8 document.
Both solutions should work -- though I haven't done much testing of
specifying a CCSID for http_parse_Xml_stmf(), so I suggest that you get
the latest beta version of httpapi from
http://www.scottklement.com/httpapi/beta and help me test it. If it
works for you, great. If not, then I'll need your help to get the bugs
out of it.
Thanks!
RUDAS István wrote:
>
> Unfortunately some of the incoming Files are encoded with
> "ISO-8859-2" instead of "ISO-8859-1": and the Tool gives a Returncode
> of "-1" with automatic terminating.
>
> It would be nice to hear any suggestions, [the option to overtype
> this manually in the incoming file is not acceptable cause it should
> run finally unattended, but thank you for thinking about it, the
> universe and all that kind of things].
-----------------------------------------------------------------------
This is the FTPAPI mailing list. To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------