[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non english characters and http_url_get (Scott Klement)



   Hello Scott,
   thank you for your answer. "callp     HTTP_SetFileCCSID(1208)" is the
   solution for my problem!
   thanks.
   Matthias

     Message: 6
     Date: Fri, 28 Mar 2008 22:04:45 -0500
     From: Scott Klement <[1]sk@xxxxxxxxxxxxxxxx>
     Subject: Re: Non english characters and http_url_get
     To: HTTPAPI and FTPAPI Projects <[2]ftpapi@xxxxxxxxxxxxxxxxxxxxxx>
     Message-ID: <[3]47EDB1CD.4010207@xxxxxxxxxxxxxxxx>
     Content-Type: text/plain; charset=ISO-8859-1; format=flowed
     hello Matthias,
     I have to admit that I'm a little confused by your example
     URL.  That
     URL points to a web page that's heavy on JavaScript and HTML.  I'm
     having a hard time conceiving of how that page would be useful in
     HTTPAPI...  it seems like that'd only be useful from a web browser.
     I've tried the URL, and no matter how I try, I can't get the words
     "Unter F?hrung" (either spelled correctly or incorrectly) to come
     up in
     the URL.  It appears that I'd have to write a JavaScript parser to
     get
     those words
     Having said that... your example shows one correct character being
     replaced with two invalid ones.  In my experience, that means that
     you're trying to view a UTF-8 encoded document as if it's ASCII --
     or
     you're telling the system to translate it from ASCII to EBCDIC --
     and
     since it's not ASCII, but UTF-8, you have problems.
     With HTTPAPI and http_url_get(), the solution is to tell HTTPAPI
     that to
     mark the file you're downloading as UTF-8 (CCSID 1208) instead of
     the
     default of ISO-8859-1 (a flavor of ASCII, CCSID 819).
     To do that, put this code before your http_url_get():
                  callp     HTTP_SetFileCCSID(1208)
     If the document isn't UTF-8, but is something else, then you need
     to
     supply the correct CCSID for whatever it is.  HTTPAPI doesn't know
     what
     the document is, so if you don't tell it otherwise, it'll default
     to
     CCSID 819.
     Matthias Schatte wrote:
     >    Hello,
     >    i have to download a html site with special german characters.
     I use
     >    the procedure http_url_get
     >    .
     >    For example:
     [1]http://www.heise.de/pda/newsticker/m105633.html
     >    "Unter F?hrung" <- this is right.
     >    With http_url_get i get "Unter F??hrung"
     >    What can i change in my RPG program?
     >    --
     >    Matthias
     >
     > References
     >
     >    1. [4]http://www.heise.de/pda/newsticker/m105633.html
     >
     >
     >
     >
     -------------------------------------------------------------------
     -----
     >
     >
     -------------------------------------------------------------------
     ----
     > This is the FTPAPI mailing list.  To unsubscribe, please go to:
     > [5]http://www.scottklement.com/mailman/listinfo/ftpapi
     >
     -------------------------------------------------------------------
     ----
     ------------------------------
     -------------------------------------------------------------------
     ----
     This is the FTPAPI mailing list digest.  To unsubscribe, go to:
     [6]http://www.scottklement.com/mailman/listinfo/ftpapi
     -------------------------------------------------------------------
     ----
     End of Ftpapi Digest, Vol 21, Issue 10
     **************************************

References

   1. mailto:sk@xxxxxxxxxxxxxxxx
   2. mailto:ftpapi@xxxxxxxxxxxxxxxxxxxxxx
   3. mailto:47EDB1CD.4010207@xxxxxxxxxxxxxxxx
   4. http://www.heise.de/pda/newsticker/m105633.html
   5. http://www.scottklement.com/mailman/listinfo/ftpapi
   6. http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------