[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parser Problems
Hello,
On 10/7/2011 4:14 PM, James Lampert wrote:
> Hmm. Hex 93 in the WinDoze character set is "left double quote."
In which WinDoze character set?!
I'm guessing you mean Win9x, and assuming that you are in the USA or a
Western European character set -- in which case, that'd be CCSID 1252,
and x'93' is the left slant ( “ ) character.
But, that's hardly the only character set in Windows!
> In the old IBM-PC hardware character set, it's a single vertical
> border line.
Or, in CCSID 437 (which was used on old IBM PCs) it'd be a lowercase o
with a circumflex on it. ( ô character ).
> In the old Macintosh character set, it's i-grave. In EBCDIC, it's a
> lower-case "l." For Xerox Ventura Publisher, I'd have to look it up, and
> I don't have the manuals here at work.
I think we can live without the "Xerox Ventura Publisher" data. If you
really like, you can go to the following link and look up x'93' in a
plethora of code pages:
http://www-01.ibm.com/software/globalization/ccsid/ccsid_registered.html
Not sure what that gains us, however. :-)
>
> It appears to have no meaning at all in any Unicode encoding, and that's
> presumably what's relevant to XML.
>
Unicode U+0093 is the slanted double-quote, just as it is in 1252. so
UCS-2 or UTF-16 x'0093' is the slanted quote. In UTF-8, it would be
x'e2809c', *not* x'93'. x'93' does not exist in UTF-8 (CCSID 1208).
Nor does it exist in ISO-8859-1 (CCSID 819).
What's relevant to XML is what's specified in the
<?xml encoding="xyz"?> value at the start of the document. If
xyz=utf-8, then you're right, UTF-8 is what's relevant to XML. However,
if xyz=windows-1252, then it's CCSID 1252 that matters to XML.
The default, if no encoding is given, is ISO-8859-1 (CCSID 819) in
Expat. Though, in some versions of XML, it defaults to UTF-8 (which is a
better choice, IMHO) But, x'93' doesn't exist in either of these, so
if no <?xml encoding= is given, this XML document is invalid and
impossible to interpret correctly.
-----------------------------------------------------------------------
This is the FTPAPI mailing list. To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------