[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parser Problems



Hello,

On 10/7/2011 4:14 PM, James Lampert wrote:
> Hmm. Hex 93 in the WinDoze character set is "left double quote."

In which WinDoze character set?!

I'm guessing you mean Win9x, and assuming that you are in the USA or a 
Western European character set -- in which case, that'd be CCSID 1252, 
and x'93' is the left slant ( “ ) character.

But, that's hardly the only character set in Windows!

> In the old IBM-PC hardware character set, it's a single vertical
> border line.

Or, in CCSID 437 (which was used on old IBM PCs) it'd be a lowercase o 
with a circumflex on it.  ( ô character ).


> In the old Macintosh character set, it's i-grave. In EBCDIC, it's a
> lower-case "l." For Xerox Ventura Publisher, I'd have to look it up, and
> I don't have the manuals here at work.

I think we can live without the "Xerox Ventura Publisher" data.  If you 
really like, you can go to the following link and look up x'93' in a 
plethora of code pages:
http://www-01.ibm.com/software/globalization/ccsid/ccsid_registered.html

Not sure what that gains us, however.  :-)


>
> It appears to have no meaning at all in any Unicode encoding, and that's
> presumably what's relevant to XML.
>

Unicode U+0093 is the slanted double-quote, just as it is in 1252.  so 
UCS-2 or UTF-16 x'0093' is the slanted quote.    In UTF-8, it would be 
x'e2809c', *not* x'93'.  x'93' does not exist in UTF-8 (CCSID 1208). 
Nor does it exist in ISO-8859-1 (CCSID 819).

What's relevant to XML is what's specified in the
<?xml encoding="xyz"?> value at the start of the document.  If 
xyz=utf-8, then you're right, UTF-8 is what's relevant to XML.  However, 
if xyz=windows-1252, then it's CCSID 1252 that matters to XML.

The default, if no encoding is given, is ISO-8859-1 (CCSID 819) in 
Expat. Though, in some versions of XML, it defaults to UTF-8 (which is a 
better choice, IMHO)   But, x'93' doesn't exist in either of these, so 
if no <?xml encoding= is given, this XML document is invalid and 
impossible to interpret correctly.
-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------