[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Parser Problems



Tom, do you want to "remove a character" on that arbitrary block as you say,
or do you want to replace it, as Scott suggests?

Removal has additional considerations (there must be a length field to
adjust... or is termination based on a special character such as nul?).
 
If you want to replace it, you could use strchr() to find the characters to
replace, or the more flexible memchr().  Here's an (untested) example of
replacement using memchr:

P replaceInBlock  B
  ///////////////////////////////////////////////////////////////////
  // Replace an arbitrary character in an arbitrary block with
  // another arbitrary character.
  // ----------------------------------------------------------------
  //    Parameters:
  //       1) Pointer to the beginning of the area to be replaced
  //       2) Length of block within which to replace data
  //       3) Original character to be replaced (such as x'3F')
  //       4) Replace p3 with this character
  //
  // Returns the number of replacements that occurred
  ///////////////////////////////////////////////////////////////////
D replaceInBlock  PI            10I 0
D  startHere                      *   Value Options(*String)
D  blockLen                     10I 0 Value
D  replaceChar                   1    Value
D  newChar                       1    Value


D memchr          PR              *   extproc(*CWIDEN: 'memchr')
D  haystack                       *   Value Options(*String)
D  needle                        1    Value
D  stack_size                   10I 0 Value


D p               S               *
D ch              S              1    Based(p)
D endPtr          S               *
D remainSize      S             10I 0
D replaced        S             10I 0 Inz(*Zero)

 /Free
  remainSize = blockLen ;
  endPtr = startHere + remainSize -1 ;
  p = startHere ;
  DoW remainSize > *Zero ;
     p = memchr(p: replaceChar: remainSize) ;
     If (p = *Null) ;
        Leave ;
     EndIF ;
     ch = newChar ;
     replaced += 1 ;
     remainSize = endPtr - p ;
     p += 1 ;
  EndDO
  Return replaced  ;
 /End-Free

P replaceBlock    E


Dennis Lovelady
http://www.linkedin.com/in/dennislovelady
--
Very funny, Scotty.  Now beam my clothes down too. 


> I believe I follow what you are saying about your thoughts on the "best
> resolution" and maybe that can be an enhancement sometime in the future
for
> HTTPAPI.  In order for me to get past the problem at hand, If you could
help
> me with a code example or an article reference for a routine to remove a
> character on any arbitrary size block of memory, I would certainly be
> appreciative.  Thanks for all your help on this.
> 
> Regards,
> -Tom
> 
> -----Original Message-----
> From: ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:ftpapi-bounces@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Scott Klement
> Sent: Saturday, October 08, 2011 7:02 AM
> To: ftpapi@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: Parser Problems
> 
> Hi Tom,
> 
> > Yes, I am parsing the inner XML document almost exactly as you describe
> > towards the end of your response below. I'm sure the X'3F is the
culprit,
> as
> > you found.  I use the http_XmlReturnPtr(*ON) technique because this
> service
> > many times will return data in excess of 64K limit on a V5R4 machine.
In
> my
> > mind, (and my inexperience with pointers), it seems the pointers
> complicate
> > my ability to %scan/%replace the X'3F characters.
> 
> Yes, they do complicate it.
> 
> My thinking is that the best resolution would be for HTTPAPI to have an
> option return responses in Unicode.  That way, when you do the first
> (outer) parse, the result can be in Unicode, and you won't lose data due
> to unicode/ebcdic translation (which is happening now, it's where the
> x'3F' is coming from.)
> 
> Then when you do the second parse, you could use EBCDIC output -- as you
> are now...  but the input would be Unicode.  Then the x'3F' could be
> used as designed, as a placeholder for untranslatable characters.
> 
> Ultimately, that's the problem:  Unicode supports about 100000 different
> characters, and any single-byte character encoding (including the EBCDIC
> you're using) can support -- at most -- 256.  Unless your program is
> written in pure Unicode, there's always going to be a risk that the
> Unicode will contain some sort of character that doesn't exist in EBCDIC
> (or ASCII for that matter.)
> 
> So, having an option for HTTPAPI's XML parser to return data in Unicode
> seems like a good idea.
> 
> 
> > I could do the entire XML document easily enough if it was less than
> > 64k.  But, once I hand off the inner XML document (pointer) to the
> > parser, It remains in the parsing loop until it finishes or chokes on
> > the X'3F.  I suspect there is a solution I haven't thought of but it
> > remains elusive to me.  Scott, please enjoy your weekend with family
> > and pick this up next week if you have time.
> 
> I could write a routine that would take care of replacing x'3F' with a
> blank on any arbitrary size block of memory.  That would be easy enough
> to do...    But I'm not sure it's the best solution going forward.
> -----------------------------------------------------------------------
> This is the FTPAPI mailing list.  To unsubscribe, please go to:
> http://www.scottklement.com/mailman/listinfo/ftpapi
> -----------------------------------------------------------------------
> 
> -----------------------------------------------------------------------
> This is the FTPAPI mailing list.  To unsubscribe, please go to:
> http://www.scottklement.com/mailman/listinfo/ftpapi
> -----------------------------------------------------------------------

-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------