[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: which example can i use to access this webpage



   Hi Tim and Scott,

   may I suggest that you combine Scott's example with the xmlReader in

   powerEXT Core that reads HTML as XML

   I have made a little example program that reads a HTML result page from
   the

   search on the site:

   [1]http://89.239.242.111:6382/pextcgiCOR/readhtml.pgm

   The only changes neede to scotts code is to change the second post so
   it

   stores the result in a temp IFS file

   On Fri, May 18, 2012 at 2:35 AM, Scott Klement <[2]sk@xxxxxxxxxxxxxxxx>
   wrote:

     Okay. I've attached an example that I hope will point you in the
     right direction.
     This type of coding is hard, because this site isn't intended to be
     called by a computer program -- it's intended to be called by a web
     browser. � Accessing a web site (as opposed to a web service)
     requires you to have a pretty strong knowledge of how a programmer
     wrote the web page. �And, figuring out how to read the output is
     challenging, because the output is designed to dictate a screen
     layout, it's not designed to identify what each field is and what
     it's for (as would be the case with a web service.) � So what
     you're looking for is possible, but it's hard. �Not because of the
     tool, but because the site just wasn't meant to be used this way.
     But, the attached example does work. �It's just harder than it
     would be if it were a web service.
     1) You connect to the initial web page, and it sets cookies that it
     uses to identify your browser session. �HTTPAPI will manage the
     cookie for you -- but make sure you're running version 1.24 or
     newer, because there have been bugs fixed recently in the cookie
     support.
     2) You create a web form containing the fields in the <input> tags
     in the HTML. �Web sites can potentially modify this stuff using
     JavaScript on the page, so the <input> tags are a good starting
     point, but you shouldn't rely on them 100%. �Instead, use a tool
     like the "Live HTTP Headers" plugin for Firefox to see exactly
     what's sent/received, then copy that in HTTPAPI.
     3) After submitting the login form, the site receives your session
     cookie and your login credentials (user/pass) and validates them.
     �Once that's done, it sets your session ID's status (stored in a
     file on the server) to "logged in". �From here on, you must
     re-submit the cookie with each request, or it won't know you're
     logged in. �That's okay, though, HTTPAPI manages the cookies and
     resubmits them as long as you're still running in the same
     activation group.
     4) The server redirects you to a new page buy sending a 302 HTTP
     response, and a new URL. �Your code can call http_redir_loc to get
     the new URL, and one of the http_get routines to follow the
     redirect. �You'll see that in the sample code. �I always like to
     limit the number of redirects to prevent the program gettting stuck
     in a loop if the redirect points to another redirect, et al.
     5) Submit the form containing the zip code query. � I coded the
     program to take the zip code as a parameter and send it as a query.
     �Again, I looked at the <input> html tags on the page, and used
     Live HTTP Headers to make sure I was sending the right things. �The
     only thing that I made a variable is the zip code, and you supply it
     like this:
     � � CALL PGM(MYFIRTEST) PARM(71635) � (where 71635 is the zip
     code)
     6) Finally, the response is received (as an HTML document,
     explaining how to format data on the browser's screen) containing
     the list of foreclosures. �I simply displayed the raw HTML on the
     screen -- I'll leave it up to you to figure out how to get the data
     you need out of that page (by %scan, %subst, etc)
     Good luck!

   On 5/17/2012 6:00 PM, [3]tim.dclinc@xxxxxxxxx wrote:

     The site in question is [4]http://www.myfir.com/myFir/login.asp
     you can use [5]tim.dclinc@xxxxxxxxx as user, and "password" as
     password.
     Its a public site which anyone can join...i just wanted to
     programmatically "check" the site.

     --------------------------------------------------------------------
     ---
     This is the FTPAPI mailing list. �To unsubscribe, please go to:
     [6]http://www.scottklement.com/mailman/listinfo/ftpapi
     --------------------------------------------------------------------
     ---

   --
   Regards,
   Henrik Rützou
   �   [7]http://powerEXT.com
   �   [plogofull200.png]

References

   1. http://89.239.242.111:6382/pextcgiCOR/readhtml.pgm
   2. mailto:sk@xxxxxxxxxxxxxxxx
   3. mailto:tim.dclinc@xxxxxxxxx
   4. http://www.myfir.com/myFir/login.asp
   5. mailto:tim.dclinc@xxxxxxxxx
   6. http://www.scottklement.com/mailman/listinfo/ftpapi
   7. http://powerext.com/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd";>

<HTML>
<HEAD>
<TITLE>Welcome to MyFir</TITLE>
<META NAME="description" CONTENT="">
<META NAME="keywords" CONTENT="">
<meta name="ROBOTS" content="ALL, INDEX, FOLLOW">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" href="/styles/myFir.css" type="text/css" >
<script src="/jscript/shared.js" type="text/javascript"> </script>
<base target="_top">

  <script language="Javascript">
  <!-- 
  //Frame Breaker code
  if (top.location != self.location) {
    top.location = self.location.href
  }
  //--> 
  </script>

<script type="text/javascript"> 

  var _gaq = _gaq || []; 
  _gaq.push(['_setAccount', 'UA-22949028-3']); 
  _gaq.push(['_trackPageview']); 

  (function() { 
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; 
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; 
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); 
  })(); 

</script> 
</head>
<body bgcolor="#FFFFFF" marginwidth="0" marginheight="0" topmargin="0" leftmargin="0">
<!--Dub:192.168.100.10-->
<table width="100%" border="0" cellpadding="0" cellspacing="0" background="/images/myfir/top_extender.gif">
  <tr> 
    <td width="363"><a href="http://www.myfir.com/index.asp";><img src="/images/myfir/myFir_logo.gif" width="363" height="43" border="0"></a></td>
    <td>&nbsp;</td>
  </tr>
</table>
<table width="100%" height="36" border="0" cellpadding="0" cellspacing="0" background="/images/myfir/header_background.gif">
  <tr> 
    <td width="12">&nbsp;</td>
    <td width="312" align="left" valign="bottom" class="mfText12"> 
      <a href="http://www.myfir.com/about.html"; class="mfText12Grey">About FIR</a>&nbsp; | &nbsp;
      <a href="http://www.myfir.com/search.asp"; class="mfText12Grey">Search</a>&nbsp; | &nbsp;
      <a href="http://www.myfir.com/myAccount.asp"; class="mfText12Grey">My Account</a>&nbsp; |&nbsp;
      <a href="http://www.myfir.com/faq.html"; class="mfText12Grey">FAQ</a> &nbsp; | &nbsp; 
      <a href="http://www.myfir.com/contact.html"; class="mfText12Grey">Contact</a></td>
    <td width="162" align="left" valign="bottom" class="mfText12Grey">&nbsp;</td>
    <td align="left" valign="bottom"><img src="/images/myfir/slogan.gif" width="289" height="19"></td>
  </tr>
  <tr> 
    <td colspan="4"><img src="/images/myfir/spacer.gif" height="3"></td>
  </tr>
</table>
<table width="800" border="0" cellspacing="0" cellpadding="0">
  <tr>
    <td colspan='2'>
 
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
  <tr> 
    <td width="12">&nbsp;</td>
    <td colspan="2" class="mfHeaderBrown"> <p class="mfUnderlineBrown">Search Results</p></td>
  </tr>
  <tr> 
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td width="300">&nbsp;</td>
  </tr>
  <tr> 

    <td width="12">&nbsp;</td>
    <td valign="top"><span class="mfText12BoldBrown">Results for: </span><span class="mfText12BoldBlue">Arkansas County</span><br>
      <span class="mfText12BoldBlue">
      
      <a href="courthouse_ar.asp">View Arkansas County Courthouses</a>
      
      </span> <br></td>
    <td width="300" valign="top" class="mfText12BoldBlue"><a href="javascript:openCustomWindow('http://www.realtytrac.com/database/noframes/agentSearch.asp','_blank',820,630,50,50,false,true)">Need an agent? Click Here.</a><br> <a href="javascript:openCustomWindow('http://www.realtytrac.com/finance/landing.aspx','_blank',820,630,50,50,false,true)">Get Pre-Qualified!  Click Here.</a><br><a href="javascript:openCustomWindow('http://www.realtytrac.com/gateway_co.asp?accnt=1258&password=myfir','_blank',820,630,50,50,false,true)">Search Nation-Wide Foreclosure Listings! Click Here.</a></td>
  </tr>
  <tr> 
    <td width="12" height="20">&nbsp;</td>
    <td colspan="2"><p class="mfUnderlineBrown">&nbsp;</p></td>
  </tr>
  
</table>
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
		<tr> 
		  <td width="30">&nbsp;</td>
		  <td class="mfText12">To refine your search or to search another area: 
		    <input name="Submit" type="submit" class="mfButtonBlue" value="New Search" onClick="javascript:window.location.href='search.asp'"></td>
	    </tr>
		<tr> 
    <td width="12" height="20">&nbsp;</td>
    <td></td>
  </tr>
</table>
	<br>
  <table width="800" border="0" cellspacing="0" cellpadding="0">
 <!--MFSP_Retrieve_Properties 'ARAr','xxxx','',''-->
		<tr> 
		  <td width="12">&nbsp;</td>
		  <td colspan="3" class="mfText12BoldBlue">Arkansas County-Gillett<br>
		    707 Rose, 72055</td>
		  <td>&nbsp;</td>
		  <td width="15">&nbsp;</td>
		  <td>&nbsp;</td>
		</tr>
	
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Sale Date:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">6/6/2012</td>
	    <td align="right" class="mfText12Bold">Owner:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12"></td>
	  </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Sale Time:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">3:00PM</td>
	    <td align="right" class="mfText12Bold">Lender:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Wells Fargo Bank N.A.</td>
	  </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Location: </td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Courthouse</td>
	    <td align="right" class="mfText12Bold">Contact: </td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Wilson & Associates, P.L.L.C.</td>
	  </tr>
	  <tr>
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12Bold">Reference Number:</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">1292-222100</td>
	    <td align="right" class="mfText12Bold">Contact Phone:</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">501-219-9388</td>
    </tr>
	  <tr>
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12Bold">Orig Prin Bal: </td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">$75,950.00&nbsp;</td>
	    <td align="right" class="mfText12Bold">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
    </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Status:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Active&nbsp; </td>
	    <td align="right" class="mfText12Bold">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	  </tr>
	  
	  <tr> 
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12"><br>
	      <a href="javascript:openCustomWindow('http://foreclosuresinar.com/nod/ret2.php?ref=1292-222100','Doc_Search',480,640,false,true)" class="mfText12BoldBlue">View Notice Of Default Document</a><br></td>
	    <td align="right" class="mfText12">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	  </tr>
	  
	  <tr> 
	    <td>&nbsp;</td>
	    <td colspan="6" align="right" class="mfText12"><p class="mfUnderlineBrown">&nbsp;</p></td>
	  </tr>
	
		<tr> 
		  <td width="12">&nbsp;</td>
		  <td colspan="3" class="mfText12BoldBlue">Arkansas County-Humphrey<br>
		    20 Shirleys Lane, 72073</td>
		  <td>&nbsp;</td>
		  <td width="15">&nbsp;</td>
		  <td>&nbsp;</td>
		</tr>
	
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Sale Date:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">7/11/2012</td>
	    <td align="right" class="mfText12Bold">Owner:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12"></td>
	  </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Sale Time:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">3:45PM</td>
	    <td align="right" class="mfText12Bold">Lender:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Mortgage Electronic Registration Systems, Inc., As A Separate Corporation That Is Acting Solely As A</td>
	  </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Location: </td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Courthouse</td>
	    <td align="right" class="mfText12Bold">Contact: </td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Wilson & Associates, P.L.L.C.</td>
	  </tr>
	  <tr>
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12Bold">Reference Number:</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">587-190105</td>
	    <td align="right" class="mfText12Bold">Contact Phone:</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">501-219-9388</td>
    </tr>
	  <tr>
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12Bold">Orig Prin Bal: </td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">$64,400.00&nbsp;</td>
	    <td align="right" class="mfText12Bold">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
    </tr>
	  <tr> 
	    <td width="12">&nbsp;</td>
	    <td align="right" class="mfText12Bold">Status:</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">Active&nbsp; </td>
	    <td align="right" class="mfText12Bold">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	  </tr>
	  
	  <tr> 
	    <td>&nbsp;</td>
	    <td align="right" class="mfText12">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12"><br>
	      <a href="javascript:openCustomWindow('http://foreclosuresinar.com/nod/ret2.php?ref=587-190105','Doc_Search',480,640,false,true)" class="mfText12BoldBlue">View Notice Of Default Document</a><br></td>
	    <td align="right" class="mfText12">&nbsp;</td>
	    <td width="15" class="mfText12">&nbsp;</td>
	    <td class="mfText12">&nbsp;</td>
	  </tr>
	  
	  <tr> 
	    <td>&nbsp;</td>
	    <td colspan="6" align="right" class="mfText12"><p class="mfUnderlineBrown">&nbsp;</p></td>
	  </tr>
	
	</table>
	
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
		<tr> 
		  <td width="30">&nbsp;</td>
		  <td class="mfText12">To refine your search or to search another area: 
		    <input name="Submit" type="submit" class="mfButtonBlue" value="New Search" onClick="javascript:window.location.href='search.asp'"></td>
	    </tr>
</table>

    </td>
  </tr>
 
</TABLE><BR>

<BR>
<TABLE width="800" border="0" cellspacing="0" cellpadding="0">
	 <TR>
    
    <TD width='100%'><P class="mfUnderlineBrown">&nbsp;</P><BR></TD>
  </TR>
  <TR> 
    <TD align="center"><SPAN class="mfText10Grey">&copy; 2012 Foreclosure 
      Investors Report, LLC <A href="http://www.myfir.com/contact.html"; class="mfText10Grey">Contact 
      Us</A> | <A href="http://www.myfir.com/disclaimer.html"; class="mfText10Grey">Disclaimer</A> | <A href="http://www.myfir.com/partners.html"; class="mfText10Grey">Partners</A></SPAN></TD>
  </TR>
</TABLE>
</BODY>
</HTML>

</body>
</html>
      *=====================================================================
      *  Project   : powerEXT Core
      *  Title     : Read HTML as XML
      *  Build     : 2012.05.18
      *  Website   : powerext.com
      *=====================================================================
      /copy qsrc,pxapihdr      General H-Spec's

      * powerEXT API Connectors
      /copy qsrc,pxapicgicn    Basic HTTP connecter & Productivity Services

      * Internal Variables
     d td_on           s              1a   inz(*off)
     d td_result       s            200a   varying
     d i               s             10i 0

      /free
       // Clear Service Program & Responce Object
       clearSrvPgm();
       setContent();

       xmlFromStmf('/htmltest.htm');
       xmlReaderInz(xmladdr():xmlsize());
       xmlReaderCase('L':*on); // the second parameter indicates HTML

       dow xmlReader = 0;

         select;
           when xmlGetNode = 'td' and xmlGetAttr = '';
             select;
               when td_on = *off;
                 if xmlGetData = 'Sale Date:'
                  or xmlGetData = 'Owner:'
                  or xmlGetData = 'Sale Time:'
                  or xmlGetData = 'Lender:'
                  or xmlGetData = 'Location:'
                  or xmlGetData = 'Courthouse:'
                  or xmlGetData = 'Orig Prin Bal:'
                  or xmlGetData = 'Contact:';
                   td_on = *on;
                   i = 1;
                   td_result = xmlGetData;
                  endif;
               when td_on = *on and i = 1;
                 i = 2;
               when td_on = *on and i = 2;
                 if td_Result = 'Sale Date:'
                  or td_Result = 'Owner:'
                  or td_Result = 'Sale Time:'
                  or td_Result = 'Lender:'
                  or td_Result = 'Location:'
                  or td_Result = 'Courthouse:'
                  or td_Result = 'Orig Prin Bal:'
                  or td_Result = 'Contact:';
                   echo('<br />');
                   echo(td_result);
                   echo(xmlGetData);
                 endif;
                 td_on = *off;
               other;
                 td_on = *off;
             endsl;
         endsl;
       enddo;

       echoToStmf('/htmltestresult.htm':1208);  // store new result or
       echoToClient(); // send it to a browser

       return;
-----------------------------------------------------------------------
This is the FTPAPI mailing list.  To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------