[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: which example can i use to access this webpage
Hi Tim and Scott,
may I suggest that you combine Scott's example with the xmlReader in
powerEXT Core that reads HTML as XML
I have made a little example program that reads a HTML result page from
the
search on the site:
[1]http://89.239.242.111:6382/pextcgiCOR/readhtml.pgm
The only changes neede to scotts code is to change the second post so
it
stores the result in a temp IFS file
On Fri, May 18, 2012 at 2:35 AM, Scott Klement <[2]sk@xxxxxxxxxxxxxxxx>
wrote:
Okay. I've attached an example that I hope will point you in the
right direction.
This type of coding is hard, because this site isn't intended to be
called by a computer program -- it's intended to be called by a web
browser. � Accessing a web site (as opposed to a web service)
requires you to have a pretty strong knowledge of how a programmer
wrote the web page. �And, figuring out how to read the output is
challenging, because the output is designed to dictate a screen
layout, it's not designed to identify what each field is and what
it's for (as would be the case with a web service.) � So what
you're looking for is possible, but it's hard. �Not because of the
tool, but because the site just wasn't meant to be used this way.
But, the attached example does work. �It's just harder than it
would be if it were a web service.
1) You connect to the initial web page, and it sets cookies that it
uses to identify your browser session. �HTTPAPI will manage the
cookie for you -- but make sure you're running version 1.24 or
newer, because there have been bugs fixed recently in the cookie
support.
2) You create a web form containing the fields in the <input> tags
in the HTML. �Web sites can potentially modify this stuff using
JavaScript on the page, so the <input> tags are a good starting
point, but you shouldn't rely on them 100%. �Instead, use a tool
like the "Live HTTP Headers" plugin for Firefox to see exactly
what's sent/received, then copy that in HTTPAPI.
3) After submitting the login form, the site receives your session
cookie and your login credentials (user/pass) and validates them.
�Once that's done, it sets your session ID's status (stored in a
file on the server) to "logged in". �From here on, you must
re-submit the cookie with each request, or it won't know you're
logged in. �That's okay, though, HTTPAPI manages the cookies and
resubmits them as long as you're still running in the same
activation group.
4) The server redirects you to a new page buy sending a 302 HTTP
response, and a new URL. �Your code can call http_redir_loc to get
the new URL, and one of the http_get routines to follow the
redirect. �You'll see that in the sample code. �I always like to
limit the number of redirects to prevent the program gettting stuck
in a loop if the redirect points to another redirect, et al.
5) Submit the form containing the zip code query. � I coded the
program to take the zip code as a parameter and send it as a query.
�Again, I looked at the <input> html tags on the page, and used
Live HTTP Headers to make sure I was sending the right things. �The
only thing that I made a variable is the zip code, and you supply it
like this:
� � CALL PGM(MYFIRTEST) PARM(71635) � (where 71635 is the zip
code)
6) Finally, the response is received (as an HTML document,
explaining how to format data on the browser's screen) containing
the list of foreclosures. �I simply displayed the raw HTML on the
screen -- I'll leave it up to you to figure out how to get the data
you need out of that page (by %scan, %subst, etc)
Good luck!
On 5/17/2012 6:00 PM, [3]tim.dclinc@xxxxxxxxx wrote:
The site in question is [4]http://www.myfir.com/myFir/login.asp
you can use [5]tim.dclinc@xxxxxxxxx as user, and "password" as
password.
Its a public site which anyone can join...i just wanted to
programmatically "check" the site.
--------------------------------------------------------------------
---
This is the FTPAPI mailing list. �To unsubscribe, please go to:
[6]http://www.scottklement.com/mailman/listinfo/ftpapi
--------------------------------------------------------------------
---
--
Regards,
Henrik Rützou
� [7]http://powerEXT.com
� [plogofull200.png]
References
1. http://89.239.242.111:6382/pextcgiCOR/readhtml.pgm
2. mailto:sk@xxxxxxxxxxxxxxxx
3. mailto:tim.dclinc@xxxxxxxxx
4. http://www.myfir.com/myFir/login.asp
5. mailto:tim.dclinc@xxxxxxxxx
6. http://www.scottklement.com/mailman/listinfo/ftpapi
7. http://powerext.com/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
<TITLE>Welcome to MyFir</TITLE>
<META NAME="description" CONTENT="">
<META NAME="keywords" CONTENT="">
<meta name="ROBOTS" content="ALL, INDEX, FOLLOW">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" href="/styles/myFir.css" type="text/css" >
<script src="/jscript/shared.js" type="text/javascript"> </script>
<base target="_top">
<script language="Javascript">
<!--
//Frame Breaker code
if (top.location != self.location) {
top.location = self.location.href
}
//-->
</script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-22949028-3']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body bgcolor="#FFFFFF" marginwidth="0" marginheight="0" topmargin="0" leftmargin="0">
<!--Dub:192.168.100.10-->
<table width="100%" border="0" cellpadding="0" cellspacing="0" background="/images/myfir/top_extender.gif">
<tr>
<td width="363"><a href="http://www.myfir.com/index.asp"><img src="/images/myfir/myFir_logo.gif" width="363" height="43" border="0"></a></td>
<td> </td>
</tr>
</table>
<table width="100%" height="36" border="0" cellpadding="0" cellspacing="0" background="/images/myfir/header_background.gif">
<tr>
<td width="12"> </td>
<td width="312" align="left" valign="bottom" class="mfText12">
<a href="http://www.myfir.com/about.html" class="mfText12Grey">About FIR</a> |
<a href="http://www.myfir.com/search.asp" class="mfText12Grey">Search</a> |
<a href="http://www.myfir.com/myAccount.asp" class="mfText12Grey">My Account</a> |
<a href="http://www.myfir.com/faq.html" class="mfText12Grey">FAQ</a> |
<a href="http://www.myfir.com/contact.html" class="mfText12Grey">Contact</a></td>
<td width="162" align="left" valign="bottom" class="mfText12Grey"> </td>
<td align="left" valign="bottom"><img src="/images/myfir/slogan.gif" width="289" height="19"></td>
</tr>
<tr>
<td colspan="4"><img src="/images/myfir/spacer.gif" height="3"></td>
</tr>
</table>
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td colspan='2'>
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="12"> </td>
<td colspan="2" class="mfHeaderBrown"> <p class="mfUnderlineBrown">Search Results</p></td>
</tr>
<tr>
<td> </td>
<td> </td>
<td width="300"> </td>
</tr>
<tr>
<td width="12"> </td>
<td valign="top"><span class="mfText12BoldBrown">Results for: </span><span class="mfText12BoldBlue">Arkansas County</span><br>
<span class="mfText12BoldBlue">
<a href="courthouse_ar.asp">View Arkansas County Courthouses</a>
</span> <br></td>
<td width="300" valign="top" class="mfText12BoldBlue"><a href="javascript:openCustomWindow('http://www.realtytrac.com/database/noframes/agentSearch.asp','_blank',820,630,50,50,false,true)">Need an agent? Click Here.</a><br> <a href="javascript:openCustomWindow('http://www.realtytrac.com/finance/landing.aspx','_blank',820,630,50,50,false,true)">Get Pre-Qualified! Click Here.</a><br><a href="javascript:openCustomWindow('http://www.realtytrac.com/gateway_co.asp?accnt=1258&password=myfir','_blank',820,630,50,50,false,true)">Search Nation-Wide Foreclosure Listings! Click Here.</a></td>
</tr>
<tr>
<td width="12" height="20"> </td>
<td colspan="2"><p class="mfUnderlineBrown"> </p></td>
</tr>
</table>
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="30"> </td>
<td class="mfText12">To refine your search or to search another area:
<input name="Submit" type="submit" class="mfButtonBlue" value="New Search" onClick="javascript:window.location.href='search.asp'"></td>
</tr>
<tr>
<td width="12" height="20"> </td>
<td></td>
</tr>
</table>
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
<!--MFSP_Retrieve_Properties 'ARAr','xxxx','',''-->
<tr>
<td width="12"> </td>
<td colspan="3" class="mfText12BoldBlue">Arkansas County-Gillett<br>
707 Rose, 72055</td>
<td> </td>
<td width="15"> </td>
<td> </td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Sale Date:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">6/6/2012</td>
<td align="right" class="mfText12Bold">Owner:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"></td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Sale Time:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">3:00PM</td>
<td align="right" class="mfText12Bold">Lender:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Wells Fargo Bank N.A.</td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Location: </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Courthouse</td>
<td align="right" class="mfText12Bold">Contact: </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Wilson & Associates, P.L.L.C.</td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12Bold">Reference Number:</td>
<td class="mfText12"> </td>
<td class="mfText12">1292-222100</td>
<td align="right" class="mfText12Bold">Contact Phone:</td>
<td class="mfText12"> </td>
<td class="mfText12">501-219-9388</td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12Bold">Orig Prin Bal: </td>
<td class="mfText12"> </td>
<td class="mfText12">$75,950.00 </td>
<td align="right" class="mfText12Bold"> </td>
<td class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Status:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Active </td>
<td align="right" class="mfText12Bold"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"><br>
<a href="javascript:openCustomWindow('http://foreclosuresinar.com/nod/ret2.php?ref=1292-222100','Doc_Search',480,640,false,true)" class="mfText12BoldBlue">View Notice Of Default Document</a><br></td>
<td align="right" class="mfText12"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td> </td>
<td colspan="6" align="right" class="mfText12"><p class="mfUnderlineBrown"> </p></td>
</tr>
<tr>
<td width="12"> </td>
<td colspan="3" class="mfText12BoldBlue">Arkansas County-Humphrey<br>
20 Shirleys Lane, 72073</td>
<td> </td>
<td width="15"> </td>
<td> </td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Sale Date:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">7/11/2012</td>
<td align="right" class="mfText12Bold">Owner:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"></td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Sale Time:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">3:45PM</td>
<td align="right" class="mfText12Bold">Lender:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Mortgage Electronic Registration Systems, Inc., As A Separate Corporation That Is Acting Solely As A</td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Location: </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Courthouse</td>
<td align="right" class="mfText12Bold">Contact: </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Wilson & Associates, P.L.L.C.</td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12Bold">Reference Number:</td>
<td class="mfText12"> </td>
<td class="mfText12">587-190105</td>
<td align="right" class="mfText12Bold">Contact Phone:</td>
<td class="mfText12"> </td>
<td class="mfText12">501-219-9388</td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12Bold">Orig Prin Bal: </td>
<td class="mfText12"> </td>
<td class="mfText12">$64,400.00 </td>
<td align="right" class="mfText12Bold"> </td>
<td class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td width="12"> </td>
<td align="right" class="mfText12Bold">Status:</td>
<td width="15" class="mfText12"> </td>
<td class="mfText12">Active </td>
<td align="right" class="mfText12Bold"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td> </td>
<td align="right" class="mfText12"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"><br>
<a href="javascript:openCustomWindow('http://foreclosuresinar.com/nod/ret2.php?ref=587-190105','Doc_Search',480,640,false,true)" class="mfText12BoldBlue">View Notice Of Default Document</a><br></td>
<td align="right" class="mfText12"> </td>
<td width="15" class="mfText12"> </td>
<td class="mfText12"> </td>
</tr>
<tr>
<td> </td>
<td colspan="6" align="right" class="mfText12"><p class="mfUnderlineBrown"> </p></td>
</tr>
</table>
<br>
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="30"> </td>
<td class="mfText12">To refine your search or to search another area:
<input name="Submit" type="submit" class="mfButtonBlue" value="New Search" onClick="javascript:window.location.href='search.asp'"></td>
</tr>
</table>
</td>
</tr>
</TABLE><BR>
<BR>
<TABLE width="800" border="0" cellspacing="0" cellpadding="0">
<TR>
<TD width='100%'><P class="mfUnderlineBrown"> </P><BR></TD>
</TR>
<TR>
<TD align="center"><SPAN class="mfText10Grey">© 2012 Foreclosure
Investors Report, LLC <A href="http://www.myfir.com/contact.html" class="mfText10Grey">Contact
Us</A> | <A href="http://www.myfir.com/disclaimer.html" class="mfText10Grey">Disclaimer</A> | <A href="http://www.myfir.com/partners.html" class="mfText10Grey">Partners</A></SPAN></TD>
</TR>
</TABLE>
</BODY>
</HTML>
</body>
</html>
*=====================================================================
* Project : powerEXT Core
* Title : Read HTML as XML
* Build : 2012.05.18
* Website : powerext.com
*=====================================================================
/copy qsrc,pxapihdr General H-Spec's
* powerEXT API Connectors
/copy qsrc,pxapicgicn Basic HTTP connecter & Productivity Services
* Internal Variables
d td_on s 1a inz(*off)
d td_result s 200a varying
d i s 10i 0
/free
// Clear Service Program & Responce Object
clearSrvPgm();
setContent();
xmlFromStmf('/htmltest.htm');
xmlReaderInz(xmladdr():xmlsize());
xmlReaderCase('L':*on); // the second parameter indicates HTML
dow xmlReader = 0;
select;
when xmlGetNode = 'td' and xmlGetAttr = '';
select;
when td_on = *off;
if xmlGetData = 'Sale Date:'
or xmlGetData = 'Owner:'
or xmlGetData = 'Sale Time:'
or xmlGetData = 'Lender:'
or xmlGetData = 'Location:'
or xmlGetData = 'Courthouse:'
or xmlGetData = 'Orig Prin Bal:'
or xmlGetData = 'Contact:';
td_on = *on;
i = 1;
td_result = xmlGetData;
endif;
when td_on = *on and i = 1;
i = 2;
when td_on = *on and i = 2;
if td_Result = 'Sale Date:'
or td_Result = 'Owner:'
or td_Result = 'Sale Time:'
or td_Result = 'Lender:'
or td_Result = 'Location:'
or td_Result = 'Courthouse:'
or td_Result = 'Orig Prin Bal:'
or td_Result = 'Contact:';
echo('<br />');
echo(td_result);
echo(xmlGetData);
endif;
td_on = *off;
other;
td_on = *off;
endsl;
endsl;
enddo;
echoToStmf('/htmltestresult.htm':1208); // store new result or
echoToClient(); // send it to a browser
return;
-----------------------------------------------------------------------
This is the FTPAPI mailing list. To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------