I am building a spider that will crawl through random whitepages (eg. anywho.com, switchboard.com, whitepages.com, etc..) and collect the information on the people found there and throw it into a database. So far I've only made this little prototype, however after trying to run it I've run into a bunch of problems....a lot of them I fixed but there are some with the expressions that I can't figure out.
Here are the errors:
QuoteWarning: preg_match_all() [function.preg-match-all]: Compilation failed: missing ) at offset 57 in /home/public_html/spider/inc/anywho.class.php on line 51
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in /home/public_html/spider/inc/anywho.class.php on line 72
Warning: preg_match_all() [function.preg-match-all]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 73
Warning: preg_match() [function.preg-match]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 76
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 92
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 93
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 94
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 95
Warning: preg_replace() [function.preg-replace]: No ending delimiter '*' found in /home/public_html/spider/inc/anywho.class.php on line 96
Along with these it isn't printing out the info like it is suppose to on line 56 of anywho.class.php
As to the fact that these are two files and a little bigger then the normal "snippet" I posted them both in a pin board. The links are below.
Spider Class: http://www.coderprofile.com/networks/code-pin-board/258/spiderclassphp
Anywho Class: http://www.coderprofile.com/networks/code-pin-board/257/anywhospiderclassphp
And here is the source of the form page:
Code: <?php
require("spider.class.php");
require("anywho.class.php");
$spider=new spider("Lorem Ipsum","Lorem Ipsum","Lorem Ipsum","localhost",15);
$any=new anywho;
if(isset($_POST['submit'])){
$state=$_POST['state'];
$last=$_POST['last'];
$first = (isset($_POST['first'])) ? $_POST['first'] : null;
$street = (isset($_POST['street'])) ? $_POST['street'] : null;
$zip = (isset($_POST['zip'])) ? $_POST['zip'] : null;
$any->initialize($last,$state,$first,$street,$city,$zip);
$any->any_crawl($any->url,0,1);
}
?>
<form action="index.php" method="post">
Last Name: <input type="text" name="last">*
First Name: <input type="text" name="first">
Street: <input type="text" name="street">
Zip: <input type="text" name="zip">
State:
<select name="state" style="height:17px; font-size:9px;">
<option value="">Select a State</option>
<option value="AL" selected="selected" >Alabama</option>
...........................
...........................
<option value="WY">Wyoming</option>
</select>*
<input type="submit" value="Crawl" name="submit">
</form>
I'm really sorry about the messy code and poor documentation.
Also I really appreciate any and all replies!
TimeStamp Column Problem...
I'm mad at my self and thinking I'm a little retarded. It just be some one mistake I'm missing here. I coded my self a forum, when you reply to a topic, the topic's TimeStamp column is supposed to
Anti Spam Code
Ok where do i start? Probably by telling you I have very little working knowledge of PHP and that I have been working on this problem for 3 days and have had no luck!Here we go:I have a rate and
MSSQL/PHP
I am tryint to setup a webapplication developed in PHP, Apache2.0 and SQLServer2000 as backend which is running smoothly on a system in UK. I got the code and database backup from the Production
Php navigation
I have four buttons on a php page.If i click a button it will redirect to different php pages.How would i do that with php?
Struct/union and scope problem!
HI all , I have
Character increment
Hi,I am facing a scenario like above,but in my case i want to show up like Col A,Col B etc....The container where i am displaying this is being dynamically generated using jquery.Any help?
IP question
ive got 2 ip addresses both global from same user how would i detect if they are local to each other
Empty text file when there is over XXXX lines of text.
define("RANDOM_FILE","/public_html/random.txt"); $randomEntry = "This goes in text file.\n"; { $randomFile = fopen(RANDOM_FILE,"a"); }
Is it a good practice to store user info. in sessions?
I am making a user class for my script which stores all the user information in sessions. It takes user id as parameter and gets the info from database and stores it in the session variable. I did
recrawling
Can anyone suggest me how may i know a page is updated before it is being downloaded, so that i can recrawl it.although i have used page info but that is not reliable.