I am building a spider that will crawl through random whitepages (eg. anywho.com, switchboard.com, whitepages.com, etc..) and collect the information on the people found there and throw it into a database. So far I've only made this little prototype, however after trying to run it I've run into a bunch of problems....a lot of them I fixed but there are some with the expressions that I can't figure out.
Here are the errors:
QuoteWarning: preg_match_all() [function.preg-match-all]: Compilation failed: missing ) at offset 57 in /home/public_html/spider/inc/anywho.class.php on line 51
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in /home/public_html/spider/inc/anywho.class.php on line 72
Warning: preg_match_all() [function.preg-match-all]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 73
Warning: preg_match() [function.preg-match]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 76
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 92
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 93
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 94
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 95
Warning: preg_replace() [function.preg-replace]: No ending delimiter '*' found in /home/public_html/spider/inc/anywho.class.php on line 96
Along with these it isn't printing out the info like it is suppose to on line 56 of anywho.class.php
As to the fact that these are two files and a little bigger then the normal "snippet" I posted them both in a pin board. The links are below.
Spider Class: http://www.coderprofile.com/networks/code-pin-board/258/spiderclassphp
Anywho Class: http://www.coderprofile.com/networks/code-pin-board/257/anywhospiderclassphp
And here is the source of the form page:
Code: <?php
require("spider.class.php");
require("anywho.class.php");
$spider=new spider("Lorem Ipsum","Lorem Ipsum","Lorem Ipsum","localhost",15);
$any=new anywho;
if(isset($_POST['submit'])){
$state=$_POST['state'];
$last=$_POST['last'];
$first = (isset($_POST['first'])) ? $_POST['first'] : null;
$street = (isset($_POST['street'])) ? $_POST['street'] : null;
$zip = (isset($_POST['zip'])) ? $_POST['zip'] : null;
$any->initialize($last,$state,$first,$street,$city,$zip);
$any->any_crawl($any->url,0,1);
}
?>
<form action="index.php" method="post">
Last Name: <input type="text" name="last">*
First Name: <input type="text" name="first">
Street: <input type="text" name="street">
Zip: <input type="text" name="zip">
State:
<select name="state" style="height:17px; font-size:9px;">
<option value="">Select a State</option>
<option value="AL" selected="selected" >Alabama</option>
...........................
...........................
<option value="WY">Wyoming</option>
</select>*
<input type="submit" value="Crawl" name="submit">
</form>
I'm really sorry about the messy code and poor documentation.
Also I really appreciate any and all replies!
Basic Question about Threading and PHP...
I have a page that I am working on and it is taking several hours to process. The basics of what the page does is get all the items out of a database then with cURL download some HTML parse through
need Array help
This is what I have to do. $teamname[1] = "Red Sox" $teamname[2] = "Giants" $teamname[3] = "White Sox" $teamname[4] = "Cubs" $teamname[5] =
Oracle Connectivity
Hi Every One,
What am I missing here? Help!
Hello all!. I can't seem to get this working right. Well - it renders right, but something is going wrong. It's a set of filters for events. The filter marked "type" (category) works
Oracle11g Patch issue
Hi all,I'm getting the following error while installing Oracle11g Patch 11.1.0.7.0 on IBM-AIX/oracle/Disk1/stage/Components/oracle.owb.rsf/11.1.0.7.0/1/Datafiles/filegroup16.jar.when I look into the
Warning: mysql_num_rows() supplied argument is not a valid MySQL result resource
This may be simple I just may need another pair of eyes..When i get records back the below code works just fine, when I get 0 rows back it shows the warning below. As you can see I tried to code for
New Login Script
Hi all, i attempted to create a whole new login script witch isnt working for some reason i dont know why. When i put the users details and then press submit, it just refreshes the page, even when i
records between 2 dates
Hello all,
Socket Server Response Headers
Earlier I had a post about my Socket Server, I wasn't able to get it to connect, now I can I am using socket_write() to well hopefully send html to a web page, the problem I think is that I have to
Echo multiple lines of html code
Hi guys,I would like to know if there is a way to echo multiple lines of html codes. I intend to echo a form if a variable is set and nothing if it is not set. And right now, I echo each line of html