I am building a spider that will crawl through random whitepages (eg. anywho.com, switchboard.com, whitepages.com, etc..) and collect the information on the people found there and throw it into a database. So far I've only made this little prototype, however after trying to run it I've run into a bunch of problems....a lot of them I fixed but there are some with the expressions that I can't figure out.
Here are the errors:
QuoteWarning: preg_match_all() [function.preg-match-all]: Compilation failed: missing ) at offset 57 in /home/public_html/spider/inc/anywho.class.php on line 51
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in /home/public_html/spider/inc/anywho.class.php on line 72
Warning: preg_match_all() [function.preg-match-all]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 73
Warning: preg_match() [function.preg-match]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 76
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 92
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 93
Warning: preg_replace() [function.preg-replace]: No ending delimiter '.' found in /home/public_html/spider/inc/anywho.class.php on line 94
Warning: preg_replace() [function.preg-replace]: No ending delimiter '^' found in /home/public_html/spider/inc/anywho.class.php on line 95
Warning: preg_replace() [function.preg-replace]: No ending delimiter '*' found in /home/public_html/spider/inc/anywho.class.php on line 96
Along with these it isn't printing out the info like it is suppose to on line 56 of anywho.class.php
As to the fact that these are two files and a little bigger then the normal "snippet" I posted them both in a pin board. The links are below.
Spider Class: http://www.coderprofile.com/networks/code-pin-board/258/spiderclassphp
Anywho Class: http://www.coderprofile.com/networks/code-pin-board/257/anywhospiderclassphp
And here is the source of the form page:
Code: <?php
require("spider.class.php");
require("anywho.class.php");
$spider=new spider("Lorem Ipsum","Lorem Ipsum","Lorem Ipsum","localhost",15);
$any=new anywho;
if(isset($_POST['submit'])){
$state=$_POST['state'];
$last=$_POST['last'];
$first = (isset($_POST['first'])) ? $_POST['first'] : null;
$street = (isset($_POST['street'])) ? $_POST['street'] : null;
$zip = (isset($_POST['zip'])) ? $_POST['zip'] : null;
$any->initialize($last,$state,$first,$street,$city,$zip);
$any->any_crawl($any->url,0,1);
}
?>
<form action="index.php" method="post">
Last Name: <input type="text" name="last">*
First Name: <input type="text" name="first">
Street: <input type="text" name="street">
Zip: <input type="text" name="zip">
State:
<select name="state" style="height:17px; font-size:9px;">
<option value="">Select a State</option>
<option value="AL" selected="selected" >Alabama</option>
...........................
...........................
<option value="WY">Wyoming</option>
</select>*
<input type="submit" value="Crawl" name="submit">
</form>
I'm really sorry about the messy code and poor documentation.
Also I really appreciate any and all replies!
formating when pulling data from a mysql database
Ok so Im not to sure if this is the right thread to post in but here is my catch 22 issue.I have a test web page www.aandstech.com.au/test.phpTest.php pulls its content from a my sql table.This works
PHP header help!
Hi all I am trying to get this php page to refresh every 5 seconds on my phone which is an aastra 480i IP desk phone.The code below the header line displays a menu on my phone. That works fine so no
IP question
ive got 2 ip addresses both global from same user how would i detect if they are local to each other
How to get all server headers like Live http Headers does
Hey all, like many of you I use the Firefox addon "Live http Headers". I'm trying to write a tool that will basically do the same thing, but web-based... so the user would enter a URL and
What exactly is net neatrality?
What exactly is it? I think it's anti-censorship and... stuff... but I don't really understand it
calculator
I can't figure out why this code doesn't work. No error messages. Page loads.Code: <?php # Script 3.9 - calculator.php$page_title = 'Widget Cost Calculator';include ('./header.html');function
Any decent php formatter/beautifier/pretty printer?
Any decent php formatter/beautifier/pretty printer class/function?I found the following whilst
Escape Latin Characters
I need to escape latin characters in an xml doc. Example: "é" is escaped to "é". I thought I could use the ASCII function, but SELECT ASCII('é') FROM DUAL in Oracle gives me 50089.
PHP login form help (Done Most of It)
Hi i am having a problem, when i try logging in it is always saying "Invalid Login" im not sure what is going wrong, a week ago i had it working. I cant remember the change i did but i know
The page should be expire when cilck back button
hi,i'm new to php world.i create user registration page.when i submit it,data goes to mysql.but when i click back button of browser it shows previous page.i need a code that perform session expirewhen