Simple Scraper... Weird Output
Posted on
16th Feb 2014 07:03 pm by
admin
Okay, maybe I just need a Blue Monster and some sleep, but....
I'm scraping a ringtone site just so that I can download all of the ringtones and add them to my personal library.
The website's code that I am targeting is:
Code: [Select]<tr><td><a href="/ringtone/527783/"><img src="/img/icon/ringt.jpg" border=0>Jackson 5 - Who's Loving You </a> </td><td align=center><a href="/ringtones/classical/" class=cat_link>Classical</a></td><td align=center><img src="/img/rating/star0.gif" border=0></td><td align="right" class=smgrey2>5 months ago</td><td align="center" class=smgrey2><span class="b">13895</span></td><td align="right"><span class="b"><a href="/profile/stambaugh01">stambaugh01</a></span></td></tr>
I would like for it to output the actual filename which in this case would be 527783. I would also like for it to output the title of the file.
There are about 50 or so listings per page, and would like for it to automatically go to the next page to scrape.
Here is my code:
Code: [Select]<?
$data = @file_get_contents("http://www.XXXXXXXXXXXX.com/ringtones/classical/");
preg_match_all('/href="/ringtone/.*?<img src="/img/icon/ringt.jpg" border=0>([^"]*).*?/"><img src="/img/icon/.*?border=0>([^"]*)</td><td align=center>/is',$data,$out);
// preg_match_all('/href="/ringtone/.*?<img src="/img/icon/ringt.jpg" border=0>([^"]*).*?/"><img src="/img/icon/.*?border=0>([^"]*)</td><td align=center>/is',$data,$out);
if ((isset($out[1]) && isset($out[2])) === FALSE) { // Let's do some error checking to see if there is data to insert into the database. If not let's end the script
break;
}
$d = array_combine($out[1], $out[2]);
// End Error Checking
foreach($d as $k=>$v){
echo $k . " --- " . $v . "
";
}
?>
The output is skipping and only outputting the title of every other row, but now directory name.
Thanks in advance for the help.
No comments posted yet
Your Answer:
Login to answer
231
24
Other forums
Cache PHP Objects/Classes?
Does anyone have ideas about caching PHP objects using something like: http://memcached.org/
Developing Ajax-enabled ASP.Net applications for the iPhone
I would like to develop Ajax web applications using Visual Studio that are optimized for the iPhone.
Values disappear from my array :( HELP!
hey all, I have a lil mysql/php/apache script that queries a database
and pulls put 5 integers.
ScriptResource.axd gives an error on fresh install of ASP.NET Ajax 1.0
Hello,I have a fresh install of Microsoft ASP.NET Ajax 1.0. When I create an Ajax enabled website in
Using real time in php
I'm very average at PHP and im looking to introduce time to something on my site.
Its a sports si
Significance of BPM
Hi Experts,
I am a novice in BPM , I just want to know how BPM as permenant department is
PHP mail() returns true but doesn't work
First off I apologize if this is a newbie question, and I generally don't like asking questions that
get font info from a font file
hello,
Does anyone know how to get font info from a font file ... using php of course !
<
sapgui f4 help last search
I know this has to be simople. One user (maybe more) does not have the "last search saved" from the
Consuming third party Payment Gateway API from Procedure.
Hi All,
First of all i would like to thank all the people of the oracle forum for providi