writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

Timer control causing error
I recently decided to add a timer control to an existing page that uses AJAX on my site. As soon as

Displaying returned XML in another PHP page
I have an online payment form that will return XML given if a payment is successful or declines. I

Help with PHP Calendar code...
Hello, I'm new to this forum and I'm glad I found it.
I wrote this code for a PHP calendar as an

Help! refer to a friend script with captcha code
Hi guys, I am posting on here in desperate need for some help with an ongoing search I have been doi

unoconv doc convert to pdf code prob
PHP/5.3.1

Hi. I am trying to use this code to convert docs to .pdf utilizing unoconv. Howe

ORA-01017: invalid username/password; logon denied
Dear All,

I am facing problem in taken backup from db13 it comes up with the following l

present value of sequence?
Hi

Please help me to find out the present value of sequence?

Thanks

MVC - Code review
I'm in the process of trying to wrap my head around MVC, and as part of that, I'm attempting to impl

Kill a process
I have a question - how can I kill a process from a command line or by using Oracle SQL Developer? I

PHP5 - Verifying a secure mail is secure
I need to send an e-mail from a form to a external department and because it contains personal custo

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash