writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

check if value exists
I have googled this for a while and I am getting lots of different results. Is there a standard meth

Is it possible to put an entire 500-page book in a database with PHP?
I am working on an intranet and I was wondering if its possible to code php with mysql to enter a fu

Compare user input to flat file data
Help...Am a complete newbie to programming so my code is prolly quite long. Am trying to verify a us

Remove values in array2 from array1
I have two arrays.

Array 1 is where the array key holds various different numbers. For exampl

Setting a default timezone?
I have read about how to change the timezone in PHPMYADMIN, but it changes back, it doesn't STAY the

Libraries in C++
Hi all,

I have two libraries. one is based targeted on linux platform and uses another li

Allowing ' and "
Hello everyone,

I am creating a form where users submit information to go into a database. I

how can i expire the submitted page using session.
hi,
i'm new to php world.
i'm using "post" method.
when i submit it,data goes to

Error with Font and imagettfbbox
I keep getting an error that says "Warning: imagettfbbox() [function.imagettfbbox]: Could not f

what does this mean? +=
is anyone able to explain what this code is saying?

i had it written for me awhile back and n

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash