writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

LOOPing Problem
Hello All!

The following code loops through the data and displays the data accordingly. My p

php or sql?
Sorry not sure if this is a sql problem or php the following code is supposed to delete data from th

that old Malformed Headers problem again!!!!! HELP!!!!!!
I've read the http://www.phpfreaks.com/forums/index.php/topic,37442.0.html

I don't think my c

How to validate from 2 possible answers
Hi

I hope somebody can help me with what will probably be really simple, I'm pulling my hair

array_combine() trouble w/csv file
I have a problem with a piece of code I wrote to import some records from a csv file into mysql. I h

DELETE FROM not working deletes wrong row
Hello

I have the following code which i found but it doesnt work properly.. it comes up with

Error In Syntax
I got this error:

Code: Parse error: syntax error, unexpected '>' in /home/bucket/publ

images aren't rendering
I'm trying to call a JPG file from within PHP (in an effort to hide the actual JPG folder). The imag

Aris, Netweaver BPM, Visual composer and X'app
Dear Experts,

Whats the relationship between the following components: Aris, Netweaver BP

is_dir() problem
Hello,

I'm buidling a php scripts that dynamically get's subfolders from a specific folder.

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash