writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

Do not allow posting of whitespace
Currently the script below works if the user does not type a name/message, but if i create a whitesp

Warning: session_start() headers already sent error - Driving me Nuts!
I am trying my sister in laws site and I keep getting an error with my coding. I am more of a design

Help with Password Encoding/Decoding?
Trying to design a "change password" tool. On my signup code I'm using base64_encode, now

Run function every 5 mins ??
I have a function PostMessage()

How can I run it every 5 mins ??

if statements problems
Hi. I'm trying to make a web form, but I kind of hit a dead end trying to figure out why it doesn't

PO Release Strategy Issue
Hi Experts,

I am facing an issue related to PO release strategy. The details are-
<

What is SAP Avatar ?
Hi All,

This G.Satish , my boss asked me to explore on SAP Avatar. I searced in internet

present value of sequence?
Hi

Please help me to find out the present value of sequence?

Thanks

Loops and Classes
I am making a template system and everything seems to be going well up until I have to loop in a var

Multiple Options for a Single Page
For this example I want to use the Handlers option which is under Fed Admin and all the related codi

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash