writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

problems with script
I made a small script wich exchanges points in my website:


<?php
session_start

reading xml faster than DB call?
I am trying to optimize a high traffic website, and I enabled a feature where there are three photos

Working with popups and such in a class
I've recently started building my applications completely enclosed in classes. I like the fact that

square instead of number
Hello
I do not know why but this code seems to work fine only in my xampp local insallation but n

a dificult string search
Hi I don't know a way around this. I want the user to input a password, but to make it a bit complca

Forms Authentication and Refresh at Login page
Hello, When I try to press the Login button in my webapplication at my login.aspx page nothing h

split string
Hi all

i have some names (imploded by comma):
Code: toronto,paris,madrid
Now, i would

remove trailing slash from array
I have the following array which contains trailing slashes I need to remove. What is the best way to

Buggy registration system
Hey, I just started scripting in PHP, and I ran into a few problems.
Code: <?php
includ

Data Function is Its Not working IN IE8
This is my first time to use formums. I hope i can get solution for this problem. view plaincopy to

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash