writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

Preloading images
Posting this question here because I am not sure where this should belong.I am building an asp.net a

Unexpected T_Variable ?
Hi all,
I dont really know what I am doing!! I know I'm doing something wrong, and I know its on

php web service error
hey guys,
I'm working on a project requires the use of web services. I've been trying a few tutor

matching numbers inside ( )
I know I can match numbers by just [0-9]+, so I thought matching numbers inside ( ) would be somethi

Table Control
Hi Guru's,

I've created a Module pool program, which contains the Table Control.

Pass sql into pl/sql and create RMAN duplicate script.
Hi,

I'm new to pl/sql and I'm trying to write a script that will generate some RMAN comma

passing data from one page to another
hey guys
i have the follwoing code to get information from one page and place on another:

Using insert variable
need a way to inert variable data to mysql database

$acc = "212121212";
$nok =

removing space from the end of a variable
i have a variable $image which contains the following url "http://tiles.xbox.com/tiles/oo/P5/0m

CE 7.1 and External GIS integration
Hi All,

We want to develop an application on CE 7.1 which uses GIS features from an exter

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash