writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

Default TimeZone
The server I'm working with is hosted in America so all times inserted into the database are coming

sapgui f4 help last search
I know this has to be simople. One user (maybe more) does not have the "last search saved" from the

Help with setcookie()
Merry Xmas to those on this foruum
Older guy here with some experience but not allot so please be

Force download script not handling files with spaces properly
I have a regular old php force download script, uses this code:

Code: header("Cache-Cont

Get word number x from string?
How can I use a function to loop through a string, and "have a look at" every word in the

Buggy registration system
Hey, I just started scripting in PHP, and I ran into a few problems.
Code: <?php
includ

Really need helps regarding Pagination with Sort
I need someone helps regarding pagination problem...i actually want to make my page limited to let s

help finding hacking loopholes
i was attacked by a redirect php injection

my pc is clean of viruses

so i figure that

Adding to an Int row in db
Hi, i have a database which houses all of the users of my site. One of the columns is for points whi

FTP Programs
Here is a list of commonly suggested FTP Programs to use:

FileZilla
SmartFTP
CuteFTP

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash