writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

Strange PHP/mySQL error ... am I just tired?
Code: <?

## CONNECT TO DB FUNCTION!
function ConnectTo($db2con)
{
$hostNam

php is not recognized as an internal or external command
Hello,

I am trying to bake the code in CakePHP through my console.
I have changed my Path

Pipe email to PHP - get mail adress from MySQL - send?
Hi all,

this is the challenge:

1) Our faculty at the college where I'm employed includ

php call servlet
I have done a php backup application .
So there is a form that user pick some files to zip and d

insert quotes
Hi,

I have an output like this:
Code: john,18,Cancer
How can I change this to
Code:

Send inserts to mysql thru port 80
Hi!

I have an application that must do some inserts in a mysql db. This db is behind a firewa

Print 'a' to 'z' via for loop
A very simple problem..
How Can I print a to z NOT a to y ?
It is a part of a code where
<

type check while uploading
Hi Everyone,

How can I check the exact type of a file while uploading on my site?

Here

Members Only
Hi all, for my website i have a members area only which on members can veiw, but at the moment anyon

Async WSAConnect failed on XP with error code = 2 ("File not found")
Hi all,

I have very strange bug, please help me if you can.

It is reproduced o

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash