writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

Strange HTML Tag?
I recently noticed some odd HTML appear in some of the websites I host. Not all of them are run on a

Getting Subdomain Name With PHP?
I want to grab the subdomain name with PHP so I can generate database queries.

for example my

type check while uploading
Hi Everyone,

How can I check the exact type of a file while uploading on my site?

Here

To change the name of label on SAP screen XK02.
Hi All,

Can one suggest me how to change the label of an input field of a sap standard s

PHP error on MySQL insert
I'm sure it's the simplest of issues, but I can't recall why this isn't working.

Code: [Selec

php sessions,logouts & the bloomin back button!
Hi All,

I've got a cms that members can log into. When they logout, the session is destroyed,

Get word number x from string?
How can I use a function to loop through a string, and "have a look at" every word in the

Save data in input fields when they press "BACK BUTTON"
Hi, this is html form: And let's say they get a error "Please enter ur title must be more then

Could Someone Please Debug This?
I was wondering if someone could debug this script for me. I realize it's not the tidest script (and

script and html conflict in trying to create a header.
I have an error is occurring because of an html webpage with a "php require" at the top of

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash