writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

DateObject and Nulls
Hi all,

I have an array mapped to a value object. One of the items in the array is a PHP Date

How To Make More Than One Redirection with PHP on the same page?
Hey im trying to do a direction page where it open differently link direction pages every time som

Help with Password Encoding/Decoding?
Trying to design a "change password" tool. On my signup code I'm using base64_encode, now

retrieving policy name inside the function called by this particular policy
Hi there,

I've playing around with dbms_rls package, trying to set up some security repo

Echo Tweaking help!
HI. I would like to have the output of the entered variables repeat forever, but it's stopping at th

Different actions for different parts of a string
<?php
function dosomething($string, $else = '') {
if (empty($else))
$string

Email Form Syntax Issue
I need the TO: in email to display To: CEO instead of To: abc@mail.com

How to alter the scri

Multiple forms on the same page (safari)
Hello everyone,

I have 3 forms on the same page, that opens in a new window and submits to a

Comment Mod System Effects all rows...
Sorry if its confusing but here is whats going on: I have a table in a database called comments and

SAP Management Console is blank
Dear All ,

I am facing problem in my des sever suddely in SAP Management Console is blank and

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash