writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

PHP4 to PHP5 Conversion
Hi Everyone,

I am working on a site that is built up on PHP4 and each page is being started f

Check premium expire
Hi,

I am making a simple file hosting site and want to check if users premium subscriptions h

i have no idea why this isn't working
Code: <?php
session_start();
include("connect.php");

error_reporting(E

getting Vars to pass to next page.
Hello all,
I have a confusing situation on my hands, i am a member of a gaming community and we

Select Lists into MYSQL
Hello All, I am new to PHP @ 1 week. So borrowing code anywhere I can. I am making progress, but h

Export hangs
Hi all, please help

I have a Oracle 10.2.0.3.0 database. When I want to query the dba_segmen

COde for a Cc
I'm not receiving $ft as a Cc. Why is that??

$to = "$email";
$headers = "Fr

Links not updating
I'm trying to fuel a simple navigation system for a news section. It works fine when it comes to ret

Material xxx does not exist in plant xxx
Dear All,

I am working for a steel project which is repetitive manufacturing.

Retrieving the 25 most recently added entries from all tables in a MySQL databas
Hello,

The code below works great. It creates a table that shows the 25 most recently added t

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash