writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

Compare user input to flat file data
Help...Am a complete newbie to programming so my code is prolly quite long. Am trying to verify a us

Displaying image pathname instead of image
Hello

Im trying to upload and then display images from a mysql database - Its only basic and

Change Sort Order to Display Newest File First
Hi Guys,

I have a page that sorts the contents (PDFs) of a directory and displays them on the

CODE NOT WORKING
Code: [Select]<?php
//include shared codes

include '../lib/common.php';
include

Help with simple query
Hi,

I'm trying to do a Query with a Union where I want to print the number of rows $tc conta

delete comma
HI,

How to delete "," at the end of the string.
Code: $match = 2009/02/03/a2corr

Inserting Data into a MS Access DB using PHP.
As part of my uni course I am doing a placement at a company whom want me to create a client zone fo

Security Exception on pages using AJAX
I am getting the exception: attempted to perform an operation not allowed by the security policy on

rename the file
File.txt

Code: ***DOCUMENT***
..DN:
000044255
..CB:
..SN:
..PY:
2009
..E

Online Event Ticket Sales
Has anyone wrote a script for online tickets sales?

I have been googling and found lots of th

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash