writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

Dealing with code in db query
I am dealing with C code and I need to make sure it is encoded some how to ensure its integrity and

A little help in c#
i am doing a simple paint program using c# i want to draw with the mouse so i wrote the code of the

socket communication between c++/java and sending image
hi,

i have a class in c++ called win32_sockserver which creates socket to java. i am trying

Hi, explode and strstr.
Hi, I seem to be confused about the strstr function, eg. i have a string like:
"a.b.c.d.e.f&

Libraries in C++
Hi all,

I have two libraries. one is based targeted on linux platform and uses another li

How to change Time Zone
HI
I want to change the time zone of the server to another country.How can do that?

Thanks

update sql when refresh - php
hi
I have made a table (attachement)

the users can update the sql database using + or x bu

DBCA Templates
Hi all,

I'm working on creating a template for DBCA, but can't find any documentation on the

Customizing message/behavior
Hi,I'm using the ASP.NET membership/authorization controls in my application. Some parts of my appli

eregi to preg
Im converting my regex for php 5.3 and I am stuck on the following

Code: [Select]if((eregi(&q

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash