writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

Does design fit in FPGA ?
Hi all,

I've made a large HCC-Design. Because of the program-size the compile process with th

Querying info from one table based on info in another
Hi, I am currently trying to make a part for my user driven website where one user can subscribe to

Question about GD library
I am trying to make an image that shows a random quote from my database.


However I want t

* Gridview and Detailsview in UpdatePanel, insert mode problem
I have a GridView and DetailsView working together. When a record from Gridview is selected, Detail

Syntax error
hi im having a little trobble with this script
-------------------------------------------------

Variable Clash
In the past I've had variables clash. For example:

Code: <?php
$c = 5;
$ca

Restricted access to sub-folder in iis6 doesn't work?
Basically I'm trying to add restriction to sub-folder (which contains pdf) in web.config for iis6 as

Query failed issue with php script but works fine in mssql manager!
hi i have the script below which copies data from one table to another but will only insert new data

Javascript form submit and radio buttons?
When this form is submitted, it is automatically resubmitted using JS.

All my fields are carr

BackButton Behaviour in AJAX
I have an ASPX Page AJAX Enabled!The page has a gridView and a DDL for filtering on it. The Gridview

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash