writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work?

No comments posted yet

Your Answer:

Login to answer
252 Like 22 Dislike
Previous forums Next forums
Other forums

help understanding hidden input w/ php
hello i'm doing a tutorial where there are multiple forms for an edit page for a user to update thei

Auto-populating dropdowns and multiple forms.
Here's what I have so far:

First drop down = select a state (works)
This populates the sec

Why is this function returning a false value when it shouldn't be??
This is in an include file. I want it to check a value in an html form and see if it's just white s

Why use OOP?
Can someone explain to me why I should use OOP instead of procedure based code.

Im building a

insert/update functions for mysql, what do you think?
I am working on two really simple functions that automatically generate (and execute) insert and upd

Linked Keywords
I am trying to get a script that makes my predefined keyword converted to links and / or converted t

ctype() validation - allowing illegal characters
Hello,
I use ctype() to filter and validate a user form. However, I am trying to allow certain c

Sufficient protection from bad input?
I am writing a simple script to let people upload 'pages' of their own content, be it simply a few b

MySQL issue
I am taking sentences through a form on page. Then checking some condition and trying to insert them

recrawling
Can anyone suggest me how may i know a page is updated before it is being downloaded, so that i can

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash