writing a screen scraper


Posted on 16th Feb 2014 07:03 pm by admin

Hello,

I'm writing a screen scraper application and want to be able to get absolute addresses for images from relative links.

So a link like this: Code: <img src="../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" /> might link to http://www.myointernational.com/furniture/e-commerce_in_a_box_small.jpg

If I am analysing a web address, I understand that the pseudo code would be something like this:Code: <?php

$string='<img src="../../e-commerce_in_a_box_small.jpg" alt="E-Commerce" width="100" height="134" border="0" />';
// we need to find the system root and replace the ../ with REAL values.

$url='http://www.myointernational.com/test_dir/';
if($string contains '../'){
$number_of_them=count(the number of them);
}
$i=1
while($i<=$number_of_them){
$tmp_url=go up one level from the $url;
$i++;
}
?>
<img src="<?php echo $tmp_url;?>" alt="E-Commerce" width="100" height="134" border="0" />
How would I go about finding the code to make the pseudo code work

No comments posted yet

Your Answer:

Login to answer
132 Like 46 Dislike
Previous forums Next forums
Other forums

question about n
I was looking at some of the things you could do with php and one of the things I have tried is n.<

Trouble checking SESSION cookie
I am trying to use $_SESSION cookies to verify admin privileges .
I don't understand why this is

XML Grouping
I'm using xml_parse_into_struct to get all my elements, but now I need to group them. For example, h

'210010106140040100' == '210010106140040101'
Debugging this simple line of a PHP script

Code: if($a == $b){ }
I've found that with val

Default TimeZone
The server I'm working with is hosted in America so all times inserted into the database are coming

"SEO" URLs
Hey, I'm wondering how to go about creating and using these types of URLs. I'm presuming it's PHP th

Session login issue
I'm wondering how to fix a problem I'm having with a session-based login system

Say I go to h

Run function every 5 mins ??
I have a function PostMessage()

How can I run it every 5 mins ??

Oracle Text CTX_DOC.snippet slow
I have a table (FILE_TABLE) that contains a blob column (ft_file) and I have created the following O

Character increment
Hi,

I am facing a scenario like above,but in my case i want to show up like Col A,Col B etc..

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash