How to extract/download content from HTTPS page?


Posted on 16th Feb 2014 07:03 pm by admin

Hello to all the Members of this forum, Im Shoiab, A novice programmer in php.. for my first job I have been recently assigned a project, in which I have got to extract/download the contents of the webpage (of my clients website) from HTTPS webpage using cURL. In other words I want to extract the same exact webpage to my local host.

Let me tell you, what all I have done so far, I am able to download the web content from "www.virginholidays.co.uk" here is the link to book a resort
"http://www.virginholidays.co.uk/brochures/florida/holidays/orlando/kissimmee/champions_world_resort" when i click on BOOK THE HOLIDAY BUTTON, it takes me to "https webpage" from which im not able to download (https://www.virginholidays.co.uk/book/start)

Im using windows XP, IE 5, php 5.2 and fiddler.

Here is my code:

$req1="GET /book/start HTTP/1.0rn";
$req1.='Accept: */*';
$req1.="rnAccept-Encoding: gzip, deflate
Cookie: _#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;

__utmc=262657675;
CoreID6=60127103647212586967853;

__utma=262657675.233062282.1258696796.1259047752.1259059734.14;
__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)

|utmcmd=(none);
_#uid=1258696798931.315033071.3223127.1883.436744734.051;

_#srchist=11611%3A1%3A20091221055958;
_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;

__utmb=262657675;

ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;

cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;

_csoot=1259036845125;

ememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depart

ureAirport=MAN&DepartureDate=Fri 11 Dec

2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA

ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg

e8=&SearchType=complete; _csuid=X47174a9c82f607;

cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;

InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR

3.5.30729)

Host: http://www.virginholidays.co.uk
Connection: Keep-Alive
Accept-Language: en-us";

$header[0] = "Accept:

text/xml,application/xml,application/xhtml+xml,application/json,";
$header[0] .=

"text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: public";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
$cookie="#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;

__utmc=262657675;

CoreID6=60127103647212586967853;

__utma=262657675.233062282.1258696796.1259047752.1259059734.14;
__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)

|utmcmd=(none);
_#uid=1258696798931.315033071.3223127.1883.436744734.051;

_#srchist=11611%3A1%3A20091221055958;
_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;

__utmb=262657675;
ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;

cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;

_csoot=1259036845125;

RememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depar

tureAirport=MAN&DepartureDate=Fri 11 Dec

2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA

ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg

e8=&SearchType=complete; _csuid=X47174a9c82f607;

cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop";

$ch = curl_init();
curl_setopt($ch,

CURLOPT_URL,"https://www.virginholidays.co.uk/book/start");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt ($ch, CURLOPT_COOKIE, $cookie);
$response1=curl_exec($ch);
curl_close($ch);
echo $response1;

$response = str_replace

("/_assets/","http://www.virginholidays.co.uk/_assets/",$response);
$response = str_replace

("/brochures/","http://www.virginholidays.co.uk/brochures/",$respon

se);
$response = str_replace

("/dynamichtag.aspx","http://www.virginholidays.co.uk/dynamichtag.a

spx",$response);
echo $response;

Could you please help me download the content of https webpage? Im not sure what is the issue? Is the cookie or session expired? Or I need to write a different code..?

Please help,
Thanks

No comments posted yet

Your Answer:

Login to answer
189 Like 5 Dislike
Previous forums Next forums
Other forums

validating url
im trying to validate url's sent to me by a form
the url's im collecting are for placing banners

Trigger tag in aspx is not coming
hi all ,iam new to ajax, iam using file upload inside the update pannel but right now i want to use

Interpret Order
Hello all,

I'm wondering if I have this:

Code: $switch = array(
'one' =>

Help With Showing Users On the Index Page
Ive got this code which works just how i want it to.

Code: <?
$timenow=time();

code help - pagination
Hi all, I have this code, basically a user logs into my site and they get this page.

The pro

PHP IMAGE UPLOAD SCRIPT
Hi for the last week i have been looking for scripts that will upload a photo to a certain folder wh

Filtering an Array Based on Value
I have a very simple script set up that pulls data from a database and is output using this code:

Splitting Attributes
SQL> SELECT I_NAME, substr(I_NAME,1,instr(I_NAME,'O')) "First part",

substr(I_NAME, IN

Problem with DB connection
Hello there! I'm new to this forum and I'm new to PHP coding also. I wrote something that doesn't ma

division gives infinity anser
int main()
{
int z=0;
int i=1/z;
cout<
}

It doesn't throw a

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash