How to extract/download content from HTTPS page?
Posted on
16th Feb 2014 07:03 pm by
admin
Hello to all the Members of this forum, Im Shoiab, A novice programmer in php.. for my first job I have been recently assigned a project, in which I have got to extract/download the contents of the webpage (of my clients website) from HTTPS webpage using cURL. In other words I want to extract the same exact webpage to my local host.
Let me tell you, what all I have done so far, I am able to download the web content from "www.virginholidays.co.uk" here is the link to book a resort
"http://www.virginholidays.co.uk/brochures/florida/holidays/orlando/kissimmee/champions_world_resort" when i click on BOOK THE HOLIDAY BUTTON, it takes me to "https webpage" from which im not able to download (https://www.virginholidays.co.uk/book/start)
Im using windows XP, IE 5, php 5.2 and fiddler.
Here is my code:
$req1="GET /book/start HTTP/1.0rn";
$req1.='Accept: */*';
$req1.="rnAccept-Encoding: gzip, deflate
Cookie: _#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;
__utmc=262657675;
CoreID6=60127103647212586967853;
__utma=262657675.233062282.1258696796.1259047752.1259059734.14;
__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)
|utmcmd=(none);
_#uid=1258696798931.315033071.3223127.1883.436744734.051;
_#srchist=11611%3A1%3A20091221055958;
_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;
__utmb=262657675;
ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;
cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;
_csoot=1259036845125;
ememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depart
ureAirport=MAN&DepartureDate=Fri 11 Dec
2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA
ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg
e8=&SearchType=complete; _csuid=X47174a9c82f607;
cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;
InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR
3.5.30729)
Host: http://www.virginholidays.co.uk
Connection: Keep-Alive
Accept-Language: en-us";
$header[0] = "Accept:
text/xml,application/xml,application/xhtml+xml,application/json,";
$header[0] .=
"text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: public";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
$cookie="#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;
__utmc=262657675;
CoreID6=60127103647212586967853;
__utma=262657675.233062282.1258696796.1259047752.1259059734.14;
__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)
|utmcmd=(none);
_#uid=1258696798931.315033071.3223127.1883.436744734.051;
_#srchist=11611%3A1%3A20091221055958;
_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;
__utmb=262657675;
ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;
cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;
_csoot=1259036845125;
RememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depar
tureAirport=MAN&DepartureDate=Fri 11 Dec
2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA
ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg
e8=&SearchType=complete; _csuid=X47174a9c82f607;
cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop";
$ch = curl_init();
curl_setopt($ch,
CURLOPT_URL,"https://www.virginholidays.co.uk/book/start");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt ($ch, CURLOPT_COOKIE, $cookie);
$response1=curl_exec($ch);
curl_close($ch);
echo $response1;
$response = str_replace
("/_assets/","http://www.virginholidays.co.uk/_assets/",$response);
$response = str_replace
("/brochures/","http://www.virginholidays.co.uk/brochures/",$respon
se);
$response = str_replace
("/dynamichtag.aspx","http://www.virginholidays.co.uk/dynamichtag.a
spx",$response);
echo $response;
Could you please help me download the content of https webpage? Im not sure what is the issue? Is the cookie or session expired? Or I need to write a different code..?
Please help,
Thanks
No comments posted yet
Your Answer:
Login to answer
189
5
Other forums
[RESOLVED] Socket/Port remains open after app crashes
I'm having this problem with a networked app in vb.net.
If the program exits normally the por
User feedback after MySQL query has been executed
Hi all, I've just registered on PHPFreaks because I've got a question that I simply can't work out b
Form validation with functions
Hi there
I am trying to make a very simple form validation function. I currently have the fol
To change the name of label on SAP screen XK02.
Hi All,
Can one suggest me how to change the label of an input field of a sap standard s
Struct/union and scope problem!
HI all , I have
Code:
in header.h
typedef struct Node Link;
/* ---------
Create a form of 2 numbers input and find the greatest.
Hi, everybody.
I have a homework in my training of php, which ask you to make a form that ask
PHP webpage & array print issue
I have this code running, and it works perfectly … however, see my bottom bit about what I see
help with email script...
hey
I need help with my mail script
when the form on http://www.mcgdesignstudio.com/c
max() problem
I have a while loop to get image names.
Code: $imagequery = mysql_query("SELECT * FROM ad_i
delete comma
HI,
How to delete "," at the end of the string.
Code: $match = 2009/02/03/a2corr