First you have to understand a subtle but very important difference between using cURL to get a webpage, and using your browser visiting that same page.
1. Loading a page with a browser
When you enter the address on the location bar, the browser converts the url into an ip address . Then it tries to reach the web server with that address asking for a web page. From now on the browser will only speak HTTP with the web server. HTTP is a protocol made for carrying documents over network. The browser is actually asking for an html document (A bunch of text) from the web server. The web server answers by sending the web page to the browser. If the web page is a static page, the web server is just picking an html file and sending it over network. If it's a dynamic page, the web server use some high level code (like php) to generate to the web page then send it over.
Once the web page has been downloaded, the browser will then parse the page and interprets the html inside which produces the actual web page on the browser. During the parsing process, when the browser finds script
tags it will interpret their content as javascript, which is a language used in browser to manipulate the look of the web page and do stuff inside the browser.
Remember, the web server only sent a web page containing html content he has no clue of what's javascript.
So when you load a web page on a browser the javascript is ONLY interpreted once it is downloaded on the browser.
2. What is cURL
If you take a look at curl man page, you'll learn that curl is a tool to transfer data from/to servers which can speak some supported protocols and HTTP is one of them.
When you download a page with curl, it will try to download the page the same way your browser does it but will not parse or interpret anything. cURL does not understand javascript or html, all it knows about is how to speak to web servers.
3. Solution
So what you need in your case is to download the page like cURL does it and also somehow make the javascript to be interpreted as if it was inside a browser.
If you had follwed me up to here then you're ready to take a look at CasperJS.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…