Is there any way of limiting the amount of data CURL will fetch? I'm screen scraping data off a page that is 50kb, however the data I require is in the top 1/4 of the page so I really only need to retrieve the first 10kb of the page.
I'm asking because there is a lot of data I need to monitor which results in me transferring close to 60GB of data per month, when only about 5GB of this bandwidth is relevant.
I am using PHP to process the data, however I am flexible in my data retrieval approach, I can use CURL, WGET, fopen etc.
One approach I'm considering is
$fp = fopen("http://www.website.com","r");
fseek($fp,5000);
$data_to_parse = fread($fp,6000);
Does the above mean I will only transfer 6kb from www.website.com, or will fopen load www.website.com into memory meaning I will still transfer the full 50kb?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…