Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
160 views
in Technique[技术] by (71.8m points)

php - Finding Element with Simple HTML Dom

I am using cURL to pull a webpage and then loading that into a simple_html_dom.

When I am trying to search for an element with the id "tester" and then return plaintext it is stating "Trying to get property 'plaintext' of non-object in"

The links provided contain will hopefully make this issue more clear,

Page with my code: https://icerace.co.uk/counter/

Page I am attempting to pull from: https://creator.zohopublic.com/james.taylor5/order-management-application/view-perma/CounterGP/FhXFwMj4Tah9gXk52AMS7GZFDK45qaV6eF5M4dF50x1WuApG4BeYs2jVh9nCwggm5WfA1MH8bAUy8NxvZ9Onn9Ds6u80vSBEXTTv


<?php
include('simple_html_dom.php');
error_reporting(E_ALL);
ini_set('display_errors', true);


# $target is set to whatever web page url I am looking to scrape. I can update it so that if it finds Next or page 2 etc then it will cycle around again.

$target = 'https://creator.zohopublic.com/james.taylor5/order-management-application/view-perma/CounterGP/FhXFwMj4Tah9gXk52AMS7GZFDK45qaV6eF5M4dF50x1WuApG4BeYs2jVh9nCwggm5WfA1MH8bAUy8NxvZ9Onn9Ds6u80vSBEXTTv';
 
# Curl needs a User Agent so that it can emulate the end user browser. I often have to play around with different User Agents before I find one that works reliably.
$user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36';
 

$ch = curl_init($target);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_URL, $target);
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, "");
 
# The next line is for debugging only and spits out the headers so you can look at the handshakes.
# curl_setopt($c, CURLOPT_VERBOSE, TRUE);
 
# The next two lines are required for https web sites and are not secure in any way. If I am productionising something then I tighten these up.
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
 
# This section is only required for Tor, hence why it is commented out
# curl_setopt($ch, CURLOPT_PROXYTYPE, 7);
# curl_setopt($ch, CURLOPT_PROXY, $proxy.':'.$port);
 
# This is where the actual web scraping request is made and errors (if any) are flagged.
$initpage = curl_exec($ch);
curl_close($ch);

$html = new simple_html_dom();
$html->load($initpage);

echo $html->find("#tester")->plaintext;

?>

question from:https://stackoverflow.com/questions/65867716/finding-element-with-simple-html-dom

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...