c# - Html Agility Pack. Load and scrape webpage

Question

Welcome To Ask or Share your Answers For Others

c# - Html Agility Pack. Load and scrape webpage

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

c# - Html Agility Pack. Load and scrape webpage

Is this the best way to get a webpage when scraping?

HttpWebRequest oReq = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse resp = (HttpWebResponse)oReq.GetResponse();

var doc = new HtmlAgilityPack.HtmlDocument();

doc.Load(resp.GetResponseStream());
var element = doc.GetElementbyId("//start-left");
var element2 = doc.DocumentNode.SelectSingleNode("//body");
string html = doc.DocumentNode.OuterHtml;

I've seen HtmlWeb().Load to get a webpage. Is that a better alternative to load and the scrape the webpage?

Ok i'll try that instead.

HtmlDocument doc = web.Load(url);

Now when i got my doc and didn't get so mutch properties. No one like SelectSingleNode. The only one I can use is GetElementById, and that works but I whant to get a class.

Do I need to do it like this?

var htmlBody = doc.DocumentNode.SelectSingleNode("//body");
htmlBody.SelectSingleNode("//paging");

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T02:50:09+0000

Much easier to use HtmlWeb.

string Url = "http://something";
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(Url);

Categories

c# - Html Agility Pack. Load and scrape webpage

c# - Html Agility Pack. Load and scrape webpage

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags