Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
342 views
in Technique[技术] by (71.8m points)

c# - Html Agility Pack - Problem selecting subnode

I want to export my Asics running plan to iCal and since Asics do not offer this service, I decided to build a little scraper for my own personal use. What I want to do is to take all the scheduled runs from my plan and generate an iCal feed based on that. I am using C# and Html Agility Pack.

What I want to do is iterate through all my scheduled runs (they are div nodes). Then next I want to select a few different nodes with my run nodes. My code looks like this:

foreach (var run in doc.DocumentNode.SelectSingleNode("//div[@id='scheduleTable']").SelectNodes("//div[@class='pTdBox']"))
{
    number++;
    string date = run.SelectSingleNode("//div[@class='date']").InnerText;
    string type = run.SelectSingleNode("//span[@class='menu']").InnerHtml;
    string distance = run.SelectSingleNode("//span[@class='distance']").InnerHtml;
    string description = run.SelectSingleNode("//div[@class='description']").InnerHtml;
    ViewData["result"] += "Dato: " + date + "<br />";
    ViewData["result"] += "Tyep: " + type + "<br />";
    ViewData["result"] += "Distance: " + distance + "<br />";
    ViewData["result"] += "Description: " + description + "<br />";
    ViewData["result"] += run.InnerHtml.Replace("<", "&lt;").Replace(">", "&gt;") + "<br />" + "<br />" + "<br />";
}

My problem is that run.SelectSingleNode("//div[@class='date']").InnerText does not select the node with the given XPath within the given run node. It selects the first node that matches the XPath in the entire document.

How can I select the single node with the given XPath within the current node?

Thank you.

Update

I tried updating my XPath string to this:

string date = run.SelectSingleNode(".div[@class='date']").InnerText;

This should select the <div class="date"></div> element within the current node, right? Well, I tried this but got this error:

Expression must evaluate to a node-set. Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Xml.XPath.XPathException: Expression must evaluate to a node-set.

Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A few things that will help you when working with HtmlAgilityPack and XPath expressions.

If run is an HtmlNode, then:

  1. run.SelectNodes("//div[@class='date']")
    Will will behave exactly like doc.DocumentNode.SelectNodes("//div[@class='date']")

  2. run.SelectNodes("./div[@class='date']")
    Will give you all the <div> nodes that are children of run node. It won't search deeper, only at the very next depth level.

  3. run.SelectNodes(".//div[@class='date']")
    Will return all the <div> nodes with that class attribute, but not only next to the run node, but also will search in depth (every possible descendant of it)

You will have to choose between 2. or 3., depending on which one satisfy your needs :)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...