Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
493 views
in Technique[技术] by (71.8m points)

Locating the node by value containing whitespaces using XPath

I need to locate the node within an xml file by its value using XPath. The problem araises when the node to find contains value with whitespaces inside. F.e.:

<Root>
  <Child>value</Child>
  <Child>value with spaces</Child>
</Root>

I can not construct the XPath locating the second Child node.

Simple XPath /Root/Child perfectly works for both children, but /Root[Child=value with spaces] returns an empty collection.

I have already tried masking spaces with %20, & #20;, & nbsp; and using quotes and double quotes.

Still no luck.

Does anybody have an idea?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Depending on your exact situation, there are different XPath expressions that will select the node, whose value contains some whitespace.

First, let us recall that any one of these characters is "whitespace":

    &#x09; -- the Tab

    &#xA; -- newline

    &#xD; -- carriage return

    ' ' or &#x20; -- the space

If you know the exact value of the node, say it is "Hello World" with a space, then a most direct XPath expression:

     /top/aChild[. = 'Hello World']

will select this node.

The difficulties with specifying a value that contains whitespace, however, come from the fact that we see all whitespace characters just as ... well, whitespace and don't know if a it is a group of spaces or a single tab.

In XPath 2.0 one may use regular expressions and they provide a simple and convenient solution. Thus we can use an XPath 2.0 expression as the one below:

    /*/aChild[matches(., "HellosWorld")]

to select any child of the top node, whose value is the string "Hello" followed by whitespace followed by the string "World". Note the use of the matches() function and of the "s" pattern that matches whitespace.

In XPath 1.0 a convenient test if a given string contains any whitespace characters is:

not(string-length(.)= stringlength(translate(., ' &#9;&#xA;&#xD;','')))

Here we use the translate() function to eliminate any of the four whitespace characters, and compare the length of the resulting string to that of the original string.

So, if in a text editor a node's value is displayed as

"Hello    World",

we can safely select this node with the XPath expression:

/*/aChild[translate(., ' &#9;&#xA;&#xD;','') = 'HelloWorld']

In many cases we can also use the XPath function normalize-space(), which from its string argument produces another string in which the groups of leading and trailing whitespace is cut, and every whitespace within the string is replaced by a single space.

In the above case, we will simply use the following XPath expression:

/*/aChild[normalize-space() = 'Hello World']


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...