XPath contains(text(),’some string’) doesn’t work when used with node with more than one Text subnode

I have a small problem with XPath contains with dom4j …

Let’s say my XML is

<Home>
    <Addr>
        <Street>ABC</Street>
        <Number>5</Number>
        <Comment>BLAH BLAH BLAH <br/><br/>ABC</Comment>
    </Addr>
</Home>

Let’s say I want to find all the nodes that have ABC in the text given the root Element…

So the XPath that I would needed to write would be

//*[contains(text(),'ABC')]

However this is not what dom4j returns …. is this a dom4j problem or my understanding how XPath works, since that query returns only the Street element and not the Comment element?

The DOM makes the Comment element a composite element with four tags two

[Text="XYZ"][BR][BR][Text="ABC"] 

I would assume that the query should still return the element since it should find the element and run contains on it, but it doesn’t …

The following query returns the element, but it returns far more then just the element – it returns the parent elements as well, which is undesirable to the problem.

//*[contains(text(),'ABC')]

Does any one know the XPath query that would return just the elements <Street/> and <Comment/> ?

9 Answers
9

Leave a Comment