.string
on a Tag
type object returns a NavigableString
type object. On the other hand, .text
gets all the child strings and return concatenated using the given separator. Return type of .text is unicode
object.
From the documentation, A NavigableString
is just like a Python Unicode
string, except that it also supports some of the features described in Navigating the tree and Searching the tree.
From the documentation on .string
, we can see that, If the html is like this,
<td>Some Table Data</td>
<td></td>
Then, .string
on the second td will return None
.
But .text
will return and empty string which is a unicode
type object.
For more convenience,
string
- Convenience property of a
tag
to get the single string within this tag.
- If the
tag
has a single string child then the return value is that string.
- If the
tag
has no children or more than one child then the return value is None
- If this
tag
has one child tag then the return value is the 'string' attribute of the child tag, recursively.
And text
- Get all the child strings and return concatenated using the given separator.
If the html
is like this:
<td>some text</td>
<td></td>
<td><p>more text</p></td>
<td>even <p>more text</p></td>
.string
on the four td
will return,
some text
None
more text
None
.text
will give result like this,
some text
more text
even more text
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…