Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
582 views
in Technique[技术] by (71.8m points)

regex php: find everything in div

I'm trying to find eveything inside a div using regexp. I'm aware that there probably is a smarter way to do this - but I've chosen regexp.

so currently my regexp pattern looks like this:

$gallery_pattern = '/<div class="gallery">([sS]*)</div>/';  

And it does the trick - somewhat.

The problem is if i have two divs after each other - like this.

<div class="gallery">text to extract here</div>
<div class="gallery">text to extract from here as well</div>

I want to extract the information from both divs, but my problem, when testing, is that im not getting the text in between as a result but instead:

"text to extract here </div>  
<div class="gallery">text to extract from here as well"

So to sum up. It skips the first end of the div. and continues on to the next. The text inside the div can contain <, / and linebreaks. just so you know!

Does anyone have a simple solution to this problem? Im still a regexp novice.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You shouldn't be using regex to parse HTML when there's a convenient DOM library:

$str = '
<div class="gallery">text to extract here</div>
<div class="gallery">text to extract from here as well</div>
';

$doc = new DOMDocument();
$doc->loadHTML($str);
$divs = $doc->getElementsByTagName('div');

if ( count($divs ) ) {
    foreach ( $divs as $div ) {
    echo $div->nodeValue . '<br>';
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...