Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
785 views
in Technique[技术] by (71.8m points)

regex - Matching a multiple lines pattern via PHP's preg_match()

How can I match subject via a PHP preg_match() regular expression pattern in this HTML code:

      <table border=0>
  <tr>
  <td>


  <h2>subject</h2>



    </td>

All the whitespaces and newlines are left on purpose. So the problem is in extracting subject name using some multiple line pattern.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you're looking for (e.g.) a h2 tag nested within a td tag where there's only whitespace in between the two, just use s which includes spaces, newlines, etc. eg::

preg_match('#<td>s*<h2>(.*?)</h2>s*</td>#i',$str,$matches);
// result is in $matches[1]

See it in action here.

For your interest, here is a list of different modifiers you can pass in to preg_* functions. Flags that may interest you are:

  • s ("dotall") : this one makes . match every character, including newlines. So, say your <h2>.....</h2> was spread over multiple lines. Then you'd have to do

    preg_match('#<td>s*<h2>(.*?)</h2>s*</td>#is',$str,$matches);
    

    in order to have the .* go over multiple lines (see the extra s at the end of the regex?).

  • m ("multiline") : this one just lets ^ and $ match start/end of line instead of just the start/end of string. You only really need it if you're using ^ and $ in your pattern and want them to match the start/end of each individual line in your input.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...