Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
790 views
in Technique[技术] by (71.8m points)

regex - PHP preg_match bible scripture format

I am struggling with building a regular expression for parsing this kind of strings (bible scriptures):

  'John 14:16–17, 25–26'
  'John 14:16–17'
  'John 14:16'
  'John 14'
  'John'

So the basic pattern is:

Book [[Chapter][:Verse]]

where chapter and verse is optional.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think this does what you need:

w+s?(d{1,2})?(:d{1,2})?([-–]d{1,2})?(,sd{1,2}[-–]d{1,2})?

Assumptions:

  • The numbers are always in sets of either 1 or 2 digits
  • The dash will match either of the following - and

Below is the regex with comments:

"
w         # Match a single character that is a “word character” (letters, digits, and underscores)
   +          # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
s         # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   ?          # Between zero and one times, as many times as possible, giving back as needed (greedy)
(          # Match the regular expression below and capture its match into backreference number 1
   d         # Match a single digit 0..9
      {1,2}      # Between one and 2 times, as many times as possible, giving back as needed (greedy)
)?         # Between zero and one times, as many times as possible, giving back as needed (greedy)
(          # Match the regular expression below and capture its match into backreference number 2
   :          # Match the character “:” literally
   d         # Match a single digit 0..9
      {1,2}      # Between one and 2 times, as many times as possible, giving back as needed (greedy)
)?         # Between zero and one times, as many times as possible, giving back as needed (greedy)
(          # Match the regular expression below and capture its match into backreference number 3
   [-–]       # Match a single character present in the list “-–”
   d         # Match a single digit 0..9
      {1,2}      # Between one and 2 times, as many times as possible, giving back as needed (greedy)
)?         # Between zero and one times, as many times as possible, giving back as needed (greedy)
(          # Match the regular expression below and capture its match into backreference number 4
   ,          # Match the character “,” literally
   s         # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   d         # Match a single digit 0..9
      {1,2}      # Between one and 2 times, as many times as possible, giving back as needed (greedy)
   [-–]       # Match a single character present in the list “-–”
   d         # Match a single digit 0..9
      {1,2}      # Between one and 2 times, as many times as possible, giving back as needed (greedy)
)?         # Between zero and one times, as many times as possible, giving back as needed (greedy)
"

And here are some examples of its usage in php:

if (preg_match('/w+s?(d{1,2})?(:d{1,2})?([-–]d{1,2})?(,sd{1,2}[-–]d{1,2})?/', $subject)) {
    # Successful match
} else {
    # Match attempt failed
}

Get an array of all matches in a given string

preg_match_all('/w+s?(d{1,2})?(:d{1,2})?([-–]d{1,2})?(,sd{1,2}[-–]d{1,2})?/', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...