Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
298 views
in Technique[技术] by (71.8m points)

php - Optional Whitespace Regex

I'm having a problem trying to ignore whitespace in-between certain characters. I've been Googling around for a few days and can't seem to find the right solution.

Here's my code:

// Get Image data
preg_match('#<a href="(.*?)" title="(.*?)"><img alt="(.*?)" src="(.*?)"[s*]width="150"[s*]height="(.*?)"></a>#', $data, $imagematch);
$image = $imagematch[4];

Basically these are some of the scenarios I have:

 <a href="/wiki/File:Sky1.png" title="File:Sky1.png"><img alt="Sky1.png" src="http://media-mcw.cursecdn.com/thumb/5/56/Sky1.png/150px-Sky1.png"width="150" height="84"></a>

(Notice the lack of a space between width="" and src="")

And

<a href="/wiki/File:TallGrass.gif" title="File:TallGrass.gif"><img alt="TallGrass.gif" src="http://media-mcw.cursecdn.com/3/34/TallGrass.gif" width="150"height="150"></a>

(Notice the lack of a space in between width="" and height="".)

Is there anyway to ignore the whitespace in between those characters? As I am not a Regex expert.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Add a s? if a space can be allowed.

s stands for white space

? says the preceding character may occur once or not occur.

If more than one spaces are allowed and is optional, use s*.

* says preceding character can occur zero or more times.

'#<a hrefs?="(.*?)" titles?="(.*?)"><img alts?="(.*?)" srcs?="(.*?)"[s*]widths?="150"[s*]heights?="(.*?)"></a>#'

allows an optional space between attribute name and =.

If you want an optional space after the = also, add a s? after it also.

Likewise, wherever you have optional characters, you can use ? if the maximum occurrence is 1 or * if the maximum occurrence is unlimited, following the optional character.

And your actual problem was [s*] which causes occurrence of a whitespace or a * as characters enclosed in [ and ] is a character class. A character class allows occurrence of any of its members once (so remove * from it) and if you append a quantifier (?, +, * etc) after the ] any character(s) in the character class can occur according to the quantifier.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...