Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
244 views
in Technique[技术] by (71.8m points)

php - Parsing CSS by regex

I'm creating a CSS editor and am trying to create a regular expression that can get data from a CSS document. This regex works if I have one property but I can't get it to work for all properties. I'm using preg/perl syntax in PHP.

Regex

(?<selector>[A-Za-z]+[s]*)[s]*{[s]*((?<properties>[A-Za-z0-9-_]+)[s]*:[s]*(?<values>[A-Za-z0-9#, ]+);[s]*)*[s]*}

Test case

body { background: #f00; font: 12px Arial; }

Expected Outcome

Array(
    [0] => Array(
            [0] => body { background: #f00; font: 12px Arial; }
            [selector] => Array(
                [0] => body
            )
            [1] => Array(
                [0] => body
            )
            [2] => font: 12px Arial; 
            [properties] => Array(
                [0] => font
            )
            [3] => Array(
                [0] => font
            )
            [values] => Array(
                [0] => 12px Arial
                [1] => background: #f00
            )
            [4] => Array(
                [0] => 12px Arial
                [1] => background: #f00
            )
        )
)

Real Outcome

Array(
    [0] => Array
        (
            [0] => body { background: #f00; font: 12px Arial; }
            [selector] => body 
            [1] => body 
            [2] => font: 12px Arial; 
            [properties] => font
            [3] => font
            [values] => 12px Arial
            [4] => 12px Arial
        )
    )

Thanks in advance for any help - this has been confusing me all afternoon!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

That just seems too convoluted for a single regular expression. Well, I'm sure that with the right extentions, an advanced user could create the right regex. But then you'd need an even more advanced user to debug it.

Instead, I'd suggest using a regex to pull out the pieces, and then tokenising each piece separately. e.g.,

/([^{])s*{s*([^}]*?)s*}/

Then you end up with the selector and the attributes in separate fields, and then split those up. (Even the selector will be fun to parse.) Note that even this will have pains if }'s can appear inside quotes or something. You could, again, convolute the heck out of it to avoid that, but it's probably even better to avoid regex's altogether here, and handle it by parsing one field at a time, perhaps by using a recursive-descent parser or yacc/bison or whatever.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...