Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
665 views
in Technique[技术] by (71.8m points)

regex - Regular Expression, match characters outside curly braces { }

I have the following data:

int  time="1356280261"
char value="3000"

bankLine {
  char value="3000"
  char currency="EUR"
  int  time="1356280261"
} #bankLine

I am parsing this data recursively and only want to match the 2 variables outside the block separately.

I do have this regex to match the variable

/(?:char|int)s*([A-z0-9]*)s*=s*"(.*)"/

Yet, the regex matches all occurrences inside the block, too.

How can I match only the first 2 variables individually and ignore all inside the bankLink-block?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's a bit hackish, but you can try adding a negative lookahead, like this:

/(?:char|int)s*([A-z0-9]*)s*=s*"(.*)"(?![^{]*})/
                                        ^^^^^^^^^^^

This assumes that all braces are balanced, and fortunately nestedness shouldn't matter (whereas normally it would, in similar questions) since you're looking for the case outside brackets.

The lookahead is based on this observation: If you encounter a close-brace without encountering an open-brace, then we might reasonably assume that we're within braces.

One is tempted to extend this the other way to include a negative lookbehind, but unfortunately most implementations do not support variable-length lookbehinds.

EDIT:

As discussed in the comments below, these fixes are recommended:

/(?:char|int)s*([A-Za-z0-9]*)s*=s*"([^"]*)"(?![^{]*})/
                    ^^^                ^^^^^

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...