Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
456 views
in Technique[技术] by (71.8m points)

regex - Exclude characters from a character class

Is there a simple way to match all characters in a class except a certain set of them? For example if in a lanaguage where I can use w to match the set of all unicode word characters, is there a way to just exclude a character like an underscore "_" from that match?

Only idea that came to mind was to use negative lookahead/behind around each character but that seems more complex than necessary when I effectively just want to match a character against a positive match AND negative match. For example if & was an AND operator I could do this...

^(w&[^_])+$
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It really depends on your regex flavor.

.NET

... provides only one simple character class set operation: subtraction. This is enough for your example, so you can simply use

[w-[_]]

If a - is followed by a nested character class, it's subtracted. Simple as that...

Java

... provides a much richer set of character class set operations. In particular you can get the intersection of two sets like [[abc]&&[cde]] (which would give c in this case). Intersection and negation together give you subtraction:

[w&&[^_]]

Perl

... supports set operations on extended character classes as an experimental feature (available since Perl 5.18). In particular, you can directly subtract arbitrary character classes:

(?[ w - [_] ])

All other flavors

... (that support lookaheads) allow you to mimic the subtraction by using a negative lookahead:

(?!_)w

This first checks that the next character is not a _ and then matches any w (which can't be _ due to the negative lookahead).

Note that each of these approaches is completely general in that you can subtract two arbitrarily complex character classes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...