Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
294 views
in Technique[技术] by (71.8m points)

java - Given a string, generate a regex that can parse *similar* strings

For example, given the string "2009/11/12" I want to get the regex ("d{2}/d{2}/d{4}"), so I'll be able to match "2001/01/02" too.

Is there something that does that? Something similar? Any idea' as to how to do it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is text2re, a free web-based "regex by example" generator.

I don't think this is available in source code, though. I dare to say there is no automatic regex generator that gets it right without user intervention, since this would require the machine knowing what you want.


Note that text2re uses a template-based, modularized and very generalized approach to regular expression generation. The expressions it generates work, but they are much more complex than the equivalent hand-crafted expression. It is not a good tool to learn regular expressions because it does a pretty lousy job at setting examples.

For instance, the string "2009/11/12" would be recognized as a yyyymmdd pattern, which is helpful. The tool transforms it into this 125 character monster:

((?:(?:[1]{1}d{1}d{1}d{1})|(?:[2]{1}d{3}))[-:/.](?:[0]?[1-9]|[1][012])[-:/.](?:(?:[0-2]?d{1})|(?:[3][01]{1})))(?![d])

The hand-made equivalent would take up merely two fifths of that (50 characters):

([12]d{3})[-:/.](0?d|1[0-2])[-:/.]([0-2]?d|3[01])

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...