Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
962 views
in Technique[技术] by (71.8m points)

bash - Extracing IP addresses as whole words with POSIX BRE/ERE regex

I am trying to match IP addresses found in the output of traceroute by means of a regex. I'm not trying to validate them because it's safe enough to assume traceroute is valid (i.e. is not outputting something like 999.999.999.999. I'm trying the following regex:

([0-9]{1,3}.?){4}

I'm testing it in regex101 and it does validate an IP address. However, when I try

echo '192.168.1.1 foobar' | grep '([0-9]{1,3}.?){4}' 

I get nothing. What am I missing?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You used a POSIX ERE pattern, but did not pass -E option to have grep use the POSIX ERE flavor. Thus, grep used POSIX BRE instead, where you need to escape {n,m} quantifier and (...) to make them be parsed as special regex operators.

Note you need to escape a . so that it could only match a literal dot.

To make your pattern work with grep the way you wanted you could use:

grep -E '([0-9]{1,3}.?){4}'      # POSIX ERE
grep '([0-9]{1,3}.?){4}'  # POSIX BRE version of the same regex

See an online demo.

However, this regex will also match a string of several digits because the . is optional.

You may solve it by unrolling the pattern as

grep -E '[0-9]{1,3}(.[0-9]{1,3}){3}'      # POSIX ERE
grep '[0-9]{1,3}(.[0-9]{1,3}){3}' # POSIX BRE

See another demo.

Basically, it matches:

  • [0-9]{1,3} - 1 to 3 occurrences of any ASCII digit
  • (.[0-9]{1,3}){3} - 3 occurrences of:
    • . - a literal .
    • [0-9]{1,3} - 1 to 3 occurrences of any ASCII digit

To make sure you only match valid IPs, you might want to use a more precise IP matching regex:

grep -E '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}' # POSIX ERE

See this online demo.

You may further tweak it with word boundaries (can be < / > or ), etc.

To extract the IPs use -o option with grep: grep -oE 'ERE_pattern' file / grep -o 'BRE_pattern' file.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...