Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

regex - IPV6 address into compressed form in Java

I have used Inet6Address.getByName("2001:db8:0:0:0:0:2:1").toString() method to compress IPv6 address, and the output is 2001:db8:0:0:0:0:2:1 ,but i need 2001:db8::2:1 . , Basically the compression output should based on RFC 5952 standard , that is

  1. Shorten as Much as Possible : For example, 2001:db8:0:0:0:0:2:1 must be shortened to
    2001:db8::2:1.Likewise, 2001:db8::0:1 is not acceptable, because the symbol "::" could have been used to produce a shorter representation 2001:db8::1.

  2. Handling One 16-Bit 0 Field : The symbol "::" MUST NOT be used to shorten just one 16-bit 0 field. For example, the representation 2001:db8:0:1:1:1:1:1 is correct, but 2001:db8::1:1:1:1:1 is not correct.

  3. Choice in Placement of "::" : = When there is an alternative choice in the placement of a "::", the longest run of consecutive 16-bit 0 fields MUST be shortened (i.e., the sequence with three consecutive zero fields is shortened in 2001: 0:0:1:0:0:0:1). When the length of the consecutive 16-bit 0 fields are equal (i.e., 2001:db8:0:0:1:0:0:1), the first sequence of zero bits MUST be shortened. For example, 2001:db8::1:0:0:1 is correct representation.

I have also checked another post in Stack overflow, but there was no condition specified (example choice in placement of ::).

Is there any java library to handle this? Could anyone please help me?

Thanks in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

How about this?

String resultString = subjectString.replaceAll("((?::0\b){2,}):?(?!\S*\b\1:0\b)(\S*)", "::$2").replaceFirst("^0::","::");

Explanation without Java double-backslash hell:

(       # Match and capture in backreference 1:
 (?:    #  Match this group:
  :0    #  :0
      #  word boundary
 ){2,}  # twice or more
)       # End of capturing group 1
:?      # Match a : if present (not at the end of the address)
(?!     # Now assert that we can't match the following here:
 S*    #  Any non-space character sequence
      #  word boundary
 1     #  the previous match
 :0     #  followed by another :0
      #  word boundary
)       # End of lookahead. This ensures that there is not a longer
        # sequence of ":0"s in this address.
(S*)   # Capture the rest of the address in backreference 2.
        # This is necessary to jump over any sequences of ":0"s
        # that are of the same length as the first one.

Input:

2001:db8:0:0:0:0:2:1
2001:db8:0:1:1:1:1:1
2001:0:0:1:0:0:0:1
2001:db8:0:0:1:0:0:1
2001:db8:0:0:1:0:0:0

Output:

2001:db8::2:1
2001:db8:0:1:1:1:1:1
2001:0:0:1::1
2001:db8::1:0:0:1
2001:db8:0:0:1::

(I hope the last example is correct - or is there another rule if the address ends in 0?)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.8k users

...