Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
624 views
in Technique[技术] by (71.8m points)

regex - Can't use ^ to say "all but"

I have a text in which I want to get only the hexadecimal codes. Like: "thisissometextthisistextx64x6fx6ex74x74x72x61x6ex73x6cx61x74x65somemoretextoverhere"

It's possible to get the hex codes with x.. But it doesn't seems I can do something like (^x..) to select everything but the hex codes.

Any workarounds?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You may use a (?s)((?:\x[a-fA-F0-9]{2})+)|. regex (that will match and capture into Group 1 any 1+ sequences of hex values OR will just match any other char including a line break char) and replace with a conditional replacement pattern (?{1}$1 :) (that will reinsert the hex value chain or will replace the match with an empty string):

Find What:      (?s)((?:\x[a-fA-F0-9]{2})+)|.
Replace With: (?{1}$1 :)

enter image description here

Regex Details:

  • (?s) - same as . matches newline option ON
  • ((?:\x[a-fA-F0-9]{2})+) - Group 1 capturing one or more sequences of
    • \x - a \x
    • [a-fA-F0-9]{2} - 2 letters from a to f or digits
  • | - or
  • . - any single char.

Replacement pattern:

  • (?{1} - if Group 1 matches:
    • $1 - replace with its contents + a newline
    • : - else replace with an empty string
  • ) - end of the replacement pattern.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...