Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
600 views
in Technique[技术] by (71.8m points)

html - Regular expression to find URLs not inside a hyperlink

There's many regex's out there to match a URL. However, I'm trying to match URLs that do not appear anywhere within a <a> hyperlink tag (HREF, inner value, etc.). So NONE of the URLs in these should match:

<a href="http://www.example.com/">something</a>
<a href="http://www.example.com/">http://www.example2.com</a>
<a href="http://www.example.com/"><b>something</b>http://www.example.com/<span>test</span></a>

Any URL outside of <a></a> should be matched.

One approach I tried was to use a negative lookahead to see if the first <a> tag after the URL was an opening <a> or a closing </a>. If it is a closing </a> then the URL must be inside a hyperlink. I think this idea was okay, but the negative lookahead regex didn't work (or more accurately, the regex wasn't written correctly). Any tips are very appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I was looking for this answer as well and because nothing out there really worked like I wanted it too this is the regex that I created. Obviously since its a regex be aware that this is not a perfect solution.

/(?!<a[^>]*>[^<])(((([A-Za-z]{3,9}:(?://)?)(?:[-;:&=+$,w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=+$,w]+@)[A-Za-z0-9.-]+)((?:/[+~%/.w-_]*)???(?:[-+=&;%@.w_]*)#?(?:[w]*))?))(?![^<]*</a>)/gi

And the whole function to update html is:

function linkifyWithRegex(input) {
  let html = input;
  let regx = /(?!<a[^>]*>[^<])(((([A-Za-z]{3,9}:(?://)?)(?:[-;:&=+$,w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=+$,w]+@)[A-Za-z0-9.-]+)((?:/[+~%/.w-_]*)???(?:[-+=&;%@.w_]*)#?(?:[w]*))?))(?![^<]*</a>)/gi;
  html = html.replace(
    regx,
    function (match) {
      return '<a href="' + match + '">' + match + "</a>";
    }
  );
  return html;
}


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...