Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
276 views
in Technique[技术] by (71.8m points)

c# - regex for URL including query string

I thought this would be a simple google search but apparently not. What is a regex I can use in C# to parse out a URL including any query string from a larger text? I have spent lots of time and found lots of examples of ones that don't include the query string. And I can't use System.URI, because that assumes you already have the URL... I need to find it in surrounding text.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This should get just about anything (feel free to add additional protocols):

@"(https?|ftp|file)://[A-Za-z0-9.-]+(/[A-Za-z0-9?&=;+!'()*-._~%]*)*"

The real difficulty is finding the end. As is, this pattern relies on finding an invalid character. That would be anything other than letters, numbers, hyphen or period before the end of the domain name, or anything other than those plus forward slash (/), question mark (?), ampersand (&), equals sign (=), semicolon (;), plus sign (+), exclamation point (!), apostrophe/single quote ('), open/close parentheses, asterisk (*), underscore (_), tilde (~), or percent sign (%) after the domain name.

Note that this would allow invalid URLs like

http://../

And it would pick up stuff after a URL, such as in this string:

Maybe you should try http://www.google.com.

Where "http://www.google.com." (with the trailing period) would be matched.

It would also miss URLs that didn't begin with a protocol specification (specifically, the protocols within the first set of parentheses. For instance, it would miss the URL in this string:

Maybe you should try www.google.com.

It's very difficult to get every case without some better-defined boundaries.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...