Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
155 views
in Technique[技术] by (71.8m points)

.net - Best HashTag Regex

I'm trying to find all the hash tags in a string. The hashtags are from a stream like twitter, they could be anywhere in the text like:

this is a #awesome event, lets use the tag #fun

I'm using the .NET framework (c#), I was thinking this would be a suitable regex pattern to use:

#w+

Is this the best regex for this purpose?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are pulling statuses containing hashtags from Twitter, you no longer need to find them yourself. You can now specify the include_entities parameter to have Twitter automatically call out mentions, links, and hashtags.

For example, take the following call to statuses/show:

http://api.twitter.com/1/statuses/show/60183527282577408.json?include_entities=true

In the resultant JSON, notice the entities object.

"entities":{"urls":[{"expanded_url":null,"indices":[68,88],"url":"http://bit.ly/gWZmaJ"}],"user_mentions":[],"hashtags":[{"text":"wordpress","indices":[89,99]}]}

You can use the above to locate the specific entities in the tweet (which occur between the string positions denoted by the indices property) and transform them appropriately.

If you just need the regular expression to locate the hashtags, Twitter provides these in an open source library.

Hashtag Match Pattern

(^|[^&p{L}p{M}p{Nd}_u200cu200dua67eu05beu05f3u05f4u309bu309cu30a0u30fbu3003u0f0bu0f0cu00b7])(#|uFF03)(?!uFE0F|u20E3)([p{L}p{M}p{Nd}_u200cu200dua67eu05beu05f3u05f4u309bu309cu30a0u30fbu3003u0f0bu0f0cu00b7]*[p{L}p{M}][p{L}p{M}p{Nd}_u200cu200dua67eu05beu05f3u05f4u309bu309cu30a0u30fbu3003u0f0bu0f0cu00b7]*)

The above pattern can be pieced together from this java file (retrieved 2015-11-23). Validation tests for this pattern are located in this file around line 128.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...