Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
213 views
in Technique[技术] by (71.8m points)

How to remove/replace specials characters from a 'dynamic' regex/string on ruby?

So I had this code working for a few months already, lets say I have a table called Categories, which has a string column called name, so I receive a string and I want to know if any category was mentioned (a mention occur when the string contains the substring: @name_of_a_category), the approach I follow for this was something like below:

categories.select { |category_i| content_received.downcase.match(/@#{category_i.downcase}/)}

That worked pretty well until today suddenly started to receive an exception unmatched close parenthesis, I realized that the categories names can contain special chars so I decided to not consider special chars or spaces anymore (don't want to add restrictions to the user and at the same time don't want to deal with those cases so the policy is just to ignore it).

So the question is there a clean way of removing these special chars (maintaining the @) and matching the string (don't want to modify the data just ignore it while looking for mentions)?

question from:https://stackoverflow.com/questions/65839892/how-to-remove-replace-specials-characters-from-a-dynamic-regex-string-on-ruby

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can also use

prep_content_received = content_received.gsub(/[^ws]|_/,'')
p categories.select { |c| 
  prep_content_received.match?(/#{c.gsub(/[^ws]|_/, '').strip()}/i) 
}

See the Ruby demo

Details:

  • The prep_content_received = content_received.gsub(/[^ws]|_/,'') creates a copy of content_received with no special chars and _. Using it once reduced overhead if there are a lot of categories
  • Then, you iterate over the categories list, and each time check if the prep_content_received matches (word boundary) + category with all special chars, _ and leading/trailing whitespace stripped from it + in a case insensitive way (see the /i flag, no need to .downcase).

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...