Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
246 views
in Technique[技术] by (71.8m points)

regex pattern for url with no ending slash and exclude certain text in url

I'm looking for preg_match_all pattern to find all URL on a page that don't have trailing slash.

For example: if I have

a href="/testing/abc/">end with slash

a href="/testing/test/mnl">no ending slash

The result would be #2. Solution is posted at find pattern for url with no ending slash

I have tried to modify the provided pattern to exclude urls that have 'images' or '.pdf' but no luck yet.

Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This one should suit your needs (demo):

href="(?:(?<!images).(?!(?:[.]pdf|/)"))*?"
  • (?:) = non-capturing groupe
  • (?<!images). = any char not preceded by images
  • .(?!(?:[.]pdf|/)") = any char not followed by .pdf" nor by /"
  • *? = match as short as possible

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...