Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

regex - How to prevent texts next to closed html tags using c#

To be as short as possible, I need to fetch plain text from HTML. Already researched a lot on the web, but non of them helped actually. Let me explain it by providing codebase.

var source = "<div>Value1</div><h1> Value2</h1> {blablabla}";

var plainText = Regex.Replace(source, "<.*?>", String.Empty); -- Value1 Value2 {blablabla}

You may ask that why you get {blablabla}, because it is incorrect. The answer to this question is, HTML is coming from JSX file, this JSX file may contain these values, and I need to handle this somehow.

So I need to prevent the values located next to the closed HTML element.

Is there any way? Thanks.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...