I need to match parts of string whilst ignoring HTML tags. Which means if user wants to look for string "foo and foo1" in source code.
Two strings, <u>foo</u> and foo1
He'd not get the match, because of the tags.
I've tried regex, but since the tags can and don't have to be there, it seems rather too complicated.
It's not server-side script. It'd be an application run from console.
To be more specific: it is for syntax highlight. So user wants "foo and foo1" to be italic, but part of it is already underline and wouldn't match anyway. That's why I can't strip the string.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…