I just read a new question here on SO asking basically the same thing as mine does in the title. That got me thinking - and searching the web (most hits pointed to SO, of course ;). So I thought -
There should be a simple regex capable of removing C-style comments from any code.
Yes, there are answers to this question/statement on SO, but the ones I found, the're all incomplete and/or overly complex.
So I started experimenting, and came up with one that works on all types of code I can imagine:
(?://(?:\
|[^
])*
)|(?:/*(?:
|
|.)*?*/)|(("|')(?:\\|\2|\
|[^2])*?2)
The first alternative checks for double slash //
comments. The second for ordinary ones /* comment */
. The third one is what I had trouble finding other regex'es dealing with the same task handling - strings containing character sequences that outside the string, would be considered comments.
What this part does is to capture any strings in capture group one, matching the quote sign in capture group two, to quoted ones, up to the end of the string.
Capture group one should be kept in the replace, everything discarded (replaced for ""
) leaving un-commented code :).
Here's a C example at regex101.
OK... So that's not a question. It's an answer you think...
Yes, you're right. So... on to the question.
Have I missed any type of code that this regex would miss?
It handles
multi line comments
/*
an easy one
*/
"end of line" comments
// Remove this
comments in strings
char array[] = "Following isn't a comment // because it's in a string /* this neither */";
which leads to - strings with escaped quotes
char array[] = "Handle /* comments */ - // - in strings with " escaped quotes";
and strings with escaped escapes
char array[] = "Handle strings with **not** escaped quotes"; // <-EOS
javscript single quoted string
var myStr = 'Should also ignore enclosed // comments /* like these */ ';
line continuation
// This is a single line comment
continuing on the next row (warns, but works in my C++ flavor)
So, can you think of any code cases messing this up? If you come up with any I'll try to complete the RE and hopefully it'll end up complete ;)
Regards.
PS. I know... Writing this it says in the right pane, under How to Ask: We prefer questions that can be answered, not just discussed. This question might violate that :S but I can't resist.
In fact, it may even turn out to be an answer, instead of a question, to some people. (Too cocky? ;)
See Question&Answers more detail:
os