Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
518 views
in Technique[技术] by (71.8m points)

java - How can I split string by a special character and ignore everything inside parentheses?

I want to split the string by "/" and ignore "/" inside the outer parentheses.

Sample input string:

"Apple 001/(Orange (002/003) ABC)/Mango 003 )/( ASDJ/(Watermelon )004)/Apple 002 ASND/(Mango)"

Expected output in string array:

["Apple 001", "(Orange (002/003) ABC)", "Mango 003 )/( ASDJ", "(Watermelon )004)", "Apple 002 ASND", "(Mango)"]

This is my regex:

/(?=(?:[^()]*([^()]*))*[^()]*$)

But it can only support simple string like this:

"Apple 001/(Orange 002/003 ABC)/Mango 003 ASDJ/(Watermelon 004)/Apple 002 ASND/(Mango)"

If there is inner parentheses, the result is incorrect.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a sample of a parser that would implement your need :

public static List<String> splitter(String input) {
    int nestingLevel=0;
    StringBuilder currentToken=new StringBuilder();
    List<String> result = new ArrayList<>();
    for (char c: input.toCharArray()) {
        if (nestingLevel==0 && c == '/') { // the character is a separator !
            result.add(currentToken.toString());
            currentToken=new StringBuilder();
        } else {
            if (c == '(') { nestingLevel++; }
            else if (c == ')' && nestingLevel > 0) { nestingLevel--; }

            currentToken.append(c);
        }
    }
    result.add(currentToken.toString());
    return result;
}

You can try it here.

Note that it doesn't lead to the expected output you posted, but I'm not sure what algorithm you were following to obtain such result. In particular I've made sure there's no "negative nesting level", so for starters the / in "Mango 003 )/( ASDJ" is considered outside of parenthesis and is parsed as a separator.

Anyway I'm sure you can tweak my answer much more easily than you would a regex answer, the whole point of my answer being to show that writing a parser to handle such problems is often more realistic than to bother trying to craft a regex.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...