Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
235 views
in Technique[技术] by (71.8m points)

php - Algorithm to get all possible string combinations from array up to certain length

What is the best algorithm to get all possible string combinations from a given array with a minimum & maximum length value.

Note: This adds complexity since the value is variable unlike the questions these are linked to.

For example:

$letters = array('a','b','c','1','2','3');
$min_length = 1;
$max_length = 4;

a
b
c
1
2
3
.
.
.
aaaa
a123
b123
c123
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Almost a Base Conversion

This solution is motivated by the observation that, if it were not for the possibility of repeating the character at array index 0 in the high-order position of a valid combination, this problem would simply be a base conversion from decimal to the new base of all integers from 0 to (base^length)-1. So,

0 = a
1 = b
...
1294 = 3332
1295 = 3333

The difficulty is that this misses combinations with one or more leading 'a', like:

aa
...
a1
aa1
aaa1

And these are the only solutions missing. So one could simply, for each generated combination with length less than max_length, add 'a' (or whatever is at letters[0]) to the front of the string, and output that string, repeating if necessary. So if you generate the string 'b', you'd output:

b
ab
aab
aaab

This works, but it is unsatisfying, first because it looks like a kludged solution. Second, it does not generate the solutions in lexicographical order, which may or may not be a problem. Third, it would be really nice if the function to generate the combinations was bijective so that we knew the unique combination that results from any given decimal integer and the unique index of any combination. This will be critical in a moment.

Imagine There's No Zero

If the zero index is giving us problems, then why not do away with it? Say we change the array to:

letters = {?,'a','b','c','1','2','3'}

where ? is an arbitrary placeholder that will never be used. We will now attempt to represent the decimal integers from 1 to base^max_length in the new base (in this case still 6, not 7) without using the digit zero. We'll represent the numbers from 1 to base-1 as normal (1, 2, 3, ...) but when we get to the number equal to the base, rather than represent it as 10, we'll represent it as the base digit (e.g., 6 rather than 10 in base 6). The next number, base+1, would be 11, then 12 up to 16 (which is equal to 12 decimal) and then up to 21. Each number

an,an-1,...,a1,a0

in the new base is equal to

an*bn+an-1*bn-1+...+a1*b1+a0*b0

in decimal, where b is the new base.

As I've come to learn, this is fittingly called a bijective numeration. Taking an example, in base 6 we would have:

Base 10    Base 6
1          1
2          2
...
6          6
7          11
8          12
...
11         15
12         16
13         21
...
36         56

From the Wikipedia link above, the first "digit" of the number in the new base is:

a0 = m - q0k

where k is the new base, m is the integer to convert, and q0 = ceiling(m/k)-1. Note the similarity to a normal base conversion where the only difference is that q0 would be floor(m/k) or ordinary integer division. Subsequent "digits" are computed similarly, using q0 instead of m to find a1, etc.

This formula can be broken down into two cases:

  • (m mod k) == 0: a0 = k and q0 = (m div k) - 1
  • (m mod k) != 0: a0 = (m mod k) and q0 = (m div k)

This is the heart of the algorithm. For any integer m we can iteratively apply this formula until the resulting qp == 0. Also note that the conversion is, naturally, reversible. Given a number 6666 in base 6, the decimal value is:

(6*6^3)+(6*6^2)+(6*6^1)+(6*6^0)=1554

We use this fact to find the range of integers to convert in order to get all combinations of a certain length. Sorry, but my PHP skills are non-existent. Hopefully the Java code is clear enough. Given:

    char [] letters = new char [] {'0','a','b','c','1','2','3'};
    int max_length = 4;

we have the function:

    int b = letters.length - 1;  // base to convert to
    int n = 0;
    for (int k = 0; k < max_length; k++)
        n = (n*b)+b;  // number of combinations

    int remainder;
    for (int i = 1; i <= n; i++) {
        int current = i;  // m and q_n in the formula
        String combination = "";
        do {
            remainder = current % b;
            if (remainder == 0) {
                combination += letters[b];
                current = current/b - 1;
            } else {
                combination += letters[remainder];
                current = current/b;
            }
        } while (current > 0);
        System.out.println(combination);
    }

where / represents integer division, or floor(a/b).

The function relies only on the input integer and not on the results of previous calculations, so the possibilities for parallel processing are pretty good.

Here is the output with the above input. Least significant digit is first, as per your example.

a
b
c
1
2
3
aa
ba
ca
1a
2a
3a
ab
bb
cb
1b
2b
3b
ac
bc
cc
1c
2c
3c
a1
b1
c1
11
21
31
a2
b2
c2
12
22
32
a3
b3
c3
13
23
33
aaa
baa
caa
1aa
2aa
3aa
aba
bba
cba
1ba
2ba
3ba
aca
bca
cca
1ca
2ca
3ca
a1a
b1a
c1a
11a
21a
31a
a2a
b2a
c2a
12a
22a
32a
a3a
b3a
c3a
13a
23a
33a
aab
bab
cab
1ab
2ab
3ab
abb
bbb
cbb
1bb
2bb
3bb
acb
bcb
ccb
1cb
2cb
3cb
a1b
b1b
c1b
11b
21b
31b
a2b
b2b
c2b
12b
22b
32b
a3b
b3b
c3b
13b
23b
33b
aac
bac
cac
1ac
2ac
3ac
abc
bbc
cbc
1bc
2bc
3bc
acc
bcc
ccc
1cc
2cc
3cc
a1c
b1c
c1c
11c
21c
31c
a2c
b2c
c2c
12c
22c
32c
a3c
b3c
c3c
13c
23c
33c
aa1
ba1
ca1
1a1
2a1
3a1
ab1
bb1
cb1
1b1
2b1
3b1
ac1
bc1
cc1
1c1
2c1
3c1
a11
b11
c11
111
211
311
a21
b21
c21
121
221
321
a31
b31
c31
131
231
331
aa2
ba2
ca2
1a2
2a2
3a2
ab2
bb2
cb2
1b2
2b2
3b2
ac2
bc2
cc2
1c2
2c2
3c2
a12
b12
c12
112
212
312
a22
b22
c22
122
222
322
a32
b32
c32
132
232
332
aa3
ba3
ca3
1a3
2a3
3a3
ab3
bb3
cb3
1b3
2b3
3b3
ac3
bc3
cc3
1c3
2c3
3c3
a13
b13
c13
113
213
313
a23
b23
c23
123
223
323
a33
b33
c33
133
233
333
aaaa
baaa
caaa
1aaa
2aaa
3aaa
abaa
bbaa
cbaa
1baa
2baa
3baa
acaa
bcaa
ccaa
1caa
2caa
3caa
a1aa
b1aa
c1aa
11aa
21aa
31aa
a2aa
b2aa
c2aa
12aa
22aa
32aa
a3aa
b3aa
c3aa
13aa
23aa
33aa
aaba
baba
caba
1aba
2aba
3aba
abba
bbba
cbba
1bba
2bba
3bba
acba
bcba
ccba
1cba
2cba
3cba
a1ba
b1ba
c1ba
11ba
21ba
31ba
a2ba
b2ba
c2ba
12ba
22ba
32ba
a3ba
b3ba
c3ba
13ba
23ba
33ba
aaca
baca
caca
1aca
2aca
3aca
abca
bbca
cbca
1bca
2bca
3bca
acca
bcca
ccca
1cca
2cca
3cca
a1ca
b1ca
c1ca
11ca
21ca
31ca
a2ca
b2ca
c2ca
12ca
22ca
32ca
a3ca
b3ca
c3ca
13ca
23ca
33ca
aa1a
ba1a
ca1a
1a1a
2a1a
3a1a
ab1a
bb1a
cb1a
1b1a
2b1a
3b1a
ac1a
bc1a
cc1a
1c1a
2c1a
3c1a
a11a
b11a
c11a
111a
211a
311a
a21a
b21a
c21a
121a
221a
321a
a31a
b31a
c31a
131a
231a
331a
aa2a
ba2a
ca2a
1a2a
2a2a
3a2a
ab2a
bb2a
cb2a
1b2a
2b2a
3b2a
ac2a
bc2a
cc2a
1c2a
2c2a
3c2a
a12a
b12a
c12a
112a
212a
312a
a22a
b22a
c22a
122a
222a
322a
a32a
b32a
c32a
132a
232a
332a
aa3a
ba3a
ca3a
1a3a
2a3a
3a3a
ab3a
bb3a
cb3a
1b3a
2b3a
3b3a
ac3a
bc3a
cc3a
1c3a
2c3a
3c3a
a13a
b13a
c13a
113a
213a
313a
a23a
b23a
c23a
123a
223a
323a
a33a
b33a
c33a
133a
233a
333a
aaab
baab
caab
1aab
2aab
3aab
abab
bbab
cbab
1bab
2bab
3bab
acab
bcab
ccab
1cab
2cab
3cab
a1ab
b1ab
c1ab
11ab
21ab
31ab
a2ab
b2ab
c2ab
12ab
22ab
32ab
a3ab
b3ab
c3ab
13ab
23ab
33ab
aabb
babb
cabb
1abb
2abb
3abb
abbb
bbbb
cbbb
1bbb
2bbb
3bbb
acbb
bcbb
ccbb
1cbb
2cbb
3cbb
a1bb
b1bb
c1bb
11bb
21bb
31bb
a2bb
b2bb
c2bb
12bb
22bb
32bb
a3bb
b3bb
c3bb
13bb
23bb
33bb
aacb
bacb
cacb
1acb
2acb
3acb
abcb
bbcb
cbcb
1bcb
2bcb
3bcb
accb
bccb
cccb
1ccb
2ccb
3ccb
a1cb
b1cb
c1cb
11cb
21cb
31cb
a2cb
b2cb
c2cb
12cb
22cb
32cb
a3cb
b3cb
c3cb
13cb
23cb
33cb
aa1b
ba1b
ca1b
1a1b
2a1b
3a1b
ab1b
bb1b
cb1b
1b1b
2b1b
3b1b
ac1b
bc1b
cc1b
1c1b
2c1b
3c1b
a11b
b11b
c11b
111b
211b
311b
a21b
b21b
c21b
121b
221b
321b
a31b
b31b
c31b
131b
231b
331b
aa2b
ba2b
ca2b
1a2b
2a2b
3a2b
ab2b
bb2b
cb2b
1b2b
2b2b
3b2b
ac2b
bc2b
cc2b
1c2b
2c2b
3c2b
a12b
b12b
c12b
112b
212b
312b
a22b
b22b
c22b
122b
222b
322b
a32b
b32b
c32b
132b
232b
332b
aa3b
ba3b
ca3b
1a3b
2a3b
3a3b
ab3b
bb3b
cb3b
1b3b
2b3b
3b3b
ac3b
bc3b
cc3b
1c3b
2c3b
3c3b
a13b
b13b
c13b
113b
213b
313b
a23b
b23b
c23b
123b
223b
323b
a33b
b33b
c33b
133b
233b
333b
aaac
baac
caac
1aac
2aac
3aac
abac
bbac
cbac
1bac
2bac
3bac
acac
bcac
ccac
1cac
2cac
3cac
a1ac
b1ac
c1ac
11ac
21ac
31ac
a2ac
b2ac
c2ac
12ac
22ac
32ac
a3ac
b3ac
c3ac
13ac
23ac
33ac
aabc
babc
cabc
1abc
2abc
3abc
abbc
bbbc
cbbc
1bbc
2bbc
3bbc
acbc
bcbc
ccbc
1cbc
2cbc
3cbc
a1bc
b1bc
c1bc
11bc
21bc
31bc
a2bc
b2bc
c2bc
12bc
22bc
32bc
a3bc
b3bc
c3bc
13bc
23bc
33bc
aacc
bacc
cacc
1acc
2acc
3acc
abcc
bbcc
cbcc
1bcc
2bcc
3bcc
accc
bccc
cccc
1ccc
2ccc
3ccc
a1cc
b1cc
c1cc
11cc
21cc
31cc
a2cc
b2cc
c2cc
12cc
22cc
32cc
a3cc
b3cc
c3cc
13cc
23cc
33cc
aa1c
ba1c
ca1c
1a1c
2a1c
3a1c
ab1c
bb1c
cb1c
1b1c
2b1c
3b1c
ac1c
bc1c
cc1c
1c1c
2c1c
3c1c
a11c
b11c
c11c
111c
211c
311c
a21c
b21c
c21c
121c
221c
321c
a31c
b31c
c31c
131c
231c
331c
aa2c
ba2c
ca2c
1a2c
2a2c
3a2c
ab2c
bb2c
cb2c
1b2c
2b2c
3b2c
ac2c
bc2c
cc2c
1c2c
2c2c
3c2c
a12c
b12c
c12c
112c
212c
312c
a22c
b22c
c22c
122c
222c
322c
a32c
b32c
c32c
132c
232c
332c
aa3c
ba3c
ca3c
1a3c
2a3c
3a3c
ab3c
bb3c
cb3c
1b3c
2b3c
3b3c
ac3c
bc3c
cc3c
1c3c
2c3c
3c3c
a13c
b13c
c13c
113c
213c
313c
a23c
b23c
c23c
123c
223c
323c
a33c
b33c
c33c
133c
233c
333c
aaa1
baa1
caa1
1aa1
2aa1
3aa1
aba1
bba1
cba1
1ba1
2ba1
3ba1
aca1
bca1
cca1
1ca1
2ca1
3ca1
a1a1
b1a1
c1a1
11a1
21a1
31a1
a2a1
b2a1
c2a1
12a1
22a1
32a1
a3a1
b3a1
c3a1
13a1
23a1
33a1
aab1
bab1
cab1
1ab1
2ab1
3ab1
abb1
bbb1
cbb1
1bb1
2bb1
3bb1
acb1
bcb1
ccb1
1cb1
2cb1
3cb1
a1b1
b1b1
c1b1
11b1
21b1
31b1
a2b1
b2b1
c2b1
12b1
22b1
32b1
a3b1
b3b1
c3b1
13b1
23b1
33b1
aac1
bac1
cac1
1ac1
2ac1
3ac1
abc1
bbc1
cbc1
1bc1
2bc1
3bc1
acc1
bcc1
ccc1
1cc1
2cc1
3cc1
a1c1
b1c1
c1c1
11c1
21c1
31c1
a2c1
b2c1
c2c1
12c1
22c1
32c1
a3c1
b3c1
c3c1
13c1
23c1
33c1
aa11
ba11
ca11
1a11
2a11
3a11
ab11
bb11
cb11
1b11
2b11
3b11
ac11
bc11
cc11
1c11
2c11
3c11
a111
b111
c111
1111
2111
3111
a211
b211
c211
1211
2211
3211
a311
b311
c311
1311
2311
3311
aa21
ba21
ca21
1a21
2a21
3a21
ab21
bb21
cb21
1b21
2b21
3b21
ac21
bc21
cc21
1c21
2c21
3c21
a121
b121
c121
1121
2121
3121
a221
b221
c221
1221
2221
3221
a321
b321
c321
1321
2321
3321
aa31
ba31
ca31
1a31
2a31
3a31
ab31
bb31
cb31
1b31
2b31
3b31
ac31
bc31
cc31
1c31
2c31
3c31
a131
b131
c131
1131
2131
3131
a231
b231
c231
1231
2231
3231
a331
b331
c331
1331
2331
3331
aaa2
baa2
caa2
1aa2
2aa2
3aa2
aba2
bba2
cba2
1ba2
2ba2
3ba2
aca2
bca2
cca2
1ca2
2ca2
3ca2
a1a2
b1a2
c1a2
11a2
21a2
31a2
a2a2
b2a2
c2a2
12a2
22a2
32a2
a3a2
b3a2
c3a2
13a2
23a2
33a2
aab2
bab2
cab2
1ab2
2ab

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...