I'm OK with either a PL/SQL solution or an Access VBA/Excel VBA (though Access VBA is preferred over Excel VBA) one. so, PL/SQL is the first choice, Access VBA is second and Excel VBA is third.
This is a very tough problem to explain. Please ask any questions and i will do my best to answer them clearly.
I have the following dataset in a table called NR_PVO_120. How do i pick out a number (which can change but let's say, 6) of UNIQUE OtherIDs without excluding any OtherIDs under any fax numbers?
So, if you pick OtherID from Row7 you then also must pick OtherIDs from rows 8 and 9 because they have the same fax number. Basically, once you pick an OtherID you're then obligated to pick all OtherIDs that have the same fax number as the one you picked.
If the number requested (6 for this example) isn't possible then "the closest number possible but not exceeding" would be the rule.
For example, if you take OtherIDs from rows 1-10 you will get 6 unique OtherIDs but row 10 shares a fax with rows 11 and 12. You either need to take all 3 (but that will raise the unique count to 8, which isn't acceptable) or skip this OtherID and find one with a fax that will add 1 unique OtherID (for example, it can have 4 OtherIDs but 3 of them exist on the result set and therefore don't add to unique counts). My result of 6 UNIQUE OtherIDs will need to contain ALL OtherIDs under any fax the existing OtherIDs are connected to.
So one solution is to take rows 1-6, 26. Another is to take rows 1-4,10-14. There are more but you get the idea.
There will be many possibilities (the real dataset has tens of thousands of rows and the number of people requested will be around 10K), as long all OtherIDs connected to all faxes on the result set are part of the requested number (6 in this case) any combination would do.
A few notes.
Getting as close as possible to the requested number is a requirement.
Some OtherIDs will have a blank fax, they should only be included as a last resort (not enough OtherIDs for the requested number).
How is this done?
Row OtherID Fax
1 11098554 2063504752
2 56200936 2080906666
3 11098554 7182160901
4 25138850 7182160901
5 56148974 7182232046
6 56530104 7182234134
7 25138850 7182234166
8 56148974 7182234166
9 11098554 7182234166
10 56597717 7182248132
11 56166294 7182248132
12 25138850 7182248132
13 56148974 7182390090
14 56226456 7182390090
15 56148974 7182395285
16 25138850 7182395285
17 56166614 7180930966
18 11098554 7180930966
19 56159509 7180930966
20 25138850 7185462234
21 56148974 7185462234
22 25138850 7185465013
23 56024315 7185465013
24 56115247 7185465281
25 25138850 7185465281
26 56148975 7185466029
A few sample outputs
one solution is taking rows 1-6 and 26.
Row OtherID Fax
1 11098554 2063504752
2 56200936 2080906666
3 11098554 7182160901
4 25138850 7182160901
5 56148974 7182232046
6 56530104 7182234134
26 56148975 7185466029
Another solution is taking rows 1-4 and 10-14.
Row OtherID Fax
1 11098554 2063504752
2 56200936 2080906666
3 11098554 7182160901
4 25138850 7182160901
10 56597717 7182248132
11 56166294 7182248132
12 25138850 7182248132
13 56148974 7182390090
14 56226456 7182390090
There are many more.
I only need FAX as my output.
This is for a fax campaign, we need to make sure no fax number is faxed twice, that all people connected to that fax number are contacted under one fax sent.
So the idea is to take all OtherIDs under ANY fax you end up using.
EDIT here's how it's currently done, maybe this helps paint a picture
list is sorted by fax, they go down the list to a random point MAKING SURE THE LAST RECORD ENDS WITH THE SAME FAX. so in my example they'd stop at either row 1,2,4,5,6,9,12,14,16,19,21,23,25,26. they then see how many unique OtherIDs they have up until that point. if it's too many they go up some, see how many they have. if it's too little, they go down some, see how many they have. and they keep doing this until they get their unique number. the only requirement is to always include all OtherIDs under a fax.
See Question&Answers more detail:
os