I've ran into something I'm not really sure how to handle here. I'm building a database to store information on sports cards, and I'm having a bit of an issue with some sorting when I want to see certain cards.
For background, each card (row in the database) has information on year, set the card is from, player on the card, and card number (there's more info than that, but this is all that's relevant here). When I see results, I want things to be sorted by year, then set, then player, then card number. Everything but card number is working fine, as year is just an integer, and set and player are both varchars, so it's easy to sort those. However, the card number is what I'm running into some issues with.
The card number column is a varchar since the card number can include letters, numbers, and dashes. Most commonly, a card number will be a straight number (i.e. 1, 2, 3, 4, 5), straight letters (Ex-A, Ex-B, Ex-C), a number followed by a letter (1a, 1b, 2, 3a, 3b, 3c), or a letter followed by a number (A1, A2, A3, A4, A5). This is how I currently have the sort portion of my SQL string set up:
order by year desc, cardset asc, subset asc, cast(cardNum as unsigned) asc;
This is handling MOST things fine. But what I'm having issues with is when a group of cards have the same letters in their card number, and then have a number. I want the sort to essentially ignore the leading letters and then just sort by the numbers. But, sometimes it doesn't do this correctly, particularly, when there's more than 5ish cards with this to sort.
Specifically, it's incorrectly sorting some cards with the following card numbers into the following order:
When it should result in:
It is currently sorting straight numbers, or numbers followed by letters correctly (i.e. 1, 2, 3, 4, 5, or 1, 2, 3a, 3b, 4, 5a, 5b). I'm not aware of any issues with it sorting straight letters incorrectly, but I also currently have very few test cases of this, so I'm not sure if it's 100% or not.
In addition to not knowing how to modify my SQL sort statement without messing up other sorts, I don't really know how it's coming up with the order it is for the BCP example above. Any thoughts on how to correct it? I thought about trying to ignore letters in card number until we get to numbers, but that would cause major issues for cards with only letters in the card number. So I'm a bit stuck.
Absolute worst comes to worst, I could probably split the card number column into 2 different columns, one for the part that is more descriptive, and one for the part I want to sort by. That would probably end up working just fine, but would require a lot of work to get things back!
Edit- Here is some more information including data in my DB (sorry for the formatting, no idea how to do tables here):
| year | cardSet | subset | cardNum |
| 2016 | Bowman | | 52 |
| 2016 | Bowman | | 54 |
| 2016 | Bowman | | 147 |
| 2016 | Bowman | Chrome Prospects | BCP32 |
| 2016 | Bowman | Chrome Prospects | BCP61 |
| 2016 | Bowman | Chrome Prospects | BCP97 |
| 2016 | Bowman | Chrome Prospects | BCP135 |
| 2016 | Topps | | 1 |
| 2016 | Topps | | 2a |
| 2016 | Topps | | 2b |
| 2016 | Topps | | 3 |
I would expect my sort to spit out results in the following order:
- 2016 Bowman 52
- 2016 Bowman 54
- 2016 Bowman 147
- 2016 Bowman Chrome Prospects BCP32
- 2016 Bowman Chrome Prospects BCP61
- 2016 Bowman Chrome Prospects BCP97
- 2016 Bowman Chrome Prospects BCP125
- 2016 Topps 1
- 2016 Topps 2a
- 2016 Topps 2b
- 2016 Topps 3
However, here is the results I am given with my above sorting statement:
- 2016 Bowman 52
- 2016 Bowman 54
- 2016 Bowman 147
- 2016 Bowman Chrome Prospects BCP62
- 2016 Bowman Chrome Prospects BCP97
- 2016 Bowman Chrome Prospects BCP32
- 2016 Bowman Chrome Prospects BCP125
- 2016 Topps 1
- 2016 Topps 2a
- 2016 Topps 2b
- 2016 Topps 3
It handles card numbers with just numbers, or numbers followed by letters just fine, but it tends to mess things up when the card number begins with letters and is followed by numbers.
I have tried using the length() trick in the comments so that the sort part of my SQL is:
order by year desc, cardset asc, subset asc, length(cardNum), cardNum asc
That does fix the issue I was describing above, but messes up the 'Topps' part in my example- It'll put cards with letters following a number last no matter what. Here's the order I get with that sort:
- 2016 Bowman 52
- 2016 Bowman 54
- 2016 Bowman 147
- 2016 Bowman Chrome Prospects BCP32
- 2016 Bowman Chrome Prospects BCP61
- 2016 Bowman Chrome Prospects BCP97
- 2016 Bowman Chrome Prospects BCP125
- 2016 Topps 1
- 2016 Topps 3
- 2016 Topps 2a
- 2016 Topps 2b
See Question&Answers more detail:
os