I have a schema in "utf8 -- UTF-8 Unicode" as charset and a collation of "utf8_spanish_ci".
All the inside tables are InnoDB with same charset and collation as mentioned.
Here comes the problem:
with a query like
SELECT *
FROM people p
WHERE p.NAME LIKE '%jose%';
I get 83 result rows. I should have 84 results, because I know it.
Changing where for:
WHERE p.NAME LIKE '%JOSE%';
I get the exact same 83 rows.
With combinations like JoSe, Jose, JOSe, etc. All the same 83 rows are reported.
The problem comes when accents play in game. If do:
WHERE p.NAME LIKE '%josé%';
I get no results. 0 rows.
But if I do:
WHERE p.NAME LIKE '%JOSé%';
I get just one resulting row, so 1 row. This is the only row which has accented "jose" and capitalized.
I've tried with josé, or JoSé or whatever combination I do, as long as the accented letter stays capitalized or not, as it really is stored in the database and it stills returning the only row. If I suddenly change "é" for "é" in whatever combination I do with the capitalization in JOSE, it returns no rows.
So conclusions:
- Case insensitive if no latin characters plays in game.
- Case sensitive if latin characters appears.
- Accent sensitive, as if I search JOSE or jose, I only get 83 rows, instead of the 84 rows I need.
What I want?
- To search "jose", "JOSE", "José", "JOSé", "JòSE", "j?se", "JoSè", ... have to return the 84 rows I know that exists. I what to turn my searches to case insensitive and "latin" insensitive.
Solutions like COLLATION
on LIKE
doesn't work for me, don't know why...
What can I do?
EDIT:
If I do something like:
WHERE p.NAME LIKE '%jose%' COLLATE utf8_general_ci;
I get the error:
COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'
And I've changed all the possible collations on the columns too!
And if I do something like:
WHERE p.NAME LIKE _utf8 '%jose%' COLLATE utf8_general_ci;
The same 83 rows are reported, as if I've made nothing...
See Question&Answers more detail:
os