I have the following character encoding issue, somehow I have managed to save data with different character encoding into my database (UTF8) The code and outputs below show 2 sample strings and how they output. 1 of them would need to be changed to UTF8 and the other already is.
How do/should I go about checking if I should encode the string or not? e.g.
I need each string to be outputted correctly, so how do I check if it is already utf8 or whether it needs to be converted?
I am using PHP 5.2, mysql myisam tables:
CREATE TABLE IF NOT EXISTS `entities` (
....
`title` varchar(255) NOT NULL
....
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
<?php
$text = $entity['Entity']['title'];
echo 'Original : ', $text."<br />";
echo 'UTF8 Encode : ', utf8_encode($text)."<br />";
echo 'UTF8 Decode : ', utf8_decode($text)."<br />";
echo 'TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//TRANSLIT", $text)."<br />";
echo 'IGNORE TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//IGNORE//TRANSLIT", $text)."<br />";
echo 'IGNORE : ', iconv("ISO-8859-1", "UTF-8//IGNORE", $text)."<br />";
echo 'Plain : ', iconv("ISO-8859-1", "UTF-8", $text)."<br />";
?>
Output 1:
Original : France Télécom
UTF8 Encode : France T??l??com
UTF8 Decode : France T?l?com
TRANSLIT : France T??l??com
IGNORE TRANSLIT : France T??l??com
IGNORE : France T??l??com
Plain : France T??l??com
Output 2:###
Original : Cond? Nast Publications
UTF8 Encode : Condé Nast Publications
UTF8 Decode : Cond?ast Publications
TRANSLIT : Condé Nast Publications
IGNORE TRANSLIT : Condé Nast Publications
IGNORE : Condé Nast Publications
Plain : Condé Nast Publications
Thanks for you time on this one. Character encoding and I don't get on very well!
UPDATE:
echo strlen($string)."|".strlen(utf8_encode($string))."|";
echo (strlen($string)!==strlen(utf8_encode($string))) ? $string : utf8_encode($string);
echo "<br />";
echo strlen($string)."|".strlen(utf8_decode($string))."|";
echo (strlen($string)!==strlen(utf8_decode($string))) ? $string : utf8_decode($string);
echo "<br />";
23|24|Cond? Nast Publications
23|21|Cond? Nast Publications
16|20|France Télécom
16|14|France Télécom
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…