Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
481 views
in Technique[技术] by (71.8m points)

php - How to remove diacritics from text?

I am making a swedish website, and swedish letters are ?, ?, and ?.

I need to make a string entered by a user to become url-safe with PHP.

Basically, need to convert all characters to underscore, all EXCEPT these:

 A-Z, a-z, 1-9

and all swedish should be converted like this:

'?' to 'a' and '?' to 'a' and '?' to 'o' (just remove the dots above).

The rest should become underscores as I said.

Im not good at regular expressions so I would appreciate the help guys!

Thanks

NOTE: NOT URLENCODE...I need to store it in a database... etc etc, urlencode wont work for me.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This should be useful which handles almost all the cases.

function Unaccent($string)
{
    return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_COMPAT, 'UTF-8'));
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...