Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

string - Using JavaScript to perform text matches with/without accented characters

I am using an AJAX-based lookup for names that a user searches in a text box.

I am making the assumption that all names in the database will be transliterated to European alphabets (i.e. no Cyrillic, Japanese, Chinese). However, the names will still contain accented characters, such as ?, ê and even ? and ?.

A simple search like "Micic" will not match "Mi?i?" though - and the user expectation is that it will.

The AJAX lookup uses regular expressions to determine a match. I have modified the regular expression comparison using this function in an attempt to match more accented characters. However, it's a little clumsy since it doesn't take into account all characters.

function makeComp (input)
{
    input = input.toLowerCase ();
    var output = '';
    for (var i = 0; i < input.length; i ++)
    {
        if (input.charAt (i) == 'a')
            output = output + '[aàáa????]'
        else if (input.charAt (i) == 'c')
            output = output + '[c?]';
        else if (input.charAt (i) == 'e')
            output = output + '[eèéê??]';
        else if (input.charAt (i) == 'i')
            output = output + '[iìí??]';
        else if (input.charAt (i) == 'n')
            output = output + '[n?]';
        else if (input.charAt (i) == 'o')
            output = output + '[oòó????]';
        else if (input.charAt (i) == 's')
            output = output + '[s?]';
        else if (input.charAt (i) == 'u')
            output = output + '[uùú?ü]';
        else if (input.charAt (i) == 'y')
            output = output + '[y?]'
        else
            output = output + input.charAt (i);
    }
    return output;
}

Apart from a substitution function like this, is there a better way? Perhaps to "deaccent" the string being compared?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is a way to “"deaccent" the string being compared” without the use of a substitution function that lists all the accents you want to remove…

Here is the easiest solution I can think about to remove accents (and other diacritics) from a string.

See it in action:

var string = "?a été Mi?i?. àé?ó?";
console.log(string);

var string_norm = string.normalize('NFD').replace(/[u0300-u036f]/g, "");
console.log(string_norm);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...