Latin char in Javascript regexp

1.5k views Asked by At

How can i inlude the use of latin chars like ČčĆ抚Đđ in this javascript regexp

var regex = new RegExp('\\b' + this.value, "i");

UPDATE:

I have this code for filtering checkbox label, but it doesnt work well when there is an input with Č č ć

function listFilter(list, input) {
    var $lbs = list.find('.css-label');

    function filter(){
        var regex = new RegExp('\\b' + this.value);
        var $els = $lbs.filter(function(){
            return regex.test($(this).text());
        });
        $lbs.not($els).hide().prev().hide();
        $els.show().prev().show();
    };

    input.keyup(filter).change(filter)
}

jQuery(function($){
    listFilter($('#list'), $('.search-filter'))
})

here is a fiddle: DEMO

2

There are 2 answers

5
Denys Séguret On BEST ANSWER

The problem in your regexp is that the word boundary isn't properly detected with those chars (just like \w and \W are badly handled with regards to Unicode).

I'd suggest to start with

new RegExp('(^|[\\s\\.])ČčĆ抚Đđ', "i")

and to add to [\\s\\.] the other chars you may be needing as word boundaries.

If you can't define the expected possible word boundaries, you'd better use a library to produce "Unicode compatible" regular expressions. Some are listed in this related question.

0
German Attanasio On

try with:

/^[A-z\u00C0-\u00ff\s'\.,-\/#!$%\^&\*;:{}=\-_`~()]+$/

as regular expression.

See the examples below:

var regexp = /[A-z\u00C0-\u00ff]+/g,
  ascii = ' hello !@#$%^&*())_+=',
  latin = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏàáâãäåæçèéêëìíîïÐÑÒÓÔÕÖØÙÚÛÜÝÞßðñòóôõöøùúûüýþÿ',
  chinese = ' 你 好 ';

console.log(regexp.test(ascii)); // true
console.log(regexp.test(latin)); // true
console.log(regexp.test(chinese)); // false

Glist: https://gist.github.com/germanattanasio/84cd25395688b7935182