Is there a way to detect the unicode of the character in the html file and apply class to the nearest div/span?

For example, if there is a div containing English (Latin Characters), I want to apply 'en' class to that div. And if there is a div containing Japanese characters, I want apply 'jp' class to that div.

(if there are both then, I guess both classes can be applied).

2 Answers

2
לבני מלכה On Best Solutions

Use divs[i].textContent.match

Use unicode range

For japanese unicode

var divs = document.querySelectorAll("div");

for (var i = 0; i < divs.length; ++i) {
    if(divs[i].textContent.match(/[\u0020-\u007F]+/g)){
       divs[i].classList.add('eng');    
    }
    else if(divs[i].textContent.match(/[\u3041-\u3096]+/g)){
       divs[i].classList.add('jp');    
    }
}
.eng{
color:red;
}

.jp{
color:blue;
}
<div>english</div>
<div>良い一日を</div>

EDIT

to query for all tags in page use: document.querySelectorAll("body *");

    var elements = document.querySelectorAll("body *");

    for (var i = 0; i < elements.length; ++i) {
        if(elements[i].textContent.match(/[\u0020-\u007F]+/g)){
           elements[i].classList.add('eng');    
        }
        else if(elements[i].textContent.match(/[\u3041-\u3096]+/g)){
           elements[i].classList.add('jp');    
        }
    }
    .eng{
    color:red;
    }

    .jp{
    color:blue;
    }
<div>english</div>
<div>良い一日を</div>
<span>english span</span>
<label>良い一日を</label>

0
JGNI On

There is a script property in Unicode regexes. You can write /^(\p{script=Han}|\p{script=Hiragana}|\p{script=Katakana})+$/ to match characters occuring in Japanese writing.