Regexp arabic text paragraph

203 views Asked by At

Given string:

QString unformatted =
   "Some non arabic text"
   "بعض النصوص العربية"
   "Another non arabic text"
   "النص العربي آخر";

How to reach following result using QRegExp or other way:

"<p>Some non arabic text</p>"
"<p dir='rtl'>بعض النصوص العربية</p>"
"<p>Another non arabic text</p>"
"<p dir='rtl'>النص العربي آخر</p>";

Thanks!

1

There are 1 answers

1
eyllanesc On BEST ANSWER

Function to separate by arabic expressions:

QString split_arabic(QString text){
    QRegExp rx("[\u0600-\u065F\u066A-\u06EF\u06FA-\u06FF][ \u0600-\u065F\u066A-\u06EF\u06FA-\u06FF]+");
    int pos = 0;


    QStringList list;

    while ((pos = rx.indexIn(text, pos)) != -1) {
        list << rx.cap(0);
        pos += rx.matchedLength();
    }

    for(int i=0; i < list.length(); i++){
        QString str = list.at(i);
        text.replace(str, "<p dir='rtl'>"+str+"</p>");
    }

    return text;
}

Example:

QString unformatted =
            "Some non arabic text"
            "بعض النصوص العربية"
            "Another non arabic text"
            "النص العربي آخر";


qDebug()<<unformatted;
qDebug()<<split_arabic(unformatted);

Output:

"Some non arabic textبعض النصوص العربيةAnother non arabic textالنص العربي آخر"
"Some non arabic text<p dir='rtl'>بعض النصوص العربية</p>Another non arabic text<p dir='rtl'>النص العربي آخر</p>"