Is there any API in icu::BreakIterator class that gives the token offsets in number of "bytes"?

86 views Asked by At
#include <unicode/brkiter.h>

int main( void ) {

    const char* iInput;  
    scanf("Enter the input string: %s", &iInput);  
    BreakIterator* boundary->setText(iInput);
    int32_t iStartOffset = boundary->first();
    int32_t iEndOffset = boundary->next();    ;
    int32_t iStrLength = strlen(iInput);
    printf("iStartOffset: %d, iEndOffset: %d, iStrLength: %d", iStartOffset, iEndOffset, iStrLength);
    return 0;
}

Using the API setText() as above gives me the result in number of unicode characters. Is there any API in BreakIterator class which gives the token offsets in terms of number of bytes?

0

There are 0 answers