Let's suppose there is a piece of code like this:
my $str = 'some text';
my $result = my_subroutine($str);
and my_subroutine()
should be implemented as Perl XS code. For example it could return the sum of bytes of the (unicode) string.
In the XS code, how to process a string (a) char by char, as a general method, and (b) byte by byte, if the string is composed of ASCII codes subset (a built-in function to convert from the native data srtucture of a string to char[]) ?
At the XS layer, you'll get byte or UTF-8 strings. In the general case, your code will likely contain a
char *
to point at the next item in the string, incrementing it as it goes. For a useful set of UTF-8 support functions to use in XS, read the "Unicode Support" section ofperlapi
An example of mine from http://cpansearch.perl.org/src/PEVANS/Tickit-0.15/lib/Tickit/Utils.xs
In brief, this function iterates the given string one Unicode character at a time, accumulating the width as given by
wcwidth()
.