Perl API Inline C: How to get get a substr of a Perl byte string by reference without copying that string

212 views Asked by At

Hello community I hope I can meet some byte string experts here. I guess SvPVbyte comes into play, but how?

My problem. I already sucessfully parse Perl array XYZ (within a hash of arrays) with example index 6789) within Inline:C with Perl:

$testn=pnp($lengthofXYZ,\@{$XYZ{$_}});

Inline C:

int pnp ( int n, SV *vertx)
AV *arrayx;
double val_of_interest;
arrayx = (AV *)SvRV( vertx );
SV **yi;
yi = av_fetch( arrayx, 6789, 0 );
val_of_interest = SvNV( *yi );
return calculation_with_val_of_interest

This works perfectly. But lets say I have a very long byte string (about 10-50MB) in Perl $xyz="\x09\x07\x44\xaa......

Now I want to pass a reference to this SV and walk in 9 byte steps (substr like) in C part throu this string without copying it completely in an own C array for example.

The walking part: first 4 bytes shall be checked against a reference 4 Byte value ABC that also shall be in the function call. If necessary I can unpack "N" this search phrase before and call function with integer. If postition 0 not successfull jump/increment 9 bytes furter, if sucessfull I will deliver the found position as return.

Thank you so much.

1

There are 1 answers

13
ikegami On
#include <stdint.h>
#include <string.h>

void foo(SV* sv) {
    STRLEN len;
    const char *buf = SvPVbyte(sv, len);

    if (len < 4) {
        /* ... Error ... */
    }

    uint32_t sig =
        ((unsigned char)(buf[0]) << 24) |
        ((unsigned char)(buf[1]) << 16) |
        ((unsigned char)(buf[2]) <<  8) |
        ((unsigned char)(buf[3]) <<  0);

    buf += 4;
    len -= 4;
    if (sig != ...) {
        /* ... Error ... */
    }

    while (len >= 9) {
        char block[9];
        memcpy(block, buf, 9);
        buf += 9;
        len -= 9;

        /* ... Use block ... */
    }

    if (len > 0) {
        /* ... Error ... */
    }
}

[This is an answer to the question in the comments]

  • NEVER use use bytes;. "Use of this module for anything other than debugging purposes is strongly discouraged." (And it's not actually useful for debugging purposes. Devel::Peek is more useful.)
  • Absolutely no reason to use our here.
  • An int could be too small for the return value.
  • It's not working because you're searching the stringification of a reference.
  • In fact, there's no need to create a reference.

use strict;
use warnings qw( all );

use Inline C => <<'__EOS__';

SV* find_first_pos_of_43h_in_byte_string(SV* sv) {
    STRLEN len;
    const char *p_start = SvPVbyte(sv, len);
    const char *p = p_start;
    const char *p_end = p_start + len;
    for (; p < p_end; ++p) {
        if (*p == 0x43)
            return newSVuv(p - p_start);
    }

    return newSViv(-1);
}

__EOS__

my $buf = "\x00\x00\x43\x01\x01\x01";
my $pos = find_first_pos_of_43h_in_byte_string($buf);

Of course, you could simply use

use strict;
use warnings qw( all );

my $buf = "\x00\x00\x43\x01\x01\x01";
my $pos = index($buf, chr(67));