I'm trying to parse some data from a .X12 file using regular expressions in PHP. The pattern is the capital letters FS followed by exactly 13 numeric characters.

Here is an example of some text: PROCUREMENT333RFQ3PO100011EAFS8340015381823PKGFSHALL

I need to extract 'FS8340015381823' and other variations of the 13 numeric characters from other files.

Here is the code that is not working for me:

$regex = '/FS[0-9]{13}?/';
preg_match( $regex, $x12, $matches );
var_dump( $matches );

I've also tried these regex patterns:

$regex = '/FS8340015381823/';
$regex = '/FS\d{13}?/';

All of these regex's work fine if I store the example string above to the $x12 variable before doing a preg_match(), but they don't work on the raw file when I load the contents. When I echo the .X12 file contents to screen, I see the exact string that I have used as an example above. If I use the regex /FS/, it finds the 'FS'.

This regexc works on the raw file data, but returns matches that aren't just numeric characters after the 'FS':

$regex = '/FS.{13}?/';

Could there be strange characters that the terminal on my machine is not displaying? I'm running Linux CentOS on an Amazon EC2.

1

There are 1 answers

1
hwnd On BEST ANSWER

Thanks to the help of @HamZa and the OP for breaking down his data.

You can use /FS\x1d[0-9]{13}/ or /FS\x1d\d{13}/

If you have multiple hex in your data, you can use a character class.

/FS[\x00-\x1f]\d{13}/